Chess and Testing

[Originally on the Ministry of Testing, Jan 2014, now only on the Web Archives]: The Day Testing Died But Didn’t

In May 11 1997 a computer beat the world champion Kasparov in chess [1] – not convincingly, but still. From then on chess could be reduced to a set of scripts and the scripts automated so fast that it was comparable to the human mind [2]. But the human chess players continued to succeed – not by more rote memorisation, but by more intuition and feelings.

Imagine that – to play world champion chess and base your moves on feelings. This is what Magnus Carlsen does [3]. As of January 2014 he is the reigning World Chess Champion [4] and the no. 1 ranked player in the world [5] with the highest rating in history [6]. I must admit that I read about him in the paper [7][8] [9], but the story relates to how even one of the most complex brain games can be automated, and yet there are still moves to explore.

To play according to textbooks is fine, up to a certain level. Perhaps up to master level, but not to grandmasters. [10]

Originally chess was a game played on a board, but even more so in the brain of the players. Grand masters of the cold war super powers played each other with full focus on both the board moves and the body moves.

Encyclopaedias of chess moves have been written; 1700+ chess moves have been given mnemonics like “the Sicilian Defence”, “King Gambit” (SFDEPOT anyone?). And the chess masters have played and played and memorized and played (against) the computer again and again.

There are books, terminology, strategies and schools of chess [11]. To quote Wikipedia:

A school (of chess)means a (chess) player or group of players that share common ideas about the strategy (of the game). There have been several schools in the history (of modern chess). Today there is less dependence on schools – players draw on many sources and play according to their personal style.

After Kasparov there were other world chess champions [12] – and lately 23 year old Magnus Carlsen, as mentioned. Carlsen started playing chess in 1998; he played Kasparov [13] as a 13 year old for a draw and later had Kasparov and the Danish grand master Peter Heine Nielsen as a trainer. Heine Nielsen explains about Carlsen:

“While the existing World Champion Anand [14]’s strength was being able to prepare thoroughly and calculate moves very fast while playing, Carlsen is different – he thrives in the contexts that are not distilled by the computer or text books. When it’s man to man – then he’s the opposite of a computer; the one that often does the unexpected yet effectual play. He plays a variety of openings – making it really hard to prepare for.”

Carlsen can’t describe, what goes on in his brain, while he plays chess. Some moves just feels good; and when the opponent play is somewhat based on computer calculations – that is maybe the best response. [15]

Chess didn’t die with the automation, chess didn’t die by being distilled in text books and templates and mnemonics – but chess evolved. The current unfair advantage for Carlsen is his irrationality and intuition – it’s what sets him apart from the scripts.

The day testing died – but didn’t, is another story Or is it?

References

  1. http://en.wikipedia.org/wiki/Deep_Blue_versus_Garry_Kasparov
  2. http://en.wikipedia.org/wiki/Turing_test
  3. http://en.wikipedia.org/wiki/Magnus_Carlsen
  4. http://en.wikipedia.org/wiki/World_Chess_Championship
  5. http://en.wikipedia.org/wiki/FIDE_World_Rankings
  6. http://en.wikipedia.org/wiki/Comparison_of_top_chess_players_throughout_history
  7. The article is my inspiration – and will be paraphrased
  8. Danish and pay-walled http://jyllands-posten.dk/eceRedirect?articleId=6190682
  9. By the way, I don’t know much about chess
  10. http://en.chessbase.com/post/vladimir-kramnik-che-is-so-deep-i-simply-feel-lost-
  11. http://en.wikipedia.org/wiki/School_of_chess
  12. http://en.wikipedia.org/wiki/World_Chess_Champion
  13. http://en.wikipedia.org/wiki/Garry_Kasparov
  14. http://en.wikipedia.org/wiki/Viswanathan_Anand
  15. Quote from the Danish article, my translation

How Automation Affects the Business

As of writing I am managing the testing of a large enterprise IT program, where we are implementing a new commercial enterprise solution (COTS).

Over the last many months there have been requirement workshops upon requirement workshop to write down what the new system should be able to do for the various business units. We have had many representatives from the business units as part of the workshops and now have about 1000 specific business requirements that needs to be tested.

Some requirements are closed questions, others are more open-ended or similarly require some thinking. Currently the ratio is that 70% is done by test automation and 30% is for a few of the subject matter experts to test. Management was happy with this, as this made the project faster, the solution more robust and the project less reliant on taking the business people away from their “real work”.

So far so good

The other day I reached out by mail to more of the business people involved in the workshops to let them know that testing had started, and that they would be able to access the solution under test when it had been “hardened”. But so far, only a few “track leads” would be involved.

The feedback surprised me, as my message was both good and bad. Good in the sense that they would not be involved so much, but also bad that they would not be involved so much. One wrote back to me:

  • There is still a risk that the solution will not be as the workshops intended, as the requirements and solution might not capture precisely, what was agreed during the workshops
  • Having been part of the workshop, we are held responsible by our coworkers as to how well the new system supports the business
  • Why don’t the project want our involvement on this?

... but that was “just feelings”, he wrote in the end. And indeed it is – No matter how it looks at first, it’s always a people problem and even if we have a successful test automation effort – we can still fail to appreciate the experts knowledge and by that fail to solve the business problems.

More about “Leading when the experts test” at ConTest NYC 2019.

I was out hiking in April. But city management had locked the toilet up - out in the woods. As an END user my problem was then solved by doing it in the woods. And all fancy sheds where for naught.
I was out hiking in April. But city management had locked the toilet up – out in the woods. As an END user my problem was then solved by doing it in the woods. And all fancy sheds where for naught.

Visual Tests are Still Code

Among the currently shiny new test automation things are visual “script-less” test automation tools. But the visual test flows are still code – and thus require discipline to structure and maintain. Otherwise you are just adding yet another layer of spaghetti code.

Among the current shiny new test automation tools are visual “script-less” automation tools like LeapWork [9], Blue Prism [10] and UiPath [7]. These tools are a part of a new class of business process automation tools called “Robot Process Automation” (RPA) [4]. There are two sub types of – “RPA” which focuses on processing data and Robot Desktop Automation (RDA).

RDA is interesting in the context of test automation [9], as they can automate GUI interactions – also on top of enterprise package applications (SaaS, COTS, OOTB etc. [2]). The test automation challenge for most of these enterprise applications (SAP, MS Dynamics [6] etc.) is that they come with no access to the code-base, even if these are pure-play web based – the GUI is all there is.

All you can to these type of business solutions is usually to add customization and configurations by entering or editing data directly in through the GUI. Some of these systems allow configurations and customization in the form of config-file – they really should be under change control [3], as they are part of the pipeline. 

visual tests are code

part of the ship

part of the crew

Bootstrap Bill Turner

Using RDA tools for test automation [9] is a novel [1] uncharted approach [12]. The editing of the “tests”/flows is usually done in a stand-alone application studio (Graphical IDE) with interactions to the solution under test (across the GUI and over Citrix and RDP) and to any test management and issue tracking system.

Interestingly the other more “data processing” RPA tools like Automation Anywhere [5] uses a VB-Script like syntax. Writing and maintaining “scripts” like that is quite like the common approaches to GUI automation using frameworks like Siluki, tagUI, Applitools [11].

Applitools etc. are coding frameworks you can apply if you have the application code base or want to write test automation directly as code. There could be benefits in coding UI testing in all web-only projects directly using Selenium and Applitools. Most enterprise business solutions are often stand-alone applications, or their web code is horrible to hook into, as often the selectors seems randomly generated (been-there-done-that).

Hence the primary driver for RDA adoption in for test automation is to take the RDA & RPA [4] tools and apply their strengths in process automation of enterprise business solutions [2] to drive the test execution. And of a business flow could be “automating” activities during onboarding [7] or an SAP purchase order as below images:

Another key driver for adoption of RPA for test automation is their visual approach in presenting interactions/tests as flows. Some do it gracefully and user-friendly (LeapWork) – others have a more old-school workflow/swim lane approach (Blue Prism, UiPath). In both cases the visual flows illustrate an interaction across multiple GUI applications to perform business actions (yes, this still happens).

These drivers probably to make the barrier to entry seem more manageable. The visual ones very easily turn into visual spaghetti code if you don’t keep an eye on it and use sub flows, low coupling and high cohesion [13].  … as with any other non-trivial code (of a certain McCabe complexity [14]). One interesting way to go about a “coding” practice for visual test cases could be inspired by how BDD can be implemented in LeapWork [8] with annotation and self-referencing unit tests.

At the end of the day even a visual test automation project is a coding project, that should be part of the project code base like everything else [3]. And probably best maintained by software engineers within the project team (where possible) – unless you want a team of test engineers spending all day playing catch-up to maintain the automation code.

  1. Since 2017’ish.
  2. COTS/OOTB = Commercial of the shelf, out of the box
  3. https://twitter.com/mipsytipsy/status/1146968926493929472
  4. https://www.horsesforsources.com/2019_RTS_survey_070619
  5. https://www.linkedin.com/pulse/automation-anywhere-example-neil-kolban/
  6. https://www.leapwork.com/blog/automate-testing-microsoft-dynamics-365-crm
  7. https://www.uipath.com/blog/how-rpa-can-help-companies-rethink-hr-tasks
  8. https://www.capgemini.dk/bdd-in-leapwork/#tab5
  9. https://dojo.ministryoftesting.com/dojo/lessons/rpa-as-a-power-tool-for-testing  
  10. https://crunchytechbytz.wordpress.com/2018/03/13/automation-with-blue-prism/
  11. https://applitools.com/features
  12. https://jlottosen.wordpress.com/2019/04/20/broaden-the-scope-of-sut/
  13. https://medium.com/clarityhub/low-coupling-high-cohesion-3610e35ac4a6
  14. https://en.wikipedia.org/wiki/Cyclomatic_complexity

Assumptions of the Test Pyramid

This may be a heresy to some… While the Test Automation Pyramid as a model may be right in many contexts, – but the model will be similarly wrong in other test automation contexts.

First let’s look at one of the assumptions of the Test Automation Pyramid:

Martin Fowler, 2012

Fowlers assumption (2012) is that UI automation is slow and expensive. Similarly Cohen (2009) writes that testing in the UI is “brittle, expensive, and time consuming“. Recently (2019) there have been developed at least two types of tools that break those assumptions – and make it relatively faster and cheaper to have automated GUI tests than before.

Example 1: Tools like Applitools Eyes let’s you do prepare test automation code that compare images of the UI. Angie Jones has an excellent code example of how to compare PDF files.

Example 2: Robot Desktop Automation tools gives the possibility of automating and autotomize end user business processes. These kind of tools can be used to write, maintain and schedule end user activities.

I have performed an analysis that shows that using RDA for test automation has similar costs and speed as with using Selenium for test automation … but then not all projects are web projects.

Still, the underlying assumption of both Applitools,the pyramid above and even Bach’s earth model is that the system under test consists of accessible code on the service and unit layers.

UI testing may be all there is!

In the context of Software-as-a-Service, standard commercially packaged applications and solutions – the business still want to test the system they are starting to use, but they have no access to the code. While they must reasonably expect the vendor to have tested the solution, the business implementing the IT package would want to test it in their setting using their own people.

As testing professionals we can help the business both not to request the kitchen sink, while also test all the things (that matter). As with all other testing – even the dreaded UAT – some of it is simple repeatable tasks (checks) while others are more subtle experiments (tests).

Perhaps we can estimate a ratio between the checks and the tests? Perhaps that ratio has more checks..? That would depend on what the business would like to know (what is their perception of quality) and how well the domain is codified (Genesis / Commodity).

There is a discussion and collection of alternative pyramids on “The Club”.


When testing web features doesn’t matter

(aka not every testing problem can be solved by a webdriver)

Web features doesn’t matter as much in the contexts I usually work in. While some may be delivered over the web, the focus for testing is on the whole system’s fit for the business. Adding automation in testing to the mix gives additional challenges as there is no source code in the solution to interact with, and we have to find other solutions to solve the tedious tasks in testing with.

One area where this is the case, is when implementing standard commercial software packages (COTS) for the enterprise or public sector. Solutions like SAP for retail CRM and ERP, Microsoft Dynamics 365 for Finance and Operations, EPIC for hospitals, Service Now for IT Service management etc. These are standard solutions that can be configured and customized, but the general source code is not available.

Thus the “test automation pyramid” falls short to help us automate things as only the GUI is available for interacting with the solution. Test engineers might want to setup CI/CD but the success of that depends on the system architecture and the provided as-is methods of releasing updates. Some of the solutions above are provided as SaaS but quite often they still run “on premise” and the business still wants releases tested before launching things on a corporate scale.

Screenshot of an RPA test on SAP

Another example is the many bespoke software solutions that are still in operation. My local electronics store has two interfaces for the sales persons: web to look things up (specs and availability) and a mainframe system to set up the actual purchase (Point of sales, POC). Many public organisations and enterprises are only now transitioning from the desktop applications of the 1990’es to more up to date solutions. Unfortunately systems that are 10+ years old have very little of live and relevant specifications and neither CI/CD suites.

While some COTS and POC solutions are being delivered over the web, testing web particularities the very least of our focus areas. The web particularities seems to matter more if the solutions are business-to-consumer but not so much when it’s business-to-employer or business-to-business.

In a business-to-employer and business-to-business context, usually only one browser is in scope. And that’s it. There is little interest for HTTP status codes, broken links, browser compatibility or login forms.

The primary testing challenges of these projects cannot be solved by Selenium, Cypress or neat tweaks in the latest JavaScript library.

Focusing only on testing the web in contexts like these we fail in

  • covering the whole system landscape across applications of different technologies
  • addressing the real questions of the business subject matter experts and the business

It’s in this context that RPA probably has some benefits in providing automation of tedious testing tasks to the tester with a business background. That is, they are business people first – and then they do the testing that matters to the context.

Trending: Shift-Left

TL;DR: Shift-Left is about testing early and automated. Shift technical with this trend or facilitate that testing happens.

Shift-Left is the label we apply when testing moves closer to development and integrated into the development activities. The concept is many IT years old, and there are already some excellent articles out there: What the Shift Left in Testing Means (Smart Bear, no date), “Shift left” has become “drop right” (Test Plant 2014), Shift Left QA. How to do it. Why it matters (Work Soft, 2015).

To me Shift-Left is still an active trend and change how to do testing. This goes along with Shift-Right, Shift-Coach and Shift-Deliver discussed separately. I discussed these trend labels at Nordic Testing Days 2016 during the talk “How to Test in IT operations“.

Here are some contexts where Shift-Left happens:

  • Google have “Software Engineer in Test” as job title according to the book “How we Test Software at Google“.
  • Microsoft have similar “Software Design Engineer in Test” as discussed by Alan Page in “The SDET Pendulum” and in the e-book “A-word
  • A project I was regarding pharmaceutical  Track and Trace, had no testers. I didn’t even test but did compliance documentation of test activities. The developers tested. First via peer review, then via peer execution of story tests and then validation activities. No testers, just the same team – for various reasons.
  • A project I was in regarding a website and API for trading property information had no testers, but had continuous build and deploy with even more user oriented test cases that I could ever grab. (see: Fell in the trap of total coverage)

The general approach to Shift-Left is that “checking” moves earlier in the cycle in form of automation. More BDD, more TDD, more automated tests, continuous builds, frequent feedback and green bars. More based on “Test automation pyramid” (blog discussion, whiteboard testing video). Discussing the pyramid model reveals that testing and checking goes together in the lower levels too. I’m certain that (exploratory) testing happens among technicians and service-level developers; – usually not explicitly, but still.

To have “no QA” is not easy. Not easy on the testers because they need to shift and become more SET/SDET-like or shift something else (Shift-Right and Shift-Coach and Shift-Deliver). Neither is it easy on the team, as the team has to own the quality activities – as discussed in “So we’re going “No QA’s”. How do we get the devs to do enough testing?

Testers and test managers cannot complain, when testing and checking is performed in new ways. When tool-supported testing take over the boring less-complex checks, we can either own these checks or  move to facilitate that these checks are in place. Similarly when the (exploratory) brain-based testing of the complex and unknown is being handed over to some other person. Come to think of it I always prefer testing done by subject matter experts in the project, be it users, clients, testers or other specialists.

We need to shift to adapt to new contexts and new ways of aiding in delivering working solutions to our clients.

jollyrum