Intelligent Automated Tests

I read an interesting paper the other day on the subject of automated software testing. It is called “A Context-Driven Approach to Automation in Testing,” by James Bach and Michael Bolton. It is about the difference between using tools to aid in testing, versus using tools to replace testing. It brought up some interesting points about automated testing and how it is viewed in the software development world. I have experienced myself the focus on automated tests in my own position – every time a test is executed manually, the question becomes: “how can we automate this to repeat it in the future?” What this paper so aptly points out is that automated tools cannot replicate the testing process by a human.

When I reflect on the automated testing suites that I have contributed to and maintained, I realize two things:

Automated tests rarely fail because of an actual bug that needs fixing, and
Any bugs that arise are more likely to be discovered by manual, exploratory testing.

Automated Tests Rarely Fail as a Result of Bugs

Now, this is not to say that automated tests don’t serve their purpose. They are great to use as smoke tests (small, quick tests of basic functionality) and for testing deterministic interfaces (such as APIs, which have a specific contract that can be easily verified by a computer). The problem comes about when these tests are not updated or maintained on a regular basis. For example, I often see small failures in a testing suite that I developed for one of our back-end services. The purpose of this service is to query a Learning Management System (LMS) on behalf of a student and fetch all their due dates, as well as provide the capability to create personal dates. In a test of the date creation feature, the structure for the date that is sent to the service is hard-coded. This test failed a couple weeks ago because students can only add dates for courses they are enrolled in and this hard-coded data was out-of-date with the test user’s enrollments on the live system.

Maintenance of automated testing suites is often overlooked – people tend to treat these tests as things you can write and then depend on without looking at for the rest of time. When one fails after maintenance has been lacking, the time taken to diagnose the problem is disproportionately large compared to the time it would have taken to keep the test (and its data) up-to-date in the first place.

Bugs Are More Likely to be Discovered by Manual, Exploratory Testing

A point that Bach and Bolton make in their article is that computers can never replicate the behaviour of a human interacting with a piece of software. For one, computers have limited ability to react to unexpected behaviour. They don’t “see” the screen the same way that a human does, and therefore they often cannot interact with it in the same way. As an example, I was trying to write an automated test earlier that would pick an option from a drop-down menu in order to get a webpage into a state that I could run checks on the desired layout of the page. I was having a problem where the drop-down menu would obscure a part of the page that I needed to check. The framework I was using to test selected the option from the menu in such a way that the menu, for whatever reason, would not disappear again once the option was selected. This “eccentricity” of the testing framework was a blocker in testing the page, and it was something a human being would never encounter.

The biggest and most important bugs are only found when a tester is intelligently and deliberately exploring the software – not when a computer is performing programmed checks.

Some Caveats…

The things I talked about above are not fair to say as blanket statements. If these things were always true of automated tests, there would be no reason to ever write them and all testing would have to be performed manually. This is simply what I have found in my experience at my company, and something that we are working on improving. Automated tests can be designed to account for the flakiness and instability I spoke about above, and they can be written intelligently to support a tester in manual testing.

One of my favourite things about automated tests is that they do all the mundane and boring, but still necessary, checks. A framework that I have been playing with, Galen, is designed to test the layout of a webpage on different browsers at different resolutions (important for our goal of having our page be properly responsive). One of the developers on my team spoke about a concern that the reason I liked the tool is because it could replace my work rather than support my work. That is a little true but a little not – I like that it can do the work of checking basic layouts on many different browsers (cross-browser testing is the most tedious thing in the entire world) and thus free up my time to focus on more intelligent testing.

My department has a bit of a love-affair going on right now with writing as many automated tests as possible. My goal going forward is to do this more intelligently so as to not write tests that need more maintenance than they are worth.