Reinventing Performance Testing: Agile

I am looking forward to share my thoughts on ‘Reinventing Performance Testing’ at the imPACt performance and capacity conference by CMG held on November 7-10, 2016 in La Jolla, CA. I decided to publish a few parts here to see if anything triggers a discussion.

It would be published as separate posts:
Introduction (a short teaser)
Cloud
-Agile (this post)
Continuous Integration
New Architectures
New Technologies

Agile development eliminates the main problem of tradition development: you need to have a working system before you may test it, so performance testing happened at the last moment. While it was always recommended to start performance testing earlier, it was usually rather few activities you can do before the system is ready. Now, with agile development, we got a major “shift left”, allowing indeed to start testing early.

In practice it is not too straightforward. Practical agile development is struggling with performance in general. Agile methods are oriented toward breaking projects into small tasks, which is quite difficult to do with performance (and many other non-functional requirements) – performance-related activities usually span the whole project.

There is no standard approach to specifying performance requirements in agile methods. Mostly it is suggested to present them as user stories or as constraints. The difference between user stories and constraints approaches is not in performance requirements per se, but how to address them during the development process. The point of the constraint approach is that user stories should represent finite manageable tasks, while performance-related activities can’t be handled as such because they usually span multiple components and iterations. Those who suggest to use user stories address that concern in another way – for example, separating cost of initial compliance and cost of ongoing compliance.

And practical agile development is struggling with performance testing in particular. Theoretically it should be rather straightforward: every iteration you have a working system and know exactly where you stand with the system’s performance. You shouldn’t wait until the end of the waterfall process to figure out where you are – on every iteration you can track your performance against requirements and see the progress (making adjustments on what is already implemented and what is not yet). Clearly it is supposed to make the whole performance engineering process easier and solve the main problem of the traditional approach that the system should be ready for performance testing (so it happens very late in the process when the cost of fixing found issues is very high).

From the agile development side the problem is that, unfortunately, it doesn’t always work this way in practice. So such notions as “hardening iterations” and “technical debt” get introduced. Although it is probably the same old problem: functionality gets priority over performance (which is somewhat explainable: you first need some functionality before you can talk about its performance). So performance related activities slip toward the end of the project and the chance to implement a proper performance engineering process built around performance requirements may be missed.

From the performance testing side the problem is that performance engineering teams don’t scale well, even assuming that they are competent and effective. At least not in their traditional form. They work well in traditional corporate environments where they check products for performance before release, but they face challenges as soon as we start to expand the scope of performance engineering (early involvement, more products/configurations/scenarios, etc.). And agile projects, where we need to test the product each iteration or build, expose the problem through an increased volume of work to do.

Just to avoid misunderstandings, I am a strong supporter of having performance teams and I believe that it is the best approach to building performance culture. Performance is a special area and performance specialists should have an opportunity to work together to grow professionally. The details of organizational structure may vary, but a center of performance expertise (formal or informal) should exist. The only thing said here is that while the approach works fine in traditional environments, it needs major changes in organization, tools, and skills when the scope of performance engineering should be extended (as in the case of agile projects).

Remedies recommended are usually automation and making performance everyone jobs (full immersion). However, they haven’t yet developed in mature practices and probably will vary much more depending on context than the traditional approach.

Automation means here not only using tools (in performance testing we almost always use tools), but automating the whole process including setting up environments, running tests, and reporting / analyzing results. Historically performance testing automation was almost non-existent (at least in traditional environments). Performance testing automation is much more difficult than, for example, functional testing automation. Setups are much more complicated. A list of possible issues is long. Results are complex (not just pass/fail). It is not easy to compare two result sets. So it is definitely much more difficult and may require more human intervention. And, of course, changing interfaces is a major challenge. Especially when recording is used to create scripts as it is difficult to predict if product changes break scripts.

The cost of performance testing automation is high. You need to know system well enough to make meaningful automation. Automation for a new system doesn’t make much sense – overheads are too high. So there was almost no automation in traditional environments (with testing in the end with a record/playback tool). When you test the system once in a while before a next major release, chances to re-use your artifacts are low.

It is opposite when the same system is tested again and again (as it should be in agile projects). It makes sense to invest in setting up automation. It rarely happened in traditional environments – even if you test each release, they are far apart and the difference between the releases prevents re-using the artifacts (especially with recorded scripts – APIs is usually more stable). So demand for automation was rather low and tool vendors didn’t pay much attention to it. Well, the situation is changing – we may see more automation-related features in load testing tools soon.

While automation would take a significant role in the future, it addresses one side of the challenge. Another side of the agile challenge is usually left unmentioned. The blessing of agile development, early testing, requires another mindset and another set of skills and tools. Performance testing of new systems is agile and exploratory in itself and can’t be replaced by automation (well, at least not in the foreseen future). Automation would complement it, together with further involvement of development, offloading performance engineers from routine tasks not requiring sophisticated research and analysis. But testing early – bringing most benefits by identifying problems early when the cost of their fixing is low – does require research and analysis, it is not a routine activity and can’t be easily formalized.

It is similar to functional testing where both automated regression testing and exploratory testing are needed – with the difference that tools are used in performance testing in any case and setting up continuous performance testing is much more new and challenging.

The problem is that early performance testing requires a mentality change from a simplistic “record/playback” performance testing occurring late in the product life-cycle to a performance engineering approach starting early in the product life-cycle. You need to translate “business functions” performed by the end user into component/unit-level usage and end-user requirements into component/unit-level requirements. You need to go from the record/playback approach to utilizing programming skills to generate the workload and create stubs to isolate the component from other parts of the system. You need to go from “black box” performance testing to “gray box”, understanding the architecture of the system and how your load impacts it.

The concept of exploratory performance testing is still rather alien. But the notion of exploring is much more important for performance testing than for functional testing. The functionality of systems is usually more or less defined (whether it is well documented is a separate question) and testing boils down to validating if it works properly. In performance testing, you won’t have a clue how the system would behave until you try it. Having requirements – which in most cases are goals you want your system to meet – doesn’t help you much here because actual system behavior may be not even close to them. It is rather a performance engineering process (with tuning, optimization, troubleshooting and fixing multi-user issues) eventually bringing the system to the proper state than just testing.

If we have the testing approach dimension, the opposite of exploratory would be regression testing. We want to make sure that we have no regressions as we modify the product – and we want to make it quick and, if possible, automatic. And as soon as we get to an iterative development process where we have product changing all the time – we need to verify that there is no regression all the time. It is a very important part of the continuum without which your testing won’t quite work. You will be missing regressions again and again going through the agony of tracking them down in real time. Automated regression testing becomes a must as soon as we get to iterative development where we need to test each iteration.

So we have a continuum from regression testing to exploratory testing, with traditional load testing being just a dot on that dimension somewhere in the middle. Which approach to use (or, more exactly, which combination of approaches to use) depends on the system. When the system is completely new, it would be mainly exploratory testing. If the system is well known and you need to test it again and again for each minor change – it would be regression testing and here is where you can benefit from automation (which can be complemented by exploratory testing of new functional areas – later added to the regression suite as their behavior become well understood).

If we see the continuum this way, the question which kind of testing is better looks completely meaningless. You need to use the right combination of approaches for your system in order to achieve better results. Seeing the whole testing continuum between regression and exploratory testing should help in understanding what should be done.

Share

Leave a Reply

Your email address will not be published. Required fields are marked *