Agile FAQs
  About   Slides   Home  

 
Managed Chaos
Naresh Jain's Random Thoughts on Software Development and Adventure Sports
     
`
 
RSS Feed
Recent Thoughts
Tags
Recent Comments

Inverting the Testing Pyramid

As more and more companies are moving to the Cloud, they want their latest, greatest software features to be available to their users as quickly as they are built. However there are several issues blocking them from moving ahead.

One key issue is the massive amount of time it takes for someone to certify that the new feature is indeed working as expected and also to assure that the rest of the features will continuing to work. In spite of this long waiting cycle, we still cannot assure that our software will not have any issues. In fact, many times our assumptions about the user’s needs or behavior might itself be wrong. But this long testing cycle only helps us validate that our assumptions works as assumed.

How can we break out of this rut & get thin slices of our features in front of our users to validate our assumptions early?

Most software organizations today suffer from what I call, the “Inverted Testing Pyramid” problem. They spend maximum time and effort manually checking software. Some invest in automation, but mostly building slow, complex, fragile end-to-end GUI test. Very little effort is spent on building a solid foundation of unit & acceptance tests.

This over-investment in end-to-end tests is a slippery slope. Once you start on this path, you end up investing even more time & effort on testing which gives you diminishing returns.

They end up with majority (80-90%) of their tests being end-to-end GUI tests. Some effort is spent on writing so-called “Integration test” (typically 5-15%.) Resulting in a shocking 1-5% of their tests being unit/micro tests.

Why is this a problem?

  • The base of the pyramid is constructed from end-to-end GUI test, which are famous for their fragility and complexity. A small pixel change in the location of a UI component can result in test failure. GUI tests are also very time-sensitive, sometimes resulting in random failure (false-negative.)
  • To make matters worst, most teams struggle automating their end-to-end tests early on, which results in huge amount of time spent in manual regression testing. Its quite common to find test teams struggling to catch up with development. This lag causes many other hard-development problems.
  • Number of end-to-end tests required to get a good coverage is much higher and more complex than the number of unit tests + selected end-to-end tests required. (BEWARE: Don’t be Seduced by Code Coverage Numbers)
  • Maintain a large number of end-to-end tests is quite a nightmare for teams. Following are some core issues with end-to-end tests:
    • It requires deep domain knowledge and high technical skills to write quality end-to-end tests.
    • They take a lot of time to execute.
    • They are relatively resource intensive.
    • Testing negative paths in end-to-end tests is very difficult (or impossible) compared to lower level tests.
    • When an end-to-end test fails, we don’t get pin-pointed feedback about what went wrong.
    • They are more tightly coupled with the environment and have external dependencies, hence fragile. Slight changes to the environment can cause the tests to fail. (false-negative.)
    • From a refactoring point of view, they don’t give the same comfort feeling to developers as unit tests can give.

Again don’t get me wrong. I’m not suggesting end-to-end integration tests are a scam. I certainly think they have a place and time.

Imagine, an automobile company building an automobile without testing/checking the bolts, nuts all the way up to the engine, transmission, breaks, etc. And then just assembling the whole thing somehow and asking you to drive it. Would you test drive that automobile? But you will see many software companies using this approach to building software.

What I propose and help many organizations achieve is the right balance of end-to-end tests, acceptance tests and unit tests. I call this “Inverting the Testing Pyramid.” [Inspired by Jonathan Wilson’s book called Inverting The Pyramid: The History Of Football Tactics].

Inverting the Testing Pyramid

In a later blog post I can quickly highlight various tactics used to invert the pyramid.

Update: I recently came across Alister Scott’s blog on Introducing the software testing ice-cream cone (anti-pattern). Strongly suggest you read it.

  • https://me.yahoo.com/a/DB9Tp_ADjoSm0OYONsiE04Fm.OCp#398ff Jared

    This is already described elsewhere: http://www.testing-software.org/Testing/index.html (Jason Huggins) and by Mike Cohn at http://blog.mountaingoatsoftware.com/the-forgotten-layer-of-the-test-automation-pyramid.

    I think there are a number of straw-men involved in the automation pyramid as well, not least of which is the vague definitions that teams (agile and otherwise) have for things like ‘GUI tests’, ‘acceptance/integration’ tests and ‘unit tests’.

  • James Grenning

    I agree with your assessment that unit tests are the base. The only way to have thoroughly tested code is to have thorough unit tests. Simple math can illustrate what you are saying with the pyramid. Imagine a system with three modules that interact with each other. If each can be completely tested with 10 tests you will need 30 tests and a handful more to check the interaction scenarios.

    If instead you take a end-to-end test strategy and want to thoroughly test the system, you need on the order of 1000 (10 x 10 x 10) tests for all possible combinations. This is impractical. An end to end only test strategy can’t work.

    In the context of the pyramid, we get full testing with 30+ unit tests, and some sample of tests at the AT and UI level. As you understand at the higher levels we’re checking connections and representative scenarios, never thinking a full test is possible.

    James

  • Tom Eble

    Agree overall – end-to-end automated tests are too slow and lack the ability to track back to the bad code. There are definitely some issues with fragility, but these are mostly mitigated by a good automator using sound development practices. Most fragility I encounter is environmental. However, pixel-change fragility is not an issue. Automation tools have not cared about pixels in more than a decade. Now, wholesale changes to the UI workflow will break the tests, but again, this can be mitigated by building the test suite where such changes are localized.

  • Bill44077

    Hi,
    Good info! I noticed that when you flipped the pyramid that you changed the name of Integration Tests to Acceptance Tests. Are you saying that they are one in the same? Or were you trying to make another point that I missed? The most common definition that I’ve seen for Acceptance Tests are “end-to-end” which I would think would be the same as GUI Tests, no?
    I really enjoy your blogs – keep up the great work!
    regards!

    • http://blogs.agilefaqs.com Naresh Jain

      Thanks Bill for the kind words.

      Good catch with respect to the Integration Tests vs. Acceptance Tests. Acceptance Tests are not the same as Integration Tests.

      Integration tests are about 2 nodes/points successfully taking to each other.

      If I had a service which could add 2 numbers, my integration test would call this service with 2 numbers (a and b) and check if I got back a number (z). It would *not* check if z = a+b .i.e. whether the result was really added correctly. That’s beyond the scope of the integration test.

      I would have many unit tests on the service side to make sure the calculator’s functionality was implemented correctly.

      Integration and unit tests are technology/implementation facing. While Acceptance Tests (ATs) are *mostly* end-to-end, they are business facing and drives development.

      But a lot depends on the context.

      For Ex: I’ve built many systems where our ATs went one layer below the view layer (at the controller or presenter). That worked better from driving the business rules perspective.

      I’ve worked on systems where we did not use ATs at all.

      And I’ve also worked on systems where we had ATs at many different levels. At the module level, the ATs verified/drove the functionality of the module. If this module had dependencies on other modules, we would fake them out.

      At the Application level, the ATs would have all the modules talking to each other, but the dependencies on other applications were faked out.

      At an overall system level, every application used by an organization was completely integrated.

      (Be careful not to quote me out of context. These modules that I’m taking about are end-to-end modules (containing all the architectural layers) and each module was roughly about 3-8 Million lines of code.)

  • Sebastien Biffi

    “GUI tests are also very time-sensitive, sometimes resulting in random failure (false-negative.)”

    I think you are speaking about a false-positive test, no? (see http://en.wikipedia.org/wiki/Type_I_and_type_II_errors).

    By the way, It’s a nice article you wrote there ;o)

    • http://blogs.agilefaqs.com Naresh Jain

      Thanks Sebastien. 

      IME end-to-end tests (esp. GUI based tests) give both false-negative and false-positive (stuff is broken but tests don’t catch it) errors.

  • sreeram_ng

    Hi Naresh, very interesting thought. How can an independent testing organization invert the pyramid and write a higher % of low level test cases as opposed to the current practice of writing requirement driven integration or end-to-end test cases. Is inverting the pyramid possible only when the same organization does both development and testing?

    • http://blogs.agilefaqs.com Naresh Jain

      Hi Sreeram, I don’t have much experience working with Independent testing organisations. However with my little experience, I think its extremely hard (if not impossible) to invert the test pyramid if the dev and test members are noting going to be working in tight collaboration with a common goal. Having an independent test team kind of gets in the way of this nature of tight collaboration.

  • manizzzz

    Naresh, blog is really great. But one things that confuses a bit is how are segregating between Business Acceptance testing, integration and workflow testing. I feel all the three are tightly coupled with each other. This leads to writing the fragile end to end tests.

    • http://blogs.agilefaqs.com Naresh Jain

      Thanks. The main thing to keep in mind when segregating tests: You want a single reason for each test to fail (Single Responsibility Principle for Test.) And when they fail, they should give you a pin-pointed, unambiguous feedback.

      If we take the unit tests and keep moving up the pyramid, we are ensuring that we are only adding one variation (dimension of change) at a time. If we add multiple variations at the same time, then when the test fails, we won’t know which of those variables caused an issue.

      Based on this core principle, I use the following rule of thumb to segregate tests:

      1. Unit (Isolation) Test – Validates if a class/function (in isolation) is implemented correctly. It is technology/implementation focused.

      2. Business/Domain Logic Acceptance Test – Validates a core business rule. This might span across multiple domain objects/functions, hence its bigger in scope when compared to a unit test. Also this is more business/user facing in contrast to the unit test, which is more technology/implementation focused. At this level, we stub/mock out all external & internal dependencies. Our focus is to validate the business/domain logic. For example we don’t worry about authentication or performance or scalability here.

      3. Integration Test – Has a single responsibility to ensure our system can talk to a sub-system correctly. Its mostly interested in validating if the sub-system has been configured correctly. We don’t assert any behaviour or state here. For ex: we invoke a REST service and ensure we got a 200 HTTP status, if we send a malformed XML input, we get an expected failure message from the down-stream system or we ensure that we can invoke a stored-proc on a DB and it does not throw an exception.

      4. Workflow Test – Forget the UI, but think about the main steps a user has to perform to achieve a goal/task. Take those steps (and may be a few main variations around those steps) and capture them as your workflow test/spec. These tests are supposed to tell you the story of a user journey. In the workflow test, you stub/mock out all external systems like email gateway, payment gateway, inventory system, etc. You are interested to know if the user can achieve their goal/task by following these steps, assuming the external system work as expected. Your integration tests can ensure they do work as expected.

      5. End-to-End Flow Test – For a given scenario, we pick a couple of workflow test, remove the stubs/mocks and ensure we can actually go through the whole flow integrated with external systems. That’s your end-to-end flow tests.

      Hope this helps. BTW the slides have quite a bit of details on this including the tools to be used and so on..


    Licensed under
Creative Commons License