Friday, July 28, 2017

Why DevOps implies microservices

The microservice design approach pertains to architecture, whereas DevOps pertains to how you build, test, and deliver—these should therefore be separate considerations, right?

No. Back in the 1980s I worked in the microchip design and testing field, and it was essential—even back then—to design chips in a way that made them testable: "design-for-testability" was—and is to this day—a major consideration for design. In fact, a good percentage of a chip's circuitry was there only to enable the chip to be tested.

This is not new, and designing software for testability is even more important today than it was only a decade ago, because today, if you want to "shift left" your integration testing, you will need to be able to create small-footprint transient integration test environments. That means that you are deploying your apps again and again, and if your apps are big and monolithic, then not only will deployment take a long time, but the apps will also have a large footprint, which translates into money.

Consider, for example, that one of your application components is a large SOA based system that provides an "information layer", and that this SOA layer can only be deployed as a single unit—i.e., all-or-nothing. If one of your teams is building an app that uses that SOA layer, then in order for the team to create an integration test environment on demand, they will have to dynamically deploy the entire SOA based system into their environment.

Why do they need to do this to test? They need to because they want to test changes to their app, and not disturb anyone else. That's why they need a test environment of their own. It only needs to be their own for the duration of their test. They can't use a shared test instance of the SOA layer, because the SOA layer contains databases, and therefore is stateful, and so the tests change the SOA layer. (Even worse, the features being tested might even included code changes to the SOA layer.) Any change to the SOA layer would affect other teams that are using that SOA layer for their testing, and that is why a test environment must be exclusive to the test agent—it cannot be shared (at least for the duration of the test run). In other words, a test environment needs to be isolated.

Many DevOps teams achieve isolation by spinning up an environment on demand and deploying all of the various application components to that environment, running the tests, and then destroying the environment. Thus, if one or more of the components is a large, monolithic system, it is difficult and expensive to use a dynamic environment testing strategy. This is a very significant handicap.

Microservices are small, independently deployable components, and so when one needs to perform an integration test that involves microservices, one can select only those needed for the test, deploy them, run the tests, and then destroy. Of course, one must consider dependencies among the microservices: that is part of the test planning that goes into the development of a feature or story. Essentially, one must design what the "test bench" should be for the system-under-test—i.e., what other components are needed besides the component that has been modified and is being tested, to enable the test to be performed.

The small footprint and independent deployability of microservices is therefore a major enabler for shift-left integration testing. If one has monolithic components, then shifting integration testing left, to the team (or to the individual developer) is very difficult, and one usually has to fall back to a shared integration test process whereby all components are deployed at regular intervals into that environment and tests are run. In such a process, there are usually a-lot of failed tests—the test run does not stay "green", and so determining the cause of failures is difficult. Also, in that approach, it is necessary to use "feature toggles" to turn off features that are not yet complete, since incompletely integrated features will cause other tests to fail that used to pass, putting those features in doubt. Using feature toggles is complicated, and it is a messy situation. That is why DevOps teams try to "shift left" and perform early integration testing before they make a feature visible to the downstream test pipeline; but to do that, your components need to be small enough to be able to deploy them frequently into local integration test environments.