Quantcast
Channel: Tom's Labs » Continuous Improvement
Viewing all articles
Browse latest Browse all 2

Pretested commits – why does it matter to us?

$
0
0

The problem

Our CI was frequently red and that creates work. During a two-week period we measured that 54% of the builds resulted in failure. More than half of the failures followed a previous failure. That means that by a low estimation 25% of the commits were made while the CI was red.

Ok but wherein lies the problem?

  • It’s more difficult to fix the problems when they’ve been there for a while
  • As time passes it’s becomes unlikely that the guy who broke the build will be the guy to fix it. And it’s definitely more difficult to fix other peoples bugs.
  • The fact that I commit while the CI is red, means that I won’t get the necessary feedback. I can easily introduce a new error that doesn’t get caught until the first one is fixed.
  • When a developer updates his code he also gets other peoples bugs, so when he discovers a problem he can’t be sure it came from his modifications.
  • It is a reason to maintain extra branches (one for development, one for patches to production)

So, indeed it creates work.

Analysis

So lets just establish that everyone always runs the tests before publishing their work, right?
- Tried it, doesn’t work.
Well not?
- Because it’s always tempting not to.
Why?
- Because it’s difficult to be sure whether it’s your bug or someone else’s.
Why?
- Because other developers commit untested code.
Oops! Vicious circle.
But that’s not the only reason, what’s more?
- It takes time to run the tests and it blocks the development environment.
Nasty. Any more?
- It takes discipline to run all the tests before every “publication” of my code and discipline is a limited resource. Someone will run out of it and then it will get a lot more tempting for the others to skip it.
Oops! vicious circle again.

So establishing a run-your-tests-locally-before-commiting-or-else-shame-on-you culture isn’t going to work for very long.

Solution

Since about two years a workflow called pretested commit, delayed commit, private build or stable build is emerging. It’s even a feature of TeamCity CI. What’s so cool about this is that it’s not a countermeasure to the problem “developers commit untested code”, it eliminates the problem all together by removing the root cause “running tests blocks developers” altogether (the fancy japaneese term is poka yoke). The basic workflow is that all tests are run before the commit to the shared development branch actually happens – let’s call it the stable branch. Thus ensuring that the latest version of the stable branch always contains code that passes the tests. It also means that whenever you update your code you don’t get bugs from the others. In fact if there are any bugs, they’re all yours!

The way we choose to implement the commit barrier was to first migrate the whole project from SVN to Git and then use the Jenkins Git plugin to configure the following workflow.

Say a developer wants to start a new feature.

  • He starts by checking out a new feature branch
  • he commits some modifications locally (since we’re using git)
  • he does some more work and commits again
  • then he pushes his branch to the team repository refs/heads/merge-requests/<my-name>/<branch-name>
  • Jenkins takes the branch merges it with the stable branch
  • if the code doesn’t merge cleanly nothing is done to the stable branch and the committer is notified by mail.
  • if any tests fail, then again nothing is done to the stable branch and the committer is notified by mail.
  • if the build succeeds, the stable branch is updated* with this latest stable version.
    (* : in git branches are like post-its that you can move around, so “updating the stable branch” just means “move to post-it named stable to the just tested commit”)

It must add that this was what fit our reality best, there are many ways of doing it. For instance if you got a really fast build that runs in isolation you might want something simpler.

So we get an always stable branch. That is, with respect to our automated tests. No more useless work created by untested commits.

Of course there are some things that don’t get into this pretested commit feature. We have some long running tests that are not convenient to include. Our build time, for a successful build, is currently 30 minutes, which is already long. It monopolizes a shared resource so we can’t run two of them in parallel. If we’d increase the test suite to encompass later stages like deployment and smoke tests to pre-production platforms it’d take longer and would make the queue to have your merge request build longer.

Another interesting fact is that the faster the test suite is the more tests we can move into to the pre-commit build and the less costly any mistake is. What a clear connection between a fast test suite and productivity!

Results & learnings

The build is still red sometimes, but that just means that someone monopolized the CI for a couple of minutes. So it’s not much of an issue.

A change like this, involving 3 teams, 2 projects and 15+ developers is not so easy to get going. It takes enough analysis and measurements to have a majority agree that there is a problem worth solving and that there are good enough solutions to it. For instance the fact that the CI is red is not a problem in itself. The real problem is the extra work that is created by the mechanisms described in the beginning of this post.

Still, sometimes the commit barrier can be annoying, since it’s now more difficult to supply a patch – it actually has to pass the tests! No-shit. This is of course good for a majority of modifications and for the developers as a group. But in some edge-cases (like a i18n fix) it can be an annoyance to the individual developer.

The actual setup is surprisingly easy with Git. We used the Jenkins Git plugin but it’d take a couple of lines of shell to do almost the same.

Probably the biggest difficulty was to modify the deployment scripts in safe manner. Modifications were done by a developer and testing and switch were made by ops. Don’t split such a task between two groups unless you have utterly fluid communication between them.

Next steps

Theoretically we’re now able to always deploy from the stable branch. Even patches can go in here because the development branch is always in a fit state, no more production branches. In practice we’re not quite ready but it is our next reachable step.

We also need to work on reducing the build time. As the application grows we will add tests, so if we don’t speed them up our build time will grow.

Share


Viewing all articles
Browse latest Browse all 2

Trending Articles