Saturday, June 10, 2017

CI Done Right

Broken.

Recently, some people have had some serious issues with a broken setuptools release [1, 2, 3]. One special complain was about broken CI systems just because of this central package. Replying to those reactions, I had a conversation via twitter about certain design decisions of continuous integration [4, 5].

In this post, I want to assemble part of my personal experience in building those systems for the last couple of years. The result should be a little cozy guideline for those setting up their CI system and CD system within a corporate environment.

So, let's start with those rough steps:
  1. Define what is to be tested.
  2. Define what will happened in case of success and failure of those tests.
Each of those steps aren't easy and setting it up over night won't help. CI systems differ from environment to environment. As usual, it depends. This said, let's dig into the details.

What is to be tested?

We can break down this question into those subpoints:
  1. Define the build environment, including
    1. system binaries
    2. system configs
    3. directory structures
    4. running services
  2. Define the test environment, including
    1. your source-code
    2. your packages
    3. 3rd-party packages
    4. non-source-code data
Define this stuff. Don't even start without having put any thought into it. Usually, we want to test our code and its integration in 3rd-party code but not the 3rd-party code itself.

And there is more to it. Software usually comes in form of releases. In practice, release A does almost the same things as release B. That's why we have a constant package name with changing versions attached to it.

However on an abstract level, each release of a package is actually a different package in itself. There is a diff, isn't it? So, we need to bother with package releases before implementing our CI system, which basically boils down to:

Dependency Update Cadence

You guessed it - we need to define it as well for everything already defined above (usually dependencies to your code): system binaries, 3rd-party packages, your packages, etc.
  • How often do you update the OS environment?
  • How often do you update 3rd-party dependencies? Which ones?
  • How often do you update your packages?
  • Manually?
  • On a regular basis like monthly?
  • Do you really need to be bleeding edge? Or will a more stable version do?
Ever updated one of your important frameworks (e.g. like Django)? Don't tell me you could fix the incompatibilities in the few millions of lines of legacy source code within a day - all of them.

Some package require bleeding-edge versions of different dependencies (like in case of virtualenv installing latest setuptoos). Using a private package servers and some cronjobs is one way to answer those questions in a sensible and enterprise-friendly way.

Pinning dependencies is another way. Use it, to define versions and define a (preferably automated) process of updating those according to your needs.

This will give you a proper test environment, where you can rely on what complements your code while being tested. Test results aren't meaningful otherwise.
Accounting for dependency versions enables you to change their update cadence according to quick needs.
As it seemed some of the people who were using the bleeding-edge version of setuptools haven't had a way to change their setup quick enough. So, they relied on how fast the team around setuptools could fix it. That is unacceptable in corporate software engineering.

What happens in case of test success and test failure?

As mentioned at the beginning, every CI/CD setup is different and it should serve their creators' needs. So, here's a collection of items, people might want to do after a successful test run (YMMV but usually that means zero failures):
  • merge code
  • create a release aka version tagging
  • build a package of this release
  • trigger jobs for dependent packages
  • update/deploy QA (re-running test there again?) and production systems with theses releases
Here we go with another list in case of test failures (usually that means at least a single test failure):
  • notify developers
  • re-run tests under certain circumstances
  • do nothing ;-)
For the sake of completeness: all of these items should be triggered and executed automatically with no supervision required except in case of errors of the CI/CD machinery. That's a design aspect, you need to care about deeply.

Considering all those points should help building CI/CD systems while allowing you to minimize wasted enterprise resources (aka your time) and increase acceptance within your team.

Happy coding,
Sven

[1] https://www.reddit.com/r/Python/comments/6elcaa/psa_setuptools_broken_release_36_dont_use_it/
[2] https://github.com/pypa/setuptools/issues/1042
[3] https://github.com/pypa/setuptools/pull/1043

[4] https://twitter.com/kunsv/status/870399225883836417
[5] https://twitter.com/lucaswiman/status/870682012675211265

No comments:

Post a Comment