Lesson 10

Continuous Integration

Ensure quality when frequently merging new code

PRO

Lesson Outline

Continuous Integration

In the following two lessons we will be setting up Continuous Integration and Continuous Deployment (CI/CD). You don't have to set these processes up for your project, but they can be hugely beneficial. We are going to start by just focusing on Continuous Integration.

Continuous Integration (CI) is one of those terms that people have some opinions about. This lesson is not going to be concerned too much with providing the most accurate or pure definition and implementation of what a Continuous Integration process should look like, we will just focus on the general idea and how to get value out of it using Github Actions.

A Brief Introduction to Continuous Integration

The main idea behind Continuous Integration is continuously integrating code into a shared repository. As with most software management terms, it's a bit vague and has some room for interpretation. If you're interested in the origin of Continuous Integration, I believe it's coining can be traced back to Kent Beck and the creation of the Extreme Programming methodology. Let's see if we can get a sense of the spirit of what it is (at least in the sense of how most people go about using the concept today, not necessarily inline with its original definition).

An example of what is NOT Continuous Integration:

Various chunks of code and features live off in their own branches somewhere, or on the developers local machine, for long periods of time. Once fully completed these changes are merged back in with the rest of the code in a shared repository (potentially causing merge conflicts and integration issues with other new code)

An example of what IS Continuous Integration:

Small chunks of code are worked on and are frequently merged back in with the rest of the code in a shared repository. Since code is pushed frequently there are less likely to be merge conflicts, it will be easier to deal with integration issues, and it reduces the barrier to deploying the software.

The core idea is that we are merging back into the main branch of the repository as often as possible.

Side note: what really constitutes CI?

I mentioned this is one of those things people have opinions about, and that's fine. If the goal is to find the best way possible to deliver high quality software that's great... but people do have a tendency for squabbling over definitions that don't actually result in better outcomes for themselves or their team.

We will be continuing with the example we used in the previous lesson, where we discussed using a task branching or branch per issue approach to managing our project.

Some people would not consider this to truly be Continuous Integration since we are creating branches, whereas perhaps a more pure definition would require that all code is always committed to the main branch.

Personally, I find the task branching approach more practical and by design these branches are very short lived so they will soon be integrated back into the main branch. In my view, it still captures the spirit of Continuous Integration and delivers most of the benefits.

This is what works for me, but just keep in mind that the goal is to find what works best for you or your team. Be willing to experiment and see what you like best, but don't get too caught up trying to meet definitions just for the sake of doing it the right way.

Merging Frequently into Main: A Recipe for Disaster?

But... your main branch is usually where your production releases are pulled from. We can't have developers committing code willy nilly and messing up our production builds!

This is where CI Builds come into play. A core part of making this process feasible is that we have well defined automated tests. Our tests should ensure that the code is working as expected, and will warn us when it's not. This creates the level of confidence required to frequently merge code into the main branch.

We want to reduce friction to getting code in the main branch, and if we are relying on some kind of manual testing process it is going to cause a lot of friction if we want to have any degree of confidence that the code works.

The process for adding a new feature or bug fix might look like this (we will assume a TDD approach is being used, but other testing strategies are fine too):

  1. Create a test for the new feature
  2. Write code to satisfy the new test
  3. Make sure you have the latest code from the remote repository
  4. Run all of the existing tests locally to make sure they pass
  5. Merge the new code back into the main branch
  6. A CI Build is created automatically to run tests on the code that was just merged
  7. The passing or failing state of the branch is displayed (and can also be used for more advanced governance controls)

Running the tests locally is a great and is an important step, but it requires that everyone follows the rules, that is:

  1. Finish feature
  2. Pull in latest changes from main
  3. Run tests
  4. Merge code

Let's say we run the tests locally and everything is passing... but we forgot to pull in the latest code from the main branch before doing that. Now when we merge back into the main branch there might actually be failing tests that we don't know about. Maybe we forgot to run the tests entirely before merging. Maybe a developer was just feeling a bit lazy and thought... it's just a minor change, it'll be fine!

If we have CI builds set up to run these tests automatically, we can easily see whether the main branch is in a stable state or not, no matter what any individual does. If code is added to the repository, then our tests will run.

We can use something like Github Actions (Jenkins and Circle CI are also popular options) to automatically:

PRO

Thanks for checking out the preview of this lesson!

You do not have the appropriate membership to view the full lesson. If you would like full access to this module you can view membership options (or log in if you are already have an appropriate membership).