Git, Versioning, and Branching for Embedded Linux Development

When building a product using Linux, versioning and branching of your software is an important consideration. Everyone’s needs are different depending on the size of the team, culture, and testing requirements, so there is no one size that fits all. However, after working on a number of different projects for a dozen or so different companies, there are several practices that are often used.

The most fundamental is to use Git to manage your software development. It is surprising the number of small companies developing software who do not use Git, or any version control system yet. The many benefits of Git stand alone, but the minute you enter into the Linux world (or the Open Source Software (OSS) world in general), there are thousands of projects available, and 99.9% of them are developed in Git. Having a good understanding of Git is essential in using OSS. This is not really specific to Linux development, but rather any modern software development (cloud, web, mobile, etc). Most foundational software technologies today are open source. There is no other development methodology that scales to the complexity of modern software systems, and no one company contains the resources needed to create and maintain these systems. Git does have a steep learning curve and usually requires 2-3 weeks of committed use for new users to become comfortable with it. However, it is well worth it.

An extension of using Git is to consider hosting your Git repositories in the cloud. There are many companies who provide this service at a reasonable cost (some like Gitlab are free), or it is relatively simple to host your own git server (can be done for $2.50/mo at Vultr using Gitea). Some may balk at putting a company’s valuable IP (intellectual property) assets in the cloud (where hackers might have an easier time getting to them) vs behind the secure company firewall. However, which of the following is the main impediment to your business?

The danger of competitors hacking into your cloud server, stealing your IP, and then implementing a product with it.
Shipping products in a timely fashion, and then maintaining them.

In my observation, #2 seems to be the challenge for most — especially as systems become more complex. Unless you are a top secret government agency, or a cutting edge chip/algorithm company, your software IP is probably of marginal use to anyone else. The reason for this is that integration and execution is most often the challenge, not coding. There is plenty of free/OSS software out there to do about anything. Integrating this into something that will make a product is the hard part. It is difficult getting software and hardware to work together, debugging problems, communicating with cloud infrastructure, software updates, maintaining a product, adding features, etc. Modern products are typically not static entities where you design it, toss it over the wall to production, produce millions, make lots of money, and never see it again. Rather modern products are complex, dynamic beasts that have continually changing requirements, bugs to fix, features needed for new customers, manufacturing issues to solve, obsolete parts to replace, and are part of a larger complex system. Execution is the hard and valuable part. Increasing developer efficiency and collaboration is important, especially with a distributed team of developers. You can require your developers to use company issued Windows computers and fussy VPNs in an attempt to protect your IP, or you can simply host your Git repositories in the cloud secured by industry proven methods such as https and ssh. As systems become more complex, it is critical to be able to involve domain experts in the project (consultants, contractors, employees at a different office, manufacturing, etc). A Git server in the cloud is the foundation of distributed development collaboration. Emailing zip files may be temping as quick and easy, but it is not a sustainable way to develop products that have so many moving parts. As an extension of this, use a web based Git repository manager (like Gitea, Gitlab, or a hosted solution) to implement your Git server. These repository managers greatly simplify the management of Git repos (teams, users, permissions, creating repositories, issue tracking, etc).

Software should be released with a version number. Semantic Versioning is a good place to start. Your needs may vary, but you may as well start with the good ideas of others. Additionally, a changelog should be maintained that lists all changes for each software release. Being able to quickly understand what changed with each release is critical as multiple versions are being used. Keeping a changelog requires a little discipline, but is fairly easy to do if you keep an unreleased section at the top of your changelog file to track changes for the upcoming release, as described in the above link.

A discussion on branching only really makes sense if you are using Git. Source code branches greatly improve your development process flow. The ease with which Git allows you to create and merge source code branches enables development flows which were impractical before Git. With an Embedded Linux project you are typically building images that get loaded on some type of hardware device. In this case, it’s useful to have development and production releases. Production releases are sent to customers, and development releases are used internally to test new features under development. Every software component that is under active development (OE build system, custom applications, kernel, etc) should likewise have production and development branches that feed into their respective builds. For a more extensive discussion of this topic, consider this post. Again, start with the good ideas of others, and then use what makes sense. An OE build can be configured to pull from the latest HEAD of your custom components, so that for each new build you get the latest from the respective branches for various components. To do this, you typically have something like the following in a bitbake recipe:

SRC_URI = "git://git.mycompany.com/myproject.git;branch=develop;protocol=ssh;user=git
PV = "4.1.15+git${SRCPV}"
SRCREV = "${AUTOREV}"

In some cases, for production builds we’ll manually specify the Git hash or tag in the bitbake recipe so that versions of all components are explicitly locked down. To specify a git version, replace ${AUTOREV} with a Git hash. Either way, if you include the Git version (SRCPV) in the package version (PV) variable, you can always figure out what Git version was used to build a package for a particular release. Below is an example of a package file name:

my-app_4.1.15+git0+a05d9b23b9-r0.10_var_som_mx6.ipk

In this case, a05d9b23b9 tells you what Git version was used to build my-app. This is often adequate if you don’t spend a lot of time building or patching old software releases. With each release of software, it is beneficial to tag the software components used in the build with a version. This allows a clear view of what changed between releases, and allows us to easily check out old versions of software.

With development and production branches, we should consider how to version the different branches. One simple way is to use a sequence like:

1.0.0 (production)
1.0.900 (develop)
1.0.901 (develop)
1.0.1 (production bug fix)
1.0.902 (develop)
merge develop to production and do a new release
1.1.0 (production)
1.1.900 (continue development
1.1.901 (develop)

In this case, the development version is always the last production release MAJOR.MINOR and the PATCH version starts at 900 and is incremented with each development release. If a version number ends in 9xx, you know it’s a development build. You can use more complex version schemes like 1.1.0-alpha1 (instead of 1.0.901), but then you might run into sorting problems where 1.1.0-alpha1 is listed after 1.1.0, even though 1.1.0 is a later release than 1.1.0-alpha1. Recent versions of the BEC software updater look at all update files on a USB disk and pick the latest one using a natural sort of version numbers within the text. In the example below, you can see how versions are sorted:

[cbrake@mars ver]$ ls -lv
total 0
-rw-r--r-- 1 cbrake cbrake 0 Dec 7 10:23 1.0.0
-rw-r--r-- 1 cbrake cbrake 0 Dec 7 10:23 1.0.1
-rw-r--r-- 1 cbrake cbrake 0 Dec 7 10:23 1.0.900
-rw-r--r-- 1 cbrake cbrake 0 Dec 7 10:23 1.0.901
-rw-r--r-- 1 cbrake cbrake 0 Dec 7 10:23 1.1.0
-rw-r--r-- 1 cbrake cbrake 0 Dec 7 10:23 1.1.0-alpha1

In this case, 1.0.9xx works much better for development builds if you want update systems to automatically pick the latest release from a list of options. You can also use versions like 1.0.0+1.1.0-alpha1 to get around the sorting problem if you want something a little more explicit.

Versioning and branching does not have to be complex, and in many cases the above is adequate for small teams. Your needs may vary, but the important thing is to have a source control, versioning, and branching process in place. These practices require some discipline to implement, and may seem unnecessary to those who have not used them, but with a little experience we soon realize that efforts to introduce appropriate amounts of organization and process into our development efforts clearly communicates what we did, what we are doing, and what gets released to production. This then frees us up to focus on what really adds value.