Problem:
Our client has a hardware/software solution which consists of multiple applications on an industrial box with multiple SOCs in it. It is required to be highly available and physical access to their shipped devices are very limited. The devices are connected to the internet.
The problem was that their CI/CD solution were completely manual, the binaries were built on developer machines in an uncontrolled environment. Also, the delivery was completely manual by secure copying binaries to the devices through the internet.
Solution:
Our solution went two ways: first, to create a repository which consists of the build tools, all the external source code and library dependencies; second, to create such a deployment, which allows secure and stable delivery of binary images and making sure the devices are functional - even after a failed deployment.
For the build, we have changed their repository structure, scripts and Makefiles to allow us to build in a controlled environment and we have dockerised it. We also built a build farm using TeamCity, it was also able to deliver the artifacts back to the developers own devices. We changed how the artifacts were delivered from a zip of binaries to pkg packages and delivered them to a Nexus package repository, to allow us to use standard installation tools to bring the binaries and all of their dependencies.
For the deployment, we first updated the applications to allow running them, without root privileges. We changed them to run as a systemd service and connected the external watchdog to the linux instead of the applications. We changed the partitioning of the device to contain a safe boot partition with a minimal linux installation, to allow complete rewriting of the running partition even in case of an update failure. This also allows us to update the operating system itself in a safe way.
We mounted the system partition as 'read only' and changed the applications to write all their data into a controlled space on the device. We created a custom management website, where our client can provision the devices with a single click. We wrote Ansible playbooks to configure the devices, and the initiation of these scripts were automated from the webpage.
Technologies:
CI/CD and live deployment processes; TeamCity, k8s, Docker, Ansible, Nexus, Cmake, Custom Python scripts