Update: The Necessity of Configuration and System Management Tools
Over the last months I wrote my Master’s Thesis in Computer Science about configuration and system management tools at the Technical University of Berlin, Germany, supervised by Prof. Dr. Odej Kao. I was also supported by Flying Circus Internet Operations GmbH.
The official title of my thesis is “Evaluating Methods to Maintain System Stability and Security When Reversing Changes Made by Configuration and System Management Tools in UNIX Environments“. In essence it’s points out what you as a system administrator should care about and take into account when using configuration and system management tools, such as Ansible, Chef, and Puppet. Using these tools is easy, and takes a lot of your plate when dealing with larger IT environments, but without considering certain things you likely break your environment at some point.
Consider the following example:
A hosting provider offers virtual machines that have an IP address that is assigned from a pool of free addresses when the machine is first set up. At some point a customer requests the server to be shut down and removed. Once the server is removed, its IP address is freed and goes back into the pool. However, a customer is able to request a restore of a server from a backup within a given time frame after its removal. This works perfectly fine as long as the original IP address is not in use. However, configuring the network setup will fail if the IP address has been recycled to a different machine. [ch. 1, p. 1]
My master’s thesis deals with this and many other cases where problems might be hard to spot.
In order for you to inspect your IT environment and find those problems I came up with a taxonomy of IT resources that helps you classify all resources [ch. 5.1, p. 23]:
For the example given above, the IP addresses of VMs are identifying resources.
After classifying your resources you can follow the rules I derived from the taxonomy ([ch. 5.2, p. 26]) to narrow down potential conflicts. In the above case “Rule 6: Use environment-wide unique identifiers” is violated. The hosting provider should keep a record of IP addresses that are still in use inside backups before actually recycling the addresses.
In chapter 5.3 I provide a variety of use cases that you might encounter in your IT environment. Chapter 6 takes up those cases and outlines possibilities of how to solve potential issues.
You are allowed to use all resources that are part of this article free of charge as long as you include my name and a reference to this article in a place the end-user of your product sees.
You may not copy, distribute, or publish the thesis as a whole unless explicitly granted.