Friday, December 30, 2011

One year later

Today, it's my blog's first anniversary, so I thought this is a good opportunity to do some reflection about it.

Motivation


So why have I started this blog in the first place? The main reason is that as a software engineer and researcher, I don't necessarily only write code or write academic papers:

  • Sometimes I have to solve engineering problems, instead of doing science (whatever that means). In my opinion, this is also interesting and useful to report about. For example, while most of my research is about Nix, NixOS, Disnix, sometimes I have to apply my work in specific scenarios/environments which is challenging enough to get it to work. A notable example is the deployment .NET software. Without proper application of the stuff I'm investigating, the usefulness of my research isn't so great either.
  • Sometimes I have to express ideas and opinions. For example, what others say about the research we're doing or about general software engineering issues.
  • Sometimes I need to do knowledge transfer about issues I have to keep explaining over and over again, like the blog post about software deployment in general.
  • Sometimes I have some fun projects also well.

The main purpose of my blog is to fill this gap with academic paper publishing. Apparently, it seems that some of my blog posts have raised quite some attention, which I'm very happy about. Moreover, some blog posts (like the .NET related stuff) also gave me some early feedback which helped me solving a problem which I was struggling with for a long time.

Writing for a blog


For some reason, I find writing blog posts more convenient than writing academic papers. In order to write an academic paper I have to take a lot of stuff into account next to the scientific contribution I want to make. For example:

  • Because I'm doing research in software deployment and because this is a neglected research topic, I have to adapt my paper to fit in the scope of a conference I have to submit to, because they are typically about something else. This is not always easy.
  • I have to identify the audience of the particular conference and learn about the subjects and concepts they talk about.
  • I have to read a lot of related work papers to determine the exact scientific contribution and to describe what the differences are.
  • I have to integrate my idea into the concepts of the given conference.
  • I have to explain the same concepts over and over again, because they are not generally understood. What is software deployment? Why is software deployment difficult and important? What is Nix? Why is Nix conceptually different compared to conventional package managers? etc. etc.
  • In each paragraph, I have to convince the reader over and over again.
  • It should fit within the page limit
  • I have to keep myself aware of the fact that my paper must be scientific paper and not an engineering paper.

Writing for my blog is actually much easier for me, because I don't have to spent so much time on these other issues next to the contribution I want to make. Furthermore, because I can link to earlier topics and I know my audience a bit, I don't have to explain the same things over and over again.

Although these observations are quite negative, this does not mean that I want to claim that I shouldn't write any academic papers and that academic papers are useless, but nonetheless it's still a funny observation I have.

Blog posts


Another funny observation is the top 10 of most popular blog posts. The top 10 (at this time) is as follows:

  1. Software deployment complexity. I'm happy that my general blog post about software deployment complexity has raised so much attention, because within the software engineering research community it's typically an ignored subject. In just one week, it surpassed the number of hits of all my other blog posts.
  2. Second computer. For some reason my blog post about my good ol' Commodore Amiga is very popular. For a long time, this was actually my most popular blog post and it's not even research related! It seems that 19 years after Commodore's demise, the Amiga is far from dead.
  3. First blog post. This is the first blog post in which I briefly introduce myself. Apparently, people want to know who I am :-)
  4. NixOS: A purely functional Linux distribution. This is a general blog post explaining the idea and features of NixOS, the Linux distribution built on top of the Nix package manager, which we use as a foundation for our current research project.
  5. First computer. Apart from my good ol' Commodore Amiga, my first computer the: Commodore 128 also still lives on!
  6. The Nix package manager. This blog post covers the Nix package manager, on which NixOS is built. Apparently people are interested in the main Nix concepts as well.
  7. Concepts of programming languages. This blog post is about the 'Concepts of programming languages' course taught at the TU Delft, for which I was a teaching assistent. This course covers various programming languages, which are conceptually different from each other. I actually have no idea why this blog post is so popular.
  8. Using NixOS for declarative deployment and testing. This blog post covers two very nice applications of NixOS in which you can automatically deploy a distributed network of NixOS machines and use the same specification to generate a network of virtual machines for testing. I have presented this topic at FOSDEM, Europe's biggest free and open source software event, and my presentation was well received there :-).
  9. On Nix, NixOS and the Filesystem Hierarchy Standard (FHS). This blog post responds to one of the common criticisms I have received about Nix and NixOS from other distributions. In this blog post I explain why we are different and that it's for a very good reason.
  10. Self-adaptive deployment with Disnix. This blog post is about one of my biggest research contributions, which I have presented at the SEAMS symposium (co-located with ICSE) in Hawaii. In this blog post I have built a framework on top of Disnix and the underlying purely functional deployment model of Nix to make systems self-adaptable by redeployment in case of an event.

My biggest surprise is the fact that the Amiga stuff is so popular. Actually, I have some fun projects that I'm working on, so if I can find any time for it, there may be more to report about.

Conclusion


It seems for me that it pays off to have blog, so hopefully next year there will be much more interesting things to report about. I have one more thing to say and that is:


HAPPY NEW YEAR!!!!!!! (I took the fireworks picture above during the ICSE in Hawaii, just in case you wanted to know ;) )

Wednesday, December 21, 2011

Techniques and lessons for improvement of deployment processes

So far, all my software deployment related blog posts were mostly about techniques implemented in Nix, NixOS and Disnix and some general deployment information.
In my career as a Masters student and PhD student, I have written quite a number of Nix expressions for many packages and I have also "Nixified" several large code bases of commercial companies to make it possible to use the distinct features of Nix to make software deployment processes reproducible and reliable.

However, for many systems it turns out that implementing Nix support (for instance, to allow it to be deployed by our Nix-based deployment tools) is more difficult, tedious and laborious than expected. In this blog post, I'm going to describe a number of interesting techniques and lessons that could improve the deployment process of a system, based on my experience with Nix and Disnix. Second, by taking these lessons into account, it also becomes easier to adopt Nix as underlying deployment system.

  1. Deal with the deployment complexity from the beginning in your development process. Although this lesson isn't technical and may sound very obvious to you, I can ensure you that if you have a codebase which grows rapidly, without properly thinking about deployment, it becomes a big burden to implement the techniques described in this blog post later in the development process. In some cases, you may even have to re-engineer specific parts of your system, which can be very expensive to do, especially if your codebase consists of millions of lines of code.

  2. Make your deployment process scriptable. This lesson is also an obvious one, especially if you have adopted an agile software development process, in which you want to create value as soon as possible (without a deployed system there is no value).

    Although it's obvious, I have seen many cases where developers still build and package components of a software system by manually performing build and package tasks in the IDE and by doing all the installation steps by hand (with some deployment documentation). Such deployment processes typically take a lot of time and are subject to errors (because they are performed manually and humans make mistakes).

    Quite often these developers, who perform these manual/ad-hoc deployment processes, tell me that it's much too complicated to learn how a build system works (such as GNU Make or Apache Ant) and to maintain these build specifications.

    The only thing I can recommend to them is that it really is worth the effort, because it significantly reduces the time to make a new release of the product and also allows you test your software sooner and more frequently.

  3. Decompose your system and their build processes. Although I have seen many software projects using some kind of automation, I have also encountered a lot of build tools which treat the the complete system as a single monolithic blob, which must be built and deployed as a whole.

    In my opinion it's better to decompose your system in parts which can be built separately, for the following reasons:

    • It may significantly reduce build times, because only components that have been changed have to be rebuilt. There is no need to reexamine the complete codebase for each change.
    • It increases flexibility, because it becomes easier to replace specific parts of a system with other variants.
    • It allows a better means of sharing components across systems and better integration in different environments. As I have stressed out in an earlier blog post, software nowadays is rarely self-contained and run in many types of environments.
    • It allows you to perform builds faster, because they can be performed in parallel.

    In order to decompose a system properly, you have to think early about a good architecture of your system and keep reflecting about it. As a general rule, the build process of a sub component should only depend on the source code, build script and the build results of the dependencies. It should not be necessary to have cross references between source files among components.

    Typically, a bad architecture for a system also implies a very complex and inefficient build process.

    I have encountered some extreme cases, in which the source code system is one big directory of files, containing hundreds of dependencies including third-party libraries in binary form, infrastructure components (such as the database and web servers) containing many tweaks and custom configuration files.

    Although such code bases can be automatically deployed, it offers almost no flexibility, because it is a big burden to replace a specific part (such as a third party library) and still ensure that the system works as expected. Also it requires the system to be deployed in a clean environment because it offers no integration. Typically these kind of systems aren't updated that frequently either.

  4. Make build-time and run-time properties of components configurable. Another obvious lession, but I quite frequently encounter build processes which make implicit assumptions about the locations where specific dependencies can be found.

    For example, in some Linux packages, I have seen a number of Makefiles which have hardcoded references to the /usr/bin directory to find specific build tools, which I have to replace myself. Although most Linux systems have these build-time dependencies stored in this folder, there are some exceptions such as NixOS and GoboLinux.

    In my opinion it's better to make all locations and settings configurable, either by allowing it to be specified as a build parameter, environment variable (e.g. PATH) or stored in a configuration file, which can be easily edited.

  5. Isolate your build artifacts. I have noticed that some development environments store the output of a build in separate folders (for example Eclipse Java projects) whereas others store all the build artifacts of all projects in a single folder (such as Visual Studio C# projects). Similarly, with end user installations in production environments, libraries are also stored in global directories, such as /usr/lib on Linux environments, or C:\Windows\System32 on Windows.

    Based on my experience with Nix, I think it's better to store the build output of every component in isolation, by storing them in separate folders. Although this makes it harder to find components, it also offers some huge benefits, such as the ability to store multiple versions next to each other. Also upgrades can be made more reliable, because you can install another version next to an existing one. Furthermore, builds can also be made more pure and reproducible, because it's required to provide the locations of these components explicitly. In global folders such as /usr/lib a dependency may still be implicitly found even if it's not specified, which cannot happen if all the dependencies are stored in separate locations. More information about isolation can be found in my blog post about NixOS and the FHS.

  6. Think about a component naming convention. Apart from storing components in separate directories which offers some benefits, it's also important to think about how to name them. Typically, the most common naming scheme that's being used consists of a name and version number. This naming scheme can be insufficient in some cases, for example:

    • It does not take the architecture of the component into account. In a name-version naming scheme a 32-bit variant and a 64-bit variant are considered the same, while they may not be compatible with another component.
    • It does not take the optional features into account. For example, another component may require certain optional features enabled in order to work. It may be possible that a program incorrectly uses a component which does not support a required optional feature.

    I recommend you to also take relevant build-time properties into account when naming a component. In Nix you get such a naming convention for free, although the naming convention is also very strict. Nix hashes all the build-time dependencies, such as the source code, libraries, compilers and build scripts and makes this hash code part of the component name, for example: /nix/store/r8vvq9kq18pz08v24918my6r9vs7s0n3-firefox-8.0.1. Because of this naming scheme, different variants of a component, which are (for example) compiled with another version of a compiler of for a different system architecture do not share the same name.

    In some cases, it may not be necessary to adopt such a strict naming scheme, because the underlying component technology used is robust enough to offer enough flexibility. In one of my case studies covering Visual Studio projects, we only distinguish between variants of component by taking the name, version, interface and architectures into account. because they have no optional features and we assume the .NET framework is always compatible. We organize the filesystem in such a way that 32-bit and 64-bit variants are stored in separate directories. Although a naming scheme like this is less powerful than Nix's, it already offers huge benefits compared to traditional deployment approaches.

    Another example of a more strictly named component store is the .NET Global Assembly Cache (GAC), which makes a distinction between components based on the name, version, culture and a cryptographic key. However, the GAC is only suitable for .NET library assemblies and not for other types of components and cannot take other build properties into account.

  7. Think about component composition. A system must be able to find its dependencies at build-time and run-time. If the components are stored in a separate folders, you need to take extra effort in order to compose them. Some methods of addressing dependencies are:

    • Modify the environment. Most platforms use environment variables to address components, such as PATH for looking up executables, CLASSPATH for looking up Java class files, PERL5LIB for addressing Perl libraries. In order to perform a build these environment variables must be adapted to contain all the dependencies. In order to run the executable, the executable needs to be wrapped in a process, which sets these environment variables. Essentially, this method binds dependencies statically to another component.
    • Composing symlink trees. You can also provide a means of looking up components from a single location by composing the contents of the required dependencies in a symlink tree and by referring to its contents. In conjunction with a chroot environment, and a bind mount of the component directory, you can make builds pure. Because of this dynamic binding approach, you can replace dependencies, without a rebuild/reconfiguration, but you cannot easily run one executable that uses a particular variant of a library, while another runs another variant. This method of addressing components is used by GoboLinux.
    • Copy everything into a single folder. This is a solution which always work, however it is also a very inefficient as you cannot share components.

    Each composition approach has its own pros and cons. The Nix package manager supports all three approaches, however the first approach is mostly used for builds as its the most reliable one, because static binding always ensures that the dependencies are present and correct and that you can safely run multiple variants of compositions simultaneously. The second approach is used by Nix for creating Nix profiles, which end users can use to conveniently access their installed programs.

  8. Granularity of dependencies. This lesson is very important while deploying service-oriented systems in a distributed environment. If service components have dependencies on components with a large level of granularity, upgrades may become very expensive because some components are unnecessarily reconfigured/rebuilt.

    I have encountered a case in which a single configuration file was used to specify the locations of all service components of a system. When the location of a particular service component changes, every service component had to be reconfigured (even services that did not need access to that particular service component) which is very expensive.

    It's better to design these configuration files in such a way that they only contain properties that a service components need to know.


Conclusion


In this blog post I have listed some interesting lessons to improve the deployment process of complex systems based on my experience with Nix and related tools. These can be implemented in various deployment processes and tools. Furthermore, it becomes easier to adopt Nix related tooling by taking these lessons into account.

I also have to remark that it is not always obvious to implement all these lessons. For example, it is hard to integrate Visual Studio C# projects in a stricter deployment model supporting isolation, because the .NET runtime has no convenient way to address run-time dependencies in arbitrary locations. However, there is a solution for this particular problem, which I have described in an earlier blog post about .NET deployment with Nix.

UPDATE: I have given a presentation about this subject at Philips recently. As usual, the slides can be obtained from the talks page of my homepage.

Monday, December 12, 2011

An evaluation and comparison of GoboLinux


In this blog post, I'm going to evaluate deployment properties of GoboLinux and compare it with NixOS. Both Linux distributions are unconventional because they deviate from the Filesystem Hierarchy Standard (FHS). Apart from this, GoboLinux shares the same idea that the filesystem can be used to organize the management of packages and other system components, instead of relying on databases.

The purpose of this blog post is not to argue which distribution is better, but to see how this distribution achieves certain deployment properties, what some differences are compared to NixOS and what we can learn from it.

Filesystem organization


Unlike NixOS, which only deviates from the FHS where necessary, GoboLinux has a filesystem tree which completely deviates from the FHS, as shown below:


GoboLinux stores static parts of programs in isolation in the /Programs folder. The /System folder is used to compose the system, to store system configuration settings and variable data. /Files is used for storing data not belonging to any program. /Mount is used for accessing external devices such as DVD-ROM drives. The /Users folder contains home directories. /Depot is a directory without a predefined structure, which can be organized by the user.

GoboLinux also provides compatibility with the FHS. For example, the /usr and /bin directories are actually also present, but not visible. These directories are in fact symlinks trees referring to files in the /System directory (as you may see in the picture above). They are made invisible to end-users by a special kernel module called GoboHide.

Package organization


Like NixOS, GoboLinux stores each package in a seperate directories, which do not change after they have been built. However, GoboLinux uses a different naming convention compared to NixOS. In GoboLinux every package uses the /Program/<Name>/<Version> naming convention, such as: /Programs/Bzip2/1.0.4 or /Programs/LibOGG/1.1.3. Furthermore, each program directory contains a Current symlink which points to the version of the component, that is actually in use. Some program directories also have a Settings/ directory containing system configuration files for the given package.

The use of isolated directories offers a number of benefits like NixOS:

  • Because every version is stored in its own directory, we can safely store multiple versions next to each other, without having to worry that a file of another version gets overwritten.
  • By using a symlink, pointing to the current version, we can atomically switch between versions by flipping the symlink (also used by Nix to switch Nix profiles), which ensures that there is no situation in which a package contains both files of the old version and new version.

In contrast to NixOS, GoboLinux uses a nominal naming convention and dependency specifications, which are based on the name of the package and version number. With a nominal naming convention, it cannot make a distinction between several variants of components, for example:

  • It does not reflect which optional features are enabled in a package. For example, many libraries and programs have optional dependencies on other libraries. Some programs may require a particular variant of a library with a specific option enabled.
  • It does not take the build-time dependencies, such as build tools or library versions into account, such as the version of GCC or glibc. Older versions may have ABI incompatibilities with newer versions.
  • It does not take the architecture of the binaries into account. Using this naming scheme, it is harder to store 32-bit and 64-bit binaries safely next to each other.

In NixOS, however, every package name is an exact specification, because the component name contains an hash-code derived from all build-time dependencies to build the package, such as the compiler version, libraries and build scripts.

For example, in NixOS bzip2 may be stored under the following path: /nix/store/6iyhp953ay3c0f9mmvw2xwvrxzr0kap5-bzip2-1.0.5. If bzip2 is compiled with a different version of GCC or linked to a different version of glibc, it gets a different hash-code and thus another filename. Because no component shares the same name, it can be safely stored next to other variants.

Another advantage of the Nix naming convention is that unprivileged users can also build and install software without interfering with other users. If for example, a user injects a trojan horse in a particular package, the resulting component is stored in a different Nix store path and will not affect other users.

However, because of these exact dependency specifications, upgrades in NixOS may be more expensive. In order to link a particular program to a new version of a library, it must be rebuild, while in GoboLinux the library dependency can be replaced without rebuilding (although it may not guarantee that the upgraded version will work).

System composition


Of course, storing packages in separate directories does not immediately result in a working system. In GoboLinux, the system configuration and structure is composed in the /System directory.

The /System directory contains the following directories:

  • Kernel/, contains everything related to the Linux kernel, such as Boot/ storing kernel images, Modules/ storing kernel modules and Objects/ storing device files.
  • Links/, is a big symlink tree composing the contents of all packages that are currently in use. This symlink tree makes it possible to refer to executables in Executables/, libraries in Libraries/ and other files from a single location. The Environment/ directory is used to set essential environment variables for the installed programs.
  • Settings/ contains configuration files, which in other distributions are commonly found in /etc. Almost all files are symlinks to the Settings/ folder included in each program directory.
  • Variable/ contains all non-static (variable) data, such as cache and log files. In conventional distributions these files are stored in /var.

Like NixOS, GoboLinux uses symlink trees to compose a system and to make the contents of packages accessible to end users.

Recipes


Like NixOS, GoboLinux uses declarative specifications to build packages. GoboLinux calls these specifications recipes. Each recipe directory contains a file called Recipe which describes how to build a package from source code and a folder Resources/ defining various other package properties, such as a description and a specification of build-time and run-time dependencies.

compile_version=1.8.2
url="$httpSourceforge/lesstif/lesstif-0.95.2.tar.bz2"
file_size=2481073
file_md5=754187dbac09fcf5d18296437e72a32f
recipe_type=configure
make_variables=(
   "ACLOCALDIR=$target/Shared/aclocal"
   "aclocaldir=$target/Shared/aclocal"
)

An example of a recipe, describing how to build Lesstif, is shown above. Basically, for autotools based projects, you only need to specify the location where the source code tarball can be obtained, and some optional parameters. From this recipe, the tarball is download from the sourceforge web server and the standard autotools build procedure is performed (i.e. ./configure; make; make install) with the given parameters.

In the Resources/Dependencies run-time dependencies can be specified and in Resource/BuildDependencies build-time dependencies can be specified. The dependencies are specified by giving a name and an optional version number or version number range.

GoboLinux recipes and Nix expressions both offer abstractions to declaratively specify build actions. A big difference between those specifications, is the way dependencies are specified. In GoboLinux only nominal dependency specifications are used, which may not be complete enough, as explained earlier. In Nix expressions, you refer to build functions that build these dependencies from source-code and their build-time dependencies, which are stored in isolation in the Nix store using hash codes.

Furthermore, in Nix expressions run-time and build-time dependencies are not separately specified. In Nix, every run-time dependency is specified as build-time dependency. After a build has been performed, Nix conservatively scans for hash occurrences of build-time dependencies inside a realized component to identify them as run-time dependencies. Although this sounds risky, it works extremely well, because chances are very slim that an exact occurrence of a hash code represents something else.

Building packages


In GoboLinux, the Compile tool can be used to build a package from a recipe. Builds performed by this tool are not entirely pure, because it looks for dependencies in the /System/Links directory. Because this directory contains all the installed packages on the system, it may be possible that the build of a recipe may accidentally succeed when a dependency is not specified, because it can be implicitly found.

In order to make builds pure, GoboLinux provides the ChrootCompile extension, which performs builds in a chroot environment. ChrootCompile tool bind mounts the /Program directory in the chroot environment and creates a small /System only containing symlinks to the dependencies that are specified in the recipe. Because only the specified dependencies can be found in /System/Links directory, a build cannot accidentally succeed if a dependency has been omitted.

Both Nix and ChrootCompile have the ability to prevent undeclared dependencies to accidentally succeed a build, which improves reproducibility. In Nix, this goal is achieved differently. In Nix, the environment variables in which a build is performed are completely cleared (well not completely, but almost :-) ), and dependencies which are specified are added to the PATH and other environment variables, which allow build tools to find the dependencies.

Nix builds can be optionally performed in a chroot environment, but this is not mandatory. In NixOS, the traditional FHS directories, such as /usr don't exist and cannot make a build impure. Furthermore, common utilities such as GCC have been patched so that they ignore standard directories, such as /usr/include, removing many impurities.

Furthermore, Nix also binds dependency relationships statically to the executables (e.g. by modifying the RPATH header in an ELF binary), instead of allowing binaries to look in global directories like: /System/Links, which GoboLinux executables do. Although, GoboLinux builds are pure inside a chroot environment, their run-time behaviour may be different when a user decides to upgrade a version of its dependency.

Conclusion


In this blog post I did an evaluation of GoboLinux and I compared some features with NixOS. In my opinion, evaluating GoboLinux shows a number of interesting lessons:

  • There are many deployment related aspects (and perhaps other aspects as well) that can be solved and improved by just using the filesystem. I often see that people write additional utilities, abstraction layers and databases to perform similar things, while you can also use symlink trees and bind mounts to provide abstractions and compositions. Additionally, this also shows that the filesystem, which is an essential key component for UNIX-like systems, is still important and very powerful.
  • Storing packages in separate directories is a simple way to manage their contents, store multiple versions safely next to each other and to make upgrades more reliable.

GoboLinux and NixOS have a number of similar deployment properties, because of their unorthodox filesystem organization, but also some differences and limitations. The following table summarizes the differences covered in this blog post:

GoboLinux NixOS
FHS compatibility Yes (through hidden symlink trees) No (deviates on some FHS aspects)
Component naming Nominal (name-version) Exact (hash-name-version)
Component binding Dynamic (dependency can be replaced without rebuild) Static (rebuild required if a dependency changes)
Granularity of component isolation Only between versions Between all build-time dependencies (including version)
Unprivileged user installations No Yes
Build specifications Stand-alone recipes Nix expressions which need to refer to all dependency expressions
Build-time dependency addressing /System/Links symlink tree in chroot environment Setting environment variables + modified tools + optional chroot environment
Run-time dependencies Specified manually Extracted by scanning for hash occurrences
Run-time dependency resolving Dynamic, by looking in /System/Links Static (e.g. encoded in RPATH)

The general conclusion of this blog post is that both distributions achieve better deployment properties compared to conventional Linux distributions, by their unorthodox filesystem organisation. Because of the purely functional deployment model of the Nix package manager, NixOS is more powerful (and strict) than GoboLinux when it comes to reliability, although this comes at the expense of extra rebuild times and additional disk space.

The only bad thing I can say about GoboLinux is that the latest 014.01 release is outdated (2008) and it looks like 015 is in progress for almost three years... I'm not sure if there will be a new release soon, which is a pity.

And of course, apart from these deployment properties, there are many other differences I haven't covered here, but that's up to the reader to make a decision.

I'm planning to use these lessons for a future blog post, which elaborates more on techniques for making software deployment processes more reliable.