NixOS Planet

August 12, 2020

Tweag I/O

Developing Python with Poetry & Poetry2nix: Reproducible flexible Python environments

Most Python projects are in fact polyglot. Indeed, many popular libraries on PyPi are Python wrappers around C code. This applies particularly to popular scientific computing packages, such as scipy and numpy. Normally, this is the terrain where Nix shines, but its support for Python projects has often been labor-intensive, requiring lots of manual fiddling and fine-tuning. One of the reasons for this is that most Python package management tools do not give enough static information about the project, not offering the determinism needed by Nix.

Thanks to Poetry, this is a problem of the past — its rich lock file offers more than enough information to get Nix running, with minimal manual intervention. In this post, I will show how to use Poetry, together with Poetry2nix, to easily manage Python projects with Nix. I will show how to package a simple Python application both using the existing support for Python in Nixpkgs, and then using Poetry2nix. This will both show why Poetry2nix is more convenient, and serve as a short tutorial covering its features.

Our application

We are going to package a simple application, a Flask server with two endpoints: one returning a static string “Hello World” and another returning a resized image. This application was chosen because:

  1. It can fit into a single file for the purposes of this post.
  2. Image resizing using Pillow requires the use of native libraries, which is something of a strength of Nix.

The code for it is in the imgapp/__init__.py file:

from flask import send_file
from flask import Flask
from io import BytesIO
from PIL import Image
import requests


app = Flask(__name__)


IMAGE_URL = "https://farm1.staticflickr.com/422/32287743652_9f69a6e9d9_b.jpg"
IMAGE_SIZE = (300, 300)


@app.route('/')
def hello():
    return "Hello World!"


@app.route('/image')
def image():
    r = requests.get(IMAGE_URL)
    if not r.status_code == 200:
        raise ValueError(f"Response code was '{r.status_code}'")

    img_io = BytesIO()

    img = Image.open(BytesIO(r.content))
    img.thumbnail(IMAGE_SIZE)
    img.save(img_io, 'JPEG', quality=70)

    img_io.seek(0)

    return send_file(img_io, mimetype='image/jpeg')


def main():
    app.run()


if __name__ == '__main__':
    main()

The status quo for packaging Python with Nix

There are two standard techniques for integrating Python projects with Nix.

Nix only

The first technique uses only Nix for package management, and is described in the Python section of the Nix manual. While it works and may look very appealing on the surface, it uses Nix for all package management needs, which comes with some drawbacks:

  1. We are essentially tied to whatever package version Nixpkgs provides for any given dependency. This can be worked around with overrides, but those can cause version incompatibilities. This happens often in complex Python projects, such as data science ones, which tend to be very sensitive to version changes.
  2. We are tied to using packages already in Nixpkgs. While Nixpkgs has many Python packages already packaged up (around 3000 right now) there are many packages missing — PyPi, the Python Package Index has more than 200000 packages. This can of course be worked around with overlays and manual packaging, but this quickly becomes a daunting task.
  3. In a team setting, every team member wanting to add packages needs to buy in to Nix and at least have some experience using and understanding Nix.

All these factors lead us to a conclusion: we need to embrace Python tooling so we can efficiently work with the entire Python ecosystem.

Pip and Pypi2Nix

The second standard method tries to overcome the faults above by using a hybrid approach of Python tooling together with Nix code generation. Instead of writing dependencies manually in Nix, they are extracted from the requirements.txt file that users of Pip and Virtualenv are very used to. That is, from a requirements.txt file containing the necessary dependencies:

requests
pillow
flask

we can use pypi2nix to package our application in a more automatic fashion than before:

nix-shell -p pypi2nix --run "pypi2nix -r requirements.txt"

However, Pip is not a dependency manager and therefore the requirements.txt file is not explicit enough — it lacks both exact versions for libraries, and system dependencies. Therefore, the command above will not produce a working Nix expression. In order to make pypi2nix work correctly, one has to manually find all dependencies incurred by the use of Pillow:

nix-shell -p pypi2nix --run "pypi2nix -V 3.8 -E pkgconfig -E freetype -E libjpeg -E openjpeg -E zlib -E libtiff -E libwebp -E tcl -E lcms2 -E xorg.libxcb -r requirements.txt"

This will generate a large Nix expression, that will indeed work as expected. Further use of Pypi2nix is left to the reader, but we can already draw some conclusions about this approach:

  1. Code generation results in huge Nix expressions that can be hard to debug and understand. These expressions will typically be checked into a project repository, and can get out of sync with actual dependencies.
  2. It’s very high friction, especially around native dependencies.

Having many large Python projects, I wasn’t satisfied with the status quo around Python package management. So I looked into what could be done to make the situation better, and which tools could be more appropriate for our use-case. A potential candidate was Pipenv, however its dependency solver and lock file format were difficult to work with. In particular, Pipenv’s detection of “local” vs “non-local” dependencies did not work properly inside the Nix shell and gave us the wrong dependency graph. Eventually, I found Poetry and it looked very promising.

Poetry and Poetry2nix

The Poetry package manager is a relatively recent addition to the Python ecosystem but it is gaining popularity very quickly. Poetry features a nice CLI with good UX and deterministic builds through lock files.

Poetry uses pip under the hood and, for this reason, inherited some of its shortcomings and lock file design. I managed to land a few patches in Poetry before the 1.0 release to improve the lock file format, and now it is fit for use in Nix builds. The result was Poetry2nix, whose key design goals were:

  1. Dead simple API.
  2. Work with the entire Python ecosystem using regular Python tooling.
  3. Python developers should not have to be Nix experts, and vice versa.
  4. Being an expert should allow you to “drop down” into the lower levels of the build and customise it.

Poetry2nix is not a code generation tool — it is implemented in pure Nix. This fixes many of problems outlined in previous paragraphs, since there is a single point of truth for dependencies and their versions.

But what about our native dependencies from before? How does Poetry2nix know about those? Indeed, Poetry2nix comes with an extensive set of overrides built-in for a lot of common packages, including Pillow. Users are encouraged to contribute overrides upstream for popular packages, so everyone can have a better user experience.

Now, let’s see how Poetry2nix works in practice.

Developing with Poetry

Let’s start with only our application file above (imgapp/__init__.py) and a shell.nix:

{ pkgs ? import <nixpkgs> {} }:

pkgs.mkShell {

  buildInputs = [
    pkgs.python3
    pkgs.poetry
  ];

}

Poetry comes with some nice helpers to create a project, so we run:

$ poetry init

And then we’ll add our dependencies:

$ poetry add requests pillow flask

We now have two files in the folder:

  • The first one is pyproject.toml which not only specifies our dependencies but also replaces setup.py.
  • The second is poetry.lock which contains our entire pinned Python dependency graph.

For Nix to know which scripts to install in the bin/ output directory, we also need to add a scripts section to pyproject.toml:

[tool.poetry]
name = "imgapp"
version = "0.1.0"
description = ""
authors = ["adisbladis <adisbladis@gmail.com>"]

[tool.poetry.dependencies]
python = "^3.7"
requests = "^2.23.0"
pillow = "^7.1.2"
flask = "^1.1.2"

[tool.poetry.dev-dependencies]

[tool.poetry.scripts]
imgapp = 'imgapp:main'

[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"

Packaging with Poetry2nix

Since Poetry2nix is not a code generation tool but implemented entirely in Nix, this step is trivial. Create a default.nix containing:

{ pkgs ? import <nixpkgs> {} }:
pkgs.poetry2nix.mkPoetryApplication {
  projectDir = ./.;
}

We can now invoke nix-build to build our package defined in default.nix. Poetry2nix will automatically infer package names, dependencies, meta attributes and more from the Poetry metadata.

Manipulating overrides

Many overrides for system dependencies are already upstream, but what if some are lacking? These overrides can be manipulated and extended manually:

poetry2nix.mkPoetryApplication {
    projectDir = ./.;
    overrides = poetry2nix.overrides.withDefaults (self: super: {
      foo = foo.overridePythonAttrs(oldAttrs: {});
    });
}

Conclusion

By embracing both modern Python package management tooling and the Nix language, we can achieve best-in-class user experience for Python developers and Nix developers alike.

There are ongoing efforts to make Poetry2nix and other Nix Python tooling work better with data science packages like numpy and scipy. I believe that Nix may soon rival Conda on Linux and MacOS for data science.

Python + Nix has a bright future ahead of it!

August 12, 2020 12:00 AM

August 11, 2020

Sander van der Burg

Experimenting with Nix and the service management properties of Docker

In the previous blog post, I have analyzed Nix and Docker as deployment solutions and described in what ways these solutions are similar and different.

To summarize my findings:

  • Nix is a source-based package manager responsible for obtaining, installing, configuring and upgrading packages in a reliable and reproducible manner and facilitating the construction of packages from source code and their dependencies.
  • Docker's purpose is to fully manage the life-cycle of applications (services and ordinary processes) in a reliable and reproducible manner, including their deployments.

As explained in my previous blog post, two prominent goals both solutions have in common is to facilitate reliable and reproducible deployment. They both use different kinds of techniques to accomplish these goals.

Although Nix and Docker can be used for a variety of comparable use cases (such as constructing images, deploying test environments, and constructing packages from source code), one prominent feature that the Nix package manager does not provide is process (or service) management.

In a Nix-based workflow you need to augment Nix with another solution that can facilitate process management.

In this blog post, I will investigate how Docker could fulfill this role -- it is pretty much the opposite goal of the combined use cases scenarios I have shown in the previous blog post, in which Nix can overtake the role of a conventional package manager in supplying packages in the construction process of an image and even the complete construction process of images.

Existing Nix integrations with process management


Although Nix does not do any process management, there are sister projects that can, such as:

  • NixOS builds entire machine configurations from a single declarative deployment specification and uses the Nix package manager to deploy and isolate all static artifacts of a system. It will also automatically generate and deploy systemd units for services defined in a NixOS configuration.
  • nix-darwin can be used to specify a collection of services in a deployment specification and uses the Nix package manager to deploy all services and their corresponding launchd configuration files.

Although both projects do a great job (e.g. they both provide a big collection of deployable services) what I consider a disadvantage is that they are platform specific -- both solutions only work on a single operating system (Linux and macOS) and a single process management solution (systemd and launchd).

If you are using Nix in a different environment, such as a different operating system, a conventional (non-NixOS) Linux distribution, or a different process manager, then there is no off-the-shelf solution that will help you managing services for packages provided by Nix.

Docker functionality


Docker could be considered a multi-functional solution for application management. I can categorize its functionality as follows:

  • Process management. The life-cycle of a container is bound to the life-cycle of a root process that needs to be started or stopped.
  • Dependency management. To ensure that applications have all the dependencies that they need and that no dependency is missing, Docker uses images containing a complete root filesystem with all required files to run an application.
  • Resource isolation is heavily used for a variety of different reasons:
    • Foremost, to ensure that the root filesystem of the container does not conflict with the host system's root filesystem.
    • It is also used to prevent conflicts with other kinds of resources. For example, the isolated network interfaces allow services to bind to the same TCP ports that may also be in use by the host system or other containers.
    • It offers some degree of protection. For example, a malicious process will not be able to see or control a process belonging to the host system or a different container.
  • Resource restriction can be used to limit the amount of system resources that a process can consume, such as the amount of RAM.

    Resource restriction can be useful for a variety of reasons, for example, to prevent a service from eating up all the system's resources affecting the stability of the system as a whole.
  • Integrations with the host system (e.g. volumes) and other services.

As described in the previous blog post, Docker uses a number key concepts to implement the functionality shown above, such as layers, namespaces and cgroups.

Developing a Nix-based process management solution


For quite some time, I have been investigating the process management domain and worked on a prototype solution to provide a more generalized infrastructure that complements Nix with process management -- I came up with an experimental Nix-based process manager-agnostic framework that has the following objectives:

  • It uses Nix to deploy all required packages and other static artifacts (such as configuration files) that a service needs.
  • It integrates with a variety of process managers on a variety of operating systems. So far, it can work with: sysvinit scripts, BSD rc scripts, supervisord, systemd, cygrunsrv and launchd.

    In addition to process managers, it can also automatically convert a processes model to deployment specifications that Disnix can consume.
  • It uses declarative specifications to define functions that construct managed processes and process instances.

    Processes can be declared in a process-manager specific and process-manager agnostic way. The latter makes it possible to target all six supported process managers with the same declarative specification, albeit with a limited set of features.
  • It allows you to run multiple instances of processes, by introducing a convention to cope with potential resource conflicts between process instances -- instance properties and potential conflicts can be configured with function parameters and can be changed in such a way that they do not conflict.
  • It can facilitate unprivileged user deployments by using Nix's ability to perform unprivileged package deployments and introducing a convention that allows you to disable user switching.

To summarize how the solution works from a user point of view, we can write a process manager-agnostic constructor function as follows:


{createManagedProcess, tmpDir}:
{port, instanceSuffix ? "", instanceName ? "webapp${instanceSuffix}"}:

let
webapp = import ../../webapp;
in
createManagedProcess {
name = instanceName;
description = "Simple web application";
inherit instanceName;

process = "${webapp}/bin/webapp";
daemonArgs = [ "-D" ];

environment = {
PORT = port;
PID_FILE = "${tmpDir}/${instanceName}.pid";
};
user = instanceName;
credentials = {
groups = {
"${instanceName}" = {};
};
users = {
"${instanceName}" = {
group = instanceName;
description = "Webapp";
};
};
};

overrides = {
sysvinit = {
runlevels = [ 3 4 5 ];
};
};
}

The Nix expression above is a nested function that defines in a process manager-agnostic way a configuration for a web application process containing an embedded web server serving a static HTML page.

  • The outer function header (first line) refers to parameters that are common to all process instances: createManagedProcess is a function that can construct process manager configurations and tmpDir refers to the directory in which temp files are stored (which is /tmp in conventional Linux installations).
  • The inner function header (second line) refers to instance parameters -- when it is desired to construct multiple instances of this process, we must make sure that we have configured these parameters in such as a way that they do not conflict with other processes.

    For example, when we assign a unique TCP port and a unique instance name (a property used by the daemon tool to create unique PID files) we can safely have multiple instances of this service co-existing on the same system.
  • In the body, we invoke the createManagedProcess function to generate configurations files for a process manager.
  • The process parameter specifies the executable that we need to run to start the process.
  • The daemonArgs parameter specifies command-line instructions passed to the the process executable, when the process should daemonize itself (the -D parameter instructs the webapp process to daemonize).
  • The environment parameter specifies all environment variables. Environment variables are used as a generic configuration facility for the service.
  • The user parameter specifies the name the process should run as (each process instance has its own user and group with the same name as the instance).
  • The credentials parameter is used to automatically create the group and user that the process needs.
  • The overrides parameter makes it possible to override the parameters generated by the createManagedProcess function with process manager-specific overrides, to configure features that are not universally supported.

    In the example above, we use an override to configure the runlevels in which the service should run (runlevels 3-5 are typically used to boot a system that is network capable). Runlevels are a sysvinit-specific concept.

In addition to defining constructor functions allowing us to construct zero or more process instances, we also need to construct process instances. These can be defined in a processes model:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir;
inherit forceDisableUserChange processManager;
};
in
rec {
webapp = rec {
port = 5000;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};
};

nginxReverseProxy = rec {
port = 8080;

pkg = constructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};
};
}

The above Nix expressions defines two process instances and uses the following conventions:

  • The first line is a function header in which the function parameters correspond to ajustable properties that apply to all process instances:
    • stateDir allows you to globally override the base directory in which all state is stored (the default value is: /var).
    • We can also change the locations of each individual state directories: tmpDir, cacheDir, logDir, runtimeDir etc.) if desired.
    • forceDisableUserChange can be enabled to prevent the process manager to change user permissions and create users and groups. This is useful to facilitate unprivileged user deployments in which the user typically has no rights to change user permissions.
    • The processManager parameter allows you to pick a process manager. All process configurations will be automatically generated for the selected process manager.

      For example, if we would pick: systemd then all configurations get translated to systemd units. supervisord causes all configurations to be translated to supervisord configuration files.
  • To get access to constructor functions, we import a constructors expression that composes all constructor functions by calling them with their common parameters (not shown in this blog post).

    The constructors expression also contains a reference to the Nix expression that deploys the webapp service, shown in our previous example.
  • The processes model defines two processes: a webapp instance that listens to TCP port 5000 and Nginx that acts as a reverse proxy forwarding requests to webapp process instances based on the virtual host name.
  • webapp is declared a dependency of the nginxReverseProxy service (by passing webapp as a parameter to the constructor function of Nginx). This causes webapp to be activated before the nginxReverseProxy.

To deploy all process instances with a process manager, we can invoke a variety of tools that are bundled with the experimental Nix process management framework.

The process model can be deployed as sysvinit scripts for an unprivileged user, with the following command:


$ nixproc-sysvinit-switch --state-dir /home/sander/var \
--force-disable-user-change processes.nix

The above command automatically generates sysvinit scripts, changes the base directory of all state folders to a directory in the user's home directory: /home/sander/var and disables user changing (and creation) so that an unprivileged user can run it.

The following command uses systemd as a process manager with the default parameters, for production deployments:


$ nixproc-systemd-switch processes.nix

The above command automatically generates systemd unit files and invokes systemd to deploy the processes.

In addition to the examples shown above, the framework contains many more tools, such as: nixproc-supervisord-switch, nixproc-launchd-switch, nixproc-bsdrc-switch, nixproc-cygrunsrv-switch, and nixproc-disnix-switch that all work with the same processes model.

Integrating Docker into the process management framework


Both Docker and the Nix-based process management framework are multi-functional solutions. After comparing the functionality of Docker and the process management framework, I realized that it is possible to integrate Docker into this framework as well, if I would use it in an unconventional way, by disabling or substituting some if its conflicting features.

Using a shared Nix store


As explained in the beginning of this blog post, Docker's primary means to provide dependencies is by using images that are self-contained root file systems containing all necessary files (e.g. packages, configuration files) to allow an application to work.

In the previous blog post, I have also demonstrated that instead of using traditional Dockerfiles to construct images, we can also use the Nix package manager as a replacement. A Docker image built by Nix is typically smaller than a conventional Docker image built from a base Linux distribution, because it only contains the runtime dependencies that an application actually needs.

A major disadvantage of using Nix constructed Docker images is that they only consist of one layer -- as a result, there is no reuse between container instances running different services that use common libraries. To alleviate this problem, Nix can also build layered images, in which common dependencies are isolated in separate layers as much as possible.

There is even a more optimal reuse strategy possible -- when running Docker on a machine that also has Nix installed, we do not need to put anything that is in the Nix store in a disk image. Instead, we can share the host system's Nix store between Docker containers.

This may sound scary, but as I have explained in the previous blog post, paths in the Nix store are prefixed with SHA256 hash codes. When two Nix store paths with identical hash codes are built on two different machines, their build results should be (nearly) bit-identical. As a result, it is safe to share the same Nix store path between multiple machines and containers.

A hacky solution to build a container image, without actually putting any of the Nix built packages in the container, can be done with the following expression:


with import <nixpkgs> {};

let
cmd = [ "${nginx}/bin/nginx" "-g" "daemon off;" "-c" ./nginx.conf ];
in
dockerTools.buildImage {
name = "nginxexp";
tag = "test";

runAsRoot = ''
${dockerTools.shadowSetup}
groupadd -r nogroup
useradd -r nobody -g nogroup -d /dev/null
mkdir -p /var/log/nginx /var/cache/nginx /var/www
cp ${./index.html} /var/www/index.html
'';

config = {
Cmd = map (arg: builtins.unsafeDiscardStringContext arg) cmd;
Expose = {
"80/tcp" = {};
};
};
}

The above expression is quite similar to the Nix-based Docker image example shown in the previous blog post, that deploys Nginx serving a static HTML page.

The only difference is how I configure the start command (the Cmd parameter). In the Nix expression language, strings have context -- if a string with context is passed to a build function (any string that contains a value that evaluates to a Nix store path), then the corresponding Nix store paths automatically become a dependency of the package that the build function builds.

By using the unsafe builtins.unsafeDiscardStringContext function I can discard the context of strings. As a result, the Nix packages that the image requires are still built. However, because their context is discarded they are no longer considered dependencies of the Docker image. As a consequence, they will not be integrated into the image that the dockerTools.buildImage creates.

(As a sidenote: there are still two Nix store paths that end-up in the image, namely bash and glibc that is a runtime dependency of bash. This is caused by the fact that the internals of the dockerTools.buildImage function make a reference to bash without discarding its context. In theory, it is also possible to eliminate this dependency as well).

To run the container and make sure that the required Nix store paths are available, I can mount the host system's Nix store as a shared volume:


$ docker run -p 8080:80 -v /nix/store:/nix/store -it nginxexp:latest

By mounting the host system's Nix store (with the -v parameter), Nginx should still behave as expected -- it is not provided by the image, but referenced from the shared Nix store.

(As a sidenote: mounting the host system's Nix store for sharing is not a new idea. It has already been intensively used by the NixOS test driver for many years to rapidly create QEMU virtual machines for system integration tests).

Using the host system's network


As explained in the previous blog post, every Docker container by default runs in its own private network namespace making it possible for services to bind to any port without conflicting with the services on the host system or services provided by any other container.

The Nix process management framework does not work with private networks, because it is not a generalizable concept (i.e. namespaces are a Linux-only feature). Aside from Docker, the only other process manager supported by the framework that can work with namespaces is systemd.

To prevent ports and other dynamic resources from conflicting with each other, the process management framework makes it possible to configure them through instance function parameters. If the instance parameters have unique values, they will not conflict with other process instances (based on the assumption that the packager has identified all possible conflicts that a process might have).

Because we already have a framework that prevents conflicts, we can also instruct Docker to use the host system's network with the --network host parameter:


$ docker run -v /nix/store:/nix/store --network host -it nginxexp:latest

The only thing the framework cannot provide you is protection -- mallicious services in a private network namespace cannot connect to ports used by other containers or the host system, but the framework cannot protect you from that.

Mapping a base directory for storing state


Services that run in containers are not always stateless -- they may rely on data that should be persistently stored, such as databases. The Docker recommendation to handle persistent state is not to store it in a container's writable layer, but on a shared volume on the host system.

Data stored outside the container makes it possible to reliably upgrade a container -- when it is desired to install a newer version of an application, the container can be discarded and recreated from a new image.

For the Nix process management framework, integration with a state directory outside the container is also useful. With an extra shared volume, we can mount the host system's state directory:


$ docker run -v /nix/store:/nix/store \
-v /var:/var --network host -it nginxexp:latest

Orchestrating containers


The last piece in the puzzle is to orchestrate the containers: we must create or discard them, and start or stop them, and perform all required steps in the right order.

Moreover, to prevent the Nix packages that a containers needs from being garbage collected, we need to make sure that they are a dependency of a package that is registered as in use.

I came up with my own convention to implement the container deployment process. When building the processes model for the docker process manager, the following files are generated that help me orchestrating the deployment process:


01-webapp-docker-priority
02-nginx-docker-priority
nginx-docker-cmd
nginx-docker-createparams
nginx-docker-settings
webapp-docker-cmd
webapp-docker-createparams
webapp-docker-settings

In the above list, we have the following kinds of files:

  • The files that have a -docker-settings suffix contain general properties of the container, such as the image that needs to be used a template.
  • The files that have a -docker-createparams suffix contain the command line parameters that are propagated to docker create to create the container. If a container with the same name already exists, the container creation is skipped and the existing instance is used instead.
  • To prevent the Nix packages that a Docker container needs from being garbage collected the generator creates a file with a -docker-cmd suffix containing the Cmd instruction including the full Nix store paths of the packages that a container needs.

    Because the strings' contexts are not discarded in the generation process, the packages become a dependency of the configuration file. As long as this configuration file is deployed, the packages will not get garbage collected.
  • To ensure that the containers are activated in the right order we have two files that are prefixed with two numeric digits that have a -container-priority suffix. The numeric digits determine in which order the containers should be activated -- in the above example the webapp process gets activated before Nginx (that acts as a reverse proxy).

With the following command, we can automatically generate the configuration files shown above for all our processes in the processes model, and use it to automatically create and start docker containers for all process instances:


$ nixproc-docker-switch processes.nix
55d833e07428: Loading layer [==================================================>] 46.61MB/46.61MB
Loaded image: webapp:latest
f020f5ecdc6595f029cf46db9cb6f05024892ce6d9b1bbdf9eac78f8a178efd7
nixproc-webapp
95b595c533d4: Loading layer [==================================================>] 46.61MB/46.61MB
Loaded image: nginx:latest
b195cd1fba24d4ec8542c3576b4e3a3889682600f0accc3ba2a195a44bf41846
nixproc-nginx

The result is two running Docker containers that correspond to the process instances shown in the processes model:


$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b195cd1fba24 nginx:latest "/nix/store/j3v4fz9h…" 15 seconds ago Up 14 seconds nixproc-nginx
f020f5ecdc65 webapp:latest "/nix/store/b6pz847g…" 16 seconds ago Up 15 seconds nixproc-webapp

and we should be able to access the example HTML page, by opening the following URL: http://localhost:8080 in a web browser.

Deploying Docker containers in a heteregenous and/or distributed environment


As explained in my previous blog posts about the experimental Nix process management framework, the processes model is a sub set of a Disnix services model. When it is desired to deploy processes to a network of machines or combine processes with other kinds of services, we can easily turn a processes model into a services model.

For example, I can change the processes model shown earlier into a services model that deploys Docker containers:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir;
inherit forceDisableUserChange;
processManager = "docker";
};
in
rec {
webapp = rec {
name = "webapp";

port = 5000;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};

type = "docker-container";
};

nginxReverseProxy = rec {
name = "nginxReverseProxy";

port = 8080;

pkg = constructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};

type = "docker-container";
};
}

In the above example, I have added a name attribute to each process (a required property for Disnix service models) and a type attribute referring to: docker-container.

In Disnix, a service could take any form. A plugin system (named Dysnomia) is responsible for managing the life-cycle of a service, such as activating or deactivating it. The type attribute is used to tell Disnix that we should use the docker-container Dysnomia module. This module will automatically create and start the container on activation, and stop and discard the container on deactivation.

To deploy the above services to a network of machines, we require an infrastructure model (that captures the available machines and their relevant deployment properties):


{
test1.properties.hostname = "test1";
}

The above infrastructure model contains only one target machine: test1 with a hostname that is identical to the machine name.

We also require a distribution model that maps services in the services model to machines in the infrastructure model:


{infrastructure}:

{
webapp = [ infrastructure.test1 ];
nginxReverseProxy = [ infrastructure.test1 ];
}

In the above distribution model, we map the all the processes in the services model to the test1 target machine in the infrastructure model.

With the following command, we can deploy our Docker containers to the remote test1 target machine:


$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

When the above command succeeds, the test1 target machine provides running webapp and nginxReverseProxy containers.

(As a sidenote: to make Docker container deployments work with Disnix, the Docker service already needs to be predeployed to the target machines in the infrastructure model, or the Docker daemon needs to be deployed as a container provider).

Deploying conventional Docker containers with Disnix


The nice thing about the docker-container Dysnomia module is that it is generic enough to also work with conventional Docker containers (that work with images, not a shared Nix store).

For example, we can deploy Nginx as a regular container built with the dockerTools.buildImage function:


{dockerTools, stdenv, nginx}:

let
dockerImage = dockerTools.buildImage {
name = "nginxexp";
tag = "test";
contents = nginx;

runAsRoot = ''
${dockerTools.shadowSetup}
groupadd -r nogroup
useradd -r nobody -g nogroup -d /dev/null
mkdir -p /var/log/nginx /var/cache/nginx /var/www
cp ${./index.html} /var/www/index.html
'';

config = {
Cmd = [ "${nginx}/bin/nginx" "-g" "daemon off;" "-c" ./nginx.conf ];
Expose = {
"80/tcp" = {};
};
};
};
in
stdenv.mkDerivation {
name = "nginxexp";
buildCommand = ''
mkdir -p $out
cat > $out/nginxexp-docker-settings <<EOF
dockerImage=${dockerImage}
dockerImageTag=nginxexp:test
EOF

cat > $out/nginxexp-docker-createparams <<EOF
-p
8080:80
EOF
'';
}

In the above example, instead of using the process manager-agnostic createManagedProcess, I directly construct a Docker-based Nginx image (by using the dockerImage attribute) and container configuration files (in the buildCommand parameter) to make the container deployments work with the docker-container Dysnomia module.

It is also possible to deploy containers from images that are constructed with Dockerfiles. After we have built an image in the traditional way, we can export it from Docker with the following command:


$ docker save nginx-debian -o nginx-debian.tar.gz

and then we can use the following Nix expression to deploy a container using our exported image:


{dockerTools, stdenv, nginx}:

stdenv.mkDerivation {
name = "nginxexp";
buildCommand = ''
mkdir -p $out
cat > $out/nginxexp-docker-settings <<EOF
dockerImage=${./nginx-debian.tar.gz}
dockerImageTag=nginxexp:test
EOF

cat > $out/nginxexp-docker-createparams <<EOF
-p
8080:80
EOF
'';
}

In the above expression, the dockerImage property refers to our exported image.

Although Disnix is flexible enough to also orchestrate Docker containers (thanks to its generalized plugin architecture), I did not develop the docker-container Dysnomia module to make Disnix compete with existing container orchestration solutions, such as Kubernetes or Docker Swarm.

Disnix is a heterogeneous deployment tool that can be used to integrate units that have all kinds of shapes and forms on all kinds of operating systems -- having a docker-container module makes it possible to mix Docker containers with other service types that Disnix and Dysnomia support.

Discussion


In this blog post, I have demonstrated that we can integrate Docker as a process management backend option into the experimental Nix process management framework, by substituting some of its conflicting features.

Moreover, because a Disnix service model is a superset of a processes model, we can also use Disnix as a simple Docker container orchestrator and integrate Docker containers with other kinds of services.

Compared to Docker, the Nix process management framework supports a number of features that Docker does not:

  • Docker is heavily developed around Linux-specific concepts, such as namespaces and cgroups. As a result, it can only be used to deploy software built for Linux.

    The Nix process management framework should work on any operating system that is supported by the Nix package manager (e.g. Nix also has first class support for macOS, and can also be used on other UNIX-like operating systems such as FreeBSD). The same also applies to Disnix.
  • The Nix process management framework can work with sysvinit, BSD rc and Disnix process scripts, that do not require any external service to manage a process' life-cycle. This is convenient for local unprivileged user deployments. To deploy Docker containers, you need to have the Docker daemon installed first.
  • Docker has an experimental rootless deployment mode, but in the Nix process management framework facilitating unprivileged user deployments is a first class concept.

On the other hand, the Nix process management framework does not take over all responsibilities of Docker:

  • Docker heavily relies on namespaces to prevent resource conflicts, such as overlapping TCP ports and global state directories. The Nix process management framework solves conflicts by avoiding them (i.e. configuring properties in such a way that they are unique). The conflict avoidance approach works as long as a service is well-specified. Unfortunately, preventing conflicts is not a hard guarantee that the tool can provide you.
  • Docker also provides some degree of protection by using namespaces and cgroups. The Nix process management framework does not support this out of the box, because these concepts are not generalizable over all the process management backends it supports. (As a sidenote: it is still possible to use these concepts by defining process manager-specific overrides).

From a functionality perspective, docker-compose comes close to the features that the experimental Nix process management framework supports. docker-compose allows you to declaratively define container instances and their dependencies, and automatically deploy them.

However, as its name implies docker-compose is specifically designed for deploying Docker containers whereas the Nix process management framework is more general -- it should work with all kinds of process managers, uses Nix as the primary means to provide dependencies, it uses the Nix expression language for configuration and it should work on a variety of operating systems.

The fact that Docker (and containers in general) are multi-functional solutions is not an observation only made by me. For example, this blog post also demonstrates that containers can work without images.

Availability


The Docker backend has been integrated into the latest development version of the Nix process management framework.

To use the docker-container Dysnomia module (so that Disnix can deploy Docker containers), you need to install the latest development version of Dysnomia.

by Sander van der Burg (noreply@blogger.com) at August 11, 2020 07:18 PM

August 07, 2020

Matej Cotman

Neovim, WSL and Nix

How to use Neovim (Neovim-Qt) under WSL 1/2 with the power of Nix

Intro

Well we all know that generally development on Linux is easier than on Windows, but sometimes you are forced to use Windows. But that does not mean that all those nice tools from Linux are not available to you, as we will see in this post.

Windows has some thing called WSL which enables you to run Linux tools natively in the Windows subsystem. Not all is without issues, you can not run graphical Linux applications because Windows does not run Xorg server, yeah you have Xorg ports that run there but that is in this case just one more unwanted layer, remember, building efficient solutions is what every engineer should strive to.

What I did is to use Windows pre-built binaries of Neovim-Qt and run the Neovim installed with Nix inside WSL.

Ok, you could say then, why not use VS Code with some Vim/Neovim plugin and use so called Remote-WSL plugin to access WSL… Well yes, but at least me I stumble upon few issues. First was that CPU usage was through the roof when Remote-WSL extension was in use on WSL1 (I could not just run Windows Update on client’s managed computer) and the fix was to install specific version of libc with dpkg (which is absurd in the first place because this is a good way to ruin your whole environment). Applying this fix did the trick for lowering the CPU usage. The second issue come right after, when I wanted to install some package with APT package manager, like I predicted, libc install did its damage, I could not install or un-install anything with APT. Nix comes again to the rescue.

By the way the sleep command forgot how to work under WSL and Ubuntu 20.04 Source.

Let’s see the solution

Neovim-Qt

Neovim-Qt has nicely built binaries on their GitHub page for Windows, so I just downloaded that zip and unpacked it into C:/Program Files/neovim-qt/. But any location could do.

WSL

Open PowerShell as Administrator and run:

dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart

If you do not have up to date Windows for any kind of reason to install WLS2 then reboot now.

Reboot Time

You should have now enabled the WSL1 and you can proceed to install Ubuntu 20.04 (or any other Linux distro you like) from Microsoft store. Do not forget to click Launch after installing it (it will ask you to create a user).

Nix

To install Nix, you need to first open some terminal emulator and run wsl.exe, but you can also just run it from Start menu.

bash <(curl -L https://nixos.org/nix/install)

To finish you can just close the terminal and open wsl.exe again.

Thats it.

The Nix script

Now here is the absolutely most awesome part that connects everything together.

{ pkgs ? import <nixpkgs> {} }:
pkgs.writeScript "run-neovim-qt.sh" ''
    #!${pkgs.stdenv.shell}
    set -e

    # get random free port
    export NVIM_LISTEN="127.0.0.1:$(${pkgs.python3Packages.python}/bin/python -c 'import socket; s=socket.socket(); s.bind(("", 0)); print(s.getsockname()[1]); s.close()')"

    # use python's sleep, because coreutils' sleep does not function under Ubuntu 20.04 and WSL
    #   after delay start nvim-qt - so that nvim starts before the GUI
    { ${pkgs.python3Packages.python}/bin/python -c 'import time; time.sleep(1)'; "''${NVIM_QT_PATH}" --server "$NVIM_LISTEN"; } &

    # start nvim
    ${pkgs.neovim}/bin/nvim --listen "$NVIM_LISTEN" --headless "$@"
''

Save it to your drive or download with wget under WSL:

wget https://raw.githubusercontent.com/matejc/helper_scripts/master/nixes/neovim-qt.nix 

Then build the command with:

nix-build ./neovim-qt.nix

The resulting script is ./result.

Usage

First we need to tell the script where is the Neovim-qt located:

export NVIM_QT_PATH='/mnt/c/Program Files/neovim-qt/bin/nvim-qt.exe'

You can save this into .bashrc or .profile and restart the terminal so that you do not need to repeat the step every time you run wsl shell.

The final step is:

./result my/awesome/code.py

Conclusion

Too much work? You think? Well how much more time you would use using and configuring VS Code or Atom to work under similar environment? And what about Nix? You can install it without the use of native package managers (in case the native one is b0rked) and once you do, you have the power to install your favorite development environment with single command.

I like this solution, in my eyes its simple and efficient, what are your thoughts?

Until next time… I wish you happy hacking!

High cpu usage of node process in Remote-WSL extension #2921 Neovim-Qt Releases WSL on Windows 10 Quick start with Nix

August 07, 2020 10:00 PM

July 31, 2020

Tweag I/O

Nix Flakes, Part 3: Managing NixOS systems

This is the third in a series of blog posts about Nix flakes. The first part motivated why we developed flakes — to improve Nix’s reproducibility, composability and usability — and gave a short tutorial on how to use flakes. The second part showed how flakes enable reliable caching of Nix evaluation results. In this post, we show how flakes can be used to manage NixOS systems in a reproducible and composable way.

What problems are we trying to solve?

Lack of reproducibility

One of the main selling points of NixOS is reproducibility: given a specification of a system, if you run nixos-rebuild to deploy it, you should always get the same actual system (modulo mutable state such as the contents of databases). For instance, we should be able to reproduce in a production environment the exact same configuration that we’ve previously validated in a test environment.

However, the default NixOS workflow doesn’t provide reproducible system configurations out of the box. Consider a typical sequence of commands to upgrade a NixOS system:

  • You edit /etc/nixos/configuration.nix.
  • You run nix-channel --update to get the latest version of the nixpkgs repository (which contains the NixOS sources).
  • You run nixos-rebuild switch, which evaluates and builds a function in the nixpkgs repository that takes /etc/nixos/configuration.nix as an input.

In this workflow, /etc/nixos/configuration.nix might not be under configuration management (e.g. point to a Git repository), or if it is, it might be a dirty working tree. Furthermore, configuration.nix doesn’t specify what Git revision of nixpkgs to use; so if somebody else deploys the same configuration.nix, they might get a very different result.

Lack of traceability

The ability to reproduce a configuration is not very useful if you can’t tell what configuration you’re actually running. That is, from a running system, you should be able to get back to its specification. So there is a lack of traceability: the ability to trace derived artifacts back to their sources. This is an essential property of good configuration management, since without it, we don’t know what we’re actually running in production, so reproducing or fixing problems becomes much harder.

NixOS currently doesn’t not have very good traceability. You can ask a NixOS system what version of Nixpkgs it was built from:

$ nixos-version --json | jq -r .nixpkgsRevision
a84b797b28eb104db758b5cb2b61ba8face6744b

Unfortunately, this doesn’t allow you to recover configuration.nix or any other external NixOS modules that were used by the configuration.

Lack of composability

It’s easy to enable a package or system service in a NixOS configuration if it is part of the nixpkgs repository: you just add a line like environment.systemPackages = [ pkgs.hello ]; or services.postgresql.enable = true; to your configuration.nix. But what if we want to use a package or service that isn’t part of Nixpkgs? Then we’re forced to use mechanisms like $NIX_PATH, builtins.fetchGit, imports using relative paths, and so on. These are not standardized (since everybody uses different conventions) and are inconvenient to use (for example, when using $NIX_PATH, it’s the user’s responsibility to put external repositories in the right directories).

Put another way: NixOS is currently built around a monorepo workflow — the entire universe should be added to the nixpkgs repository, because anything that isn’t, is much harder to use.

It’s worth noting that any NixOS system configuration already violates the monorepo assumption: your system’s configuration.nix is not part of the nixpkgs repository.

Using flakes for NixOS configurations

In the previous post, we saw that flakes are (typically) Git repositories that have a file named flake.nix, providing a standardized interface to Nix artifacts. We saw flakes that provide packages and development environments; now we’ll use them to provide NixOS system configurations. This solves the problems described above:

  • Reproducibility: the entire system configuration (including everything it depends on) is captured by the flake and its lock file. So if two people check out the same Git revision of a flake and build it, they should get the same result.
  • Traceability: nixos-version prints the Git revision of the top-level configuration flake, not its nixpkgs input.
  • Composability: it’s easy to pull in packages and modules from other repositories as flake inputs.

Prerequisites

Flake support has been added as an experimental feature to NixOS 20.03. However, flake support is not part of the current stable release of Nix (2.3). So to get a NixOS system that supports flakes, you first need to switch to the nixUnstable package and enable some experimental features. This can be done by adding the following to configuration.nix:

nix.package = pkgs.nixUnstable;
nix.extraOptions = ''
  experimental-features = nix-command flakes
'';

Creating a NixOS configuration flake

Let’s create a flake that contains the configuration for a NixOS container.

$ git init my-flake
$ cd my-flake
$ nix flake init -t templates#simpleContainer
$ git commit -a -m 'Initial version'

Note that the -t flag to nix flake init specifies a template from which to copy the initial contents of the flake. This is useful for getting started. To see what templates are available, you can run:

$ nix flake show templates

For reference, this is what the initial flake.nix looks like:

{
  inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-20.03";

  outputs = { self, nixpkgs }: {

    nixosConfigurations.container = nixpkgs.lib.nixosSystem {
      system = "x86_64-linux";
      modules =
        [ ({ pkgs, ... }: {
            boot.isContainer = true;

            # Let 'nixos-version --json' know about the Git revision
            # of this flake.
            system.configurationRevision = nixpkgs.lib.mkIf (self ? rev) self.rev;

            # Network configuration.
            networking.useDHCP = false;
            networking.firewall.allowedTCPPorts = [ 80 ];

            # Enable a web server.
            services.httpd = {
              enable = true;
              adminAddr = "morty@example.org";
            };
          })
        ];
    };

  };
}

That is, the flake has one input, namely nixpkgs - specifically the 20.03 branch. It has one output, nixosConfigurations.container, which evaluates a NixOS configuration for tools like nixos-rebuild and nixos-container. The main argument is modules, which is a list of NixOS configuration modules. This takes the place of the file configuration.nix in non-flake deployments. (In fact, you can write modules = [ ./configuration.nix ] if you’re converting a pre-flake NixOS configuration.)

Let’s create and start the container! (Note that nixos-container currently requires you to be root.)

# nixos-container create flake-test --flake /path/to/my-flake
host IP is 10.233.4.1, container IP is 10.233.4.2

# nixos-container start flake-test

To check whether the container works, let’s try to connect to it:

$ curl http://flake-test/
<html><body><h1>It works!</h1></body></html>

As an aside, if you just want to build the container without the nixos-container command, you can do so as follows:

$ nix build /path/to/my-flake#nixosConfigurations.container.config.system.build.toplevel

Note that system.build.toplevel is an internal NixOS option that evaluates to the “system” derivation that commands like nixos-rebuild, nixos-install and nixos-container build and activate. The symlink /run/current-system points to the output of this derivation.

Hermetic evaluation

One big difference between “regular” NixOS systems and flake-based NixOS systems is that the latter record the Git revisions from which they were built. We can query this as follows:

# nixos-container run flake-test -- nixos-version --json
{"configurationRevision":"9190c396f4dcfc734e554768c53a81d1c231c6a7"
,"nixosVersion":"20.03.20200622.13c15f2"
,"nixpkgsRevision":"13c15f26d44cf7f54197891a6f0c78ce8149b037"}

Here, configurationRevision is the Git revision of the repository /path/to/my-flake. Because evaluation is hermetic, and the lock file locks all flake inputs such as nixpkgs, knowing the revision 9190c39… allows you to completely reconstruct this configuration at a later point in time. For example, if you want to deploy this particular configuration to a container, you can do:

# nixos-container update flake-test \
    --flake /path/to/my-flake?rev=9190c396f4dcfc734e554768c53a81d1c231c6a7

Dirty configurations

It’s not required that you commit all changes to a configuration before deploying it. For example, if you change the adminAddr line in flake.nix to

adminAddr = "rick@example.org";

and redeploy the container, you will get:

# nixos-container update flake-test
warning: Git tree '/path/to/my-flake' is dirty
...
reloading container...

and the container will no longer have a configuration Git revision:

# nixos-container run flake-test -- nixos-version --json | jq .configurationRevision
null

While this may be convenient for testing, in production we really want to ensure that systems are deployed from clean Git trees. One way is to disallow dirty trees on the command line:

# nixos-container update flake-test --no-allow-dirty
error: --- Error -------------------- nix
Git tree '/path/to/my-flake' is dirty

Another is to require a clean Git tree in flake.nix, for instance by adding a check to the definition of system.configurationRevision:

system.configurationRevision =
  if self ? rev
  then self.rev
  else throw "Refusing to build from a dirty Git tree!";

Adding modules from third-party flakes

One of the main goals of flake-based NixOS is to make it easier to use packages and modules that are not included in the nixpkgs repository. As an example, we’ll add Hydra (a continuous integration server) to our container.

Here’s how we add it to our container. We specify it as an additional input:

  inputs.hydra.url = "github:NixOS/hydra";

and as a corresponding function argument to the outputs function:

  outputs = { self, nixpkgs, hydra }: {

Finally, we enable the NixOS module provided by the hydra flake:

      modules =
        [ hydra.nixosModules.hydraTest

          ({ pkgs, ... }: {
            ... our own configuration ...

            # Hydra runs on port 3000 by default, so open it in the firewall.
            networking.firewall.allowedTCPPorts = [ 3000 ];
          })
        ];

Note that we can discover the name of this module by using nix flake show:

$ nix flake show github:NixOS/hydra
github:NixOS/hydra/d0deebc4fc95dbeb0249f7b774b03d366596fbed
├───…
├───nixosModules
│   ├───hydra: NixOS module
│   ├───hydraProxy: NixOS module
│   └───hydraTest: NixOS module
└───overlay: Nixpkgs overlay

After committing this change and running nixos-container update, we can check whether hydra is working in the container by visiting http://flake-test:3000/ in a web browser.

Working with lock files

There are a few command line flags accepted by nix, nixos-rebuild and nixos-container that make updating lock file more convenient. A very common action is to update a flake input to the latest version; for example,

$ nixos-container update flake-test --update-input nixpkgs --commit-lock-file

updates the nixpkgs input to the latest revision on the nixos-20.03 branch, and commits the new lock file with a commit message that records the input change.

A useful flag during development is --override-input, which allows you to point a flake input to another location, completely overriding the input location specified by flake.nix. For example, this is how you can build the container against a local Git checkout of Hydra:

$ nixos-container update flake-test --override-input hydra /path/to/my/hydra

Adding overlays from third-party flakes

Similarly, we can add Nixpkgs overlays from other flakes. (Nixpkgs overlays add or override packages in the pkgs set.) For example, here is how you add the overlay provided by the nix flake:

  outputs = { self, nixpkgs, nix }: {
    nixosConfigurations.container = nixpkgs.lib.nixosSystem {
      ...
      modules =
        [
          ({ pkgs, ... }: {
            nixpkgs.overlays = [ nix.overlay ];
            ...
          })
        ];
    };
  };
}

Using nixos-rebuild

Above we saw how to manage NixOS containers using flakes. Managing “real” NixOS systems works much the same, except using nixos-rebuild instead of nixos-container. For example,

# nixos-rebuild switch --flake /path/to/my-flake#my-machine

builds and activates the configuration specified by the flake output nixosConfigurations.my-machine. If you omit the name of the configuration (#my-machine), nixos-rebuild defaults to using the current host name.

Pinning Nixpkgs

It’s often convenient to pin the nixpkgs flake to the exact version of nixpkgs used to build the system. This ensures that commands like nix shell nixpkgs#<package> work more efficiently since many or all of the dependencies of <package> will already be present. Here is a bit of NixOS configuration that pins nixpkgs in the system-wide flake registry:

nix.registry.nixpkgs.flake = nixpkgs;

Note that this only affects commands that reference nixpkgs without further qualifiers; more specific flake references like nixpkgs/nixos-20.03 or nixpkgs/348503b6345947082ff8be933dda7ebeddbb2762 are unaffected.

Conclusion

In this blog post we saw how Nix flakes make NixOS configurations hermetic and reproducible. In a future post, we’ll show how we can do the same for cloud deployments using NixOps.

Acknowledgment: The development of flakes was partially funded by Target Corporation.

July 31, 2020 12:00 AM

July 29, 2020

Sander van der Burg

On using Nix and Docker as deployment automation solutions: similarities and differences

As frequent readers of my blog may probably already know, I have been using Nix-related tools for quite some time to solve many of my deployment automation problems.

Although I have worked in environments in which Nix and its related sub projects are well-known, when I show some of Nix's use cases to larger groups of DevOps-minded people, a frequent answer I that have been hearing is that it looks very similar to Docker. People also often ask me what advantages Nix has over Docker.

So far, I have not even covered Docker once on my blog, despite its popularity, including very popular sister projects such as docker-compose and Kubernetes.

The main reason why I never wrote anything about Docker is not because I do not know about it or how to use it, but simply because I never had any notable use cases that would lead to something publishable -- most of my problems for which Docker could be a solution, I solved it by other means, typically by using a Nix-based solution somewhere in the solution stack.

Docker is a container-based deployment solution, that was not the first (neither in the Linux world, nor in the UNIX-world in general), but since its introduction in 2013 it has grown very rapidly in popularity. I believe its popularity can be mainly attributed to its ease of usage and its extendable images ecosystem: Docker Hub.

In fact, Docker (and Kubernetes, a container orchestration solution that incorporates Docker) have become so popular, that they have set a new standard when it comes to organizing systems and automating deployment -- today, in many environments, I have the feeling that it is no longer the question what kind of deployment solution is best for a particular system and organization, but rather: "how do we get it into containers?".

The same thing applies to the "microservices paradigm" that should facilitate modular systems. If I compare the characteristics of microservices with the definition a "software component" by Clemens Szyperski's Component Software book, then I would argue that they have more in common than they are different.

One of the reasons why I think microservices are considered to be a success (or at least considered moderately more successful by some over older concepts, such as web services and software components) is because they easily map to a container, that can be conveniently managed with Docker. For some people, a microservice and a Docker container are pretty much the same things.

Modular software systems have all kinds of advantages, but its biggest disadvantage is that the deployment of a system becomes more complicated as the amount of components and dependencies grow. With Docker containers this problem can be (somewhat) addressed in a convenient way.

In this blog post, I will provide my view on Nix and Docker -- I will elaborate about some of their key concepts, explain in what ways they are different and similar, and I will show some use-cases in which both solutions can be combined to achieve interesting results.

Application domains


Nix and Docker are both deployment solutions for slightly different, but also somewhat overlapping, application domains.

The Nix package manager (on the recently revised homepage) advertises itself as follows:

Nix is a powerful package manager for Linux and other Unix systems that makes package management reliable and reproducible. Share your development and build environments across different machines.

whereas Docker advertises itself as follows (in the getting started guide):

Docker is an open platform for developing, shipping, and running applications.

To summarize my interpretations of the descriptions:

  • Nix's chief responsibility is as its description implies: package management and provides a collection of software tools that automates the process of installing, upgrading, configuring, and removing computer programs for a computer's operating system in a consistent manner.

    There are two properties that set Nix apart from most other package management solutions. First, Nix is also a source-based package manager -- it can be used as a tool to construct packages from source code and their dependencies, by invoking build scripts in "pure build environments".

    Moreover, it borrows concepts from purely functional programming languages to make deployments reproducible, reliable and efficient.
  • Docker's chief responsibility is much broader than package management -- Docker facilitates full process/service life-cycle management. Package management can be considered to be a sub problem of this domain, as I will explain later in this blog post.

Although both solutions map to slightly different domains, there is one prominent objective that both solutions have in common. They both facilitate reproducible deployment.

With Nix the goal is that if you build a package from source code and a set of dependencies and perform the same build with the same inputs on a different machine, their build results should be (nearly) bit-identical.

With Docker, the objective is to facilitate reproducible environments for running applications -- when running an application container on one machine that provides Docker, and running the same application container on another machine, they both should work in an identical way.

Although both solutions facilitate reproducible deployments, their reproducibility properties are based on different kinds of concepts. I will explain more about them in the next sections.

Nix concepts


As explained earlier, Nix is a source-based package manager that borrows concepts from purely functional programming languages. Packages are built from build recipes called Nix expressions, such as:


with import <nixpkgs> {};

stdenv.mkDerivation {
name = "file-5.38";

src = fetchurl {
url = "ftp://ftp.astron.com/pub/file/file-5.38.tar.gz";
sha256 = "0d7s376b4xqymnrsjxi3nsv3f5v89pzfspzml2pcajdk5by2yg2r";
};

buildInputs = [ zlib ];

meta = {
homepage = https://darwinsys.com/file;
description = "A program that shows the type of files";
};
}

The above Nix expression invokes the function: stdenv.mkDerivation that creates a build environment in which we build the package: file from source code:

  • The name parameter provides the package name.
  • The src parameter invokes the fetchurl function that specifies where to download the source tarball from.
  • buildInputs refers to the build-time dependencies that the package needs. The file package only uses one dependency: zlib that provides deflate compression support.

    The buildInputs parameter is used to automatically configure the build environment in such a way that zlib can be found as a library dependency by the build script.
  • The meta parameter specifies the package's meta data. Meta data is used by Nix to provide information about the package, but it is not used by the build script.

The Nix expression does not specify any build instructions -- if no build instructions were provided, the stdenv.mkDerivation function will execute the standard GNU Autotools build procedure: ./configure; make; make install.

Nix combines several concepts to make builds more reliable and reproducible.

Foremost, packages managed by Nix are stored in a so-called Nix store (/nix/store) in which every package build resides in its own directory.

When we build the above Nix expression with the following command:


$ nix-build file.nix

then we may get the following Nix store path as output:


/nix/store/6rcg0zgqyn2v1ypd46hlvngaf5lgqk9g-file-5.38

Each entry in the Nix store has a SHA256 hash prefix (e.g. ypag3bh7y7i15xf24zihr343wi6x5i6g) that is derived from all build inputs used to build a package.

If we would build file, for example, with a different build script or different version of zlib then the resulting Nix store prefix will be different. As a result, we can safely store multiple versions and variants of the same package next to each other, because they will never share the same name.

Because each package resides in its own directory in the Nix store, rather than global directories that are commonly used on conventional Linux systems, such as /bin and /lib, we get stricter purity guarantees -- dependencies can typically not be found if they have not been specified in any of the search environment variables (e.g. PATH) or provided as build parameters.

In conventional Linux systems, package builds might still accidentally succeed if they unknowingly use an undeclared dependency. When deploying such a package to another system that does not have this undeclared dependency installed, the package might not work properly or not all.

In simple single-user Nix installations, builds typically get executed in an environment in which most environment variables (including search path environment variables, such as PATH) are cleared or set to dummy variables.

Build abstraction functions (such as stdenv.mkDerivation) will populate the search path environment variables (e.g. PATH, CLASSPATH, PYTHONPATH etc.) and configure build parameters to ensure that the dependencies in the Nix store can be found.

Builds are only allowed to write in the build directory or designated output folders in the Nix store.

When a build has completed successfully, their results are made immutable (by removing their write permission bits in the Nix store) and their timestamps are reset to 1 second after the epoch (to improve build determinism).

Storing packages in isolation and providing an environment with cleared environment variables is obviously not a guarantee that builds will be pure. For example, build scripts may still have hard-coded absolute paths to executables on the host system, such as /bin/install and a C compiler may still implicitly search for headers in /usr/include.

To alleviate the problem with hard-coded global directory references, some common build utilities, such as GCC, deployed by Nix have been patched to ignore global directories, such as /usr/include.

When using Nix in multi-user mode, extra precautions have been taken to ensure build purity:

  • Each build will run as an unprivileged user, that do not have any write access to any directory but its own build directory and the designated output Nix store paths.
  • On Linux, optionally a build can run in a chroot environment, that completely disables access to all global directories in the build process. In addition, all Nix store paths of all dependencies will be bind mounted, preventing the build process to still access undeclared dependencies in the Nix store (changes will be slim that you encounter such a build, but still...)
  • On Linux kernels that support namespaces, the Nix build environment will use them to improve build purity.

    The network namespace helps the Nix builder to prevent a build process from accessing the network -- when a build process downloads an undeclared dependency from a remote location, we cannot be sure that we get a predictable result.

    In Nix, only builds that are so-called fixed output derivations (whose output hashes need to be known in advance) are allowed to download files from remote locations, because their output results can be verified.

    (As a sidenote: namespaces are also intensively used by Docker containers, as I will explain in the next section.)
  • On macOS, builds can optionally be executed in an app sandbox, that can also be used to restrict access to various kinds of shared resources, such as network access.

Besides isolation, using hash code prefixes have another advantage. Because every build with the same hash code is (nearly) bit identical, it also provides a nice optimization feature.

When we evaluate a Nix expression and the resulting hash code is identical to a valid Nix store path, then we do not have to build the package again -- because it is bit identical, we can simply return the Nix store path of the package that is already in the Nix store.

This property is also used by Nix to facilitate transparent binary package deployments. If we want to build a package with a certain hash prefix, and we know that another machine or binary cache already has this package in its Nix store, then we can download a binary substitute.

Another interesting benefit of using hash codes is that we can also identify the runtime dependencies that a package needs -- if a Nix store path contains references to other Nix store paths, then we know that these are runtime dependencies of the corresponding package.

Scanning for Nix store paths may sound scary, but there is a very slim change that a hash code string represents something else. In practice, it works really well.

For example, the following shell command shows all the runtime dependencies of the file package:


$ nix-store -qR /nix/store/6rcg0zgqyn2v1ypd46hlvngaf5lgqk9g-file-5.38
/nix/store/y8n2b9nwjrgfx3kvi3vywvfib2cw5xa6-libunistring-0.9.10
/nix/store/fhg84pzckx2igmcsvg92x1wpvl1dmybf-libidn2-2.3.0
/nix/store/bqbg6hb2jsl3kvf6jgmgfdqy06fpjrrn-glibc-2.30
/nix/store/5x6l9xm5dp6v113dpfv673qvhwjyb7p5-zlib-1.2.11
/nix/store/6rcg0zgqyn2v1ypd46hlvngaf5lgqk9g-file-5.38

If we query the dependencies of another package that is built from the same Nix packages set, such as cpio:


$ nix-store -qR /nix/store/bzm0mszhvbr6hp4gmar4czsn52hz07q1-cpio-2.13
/nix/store/y8n2b9nwjrgfx3kvi3vywvfib2cw5xa6-libunistring-0.9.10
/nix/store/fhg84pzckx2igmcsvg92x1wpvl1dmybf-libidn2-2.3.0
/nix/store/bqbg6hb2jsl3kvf6jgmgfdqy06fpjrrn-glibc-2.30
/nix/store/bzm0mszhvbr6hp4gmar4czsn52hz07q1-cpio-2.13

When looking at the outputs above, you may probably notice that both bash and cpio share the same kinds of dependencies (e.g. libidn2, libunisting and glibc), with the same hash code prefixes. Because they are same Nix store paths, they are shared on disk (and in RAM, because the operating system caches the same files in memory) leading to more efficient disk and RAM usage.

The fact that we can detect references to Nix store paths is because packages in the Nix package repository use an unorthodox form of static linking.

For example, ELF executables built with Nix have the store paths of their library dependencies in their RPATH header values (the ld command in Nixpkgs has been wrapped to transparently augment libraries to a binary's RPATH).

Python programs (and other programs written in interpreted languages) typically use wrapper scripts that set the PYTHONPATH (or equivalent) environment variables to contain Nix store paths providing the dependencies.

Docker concepts


The Docker overview page states the following about what Docker can do:

When you use Docker, you are creating and using images, containers, networks, volumes, plugins, and other objects.

Although you can create many kinds of objects with Docker, the two most important objects are the following:

  • Images. The overview page states: "An image is a read-only template with instructions for creating a Docker container.".

    To more accurately describe what this means is that images are created from build recipes called Dockerfiles. They produce self-contained root file systems containing all necessary files to run a program, such as binaries, libraries, configuration files etc. The resulting image itself is immutable (read only) and cannot change after it has been built.
  • Containers. The overview gives the following description: "A container is a runnable instance of an image".

    More specifically, this means that the life-cycle (whether a container is in a started or stopped state) is bound to the life-cycle of a root process, that runs in a (somewhat) isolated environment using the content of a Docker image as its root file system.

Besides the object types explained above, there are many more kinds objects, such as volumes (that can mount a directory from the host file system to a path in the container), and port forwardings from the host system to a container. For more information about these remaining objects, consult the Docker documentation.

Docker combines several concepts to facilitate reproducible and reliable container deployment. To be able to isolate containers from each other, it uses several kinds of Linux namespaces:

  • The mount namespace: This is in IMO the most important name space. After setting up a private mount namespace, every subsequent mount that we make will be visible in the container, but not to other containers/processes that are in a different mount name space.

    A private mount namespace is used to mount a new root file system (the contents of the Docker image) with all essential system software and other artifacts to run an application, that is different from the host system's root file system.
  • The Process ID (PID) namespace facilitates process isolation. A process/container with a private PID namespace will not be able to see or control the host system's processes (the opposite is actually possible).
  • The network namespace separates network interfaces from the host system. In a private network namespace, a container has one or more private network interfaces with their own IP addresses, port assignments and firewall settings.

    As a result, a service such as the Apache HTTP server in a Docker container can bind to port 80 without conflicting with another HTTP server that binds to the same port on the host system or in another container instance.
  • The Inter-Process Communication (IPC) namespace separates the ability for processes to communicate with each other via the SHM family of functions to establish a range of shared memory between the two processes.
  • The UTS namespace isolates kernel and version identifiers.

Another important concept that containers use are cgroups that can be use to limit the amount of system resources that containers can use, such as the amount of RAM.

Finally, to optimize/reduce storage overhead, Docker uses layers and a union filesystem (there are a variety of file system options for this) to combine these layers by "stacking" them on top of each other.

A running container basically mounts an image's read-only layers on top of each other, and keeps the final layer writable so that processes in the container can create and modify files on the system.

Whenever you construct an image from a Dockerfile, each modification operation generates a new layer. Each layer is immutable (it will never change after it has been created) and is uniquely identifiable with a hash code, similar to Nix store paths.

For example, we can build an image with the following Dockerfile that deploys and runs the Apache HTTP server on a Debian Buster Linux distribution:


FROM debian:buster

RUN apt-get update
RUN apt-get install -y apache2
ADD index.html /var/www/html
CMD ["apachectl", "-D", "FOREGROUND"]
EXPOSE 80/tcp

The above Dockerfile executes the following steps:

  • It takes the debian:buster image from Docker Hub as a base image.
  • It updates the Debian package database (apt-get update) and installs the Apache HTTPD server package from the Debian package repository.
  • It uploads an example page (index.html) to the document root folder.
  • It executes the: apachectl -D FOREGROUND command-line instruction to start the Apache HTTP server in foreground mode. The container's life-cycle is bound to the life-cycle of this foreground process.
  • It informs Docker that the container listens to TCP port: 80. Connecting to port 80 makes it possible for a user to retrieve the example index.html page.

With the following command we can build the image:


$ docker build . -t debian-apache

Resulting in the following layers:


$ docker history debian-nginx:latest
IMAGE CREATED CREATED BY SIZE COMMENT
a72c04bd48d6 About an hour ago /bin/sh -c #(nop) EXPOSE 80/tcp 0B
325875da0f6d About an hour ago /bin/sh -c #(nop) CMD ["apachectl" "-D" "FO… 0B
35d9a1dca334 About an hour ago /bin/sh -c #(nop) ADD file:18aed37573327bee1… 129B
59ee7771f1bc About an hour ago /bin/sh -c apt-get install -y apache2 112MB
c355fe9a587f 2 hours ago /bin/sh -c apt-get update 17.4MB
ae8514941ea4 33 hours ago /bin/sh -c #(nop) CMD ["bash"] 0B
<missing> 33 hours ago /bin/sh -c #(nop) ADD file:89dfd7d3ed77fd5e0… 114MB

As may be observed, the base Debian Buster image and every change made in the Dockerfile results in a new layer with a new hash code, as shown in the IMAGE column.

Layers and Nix store paths share the similarity that they are immutable and they can both be identified with hash codes.

They are also different -- first, a Nix store path is the result of building a package or a static artifact, whereas a layer is the result of making a filesystem modification. Second, for a Nix store path, the hash code is derived from all inputs, whereas the hash code of a layer is derived from the output: its contents.

Furthermore, Nix store paths are always isolated because they reside in a unique directory (enforced by the hash prefixes), whereas a layer might have files that overlap with files in other layers. In Docker, when a conflict is encountered the files in the layer that gets added on top of it take precedence.

We can construct a second image using the same Debian Linux distribution image that runs Nginx with the following Dockerfile:


FROM debian:buster

RUN apt-get update
RUN apt-get install -y nginx
ADD nginx.conf /etc
ADD index.html /var/www
CMD ["nginx", "-g", "daemon off;", "-c", "/etc/nginx.conf"]
EXPOSE 80/tcp

The above Dockerfile looks similar to the previous, except that we install the Nginx package from the Debian package repository and we use a different command-line instruction to start Nginx in foreground mode.

When building the image, its storage will be optimized -- both images share the same base layer (the Debian Buster Linux base distribution):


$ docker history debian-nginx:latest
IMAGE CREATED CREATED BY SIZE COMMENT
b7ae6f38ae77 2 hours ago /bin/sh -c #(nop) EXPOSE 80/tcp 0B
17027888ce23 2 hours ago /bin/sh -c #(nop) CMD ["nginx" "-g" "daemon… 0B
41a50a3fa73c 2 hours ago /bin/sh -c #(nop) ADD file:18aed37573327bee1… 129B
0f5b2fdcb207 2 hours ago /bin/sh -c #(nop) ADD file:f18afd18cfe2728b3… 189B
e49bbb46138b 2 hours ago /bin/sh -c apt-get install -y nginx 64.2MB
c355fe9a587f 2 hours ago /bin/sh -c apt-get update 17.4MB
ae8514941ea4 33 hours ago /bin/sh -c #(nop) CMD ["bash"] 0B
<missing> 33 hours ago /bin/sh -c #(nop) ADD file:89dfd7d3ed77fd5e0… 114MB

If you compare the above output with the previous docker history output, then you will notice that the bottom layer (last row) refers to the same layer using the same hash code behind the ADD file: statement in the CREATED BY column.

This ability to share the base distribution prevents us from storing another 114MB Debian Buster image, saving us storage and RAM.

Some common misconceptions


What I have noticed is that quite a few people compare containers to virtual machines (and even give containers that name, incorrectly suggesting that they are the same thing!).

A container is not a virtual machine, because it does not emulate or virtualize hardware -- virtual machines have a virtual CPU, virtual memory, virtual disk etc. that have similar capabilities and limitations as real hardware.

Furthermore, containers do not run a full operating system -- they run processes managed by the host system's Linux kernel. As a result, Docker containers will only deploy software that runs on Linux, and not software that was built for other operating systems.

(As a sidenote: Docker can also be used on Windows and macOS -- on these non-Linux platforms, a virtualized Linux system is used for hosting the containers, but the containers themselves are not separated by using virtualization).

Containers cannot even be considered "light weight virtual machines".

The means to isolate containers from each other only apply to a limited number of potentially shared resources. For example, a resource that could not be unshared is the system's clock, although this may change in the near future, because in March 2020 a time namespace has been added to the newest Linux kernel version. I believe this namespace is not yet offered as a generally available feature in Docker.

Moreover, namespaces, that normally provide separation/isolation between containers, are objects and these objects can also be shared among multiple container instances (this is a uncommon use-case, because by default every container has its own private namespaces).

For example, it is also possible for two containers to share the same IPC namespace -- then processes in both containers will be able to communicate with each other with a shared-memory IPC mechanism, but they cannot do any IPC with processes on the host system or containers not sharing the same namespace.

Finally, certain system resources are not constrained by default unlike a virtual machine -- for example, a container is allowed to consume all the RAM of the host machine unless a RAM restriction has been configured. An unrestricted container could potentially affect the machine's stability as a whole and other containers running on the same machine.

A comparison of use cases


As mentioned in the introduction, when I show people Nix, then I often get a remark that it looks very similar to Docker.

In this section, I will compare some of their common use cases.

Managing services


In addition to building a Docker image, I believe the most common use case for Docker is to manage services, such as custom REST API services (that are self-contained processes with an embedded web server), web servers or database management systems.

For example, after building an Nginx Docker image (as shown in the section about Docker concepts), we can also launch a container instance using the previously constructed image to serve our example HTML page:


$ docker run -p 8080:80 --name nginx-container -it debian-nginx

The above command create a new container instance using our Nginx image as a root file system and then starts the container in interactive mode -- the command's execution will block and display the output of the Nginx process on the terminal.

If we would omit the -it parameters then the container will run in the background.

The -p parameter configures a port forwarding from the host system to the container: traffic to the host system's port 8080 gets forwarded to port 80 in the container where the Nginx server listens to.

We should be able to see the example HTML page, by opening the following URL in a web browser:


http://localhost:8080

After stopping the container, its state will be retained. We can remove the container permanently, by running:


$ docker rm nginx-container

The Nix package manager has no equivalent use case for manging running processes, because its purpose is package management and not process/service life-cycle management.

However, some projects based on Nix, such as NixOS: a Linux distribution built around the Nix package manager using a single declarative configuration file to capture a machine's configuration, generates systemd unit files to manage services' life-cycles.

The Nix package manager can also be used on other operating systems, such as conventional Linux distributions, macOS and other UNIX-like systems. There is no universal solution that allows you to complement Nix with service manage support on all platforms that Nix supports.

Experimenting with packages


Another common use case is using Docker to experiment with packages that should not remain permanently installed on a system.

One way of doing this is by directly pulling a Linux distribution image (such as Debian Buster):


$ docker pull debian:buster

and then starting a container in an interactive shell session, in which we install the packages that we want to experiment with:


$ docker run --name myexperiment -it debian:buster /bin/sh
$ apt-get update
$ apt-get install -y file
# file --version
file-5.22
magic file from /etc/magic:/usr/share/misc/magic

The above example suffices to experiment with the file package, but its deployment is not guaranteed to be reproducible.

For example, the result of running my apt-get instructions shown above is file version 5.22. If I would, for example, run the same instructions a week later, then I might get a different version (e.g. 5.23).

The Docker-way of making such a deployment scenario reproducible, is by installing the packages in a Dockerfile as part of the container's image construction process:


FROM debian:buster

RUN apt-get update
RUN apt-get install -y file

we can build the container image, with our file package as follows:


$ docker build . -t file-experiment

and then deploy a container that uses that image:


$ docker run --name myexperiment -it debian:buster /bin/sh

As long as we deploy a container with the same image, we will always have the same version of the file executable:


$ docker run --name myexperiment -it file-experiment /bin/sh
# file --version
file-5.22
magic file from /etc/magic:/usr/share/misc/magic

With Nix, generating reproducible development environments with packages is a first-class feature.

For example, to launch a shell session providing the file package from the Nixpkgs collection, we can simply run:


$ nix-shell -p file
$ file --version
file-5.39
magic file from /nix/store/j4jj3slm15940mpmympb0z99a2ghg49q-file-5.39/share/misc/magic

As long as the Nix expression sources remain the same (e.g. the Nix channel is not updated, or NIX_PATH is hardwired to a certain Git revision of Nixpkgs), the deployment of the development environment is reproducible -- we should always get the same file package with the same Nix store path.

Building development projects/arbitrary packages


As shown in the section about Nix's concepts, one of Nix's key features is to generate build environments for building packages and other software projects. I have shown that with a simple Nix expression consisting of only a few lines of code, we can build the file package from source code and its build dependencies in such a dedicated build environment.

In Docker, only building images is a first-class concept. However, building arbitrary software projects and packages is also something you can do by using Docker containers in a specific way.

For example, we can create a bash script that builds the same example package (file) shown in the section that explains Nix's concepts:


#!/bin/bash -e

mkdir -p /build
cd /build

wget ftp://ftp.astron.com/pub/file/file-5.38.tar.gz

tar xfv file-5.38.tar.gz
cd file-5.38
./configure --prefix=/opt/file
make
make install

tar cfvz /out/file-5.38-binaries.tar.gz /opt/file

Compared to its Nix expression counterpart, the build script above does not use any abstractions -- as a consequence, we have to explicitly write all steps that executes the required build steps to build the package:

  • Create a dedicated build directory.
  • Download the source tarball from the FTP server
  • Unpack the tarball
  • Execute the standard GNU Autotools build procedure: ./configure; make; make install and install the binaries in an isolated folder (/opt/file).
  • Create a binary tarball from the /opt/file folder and store it in the /out directory (that is a volume shared between the container and the host system).

To create a container that runs the build script and to provide its dependencies in a reproducible way, we need to construct an image from the following Dockerfile:


FROM debian:buster

RUN apt-get update
RUN apt-get install -y wget gcc make libz-dev
ADD ./build.sh /
CMD /build.sh

The above Dockerfile builds an image using the Debian Buster Linux distribution, installs all mandatory build utilities (wget, gcc, and make) and library dependencies (libz-dev), and executes the build script shown above.

With the following command, we can build the image:


$ docker build . -t buildenv

and with the following command, we can create and launch the container that executes the build script (and automatically discard it as soon as it finishes its task):


$ docker run -v $(pwd)/out:/out --rm -t buildenv

To make sure that we can keep our resulting binary tarball after the container gets discarded, we have created a shared volume that maps the out directory in our current working directory onto the /out directory in the container.

When the build script finishes, the output directory should contain our generated binary tarball:


ls out/
file-5.38-binaries.tar.gz

Although both Nix and Docker both can provide reproducible environments for building packages (in the case of Docker, we need to make sure that all dependencies are provided by the Docker image), builds performed in a Docker container are not guaranteed to be pure, because it does not take the same precautions that Nix takes:

  • In the build script, we download the source tarball without checking its integrity. This might cause an impurity, because the tarball on the remote server could change (this could happen for non-mallicious as well as mallicous reasons).
  • While running the build, we have unrestricted network access. The build script might unknowingly download all kinds of undeclared/unknown dependencies from external sites whose results are not deterministic.
  • We do not reset any timestamps -- as a result, when performing the same build twice in a row, the second result might be slightly different because of the timestamps integrated in the build product.

Coping with these impurities in a Docker workflow is the responsibility of the build script implementer. With Nix, most of it is transparently handled for you.

Moreover, the build script implementer is also responsible to retrieve the build artifact and store it somewhere, e.g. in a directory outside the container or uploading it to a remote artifactory repository.

In Nix, the result of a build process is automatically stored in isolation in the Nix store. We can also quite easily turn a Nix store into a binary cache and let other Nix consumers download from it, e.g. by installing nix-serve, Hydra: the Nix-based continuous integration service, cachix, or by manually generating a static binary cache.

Beyond the ability to execute builds, Nix has another great advantage for building packages from source code. On Linux systems, the Nixpkgs collection is entirely bootstrapped, except for the bootstrap binaries -- this provides us almost full traceability of all dependencies and transitive dependencies used at build-time.

With Docker you typically do not have such insights -- images get constructed from binaries obtained from arbitrary locations (e.g. binary packages that originate from Linux distributions' package repositories). As a result, it is impossible to get any insights on how these package dependencies were constructed from source code.

For most people, knowing exactly from which sources a package has been built is not considered important, but it can still be useful for more specialized use cases. For example, to determine if your system is constructed from trustable/audited sources and whether you did not violate a license of a third-party library.

Combined use cases


As explained earlier in this blog post, Nix and Docker are deployment solutions for sightly different application domains.

There are quite a few solutions developed by the Nix community that can combine Nix and Docker in interesting ways.

In this section, I will show some of them.

Experimenting with the Nix package manager in a Docker container


Since Docker is such a common solution to provide environments in which users can experiment with packages, the Nix community also provides a Nix Docker image, that allows you to conveniently experiment with the Nix package manager in a Docker container.

We can pull this image as follows:


$ docker pull nixos/nix

Then launch a container interactively:


$ docker run -it nixos/nix

And finally, pull the package specifications from the Nix channel and install any Nix package that we want in the container:


$ nix-channel --add https://nixos.org/channels/nixpkgs-unstable nixpkgs
$ nix-channel --update
$ nix-env -f '<nixpkgs>' -iA file
$ file --version
file-5.39
magic file from /nix/store/bx9l7vrcb9izgjgwkjwvryxsdqdd5zba-file-5.39/share/misc/magic

Using the Nix package manager to deliver the required packages to construct an image


In the examples that construct Docker images for Nginx and the Apache HTTP server, I use the Debian Buster Linux distribution as base images in which I add the required packages to run the services from the Debian package repository.

This is a common practice to construct Docker images -- as I have already explained in section that covers its concepts, package management is a sub problem of the process/service life-cycle management problem, but Docker leaves solving this problem to the Linux distribution's package manager.

Instead of using conventional Linux distributions and their package management solutions, such as Debian, Ubuntu (using apt-get), Fedora (using yum) or Alpine Linux (using apk), it is also possible to use Nix.

The following Dockerfile can be used to create an image that uses Nginx deployed by the Nix package manager:


FROM nixos/nix

RUN nix-channel --add https://nixos.org/channels/nixpkgs-unstable nixpkgs
RUN nix-channel --update
RUN nix-env -f '<nixpkgs>' -iA nginx

RUN mkdir -p /var/log/nginx /var/cache/nginx /var/www
ADD nginx.conf /etc
ADD index.html /var/www

CMD ["nginx", "-g", "daemon off;", "-c", "/etc/nginx.conf"]
EXPOSE 80/tcp

Using Nix to build Docker images


Ealier, I have shown that the Nix package manager can also be used in a Dockerfile to obtain all required packages to run a service.

In addition to building software packages, Nix can also build all kinds of static artifacts, such as disk images, DVD ROM ISO images, and virtual machine configurations.

The Nixpkgs repository also contains an abstraction function to build Docker images that does not require any Docker utilities.

For example, with the following Nix expression, we can build a Docker image that deploys Nginx:


with import <nixpkgs> {};

dockerTools.buildImage {
name = "nginxexp";
tag = "test";

contents = nginx;

runAsRoot = ''
${dockerTools.shadowSetup}
groupadd -r nogroup
useradd -r nobody -g nogroup -d /dev/null
mkdir -p /var/log/nginx /var/cache/nginx /var/www
cp ${./index.html} /var/www/index.html
'';

config = {
Cmd = [ "${nginx}/bin/nginx" "-g" "daemon off;" "-c" ./nginx.conf ];
Expose = {
"80/tcp" = {};
};
};
}

The above expression propagates the following parameters to the dockerTools.buildImage function:

  • The name of the image is: nginxexp using the tag: test.
  • The contents parameter specifies all Nix packages that should be installed in the Docker image.
  • The runAsRoot refers to a script that runs as root user in a QEMU virtual machine. This virtual machine is used to provide the dynamic parts of a Docker image, setting up user accounts and configuring the state of the Nginx service.
  • The config parameter specifies image configuration properties, such as the command to execute and which TCP ports should be exposed.

Running the following command:


$ nix-build
/nix/store/qx9cpvdxj78d98rwfk6a5z2qsmqvgzvk-docker-image-nginxexp.tar.gz

Produces a compressed tarball that contains all files belonging to the Docker image. We can load the image into Docker with the following command:


$ docker load -i \
/nix/store/qx9cpvdxj78d98rwfk6a5z2qsmqvgzvk-docker-image-nginxexp.tar.gz

and then launch a container instance that uses the Nix-generated image:


$ docker run -p 8080:80/tcp -it nginxexp:test

When we look at the Docker images overview:


$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nginxexp test cde8298f025f 50 years ago 61MB

There are two properties that stand out when you compare the Nix generated Docker image to conventional Docker images:

  • The first odd property is that the overview says that the image that was created 50 years ago. This is explainable: to make Nix builds pure and deterministic, time stamps are typically reset to 1 second after the epoch (Januarty 1st 1970), to ensure that we always get the same bit-identical build result.
  • The second property is the size of the image: 61MB is considerably smaller than our Debian-based Docker image.

    To give you a comparison: the docker history command-line invocation (shown earlier in this blog post) that displays the layers of which the Debian-based Nginx image consists, shows that the base Linux distribution image consumes 114 MB, the update layer 17.4 MB and the layer that provides the Nginx package is 64.2 MB.

The reason why Nix-generated images are so small is because Nix exactly knows all runtime dependencies required to run Nginx. As a result, we can restrict the image to only contain Nginx and its required runtime dependencies, leaving all unnecessary software out.

The Debian-based Nginx container is much bigger, because it also contains a base Debian Linux system with all kinds of command-line utilities and libraries, that are not required to run Nginx.

The same limitation also applies to the Nix Docker image shown in the previous sections -- the Nix Docker image was constructed from an Alpine Linux image and contains a small, but fully functional Linux distribution. As a result, it is bigger than the Docker image directly generated from a Nix expression.

Although a Nix-generated Docker image is smaller than most conventional images, one of its disadvantages is that the image consists of only one single layer -- as we have seen in the section about Nix concepts, many services typically share the same runtime dependencies (such as glibc). Because these common dependencies are not in a reusable layer, they cannot be shared.

To optimize reuse, it is also possible to build layered Docker images with Nix:


with import <nixpkgs> {};

dockerTools.buildLayeredImage {
name = "nginxexp";
tag = "test";

contents = nginx;

maxLayers = 100;

extraCommands = ''
mkdir -p var/log/nginx var/cache/nginx var/www
cp ${./index.html} var/www/index.html
'';

config = {
Cmd = [ "${nginx}/bin/nginx" "-g" "daemon off;" "-c" ./nginx.conf ];
Expose = {
"80/tcp" = {};
};
};
}

The above Nix expression is similar to the previous. but uses dockerTools.buildLayeredImage to construct a layered image.

We can build and load the image as follows:


$ docker load -i $(nix-build layered.nix)

When we retieve the history of the image, then we will see the following:


$ docker history nginxexp:test
IMAGE CREATED CREATED BY SIZE COMMENT
b91799a04b99 50 years ago 1.47kB store paths: ['/nix/store/snxpdsksd4wxcn3niiyck0fry3wzri96-nginxexp-customisation-layer']
<missing> 50 years ago 200B store paths: ['/nix/store/6npz42nl2hhsrs98bq45aqkqsndpwvp1-nginx-root.conf']
<missing> 50 years ago 1.79MB store paths: ['/nix/store/qsq6ni4lxd8i4g9g4dvh3y7v1f43fqsp-nginx-1.18.0']
<missing> 50 years ago 71.3kB store paths: ['/nix/store/n14bjnksgk2phl8n69m4yabmds7f0jj2-source']
<missing> 50 years ago 166kB store paths: ['/nix/store/jsqrk045m09i136mgcfjfai8i05nq14c-source']
<missing> 50 years ago 1.3MB store paths: ['/nix/store/4w2zbpv9ihl36kbpp6w5d1x33gp5ivfh-source']
<missing> 50 years ago 492kB store paths: ['/nix/store/kdrdxhswaqm4dgdqs1vs2l4b4md7djma-pcre-8.44']
<missing> 50 years ago 4.17MB store paths: ['/nix/store/6glpgx3pypxzb09wxdqyagv33rrj03qp-openssl-1.1.1g']
<missing> 50 years ago 385kB store paths: ['/nix/store/7n56vmgraagsl55aarx4qbigdmcvx345-libxslt-1.1.34']
<missing> 50 years ago 324kB store paths: ['/nix/store/1f8z1lc748w8clv1523lma4w31klrdpc-geoip-1.6.12']
<missing> 50 years ago 429kB store paths: ['/nix/store/wnrjhy16qzbhn2qdxqd6yrp76yghhkrg-gd-2.3.0']
<missing> 50 years ago 1.22MB store paths: ['/nix/store/hqd0i3nyb0717kqcm1v80x54ipkp4bv6-libwebp-1.0.3']
<missing> 50 years ago 327kB store paths: ['/nix/store/79nj0nblmb44v15kymha0489sw1l7fa0-fontconfig-2.12.6-lib']
<missing> 50 years ago 1.7MB store paths: ['/nix/store/6m9isbbvj78pjngmh0q5qr5cy5y1kzyw-libxml2-2.9.10']
<missing> 50 years ago 580kB store paths: ['/nix/store/2xmw4nxgfximk8v1rkw74490rfzz2gjp-libtiff-4.1.0']
<missing> 50 years ago 404kB store paths: ['/nix/store/vbxifzrl7i5nvh3h505kyw325da9k47n-giflib-5.2.1']
<missing> 50 years ago 79.8kB store paths: ['/nix/store/jc5bd71qcjshdjgzx9xdfrnc9hsi2qc3-fontconfig-2.12.6']
<missing> 50 years ago 236kB store paths: ['/nix/store/9q5gjvrabnr74vinmjzkkljbpxi8zk5j-expat-2.2.8']
<missing> 50 years ago 482kB store paths: ['/nix/store/0d6vl8gzwqc3bdkgj5qmmn8v67611znm-xz-5.2.5']
<missing> 50 years ago 6.28MB store paths: ['/nix/store/rmn2n2sycqviyccnhg85zangw1qpidx0-gcc-9.3.0-lib']
<missing> 50 years ago 1.98MB store paths: ['/nix/store/fnhsqz8a120qwgyyaiczv3lq4bjim780-freetype-2.10.2']
<missing> 50 years ago 757kB store paths: ['/nix/store/9ifada2prgfg7zm5ba0as6404rz6zy9w-dejavu-fonts-minimal-2.37']
<missing> 50 years ago 1.51MB store paths: ['/nix/store/yj40ch9rhkqwyjn920imxm1zcrvazsn3-libjpeg-turbo-2.0.4']
<missing> 50 years ago 79.8kB store paths: ['/nix/store/1lxskkhsfimhpg4fd7zqnynsmplvwqxz-bzip2-1.0.6.0.1']
<missing> 50 years ago 255kB store paths: ['/nix/store/adldw22awj7n65688smv19mdwvi1crsl-libpng-apng-1.6.37']
<missing> 50 years ago 123kB store paths: ['/nix/store/5x6l9xm5dp6v113dpfv673qvhwjyb7p5-zlib-1.2.11']
<missing> 50 years ago 30.9MB store paths: ['/nix/store/bqbg6hb2jsl3kvf6jgmgfdqy06fpjrrn-glibc-2.30']
<missing> 50 years ago 209kB store paths: ['/nix/store/fhg84pzckx2igmcsvg92x1wpvl1dmybf-libidn2-2.3.0']
<missing> 50 years ago 1.63MB store paths: ['/nix/store/y8n2b9nwjrgfx3kvi3vywvfib2cw5xa6-libunistring-0.9.10']

As you may notice, all Nix store paths are in their own layers. If we would also build a layered Docker image for the Apache HTTP service, we end up using less disk space (because common dependencies such as glibc can be reused), and less RAM (because these common dependencies can be shared in RAM).

Mapping Nix store paths onto layers obviously has limitations -- there is a maximum number of layers that Docker can use (in the Nix expression, I have imposed a limit of 100 layers, recent versions of Docker support a somewhat higher number).

Complex systems packaged with Nix typically have much more dependencies than the number of layers that Docker can mount. To cope with this limitation, the dockerTools.buildLayerImage abstraction function tries to merge infrequently used dependencies into a shared layers. More information about this process can be found in Graham Christensen's blog post.

Besides the use cases shown in the examples above, there is much more you can do with the dockerTools functions in Nixpkgs -- you can also pull images from Docker Hub (with the dockerTools.pullImage function) and use the dockerTools.buildImage function to use existing Docker images as a basis to create hybrids combining conventional Linux software with Nix packages.

Conclusion


In this blog post, I have elaborated about using Nix and Docker as deployment solutions.

What they both have in common is that they facilitate reliable and reproducible deployment.

They can be used for a variety of use cases in two different domains (package management and process/service management). Some of these use cases are common to both Nix and Docker.

Nix and Docker can also be combined in several interesting ways -- Nix can be used as a package manager to deliver package dependencies in the construction process of an image, and Nix can also be used directly to build images, as a replacement for Dockerfiles.

This table summarizes the conceptual differences between Nix and Docker covered in this blog post:

NixDocker
Application domainPackage managementProcess/service management
Storage unitsPackage build resultsFile system changes
Storage modelIsolated Nix store pathsLayers + union file system
Component addressingHashes computed from inputsHashes computed from a layer's contents
Service/process managementUnsupportedFirst-class feature
Package managementFirst class supportDelegated responsibility to a distro's package manager
Development environmentsnix-shellCreate image with dependencies + run shell session in container
Build management (images)DockerfiledockerTools.buildImage {}
dockerTools.buildLayeredImage {}
Build management (packages)First class function supportImplementer's responsibility, can be simulated
Build environment purityMany precautions takenOnly images provide some reproducibility, implementer's responsibility
Full source traceabilityYes (on Linux)No
OS supportMany UNIX-like systemsLinux (real system or virtualized)

I believe the last item in the table deserves a bit of clarification -- Nix works on other operating systems than Linux, e.g. macOS, and can also deploy binaries for those platforms.

Docker can be used on Windows and macOS, but it still deploys Linux software -- on Windows and macOS containers are deployed to a virtualized Linux environment. Docker containers can only work on Linux, because they heavily rely on Linux-specific concepts: namespaces and cgroups.

Aside from the functional parts, Nix and Docker also have some fundamental non-functional differences. One of them is usability.

Although I am a long-time Nix user (since 2007). Docker is very popular because it is well-known and provides quite an optimized user experience. It does not deviate much from the way traditional Linux systems are managed -- this probably explains why so many users incorrectly call containers "virtual machines", because they manifest themselves as units that provide almost fully functional Linux distributions.

From my own experiences, it is typically more challenging to convince a new audience to adopt Nix -- getting an audience used to the fact that a package build can be modeled as a pure function invocation (in which the function parameters are a package's build inputs) and that a specialized Nix store is used to store all static artifacts, is sometimes difficult.

Both Nix and Docker support reuse: the former by means of using identical Nix store paths and the latter by using identical layers. For both solutions, these objects can be identified with hash codes.

In practice, reuse with Docker is not always optimal -- for frequently used services, such as Nginx and Apache HTTP server, is not a common practice to manually derive these images from a Linux distribution base image.

Instead, most Docker users will obtain specialized Nginx and Apache HTTP images. The official Docker Nginx images are constructed from Debian Buster and Alpine Linux, whereas the official Apache HTTP images only support Alpine Linux. Sharing common dependencies between these two images will only be possible if we install the Alpine Linux-based images.

In practice, it happens quite frequently that people run images constructed from all kinds of different base images, making it very difficult to share common dependencies.

Another impractical aspect of Nix is that it works conveniently for software compiled from source code, but packaging and deploying pre-built binaries is typically a challenge -- ELF binaries typically do not work out of the box and need to be patched, or deployed to an FHS user environment in which dependencies can be found in their "usual" locations (e.g. /bin, /lib etc.).

Related work


In this blog post, I have restricted my analysis to Nix and Docker. Both tools are useful on their own, but they are also the foundations of entire solution eco-systems. I did not elaborate much about solutions in these extended eco-systems.

For example, Nix does not do any process/service management, but there are Nix-related projects that can address this concern. Most notably: NixOS: a Linux-distribution fully managed by Nix, uses systemd to manage services.

For Nix users on macOS, there is a project called nix-darwin that integrates with launchd, which is the default service manager on macOS.

There also used to be an interesting cross-over project between Nix and Docker (called nix-docker) combining the Nix's package management capabilities, with Docker's isolation capabilities, and supervisord's ability to manage multiple services in a container -- it takes a configuration file (that looks similar to a NixOS configuration) defining a set of services, fully generates a supervisord configuration (with all required services and dependencies) and deploys them to a container. Unfortunately, the project is no longer maintained.

Nixery is a Docker-compatible container registry that is capable of transparently building and serving container images using Nix.

Docker is also an interesting foundation for an entire eco-system of solutions. Most notably Kubernetes, a container-orchestrating system that works with a variety of container tools including Docker. docker-compose makes it possible to manage collections of Docker containers and dependencies between containers.

There are also many solutions available to make building development projects with Docker (and other container technologies) more convenient than my file package build example. Gitlab CI, for example, provides first-class Docker integration. Tekton is a Kubernetes-based framework that can be used to build CI/CD systems.

There are also quite a few Nix cross-over projects that integrate with the extended containers eco-system, such as Kubernetes and docker-compose. For example, arion can generate docker-compose configuration files with specialized containers from NixOS modules. KuberNix can be used to bootstrap a Kubernetes cluster with the Nix package manager, and Kubenix can be used to build Kubernetes resources with Nix.

As explained in my comparisons, package management is not something that Docker supports as a first-class feature, but Docker has been an inspiration for package management solutions as well.

Most notably, several years ago I did a comparison between Nix and Ubuntu's Snappy package manager. The latter deploys every package (and all its required dependencies) as a container.

In this comparison blog post, I raised a number of concerns about reuse. Snappy does not have any means to share common dependencies between packages, and as a result, Snaps can be quite disk space and memory consuming.

Flatpak can be considered an alternative and more open solution to Snappy.

I still do not understand why these Docker-inspired package management solutions have not used Nix (e.g. storing packages in insolated folders) or Docker (e.g. using layers) as an inspiration to optimize reuse and simplify the construction of packages.

Future work


In the next blog post, I will elaborate more about integrating the Nix package manager with tools that can address the process/service management concern.

by Sander van der Burg (noreply@blogger.com) at July 29, 2020 08:57 PM

July 28, 2020

Cachix

Upstream caches: avoiding pushing paths in cache.nixos.org

One of the most requested features, the so-called upstream caches was released today. It is enabled by default for all caches, and the owner of the binary cache can disable it via Settings. When you push store paths to Cachix, querying cache.nixos.org adds overhead of multiples of 100ms, but you save storage and possibly minutes for avoiding the pushing of already available paths. Queries to cache.nixos.org are also cached, so that subsequent push operations do not have the overhead.

by Domen Kožar (support@cachix.org) at July 28, 2020 02:30 PM

July 20, 2020

Cachix

Documentation and More Documentation

Documentation is an important ingredient of a successful software project. Last few weeks I’ve worked on improving status quo on two fronts: 1) https://nix.dev is an opinionated guide for developers getting things done using the Nix ecosystem. A few highlights: Getting started repository template with a tutorial for using declarative and reproducible developer environments Setting up GitHub Actions with Nix Nix language anti-patterns to avoid and recommended alternatives

by Domen Kožar (support@cachix.org) at July 20, 2020 02:45 PM

July 08, 2020

Tweag I/O

Setting up Buildkite for Nix-based projects using Terraform and GCP

In this post I’m going to show how to setup, with Terraform, a Buildkite-based CI using your own workers that run on GCP. For reference, the complete Terraform configuration for this post is available in this repository.

  • The setup gives you complete control on how fast your worker are.
  • The workers come with Nix pre-installed, so you won’t need to spend time downloading the same docker container again and again on every push as would usually happen with most cloud CI providers.
  • The workers come with a distributed Nix cache set up. So authors of CI scripts won’t have to bother about caching at all.

Secrets

We are going to need to import two secret resources:

resource "secret_resource" "buildkite_agent_token" {}
resource "secret_resource" "nix_signing_key" {}

To initialize the resources, execute the following from the root directory of your project:

$ terraform import secret_resource.<name> <value>

where:

  • buildkite_agent_token is obtained from the Buildkite site.
  • nix_signing_key can be generated by running:

    nix-store --generate-binary-cache-key <your-key-name> key.private key.public

    The key.private file will contain the value for the signing key. I’ll explain later in the post how to use the contents of the key.public file.

Custom NixOS image

The next step is to use the nixos_image_custom module to create a NixOS image with custom configuration.

resource "google_storage_bucket" "nixos_image" {
  name     = "buildkite-nixos-image-bucket-name"
  location = "EU"
}

module "nixos_image_custom" {
  source      = "git::https://github.com/tweag/terraform-nixos.git//google_image_nixos_custom?ref=40fedb1fae7df5bd7ad9defdd71eb06b7252810f"
  bucket_name = "${google_storage_bucket.nixos_image.name}"
  nixos_config = "${path.module}/nixos-config.nix"
}

The snippet above first creates a bucket nixos_image where the generated image will be uploaded, then it uses the nixos_image_custom module, which handles generation of the image using the configuration from the nixos-config.nix file. The file is assumed to be in the same directory as the Terraform configuration, hence ${path.module}/.

Service account and cache bucket

To control access to different resources we will also need a service account:

resource "google_service_account" "buildkite_agent" {
  account_id   = "buildkite-agent"
  display_name = "Buildkite agent"
}

We can use it to set access permissions for the storage bucket that will contain the Nix cache:

resource "google_storage_bucket" "nix_cache_bucket" {
  name     = "nix-cache-bucket-name"
  location = "EU"
  force_destroy = true
  retention_policy {
    retention_period = 7889238 # three months
  }
}

resource "google_storage_bucket_iam_member" "buildkite_nix_cache_writer" {
  bucket = "${google_storage_bucket.nix_cache_bucket.name}"
  role = "roles/storage.objectAdmin"
  member = "serviceAccount:${google_service_account.buildkite_agent.email}"
}

resource "google_storage_bucket_iam_member" "buildkite_nix_cache_reader" {
  bucket = "${google_storage_bucket.nix_cache_bucket.name}"
  role   = "roles/storage.objectViewer"
  member = "allUsers"
}

The bucket is configured to automatically delete objects that are older than 3 months. We give the service account the ability to write to and read from the bucket (the roles/storage.objectAdmin role). The rest of the world gets the ability to read from the bucket (the roles/storage.objectViewer role).

NixOS configuration

Here is the content of my nixos-config.nix. This NixOS configuration can serve as a starting point for writing your own. The numbered points refer to the notes below.

{ modulesPath, pkgs, ... }:
{
  imports = [
    "${modulesPath}/virtualisation/google-compute-image.nix"
  ];
  virtualisation.googleComputeImage.diskSize = 3000;
  virtualisation.docker.enable = true;

  services = {
    buildkite-agents.agent = {
      enable = true;
      extraConfig = ''
      tags-from-gcp=true
      '';
      tags = {
        os = "nixos";
        nix = "true";
      };
      tokenPath = "/run/keys/buildkite-agent-token"; # (1)
      runtimePackages = with pkgs; [
        bash
        curl
        gcc
        gnutar
        gzip
        ncurses
        nix
        python3
        xz
        # (2) extend as necessary
      ];
    };
    nix-store-gcs-proxy = {
      nix-cache-bucket-name = { # (3)
        address = "localhost:3000";
      };
    };
  };

  nix = {
    binaryCaches = [
      "https://cache.nixos.org/"
      "https://storage.googleapis.com/nix-cache-bucket-name" # (4)
    ];
    binaryCachePublicKeys = [
      "cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY="
      "<insert your public signing key here>" # (5)
    ];
    extraOptions = ''
      post-build-hook = /etc/nix/upload-to-cache.sh # (6)
    '';
  };

  security.sudo.enable = true;
  services.openssh.passwordAuthentication = false;
  security.sudo.wheelNeedsPassword = false;
}

Notes:

  1. This file will be created later by the startup script (see below).
  2. The collection of packages that are available to the Buildkite script can be edited here.
  3. Replace nix-cache-bucket-name by the name of the bucket used for the Nix cache.
  4. Similarly to (3) replace nix-cache-bucket-name in the URL.
  5. Insert the contents of the key.public file you generated earlier.
  6. The file will be created later by the startup script.

Compute instances and startup script

The following snippet sets up an instance group manager which controls multiple (3 in this example) Buildkite agents. The numbered points refer to the notes below.

data "template_file" "buildkite_nixos_startup" { # (1)
  template = "${file("${path.module}/files/buildkite_nixos_startup.sh")}"

  vars = {
    buildkite_agent_token = "${secret_resource.buildkite_agent_token.value}"
    nix_signing_key = "${secret_resource.nix_signing_key.value}"
  }
}

resource "google_compute_instance_template" "buildkite_nixos" {
  name_prefix  = "buildkite-nixos-"
  machine_type = "n1-standard-8"

  disk {
    boot         = true
    disk_size_gb = 100
    source_image = "${module.nixos_image_custom.self_link}"
  }

  metadata_startup_script = "${data.template_file.buildkite_nixos_startup.rendered}"

  network_interface {
    network = "default"

    access_config = {}
  }

  metadata {
    enable-oslogin = true
  }

  service_account {
    email = "${google_service_account.buildkite_agent.email}"

    scopes = [
      "compute-ro",
      "logging-write",
      "storage-rw",
    ]
  }

  scheduling {
    automatic_restart   = false
    on_host_maintenance = "TERMINATE"
    preemptible         = true # (2)
  }

  lifecycle {
    create_before_destroy = true
  }
}

resource "google_compute_instance_group_manager" "buildkite_nixos" {
  provider           = "google-beta"
  name               = "buildkite-nixos"
  base_instance_name = "buildkite-nixos"
  target_size        = "3" # (3)
  zone               = "<your-zone>" # (4)

  version {
    name              = "buildkite_nixos"
    instance_template = "${google_compute_instance_template.buildkite_nixos.self_link}"
  }

  update_policy {
    type                  = "PROACTIVE"
    minimal_action        = "REPLACE"
    max_unavailable_fixed = 1
  }
}

Notes:

  1. The file files/buildkite_nixos_startup.sh is shown below.
  2. Because of the remote Nix cache, the nodes can be preemptible (short-lived, never lasting longer than 24 hours), which results in much lower GCP costs.
  3. Changing target_size allows you to scale the system. This is the number of instances that are controlled by the instance group manager.
  4. Insert your desired zone here.

Finally, here is the startup script:

# workaround https://github.com/NixOS/nixpkgs/issues/42344
chown root:keys /run/keys
chmod 750 /run/keys
umask 037
echo "${buildkite_agent_token}" > /run/keys/buildkite-agent-token
chown root:keys /run/keys/buildkite-agent-token
umask 077
echo '${nix_signing_key}' > /run/keys/nix_signing_key
chown root:keys /run/keys/nix-signing-key

cat <<EOF > /etc/nix/upload-to-cache.sh
#!/bin/sh

set -eu
set -f # disable globbing
export IFS=' '

echo "Uploading paths" $OUT_PATHS
exec nix copy --to http://localhost:3000?secret-key=/run/keys/nix_signing_key \$OUT_PATHS
EOF
chmod +x /etc/nix/upload-to-cache.sh

This script uses the Nix post build hook approach for uploading to the cache without polluting the CI script.

Conclusion

The setup allows us to run Nix builds in an environment where Nix tooling is available. It also provides a remote Nix cache which does not require that the authors of CI scripts set it up or, even, be aware of it at all. We use this setup on many of Tweag’s projects and found that both mental and performance overheads are minimal. A typical CI script looks like this:

steps:
  - label: Build and test
    command: nix-build -A distributed-closure --no-out-link

Builds with up-to-date cache that does not cause re-builds may finish in literally 1 second.

July 08, 2020 12:00 AM

June 25, 2020

Tweag I/O

Nix Flakes, Part 2: Evaluation caching

Nix evaluation is often quite slow. In this blog post, we’ll have a look at a nice advantage of the hermetic evaluation model enforced by flakes: the ability to cache evaluation results reliably. For a short introduction to flakes, see our previous blog post.

Why Nix evaluation is slow

Nix uses a simple, interpreted, purely functional language to describe package dependency graphs and NixOS system configurations. So to get any information about those things, Nix first needs to evaluate a substantial Nix program. This involves parsing potentially thousands of .nix files and running a Turing-complete language.

For example, the command nix-env -qa shows you which packages are available in Nixpkgs. But this is quite slow and takes a lot of memory:

$ command time nix-env -qa | wc -l
5.09user 0.49system 0:05.59elapsed 99%CPU (0avgtext+0avgdata 1522792maxresident)k
28012

Evaluating individual packages or configurations can also be slow. For example, using nix-shell to enter a development environment for Hydra, we have to wait a bit, even if all dependencies are present in the Nix store:

$ command time nix-shell --command 'exit 0'
1.34user 0.18system 0:01.69elapsed 89%CPU (0avgtext+0avgdata 434808maxresident)k

That might be okay for occasional use but a wait of one or more seconds may well be unacceptably slow in scripts.

Note that the evaluation overhead is completely independent from the time it takes to actually build or download a package or configuration. If something is already present in the Nix store, Nix won’t build or download it again. But it still needs to re-evaluate the Nix files to determine which Nix store paths are needed.

Caching evaluation results

So can’t we speed things up by caching evaluation results? After all, the Nix language is purely functional, so it seems that re-evaluation should produce the same result, every time. Naively, maybe we can keep a cache that records that attribute A of file X evaluates to derivation D (or whatever metadata we want to cache). Unfortunately, it’s not that simple; cache invalidation is, after all, one of the only two hard problems in computer science.

The reason this didn’t work is that in the past Nix evaluation was not hermetic. For example, a .nix file can import other Nix files through relative or absolute paths (such as ~/.config/nixpkgs/config.nix for Nixpkgs) or by looking them up in the Nix search path ($NIX_PATH). So unless we perfectly keep track of all the files used during evaluation, a cached result might be inconsistent with the current input.

(As an aside: for a while, Nix has had an experimental replacement for nix-env -qa called nix search, which used an ad hoc cache for package metadata. It had exactly this cache invalidation problem: it wasn’t smart enough to figure out whether its cache was up to date with whatever revision of Nixpkgs you were using. So it had a manual flag --update-cache to allow the user to force cache invalidation.)

Flakes to the rescue

Flakes solve this problem by ensuring fully hermetic evaluation. When you evaluate an output attribute of a particular flake (e.g. the attribute defaultPackage.x86_64-linux of the dwarffs flake), Nix disallows access to any files outside that flake or its dependencies. It also disallows impure or platform-dependent features such as access to environment variables or the current system type.

This allows the nix command to aggressively cache evaluation results without fear of cache invalidation problems. Let’s see this in action by running Firefox from the nixpkgs flake. If we do this with an empty evaluation cache, Nix needs to evaluate the entire dependency graph of Firefox, which takes a quarter of a second:

$ command time nix shell nixpkgs#firefox -c firefox --version
Mozilla Firefox 75.0
0.26user 0.05system 0:00.39elapsed 82%CPU (0avgtext+0avgdata 115224maxresident)k

But if we do it again, it’s almost instantaneous (and takes less memory):

$ command time nix shell nixpkgs#firefox -c firefox --version
Mozilla Firefox 75.0
0.01user 0.01system 0:00.03elapsed 93%CPU (0avgtext+0avgdata 25840maxresident)k

The cache is implemented using a simple SQLite database that stores the values of flake output attributes. After the first command above, the cache looks like this:

$ sqlite3 ~/.cache/nix/eval-cache-v1/302043eedfbce13ecd8169612849f6ce789c26365c9aa0e6cfd3a772d746e3ba.sqlite .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE Attributes (
    parent      integer not null,
    name        text,
    type        integer not null,
    value       text,
    primary key (parent, name)
);
INSERT INTO Attributes VALUES(0,'',0,NULL);
INSERT INTO Attributes VALUES(1,'packages',3,NULL);
INSERT INTO Attributes VALUES(1,'legacyPackages',0,NULL);
INSERT INTO Attributes VALUES(3,'x86_64-linux',0,NULL);
INSERT INTO Attributes VALUES(4,'firefox',0,NULL);
INSERT INTO Attributes VALUES(5,'type',2,'derivation');
INSERT INTO Attributes VALUES(5,'drvPath',2,'/nix/store/7mz8pkgpl24wyab8nny0zclvca7ki2m8-firefox-75.0.drv');
INSERT INTO Attributes VALUES(5,'outPath',2,'/nix/store/5x1i2gp8k95f2mihd6aj61b5lydpz5dy-firefox-75.0');
INSERT INTO Attributes VALUES(5,'outputName',2,'out');
COMMIT;

In other words, the cache stores all the attributes that nix shell had to evaluate, in particular legacyPackages.x86_64-linux.firefox.{type,drvPath,outPath,outputName}. It also stores negative lookups, that is, attributes that don’t exist (such as packages).

The name of the SQLite database, 302043eedf….sqlite in this example, is derived from the contents of the top-level flake. Since the flake’s lock file contains content hashes of all dependencies, this is enough to efficiently and completely capture all files that might influence the evaluation result. (In the future, we’ll optimise this a bit more: for example, if the flake is a Git repository, we can simply use the Git revision as the cache name.)

The nix search command has been updated to use the new evaluation cache instead of its previous ad hoc cache. For example, searching for Blender is slow the first time:

$ command time nix search nixpkgs blender
* legacyPackages.x86_64-linux.blender (2.82a)
  3D Creation/Animation/Publishing System
5.55user 0.63system 0:06.17elapsed 100%CPU (0avgtext+0avgdata 1491912maxresident)k

but the second time it is pretty fast and uses much less memory:

$ command time nix search nixpkgs blender
* legacyPackages.x86_64-linux.blender (2.82a)
  3D Creation/Animation/Publishing System
0.41user 0.00system 0:00.42elapsed 99%CPU (0avgtext+0avgdata 21100maxresident)k

The evaluation cache at this point is about 10.9 MiB in size. The overhead for creating the cache is fairly modest: with the flag --no-eval-cache, nix search nixpkgs blender takes 4.9 seconds.

Caching and store derivations

There is only one way in which cached results can become “stale”, in a way. Nix evaluation produces store derivations such as /nix/store/7mz8pkgpl24wyab8nny0zclvca7ki2m8-firefox-75.0.drv as a side effect. (.drv files are essentially a serialization of the dependency graph of a package.) These store derivations may be garbage-collected. In that case, the evaluation cache points to a path that no longer exists. Thus, Nix checks whether the .drv file still exist, and if not, falls back to evaluating normally.

Future improvements

Currently, the evaluation cache is only created and used locally. However, Nix could automatically download precomputed caches, similar to how it has a binary cache for the contents of store paths. That is, if we need a cache like 302043eedf….sqlite, we could first check if it’s available on cache.nixos.org and if so fetch it from there. In this way, when we run a command such as nix shell nixpkgs#firefox, we could even avoid the need to fetch the actual source of the flake!

Another future improvement is to populate and use the cache in the evaluator itself. Currently the cache is populated and cached in the user interface (that is, the nix command). The command nix shell nixpkgs#firefox will create a cache entry for firefox, but not for the dependencies of firefox; thus a subsequent nix shell nixpkgs#thunderbird won’t see a speed improvement even though it shares most of its dependencies. So it would be nice if the evaluator had knowledge of the evaluation cache. For example, the evaluation of thunks that represent attributes like nixpkgs.legacyPackages.x86_64-linux.<package name> could check and update the cache.

June 25, 2020 12:00 AM

nixbuild.net

Automatic Resource Optimization

As of today, nixbuild.net will automatically select resources (CPU count and memory amount) for builds submitted to it. Based on historic build data, nixbuild.net calculates a resource allocation that will make your build as performant as possible, while wasting minimal CPU time. This means nixbuild.net users get faster and cheaper builds, while also taking away the user’s burden of figuring out what resource settings to use for each individual build.

Previously, all builds were assigned 4 CPUs unless the user configured resource selection differently. However, configuring different resource settings for individual builds was difficult, since Nix has no notion of such settings. Additionally, it is really tricky to know wether a build will gain anything from being allocated many CPUs, or if it just makes the build more expensive. It generally requires the user to try out the build with different settings, which is time-consuming for a single build and almost insurmountable for a large set of builds with different characteristics.

Now, each individual build will be analyzed and can be assigned between 1 and 16 CPUs, depending on how well the build utilizes multiple CPUs. The memory allocation will be adapted to minimize the amount of unused memory.

The automatic resource optimization has been tested both internally and by a selected number of beta users, and the results have been very positive so far. We’re happy to make this feature available to all nixbuild.net users, since it aligns perfectly with the service’s core idea of being simple, cost-effective and performant.

How Does it Work?

The automatic resource optimization works in two steps:

  1. When a Nix derivation is submitted to nixbuild.net, we look for similar derivations that have been built on nixbuild.net before. A heuristic approach is used, where derivations are compared based on package names and version numbers. This approach can be improved in the future, by looking at more parts of the derivations, like dependencies and build scripts.

  2. A number of the most recent, most similar derivations are selected. We then analyze the build data of those derivations. Since we have developed a secure sandbox specifically for running Nix builds, we’re also able to collect a lot of data about the builds. One metric that is collected is CPU utilization, and that lets us make predictions about how well a build would scale, performance-wise, if it was given more CPUs.

    We also look at metrics about the historic memory usage, and make sure the new build is allocated enough memory.

by nixbuild.net (support@nixbuild.net) at June 25, 2020 12:00 AM

June 18, 2020

Tweag I/O

Long-term reproducibility with Nix and Software Heritage

Reproducible builds and deployments — this is the ambition that Nix proclaims for itself. This ambition comes, however, with a fine print: builds are reproducible only if the original source code still exists. Otherwise, Nix can’t download, build and deploy it. The community maintains binary caches like cache.nixos.org, but these don’t preserve anything — caches are ephemeral. After all, preserving source code is not Nix’s primary mission!

Software Heritage, on the other hand, aspires to preserve software forever. Software Heritage is an independent foundation, grounded in academia, supported by public money, and backed by many of the world’s largest software companies. It can thus reliably pursue the tedious task of collecting and archiving public code and making it available through its website.

Quite naturally, this situation suggested an opportunity for collaboration: can we combine Nix’s reproducible build definitions with Software Heritage’s long-term archive? Will this collaboration bring us closer to the dream of forever replicable and successful builds? We thought that this was an effort worth pursuing, so a partnership started.

This is a story about the challenges that we encountered when trying to make this happen, what we already achieved, and the lessons that we have learned.

Archiving all sources in Nixpkgs

Whenever a Nix user installs a package that isn’t in the binary cache, Nix tries to download the original source and build it from scratch. When these sources don’t exist anymore, or don’t correspond anymore to what Nix expects them to be, the experience can be frustrating — instead of a reproducible build, the user ends up with a “file not found” HTTP error message.

With a long-term source code archive at hand, we could simply redirect the download request to it and move on with the build process. In other words, we would like to make Nix fall back on the Software Heritage archive to download the missing source from there.

For this to work, we must ensure that the source code has been fed to the Software Heritage archive previously. The best way to do it is simply to tell Software Heritage which source code we would like to have archived. Indeed, Software Heritage’s long term goal is to archive all source code produced in the world, and is quite eager to get pointed to locations where to get more!

The first step of this joint effort was to compile a list of all source code URLs required by Nixpkgs, and make them available to Software Heritage. A Nix community project is in charge of generating the list of source code URLs required by a Nixpkgs build, and it is available here. This list indexes every tarball used by the Nixpkgs master branch, with other sources such as patches, JAR’s, or git clones being currently excluded, and hence not archived.

A source list is a simple JSON file that looks like this:

{
  "sources": [
    {
      "type": "url",
      "urls": [
        "https://ftpmirror.gnu.org/hello/hello-2.10.tar.gz",
      ],
      "integrity": "sha256-MeBmE3qWJnbon2nRtlOC3pWn732RS4y5VvQepy4PUWs=",
    },
    ...
  ],
  "version": 1,
  "revision": "cc4e04c26672dd74e5fd0fecb78b435fb55368f7"
}

Here,

  • version is the version of this file’s format,
  • revision is a Nixpkgs commit ID,
  • type is the type of the source. Only the url type is currently supported,
  • integrity is the hash of the Nix fixed output derivation.

We then implemented a Software Heritage loader which fetches this JSON file once per day and archives all listed tarballs in a snapshot.

You can see an example snapshot here. This snapshot was created the 03 June 2020, points to the Nixpkgs commit 46fcaf3c8a13f32e2c147fd88f97c4ad2d3b0f27 and contains 21,100 branches, each one corresponding to the tarballs used by this Nixpkgs revision.

Every day, Software Heritage is now archiving more than 21,000 tarballs used by Nixpkgs. But how we can plug them back into Nix?

Falling back on Software Heritage

Two issues need to be addressed to make this happen:

First, we need to find the correct source, the one that Nix tried to download unsuccessfully, in the Software Heritage archive. We cannot simply use the original source URL to query the Software Heritage archive because the URL alone doesn’t identify uniquely the content that is behind it. What if the code that a URL points to has changed over time? To check this, Nix associates a hash with each downloaded artifact. Such a build step is called a fixed output derivation because Nix verifies that the output hash remains immitigably unchanged. It thus ensures that the content that it downloads is always the same, and emits an error otherwise. What we therefore really want is querying Software Heritage with the contents, that is, the hash of the source code artifact itself.

Let’s do it then! Does this mean that the problem is solved? Unfortunately not, and the reason for this are the design decisions behind Nix, Software Heritage and other package repositories, that are not fully compatible. And tarballs play a central role in this.

Content Hash vs Tarball Hash

When Software Heritage archives a tarball, it first unpacks it and then stores all files and directories. This makes it possible to deduplicate files: when two tarballs contain the same file it is only stored once. This is a nice solution, but unfortunately, this new tarball may differ from the original one — for example, if file permissions or timestamps were not preserved in this reassembly procedure. Since Nix computes the checksum of the tarball, it will detect that it differs from the hash of the original, and has no means to ensure that the tarball downloaded from the Software Heritage archive corresponds to the one it expects.

Nix, being really stubborn with respect to reproducibility, computes the checksum of the tarball, and detects that it differs from the hash of the original. Nix simply cannot ensure that the tarball downloaded from the Software Heritage archive corresponds to the one it expects, and rightly so because this could easily lead to a different build.

Nix already deals with some situations where the hash of a source tarball can change without affecting build reproducibility. For example, fetchFromGitHub unpacks the tarball before computing the hash of the unpacked contents. This is because GitHub produces release tarballs on the fly, and a change in the compression algorithm can invalidate the hash expected by Nix. It happens that when Nix uses the fetchFromGithub strategy, it manages to download tarballs from Software Heritage. Currently, this is the case for about 6,000 sources in Nixpkgs.

For all other sources, we would have to modify the Nixpkgs fetchers to compute checksums on the unpacked tarball contents. This seems reasonable, since we want to make sure that the source code is the same, and we don’t care much about the format in it is transferred (which can also be important for security reasons). However, using the hash of the tarball directly is often more convenient, since it is exposed by many repositories such as Pypi and Hackage. When updating a package in Nixpkgs, the maintainer can just pick the checksum provided by the package repository, instead of recalculating it locally.

Ideally, Nix fetchers and these package repositories would provide checksums of the unpacked contents of their packages as much as possible. This would give us the freedom to identify content independently of the way it is packed, transferred and stored. We haven’t yet decided how to move on, but rest assured that we will continue to work on it.

Wrap up

Thanks to Software Heritage, a significant part of the sources used by Nixpkgs are now archived forever. Moreover, about 6,000 out of the 21,000 tarballs used by Nixpkgs can already be used by the future Nix fallback mechanism. We now want to increase the number of archived source code tarballs, add support for Git sources and start to implement the fallback mechanism in Nix. At the same time, we aim to increase the number of fixed output derivations whose hash is computed on the unpacked tarball.

We want build reproducibility to become the standard in the software world. For this reason, the Software Heritage loader archiving Nixpkgs source code tarballs has been designed to be easily reused by other actors. For example, the Guix project already started publishing and archiving their sources using the same component! Finally, we want to thank NLnet for the funding that makes this work possible.

June 18, 2020 12:00 AM

June 17, 2020

Mayflower

Windows-on-NixOS, part 2: Make it go fast!

This is part 2 of a series of blog posts explaining how we took an existing Windows installation on hardware and moved it into a VM running on top of NixOS. Previously, we discussed how we performed the actual storage migration. In this post, we’ll cover the various performance optimisations we tried, what worked, and what didn’t work. GPU passthrough Since the machine is, amongst other things, used for gaming, graphics performance is critical.

June 17, 2020 09:00 AM

June 11, 2020

Sander van der Burg

Using Disnix as a simple and minimalistic dependency-based process manager

In my previous blog post I have demonstrated that I can deploy an entire service-oriented system locally with Disnix without the need of obtaining any external physical or virtual machines (or even Linux containers).

The fact that I could do this with relative ease is a benefit of using my experimental process manager-agnostic deployment framework that I have developed earlier, allowing you to target a variety of process management solutions with the same declarative deployment specifications.

Most notably, the fact that the framework can also work with processes that daemonize and let foreground processes automatically daemonize, make it very convenient to do local unprivileged user deployments.

To refresh your memory: a process that daemonizes spawns another process that keeps running in the background while the invoking process terminates after the initialization is done. Since there is no way for the caller to know the PID of the daemon process, daemons typically follow the convention to write a PID file to disk (containing the daemon's process ID), so that it can eventually be reliably terminated.

In addition to spawning a daemon process that remains in the background, services should also implement a number of steps to make it well-behaving, such as resetting signals handlers, clearing privacy sensitive environment variables, and dropping privileges etc.

In earlier blog posts, I argued that managing foreground processes with a process manager is typically more reliable (e.g. a PID of a foreground process is always known to be right).

On the other hand, processes that daemonize also have certain advantages:

  • They are self contained -- they do not rely on any external services to operate. This makes it very easy to run a collection of processes for local experimentation.
  • They have a standard means to notify the caller that the service is ready. By convention, the executable that spawns the daemon process is only supposed to terminate when the daemon has been successfully initialized. For example, foreground processes that are managed by systemd, should invoke the non-standard sd_notify() function to notify systemd that they are ready.

Although these concepts are nice, properly daemonizing a process is the responsibility of the service implementer -- as a consequence, it is not a guarantee that all services will properly implement all steps to make a daemon well-behaving.

Since the management of daemons is straight forward and self contained, the Nix expression language provides all kinds of advantages over data-oriented configuration languages (e.g. JSON or YAML) and Disnix has a flexible deployment model that works with a dependency graph and a plugin system that can activate and deactivate all kinds of components, I realized that I could integrate these facilities to make my own simple dependency-based process manager.

In this blog post, I will describe how this process management approach works.

Specifying a process configuration


A simple Nix expression capturing a daemon deployment configuration might look as follows:

{writeTextFile, mydaemon}:

writeTextFile {
name = "mydaemon";
text = ''
process=${mydaemon}/bin/mydaemon
pidFile=/var/run/mydaemon.pid
'';
destination = "/etc/dysnomia/process";
}

The above Nix expression generates a textual configuration file:

  • The process field specifies the path to executable to start (that in turn spawns a deamon process that keeps running in the background).
  • The pidFile field indicates the location of the PID file containing the process ID of the daemon process, so that it can be reliably terminated.

Most common system services (e.g. the Apache HTTP server, MySQL and PostgreSQL) can daemonize on their own and follow the same conventions. As a result, the deployment system can save you some configuration work by providing reasonable default values:

  • If no pidFile is provided, then the deployment system assumes that the daemon generates a PID file with the same name as the executable and resides in the directory that is commonly used for storing PID files: /var/run.
  • If a package provides only a single executable in the bin/ sub folder, then it is also not required to specify a process.

The fact that the configuration system provides reasonable defaults, means that for trivial services we do not have to specify any configuration properties at all -- simply providing a single executable in the package's bin/ sub folder suffices.

Do these simple configuration facilities really suffice to manage all kinds of system services? The answer is most likely no, because we may also want to manage processes that cannot daemonize on their own, or we may need to initialize some state first before the service can be used.

To provide these additional facilities, we can create a wrapper script around the executable and refer to it in the process field of the deployment specification.

The following Nix expression generates a deployment configuration for a service that requires state and only runs as a foreground process:

{stdenv, writeTextFile, writeScript, daemon, myForegroundService}:

let
myForegroundServiceWrapper = writeScript {
name = "myforegroundservice-wrapper";
text = ''
#! ${stdenv.shell} -e

mkdir -p /var/lib/myservice
exec ${daemon}/bin/daemon -U -F /var/run/mydaemon.pid -- \
${myForegroundService}/bin/myservice
'';
};
in
writeTextFile {
name = "mydaemon";
text = ''
process=${myForegroundServiceWrapper}
pidFile=/var/run/mydaemon.pid
'';
destination = "/etc/dysnomia/process";
}

As you may observe by looking at the Nix expression shown above, the Nix expression generates a wrapper script that does the following:

  • First, it creates the required state directory: /var/lib/myservice so that the service can work properly.
  • Then it invokes libslack's daemon command to automatically daemonize the service. The daemon command will automatically store a PID file containing the daemon's process ID, so that the configuration system knows how to terminate it. The value of the -F parameter passed to the daemon executable and the pidFile configuration property are the same.

Typically, in deployment systems that use a data-driven configuration language (such as YAML or JSON) obtaining a wrapped executable is a burden, but in the Nix expression language this is quite convenient -- the language allows you to automatically build packages and other static artifacts such as configuration files and scripts, and pass their corresponding Nix store paths as parameters to configuration files.

The combination of wrapper scripts and a simple configuration file suffices to manage all kinds of services, but it is fairly low-level -- to automate the deployment process of a system service, you basically need to re-implement the same kinds of configuration properties all over again.

In the Nix process mangement-framework, I have developed a high-level abstraction function for creating managed processes that can be used to target all kinds of process managers:

{createManagedProcess, runtimeDir}:
{port}:

let
webapp = import ../../webapp;
in
createManagedProcess rec {
name = "webapp";
description = "Simple web application";

# This expression can both run in foreground or daemon mode.
# The process manager can pick which mode it prefers.
process = "${webapp}/bin/webapp";
daemonArgs = [ "-D" ];

environment = {
PORT = port;
PID_FILE = "${runtimeDir}/${name}.pid";
};
}

The above Nix expression is a constructor function that generates a configuration for a web application process (with an embedded HTTP server) that returns a static HTML page.

The createManagedProcess function abstraction function can be used to generate configuration artifacts for systemd, supervisord, and launchd and various kinds of scripts, such as sysvinit scripts and BSD rc scripts.

I can also easily adjust the generator infrastructure to generate the configuration files shown earlier (capturing the path of an executable and a PID file) with a wrapper script.

Managing daemons with Disnix


As explained in earlier blog posts about Disnix, services in a Disnix deployment model are abstract representations of basically any kind of deployment unit.

Every service is annotated with a type field. Disnix consults a plugin system named Dysnomia to invoke the corresponding plugin that can manage the lifecycle of that service, e.g. by activating or deactivating it.

Implementing a Dysnomia module for directly managing daemons is quite straight forward -- as an activation step I just have to start the process defined in the configuration file (or the single executable that resides in the bin/ sub folder of the package).

As a deactivation step (which purpose is to stop a process) I simply need to send a TERM signal to the PID in the PID file, by running:

$ kill $(cat $pidFile)

Translation to a Disnix deployment specification


The last remaining bits in the puzzle is process dependency management and the translation to a Disnix services model so that Disnix can carry out the deployment.

Deployments managed by the Nix process management framework are driven by so-called processes models that capture the properties of running process instances, such as:

{ pkgs ? import  { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "disnix"
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager;
};
in
rec {
webapp = rec {
port = 5000;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};
};

nginxReverseProxy = rec {
port = 8080;

pkg = constructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};
};
}

The above Nix expression is a simple example of a processes model defining two running processes:

  • The webapp process is the web application process described earlier that runs an embedded HTTP server and serves a static HTML page.
  • The nginxReverseProxy is an Nginx web server that acts as a reverse proxy server for the webapp process. To make this service to work properly, it needs to be activated after the webapp process is activated. To ensure that the activation is done in the right order, webapp is passed as a process dependency to the nginxReverseProxyHostBased constructor function.

As explained in previous blog posts, Disnix deployments are driven by three kinds of deployment specifications: a services model that captures the service components of which a system consists, an infrastructure model that captures all available target machines and their configuration properties and a distribution model that maps services in the services model to machines in the infrastructure model.

The processes model and Disnix services model are quite similar -- the latter is actually a superset of the processes model.

We can translate process instances to Disnix services in a straight forward manner. For example, the nginxReverseProxy process can be translated into the following Disnix service configuration:

nginxReverseProxy = rec {
name = "nginxReverseProxy";
port = 8080;

pkg = constructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};

activatesAfter = {
inherit webapp;
};

type = "process";
};

In the above specification, the process configuration has been augmented with the following properties:

  • A name property because this is a mandatory field for every service.
  • In the process management framework all process instances are managed by the same process manager, but in Disnix services can have all kinds of shapes and formes and require a plugin to manage their life-cycles.

    To allow Disnix to manage daemons, we specify the type property to refer to our process Dysnomia module that starts and terminates a daemon from a simple textual specification.
  • The process dependencies are translated to Disnix inter-dependencies by using the activatesAfter property.

    In Disnix, inter-dependency parameters serve two purposes -- they provide the inter-dependent services with configuration parameters and they ensure the correct activation ordering.

    The activatesAfter parameter disregards the first inter-dependency property, because we are already using the process management framework's convention for propagating process dependencies.

To allow Disnix to carry out the deployment of processes only a services model does not suffice. Since we are only interested in local deployment, we can just provide an infrastructure model with only a localhost target and a distribution model that maps all services to localhost.

To accomplish this, we can use the same principles for local deployments described in the previous blog post.

An example deployment scenario


I have added a new tool called nixproc-disnix-switch to the Nix process management framework that automatically converts processes models into Disnix deployment models and invokes Disnix to locally deploy a system.

The following command will carry out the complete deployment of our webapp example system, shown earlier, using Disnix as a simple dependency-based process manager:

$ nixproc-disnix-switch --state-dir /home/sander/var \
--force-disable-user-change processes.nix

In addition to using Disnix for deploying processes, we can also use its other features. For example, another application of Disnix I typically find useful is the deployment visualization tool.

We can also use Disnix to generate a DOT graph from the deployment architecture of the currently deployed system and generate an image from it:

$ disnix-visualize > out.dot
$ dot -Tpng out.dot > out.png

Resulting in the following diagram:


In the first blog post that I wrote about the Nix process management framework (in which I explored a functional discipline using sysvinit-scripts as a basis), I was using hand-drawn diagrams to illustrate deployments.

With the Disnix backend, I can use Disnix's visualization tool to automatically generate these diagrams.

Discussion


In this blog post, I have shown that by implementing a few very simple concepts, we can use Disnix as a process management backend for the experimental Nix-based process management framework.

Although it was fun to develop a simple process management solution, my goal is not to compete with existing process management solutions (such as systemd, launchd or supervisord) -- this solution is primarily designed for simple use cases and local experimentation.

For production deployments, you probably still want to use a more sophisticated solution. For example, in production scenarios you also want to check the status of running processes and send them reload instructions. These are features that the Disnix backend does not support.

The Nix process management framework supports a variety of process managers, but none of them can be universally used on all platforms that Disnix can run on. For example, the sysvinit-script module works conveniently for local deployments but is restricted to Linux only. Likewise the bsdrc-script module only works on FreeBSD (and theoretically on NetBSD and OpenBSD). supervisord works on most UNIX-like systems, but is not self contained -- processes rely on the availablity of the supervisord service to run.

This Disnix-based process management solution is simple and portable to all UNIX-like systems that Disnix has been tested on.

The process module described in this blog post is a replacement for the process module that already exists in the current release of Dysnomia. The reason why I want it to be replaced is that Dysnomia now provides better alternatives to the old process module.

For example, when it is desired to have your process managed by systemd, then the new systemd-unit module should be used that is more reliable, supports many more features and has a simpler implementation.

Furthermore, I made a couple of mistakes in the past. The old process module was originally implemented as a simple module that would start a foreground process in the background, by using the nohup command. At the time I developed that module, I did not know much about developing daemons, nor about the additional steps daemons need to carry out to make themselves well-behaving.

nohup is not a proper solution for daemonizing foreground processes, such as critical system services -- a process might inherit privacy-sensitive environment variables, does not change the current working directory to the root folder and keep external drives mounted, and could also behave unpredictably if signal handlers have been changed from the default behaviour.

At some point I believed that it is more reliable to use a process manager to manage the lifecycle of a process and adjusted the process module to do that. Originally I used Upstart for this purpose, and later I switched to systemd, with sysvinit-scripts (and the direct appraoch with nohup as alternative implemenations).

Basically the process module provided three kinds of implementations in which none of them provided an optimal deployment experience.

I made a similar mistake with Dysnomia's wrapper module. Originally, its only purpose was to delegate the execution of deployment activities to a wrapper script included with the component that needs to be deployed. Because I was using this script mostly to deploy daemons, I have also adjusted the wrapper module to use an external process manager to manage the lifecycle of the daemon that the wrapper script might spawn.

Because of these mistakes and poor separation of functionality, I have decided to deprecate the old process and wrapper modules. Since they are frequently used and I do not want to break compatibility with old deployments, they can still be used if Dysnomia is configured in legacy mode, which is the default setting for the time being.

When using the old modules, Dysnomia will display a warning message explaining you that you should migrate to better alternatives.

Availability


The process Dysnomia module described in this blog post is part of the current development version of Dysnomia and will become available in the next release.

The Nix process management framework (which is still a highly-experimental prototype) includes the disnix backend (described in this blog post), allowing you to automatically translate a processes model to Disnix deployment models and uses Disnix to deploy a system.

by Sander van der Burg (noreply@blogger.com) at June 11, 2020 06:15 PM

May 26, 2020

Sander van der Burg

Deploying heterogeneous service-oriented systems locally with Disnix

In the previous blog post, I have shown a new useful application area that is built on top of the combination of my experimental Nix-based process management framework and Disnix.

Both of these underlying solutions have a number of similarities -- as their names obviously suggest, they both strongly depend on the Nix package manager to deploy all their package dependencies and static configuration artifacts, such as configuration files.

Furthermore, they are both driven by models written in the Nix expression language to automate the deployment processes of entire systems.

These models are built on a number of simple conventions that are frequently used in the Nix packages repository:

  • All units of which a system consists are defined as Nix expressions declaring a function. Each function parameter refers to a dependency or configuration property required to construct the unit from its sources.
  • To compose a particular variant of a unit, we must invoke the function that builds and configures the unit with parameters providing the dependencies and configuration properties that the unit needs.
  • To make all units conveniently accessible from a single location, the content of the configuration units is typically blended into a symlink tree called Nix profiles.

Besides these commonalities, their main difference is that the process management framework is specifically designed as a solution for systems that are composed out of running processes (i.e. daemons in UNIX terminology).

This framework makes it possible to construct multiple instances of running processes, isolate their resources (by avoiding conflicting resource configuration properties), and manage running process with a variety of process management solutions, such as sysvinit scripts, BSD rc scripts, systemd, launchd and supervisord.

The process management framework is quite useful for single machine deployments and local experimentation, but it does not do any distributed deployment and heterogeneous service deployment -- it cannot (at least not conveniently) deploy units that are not daemons, such as databases, Java web applications deployed to a Servlet container, PHP applications deployed to a PHP-enabled web server etc.

Disnix is a solution to automate the deployment processes of service-oriented systems -- distributed systems that are composed of components, using a variety of technologies, into a network of machines.

To accomplish full automation, Disnix integrates and combines a number of activities and tools, such as Nix for package management and Dysnomia for state management (Dysnomia takes care of the activation, deactivation steps for services, and can optionally manage snapshots and restores of state). Dysnomia provides a plugin system that makes it possible to manage a variety of component types, including processes and databases.

Disnix and Dysnomia can also include the features of the Nix process management framework for the deployment of services that are running processes, if desired.

The scope of Disnix is quite broad in comparison to the process management framework, but it can also be used to automate all kinds of sub problems. For example, it can also be used as a remote package deployment solution to build and deploy packages in a network of heterogeneous machines (e.g. Linux and macOS).

After comparing the properties of both deployment solutions, I have identified another interesting sub use case for Disnix -- deploying heterogeneous service-oriented systems (that are composed out of components using a variety of technologies) locally for experimentation purposes.

In this blog post, I will describe how Disnix can be used for local deployments.

Motivating example: deploying a Java-based web application and web service system


One of the examples I have shown in the previous blog post, is an over engineered Java-based web application and web service system which only purpose is to display the string: "Hello world!".

The "Hello" string is returned by the HelloService and consumed by another service called HelloWorldService that composes the sentence "Hello world!" from the first message. The HelloWorld web application is the front-end responsible for displaying the sentence to the end user.

When deploying the system to a single target machine, it could have the following deployment architecture:


In the architecture diagram shown above, ovals denote services, arrows inter-dependency relationships (requiring that a service gets activated before another), the dark grey colored boxes container environments, and the light grey colored box a machine (which is only one machine in the above example).

As you may notice, only one service in the diagram shown above is a daemon, namely Apache Tomcat (simpleAppservingTomcat) that can be managed by the experimental Nix process management framework.

The remainder of the services have a different kind of form -- the web application front-end (HelloWorld) is a Java web application that is embedded in Catalina, the Servlet container that comes with Apache Tomcat. The web services are Axis2 archives that are deployed to the Axis2 container (that in turn is a web application managed by Apache Tomcat).

In the previous blog post, I have shown that we can deploy and distribute these services over a small network of machines.

It is also possible to completely deploy this system locally, without any external physical or virtual machines, and network connectivity.

Configuring the client interface for local deployment


To execute deployment tasks remotely, Disnix invokes an external process that is called a client interface. By default, Disnix uses the disnix-ssh-client that remotely executes commands via SSH and transfers data via SCP.

It is also possible to use alternative client interfaces so that different communication protocols and methods can be used. For example, there is also an external package that provides a SOAP client disnix-soap-client and a NixOps client (disnix-nixops-client).

Communication with a local Disnix service instance can also be done with a client interface. For example, configuring the following environment variable:

$ export DISNIX_CLIENT_INTERFACE=disnix-client

instructs the Disnix tools to use the D-Bus client to communicate with a local Disnix service instance.

It is also possible to bypass the local Disnix service and directly execute all deployment activities with the following interface:

$ export DISNIX_CLIENT_INTERFACE=disnix-runactivity

The disnix-runactivity client interface is particularly useful for single-user/unprivileged user deployments. In the former case, you need a Disnix D-Bus daemon running in the background that authorizes the user to execute deployments. For the latter, nothing is required beyond a single user Nix installation.

Deploying the example system locally


As explained in earlier blog posts about Disnix, deployments are driven by three kinds of deployment specifications: a services model capturing all the services of which a system consists and how they depend on each other, an infrastructure model captures all available target machines and their relevant configuration properties (including so-called container services that can host application services) and the distribution model maps services in the services model to target machines in the infrastructure model (and container services that a machine may provide).

Normally, Disnix deploys services to remote machines defined in the infrastructure model. For local deployments, we simply need to provide an infrastructure model with only one entry:

{
localhost.properties.hostname = "localhost";
}

In the distribution model, we must map all services to the localhost target:

{infrastructure}:

{
simpleAppservingTomcat = [ infrastructure.localhost ];
axis2 = [ infrastructure.localhost ];

HelloService = [ infrastructure.localhost ];
HelloWorldService = [ infrastructure.localhost ];
HelloWorld = [ infrastructure.localhost ];
}

With the above infrastructure and distribution model that facilitates local deployment, and the services model of the example system shown above, we can deploy the entire system on our local machine:

$ disnix-env -s services.nix -i infrastructure-local.nix -d distribution-local.nix

Deploying the example system locally as an unprivileged user


The deployment scenario shown earlier supports local deployment, but still requires super-user privileges. For example, to deploy Apache Tomcat, we must have write access to the state directory: /var to configure Apache Tomcat's state and deploy the Java web application archives. An unprivileged user typically lacks the permissions to perform modifications in the /var directory.

One of they key features of the Nix process management framework is that it makes all state directories are configurable. State directories can be changed in such a way that also unprivileged users can deploy services (e.g. by changing the state directory to a sub folder in the user's home directory).

Disnix service models can also define these process management configuration parameters:

{ pkgs, system, distribution, invDistribution
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "systemd"
}:

let
processType =
if processManager == null then "managed-process"
else if processManager == "sysvinit" then "sysvinit-script"
else if processManager == "systemd" then "systemd-unit"
else if processManager == "supervisord" then "supervisord-program"
else if processManager == "bsdrc" then "bsdrc-script"
else if processManager == "cygrunsrv" then "cygrunsrv-service"
else if processManager == "launchd" then "launchd-daemon"
else throw "Unknown process manager: ${processManager}";

constructors = import ../../../nix-processmgmt/examples/service-containers-agnostic/constructors.nix {
inherit pkgs stateDir runtimeDir logDir cacheDir tmpDir forceDisableUserChange processManager;
};

customPkgs = import ../top-level/all-packages.nix {
inherit system pkgs stateDir;
};
in
rec {
simpleAppservingTomcat = constructors.simpleAppservingTomcat {
httpPort = 8080;
type = processType;
};
...
}

The above Nix expression shows a partial Nix services model for the Java example system. The first four function parameters: pkgs, system, distribution, and invDistribution are standard Disnix service model parameters.

The remainder of the parameters are specific to the process management framework -- they allow you to change the state directories, force disable user changing (this is useful for unprivileged user deployments) and the process manager it should use for daemons.

I have added a new command-line parameter (--extra-params) to the Disnix tools that can be used to propagate values for these additional parameters.

With the following command-line instruction, we change the base directory of the state directories to the user's home directory, force disable user changing (only a privileged user can do this), and change the process manager to sysvinit scripts:

$ disnix-env -s services.nix -i infrastructure-local.nix -d distribution-local.nix \
--extra-params '{
stateDir = "/home/sander/var";
processManager = "sysvinit";
forceDisableUserChange = true;
}'

With the above command, we can deploy the example system completely as an unprivileged user, without requiring any process/service manager to manage Apache Tomcat.

Working with predeployed container services


In our examples so far, we have deployed systems that are entirely self contained. However, it is also possible to deploy services to container services that have already been deployed by other means. For example, it is also possible to install Apache Tomcat with your host system's distribution and use Dysnomia to integrate with that.

To allow Disnix to deploy services to these containers, we need an infrastructure model that knows its properties. We can automatically generate an infrastructure model from the Dysnomia container configuration files, by running:

$ disnix-capture-infra infrastructure.nix > \
infrastructure-captured.nix

and using the captured infrastructure model to locally deploy the system:

$ disnix-env -s services.nix -i infrastructure-captured.nix -d distribution-local.nix

Undeploying a system


For local experimentation, it is probably quite common that you want to completely undeploy the system as soon as you no longer need it. Normally, this should be done by writing an empty distribution model and redeploying the system with that empty distribution model, but that is still a bit of a hassle.

In the latest development version of Disnix, an undeploy can be done with the following command-line instruction:

$ disnix-env --undeploy -i infrastructure.nix

Availability


The --extra-params and --undeploy Disnix command-line options are part of the current development version of Disnix and will become available in the next release.

by Sander van der Burg (noreply@blogger.com) at May 26, 2020 09:49 PM

May 25, 2020

Tweag I/O

Nix Flakes, Part 1: An introduction and tutorial

This is the first in a series of blog posts intended to provide a gentle introduction to flakes, a new Nix feature that improves reproducibility, composability and usability in the Nix ecosystem. This blog post describes why flakes were introduced, and give a short tutorial on how to use them.

Flakes were developed at Tweag and funded by Target Corporation and Tweag.

What problems do flakes solve?

Once upon a time, Nix pioneered reproducible builds: it tries hard to ensure that two builds of the same derivation graph produce an identical result. Unfortunately, the evaluation of Nix files into such a derivation graph isn’t nearly as reproducible, despite the language being nominally purely functional.

For example, Nix files can access arbitrary files (such as ~/.config/nixpkgs/config.nix), environment variables, Git repositories, files in the Nix search path ($NIX_PATH), command-line arguments (--arg) and the system type (builtins.currentSystem). In other words, evaluation isn’t as hermetic as it could be. In practice, ensuring reproducible evaluation of things like NixOS system configurations requires special care.

Furthermore, there is no standard way to compose Nix-based projects. It’s rare that everything you need is in Nixpkgs; consider for instance projects that use Nix as a build tool, or NixOS system configurations. Typical ways to compose Nix files are to rely on the Nix search path (e.g. import <nixpkgs>) or to use fetchGit or fetchTarball. The former has poor reproducibility, while the latter provides a bad user experience because of the need to manually update Git hashes to update dependencies.

There is also no easy way to deliver Nix-based projects to users. Nix has a “channel” mechanism (essentially a tarball containing Nix files), but it’s not easy to create channels and they are not composable. Finally, Nix-based projects lack a standardized structure. There are some conventions (e.g. shell.nix or release.nix) but they don’t cover many common use cases; for instance, there is no way to discover the NixOS modules provided by a repository.

Flakes are a solution to these problems. A flake is simply a source tree (such as a Git repository) containing a file named flake.nix that provides a standardized interface to Nix artifacts such as packages or NixOS modules. Flakes can have dependencies on other flakes, with a “lock file” pinning those dependencies to exact revisions to ensure reproducible evaluation.

The flake file format and semantics are described in a NixOS RFC, which is currently the best reference on flakes.

Trying out flakes

Flakes are currently implemented in an experimental branch of Nix. If you want to play with flakes, you can get this version of Nix from Nixpkgs:

$ nix-shell -I nixpkgs=channel:nixos-20.03 --packages nixFlakes

Since flakes are an experimental feature, you also need to add the following line to ~/.config/nix/nix.conf:

experimental-features = nix-command flakes

or pass the flag --experimental-features 'nix-command flakes' whenever you call the nix command.

Using flakes

To see flakes in action, let’s start with a simple Unix package named dwarffs (a FUSE filesystem that automatically fetches debug symbols from the Internet). It lives in a GitHub repository at https://github.com/edolstra/dwarffs; it is a flake because it contains a file named flake.nix. We will look at the contents of this file later, but in short, it tells Nix what the flake provides (such as Nix packages, NixOS modules or CI tests).

The following command fetches the dwarffs Git repository, builds its default package and runs it.

$ nix shell github:edolstra/dwarffs --command dwarffs --version
dwarffs 0.1.20200406.cd7955a

The command above isn’t very reproducible: it fetches the most recent version of dwarffs, which could change over time. But it’s easy to ask Nix to build a specific revision:

$ nix shell github:edolstra/dwarffs/cd7955af31698c571c30b7a0f78e59fd624d0229 ...

Nix tries very hard to ensure that the result of building a flake from such a URL is always the same. This requires it to restrict a number of things that Nix projects could previously do. For example, the dwarffs project requires a number of dependencies (such as a C++ compiler) that it gets from the Nix Packages collection (Nixpkgs). In the past, you might use the NIX_PATH environment variable to allow your project to find Nixpkgs. In the world of flakes, this is no longer allowed: flakes have to declare their dependencies explicitly, and these dependencies have to be locked to specific revisions.

In order to do so, dwarffs’s flake.nix file declares an explicit dependency on Nixpkgs, which is also a flake. We can see the dependencies of a flake as follows:

$ nix flake list-inputs github:edolstra/dwarffs
github:edolstra/dwarffs/d11b181af08bfda367ea5cf7fad103652dc0409f
├───nix: github:NixOS/nix/3aaceeb7e2d3fb8a07a1aa5a21df1dca6bbaa0ef
│   └───nixpkgs: github:NixOS/nixpkgs/b88ff468e9850410070d4e0ccd68c7011f15b2be
└───nixpkgs: github:NixOS/nixpkgs/b88ff468e9850410070d4e0ccd68c7011f15b2be

So the dwarffs flake depends on a specific version of the nixpkgs flake (as well as the nix flake). As a result, building dwarffs will always produce the same result. We didn’t specify this version in dwarffs’s flake.nix. Instead, it’s recorded in a lock file named flake.lock that is generated automatically by Nix and committed to the dwarffs repository.

Flake outputs

Another goal of flakes is to provide a standard structure for discoverability within Nix-based projects. Flakes can provide arbitrary Nix values, such as packages, NixOS modules or library functions. These are called its outputs. We can see the outputs of a flake as follows:

$ nix flake show github:edolstra/dwarffs
github:edolstra/dwarffs/d11b181af08bfda367ea5cf7fad103652dc0409f
├───checks
│   ├───aarch64-linux
│   │   └───build: derivation 'dwarffs-0.1.20200409'
│   ├───i686-linux
│   │   └───build: derivation 'dwarffs-0.1.20200409'
│   └───x86_64-linux
│       └───build: derivation 'dwarffs-0.1.20200409'
├───defaultPackage
│   ├───aarch64-linux: package 'dwarffs-0.1.20200409'
│   ├───i686-linux: package 'dwarffs-0.1.20200409'
│   └───x86_64-linux: package 'dwarffs-0.1.20200409'
├───nixosModules
│   └───dwarffs: NixOS module
└───overlay: Nixpkgs overlay

While a flake can have arbitrary outputs, some of them, if they exist, have a special meaning to certain Nix commands and therefore must have a specific type. For example, the output defaultPackage.<system> must be a derivation; it’s what nix build and nix shell will build by default unless you specify another output. The nix CLI allows you to specify another output through a syntax reminiscent of URL fragments:

$ nix build github:edolstra/dwarffs#checks.aarch64-linux.build

By the way, the standard checks output specifies a set of derivations to be built by a continuous integration system such as Hydra. Because flake evaluation is hermetic and the lock file locks all dependencies, it’s guaranteed that the nix build command above will evaluate to the same result as the one in the CI system.

The flake registry

Flake locations are specified using a URL-like syntax such as github:edolstra/dwarffs or git+https://github.com/NixOS/patchelf. But because such URLs would be rather verbose if you had to type them all the time on the command line, there also is a flake registry that maps symbolic identifiers such as nixpkgs to actual locations like https://github.com/NixOS/nixpkgs. So the following are (by default) equivalent:

$ nix shell nixpkgs#cowsay --command cowsay Hi!
$ nix shell github:NixOS/nixpkgs#cowsay --command cowsay Hi!

It’s possible to override the registry locally. For example, you can override the nixpkgs flake to your own Nixpkgs tree:

$ nix registry add nixpkgs ~/my-nixpkgs

or pin it to a specific revision:

$ nix registry add nixpkgs github:NixOS/nixpkgs/5272327b81ed355bbed5659b8d303cf2979b6953

Writing your first flake

Unlike Nix channels, creating a flake is pretty simple: you just add a flake.nix and possibly a flake.lock to your project’s repository. As an example, suppose we want to create our very own Hello World and distribute it as a flake. Let’s create this project first:

$ git init hello
$ cd hello
$ echo 'int main() { printf("Hello World"); }' > hello.c
$ git add hello.c

To turn this Git repository into a flake, we add a file named flake.nix at the root of the repository with the following contents:

{
  description = "A flake for building Hello World";

  inputs.nixpkgs.url = github:NixOS/nixpkgs/nixos-20.03;

  outputs = { self, nixpkgs }: {

    defaultPackage.x86_64-linux =
      # Notice the reference to nixpkgs here.
      with import nixpkgs { system = "x86_64-linux"; };
      stdenv.mkDerivation {
        name = "hello";
        src = self;
        buildPhase = "gcc -o hello ./hello.c";
        installPhase = "mkdir -p $out/bin; install -t $out/bin hello";
      };

  };
}

The command nix flake init creates a basic flake.nix for you.

Note that any file that is not tracked by Git is invisible during Nix evaluation, in order to ensure hermetic evaluation. Thus, you need to make flake.nix visible to Git:

$ git add flake.nix

Let’s see if it builds!

$ nix build
warning: creating lock file '/home/eelco/Dev/hello/flake.lock'
warning: Git tree '/home/eelco/Dev/hello' is dirty

$ ./result/bin/hello
Hello World

or equivalently:

$ nix shell --command hello
Hello World

It’s also possible to get an interactive development environment in which all the dependencies (like GCC) and shell variables and functions from the derivation are in scope:

$ nix dev-shell
$ eval "$buildPhase"
$ ./hello
Hello World

So what does all that stuff in flake.nix mean?

  • The description attribute is a one-line description shown by nix flake info.
  • The inputs attribute specifies other flakes that this flake depends on. These are fetched by Nix and passed as arguments to the outputs function.
  • The outputs attribute is the heart of the flake: it’s a function that produces an attribute set. The function arguments are the flakes specified in inputs.

    The self argument denotes this flake. Its primarily useful for referring to the source of the flake (as in src = self;) or to other outputs (e.g. self.defaultPackage.x86_64-linux).

  • The attributes produced by outputs are arbitrary values, except that (as we saw above) there are some standard outputs such as defaultPackage.${system}.
  • Every flake has some metadata, such as self.lastModifiedDate, which is used to generate a version string like hello-20191015.

You may have noticed that the dependency specification github:NixOS/nixpkgs/nixos-20.03 is imprecise: it says that we want to use the nixos-20.03 branch of Nixpkgs, but doesn’t say which Git revision. This seems bad for reproducibility. However, when we ran nix build, Nix automatically generated a lock file that precisely states which revision of nixpkgs to use:

$ cat flake.lock
{
  "nodes": {
    "nixpkgs": {
      "info": {
        "lastModified": 1587398327,
        "narHash": "sha256-mEKkeLgUrzAsdEaJ/1wdvYn0YZBAKEG3AN21koD2AgU="
      },
      "locked": {
        "owner": "NixOS",
        "repo": "nixpkgs",
        "rev": "5272327b81ed355bbed5659b8d303cf2979b6953",
        "type": "github"
      },
      "original": {
        "owner": "NixOS",
        "ref": "nixos-20.03",
        "repo": "nixpkgs",
        "type": "github"
      }
    },
    "root": {
      "inputs": {
        "nixpkgs": "nixpkgs"
      }
    }
  },
  "root": "root",
  "version": 5
}

Any subsequent build of this flake will use the version of nixpkgs recorded in the lock file. If you add new inputs to flake.nix, when you run any command such as nix build, Nix will automatically add corresponding locks to flake.lock. However, it won’t replace existing locks. If you want to update a locked input to the latest version, you need to ask for it:

$ nix flake update --update-input nixpkgs
$ nix build

To wrap things up, we can now commit our project and push it to GitHub, after making sure that everything is in order:

$ nix flake check
$ git commit -a -m 'Initial version'
$ git remote add origin git@github.com:edolstra/hello.git
$ git push -u origin master

Other users can then use this flake:

$ nix shell github:edolstra/hello -c hello

Next steps

In the next blog post, we’ll talk about typical uses of flakes, such as managing NixOS system configurations, distributing Nixpkgs overlays and NixOS modules, and CI integration.

May 25, 2020 12:00 AM