NixOS Planet

September 24, 2020

Sander van der Burg

Assigning unique IDs to services in Disnix deployment models

As described in some of my recent blog posts, one of the more advanced features of Disnix as well as the experimental Nix process management framework is to deploy multiple instances of the same service to the same machine.

To make running multiple service instances on the same machine possible, these tools rely on conflict avoidance rather than isolation (typically used for containers). To allow multiple services instances to co-exist on the same machine, they need to be configured in such a way that they do not allocate any conflicting resources.

Although for small systems it is doable to configure multiple instances by hand, this process gets tedious and time consuming for larger and more technologically diverse systems.

One particular kind of conflicting resource that could be configured automatically are numeric IDs, such as TCP/UDP port numbers, user IDs (UIDs), and group IDs (GIDs).

In this blog post, I will describe how multiple service instances are configured (in Disnix and the process management framework) and how we can automatically assign unique numeric IDs to them.

Configuring multiple service instances


To facilitate conflict avoidance in Disnix and the Nix process management framework, services are configured as follows:


{createManagedProcess, tmpDir}:
{port, instanceSuffix ? "", instanceName ? "webapp${instanceSuffix}"}:

let
webapp = import ../../webapp;
in
createManagedProcess {
name = instanceName;
description = "Simple web application";
inherit instanceName;

# This expression can both run in foreground or daemon mode.
# The process manager can pick which mode it prefers.
process = "${webapp}/bin/webapp";
daemonArgs = [ "-D" ];

environment = {
PORT = port;
PID_FILE = "${tmpDir}/${instanceName}.pid";
};
user = instanceName;
credentials = {
groups = {
"${instanceName}" = {};
};
users = {
"${instanceName}" = {
group = instanceName;
description = "Webapp";
};
};
};

overrides = {
sysvinit = {
runlevels = [ 3 4 5 ];
};
};
}

The Nix expression shown above is a nested function that describes how to deploy a simple self-contained REST web application with an embedded HTTP server:

  • The outer function header (first line) specifies all common build-time dependencies and configuration properties that the service needs:

    • createManagedProcess is a function that can be used to define process manager agnostic configurations that can be translated to configuration files for a variety of process managers (e.g. systemd, launchd, supervisord etc.).
    • tmpDir refers to the temp directory in which temp files are stored.
  • The inner function header (second line) specifies all instance parameters -- these are the parameters that must be configured in such a way that conflicts with other process instances are avoided:

    • The instanceName parameter (that can be derived from the instanceSuffix) is a value used by some of the process management backends (e.g. the ones that invoke the daemon command) to derive a unique PID file for the process. When running multiple instances of the same process, each of them requires a unique PID file name.
    • The port parameter specifies to which TCP port the service binds to. Binding the service to a port that is already taken by another service, causes the deployment of this service to fail.
  • In the function body, we invoke the createManagedProcess function to construct configuration files for all supported process manager backends to run the webapp process:

    • As explained earlier, the instanceName is used to configure the daemon executable in such a way that it allocates a unique PID file.
    • The process parameter specifies which executable we need to run, both as a foreground process or daemon.
    • The daemonArgs parameter specifies which command-line parameters need to be propagated to the executable when the process should daemonize on its own.
    • The environment parameter specifies all environment variables. The webapp service uses these variables for runtime property configuration.
    • The user parameter is used to specify that the process should run as an unprivileged user. The credentials parameter is used to configure the creation of the user account and corresponding user group.
    • The overrides parameter is used to override the process manager-agnostic parameters with process manager-specific parameters. For the sysvinit backend, we configure the runlevels in which the service should run.

Although the convention shown above makes it possible to avoid conflicts (assuming that all potential conflicts have been identified and exposed as function parameters), these parameters are typically configured manually:


{ pkgs, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "sysvinit"
, ...
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager;
};

processType = import ../../nixproc/derive-dysnomia-process-type.nix {
inherit processManager;
};
in
rec {
webapp1 = rec {
name = "webapp1";
port = 5000;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
instanceSuffix = "1";
};
type = processType;
};

webapp2 = rec {
name = "webapp2";
port = 5001;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
instanceSuffix = "2";
};
type = processType;
};
}

The above Nix expression shows both a valid Disnix services as well as a valid processes model that composes two web application process instances that can run concurrently on the same machine by invoking the nested constructor function shown in the previous example:

  • Each webapp instance has its own unique instance name, by specifying a unique numeric instanceSuffix that gets appended to the service name.
  • Every webapp instance binds to a unique TCP port (5000 and 5001) that should not conflict with system services or other process instances.

Previous work: assigning port numbers


Although configuring two process instances is still manageable, the configuration process becomes more tedious and time consuming when the amount and the kind of processes (each having their own potential conflicts) grow.

Five years ago, I already identified a resource that could be automatically assigned to services: port numbers.

I have created a very simple port assigner tool that allows you to specify a global ports pool and a target-specific pool pool. The former is used to assign globally unique port numbers to all services in the network, whereas the latter assigns port numbers that are unique to the target machine where the service is deployed to (this is to cope with the scarcity of port numbers).

Although the tool is quite useful for systems that do not consist of too many different kinds of components, I ran into a number limitations when I want to manage a more diverse set of services:

  • Port numbers are not the only numeric IDs that services may require. When deploying systems that consist of self-contained executables, you typically want to run them as unprivileged users for security reasons. User accounts on most UNIX-like systems require unique user IDs, and the corresponding users' groups require unique group IDs.
  • We typically want to manage multiple resource pools, for a variety of reasons. For example, when we have a number of HTTP server instances and a number of database instances, then we may want to pick port numbers in the 8000-9000 range for the HTTP servers, whereas for the database servers we want to use a different pool, such as 5000-6000.

Assigning unique numeric IDs


To address these shortcomings, I have developed a replacement tool that acts as a generic numeric ID assigner.

This new ID assigner tool works with ID resource configuration files, such as:


rec {
ports = {
min = 5000;
max = 6000;
scope = "global";
};

uids = {
min = 2000;
max = 3000;
scope = "global";
};

gids = uids;
}

The above ID resource configuration file (idresources.nix) defines three resource pools: ports is a resource that represents port numbers to be assigned to the webapp processes, uids refers to user IDs and gids to group IDs. The group IDs' resource configuration is identical to the users' IDs configuration.

Each resource attribute refers the following configuration properties:

  • The min value specifies the minimum ID to hand out, max the maximum ID.
  • The scope value specifies the scope of the resource pool. global (which is the default option) means that the IDs assigned from this resource pool to services are globally unique for the entire system.

    The machine scope can be used to assign IDs that are unique for the machine where a service is distributed to. When the latter option is used, services that are distributed two separate machines may have the same ID.

We can adjust the services/processes model in such a way that every service will use dynamically assigned IDs and that each service specifies for which resources it requires a unique ID:


{ pkgs, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "sysvinit"
, ...
}:

let
ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager ids;
};

processType = import ../../nixproc/derive-dysnomia-process-type.nix {
inherit processManager;
};
in
rec {
webapp1 = rec {
name = "webapp1";
port = ids.ports.webapp1 or 0;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
instanceSuffix = "1";
};
type = processType;
requiresUniqueIdsFor = [ "ports" "uids" "gids" ];
};

webapp2 = rec {
name = "webapp2";
port = ids.ports.webapp2 or 0;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
instanceSuffix = "2";
};
type = processType;
requiresUniqueIdsFor = [ "ports" "uids" "gids" ];
};
}

In the above services/processes model, we have made the following changes:

  • In the beginning of the expression, we import the dynamically generated ids.nix expression that provides ID assignments for each resource. If the ids.nix file does not exists, we generate an empty attribute set. We implement this construction (in which the absence of ids.nix can be tolerated) to allow the ID assigner to bootstrap the ID assignment process.
  • Every hardcoded port attribute of every service is replaced by a reference to the ids attribute set that is dynamically generated by the ID assigner tool. To allow the ID assigner to open the services model in the first run, we provide a fallback port value of 0.
  • Every service specifies for which resources it requires a unique ID through the requiresUniqueIdsFor attribute. In the above example, both service instances require unique IDs to assign a port number, user ID to the user and group ID to the group.

The port assignments are propagated as function parameters to the constructor functions that configure the services (as shown earlier in this blog post).

We could also implement a similar strategy with the UIDs and GIDs, but a more convenient mechanism is to compose the function that creates the credentials, so that it transparently uses our uids and gids assignments.

As shown in the expression above, the ids attribute set is also propagated to the constructors expression. The constructors expression indirectly composes the createCredentials function as follows:


{pkgs, ids ? {}, ...}:

{
createCredentials = import ../../create-credentials {
inherit (pkgs) stdenv;
inherit ids;
};

...
}

The ids attribute set is propagated to the function that composes the createCredentials function. As a result, it will automatically assign the UIDs and GIDs in the ids.nix expression when the user configures a user or group with a name that exists in the uids and gids resource pools.

To make these UIDs and GIDs assignments go smoothly, it is recommended to give a process instance the same process name, instance name, user and group names.

Using the ID assigner tool


By combining the ID resources specification with the three Disnix models: a services model (that defines all distributable services, shown above), an infrastructure model (that captures all available target machines) and their properties and a distribution model (that maps services to target machines in the network), we can automatically generate an ids configuration that contains all ID assignments:


$ dydisnix -s services.nix -i infrastructure.nix -d distribution.nix \
--id-resources idresources.nix --output-file ids.nix

The above command will generate an ids configuration file (ids.nix) that provides, for each resource in the ID resources model, a unique assignment to services that are distributed to a target machine in the network. (Services that are not distributed to any machine in the distribution model will be skipped, to not waste too many resources).

The output file (ids.nix) has the following structure:


{
"ids" = {
"gids" = {
"webapp1" = 2000;
"webapp2" = 2001;
};
"uids" = {
"webapp1" = 2000;
"webapp2" = 2001;
};
"ports" = {
"webapp1" = 5000;
"webapp2" = 5001;
};
};
"lastAssignments" = {
"gids" = 2001;
"uids" = 2001;
"ports" = 5001;
};
}

  • The ids attribute contains for each resource (defined in the ID resources model) the unique ID assignments per service. As shown earlier, both service instances require unique IDs for ports, uids and gids. The above attribute set stores the corresponding ID assignments.
  • The lastAssignments attribute memorizes the last ID assignment per resource. Once an ID is assigned, it will not be immediately reused. This is to allow roll backs and to prevent data to incorrectly get owned by the wrong user accounts. Once the maximum ID limit is reached, the ID assigner will start searching for a free assignment from the beginning of the resource pool.

In addition to assigning IDs to services that are distributed to machines in the network, it is also possible to assign IDs to all services (regardless whether they have been deployed or not):


$ dydisnix -s services.nix \
--id-resources idresources.nix --output-file ids.nix

Since the above command does not know anything about the target machines, it only works with an ID resources configuration that defines global scope resources.

When you intend to upgrade an existing deployment, you typically want to retain already assigned IDs, while obsolete ID assignment should be removed, and new IDs should be assigned to services that have none yet. This is to prevent unnecessary redeployments.

When removing the first webapp service and adding a third instance:


{ pkgs, system
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager ? "sysvinit"
, ...
}:

let
ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir forceDisableUserChange processManager ids;
};

processType = import ../../nixproc/derive-dysnomia-process-type.nix {
inherit processManager;
};
in
rec {
webapp2 = rec {
name = "webapp2";
port = ids.ports.webapp2 or 0;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
instanceSuffix = "2";
};
type = processType;
requiresUniqueIdsFor = [ "ports" "uids" "gids" ];
};

webapp3 = rec {
name = "webapp3";
port = ids.ports.webapp3 or 0;
dnsName = "webapp.local";
pkg = constructors.webapp {
inherit port;
instanceSuffix = "3";
};
type = processType;
requiresUniqueIdsFor = [ "ports" "uids" "gids" ];
};
}

And running the following command (that provides the current ids.nix as a parameter):


$ dydisnix -s services.nix -i infrastructure.nix -d distribution.nix \
--id-resources idresources.nix --ids ids.nix --output-file ids.nix

we will get the following ID assignment configuration:


{
"ids" = {
"gids" = {
"webapp2" = 2001;
"webapp3" = 2002;
};
"uids" = {
"webapp2" = 2001;
"webapp3" = 2002;
};
"ports" = {
"webapp2" = 5001;
"webapp3" = 5002;
};
};
"lastAssignments" = {
"gids" = 2002;
"uids" = 2002;
"ports" = 5002;
};
}

As may be observed, since the webapp2 process is in both the current and the previous configuration, its ID assignments will be retained. webapp1 gets removed because it is no longer in the services model. webapp3 gets the next numeric IDs from the resources pools.

Because the configuration of webapp2 stays the same, it does not need to be redeployed.

The models shown earlier are valid Disnix services models. As a consequence, they can be used with Dynamic Disnix's ID assigner tool: dydisnix-id-assign.

Although these Disnix services models are also valid processes models (used by the Nix process management framework) not every processes model is guaranteed to be compatible with a Disnix service model.

For process models that are not compatible, it is possible to use the nixproc-id-assign tool that acts as a wrapper around dydisnix-id-assign tool:


$ nixproc-id-assign --id-resources idresources.nix processes.nix

Internally, the nixproc-id-assign tool converts a processes model to a Disnix service model (augmenting the process instance objects with missing properties) and propagates it to the dydisnix-id-assign tool.

A more advanced example


The webapp processes example is fairly trivial and only needs unique IDs for three kinds of resources: port numbers, UIDs, and GIDs.

I have also developed a more complex example for the Nix process management framework that exposes several commonly used system services on Linux systems, such as:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
ids = if builtins.pathExists ./ids.nix then (import ./ids.nix).ids else {};

constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir cacheDir forceDisableUserChange processManager ids;
};
in
rec {
apache = rec {
port = ids.httpPorts.apache or 0;

pkg = constructors.simpleWebappApache {
inherit port;
serverAdmin = "root@localhost";
};

requiresUniqueIdsFor = [ "httpPorts" "uids" "gids" ];
};

postgresql = rec {
port = ids.postgresqlPorts.postgresql or 0;

pkg = constructors.postgresql {
inherit port;
};

requiresUniqueIdsFor = [ "postgresqlPorts" "uids" "gids" ];
};

influxdb = rec {
httpPort = ids.influxdbPorts.influxdb or 0;
rpcPort = httpPort + 2;

pkg = constructors.simpleInfluxdb {
inherit httpPort rpcPort;
};

requiresUniqueIdsFor = [ "influxdbPorts" "uids" "gids" ];
};
}

The above processes model exposes three service instances: an Apache HTTP server (that works with a simple configuration that serves web applications from a single virtual host), PostgreSQL and InfluxDB. Each service requires a unique user ID and group ID so that their privileges are separated.

To make these services more accessible/usable, we do not use a shared ports resource pool. Instead, each service type consumes port numbers from their own resource pools.

The following ID resources configuration can be used to provision the unique IDs to the services above:


rec {
uids = {
min = 2000;
max = 3000;
};

gids = uids;

httpPorts = {
min = 8080;
max = 8085;
};

postgresqlPorts = {
min = 5432;
max = 5532;
};

influxdbPorts = {
min = 8086;
max = 8096;
step = 3;
};
}


The above ID resources configuration defines a shared UIDs and GIDs resource pool, but separate ports resource pools for each service type. This has the following implications if we deploy multiple instances of each service type:

  • All Apache HTTP server instances get a TCP port assignment between 8080-8085.
  • All PostgreSQL server instances get a TCP port assignment between 5432-5532.
  • All InfluxDB server instances get a TCP port assignment between 8086-8096. Since an InfluxDB allocates two port numbers: one for the HTTP server and one for the RPC service (the latter's port number is the base port number + 2). We use a step count of 3 so that we can retain this convention for each InfluxDB instance.

Conclusion


In this blog post, I have described a new tool: dydisnix-id-assign that can be used to automatically assign unique numeric IDs to services in Disnix service models.

Moreover, I have described: nixproc-id-assign that acts a thin wrapper around this tool to automatically assign numeric IDs to services in the Nix process management framework's processes model.

This tool replaces the old dydisnix-port-assign tool in the Dynamic Disnix toolset (described in the blog post written five years ago) that is much more limited in its capabilities.

Availability


The dydisnix-id-assign tool is available in the current development version of Dynamic Disnix. The nixproc-id-assign is part of the current implementation of the Nix process management framework prototype.

by Sander van der Burg (noreply@blogger.com) at September 24, 2020 06:24 PM

September 16, 2020

Tweag I/O

Implicit Dependencies in Build Systems

In making a build system for your software, you codified the dependencies between its parts. But, did you account for implicit software dependencies, like system libraries and compiler toolchains?

Implicit dependencies give rise to the biggest and most common problem with software builds - the lack of hermiticity. Without hermetic builds, reproducibility and cacheability are lost.

This post motivates the desire for reproducibility and cacheability, and explains how we achieve hermetic, reproducible, highly cacheable builds by taking control of implicit dependencies.

Reproducibility

Consider a developer newly approaching a code repository. After cloning the repo, the developer must install a long list of “build requirements” and plod through multiple steps of “setup”, only to find that, yes indeed, the build fails. Yet, it worked just fine for their colleague! The developer, typically not expert in build tooling, must debug the mysterious failure not of their making. This is bad for morale and for productivity.

This happens because the build is not reproducible.

One very common reason for the failure is that the compiler toolchain on the developer’s system is different from that of the colleague. This happens even with build systems that use sophisticated build software, like Bazel. Bazel implicitly uses whatever system libraries and compilers are currently installed in the developer’s environment.

A common workaround is to provide developers with a Docker image equipped with a certain compiler toolchain and system libraries, and then to mandate that the Bazel build occurs in that context.

That solution has a number of drawbacks. First, if the developer is using macOS, the virtualized build context runs substantially slower. Second, the Bazel build cache, developer secrets, and the source code remain outside of the image and this adds complexity to the Docker invocation. Third, the Docker image must be rebuilt and redistributed as dependencies change and that’s extra maintenance. Fourth, and this is the biggest issue, Docker image builds are themselves not reproducible - they nearly always rely on some external state that does not remain constant across build invocations, and that means the build can fail for reasons unrelated to the developer’s code.

A better solution is to use Nix to supply the compiler toolchain and system library dependencies. Nix is a software package management system somewhat like Debian’s APT or macOS’s Homebrew. Nix goes much farther to help developers control their environments. It is unsurpassed when it comes to reproducible builds of software packages.

Nix facilitates use of the Nixpkgs package set. That set is the largest single set of software packages. It is also the freshest package set. It provides build instructions that work both on Linux and macOS. Developers can easily pin any software package at an exact version.

Learn more about using Nix with Bazel, here.

Cacheability

Not only should builds be reproducible, but they should also be fast. Fast builds are achieved by caching intermediate build results. Cache entries are keyed based on the precise dependencies as well as the build instructions that produce the entries. Builds will only benefit from a (shared, distributed) cache when they have matching dependencies. Otherwise, cache keys (which depend on the precise dependencies) will be different, and there will be cache misses. This means that the developer will have to rebuild targets locally. These unnecessary local rebuilds slow development.

The solution is to make the implicit dependencies into explicit ones, again using Nix, making sure to configure and use a shared Nix cache.

Learn more about configuring a shared Bazel cache, here.

Conclusion

It is important to eliminate implicit dependencies in your build system in order to retain build reproducibility and cacheability. Identify Nix packages that can replace the implicit dependencies of your Bazel build and use rules_nixpkgs to declare them as explicit dependencies. That will yield a fast, correct, hermetic build.

September 16, 2020 12:00 AM

September 10, 2020

Tweag I/O

Towards a content-addressed model for Nix

This is my first post about content-addressability in Nix — a long-awaited feature that is hopefully coming soon! In this post I will show you how this feature will improve the Nix infrastructure. I’ll come back in another post to explain the technical challenges of adding content-addressability to Nix.

Nix has a wonderful model for handling packages. Because each derivation is stored under (aka addressed by) a unique name, multiple versions of the same library can coexist on the same system without issues: each version of the library has a distinct name, as far as Nix is concerned.

What’s more, if openssl is upgraded in Nixpkgs, Nix knows that all the packages that depend on openssl (i.e., almost everything) must be rebuilt, if only so that they point at the name of the new openssl version. This way, a Nix installation will never feature a package built for one version of openssl, but dynamically linked against another: as a user, it means that you will never have an undefined symbol error. Hurray!

The input-addressed store

How does Nix achieve this feat? The idea is that the name of a package is derived from all of its inputs (that is, the complete list of dependencies, as well as the package description). So if you change the git tag from which openssl is fetched, the name changes, if the name of openssl changes, then the name of any package which has openssl in its dependencies changes.

However this can be very pessimistic: even changes that aren’t semantically meaningful can imply mass rebuilding and downloading. As a slightly extreme example, this merge-request on Nixpkgs makes a tiny change to the way openssl is built. It doesn’t actually change openssl, yet requires rebuilding an insane amount of packages. Because, as far as Nix is concerned, all these packages have different names, hence are different packages. In reality, though, they weren’t.

Nevertheless, the cost of the rebuild has to be born by the Nix infrastructure: Hydra builds all packages to populate the cache, and all the newly built packages must be stored. It costs both time, and money (in cpu power, and storage space).

Unnecessary rebuilds?

Most distributions, by default, don’t rebuild packages when their dependencies change, and have a (more-or-less automated) process to detect changes that require rebuilding reverse dependencies. For example, Debian tries to detect ABI changes automatically and Fedora has a more manual process. But Nix doesn’t.

The issue is that the notion of a “breaking change” is a very fuzzy one. Should we follow Debian and consider that only ABI changes are breaking? This criterion only applies for shared libraries, and as the Debian policy acknowledges, only for “well-behaved” programs. So if we follow this criterion, there’s still need for manual curation, which is precisely what Nix tries to avoid.

The content-addressed model

Quite happily, there is a criterion to avoid many useless rebuilds without sacrificing correctness: detecting when changes in a package (or one of its dependencies) yields the exact same output. That might seem like an edge case, but the openssl example above (and many others) shows that there’s a practical application to it. As another example, go depends on perl for its tests, so an upgrade of perl requires rebuilding all the Go packages in Nixpkgs, although it most likely doesn’t change the output of the go derivation.

But, for Nix to recognise that a package is not a new package, the new, unchanged, openssl or go packages must have the same name as the old version. Therefore, the name of a package must not be derived from its inputs which have changed, but, instead, it should be derived from the content of the compiled package. This is called content addressing.

Content addressing is how you can be sure that when you and a colleague at the other side of the world type git checkout 7cc16bb8cd38ff5806e40b32978ae64d54023ce0 you actually have the exact same content in your tree. Git commits are content addressed, therefore the name 7cc16bb8cd38ff5806e40b32978ae64d54023ce0 refers to that exact tree.

Yet another example of content-addressed storage is IPFS. In IPFS storage files can be stored in any number of computers, and even moved from computer to computer. The content-derived name is used as a way to give an intrinsic name to a file, regardless of where it is stored.

In fact, even the particular use case that we are discussing here - avoiding recompilation when a rebuilt dependency hasn’t changed - can be found in various build systems such as Bazel. In build systems, such recompilation avoidance is sometimes known as the early cutoff optimization − see the build systems a la carte paper for example).

So all we need to do is to move the Nix store from an input-addressed model to a content-addressed model, as used by many tools already, and we will be able to save a lot of storage space and CPU usage, by rebuilding many fewer packages. Nixpkgs contributors will see their CI time improved. It could also allow serving a binary cache over IPFS.

Well, like many things with computers, this is actually way harder than it sounds (which explains why this hasn’t already been done despite being discussed nearly 15 years ago in the original paper), but we now believe that there’s a way forward… more on that in a later post.

Conclusion

A content-addressed store for Nix would help reduce the insane load that Hydra has to sustain. While content-addressing is a common technique both in distributed systems and build systems (Nix is both!), getting to the point where it was feasible to integrate content-addressing in Nix has been a long journey.

In a future post, I’ll explain why it was so hard, and how we finally managed to propose a viable design for a content-addressed Nix.

September 10, 2020 12:00 AM

August 28, 2020

nixbuild.net

nixbuild.net is Generally Available

Today, nixbuild.net is exiting private beta and made generally available! Anyone can now sign up for a nixbuild.net account and immediately start building using the free CPU hours included with every account.

After the free CPU hours have been consumed, the pricing is simple: 0.12 EUR (excl. VAT) per CPU hour consumed, billed monthly.

As part of this GA announcement, a number of marketing and documentation improvements have been published:

We’re really happy for nixbuild.net to enter this new phase — making simple, performant and scalable remote builds available to every Nix user! We’re excited to see how the service is used, and we have lots of plans for the future of nixbuild.net.

by nixbuild.net (support@nixbuild.net) at August 28, 2020 12:00 AM

August 20, 2020

Tweag I/O

How Nix grew a marketing team

Recently I witnessed the moment when a potential Nix user reached eureka. The moment where everything regarding Nix made sense. My friend, now a Nix user, screamed from joy: “We need to Nix–ify everything!”

Moments like these reinforce my belief that Nix is a solution from — and for — the future. A solution that could reach many more people, only if learning about Nix didn’t demand investing as much time and effort as it does now.

I think that Nix has the perfect foundation for becoming a success but that it still needs better marketing. Many others agree with me, and that’s why we formed the Nix marketing team.

I would like to convince you that indeed, marketing is the way to go and that it is worth it. Therefore, in this post I will share my thoughts on what kind of success we aim for, and which marketing efforts we are currently pursuing. The marketing team is already giving its first results, and with your input, we can go further.

What does success look like?

At the time of writing this post, I have been using Nix for 10 years. I organized one and attended most of the Nix conferences since then, and talked to many people in the community. All of this does not give me the authority to say what success for Nix looks like, but it does give me a great insight into what we — the Nix community — can agree on.

Success for Nix would be the next time you encounter a project on GitHub, it would already contain a default.nix for you to start developing. Success for Nix would be the next time you try to run a server on the cloud, NixOS would be offered to you. Or even more ambitious, would be other communities recognising Nix as a de facto standard that improves the industry as a whole.

To some, this success statement may seem very obvious. However, it is important to say it out loud and often, so we can keep focus, and keep working on the parts of Nix that will contribute the most to this success.

The importance of marketing

Before we delve into what Nix still lacks, I would like to say that we — engineers and developers — should be aware of our bias against marketing. This bias becomes clear when we think about what we think are the defining aspects for a project’s success. We tend to believe that code is everything, and that good code leads to good results. But what if I tell you that good marketing constitutes more than 50% of the success of a project? Would you be upset? We have to overcome this bias, since it prevents us from seeing the big picture.

Putting aside those Sunday afternoons when I code for the pure joy of stretching my mind, most of the time I simply want to solve a problem. The joy when seeing others realizing that their problem is not a problem anymore, is one of the best feelings I experienced as a developer. This is what drives me. Not the act of coding itself, but the act of solving the problem. Coding is then only part of the solution. Others need to know about the existence of your code, understand how it can solve their problem and furthermore they need to know how to use it.

That is why marketing, and, more generally, non-technical work, is at least as important as technical work. Documentation, writing blog posts, creating content for the website, release announcements, conference talks, conference booths, forums, chat channels, email lists, demo videos, use cases, swag, search engine optimisation, social media presence, engaging with the community… These are all crucial parts of any successful project.

Nix needs better marketing, from a better website to better documentation, along with all the ingredients mentioned above. If we want Nix to grow as a project we need to improve our marketing game, since this is the area of work that is historically receiving the least amount of attention. And we are starting to work on it. In the middle of March 2020, a bunch of us got together and announced the creation of the Nix marketing team. Since then we meet roughly every two weeks to discuss and work on non-technical challenges that the Nix project is facing.

But before the Nix marketing team could start doing any actual work we had to answer an important question:

What is Nix?

I want to argue that the Nix community is still missing an answer to an apparently very simple question: What is Nix?.

The reason why what is Nix? is a harder question than it may appear at first, is that any complete answer has to tell us what and who Nix is for. Knowing the audience and primary use cases is a precondition to improving the website, documentation, or even Nix itself.

This is what the Nix marketing team discussed first. We identified the following audiences and primary use cases:

  1. Development environments (audience: developers)
  2. Deploying to the cloud (audience: system administrators)

It doesn’t mean other use cases are not important — they are. We are just using the primary use cases as a gateway drug into the rest of the Nix’s ecosystem. In this way, new users will not be overwhelmed with all the existing options and will have a clear idea where to start.

Some reasons for selecting the two use cases are:

  • Both use cases are relatively polished solutions. Clearly, there is still much to be improved, but currently these are the two use cases with the best user experience in the Nix ecosystem.
  • One use case is a natural continuation of another. First, you develop and then you can use the same tools to package and deploy.
  • Market size for both use cases is huge, which means there is a big potential.

A differentiating factor — why somebody would choose Nix over others — is Nix’s ability to provide reproducible results. The promise of reproducibility is the aspect that already attracts the majority of Nix’s user base. From this, we came up with a slogan for Nix:

Reproducible builds and deploys

With the basic question answered we started working.

What has been done so far? How can I help?

So far, the Marketing team focused on improving the website:

  1. Moved the website to Netlify. The important part is not switching to Netlify, but separating the website from the Nix infrastructure. This removes the fear of a website update bringing down parts of Nix infrastructure.
  2. Simplified navigation. If you remember, the navigation was different for each project that was listed on the website. We removed the project differentiation and unified navigation. This will show Nix ecosystem as a unified story and not a collection of projects. One story is easier to follow than five.
  3. Created a new learn page. Discoverability of documentation was a huge problem. Links to popular topics in manuals are now more visible. Some work on entry level tutorials has also started. Good and beginner friendly learning resources are what is going to create the next generation of Nix users.
  4. Created new team pages. We collected information about different official and less official teams working on Nix. The work here is not done, but it shows that many teams don’t have clear responsibilities. It shows how decisions are made and invites new Nix users to become more involved with the project.
  5. Improved landing page. Instead of telling the user what Nix is, they will experience it from the start. The landing page is filled with examples that will convince visitors to give Nix a try.

The work of the marketing team has just started, and there is still a lot to be done. We are working hard on redesigning the website and improving the messaging. The roadmap will tell you more about what to expect next.

If you wish to help come and say hi to #nixos-marketing on irc.freenode.org.

Conclusion

Marketing, and non-technical work, is all too often an afterthought for developers. I really wish it weren’t the case. Having clearly defined problems, audience and strategy should be as important to us as having clean and tested code. This is important for Nix. This is important for any project that aims to succeed.

August 20, 2020 12:00 AM

August 13, 2020

nixbuild.net

Build Reuse in nixbuild.net

Performance and cost-effectiveness are core values for nixbuild.net. How do you make a Nix build as performant and cheap as possible? The answer is — by not running it at all!

This post goes into some detail about the different ways nixbuild.net is able to safely reuse build results. The post gets technical, but the main message is that nixbuild.net really tries to avoid building if it can, in order to save time and money for its users.

Binary Caches

The most obvious way of reusing build results is by utilising binary caches, and an earlier blog post described how this is supported by nixbuild.net. In short, if something has been built on cache.nixos.org, nixbuild.net can skip building it and just fetch it. It is also possible to configure other binary caches to use, and even treat the builds of specific nixbuild.net users in the same way as a trusted binary cache.

No Shared Uploads

As part of the Nix remote build protocol, inputs (dependencies) can be uploaded directly to nixbuild.net. Those inputs are not necessarily trustworty, because we don’t know how they were produced. Therefore, those inputs are only allowed to be used by the user who uploaded them. The exception is if the uploaded input had a signature from a binary cache key, then we allow it to be used by all accounts that trust that specific key. Also, if explicit trust has been setup between two accounts, uploaded paths will be shared.

Derivation Sharing

Another method of reuse, unique to nixbuild.net, is the sharing of build results between users that don’t necessarily trust each other. It works like this:

  1. When we receive a build request, we get a derivation from the user’s Nix client. In essence, this derivation describes what inputs (dependencies) the build needs, and what commands must be run to produce the build output.

  2. The inputs are described in the derivation simply as a list of store paths (/nix/store/abc, /nix/store/xyz). The way the Nix remote build protocol works, those store paths have already been provided to us, either because we already had trusted variants of them in our storage, or because we’ve downloaded them from binary caches, or because the client uploaded them to us.

    In order for us to be able to run the build, we need to map the input store paths to the actual file contents of the inputs. This mapping can actually vary even though store paths are the same. This is because a Nix store path does not depend on the contents of the path, but rather on the dependencies of the path. So we can very well have multiple versions of the same store path in our storage, because multiple users might have uploaded differing builds of the same paths.

    Anyhow, we will end up with a mapping that depends entirely on what paths the user is allowed to use. So, two users may build the exact same derivation but get different store-path-to-content mappings.

  3. At this stage, we store a representation of both the derivation itself, and the mapping described in previous step. Together, these two pieces represent a unique derivation in nixbuild.net’s database.

  4. Now, we can build the derivation. The build runs inside an isolated, virtualized sandbox that has no network access and nothing other than its inputs inside its filesystem.

    The sandbox is of course vital for keeping your builds secure, but it has another application, too: If we already have built a specific derivation (with a specific set of input content), this build result can be reused for any user that comes along and requests a build of the exact same derivation with the exact same set of input content.

We do not yet have any numbers on how big impact this type of build result sharing has in practice. The effectiveness will depend on how reproducible the builds are, and of course also on how many users that are likely to build the same derivations.

For an organization with a large set of custom packages that want to share binary builds with contributors and users, it could turn out useful. The benefit for users is that they don’t actually have to blindly trust a binary cache but instead can be sure that they get binaries that correspond to the nix derivations they have evaluated.

by nixbuild.net (support@nixbuild.net) at August 13, 2020 12:00 AM

August 12, 2020

Tweag I/O

Developing Python with Poetry & Poetry2nix: Reproducible flexible Python environments

Most Python projects are in fact polyglot. Indeed, many popular libraries on PyPi are Python wrappers around C code. This applies particularly to popular scientific computing packages, such as scipy and numpy. Normally, this is the terrain where Nix shines, but its support for Python projects has often been labor-intensive, requiring lots of manual fiddling and fine-tuning. One of the reasons for this is that most Python package management tools do not give enough static information about the project, not offering the determinism needed by Nix.

Thanks to Poetry, this is a problem of the past — its rich lock file offers more than enough information to get Nix running, with minimal manual intervention. In this post, I will show how to use Poetry, together with Poetry2nix, to easily manage Python projects with Nix. I will show how to package a simple Python application both using the existing support for Python in Nixpkgs, and then using Poetry2nix. This will both show why Poetry2nix is more convenient, and serve as a short tutorial covering its features.

Our application

We are going to package a simple application, a Flask server with two endpoints: one returning a static string “Hello World” and another returning a resized image. This application was chosen because:

  1. It can fit into a single file for the purposes of this post.
  2. Image resizing using Pillow requires the use of native libraries, which is something of a strength of Nix.

The code for it is in the imgapp/__init__.py file:

from flask import send_file
from flask import Flask
from io import BytesIO
from PIL import Image
import requests


app = Flask(__name__)


IMAGE_URL = "https://farm1.staticflickr.com/422/32287743652_9f69a6e9d9_b.jpg"
IMAGE_SIZE = (300, 300)


@app.route('/')
def hello():
    return "Hello World!"


@app.route('/image')
def image():
    r = requests.get(IMAGE_URL)
    if not r.status_code == 200:
        raise ValueError(f"Response code was '{r.status_code}'")

    img_io = BytesIO()

    img = Image.open(BytesIO(r.content))
    img.thumbnail(IMAGE_SIZE)
    img.save(img_io, 'JPEG', quality=70)

    img_io.seek(0)

    return send_file(img_io, mimetype='image/jpeg')


def main():
    app.run()


if __name__ == '__main__':
    main()

The status quo for packaging Python with Nix

There are two standard techniques for integrating Python projects with Nix.

Nix only

The first technique uses only Nix for package management, and is described in the Python section of the Nix manual. While it works and may look very appealing on the surface, it uses Nix for all package management needs, which comes with some drawbacks:

  1. We are essentially tied to whatever package version Nixpkgs provides for any given dependency. This can be worked around with overrides, but those can cause version incompatibilities. This happens often in complex Python projects, such as data science ones, which tend to be very sensitive to version changes.
  2. We are tied to using packages already in Nixpkgs. While Nixpkgs has many Python packages already packaged up (around 3000 right now) there are many packages missing — PyPi, the Python Package Index has more than 200000 packages. This can of course be worked around with overlays and manual packaging, but this quickly becomes a daunting task.
  3. In a team setting, every team member wanting to add packages needs to buy in to Nix and at least have some experience using and understanding Nix.

All these factors lead us to a conclusion: we need to embrace Python tooling so we can efficiently work with the entire Python ecosystem.

Pip and Pypi2Nix

The second standard method tries to overcome the faults above by using a hybrid approach of Python tooling together with Nix code generation. Instead of writing dependencies manually in Nix, they are extracted from the requirements.txt file that users of Pip and Virtualenv are very used to. That is, from a requirements.txt file containing the necessary dependencies:

requests
pillow
flask

we can use pypi2nix to package our application in a more automatic fashion than before:

nix-shell -p pypi2nix --run "pypi2nix -r requirements.txt"

However, Pip is not a dependency manager and therefore the requirements.txt file is not explicit enough — it lacks both exact versions for libraries, and system dependencies. Therefore, the command above will not produce a working Nix expression. In order to make pypi2nix work correctly, one has to manually find all dependencies incurred by the use of Pillow:

nix-shell -p pypi2nix --run "pypi2nix -V 3.8 -E pkgconfig -E freetype -E libjpeg -E openjpeg -E zlib -E libtiff -E libwebp -E tcl -E lcms2 -E xorg.libxcb -r requirements.txt"

This will generate a large Nix expression, that will indeed work as expected. Further use of Pypi2nix is left to the reader, but we can already draw some conclusions about this approach:

  1. Code generation results in huge Nix expressions that can be hard to debug and understand. These expressions will typically be checked into a project repository, and can get out of sync with actual dependencies.
  2. It’s very high friction, especially around native dependencies.

Having many large Python projects, I wasn’t satisfied with the status quo around Python package management. So I looked into what could be done to make the situation better, and which tools could be more appropriate for our use-case. A potential candidate was Pipenv, however its dependency solver and lock file format were difficult to work with. In particular, Pipenv’s detection of “local” vs “non-local” dependencies did not work properly inside the Nix shell and gave us the wrong dependency graph. Eventually, I found Poetry and it looked very promising.

Poetry and Poetry2nix

The Poetry package manager is a relatively recent addition to the Python ecosystem but it is gaining popularity very quickly. Poetry features a nice CLI with good UX and deterministic builds through lock files.

Poetry uses pip under the hood and, for this reason, inherited some of its shortcomings and lock file design. I managed to land a few patches in Poetry before the 1.0 release to improve the lock file format, and now it is fit for use in Nix builds. The result was Poetry2nix, whose key design goals were:

  1. Dead simple API.
  2. Work with the entire Python ecosystem using regular Python tooling.
  3. Python developers should not have to be Nix experts, and vice versa.
  4. Being an expert should allow you to “drop down” into the lower levels of the build and customise it.

Poetry2nix is not a code generation tool — it is implemented in pure Nix. This fixes many of problems outlined in previous paragraphs, since there is a single point of truth for dependencies and their versions.

But what about our native dependencies from before? How does Poetry2nix know about those? Indeed, Poetry2nix comes with an extensive set of overrides built-in for a lot of common packages, including Pillow. Users are encouraged to contribute overrides upstream for popular packages, so everyone can have a better user experience.

Now, let’s see how Poetry2nix works in practice.

Developing with Poetry

Let’s start with only our application file above (imgapp/__init__.py) and a shell.nix:

{ pkgs ? import <nixpkgs> {} }:

pkgs.mkShell {

  buildInputs = [
    pkgs.python3
    pkgs.poetry
  ];

}

Poetry comes with some nice helpers to create a project, so we run:

$ poetry init

And then we’ll add our dependencies:

$ poetry add requests pillow flask

We now have two files in the folder:

  • The first one is pyproject.toml which not only specifies our dependencies but also replaces setup.py.
  • The second is poetry.lock which contains our entire pinned Python dependency graph.

For Nix to know which scripts to install in the bin/ output directory, we also need to add a scripts section to pyproject.toml:

[tool.poetry]
name = "imgapp"
version = "0.1.0"
description = ""
authors = ["adisbladis <adisbladis@gmail.com>"]

[tool.poetry.dependencies]
python = "^3.7"
requests = "^2.23.0"
pillow = "^7.1.2"
flask = "^1.1.2"

[tool.poetry.dev-dependencies]

[tool.poetry.scripts]
imgapp = 'imgapp:main'

[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"

Packaging with Poetry2nix

Since Poetry2nix is not a code generation tool but implemented entirely in Nix, this step is trivial. Create a default.nix containing:

{ pkgs ? import <nixpkgs> {} }:
pkgs.poetry2nix.mkPoetryApplication {
  projectDir = ./.;
}

We can now invoke nix-build to build our package defined in default.nix. Poetry2nix will automatically infer package names, dependencies, meta attributes and more from the Poetry metadata.

Manipulating overrides

Many overrides for system dependencies are already upstream, but what if some are lacking? These overrides can be manipulated and extended manually:

poetry2nix.mkPoetryApplication {
    projectDir = ./.;
    overrides = poetry2nix.overrides.withDefaults (self: super: {
      foo = foo.overridePythonAttrs(oldAttrs: {});
    });
}

Conclusion

By embracing both modern Python package management tooling and the Nix language, we can achieve best-in-class user experience for Python developers and Nix developers alike.

There are ongoing efforts to make Poetry2nix and other Nix Python tooling work better with data science packages like numpy and scipy. I believe that Nix may soon rival Conda on Linux and MacOS for data science.

Python + Nix has a bright future ahead of it!

August 12, 2020 12:00 AM

August 11, 2020

Sander van der Burg

Experimenting with Nix and the service management properties of Docker

In the previous blog post, I have analyzed Nix and Docker as deployment solutions and described in what ways these solutions are similar and different.

To summarize my findings:

  • Nix is a source-based package manager responsible for obtaining, installing, configuring and upgrading packages in a reliable and reproducible manner and facilitating the construction of packages from source code and their dependencies.
  • Docker's purpose is to fully manage the life-cycle of applications (services and ordinary processes) in a reliable and reproducible manner, including their deployments.

As explained in my previous blog post, two prominent goals both solutions have in common is to facilitate reliable and reproducible deployment. They both use different kinds of techniques to accomplish these goals.

Although Nix and Docker can be used for a variety of comparable use cases (such as constructing images, deploying test environments, and constructing packages from source code), one prominent feature that the Nix package manager does not provide is process (or service) management.

In a Nix-based workflow you need to augment Nix with another solution that can facilitate process management.

In this blog post, I will investigate how Docker could fulfill this role -- it is pretty much the opposite goal of the combined use cases scenarios I have shown in the previous blog post, in which Nix can overtake the role of a conventional package manager in supplying packages in the construction process of an image and even the complete construction process of images.

Existing Nix integrations with process management


Although Nix does not do any process management, there are sister projects that can, such as:

  • NixOS builds entire machine configurations from a single declarative deployment specification and uses the Nix package manager to deploy and isolate all static artifacts of a system. It will also automatically generate and deploy systemd units for services defined in a NixOS configuration.
  • nix-darwin can be used to specify a collection of services in a deployment specification and uses the Nix package manager to deploy all services and their corresponding launchd configuration files.

Although both projects do a great job (e.g. they both provide a big collection of deployable services) what I consider a disadvantage is that they are platform specific -- both solutions only work on a single operating system (Linux and macOS) and a single process management solution (systemd and launchd).

If you are using Nix in a different environment, such as a different operating system, a conventional (non-NixOS) Linux distribution, or a different process manager, then there is no off-the-shelf solution that will help you managing services for packages provided by Nix.

Docker functionality


Docker could be considered a multi-functional solution for application management. I can categorize its functionality as follows:

  • Process management. The life-cycle of a container is bound to the life-cycle of a root process that needs to be started or stopped.
  • Dependency management. To ensure that applications have all the dependencies that they need and that no dependency is missing, Docker uses images containing a complete root filesystem with all required files to run an application.
  • Resource isolation is heavily used for a variety of different reasons:
    • Foremost, to ensure that the root filesystem of the container does not conflict with the host system's root filesystem.
    • It is also used to prevent conflicts with other kinds of resources. For example, the isolated network interfaces allow services to bind to the same TCP ports that may also be in use by the host system or other containers.
    • It offers some degree of protection. For example, a malicious process will not be able to see or control a process belonging to the host system or a different container.
  • Resource restriction can be used to limit the amount of system resources that a process can consume, such as the amount of RAM.

    Resource restriction can be useful for a variety of reasons, for example, to prevent a service from eating up all the system's resources affecting the stability of the system as a whole.
  • Integrations with the host system (e.g. volumes) and other services.

As described in the previous blog post, Docker uses a number key concepts to implement the functionality shown above, such as layers, namespaces and cgroups.

Developing a Nix-based process management solution


For quite some time, I have been investigating the process management domain and worked on a prototype solution to provide a more generalized infrastructure that complements Nix with process management -- I came up with an experimental Nix-based process manager-agnostic framework that has the following objectives:

  • It uses Nix to deploy all required packages and other static artifacts (such as configuration files) that a service needs.
  • It integrates with a variety of process managers on a variety of operating systems. So far, it can work with: sysvinit scripts, BSD rc scripts, supervisord, systemd, cygrunsrv and launchd.

    In addition to process managers, it can also automatically convert a processes model to deployment specifications that Disnix can consume.
  • It uses declarative specifications to define functions that construct managed processes and process instances.

    Processes can be declared in a process-manager specific and process-manager agnostic way. The latter makes it possible to target all six supported process managers with the same declarative specification, albeit with a limited set of features.
  • It allows you to run multiple instances of processes, by introducing a convention to cope with potential resource conflicts between process instances -- instance properties and potential conflicts can be configured with function parameters and can be changed in such a way that they do not conflict.
  • It can facilitate unprivileged user deployments by using Nix's ability to perform unprivileged package deployments and introducing a convention that allows you to disable user switching.

To summarize how the solution works from a user point of view, we can write a process manager-agnostic constructor function as follows:


{createManagedProcess, tmpDir}:
{port, instanceSuffix ? "", instanceName ? "webapp${instanceSuffix}"}:

let
webapp = import ../../webapp;
in
createManagedProcess {
name = instanceName;
description = "Simple web application";
inherit instanceName;

process = "${webapp}/bin/webapp";
daemonArgs = [ "-D" ];

environment = {
PORT = port;
PID_FILE = "${tmpDir}/${instanceName}.pid";
};
user = instanceName;
credentials = {
groups = {
"${instanceName}" = {};
};
users = {
"${instanceName}" = {
group = instanceName;
description = "Webapp";
};
};
};

overrides = {
sysvinit = {
runlevels = [ 3 4 5 ];
};
};
}

The Nix expression above is a nested function that defines in a process manager-agnostic way a configuration for a web application process containing an embedded web server serving a static HTML page.

  • The outer function header (first line) refers to parameters that are common to all process instances: createManagedProcess is a function that can construct process manager configurations and tmpDir refers to the directory in which temp files are stored (which is /tmp in conventional Linux installations).
  • The inner function header (second line) refers to instance parameters -- when it is desired to construct multiple instances of this process, we must make sure that we have configured these parameters in such as a way that they do not conflict with other processes.

    For example, when we assign a unique TCP port and a unique instance name (a property used by the daemon tool to create unique PID files) we can safely have multiple instances of this service co-existing on the same system.
  • In the body, we invoke the createManagedProcess function to generate configurations files for a process manager.
  • The process parameter specifies the executable that we need to run to start the process.
  • The daemonArgs parameter specifies command-line instructions passed to the the process executable, when the process should daemonize itself (the -D parameter instructs the webapp process to daemonize).
  • The environment parameter specifies all environment variables. Environment variables are used as a generic configuration facility for the service.
  • The user parameter specifies the name the process should run as (each process instance has its own user and group with the same name as the instance).
  • The credentials parameter is used to automatically create the group and user that the process needs.
  • The overrides parameter makes it possible to override the parameters generated by the createManagedProcess function with process manager-specific overrides, to configure features that are not universally supported.

    In the example above, we use an override to configure the runlevels in which the service should run (runlevels 3-5 are typically used to boot a system that is network capable). Runlevels are a sysvinit-specific concept.

In addition to defining constructor functions allowing us to construct zero or more process instances, we also need to construct process instances. These can be defined in a processes model:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
, processManager
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir;
inherit forceDisableUserChange processManager;
};
in
rec {
webapp = rec {
port = 5000;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};
};

nginxReverseProxy = rec {
port = 8080;

pkg = constructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};
};
}

The above Nix expressions defines two process instances and uses the following conventions:

  • The first line is a function header in which the function parameters correspond to ajustable properties that apply to all process instances:
    • stateDir allows you to globally override the base directory in which all state is stored (the default value is: /var).
    • We can also change the locations of each individual state directories: tmpDir, cacheDir, logDir, runtimeDir etc.) if desired.
    • forceDisableUserChange can be enabled to prevent the process manager to change user permissions and create users and groups. This is useful to facilitate unprivileged user deployments in which the user typically has no rights to change user permissions.
    • The processManager parameter allows you to pick a process manager. All process configurations will be automatically generated for the selected process manager.

      For example, if we would pick: systemd then all configurations get translated to systemd units. supervisord causes all configurations to be translated to supervisord configuration files.
  • To get access to constructor functions, we import a constructors expression that composes all constructor functions by calling them with their common parameters (not shown in this blog post).

    The constructors expression also contains a reference to the Nix expression that deploys the webapp service, shown in our previous example.
  • The processes model defines two processes: a webapp instance that listens to TCP port 5000 and Nginx that acts as a reverse proxy forwarding requests to webapp process instances based on the virtual host name.
  • webapp is declared a dependency of the nginxReverseProxy service (by passing webapp as a parameter to the constructor function of Nginx). This causes webapp to be activated before the nginxReverseProxy.

To deploy all process instances with a process manager, we can invoke a variety of tools that are bundled with the experimental Nix process management framework.

The process model can be deployed as sysvinit scripts for an unprivileged user, with the following command:


$ nixproc-sysvinit-switch --state-dir /home/sander/var \
--force-disable-user-change processes.nix

The above command automatically generates sysvinit scripts, changes the base directory of all state folders to a directory in the user's home directory: /home/sander/var and disables user changing (and creation) so that an unprivileged user can run it.

The following command uses systemd as a process manager with the default parameters, for production deployments:


$ nixproc-systemd-switch processes.nix

The above command automatically generates systemd unit files and invokes systemd to deploy the processes.

In addition to the examples shown above, the framework contains many more tools, such as: nixproc-supervisord-switch, nixproc-launchd-switch, nixproc-bsdrc-switch, nixproc-cygrunsrv-switch, and nixproc-disnix-switch that all work with the same processes model.

Integrating Docker into the process management framework


Both Docker and the Nix-based process management framework are multi-functional solutions. After comparing the functionality of Docker and the process management framework, I realized that it is possible to integrate Docker into this framework as well, if I would use it in an unconventional way, by disabling or substituting some if its conflicting features.

Using a shared Nix store


As explained in the beginning of this blog post, Docker's primary means to provide dependencies is by using images that are self-contained root file systems containing all necessary files (e.g. packages, configuration files) to allow an application to work.

In the previous blog post, I have also demonstrated that instead of using traditional Dockerfiles to construct images, we can also use the Nix package manager as a replacement. A Docker image built by Nix is typically smaller than a conventional Docker image built from a base Linux distribution, because it only contains the runtime dependencies that an application actually needs.

A major disadvantage of using Nix constructed Docker images is that they only consist of one layer -- as a result, there is no reuse between container instances running different services that use common libraries. To alleviate this problem, Nix can also build layered images, in which common dependencies are isolated in separate layers as much as possible.

There is even a more optimal reuse strategy possible -- when running Docker on a machine that also has Nix installed, we do not need to put anything that is in the Nix store in a disk image. Instead, we can share the host system's Nix store between Docker containers.

This may sound scary, but as I have explained in the previous blog post, paths in the Nix store are prefixed with SHA256 hash codes. When two Nix store paths with identical hash codes are built on two different machines, their build results should be (nearly) bit-identical. As a result, it is safe to share the same Nix store path between multiple machines and containers.

A hacky solution to build a container image, without actually putting any of the Nix built packages in the container, can be done with the following expression:


with import <nixpkgs> {};

let
cmd = [ "${nginx}/bin/nginx" "-g" "daemon off;" "-c" ./nginx.conf ];
in
dockerTools.buildImage {
name = "nginxexp";
tag = "test";

runAsRoot = ''
${dockerTools.shadowSetup}
groupadd -r nogroup
useradd -r nobody -g nogroup -d /dev/null
mkdir -p /var/log/nginx /var/cache/nginx /var/www
cp ${./index.html} /var/www/index.html
'';

config = {
Cmd = map (arg: builtins.unsafeDiscardStringContext arg) cmd;
Expose = {
"80/tcp" = {};
};
};
}

The above expression is quite similar to the Nix-based Docker image example shown in the previous blog post, that deploys Nginx serving a static HTML page.

The only difference is how I configure the start command (the Cmd parameter). In the Nix expression language, strings have context -- if a string with context is passed to a build function (any string that contains a value that evaluates to a Nix store path), then the corresponding Nix store paths automatically become a dependency of the package that the build function builds.

By using the unsafe builtins.unsafeDiscardStringContext function I can discard the context of strings. As a result, the Nix packages that the image requires are still built. However, because their context is discarded they are no longer considered dependencies of the Docker image. As a consequence, they will not be integrated into the image that the dockerTools.buildImage creates.

(As a sidenote: there are still two Nix store paths that end-up in the image, namely bash and glibc that is a runtime dependency of bash. This is caused by the fact that the internals of the dockerTools.buildImage function make a reference to bash without discarding its context. In theory, it is also possible to eliminate this dependency as well).

To run the container and make sure that the required Nix store paths are available, I can mount the host system's Nix store as a shared volume:


$ docker run -p 8080:80 -v /nix/store:/nix/store -it nginxexp:latest

By mounting the host system's Nix store (with the -v parameter), Nginx should still behave as expected -- it is not provided by the image, but referenced from the shared Nix store.

(As a sidenote: mounting the host system's Nix store for sharing is not a new idea. It has already been intensively used by the NixOS test driver for many years to rapidly create QEMU virtual machines for system integration tests).

Using the host system's network


As explained in the previous blog post, every Docker container by default runs in its own private network namespace making it possible for services to bind to any port without conflicting with the services on the host system or services provided by any other container.

The Nix process management framework does not work with private networks, because it is not a generalizable concept (i.e. namespaces are a Linux-only feature). Aside from Docker, the only other process manager supported by the framework that can work with namespaces is systemd.

To prevent ports and other dynamic resources from conflicting with each other, the process management framework makes it possible to configure them through instance function parameters. If the instance parameters have unique values, they will not conflict with other process instances (based on the assumption that the packager has identified all possible conflicts that a process might have).

Because we already have a framework that prevents conflicts, we can also instruct Docker to use the host system's network with the --network host parameter:


$ docker run -v /nix/store:/nix/store --network host -it nginxexp:latest

The only thing the framework cannot provide you is protection -- mallicious services in a private network namespace cannot connect to ports used by other containers or the host system, but the framework cannot protect you from that.

Mapping a base directory for storing state


Services that run in containers are not always stateless -- they may rely on data that should be persistently stored, such as databases. The Docker recommendation to handle persistent state is not to store it in a container's writable layer, but on a shared volume on the host system.

Data stored outside the container makes it possible to reliably upgrade a container -- when it is desired to install a newer version of an application, the container can be discarded and recreated from a new image.

For the Nix process management framework, integration with a state directory outside the container is also useful. With an extra shared volume, we can mount the host system's state directory:


$ docker run -v /nix/store:/nix/store \
-v /var:/var --network host -it nginxexp:latest

Orchestrating containers


The last piece in the puzzle is to orchestrate the containers: we must create or discard them, and start or stop them, and perform all required steps in the right order.

Moreover, to prevent the Nix packages that a containers needs from being garbage collected, we need to make sure that they are a dependency of a package that is registered as in use.

I came up with my own convention to implement the container deployment process. When building the processes model for the docker process manager, the following files are generated that help me orchestrating the deployment process:


01-webapp-docker-priority
02-nginx-docker-priority
nginx-docker-cmd
nginx-docker-createparams
nginx-docker-settings
webapp-docker-cmd
webapp-docker-createparams
webapp-docker-settings

In the above list, we have the following kinds of files:

  • The files that have a -docker-settings suffix contain general properties of the container, such as the image that needs to be used a template.
  • The files that have a -docker-createparams suffix contain the command line parameters that are propagated to docker create to create the container. If a container with the same name already exists, the container creation is skipped and the existing instance is used instead.
  • To prevent the Nix packages that a Docker container needs from being garbage collected the generator creates a file with a -docker-cmd suffix containing the Cmd instruction including the full Nix store paths of the packages that a container needs.

    Because the strings' contexts are not discarded in the generation process, the packages become a dependency of the configuration file. As long as this configuration file is deployed, the packages will not get garbage collected.
  • To ensure that the containers are activated in the right order we have two files that are prefixed with two numeric digits that have a -container-priority suffix. The numeric digits determine in which order the containers should be activated -- in the above example the webapp process gets activated before Nginx (that acts as a reverse proxy).

With the following command, we can automatically generate the configuration files shown above for all our processes in the processes model, and use it to automatically create and start docker containers for all process instances:


$ nixproc-docker-switch processes.nix
55d833e07428: Loading layer [==================================================>] 46.61MB/46.61MB
Loaded image: webapp:latest
f020f5ecdc6595f029cf46db9cb6f05024892ce6d9b1bbdf9eac78f8a178efd7
nixproc-webapp
95b595c533d4: Loading layer [==================================================>] 46.61MB/46.61MB
Loaded image: nginx:latest
b195cd1fba24d4ec8542c3576b4e3a3889682600f0accc3ba2a195a44bf41846
nixproc-nginx

The result is two running Docker containers that correspond to the process instances shown in the processes model:


$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b195cd1fba24 nginx:latest "/nix/store/j3v4fz9h…" 15 seconds ago Up 14 seconds nixproc-nginx
f020f5ecdc65 webapp:latest "/nix/store/b6pz847g…" 16 seconds ago Up 15 seconds nixproc-webapp

and we should be able to access the example HTML page, by opening the following URL: http://localhost:8080 in a web browser.

Deploying Docker containers in a heteregenous and/or distributed environment


As explained in my previous blog posts about the experimental Nix process management framework, the processes model is a sub set of a Disnix services model. When it is desired to deploy processes to a network of machines or combine processes with other kinds of services, we can easily turn a processes model into a services model.

For example, I can change the processes model shown earlier into a services model that deploys Docker containers:


{ pkgs ? import <nixpkgs> { inherit system; }
, system ? builtins.currentSystem
, stateDir ? "/var"
, runtimeDir ? "${stateDir}/run"
, logDir ? "${stateDir}/log"
, cacheDir ? "${stateDir}/cache"
, tmpDir ? (if stateDir == "/var" then "/tmp" else "${stateDir}/tmp")
, forceDisableUserChange ? false
}:

let
constructors = import ./constructors.nix {
inherit pkgs stateDir runtimeDir logDir tmpDir;
inherit forceDisableUserChange;
processManager = "docker";
};
in
rec {
webapp = rec {
name = "webapp";

port = 5000;
dnsName = "webapp.local";

pkg = constructors.webapp {
inherit port;
};

type = "docker-container";
};

nginxReverseProxy = rec {
name = "nginxReverseProxy";

port = 8080;

pkg = constructors.nginxReverseProxyHostBased {
webapps = [ webapp ];
inherit port;
} {};

type = "docker-container";
};
}

In the above example, I have added a name attribute to each process (a required property for Disnix service models) and a type attribute referring to: docker-container.

In Disnix, a service could take any form. A plugin system (named Dysnomia) is responsible for managing the life-cycle of a service, such as activating or deactivating it. The type attribute is used to tell Disnix that we should use the docker-container Dysnomia module. This module will automatically create and start the container on activation, and stop and discard the container on deactivation.

To deploy the above services to a network of machines, we require an infrastructure model (that captures the available machines and their relevant deployment properties):


{
test1.properties.hostname = "test1";
}

The above infrastructure model contains only one target machine: test1 with a hostname that is identical to the machine name.

We also require a distribution model that maps services in the services model to machines in the infrastructure model:


{infrastructure}:

{
webapp = [ infrastructure.test1 ];
nginxReverseProxy = [ infrastructure.test1 ];
}

In the above distribution model, we map the all the processes in the services model to the test1 target machine in the infrastructure model.

With the following command, we can deploy our Docker containers to the remote test1 target machine:


$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

When the above command succeeds, the test1 target machine provides running webapp and nginxReverseProxy containers.

(As a sidenote: to make Docker container deployments work with Disnix, the Docker service already needs to be predeployed to the target machines in the infrastructure model, or the Docker daemon needs to be deployed as a container provider).

Deploying conventional Docker containers with Disnix


The nice thing about the docker-container Dysnomia module is that it is generic enough to also work with conventional Docker containers (that work with images, not a shared Nix store).

For example, we can deploy Nginx as a regular container built with the dockerTools.buildImage function:


{dockerTools, stdenv, nginx}:

let
dockerImage = dockerTools.buildImage {
name = "nginxexp";
tag = "test";
contents = nginx;

runAsRoot = ''
${dockerTools.shadowSetup}
groupadd -r nogroup
useradd -r nobody -g nogroup -d /dev/null
mkdir -p /var/log/nginx /var/cache/nginx /var/www
cp ${./index.html} /var/www/index.html
'';

config = {
Cmd = [ "${nginx}/bin/nginx" "-g" "daemon off;" "-c" ./nginx.conf ];
Expose = {
"80/tcp" = {};
};
};
};
in
stdenv.mkDerivation {
name = "nginxexp";
buildCommand = ''
mkdir -p $out
cat > $out/nginxexp-docker-settings <<EOF
dockerImage=${dockerImage}
dockerImageTag=nginxexp:test
EOF

cat > $out/nginxexp-docker-createparams <<EOF
-p
8080:80
EOF
'';
}

In the above example, instead of using the process manager-agnostic createManagedProcess, I directly construct a Docker-based Nginx image (by using the dockerImage attribute) and container configuration files (in the buildCommand parameter) to make the container deployments work with the docker-container Dysnomia module.

It is also possible to deploy containers from images that are constructed with Dockerfiles. After we have built an image in the traditional way, we can export it from Docker with the following command:


$ docker save nginx-debian -o nginx-debian.tar.gz

and then we can use the following Nix expression to deploy a container using our exported image:


{dockerTools, stdenv, nginx}:

stdenv.mkDerivation {
name = "nginxexp";
buildCommand = ''
mkdir -p $out
cat > $out/nginxexp-docker-settings <<EOF
dockerImage=${./nginx-debian.tar.gz}
dockerImageTag=nginxexp:test
EOF

cat > $out/nginxexp-docker-createparams <<EOF
-p
8080:80
EOF
'';
}

In the above expression, the dockerImage property refers to our exported image.

Although Disnix is flexible enough to also orchestrate Docker containers (thanks to its generalized plugin architecture), I did not develop the docker-container Dysnomia module to make Disnix compete with existing container orchestration solutions, such as Kubernetes or Docker Swarm.

Disnix is a heterogeneous deployment tool that can be used to integrate units that have all kinds of shapes and forms on all kinds of operating systems -- having a docker-container module makes it possible to mix Docker containers with other service types that Disnix and Dysnomia support.

Discussion


In this blog post, I have demonstrated that we can integrate Docker as a process management backend option into the experimental Nix process management framework, by substituting some of its conflicting features.

Moreover, because a Disnix service model is a superset of a processes model, we can also use Disnix as a simple Docker container orchestrator and integrate Docker containers with other kinds of services.

Compared to Docker, the Nix process management framework supports a number of features that Docker does not:

  • Docker is heavily developed around Linux-specific concepts, such as namespaces and cgroups. As a result, it can only be used to deploy software built for Linux.

    The Nix process management framework should work on any operating system that is supported by the Nix package manager (e.g. Nix also has first class support for macOS, and can also be used on other UNIX-like operating systems such as FreeBSD). The same also applies to Disnix.
  • The Nix process management framework can work with sysvinit, BSD rc and Disnix process scripts, that do not require any external service to manage a process' life-cycle. This is convenient for local unprivileged user deployments. To deploy Docker containers, you need to have the Docker daemon installed first.
  • Docker has an experimental rootless deployment mode, but in the Nix process management framework facilitating unprivileged user deployments is a first class concept.

On the other hand, the Nix process management framework does not take over all responsibilities of Docker:

  • Docker heavily relies on namespaces to prevent resource conflicts, such as overlapping TCP ports and global state directories. The Nix process management framework solves conflicts by avoiding them (i.e. configuring properties in such a way that they are unique). The conflict avoidance approach works as long as a service is well-specified. Unfortunately, preventing conflicts is not a hard guarantee that the tool can provide you.
  • Docker also provides some degree of protection by using namespaces and cgroups. The Nix process management framework does not support this out of the box, because these concepts are not generalizable over all the process management backends it supports. (As a sidenote: it is still possible to use these concepts by defining process manager-specific overrides).

From a functionality perspective, docker-compose comes close to the features that the experimental Nix process management framework supports. docker-compose allows you to declaratively define container instances and their dependencies, and automatically deploy them.

However, as its name implies docker-compose is specifically designed for deploying Docker containers whereas the Nix process management framework is more general -- it should work with all kinds of process managers, uses Nix as the primary means to provide dependencies, it uses the Nix expression language for configuration and it should work on a variety of operating systems.

The fact that Docker (and containers in general) are multi-functional solutions is not an observation only made by me. For example, this blog post also demonstrates that containers can work without images.

Availability


The Docker backend has been integrated into the latest development version of the Nix process management framework.

To use the docker-container Dysnomia module (so that Disnix can deploy Docker containers), you need to install the latest development version of Dysnomia.

by Sander van der Burg (noreply@blogger.com) at August 11, 2020 07:18 PM

August 07, 2020

Matej Cotman

Neovim, WSL and Nix

How to use Neovim (Neovim-Qt) under WSL 1/2 with the power of Nix

Intro

Well we all know that generally development on Linux is easier than on Windows, but sometimes you are forced to use Windows. But that does not mean that all those nice tools from Linux are not available to you, as we will see in this post.

Windows has some thing called WSL which enables you to run Linux tools natively in the Windows subsystem. Not all is without issues, you can not run graphical Linux applications because Windows does not run Xorg server, yeah you have Xorg ports that run there but that is in this case just one more unwanted layer, remember, building efficient solutions is what every engineer should strive to.

What I did is to use Windows pre-built binaries of Neovim-Qt and run the Neovim installed with Nix inside WSL.

Ok, you could say then, why not use VS Code with some Vim/Neovim plugin and use so called Remote-WSL plugin to access WSL… Well yes, but at least me I stumble upon few issues. First was that CPU usage was through the roof when Remote-WSL extension was in use on WSL1 (I could not just run Windows Update on client’s managed computer) and the fix was to install specific version of libc with dpkg (which is absurd in the first place because this is a good way to ruin your whole environment). Applying this fix did the trick for lowering the CPU usage. The second issue come right after, when I wanted to install some package with APT package manager, like I predicted, libc install did its damage, I could not install or un-install anything with APT. Nix comes again to the rescue.

By the way the sleep command forgot how to work under WSL and Ubuntu 20.04 Source.

Let’s see the solution

Neovim-Qt

Neovim-Qt has nicely built binaries on their GitHub page for Windows, so I just downloaded that zip and unpacked it into C:/Program Files/neovim-qt/. But any location could do.

WSL

Open PowerShell as Administrator and run:

dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart

If you do not have up to date Windows for any kind of reason to install WLS2 then reboot now.

Reboot Time

You should have now enabled the WSL1 and you can proceed to install Ubuntu 20.04 (or any other Linux distro you like) from Microsoft store. Do not forget to click Launch after installing it (it will ask you to create a user).

Nix

To install Nix, you need to first open some terminal emulator and run wsl.exe, but you can also just run it from Start menu.

bash <(curl -L https://nixos.org/nix/install)

To finish you can just close the terminal and open wsl.exe again.

Thats it.

The Nix script

Now here is the absolutely most awesome part that connects everything together.

{ pkgs ? import <nixpkgs> {} }:
pkgs.writeScript "run-neovim-qt.sh" ''
    #!${pkgs.stdenv.shell}
    set -e

    # get random free port
    export NVIM_LISTEN="127.0.0.1:$(${pkgs.python3Packages.python}/bin/python -c 'import socket; s=socket.socket(); s.bind(("", 0)); print(s.getsockname()[1]); s.close()')"

    # use python's sleep, because coreutils' sleep does not function under Ubuntu 20.04 and WSL
    #   after delay start nvim-qt - so that nvim starts before the GUI
    { ${pkgs.python3Packages.python}/bin/python -c 'import time; time.sleep(1)'; "''${NVIM_QT_PATH}" --server "$NVIM_LISTEN"; } &

    # start nvim
    ${pkgs.neovim}/bin/nvim --listen "$NVIM_LISTEN" --headless "$@"
''

Save it to your drive or download with wget under WSL:

wget https://raw.githubusercontent.com/matejc/helper_scripts/master/nixes/neovim-qt.nix 

Then build the command with:

nix-build ./neovim-qt.nix

The resulting script is ./result.

Usage

First we need to tell the script where is the Neovim-qt located:

export NVIM_QT_PATH='/mnt/c/Program Files/neovim-qt/bin/nvim-qt.exe'

You can save this into .bashrc or .profile and restart the terminal so that you do not need to repeat the step every time you run wsl shell.

The final step is:

./result my/awesome/code.py

Conclusion

Too much work? You think? Well how much more time you would use using and configuring VS Code or Atom to work under similar environment? And what about Nix? You can install it without the use of native package managers (in case the native one is b0rked) and once you do, you have the power to install your favorite development environment with single command.

I like this solution, in my eyes its simple and efficient, what are your thoughts?

Until next time… I wish you happy hacking!

High cpu usage of node process in Remote-WSL extension #2921 Neovim-Qt Releases WSL on Windows 10 Quick start with Nix

August 07, 2020 10:00 PM

July 31, 2020

Tweag I/O

Nix Flakes, Part 3: Managing NixOS systems

This is the third in a series of blog posts about Nix flakes. The first part motivated why we developed flakes — to improve Nix’s reproducibility, composability and usability — and gave a short tutorial on how to use flakes. The second part showed how flakes enable reliable caching of Nix evaluation results. In this post, we show how flakes can be used to manage NixOS systems in a reproducible and composable way.

What problems are we trying to solve?

Lack of reproducibility

One of the main selling points of NixOS is reproducibility: given a specification of a system, if you run nixos-rebuild to deploy it, you should always get the same actual system (modulo mutable state such as the contents of databases). For instance, we should be able to reproduce in a production environment the exact same configuration that we’ve previously validated in a test environment.

However, the default NixOS workflow doesn’t provide reproducible system configurations out of the box. Consider a typical sequence of commands to upgrade a NixOS system:

  • You edit /etc/nixos/configuration.nix.
  • You run nix-channel --update to get the latest version of the nixpkgs repository (which contains the NixOS sources).
  • You run nixos-rebuild switch, which evaluates and builds a function in the nixpkgs repository that takes /etc/nixos/configuration.nix as an input.

In this workflow, /etc/nixos/configuration.nix might not be under configuration management (e.g. point to a Git repository), or if it is, it might be a dirty working tree. Furthermore, configuration.nix doesn’t specify what Git revision of nixpkgs to use; so if somebody else deploys the same configuration.nix, they might get a very different result.

Lack of traceability

The ability to reproduce a configuration is not very useful if you can’t tell what configuration you’re actually running. That is, from a running system, you should be able to get back to its specification. So there is a lack of traceability: the ability to trace derived artifacts back to their sources. This is an essential property of good configuration management, since without it, we don’t know what we’re actually running in production, so reproducing or fixing problems becomes much harder.

NixOS currently doesn’t not have very good traceability. You can ask a NixOS system what version of Nixpkgs it was built from:

$ nixos-version --json | jq -r .nixpkgsRevision
a84b797b28eb104db758b5cb2b61ba8face6744b

Unfortunately, this doesn’t allow you to recover configuration.nix or any other external NixOS modules that were used by the configuration.

Lack of composability

It’s easy to enable a package or system service in a NixOS configuration if it is part of the nixpkgs repository: you just add a line like environment.systemPackages = [ pkgs.hello ]; or services.postgresql.enable = true; to your configuration.nix. But what if we want to use a package or service that isn’t part of Nixpkgs? Then we’re forced to use mechanisms like $NIX_PATH, builtins.fetchGit, imports using relative paths, and so on. These are not standardized (since everybody uses different conventions) and are inconvenient to use (for example, when using $NIX_PATH, it’s the user’s responsibility to put external repositories in the right directories).

Put another way: NixOS is currently built around a monorepo workflow — the entire universe should be added to the nixpkgs repository, because anything that isn’t, is much harder to use.

It’s worth noting that any NixOS system configuration already violates the monorepo assumption: your system’s configuration.nix is not part of the nixpkgs repository.

Using flakes for NixOS configurations

In the previous post, we saw that flakes are (typically) Git repositories that have a file named flake.nix, providing a standardized interface to Nix artifacts. We saw flakes that provide packages and development environments; now we’ll use them to provide NixOS system configurations. This solves the problems described above:

  • Reproducibility: the entire system configuration (including everything it depends on) is captured by the flake and its lock file. So if two people check out the same Git revision of a flake and build it, they should get the same result.
  • Traceability: nixos-version prints the Git revision of the top-level configuration flake, not its nixpkgs input.
  • Composability: it’s easy to pull in packages and modules from other repositories as flake inputs.

Prerequisites

Flake support has been added as an experimental feature to NixOS 20.03. However, flake support is not part of the current stable release of Nix (2.3). So to get a NixOS system that supports flakes, you first need to switch to the nixUnstable package and enable some experimental features. This can be done by adding the following to configuration.nix:

nix.package = pkgs.nixUnstable;
nix.extraOptions = ''
  experimental-features = nix-command flakes
'';

Creating a NixOS configuration flake

Let’s create a flake that contains the configuration for a NixOS container.

$ git init my-flake
$ cd my-flake
$ nix flake init -t templates#simpleContainer
$ git commit -a -m 'Initial version'

Note that the -t flag to nix flake init specifies a template from which to copy the initial contents of the flake. This is useful for getting started. To see what templates are available, you can run:

$ nix flake show templates

For reference, this is what the initial flake.nix looks like:

{
  inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-20.03";

  outputs = { self, nixpkgs }: {

    nixosConfigurations.container = nixpkgs.lib.nixosSystem {
      system = "x86_64-linux";
      modules =
        [ ({ pkgs, ... }: {
            boot.isContainer = true;

            # Let 'nixos-version --json' know about the Git revision
            # of this flake.
            system.configurationRevision = nixpkgs.lib.mkIf (self ? rev) self.rev;

            # Network configuration.
            networking.useDHCP = false;
            networking.firewall.allowedTCPPorts = [ 80 ];

            # Enable a web server.
            services.httpd = {
              enable = true;
              adminAddr = "morty@example.org";
            };
          })
        ];
    };

  };
}

That is, the flake has one input, namely nixpkgs - specifically the 20.03 branch. It has one output, nixosConfigurations.container, which evaluates a NixOS configuration for tools like nixos-rebuild and nixos-container. The main argument is modules, which is a list of NixOS configuration modules. This takes the place of the file configuration.nix in non-flake deployments. (In fact, you can write modules = [ ./configuration.nix ] if you’re converting a pre-flake NixOS configuration.)

Let’s create and start the container! (Note that nixos-container currently requires you to be root.)

# nixos-container create flake-test --flake /path/to/my-flake
host IP is 10.233.4.1, container IP is 10.233.4.2

# nixos-container start flake-test

To check whether the container works, let’s try to connect to it:

$ curl http://flake-test/
<html><body><h1>It works!</h1></body></html>

As an aside, if you just want to build the container without the nixos-container command, you can do so as follows:

$ nix build /path/to/my-flake#nixosConfigurations.container.config.system.build.toplevel

Note that system.build.toplevel is an internal NixOS option that evaluates to the “system” derivation that commands like nixos-rebuild, nixos-install and nixos-container build and activate. The symlink /run/current-system points to the output of this derivation.

Hermetic evaluation

One big difference between “regular” NixOS systems and flake-based NixOS systems is that the latter record the Git revisions from which they were built. We can query this as follows:

# nixos-container run flake-test -- nixos-version --json
{"configurationRevision":"9190c396f4dcfc734e554768c53a81d1c231c6a7"
,"nixosVersion":"20.03.20200622.13c15f2"
,"nixpkgsRevision":"13c15f26d44cf7f54197891a6f0c78ce8149b037"}

Here, configurationRevision is the Git revision of the repository /path/to/my-flake. Because evaluation is hermetic, and the lock file locks all flake inputs such as nixpkgs, knowing the revision 9190c39… allows you to completely reconstruct this configuration at a later point in time. For example, if you want to deploy this particular configuration to a container, you can do:

# nixos-container update flake-test \
    --flake /path/to/my-flake?rev=9190c396f4dcfc734e554768c53a81d1c231c6a7

Dirty configurations

It’s not required that you commit all changes to a configuration before deploying it. For example, if you change the adminAddr line in flake.nix to

adminAddr = "rick@example.org";

and redeploy the container, you will get:

# nixos-container update flake-test
warning: Git tree '/path/to/my-flake' is dirty
...
reloading container...

and the container will no longer have a configuration Git revision:

# nixos-container run flake-test -- nixos-version --json | jq .configurationRevision
null

While this may be convenient for testing, in production we really want to ensure that systems are deployed from clean Git trees. One way is to disallow dirty trees on the command line:

# nixos-container update flake-test --no-allow-dirty
error: --- Error -------------------- nix
Git tree '/path/to/my-flake' is dirty

Another is to require a clean Git tree in flake.nix, for instance by adding a check to the definition of system.configurationRevision:

system.configurationRevision =
  if self ? rev
  then self.rev
  else throw "Refusing to build from a dirty Git tree!";

Adding modules from third-party flakes

One of the main goals of flake-based NixOS is to make it easier to use packages and modules that are not included in the nixpkgs repository. As an example, we’ll add Hydra (a continuous integration server) to our container.

Here’s how we add it to our container. We specify it as an additional input:

  inputs.hydra.url = "github:NixOS/hydra";

and as a corresponding function argument to the outputs function:

  outputs = { self, nixpkgs, hydra }: {

Finally, we enable the NixOS module provided by the hydra flake:

      modules =
        [ hydra.nixosModules.hydraTest

          ({ pkgs, ... }: {
            ... our own configuration ...

            # Hydra runs on port 3000 by default, so open it in the firewall.
            networking.firewall.allowedTCPPorts = [ 3000 ];
          })
        ];

Note that we can discover the name of this module by using nix flake show:

$ nix flake show github:NixOS/hydra
github:NixOS/hydra/d0deebc4fc95dbeb0249f7b774b03d366596fbed
├───…
├───nixosModules
│   ├───hydra: NixOS module
│   ├───hydraProxy: NixOS module
│   └───hydraTest: NixOS module
└───overlay: Nixpkgs overlay

After committing this change and running nixos-container update, we can check whether hydra is working in the container by visiting http://flake-test:3000/ in a web browser.

Working with lock files

There are a few command line flags accepted by nix, nixos-rebuild and nixos-container that make updating lock file more convenient. A very common action is to update a flake input to the latest version; for example,

$ nixos-container update flake-test --update-input nixpkgs --commit-lock-file

updates the nixpkgs input to the latest revision on the nixos-20.03 branch, and commits the new lock file with a commit message that records the input change.

A useful flag during development is --override-input, which allows you to point a flake input to another location, completely overriding the input location specified by flake.nix. For example, this is how you can build the container against a local Git checkout of Hydra:

$ nixos-container update flake-test --override-input hydra /path/to/my/hydra

Adding overlays from third-party flakes

Similarly, we can add Nixpkgs overlays from other flakes. (Nixpkgs overlays add or override packages in the pkgs set.) For example, here is how you add the overlay provided by the nix flake:

  outputs = { self, nixpkgs, nix }: {
    nixosConfigurations.container = nixpkgs.lib.nixosSystem {
      ...
      modules =
        [
          ({ pkgs, ... }: {
            nixpkgs.overlays = [ nix.overlay ];
            ...
          })
        ];
    };
  };
}

Using nixos-rebuild

Above we saw how to manage NixOS containers using flakes. Managing “real” NixOS systems works much the same, except using nixos-rebuild instead of nixos-container. For example,

# nixos-rebuild switch --flake /path/to/my-flake#my-machine

builds and activates the configuration specified by the flake output nixosConfigurations.my-machine. If you omit the name of the configuration (#my-machine), nixos-rebuild defaults to using the current host name.

Pinning Nixpkgs

It’s often convenient to pin the nixpkgs flake to the exact version of nixpkgs used to build the system. This ensures that commands like nix shell nixpkgs#<package> work more efficiently since many or all of the dependencies of <package> will already be present. Here is a bit of NixOS configuration that pins nixpkgs in the system-wide flake registry:

nix.registry.nixpkgs.flake = nixpkgs;

Note that this only affects commands that reference nixpkgs without further qualifiers; more specific flake references like nixpkgs/nixos-20.03 or nixpkgs/348503b6345947082ff8be933dda7ebeddbb2762 are unaffected.

Conclusion

In this blog post we saw how Nix flakes make NixOS configurations hermetic and reproducible. In a future post, we’ll show how we can do the same for cloud deployments using NixOps.

Acknowledgment: The development of flakes was partially funded by Target Corporation.

Nix Flakes Series

July 31, 2020 12:00 AM

July 29, 2020

Sander van der Burg

On using Nix and Docker as deployment automation solutions: similarities and differences

As frequent readers of my blog may probably already know, I have been using Nix-related tools for quite some time to solve many of my deployment automation problems.

Although I have worked in environments in which Nix and its related sub projects are well-known, when I show some of Nix's use cases to larger groups of DevOps-minded people, a frequent answer I that have been hearing is that it looks very similar to Docker. People also often ask me what advantages Nix has over Docker.

So far, I have not even covered Docker once on my blog, despite its popularity, including very popular sister projects such as docker-compose and Kubernetes.

The main reason why I never wrote anything about Docker is not because I do not know about it or how to use it, but simply because I never had any notable use cases that would lead to something publishable -- most of my problems for which Docker could be a solution, I solved it by other means, typically by using a Nix-based solution somewhere in the solution stack.

Docker is a container-based deployment solution, that was not the first (neither in the Linux world, nor in the UNIX-world in general), but since its introduction in 2013 it has grown very rapidly in popularity. I believe its popularity can be mainly attributed to its ease of usage and its extendable images ecosystem: Docker Hub.

In fact, Docker (and Kubernetes, a container orchestration solution that incorporates Docker) have become so popular, that they have set a new standard when it comes to organizing systems and automating deployment -- today, in many environments, I have the feeling that it is no longer the question what kind of deployment solution is best for a particular system and organization, but rather: "how do we get it into containers?".

The same thing applies to the "microservices paradigm" that should facilitate modular systems. If I compare the characteristics of microservices with the definition a "software component" by Clemens Szyperski's Component Software book, then I would argue that they have more in common than they are different.

One of the reasons why I think microservices are considered to be a success (or at least considered moderately more successful by some over older concepts, such as web services and software components) is because they easily map to a container, that can be conveniently managed with Docker. For some people, a microservice and a Docker container are pretty much the same things.

Modular software systems have all kinds of advantages, but its biggest disadvantage is that the deployment of a system becomes more complicated as the amount of components and dependencies grow. With Docker containers this problem can be (somewhat) addressed in a convenient way.

In this blog post, I will provide my view on Nix and Docker -- I will elaborate about some of their key concepts, explain in what ways they are different and similar, and I will show some use-cases in which both solutions can be combined to achieve interesting results.

Application domains


Nix and Docker are both deployment solutions for slightly different, but also somewhat overlapping, application domains.

The Nix package manager (on the recently revised homepage) advertises itself as follows:

Nix is a powerful package manager for Linux and other Unix systems that makes package management reliable and reproducible. Share your development and build environments across different machines.

whereas Docker advertises itself as follows (in the getting started guide):

Docker is an open platform for developing, shipping, and running applications.

To summarize my interpretations of the descriptions:

  • Nix's chief responsibility is as its description implies: package management and provides a collection of software tools that automates the process of installing, upgrading, configuring, and removing computer programs for a computer's operating system in a consistent manner.

    There are two properties that set Nix apart from most other package management solutions. First, Nix is also a source-based package manager -- it can be used as a tool to construct packages from source code and their dependencies, by invoking build scripts in "pure build environments".

    Moreover, it borrows concepts from purely functional programming languages to make deployments reproducible, reliable and efficient.
  • Docker's chief responsibility is much broader than package management -- Docker facilitates full process/service life-cycle management. Package management can be considered to be a sub problem of this domain, as I will explain later in this blog post.

Although both solutions map to slightly different domains, there is one prominent objective that both solutions have in common. They both facilitate reproducible deployment.

With Nix the goal is that if you build a package from source code and a set of dependencies and perform the same build with the same inputs on a different machine, their build results should be (nearly) bit-identical.

With Docker, the objective is to facilitate reproducible environments for running applications -- when running an application container on one machine that provides Docker, and running the same application container on another machine, they both should work in an identical way.

Although both solutions facilitate reproducible deployments, their reproducibility properties are based on different kinds of concepts. I will explain more about them in the next sections.

Nix concepts


As explained earlier, Nix is a source-based package manager that borrows concepts from purely functional programming languages. Packages are built from build recipes called Nix expressions, such as:


with import <nixpkgs> {};

stdenv.mkDerivation {
name = "file-5.38";

src = fetchurl {
url = "ftp://ftp.astron.com/pub/file/file-5.38.tar.gz";
sha256 = "0d7s376b4xqymnrsjxi3nsv3f5v89pzfspzml2pcajdk5by2yg2r";
};

buildInputs = [ zlib ];

meta = {
homepage = https://darwinsys.com/file;
description = "A program that shows the type of files";
};
}

The above Nix expression invokes the function: stdenv.mkDerivation that creates a build environment in which we build the package: file from source code:

  • The name parameter provides the package name.
  • The src parameter invokes the fetchurl function that specifies where to download the source tarball from.
  • buildInputs refers to the build-time dependencies that the package needs. The file package only uses one dependency: zlib that provides deflate compression support.

    The buildInputs parameter is used to automatically configure the build environment in such a way that zlib can be found as a library dependency by the build script.
  • The meta parameter specifies the package's meta data. Meta data is used by Nix to provide information about the package, but it is not used by the build script.

The Nix expression does not specify any build instructions -- if no build instructions were provided, the stdenv.mkDerivation function will execute the standard GNU Autotools build procedure: ./configure; make; make install.

Nix combines several concepts to make builds more reliable and reproducible.

Foremost, packages managed by Nix are stored in a so-called Nix store (/nix/store) in which every package build resides in its own directory.

When we build the above Nix expression with the following command:


$ nix-build file.nix

then we may get the following Nix store path as output:


/nix/store/6rcg0zgqyn2v1ypd46hlvngaf5lgqk9g-file-5.38

Each entry in the Nix store has a SHA256 hash prefix (e.g. ypag3bh7y7i15xf24zihr343wi6x5i6g) that is derived from all build inputs used to build a package.

If we would build file, for example, with a different build script or different version of zlib then the resulting Nix store prefix will be different. As a result, we can safely store multiple versions and variants of the same package next to each other, because they will never share the same name.

Because each package resides in its own directory in the Nix store, rather than global directories that are commonly used on conventional Linux systems, such as /bin and /lib, we get stricter purity guarantees -- dependencies can typically not be found if they have not been specified in any of the search environment variables (e.g. PATH) or provided as build parameters.

In conventional Linux systems, package builds might still accidentally succeed if they unknowingly use an undeclared dependency. When deploying such a package to another system that does not have this undeclared dependency installed, the package might not work properly or not all.

In simple single-user Nix installations, builds typically get executed in an environment in which most environment variables (including search path environment variables, such as PATH) are cleared or set to dummy variables.

Build abstraction functions (such as stdenv.mkDerivation) will populate the search path environment variables (e.g. PATH, CLASSPATH, PYTHONPATH etc.) and configure build parameters to ensure that the dependencies in the Nix store can be found.

Builds are only allowed to write in the build directory or designated output folders in the Nix store.

When a build has completed successfully, their results are made immutable (by removing their write permission bits in the Nix store) and their timestamps are reset to 1 second after the epoch (to improve build determinism).

Storing packages in isolation and providing an environment with cleared environment variables is obviously not a guarantee that builds will be pure. For example, build scripts may still have hard-coded absolute paths to executables on the host system, such as /bin/install and a C compiler may still implicitly search for headers in /usr/include.

To alleviate the problem with hard-coded global directory references, some common build utilities, such as GCC, deployed by Nix have been patched to ignore global directories, such as /usr/include.

When using Nix in multi-user mode, extra precautions have been taken to ensure build purity:

  • Each build will run as an unprivileged user, that do not have any write access to any directory but its own build directory and the designated output Nix store paths.
  • On Linux, optionally a build can run in a chroot environment, that completely disables access to all global directories in the build process. In addition, all Nix store paths of all dependencies will be bind mounted, preventing the build process to still access undeclared dependencies in the Nix store (changes will be slim that you encounter such a build, but still...)
  • On Linux kernels that support namespaces, the Nix build environment will use them to improve build purity.

    The network namespace helps the Nix builder to prevent a build process from accessing the network -- when a build process downloads an undeclared dependency from a remote location, we cannot be sure that we get a predictable result.

    In Nix, only builds that are so-called fixed output derivations (whose output hashes need to be known in advance) are allowed to download files from remote locations, because their output results can be verified.

    (As a sidenote: namespaces are also intensively used by Docker containers, as I will explain in the next section.)
  • On macOS, builds can optionally be executed in an app sandbox, that can also be used to restrict access to various kinds of shared resources, such as network access.

Besides isolation, using hash code prefixes have another advantage. Because every build with the same hash code is (nearly) bit identical, it also provides a nice optimization feature.

When we evaluate a Nix expression and the resulting hash code is identical to a valid Nix store path, then we do not have to build the package again -- because it is bit identical, we can simply return the Nix store path of the package that is already in the Nix store.

This property is also used by Nix to facilitate transparent binary package deployments. If we want to build a package with a certain hash prefix, and we know that another machine or binary cache already has this package in its Nix store, then we can download a binary substitute.

Another interesting benefit of using hash codes is that we can also identify the runtime dependencies that a package needs -- if a Nix store path contains references to other Nix store paths, then we know that these are runtime dependencies of the corresponding package.

Scanning for Nix store paths may sound scary, but there is a very slim change that a hash code string represents something else. In practice, it works really well.

For example, the following shell command shows all the runtime dependencies of the file package:


$ nix-store -qR /nix/store/6rcg0zgqyn2v1ypd46hlvngaf5lgqk9g-file-5.38
/nix/store/y8n2b9nwjrgfx3kvi3vywvfib2cw5xa6-libunistring-0.9.10
/nix/store/fhg84pzckx2igmcsvg92x1wpvl1dmybf-libidn2-2.3.0
/nix/store/bqbg6hb2jsl3kvf6jgmgfdqy06fpjrrn-glibc-2.30
/nix/store/5x6l9xm5dp6v113dpfv673qvhwjyb7p5-zlib-1.2.11
/nix/store/6rcg0zgqyn2v1ypd46hlvngaf5lgqk9g-file-5.38

If we query the dependencies of another package that is built from the same Nix packages set, such as cpio:


$ nix-store -qR /nix/store/bzm0mszhvbr6hp4gmar4czsn52hz07q1-cpio-2.13
/nix/store/y8n2b9nwjrgfx3kvi3vywvfib2cw5xa6-libunistring-0.9.10
/nix/store/fhg84pzckx2igmcsvg92x1wpvl1dmybf-libidn2-2.3.0
/nix/store/bqbg6hb2jsl3kvf6jgmgfdqy06fpjrrn-glibc-2.30
/nix/store/bzm0mszhvbr6hp4gmar4czsn52hz07q1-cpio-2.13

When looking at the outputs above, you may probably notice that both bash and cpio share the same kinds of dependencies (e.g. libidn2, libunisting and glibc), with the same hash code prefixes. Because they are same Nix store paths, they are shared on disk (and in RAM, because the operating system caches the same files in memory) leading to more efficient disk and RAM usage.

The fact that we can detect references to Nix store paths is because packages in the Nix package repository use an unorthodox form of static linking.

For example, ELF executables built with Nix have the store paths of their library dependencies in their RPATH header values (the ld command in Nixpkgs has been wrapped to transparently augment libraries to a binary's RPATH).

Python programs (and other programs written in interpreted languages) typically use wrapper scripts that set the PYTHONPATH (or equivalent) environment variables to contain Nix store paths providing the dependencies.

Docker concepts


The Docker overview page states the following about what Docker can do:

When you use Docker, you are creating and using images, containers, networks, volumes, plugins, and other objects.

Although you can create many kinds of objects with Docker, the two most important objects are the following:

  • Images. The overview page states: "An image is a read-only template with instructions for creating a Docker container.".

    To more accurately describe what this means is that images are created from build recipes called Dockerfiles. They produce self-contained root file systems containing all necessary files to run a program, such as binaries, libraries, configuration files etc. The resulting image itself is immutable (read only) and cannot change after it has been built.
  • Containers. The overview gives the following description: "A container is a runnable instance of an image".

    More specifically, this means that the life-cycle (whether a container is in a started or stopped state) is bound to the life-cycle of a root process, that runs in a (somewhat) isolated environment using the content of a Docker image as its root file system.

Besides the object types explained above, there are many more kinds objects, such as volumes (that can mount a directory from the host file system to a path in the container), and port forwardings from the host system to a container. For more information about these remaining objects, consult the Docker documentation.

Docker combines several concepts to facilitate reproducible and reliable container deployment. To be able to isolate containers from each other, it uses several kinds of Linux namespaces:

  • The mount namespace: This is in IMO the most important name space. After setting up a private mount namespace, every subsequent mount that we make will be visible in the container, but not to other containers/processes that are in a different mount name space.

    A private mount namespace is used to mount a new root file system (the contents of the Docker image) with all essential system software and other artifacts to run an application, that is different from the host system's root file system.
  • The Process ID (PID) namespace facilitates process isolation. A process/container with a private PID namespace will not be able to see or control the host system's processes (the opposite is actually possible).
  • The network namespace separates network interfaces from the host system. In a private network namespace, a container has one or more private network interfaces with their own IP addresses, port assignments and firewall settings.

    As a result, a service such as the Apache HTTP server in a Docker container can bind to port 80 without conflicting with another HTTP server that binds to the same port on the host system or in another container instance.
  • The Inter-Process Communication (IPC) namespace separates the ability for processes to communicate with each other via the SHM family of functions to establish a range of shared memory between the two processes.
  • The UTS namespace isolates kernel and version identifiers.

Another important concept that containers use are cgroups that can be use to limit the amount of system resources that containers can use, such as the amount of RAM.

Finally, to optimize/reduce storage overhead, Docker uses layers and a union filesystem (there are a variety of file system options for this) to combine these layers by "stacking" them on top of each other.

A running container basically mounts an image's read-only layers on top of each other, and keeps the final layer writable so that processes in the container can create and modify files on the system.

Whenever you construct an image from a Dockerfile, each modification operation generates a new layer. Each layer is immutable (it will never change after it has been created) and is uniquely identifiable with a hash code, similar to Nix store paths.

For example, we can build an image with the following Dockerfile that deploys and runs the Apache HTTP server on a Debian Buster Linux distribution:


FROM debian:buster

RUN apt-get update
RUN apt-get install -y apache2
ADD index.html /var/www/html
CMD ["apachectl", "-D", "FOREGROUND"]
EXPOSE 80/tcp

The above Dockerfile executes the following steps:

  • It takes the debian:buster image from Docker Hub as a base image.
  • It updates the Debian package database (apt-get update) and installs the Apache HTTPD server package from the Debian package repository.
  • It uploads an example page (index.html) to the document root folder.
  • It executes the: apachectl -D FOREGROUND command-line instruction to start the Apache HTTP server in foreground mode. The container's life-cycle is bound to the life-cycle of this foreground process.
  • It informs Docker that the container listens to TCP port: 80. Connecting to port 80 makes it possible for a user to retrieve the example index.html page.

With the following command we can build the image:


$ docker build . -t debian-apache

Resulting in the following layers:


$ docker history debian-nginx:latest
IMAGE CREATED CREATED BY SIZE COMMENT
a72c04bd48d6 About an hour ago /bin/sh -c #(nop) EXPOSE 80/tcp 0B
325875da0f6d About an hour ago /bin/sh -c #(nop) CMD ["apachectl" "-D" "FO… 0B
35d9a1dca334 About an hour ago /bin/sh -c #(nop) ADD file:18aed37573327bee1… 129B
59ee7771f1bc About an hour ago /bin/sh -c apt-get install -y apache2 112MB
c355fe9a587f 2 hours ago /bin/sh -c apt-get update 17.4MB
ae8514941ea4 33 hours ago /bin/sh -c #(nop) CMD ["bash"] 0B
<missing> 33 hours ago /bin/sh -c #(nop) ADD file:89dfd7d3ed77fd5e0… 114MB

As may be observed, the base Debian Buster image and every change made in the Dockerfile results in a new layer with a new hash code, as shown in the IMAGE column.

Layers and Nix store paths share the similarity that they are immutable and they can both be identified with hash codes.

They are also different -- first, a Nix store path is the result of building a package or a static artifact, whereas a layer is the result of making a filesystem modification. Second, for a Nix store path, the hash code is derived from all inputs, whereas the hash code of a layer is derived from the output: its contents.

Furthermore, Nix store paths are always isolated because they reside in a unique directory (enforced by the hash prefixes), whereas a layer might have files that overlap with files in other layers. In Docker, when a conflict is encountered the files in the layer that gets added on top of it take precedence.

We can construct a second image using the same Debian Linux distribution image that runs Nginx with the following Dockerfile:


FROM debian:buster

RUN apt-get update
RUN apt-get install -y nginx
ADD nginx.conf /etc
ADD index.html /var/www
CMD ["nginx", "-g", "daemon off;", "-c", "/etc/nginx.conf"]
EXPOSE 80/tcp

The above Dockerfile looks similar to the previous, except that we install the Nginx package from the Debian package repository and we use a different command-line instruction to start Nginx in foreground mode.

When building the image, its storage will be optimized -- both images share the same base layer (the Debian Buster Linux base distribution):


$ docker history debian-nginx:latest
IMAGE CREATED CREATED BY SIZE COMMENT
b7ae6f38ae77 2 hours ago /bin/sh -c #(nop) EXPOSE 80/tcp 0B
17027888ce23 2 hours ago /bin/sh -c #(nop) CMD ["nginx" "-g" "daemon… 0B
41a50a3fa73c 2 hours ago /bin/sh -c #(nop) ADD file:18aed37573327bee1… 129B
0f5b2fdcb207 2 hours ago /bin/sh -c #(nop) ADD file:f18afd18cfe2728b3… 189B
e49bbb46138b 2 hours ago /bin/sh -c apt-get install -y nginx 64.2MB
c355fe9a587f 2 hours ago /bin/sh -c apt-get update 17.4MB
ae8514941ea4 33 hours ago /bin/sh -c #(nop) CMD ["bash"] 0B
<missing> 33 hours ago /bin/sh -c #(nop) ADD file:89dfd7d3ed77fd5e0… 114MB

If you compare the above output with the previous docker history output, then you will notice that the bottom layer (last row) refers to the same layer using the same hash code behind the ADD file: statement in the CREATED BY column.

This ability to share the base distribution prevents us from storing another 114MB Debian Buster image, saving us storage and RAM.

Some common misconceptions


What I have noticed is that quite a few people compare containers to virtual machines (and even give containers that name, incorrectly suggesting that they are the same thing!).

A container is not a virtual machine, because it does not emulate or virtualize hardware -- virtual machines have a virtual CPU, virtual memory, virtual disk etc. that have similar capabilities and limitations as real hardware.

Furthermore, containers do not run a full operating system -- they run processes managed by the host system's Linux kernel. As a result, Docker containers will only deploy software that runs on Linux, and not software that was built for other operating systems.

(As a sidenote: Docker can also be used on Windows and macOS -- on these non-Linux platforms, a virtualized Linux system is used for hosting the containers, but the containers themselves are not separated by using virtualization).

Containers cannot even be considered "light weight virtual machines".

The means to isolate containers from each other only apply to a limited number of potentially shared resources. For example, a resource that could not be unshared is the system's clock, although this may change in the near future, because in March 2020 a time namespace has been added to the newest Linux kernel version. I believe this namespace is not yet offered as a generally available feature in Docker.

Moreover, namespaces, that normally provide separation/isolation between containers, are objects and these objects can also be shared among multiple container instances (this is a uncommon use-case, because by default every container has its own private namespaces).

For example, it is also possible for two containers to share the same IPC namespace -- then processes in both containers will be able to communicate with each other with a shared-memory IPC mechanism, but they cannot do any IPC with processes on the host system or containers not sharing the same namespace.

Finally, certain system resources are not constrained by default unlike a virtual machine -- for example, a container is allowed to consume all the RAM of the host machine unless a RAM restriction has been configured. An unrestricted container could potentially affect the machine's stability as a whole and other containers running on the same machine.

A comparison of use cases


As mentioned in the introduction, when I show people Nix, then I often get a remark that it looks very similar to Docker.

In this section, I will compare some of their common use cases.

Managing services


In addition to building a Docker image, I believe the most common use case for Docker is to manage services, such as custom REST API services (that are self-contained processes with an embedded web server), web servers or database management systems.

For example, after building an Nginx Docker image (as shown in the section about Docker concepts), we can also launch a container instance using the previously constructed image to serve our example HTML page:


$ docker run -p 8080:80 --name nginx-container -it debian-nginx

The above command create a new container instance using our Nginx image as a root file system and then starts the container in interactive mode -- the command's execution will block and display the output of the Nginx process on the terminal.

If we would omit the -it parameters then the container will run in the background.

The -p parameter configures a port forwarding from the host system to the container: traffic to the host system's port 8080 gets forwarded to port 80 in the container where the Nginx server listens to.

We should be able to see the example HTML page, by opening the following URL in a web browser:


http://localhost:8080

After stopping the container, its state will be retained. We can remove the container permanently, by running:


$ docker rm nginx-container

The Nix package manager has no equivalent use case for manging running processes, because its purpose is package management and not process/service life-cycle management.

However, some projects based on Nix, such as NixOS: a Linux distribution built around the Nix package manager using a single declarative configuration file to capture a machine's configuration, generates systemd unit files to manage services' life-cycles.

The Nix package manager can also be used on other operating systems, such as conventional Linux distributions, macOS and other UNIX-like systems. There is no universal solution that allows you to complement Nix with service manage support on all platforms that Nix supports.

Experimenting with packages


Another common use case is using Docker to experiment with packages that should not remain permanently installed on a system.

One way of doing this is by directly pulling a Linux distribution image (such as Debian Buster):


$ docker pull debian:buster

and then starting a container in an interactive shell session, in which we install the packages that we want to experiment with:


$ docker run --name myexperiment -it debian:buster /bin/sh
$ apt-get update
$ apt-get install -y file
# file --version
file-5.22
magic file from /etc/magic:/usr/share/misc/magic

The above example suffices to experiment with the file package, but its deployment is not guaranteed to be reproducible.

For example, the result of running my apt-get instructions shown above is file version 5.22. If I would, for example, run the same instructions a week later, then I might get a different version (e.g. 5.23).

The Docker-way of making such a deployment scenario reproducible, is by installing the packages in a Dockerfile as part of the container's image construction process:


FROM debian:buster

RUN apt-get update
RUN apt-get install -y file

we can build the container image, with our file package as follows:


$ docker build . -t file-experiment

and then deploy a container that uses that image:


$ docker run --name myexperiment -it debian:buster /bin/sh

As long as we deploy a container with the same image, we will always have the same version of the file executable:


$ docker run --name myexperiment -it file-experiment /bin/sh
# file --version
file-5.22
magic file from /etc/magic:/usr/share/misc/magic

With Nix, generating reproducible development environments with packages is a first-class feature.

For example, to launch a shell session providing the file package from the Nixpkgs collection, we can simply run:


$ nix-shell -p file
$ file --version
file-5.39
magic file from /nix/store/j4jj3slm15940mpmympb0z99a2ghg49q-file-5.39/share/misc/magic

As long as the Nix expression sources remain the same (e.g. the Nix channel is not updated, or NIX_PATH is hardwired to a certain Git revision of Nixpkgs), the deployment of the development environment is reproducible -- we should always get the same file package with the same Nix store path.

Building development projects/arbitrary packages


As shown in the section about Nix's concepts, one of Nix's key features is to generate build environments for building packages and other software projects. I have shown that with a simple Nix expression consisting of only a few lines of code, we can build the file package from source code and its build dependencies in such a dedicated build environment.

In Docker, only building images is a first-class concept. However, building arbitrary software projects and packages is also something you can do by using Docker containers in a specific way.

For example, we can create a bash script that builds the same example package (file) shown in the section that explains Nix's concepts:


#!/bin/bash -e

mkdir -p /build
cd /build

wget ftp://ftp.astron.com/pub/file/file-5.38.tar.gz

tar xfv file-5.38.tar.gz
cd file-5.38
./configure --prefix=/opt/file
make
make install

tar cfvz /out/file-5.38-binaries.tar.gz /opt/file

Compared to its Nix expression counterpart, the build script above does not use any abstractions -- as a consequence, we have to explicitly write all steps that executes the required build steps to build the package:

  • Create a dedicated build directory.
  • Download the source tarball from the FTP server
  • Unpack the tarball
  • Execute the standard GNU Autotools build procedure: ./configure; make; make install and install the binaries in an isolated folder (/opt/file).
  • Create a binary tarball from the /opt/file folder and store it in the /out directory (that is a volume shared between the container and the host system).

To create a container that runs the build script and to provide its dependencies in a reproducible way, we need to construct an image from the following Dockerfile:


FROM debian:buster

RUN apt-get update
RUN apt-get install -y wget gcc make libz-dev
ADD ./build.sh /
CMD /build.sh

The above Dockerfile builds an image using the Debian Buster Linux distribution, installs all mandatory build utilities (wget, gcc, and make) and library dependencies (libz-dev), and executes the build script shown above.

With the following command, we can build the image:


$ docker build . -t buildenv

and with the following command, we can create and launch the container that executes the build script (and automatically discard it as soon as it finishes its task):


$ docker run -v $(pwd)/out:/out --rm -t buildenv

To make sure that we can keep our resulting binary tarball after the container gets discarded, we have created a shared volume that maps the out directory in our current working directory onto the /out directory in the container.

When the build script finishes, the output directory should contain our generated binary tarball:


ls out/
file-5.38-binaries.tar.gz

Although both Nix and Docker both can provide reproducible environments for building packages (in the case of Docker, we need to make sure that all dependencies are provided by the Docker image), builds performed in a Docker container are not guaranteed to be pure, because it does not take the same precautions that Nix takes:

  • In the build script, we download the source tarball without checking its integrity. This might cause an impurity, because the tarball on the remote server could change (this could happen for non-mallicious as well as mallicous reasons).
  • While running the build, we have unrestricted network access. The build script might unknowingly download all kinds of undeclared/unknown dependencies from external sites whose results are not deterministic.
  • We do not reset any timestamps -- as a result, when performing the same build twice in a row, the second result might be slightly different because of the timestamps integrated in the build product.

Coping with these impurities in a Docker workflow is the responsibility of the build script implementer. With Nix, most of it is transparently handled for you.

Moreover, the build script implementer is also responsible to retrieve the build artifact and store it somewhere, e.g. in a directory outside the container or uploading it to a remote artifactory repository.

In Nix, the result of a build process is automatically stored in isolation in the Nix store. We can also quite easily turn a Nix store into a binary cache and let other Nix consumers download from it, e.g. by installing nix-serve, Hydra: the Nix-based continuous integration service, cachix, or by manually generating a static binary cache.

Beyond the ability to execute builds, Nix has another great advantage for building packages from source code. On Linux systems, the Nixpkgs collection is entirely bootstrapped, except for the bootstrap binaries -- this provides us almost full traceability of all dependencies and transitive dependencies used at build-time.

With Docker you typically do not have such insights -- images get constructed from binaries obtained from arbitrary locations (e.g. binary packages that originate from Linux distributions' package repositories). As a result, it is impossible to get any insights on how these package dependencies were constructed from source code.

For most people, knowing exactly from which sources a package has been built is not considered important, but it can still be useful for more specialized use cases. For example, to determine if your system is constructed from trustable/audited sources and whether you did not violate a license of a third-party library.

Combined use cases


As explained earlier in this blog post, Nix and Docker are deployment solutions for sightly different application domains.

There are quite a few solutions developed by the Nix community that can combine Nix and Docker in interesting ways.

In this section, I will show some of them.

Experimenting with the Nix package manager in a Docker container


Since Docker is such a common solution to provide environments in which users can experiment with packages, the Nix community also provides a Nix Docker image, that allows you to conveniently experiment with the Nix package manager in a Docker container.

We can pull this image as follows:


$ docker pull nixos/nix

Then launch a container interactively:


$ docker run -it nixos/nix

And finally, pull the package specifications from the Nix channel and install any Nix package that we want in the container:


$ nix-channel --add https://nixos.org/channels/nixpkgs-unstable nixpkgs
$ nix-channel --update
$ nix-env -f '<nixpkgs>' -iA file
$ file --version
file-5.39
magic file from /nix/store/bx9l7vrcb9izgjgwkjwvryxsdqdd5zba-file-5.39/share/misc/magic

Using the Nix package manager to deliver the required packages to construct an image


In the examples that construct Docker images for Nginx and the Apache HTTP server, I use the Debian Buster Linux distribution as base images in which I add the required packages to run the services from the Debian package repository.

This is a common practice to construct Docker images -- as I have already explained in section that covers its concepts, package management is a sub problem of the process/service life-cycle management problem, but Docker leaves solving this problem to the Linux distribution's package manager.

Instead of using conventional Linux distributions and their package management solutions, such as Debian, Ubuntu (using apt-get), Fedora (using yum) or Alpine Linux (using apk), it is also possible to use Nix.

The following Dockerfile can be used to create an image that uses Nginx deployed by the Nix package manager:


FROM nixos/nix

RUN nix-channel --add https://nixos.org/channels/nixpkgs-unstable nixpkgs
RUN nix-channel --update
RUN nix-env -f '<nixpkgs>' -iA nginx

RUN mkdir -p /var/log/nginx /var/cache/nginx /var/www
ADD nginx.conf /etc
ADD index.html /var/www

CMD ["nginx", "-g", "daemon off;", "-c", "/etc/nginx.conf"]
EXPOSE 80/tcp

Using Nix to build Docker images


Ealier, I have shown that the Nix package manager can also be used in a Dockerfile to obtain all required packages to run a service.

In addition to building software packages, Nix can also build all kinds of static artifacts, such as disk images, DVD ROM ISO images, and virtual machine configurations.

The Nixpkgs repository also contains an abstraction function to build Docker images that does not require any Docker utilities.

For example, with the following Nix expression, we can build a Docker image that deploys Nginx:


with import <nixpkgs> {};

dockerTools.buildImage {
name = "nginxexp";
tag = "test";

contents = nginx;

runAsRoot = ''
${dockerTools.shadowSetup}
groupadd -r nogroup
useradd -r nobody -g nogroup -d /dev/null
mkdir -p /var/log/nginx /var/cache/nginx /var/www
cp ${./index.html} /var/www/index.html
'';

config = {
Cmd = [ "${nginx}/bin/nginx" "-g" "daemon off;" "-c" ./nginx.conf ];
Expose = {
"80/tcp" = {};
};
};
}

The above expression propagates the following parameters to the dockerTools.buildImage function:

  • The name of the image is: nginxexp using the tag: test.
  • The contents parameter specifies all Nix packages that should be installed in the Docker image.
  • The runAsRoot refers to a script that runs as root user in a QEMU virtual machine. This virtual machine is used to provide the dynamic parts of a Docker image, setting up user accounts and configuring the state of the Nginx service.
  • The config parameter specifies image configuration properties, such as the command to execute and which TCP ports should be exposed.

Running the following command:


$ nix-build
/nix/store/qx9cpvdxj78d98rwfk6a5z2qsmqvgzvk-docker-image-nginxexp.tar.gz

Produces a compressed tarball that contains all files belonging to the Docker image. We can load the image into Docker with the following command:


$ docker load -i \
/nix/store/qx9cpvdxj78d98rwfk6a5z2qsmqvgzvk-docker-image-nginxexp.tar.gz

and then launch a container instance that uses the Nix-generated image:


$ docker run -p 8080:80/tcp -it nginxexp:test

When we look at the Docker images overview:


$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nginxexp test cde8298f025f 50 years ago 61MB

There are two properties that stand out when you compare the Nix generated Docker image to conventional Docker images:

  • The first odd property is that the overview says that the image that was created 50 years ago. This is explainable: to make Nix builds pure and deterministic, time stamps are typically reset to 1 second after the epoch (Januarty 1st 1970), to ensure that we always get the same bit-identical build result.
  • The second property is the size of the image: 61MB is considerably smaller than our Debian-based Docker image.

    To give you a comparison: the docker history command-line invocation (shown earlier in this blog post) that displays the layers of which the Debian-based Nginx image consists, shows that the base Linux distribution image consumes 114 MB, the update layer 17.4 MB and the layer that provides the Nginx package is 64.2 MB.

The reason why Nix-generated images are so small is because Nix exactly knows all runtime dependencies required to run Nginx. As a result, we can restrict the image to only contain Nginx and its required runtime dependencies, leaving all unnecessary software out.

The Debian-based Nginx container is much bigger, because it also contains a base Debian Linux system with all kinds of command-line utilities and libraries, that are not required to run Nginx.

The same limitation also applies to the Nix Docker image shown in the previous sections -- the Nix Docker image was constructed from an Alpine Linux image and contains a small, but fully functional Linux distribution. As a result, it is bigger than the Docker image directly generated from a Nix expression.

Although a Nix-generated Docker image is smaller than most conventional images, one of its disadvantages is that the image consists of only one single layer -- as we have seen in the section about Nix concepts, many services typically share the same runtime dependencies (such as glibc). Because these common dependencies are not in a reusable layer, they cannot be shared.

To optimize reuse, it is also possible to build layered Docker images with Nix:


with import <nixpkgs> {};

dockerTools.buildLayeredImage {
name = "nginxexp";
tag = "test";

contents = nginx;

maxLayers = 100;

extraCommands = ''
mkdir -p var/log/nginx var/cache/nginx var/www
cp ${./index.html} var/www/index.html
'';

config = {
Cmd = [ "${nginx}/bin/nginx" "-g" "daemon off;" "-c" ./nginx.conf ];
Expose = {
"80/tcp" = {};
};
};
}

The above Nix expression is similar to the previous. but uses dockerTools.buildLayeredImage to construct a layered image.

We can build and load the image as follows:


$ docker load -i $(nix-build layered.nix)

When we retieve the history of the image, then we will see the following:


$ docker history nginxexp:test
IMAGE CREATED CREATED BY SIZE COMMENT
b91799a04b99 50 years ago 1.47kB store paths: ['/nix/store/snxpdsksd4wxcn3niiyck0fry3wzri96-nginxexp-customisation-layer']
<missing> 50 years ago 200B store paths: ['/nix/store/6npz42nl2hhsrs98bq45aqkqsndpwvp1-nginx-root.conf']
<missing> 50 years ago 1.79MB store paths: ['/nix/store/qsq6ni4lxd8i4g9g4dvh3y7v1f43fqsp-nginx-1.18.0']
<missing> 50 years ago 71.3kB store paths: ['/nix/store/n14bjnksgk2phl8n69m4yabmds7f0jj2-source']
<missing> 50 years ago 166kB store paths: ['/nix/store/jsqrk045m09i136mgcfjfai8i05nq14c-source']
<missing> 50 years ago 1.3MB store paths: ['/nix/store/4w2zbpv9ihl36kbpp6w5d1x33gp5ivfh-source']
<missing> 50 years ago 492kB store paths: ['/nix/store/kdrdxhswaqm4dgdqs1vs2l4b4md7djma-pcre-8.44']
<missing> 50 years ago 4.17MB store paths: ['/nix/store/6glpgx3pypxzb09wxdqyagv33rrj03qp-openssl-1.1.1g']
<missing> 50 years ago 385kB store paths: ['/nix/store/7n56vmgraagsl55aarx4qbigdmcvx345-libxslt-1.1.34']
<missing> 50 years ago 324kB store paths: ['/nix/store/1f8z1lc748w8clv1523lma4w31klrdpc-geoip-1.6.12']
<missing> 50 years ago 429kB store paths: ['/nix/store/wnrjhy16qzbhn2qdxqd6yrp76yghhkrg-gd-2.3.0']
<missing> 50 years ago 1.22MB store paths: ['/nix/store/hqd0i3nyb0717kqcm1v80x54ipkp4bv6-libwebp-1.0.3']
<missing> 50 years ago 327kB store paths: ['/nix/store/79nj0nblmb44v15kymha0489sw1l7fa0-fontconfig-2.12.6-lib']
<missing> 50 years ago 1.7MB store paths: ['/nix/store/6m9isbbvj78pjngmh0q5qr5cy5y1kzyw-libxml2-2.9.10']
<missing> 50 years ago 580kB store paths: ['/nix/store/2xmw4nxgfximk8v1rkw74490rfzz2gjp-libtiff-4.1.0']
<missing> 50 years ago 404kB store paths: ['/nix/store/vbxifzrl7i5nvh3h505kyw325da9k47n-giflib-5.2.1']
<missing> 50 years ago 79.8kB store paths: ['/nix/store/jc5bd71qcjshdjgzx9xdfrnc9hsi2qc3-fontconfig-2.12.6']
<missing> 50 years ago 236kB store paths: ['/nix/store/9q5gjvrabnr74vinmjzkkljbpxi8zk5j-expat-2.2.8']
<missing> 50 years ago 482kB store paths: ['/nix/store/0d6vl8gzwqc3bdkgj5qmmn8v67611znm-xz-5.2.5']
<missing> 50 years ago 6.28MB store paths: ['/nix/store/rmn2n2sycqviyccnhg85zangw1qpidx0-gcc-9.3.0-lib']
<missing> 50 years ago 1.98MB store paths: ['/nix/store/fnhsqz8a120qwgyyaiczv3lq4bjim780-freetype-2.10.2']
<missing> 50 years ago 757kB store paths: ['/nix/store/9ifada2prgfg7zm5ba0as6404rz6zy9w-dejavu-fonts-minimal-2.37']
<missing> 50 years ago 1.51MB store paths: ['/nix/store/yj40ch9rhkqwyjn920imxm1zcrvazsn3-libjpeg-turbo-2.0.4']
<missing> 50 years ago 79.8kB store paths: ['/nix/store/1lxskkhsfimhpg4fd7zqnynsmplvwqxz-bzip2-1.0.6.0.1']
<missing> 50 years ago 255kB store paths: ['/nix/store/adldw22awj7n65688smv19mdwvi1crsl-libpng-apng-1.6.37']
<missing> 50 years ago 123kB store paths: ['/nix/store/5x6l9xm5dp6v113dpfv673qvhwjyb7p5-zlib-1.2.11']
<missing> 50 years ago 30.9MB store paths: ['/nix/store/bqbg6hb2jsl3kvf6jgmgfdqy06fpjrrn-glibc-2.30']
<missing> 50 years ago 209kB store paths: ['/nix/store/fhg84pzckx2igmcsvg92x1wpvl1dmybf-libidn2-2.3.0']
<missing> 50 years ago 1.63MB store paths: ['/nix/store/y8n2b9nwjrgfx3kvi3vywvfib2cw5xa6-libunistring-0.9.10']

As you may notice, all Nix store paths are in their own layers. If we would also build a layered Docker image for the Apache HTTP service, we end up using less disk space (because common dependencies such as glibc can be reused), and less RAM (because these common dependencies can be shared in RAM).

Mapping Nix store paths onto layers obviously has limitations -- there is a maximum number of layers that Docker can use (in the Nix expression, I have imposed a limit of 100 layers, recent versions of Docker support a somewhat higher number).

Complex systems packaged with Nix typically have much more dependencies than the number of layers that Docker can mount. To cope with this limitation, the dockerTools.buildLayerImage abstraction function tries to merge infrequently used dependencies into a shared layers. More information about this process can be found in Graham Christensen's blog post.

Besides the use cases shown in the examples above, there is much more you can do with the dockerTools functions in Nixpkgs -- you can also pull images from Docker Hub (with the dockerTools.pullImage function) and use the dockerTools.buildImage function to use existing Docker images as a basis to create hybrids combining conventional Linux software with Nix packages.

Conclusion


In this blog post, I have elaborated about using Nix and Docker as deployment solutions.

What they both have in common is that they facilitate reliable and reproducible deployment.

They can be used for a variety of use cases in two different domains (package management and process/service management). Some of these use cases are common to both Nix and Docker.

Nix and Docker can also be combined in several interesting ways -- Nix can be used as a package manager to deliver package dependencies in the construction process of an image, and Nix can also be used directly to build images, as a replacement for Dockerfiles.

This table summarizes the conceptual differences between Nix and Docker covered in this blog post:

NixDocker
Application domainPackage managementProcess/service management
Storage unitsPackage build resultsFile system changes
Storage modelIsolated Nix store pathsLayers + union file system
Component addressingHashes computed from inputsHashes computed from a layer's contents
Service/process managementUnsupportedFirst-class feature
Package managementFirst class supportDelegated responsibility to a distro's package manager
Development environmentsnix-shellCreate image with dependencies + run shell session in container
Build management (images)DockerfiledockerTools.buildImage {}
dockerTools.buildLayeredImage {}
Build management (packages)First class function supportImplementer's responsibility, can be simulated
Build environment purityMany precautions takenOnly images provide some reproducibility, implementer's responsibility
Full source traceabilityYes (on Linux)No
OS supportMany UNIX-like systemsLinux (real system or virtualized)

I believe the last item in the table deserves a bit of clarification -- Nix works on other operating systems than Linux, e.g. macOS, and can also deploy binaries for those platforms.

Docker can be used on Windows and macOS, but it still deploys Linux software -- on Windows and macOS containers are deployed to a virtualized Linux environment. Docker containers can only work on Linux, because they heavily rely on Linux-specific concepts: namespaces and cgroups.

Aside from the functional parts, Nix and Docker also have some fundamental non-functional differences. One of them is usability.

Although I am a long-time Nix user (since 2007). Docker is very popular because it is well-known and provides quite an optimized user experience. It does not deviate much from the way traditional Linux systems are managed -- this probably explains why so many users incorrectly call containers "virtual machines", because they manifest themselves as units that provide almost fully functional Linux distributions.

From my own experiences, it is typically more challenging to convince a new audience to adopt Nix -- getting an audience used to the fact that a package build can be modeled as a pure function invocation (in which the function parameters are a package's build inputs) and that a specialized Nix store is used to store all static artifacts, is sometimes difficult.

Both Nix and Docker support reuse: the former by means of using identical Nix store paths and the latter by using identical layers. For both solutions, these objects can be identified with hash codes.

In practice, reuse with Docker is not always optimal -- for frequently used services, such as Nginx and Apache HTTP server, is not a common practice to manually derive these images from a Linux distribution base image.

Instead, most Docker users will obtain specialized Nginx and Apache HTTP images. The official Docker Nginx images are constructed from Debian Buster and Alpine Linux, whereas the official Apache HTTP images only support Alpine Linux. Sharing common dependencies between these two images will only be possible if we install the Alpine Linux-based images.

In practice, it happens quite frequently that people run images constructed from all kinds of different base images, making it very difficult to share common dependencies.

Another impractical aspect of Nix is that it works conveniently for software compiled from source code, but packaging and deploying pre-built binaries is typically a challenge -- ELF binaries typically do not work out of the box and need to be patched, or deployed to an FHS user environment in which dependencies can be found in their "usual" locations (e.g. /bin, /lib etc.).

Related work


In this blog post, I have restricted my analysis to Nix and Docker. Both tools are useful on their own, but they are also the foundations of entire solution eco-systems. I did not elaborate much about solutions in these extended eco-systems.

For example, Nix does not do any process/service management, but there are Nix-related projects that can address this concern. Most notably: NixOS: a Linux-distribution fully managed by Nix, uses systemd to manage services.

For Nix users on macOS, there is a project called nix-darwin that integrates with launchd, which is the default service manager on macOS.

There also used to be an interesting cross-over project between Nix and Docker (called nix-docker) combining the Nix's package management capabilities, with Docker's isolation capabilities, and supervisord's ability to manage multiple services in a container -- it takes a configuration file (that looks similar to a NixOS configuration) defining a set of services, fully generates a supervisord configuration (with all required services and dependencies) and deploys them to a container. Unfortunately, the project is no longer maintained.

Nixery is a Docker-compatible container registry that is capable of transparently building and serving container images using Nix.

Docker is also an interesting foundation for an entire eco-system of solutions. Most notably Kubernetes, a container-orchestrating system that works with a variety of container tools including Docker. docker-compose makes it possible to manage collections of Docker containers and dependencies between containers.

There are also many solutions available to make building development projects with Docker (and other container technologies) more convenient than my file package build example. Gitlab CI, for example, provides first-class Docker integration. Tekton is a Kubernetes-based framework that can be used to build CI/CD systems.

There are also quite a few Nix cross-over projects that integrate with the extended containers eco-system, such as Kubernetes and docker-compose. For example, arion can generate docker-compose configuration files with specialized containers from NixOS modules. KuberNix can be used to bootstrap a Kubernetes cluster with the Nix package manager, and Kubenix can be used to build Kubernetes resources with Nix.

As explained in my comparisons, package management is not something that Docker supports as a first-class feature, but Docker has been an inspiration for package management solutions as well.

Most notably, several years ago I did a comparison between Nix and Ubuntu's Snappy package manager. The latter deploys every package (and all its required dependencies) as a container.

In this comparison blog post, I raised a number of concerns about reuse. Snappy does not have any means to share common dependencies between packages, and as a result, Snaps can be quite disk space and memory consuming.

Flatpak can be considered an alternative and more open solution to Snappy.

I still do not understand why these Docker-inspired package management solutions have not used Nix (e.g. storing packages in insolated folders) or Docker (e.g. using layers) as an inspiration to optimize reuse and simplify the construction of packages.

Future work


In the next blog post, I will elaborate more about integrating the Nix package manager with tools that can address the process/service management concern.

by Sander van der Burg (noreply@blogger.com) at July 29, 2020 08:57 PM

July 28, 2020

Cachix

Upstream caches: avoiding pushing paths in cache.nixos.org

One of the most requested features, the so-called upstream caches was released today. It is enabled by default for all caches, and the owner of the binary cache can disable it via Settings. When you push store paths to Cachix, querying cache.nixos.org adds overhead of multiples of 100ms, but you save storage and possibly minutes for avoiding the pushing of already available paths. Queries to cache.nixos.org are also cached, so that subsequent push operations do not have the overhead.

by Domen Kožar (support@cachix.org) at July 28, 2020 02:30 PM

July 20, 2020

Cachix

Documentation and More Documentation

Documentation is an important ingredient of a successful software project. Last few weeks I’ve worked on improving status quo on two fronts: 1) https://nix.dev is an opinionated guide for developers getting things done using the Nix ecosystem. A few highlights: Getting started repository template with a tutorial for using declarative and reproducible developer environments Setting up GitHub Actions with Nix Nix language anti-patterns to avoid and recommended alternatives

by Domen Kožar (support@cachix.org) at July 20, 2020 02:45 PM

July 08, 2020

Tweag I/O

Setting up Buildkite for Nix-based projects using Terraform and GCP

In this post I’m going to show how to setup, with Terraform, a Buildkite-based CI using your own workers that run on GCP. For reference, the complete Terraform configuration for this post is available in this repository.

  • The setup gives you complete control on how fast your worker are.
  • The workers come with Nix pre-installed, so you won’t need to spend time downloading the same docker container again and again on every push as would usually happen with most cloud CI providers.
  • The workers come with a distributed Nix cache set up. So authors of CI scripts won’t have to bother about caching at all.

Secrets

We are going to need to import two secret resources:

resource "secret_resource" "buildkite_agent_token" {}
resource "secret_resource" "nix_signing_key" {}

To initialize the resources, execute the following from the root directory of your project:

$ terraform import secret_resource.<name> <value>

where:

  • buildkite_agent_token is obtained from the Buildkite site.
  • nix_signing_key can be generated by running:

    nix-store --generate-binary-cache-key <your-key-name> key.private key.public

    The key.private file will contain the value for the signing key. I’ll explain later in the post how to use the contents of the key.public file.

Custom NixOS image

The next step is to use the nixos_image_custom module to create a NixOS image with custom configuration.

resource "google_storage_bucket" "nixos_image" {
  name     = "buildkite-nixos-image-bucket-name"
  location = "EU"
}

module "nixos_image_custom" {
  source      = "git::https://github.com/tweag/terraform-nixos.git//google_image_nixos_custom?ref=40fedb1fae7df5bd7ad9defdd71eb06b7252810f"
  bucket_name = "${google_storage_bucket.nixos_image.name}"
  nixos_config = "${path.module}/nixos-config.nix"
}

The snippet above first creates a bucket nixos_image where the generated image will be uploaded, then it uses the nixos_image_custom module, which handles generation of the image using the configuration from the nixos-config.nix file. The file is assumed to be in the same directory as the Terraform configuration, hence ${path.module}/.

Service account and cache bucket

To control access to different resources we will also need a service account:

resource "google_service_account" "buildkite_agent" {
  account_id   = "buildkite-agent"
  display_name = "Buildkite agent"
}

We can use it to set access permissions for the storage bucket that will contain the Nix cache:

resource "google_storage_bucket" "nix_cache_bucket" {
  name     = "nix-cache-bucket-name"
  location = "EU"
  force_destroy = true
  retention_policy {
    retention_period = 7889238 # three months
  }
}

resource "google_storage_bucket_iam_member" "buildkite_nix_cache_writer" {
  bucket = "${google_storage_bucket.nix_cache_bucket.name}"
  role = "roles/storage.objectAdmin"
  member = "serviceAccount:${google_service_account.buildkite_agent.email}"
}

resource "google_storage_bucket_iam_member" "buildkite_nix_cache_reader" {
  bucket = "${google_storage_bucket.nix_cache_bucket.name}"
  role   = "roles/storage.objectViewer"
  member = "allUsers"
}

The bucket is configured to automatically delete objects that are older than 3 months. We give the service account the ability to write to and read from the bucket (the roles/storage.objectAdmin role). The rest of the world gets the ability to read from the bucket (the roles/storage.objectViewer role).

NixOS configuration

Here is the content of my nixos-config.nix. This NixOS configuration can serve as a starting point for writing your own. The numbered points refer to the notes below.

{ modulesPath, pkgs, ... }:
{
  imports = [
    "${modulesPath}/virtualisation/google-compute-image.nix"
  ];
  virtualisation.googleComputeImage.diskSize = 3000;
  virtualisation.docker.enable = true;

  services = {
    buildkite-agents.agent = {
      enable = true;
      extraConfig = ''
      tags-from-gcp=true
      '';
      tags = {
        os = "nixos";
        nix = "true";
      };
      tokenPath = "/run/keys/buildkite-agent-token"; # (1)
      runtimePackages = with pkgs; [
        bash
        curl
        gcc
        gnutar
        gzip
        ncurses
        nix
        python3
        xz
        # (2) extend as necessary
      ];
    };
    nix-store-gcs-proxy = {
      nix-cache-bucket-name = { # (3)
        address = "localhost:3000";
      };
    };
  };

  nix = {
    binaryCaches = [
      "https://cache.nixos.org/"
      "https://storage.googleapis.com/nix-cache-bucket-name" # (4)
    ];
    binaryCachePublicKeys = [
      "cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY="
      "<insert your public signing key here>" # (5)
    ];
    extraOptions = ''
      post-build-hook = /etc/nix/upload-to-cache.sh # (6)
    '';
  };

  security.sudo.enable = true;
  services.openssh.passwordAuthentication = false;
  security.sudo.wheelNeedsPassword = false;
}

Notes:

  1. This file will be created later by the startup script (see below).
  2. The collection of packages that are available to the Buildkite script can be edited here.
  3. Replace nix-cache-bucket-name by the name of the bucket used for the Nix cache.
  4. Similarly to (3) replace nix-cache-bucket-name in the URL.
  5. Insert the contents of the key.public file you generated earlier.
  6. The file will be created later by the startup script.

Compute instances and startup script

The following snippet sets up an instance group manager which controls multiple (3 in this example) Buildkite agents. The numbered points refer to the notes below.

data "template_file" "buildkite_nixos_startup" { # (1)
  template = "${file("${path.module}/files/buildkite_nixos_startup.sh")}"

  vars = {
    buildkite_agent_token = "${secret_resource.buildkite_agent_token.value}"
    nix_signing_key = "${secret_resource.nix_signing_key.value}"
  }
}

resource "google_compute_instance_template" "buildkite_nixos" {
  name_prefix  = "buildkite-nixos-"
  machine_type = "n1-standard-8"

  disk {
    boot         = true
    disk_size_gb = 100
    source_image = "${module.nixos_image_custom.self_link}"
  }

  metadata_startup_script = "${data.template_file.buildkite_nixos_startup.rendered}"

  network_interface {
    network = "default"

    access_config = {}
  }

  metadata {
    enable-oslogin = true
  }

  service_account {
    email = "${google_service_account.buildkite_agent.email}"

    scopes = [
      "compute-ro",
      "logging-write",
      "storage-rw",
    ]
  }

  scheduling {
    automatic_restart   = false
    on_host_maintenance = "TERMINATE"
    preemptible         = true # (2)
  }

  lifecycle {
    create_before_destroy = true
  }
}

resource "google_compute_instance_group_manager" "buildkite_nixos" {
  provider           = "google-beta"
  name               = "buildkite-nixos"
  base_instance_name = "buildkite-nixos"
  target_size        = "3" # (3)
  zone               = "<your-zone>" # (4)

  version {
    name              = "buildkite_nixos"
    instance_template = "${google_compute_instance_template.buildkite_nixos.self_link}"
  }

  update_policy {
    type                  = "PROACTIVE"
    minimal_action        = "REPLACE"
    max_unavailable_fixed = 1
  }
}

Notes:

  1. The file files/buildkite_nixos_startup.sh is shown below.
  2. Because of the remote Nix cache, the nodes can be preemptible (short-lived, never lasting longer than 24 hours), which results in much lower GCP costs.
  3. Changing target_size allows you to scale the system. This is the number of instances that are controlled by the instance group manager.
  4. Insert your desired zone here.

Finally, here is the startup script:

# workaround https://github.com/NixOS/nixpkgs/issues/42344
chown root:keys /run/keys
chmod 750 /run/keys
umask 037
echo "${buildkite_agent_token}" > /run/keys/buildkite-agent-token
chown root:keys /run/keys/buildkite-agent-token
umask 077
echo '${nix_signing_key}' > /run/keys/nix_signing_key
chown root:keys /run/keys/nix-signing-key

cat <<EOF > /etc/nix/upload-to-cache.sh
#!/bin/sh

set -eu
set -f # disable globbing
export IFS=' '

echo "Uploading paths" $OUT_PATHS
exec nix copy --to http://localhost:3000?secret-key=/run/keys/nix_signing_key \$OUT_PATHS
EOF
chmod +x /etc/nix/upload-to-cache.sh

This script uses the Nix post build hook approach for uploading to the cache without polluting the CI script.

Conclusion

The setup allows us to run Nix builds in an environment where Nix tooling is available. It also provides a remote Nix cache which does not require that the authors of CI scripts set it up or, even, be aware of it at all. We use this setup on many of Tweag’s projects and found that both mental and performance overheads are minimal. A typical CI script looks like this:

steps:
  - label: Build and test
    command: nix-build -A distributed-closure --no-out-link

Builds with up-to-date cache that does not cause re-builds may finish in literally 1 second.

July 08, 2020 12:00 AM

June 25, 2020

Tweag I/O

Nix Flakes, Part 2: Evaluation caching

Nix evaluation is often quite slow. In this blog post, we’ll have a look at a nice advantage of the hermetic evaluation model enforced by flakes: the ability to cache evaluation results reliably. For a short introduction to flakes, see our previous blog post.

Why Nix evaluation is slow

Nix uses a simple, interpreted, purely functional language to describe package dependency graphs and NixOS system configurations. So to get any information about those things, Nix first needs to evaluate a substantial Nix program. This involves parsing potentially thousands of .nix files and running a Turing-complete language.

For example, the command nix-env -qa shows you which packages are available in Nixpkgs. But this is quite slow and takes a lot of memory:

$ command time nix-env -qa | wc -l
5.09user 0.49system 0:05.59elapsed 99%CPU (0avgtext+0avgdata 1522792maxresident)k
28012

Evaluating individual packages or configurations can also be slow. For example, using nix-shell to enter a development environment for Hydra, we have to wait a bit, even if all dependencies are present in the Nix store:

$ command time nix-shell --command 'exit 0'
1.34user 0.18system 0:01.69elapsed 89%CPU (0avgtext+0avgdata 434808maxresident)k

That might be okay for occasional use but a wait of one or more seconds may well be unacceptably slow in scripts.

Note that the evaluation overhead is completely independent from the time it takes to actually build or download a package or configuration. If something is already present in the Nix store, Nix won’t build or download it again. But it still needs to re-evaluate the Nix files to determine which Nix store paths are needed.

Caching evaluation results

So can’t we speed things up by caching evaluation results? After all, the Nix language is purely functional, so it seems that re-evaluation should produce the same result, every time. Naively, maybe we can keep a cache that records that attribute A of file X evaluates to derivation D (or whatever metadata we want to cache). Unfortunately, it’s not that simple; cache invalidation is, after all, one of the only two hard problems in computer science.

The reason this didn’t work is that in the past Nix evaluation was not hermetic. For example, a .nix file can import other Nix files through relative or absolute paths (such as ~/.config/nixpkgs/config.nix for Nixpkgs) or by looking them up in the Nix search path ($NIX_PATH). So unless we perfectly keep track of all the files used during evaluation, a cached result might be inconsistent with the current input.

(As an aside: for a while, Nix has had an experimental replacement for nix-env -qa called nix search, which used an ad hoc cache for package metadata. It had exactly this cache invalidation problem: it wasn’t smart enough to figure out whether its cache was up to date with whatever revision of Nixpkgs you were using. So it had a manual flag --update-cache to allow the user to force cache invalidation.)

Flakes to the rescue

Flakes solve this problem by ensuring fully hermetic evaluation. When you evaluate an output attribute of a particular flake (e.g. the attribute defaultPackage.x86_64-linux of the dwarffs flake), Nix disallows access to any files outside that flake or its dependencies. It also disallows impure or platform-dependent features such as access to environment variables or the current system type.

This allows the nix command to aggressively cache evaluation results without fear of cache invalidation problems. Let’s see this in action by running Firefox from the nixpkgs flake. If we do this with an empty evaluation cache, Nix needs to evaluate the entire dependency graph of Firefox, which takes a quarter of a second:

$ command time nix shell nixpkgs#firefox -c firefox --version
Mozilla Firefox 75.0
0.26user 0.05system 0:00.39elapsed 82%CPU (0avgtext+0avgdata 115224maxresident)k

But if we do it again, it’s almost instantaneous (and takes less memory):

$ command time nix shell nixpkgs#firefox -c firefox --version
Mozilla Firefox 75.0
0.01user 0.01system 0:00.03elapsed 93%CPU (0avgtext+0avgdata 25840maxresident)k

The cache is implemented using a simple SQLite database that stores the values of flake output attributes. After the first command above, the cache looks like this:

$ sqlite3 ~/.cache/nix/eval-cache-v1/302043eedfbce13ecd8169612849f6ce789c26365c9aa0e6cfd3a772d746e3ba.sqlite .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE Attributes (
    parent      integer not null,
    name        text,
    type        integer not null,
    value       text,
    primary key (parent, name)
);
INSERT INTO Attributes VALUES(0,'',0,NULL);
INSERT INTO Attributes VALUES(1,'packages',3,NULL);
INSERT INTO Attributes VALUES(1,'legacyPackages',0,NULL);
INSERT INTO Attributes VALUES(3,'x86_64-linux',0,NULL);
INSERT INTO Attributes VALUES(4,'firefox',0,NULL);
INSERT INTO Attributes VALUES(5,'type',2,'derivation');
INSERT INTO Attributes VALUES(5,'drvPath',2,'/nix/store/7mz8pkgpl24wyab8nny0zclvca7ki2m8-firefox-75.0.drv');
INSERT INTO Attributes VALUES(5,'outPath',2,'/nix/store/5x1i2gp8k95f2mihd6aj61b5lydpz5dy-firefox-75.0');
INSERT INTO Attributes VALUES(5,'outputName',2,'out');
COMMIT;

In other words, the cache stores all the attributes that nix shell had to evaluate, in particular legacyPackages.x86_64-linux.firefox.{type,drvPath,outPath,outputName}. It also stores negative lookups, that is, attributes that don’t exist (such as packages).

The name of the SQLite database, 302043eedf….sqlite in this example, is derived from the contents of the top-level flake. Since the flake’s lock file contains content hashes of all dependencies, this is enough to efficiently and completely capture all files that might influence the evaluation result. (In the future, we’ll optimise this a bit more: for example, if the flake is a Git repository, we can simply use the Git revision as the cache name.)

The nix search command has been updated to use the new evaluation cache instead of its previous ad hoc cache. For example, searching for Blender is slow the first time:

$ command time nix search nixpkgs blender
* legacyPackages.x86_64-linux.blender (2.82a)
  3D Creation/Animation/Publishing System
5.55user 0.63system 0:06.17elapsed 100%CPU (0avgtext+0avgdata 1491912maxresident)k

but the second time it is pretty fast and uses much less memory:

$ command time nix search nixpkgs blender
* legacyPackages.x86_64-linux.blender (2.82a)
  3D Creation/Animation/Publishing System
0.41user 0.00system 0:00.42elapsed 99%CPU (0avgtext+0avgdata 21100maxresident)k

The evaluation cache at this point is about 10.9 MiB in size. The overhead for creating the cache is fairly modest: with the flag --no-eval-cache, nix search nixpkgs blender takes 4.9 seconds.

Caching and store derivations

There is only one way in which cached results can become “stale”, in a way. Nix evaluation produces store derivations such as /nix/store/7mz8pkgpl24wyab8nny0zclvca7ki2m8-firefox-75.0.drv as a side effect. (.drv files are essentially a serialization of the dependency graph of a package.) These store derivations may be garbage-collected. In that case, the evaluation cache points to a path that no longer exists. Thus, Nix checks whether the .drv file still exist, and if not, falls back to evaluating normally.

Future improvements

Currently, the evaluation cache is only created and used locally. However, Nix could automatically download precomputed caches, similar to how it has a binary cache for the contents of store paths. That is, if we need a cache like 302043eedf….sqlite, we could first check if it’s available on cache.nixos.org and if so fetch it from there. In this way, when we run a command such as nix shell nixpkgs#firefox, we could even avoid the need to fetch the actual source of the flake!

Another future improvement is to populate and use the cache in the evaluator itself. Currently the cache is populated and cached in the user interface (that is, the nix command). The command nix shell nixpkgs#firefox will create a cache entry for firefox, but not for the dependencies of firefox; thus a subsequent nix shell nixpkgs#thunderbird won’t see a speed improvement even though it shares most of its dependencies. So it would be nice if the evaluator had knowledge of the evaluation cache. For example, the evaluation of thunks that represent attributes like nixpkgs.legacyPackages.x86_64-linux.<package name> could check and update the cache.

Nix Flakes Series

June 25, 2020 12:00 AM