NixOS Planet

June 14, 2017

Typing nix

Things won are done, joy's soul lies in the doing

I couldn't find the time last week to make something presentable out of the typechecker as I promised, but almost all the features are already here − except records which I'll probably start implementing in a few weeks.

Implementation

State of the typechecker

I finished most of the engine for the language without records, so I'll now work on adding everything that is needed around, and handling some constructs that could be ignored from a theoretical point of view − mostly because it could be considered as just some syntactic sugar.

Here are the constructs that need to be considered:

Assert
The assert construct should be typed in a rather simple way, as assert e1; e2 is equivalent to if e1 then e2 else raise "Assertion failed" Well, it is not strictly equivalent because the exceptions raised by assert and the one raised by raise aren't exactly the same − see this comment −, but four our purposes, it is the same. .

Import
The simplest way to handle the import statement is to simply replace it by a dump of the imported file. This is probably what I will do in a first time. In a second time, it may be useful for performance reasons to type the imported file separately and memoïze the result (this is what nix does with the evaluation iirc).

Different modes of check

During the design of the type-system, there has been a constant tension between having a safe type-system and being able to typecheck as much existing code as possible.

Eventually, it became clear that both weren't compatible, and that in order to keep the best of both worlds, we needed to have (at least) two different versions (a "permissive" one and a "strict" one) of the type-system. So that's what we decided to do, with an annotation to switch from one to the other. Nothing has been implemented in this direction, and the boundary between strict and permissive hasn't been decided yet − especially, it will mostly concern records. But an important point is that the gradual type will be considered differently.

On the theoretical side

There hasn't been much improvement here, as I am still stuck with the proof of correctness, which is definitely harder than expected (but I must admit, I didn't work a lot on it). A PhD student of my advisor is also working on a really similar proof, and hopefully he will find a solution − as I don't really have much more time for this because I need to work on the implementation.

Next week

I'll start drafting headlines for my dissertation; this will allow me to see which points need some more work.

In parallel, I plan to make the type annotations for lists − this will probably deserve its own post as there is some really elegant stuff that I borrow from CDuce and that are worth presenting, as well as to treat the assert and import keywords.

I'm also rewriting the parser − this is already almost over, but there is still a few polishing work ongoing, as the old one was a quick and dirty crafted one, that I wrote in order to be quickly able to work on the typechecker itself, but which behaves quite badly − in particular regarding error messages.

June 14, 2017 12:00 AM

June 02, 2017

Typing nix

What is past is prologue

This week has been shared between a lot of easy fun putting together all the pieces that I worked on during the last weeks, and a lot of frustration trying to prove the soundness of the system.

Implementation

After several weeks spent trying to understand the API of CDuce − the language our type-system takes it inspiration from and whose code we planned to reuse as much as possible to ease the Implementation −, I eventually understood it enough to start using it for writing the inference engine.

This has been really enjoyable, as all I had to do was plugging together a few functions that where already dealing with all the boring low-level stuff, and everything has been almost instantly working. I could quickly write a prototype of typer for the core lambda-calculus of the language. It is still a little bit too rough on the edges to be presented, but I should (hopefully) be able to present some nice examples for next week.

On the theoretical side

I talked last week about some difficulties with the proof of the soundness of the type-system.

Those difficulties still hold, although some progress has been done. There are a couple of paths that we are currently exploring for a this:

  • We may change a bit the semantic of the language we work on (which is currently an approximation of the lazyness of nix) with something closer actual lazyness. This should probably solve our problems, but at the cost of redefining the language with a heavier and less practical formalism.

  • We may use a non-syntactic proof using something like what is described in this paper which a colleague from tweag suggested to me. While mathematically-heavy, this kind of proof has the advantage of being able to operate on our simplified semantic, which means we can keep our simple formalism everywhere else. It also is more modular than syntactic proofs, which means that we may prove some extensions of the core system without needing to rewrite the full proof.

Next week

Next week will be more or less like this one: I'll continue the implementation using all the nice features offered by CDuce. In the same time, I'll keep on working on the proof − I hope that this will be solved before next Friday.

Provided I find the time − which is rather unlikely, but who knows? − I'll also try to make an online demonstration of the engine to show the progress of the implementation.

June 02, 2017 12:00 AM

May 26, 2017

Typing nix

Proving and implementing

It's been a month (already !) since my last update on this work. In the meantime, we've begun to prove a subset of the type-system, and (slowly) started the implementation.

Bidirectional typing

One nice feature that we added in order to make our system more powerful is a little touch of bidirectional typing, which allows us to type some expressions that the old type-system wasn't powerful enough to type.

Here is an example (and in fact the example that motivated this addition in the first place):

let f :: (true → Int → Int) ∧ (false → Bool → Bool) =
  cond: x: if cond then x+1 else not x in f

Although the type annotation corresponds to a type that's perfectly reasonable − f applied to a value of type true and a value of type Int returns an Int, and applied to a value of type false and a value of type Bool returns a Bool −, the type-system isn't able to infer it. It doesn't matter how we annotate the expression, it can't, because we would have to annotate x as either Int or Bool depending on the value of cond, and our system can't express this.

Without diving into the details, this is now possible to do. Of course this exact case isn't really exciting as it is easily possible to rewrite is as something with roughly the same semantics and that already was possible to type. But the really nice thing is that most of the times you now only have to annotate the top-level functions, and not every single lambda in the code.

Proving the type-system

The main work of those last weeks has been on establishing a proof of the soundness of (the core of) the type-system.

Saying that a type-system is "sound" intuitively means that it respects the adage that "well-typed programs cannot go wrong". The exact definition depends on what it is intended for, but in our case we can define it as "A well-typed expression will either loop indefinitely, raise an exception, or reduce to a value" (where a value is a constant, a list, a record, a function…).

We've been working on this the last few weeks, and this is somehow harder than expected. The classical way of proving soundness is to prove two lemmas known as "subject reduction" and "progress". The first one states that if a well-typed expression e reduces ("evaluates") to another expression e', then e' is also a well-typed expression, which has the same type as e (or a more precise type). The second one states that if e is a well-typed expression, then it reduces to some e'. The combination of those two lemmas give us our soundness theorem.

The fact is that the subject-reduction lemma doesn't hold for our system − for a couple of reasons that I won't detail here to keep things simple, but that I would gladly explain if anyone is interested. We see some possible workarounds for this, but we still have to investigate a little more to ensure that they are viable.

Implementation

We also started the implementation. For this, we will reuse a lot of code from the implementation of the type system of the CDuce language because ours shares a lot with it. In particular, a really big part − maybe the biggest part − of the implementation is the subtyping relation, which is the same in CDuce and in Nix.

Unfortunately the code of CDuce, although really well-written, lacks of documentation and is hardly commented, so it took me some time to get into it.

Plans for the future

Apart from trying to keep this weekly newsletter weekly − and I apologize for the whole month of silence, I'll try next week to come up with a clear idea of how to prove the soundness (not a fully redacted proof, but at least the main arguments), and advance on the implementation − ideally, I'd like to have a working prototype of a simplified type-checker (without bidirectional typing) for the core lambda-calculus − no lists and no records − within three weeks (but I'm not very good at estimates, so this doesn't mean much).

May 26, 2017 12:00 AM

May 08, 2017

Munich NixOS Meetup

Dive into NixOPS and its container backend

photoMunich NixOS Meetup

Talk by Johannes Bornhold

NixOPS is the cloud deployment tool out of the NixOS ecosystem. It is based on Nix and NixOS and allows to describe a network of machines and additional resources. Based on this description the network can be deployed.

The container backend is based on NixOS' built-in container support. The talk will give a short introduction into the usage of NixOPS and the container backend. This will be based on my experiences trying to use the container backend to quickly throw a few containers into a virtual machine and later deploy these container on a regular server.

The second part will give an introduction into NixOPS internals with a focus on its container backend. This part will cover the internal structure of the NixOPS codebase as well as the way how declarative and imperative NixOS containers do currently work, looking from the NixOPS angle.


If some people are interested we can do a little hackathon afterwards. Proposed topic by Johannes:

• Unify declarative / imperative NixOS containers to extend it's usability from the NixOPS perspective. This is the bit I ended up working on, instead of throwing containers out into my virtual machine.
    Some background food on this topic:
    • Some notes on this PR
    • Early code fragments      

My goal would be to look at some small steps to further explore this approach.


Munich - Germany

Friday, May 26 at 6:00 PM

26

https://www.meetup.com/Munich-NixOS-Meetup/events/239835348/

May 08, 2017 10:24 PM

April 25, 2017

Typing nix

Gradual types and annotations

This week's post will be about adding gradual types and type annotatinos to the system.

Last week, we took advantage of the fact that Jeremy Siek, the inventor of gradual typing was in Paris to work on adding gradual types to the system.

The idea is to use this to type the parts that can't be typed using a traditional type-system. Waiting for them to be rewritten in a typeable way of course

This has also been the occasion to explicit where type annotations would be needed.

About gradual types

(note that this is just a very quick introduction. If you want something more detailed, I enjoin you to read the original blogpost about it).

The jungle of programming languages is roughly split into two universes: the typed world and the untyped one. The goal of gradual typing is to bridge the gap in between. For this purpose, we add a new type (the "gradual type", often written ?), which is the type of expressions that we do not want (or can't) type.

The type ? can replace every type (for example, if a function expects an argument of type Int, I can pass in a value of type ? instead), and every type can replace the type ? (so if a function expects an argument of type ?, I can pass an Int or a Bool or a function String → (? → Bool) instead).

This tool is used in various languages such as Hack, Flow or Typescript.

Back to nix

So how can this be useful for nix?

First, let's see where we need to help the (classical) type checker in order to see where and why those types can help.

Type annotations

While a type-checker can be really smart, the mind of a programmer is often way too weird for him it may only infer a limited amount of stuff, and often needs some hints to understand what is going on.

In the case of nix, there are two places where you probably will need to annotate stuff: lambdas and (recursive) let-bindings. Those are the places where it would otherwise have to guess the type of variables, In some type-systems such as the Hindley-Milner type system used in the ML languages, there is no need for such annotations because the type-system is simple enough to allow working around that impossible guess. Unfortunately, those workarounds do not scale with more complicated type systems − in particular with intersection types. which is out of his reach. This doesn't mean that the type-checker won't be able to say anything about those, just that most of the time, the types it would infer would be too imprecise.

Lambdas

In order to type a function λx.e, the type-checker will have to type the body e without knowing anything about x, and by the way deduce the possible types for x. This is possible in ML thanks to unification, but unfortunately wouldn't work in our case.

As a consequence, the programmer will have to explicitly annotate the type of the argument, so that the type-checker can do its job. In fact, the programmer may even write several types for the argument, because function types aren't only arrows τ → σ, but also intersections of arrows i(τi → σi). So for each type τi of the argument, the type-checker will run to deduce a return type σi. Functions will then have a form like λx:(τ1;...;τn).e

Recursive let-bindings

Recursive declarations are similar to lambdas in the sense that we use a variable before it is fully defined − hence before it is typed. So for the very same reason, we must annotate the variables that we define recursively.

Including gradual types into this

Those annotations seem like a reasonable trade-off, but it breaks compatibility with existing untyped code. The obvious solution is to add a default type annotation if none is given by the programmer, which must be the gradual type if we want the body to type and still stay sound.

So an unannotated function λx.e is just syntactic sugar for λx:?.e.

That's nice, but what do we do with it?

The essence of the gradual type is that a value of type ? can be used in place of anything, and any value can be used where the type ? is expected. So in this case, the function will work exactly as it would in the untyped world: the argument can be used as a value of any type inside the body, and any value can be passed as argument.

(The concept of gradual typing also includes adding explicit casts at the boundaries between the typed and untyped worlds so that runtime type errors appear reliably at the right place, but as we do not touch the nix interpreter itself, this is out of scope for this project).

Working around the lack of polymorphism

The first use of gradual typing is a transitional one: we plan to use it to type legacy code, waiting for it to be rewritten in a typed way.

Another use, which is also (I hope) a transitional one is to mitigate the weaknesses of the type system (waiting for it to be improved of course). In particular, there is currently no polymorphism for now (because we don't know how to make polymorphism work with records and − ironically − gradual typing). This is rather annoying, mainly because some basic functions (head, tail, fix, …) are polymorphic (and typing them otherwise wouldn't make sense). There are people working on this, and hopefully a solution will land soon, but in the meantime, we have to deal with the absence of polymorphism.

For some really basic and important functions, we may hardcode some typing rules to implement in an ad-hoc way the missing polymorphism. But as we can't do this for every function, an acceptable solution is to type the other one gradually (so for example the identity function could be typed by ? → ? or the head function by [? ?*] → ?. This would of course be rather imprecise in some cases, but nonetheless allows us to give a useful type to those polymorphic functions.

Some examples

Let's now see some real nix code. This is an example from the first post of this blog

Gradual typing of a function

x:
if isInteger x then
  toString x
else if isBool x then
  x
else
  raise "Invalid argument"

If we leave this as such, the type-checker will understand it as:

(x::?):
if isInteger x then
  toString x
else if isBool x then
  x
else
  raise "Invalid argument"

In this case, it will try to type the body under the assumption that x is of type ?, see that in this case the return type may be String or Bool, and thus type the whole function as ? → (String ∨ Bool).

With a type annotation

Now, suppose the programmer adds a type annotation to the argument:

(x::Int ∨ Bool):
if isInteger x then
  toString x
else if isBool x then
  x
else
  raise "Invalid argument"

The type-checker will now type the body under the assumption that x is of type Int ∨ Bool. Like in previous case, he will find a type String ∨ Bool, and deduce a type (Int ∨ Bool) → (String ∨ Bool) for the body.

That's cool, but it could do better, this is not the most precise type for the function, we would like the type-system to infer the type (Int → String) ∧ (Bool → Bool) instead.

With a types annotation

Remember somewhere earlier, I told that you sometimes would need to annotate several types for the argument. This has been intended exactly for those situations. If we just change the type annotation to this:

(x::Int; Bool):
if isInteger x then
  toString x
else if isBool x then
  x
else
  raise "Invalid argument"

the type-checker will try to infer the type of the body twice: a first with the assumption x::Int (in which case he infers a return type String), and a second time with the assumption x::Bool (in which case he infers a return type Bool). Then, he will take the intersection of those two results to deduce the final type of the function, which will here be (Int → String) ∧ (Bool → Bool), as expected.

All those reasonings also apply to recursive let-bindings (although the need of annotating several types in a let-binding is rather unlikely to happen).

Further thoughts

Patterns

For simplicity, I eluded here everything related to patterns by assuming that the functions were in the form x:e. When the argument is matched against a pattern, it is of course possible to use this to get more informations about the domain of the function, even without annotation.

Concrete sytax for type annotations

For simplicity again, I assumed that the language was extended with type annotations, which obviously are not present in nix. So to keep compatibility, those will probably have to be written in a way that makes the nix parser happy, most probably inside comments (or we can try to add them to the official parser, which would be even better).

Non-recursive let-bindings

Unlike nix, nix-light has two separate let and let rec constructs. This is nice because the non-recursive version is trivial to type, and doesn't require any type annotation. During the compilation from nix to nix-light, it is possible to translate let-bindings who do not use recursion into non-recursive one, so that some let-bindings in nix won't need to be annotated.

April 25, 2017 12:00 AM

April 10, 2017

Typing nix

About records

Nix is all about records. They are everywhere in nixpkgs, and even the semantics of nix seems to have been designed with this idea of records everywhere as a first class construct. (In fact, I think that it would be more correct to think of nix as "json with lambda-calculus on top of it" rather than "a functional language with lists and records").

Thus we have to type records.

What are records?

Records (or "attribute sets" as they are called in nix) are a data structure associating expressions (of an arbitrary type) to strings. More precisely, a record is a finite map from strings (the "labels" of the record) to the set of all expressions (the "values" of the record).

There are two main uses of records:

Static records

The first use of records is when you manipulate some data with a known structure that you want to store in a structured way. In this case, you know (statically) the labels and the type of the value associated to each label.

For example, the fetchurl function from nixpkgs (defined in pkgs/build-support/fetchurl/default.nix) expects arguments of the form:

{
  url = "some_string"; # Or nothing
  urls = [ "a" "list" "of" "strings"]; # Or nothing
  curlOpts = "some_string"; # Or nothing
  name = "some_string"; # Or nothing
  sha256 = "some_string"; # Or nothing
  executable = some_boolean; # Or nothing
}

(we only show some of the fields as the whole definition doesn't matter here)

For that kind of use, the name of the fields do not carry any semantic, they are just here to help the programmer (and in most compiled languages with static records, the name of the fields is dropped at compile-time).

This is the structure of the nixos options set (except that it is all done at runtime): all the option declarations generate a rigid structure in which the settings must fit. And (except for a few), the actual name of the options doesn't appear in the result of the evaluation.

Dynamic records (aka hashtables)

Another use of records is when we got some arbitrary data that we want to store with the possibility to retrieve it efficiently and easily, to iterate on it, to pass as a whole to another function... In that case, we may use records as hashtables − or as what functional programmers refer to as "maps" (with the restriction that the keys may only be strings). If the programmer doesn't want to shoot itself in the foot, all the fields will likely be of the same type.

This is used in particular for some nixos options (the one of type attrsOf α).

For example, the environment.etc option accepts as an argument a module whose fields will determine which files are copied (or symlinked) to /etc. By default, the label of each field will be used as the file name.

How do we type them?

Those two uses of records are really different, and require different typing (except of course if someone ever manages to provide some typing rules which unify both and are usable in practice).

The problem is that in nix (and in most untyped languages), both are happily mangled. They are first mangled because the same construct is used in both cases, with only some semantic considerations allowing to distinguish the intention of the programmer, and second because the same record is sometimes used as the same time as a static and as a dynamic one.

For example, the derivation builtins expects as an argument a record in which some fields ("name", "builder", "system", ...) are expected and have a special meaning, and all the others are just passed as environment variables.

Fortunately, we got a powerful enough formalism to be able to be still able do some typing (and some useful one!). The exact typing rules are on the github repository, but here is an informal writeup:

The base elements are atomic records (records with only one field), which are combined together to form bigger record. The rules for a basic record r = { e1 = e2; } where e2 has type τ are:

  • If we can prove that e1 has value s where s designs a constant string ( resp. we know that e1 can only take a finite set of values s1, ..., sn), then r will have type { s: τ } (resp. { s1: τ; } | ... | { sn: τ; }).

  • If can't determine a finite set of values for e1 (but we still know that it has type string), then we give the type { _: τ; } to r.

Various combination rules explain how to type the various merges of records, depending on whether we know that the fields are distinct or not.

There also are some (non published yet) rules for several builtins which will require some ad-hoc typing such as attrValues (which applied to a record returns the set of fields defined in it) or listToAttr (which applied to a key-value pairs list returns the corresponding record).

A note about nixos "types"

The nixos module system already has a notion of "types" and this new type system may seem redundant at least for this part of nixpkgs (or may seem to obsolete the old one).

I don't think either of those is likely to happen, given the possible benefits of a static type system and the fact that the runtime type system is at the core of the implementation of the current system. I think that if the module system had been implemented on top of a language with a rich enough type system − type classes or equivalent in particular, most of the runtime type system would have been useless (if not all). But the module system being built around it, replacing it would probably not be worth the benefit

A small explanation of how both can play together may however be quite useful. In general, embedding a type system into a language isn't really nice for the host type-system, because the guest may prove stuff than the host can't prove even with some help, and thus will want to bypass the restrictions it enforces. Fortunately, our type-system should be expressive enough to be able to understand the inner type system, and thus the actual typing of nixos options could be (for the most part) embedded into the new one. And for the cases where it would not be possible, the gradual part of the type system provides a nice way of bypassing the static type-system.

April 10, 2017 12:00 AM

April 09, 2017

Munich NixOS Meetup

Augsburger Linux-Infotag

photoMunich NixOS Meetup

Der Augsburger Linux-Infotag ist eine eintägige Konferenz mit ca. 20 Vorträgen und 9 Workshops rund um Linux, Open-Source-Software und die digitale Gesellschaft. Der Eintritt ist frei.
Wir sind dort mit einem Stand zu NixOS vertreten :-)
https://www.luga.de/A...

Augsburg - Germany

Saturday, April 22 at 9:30 AM

6

https://www.meetup.com/Munich-NixOS-Meetup/events/239077440/

April 09, 2017 08:16 AM

Hackathon & Barbecue

photoMunich NixOS Meetup

Let's do another hackathon! Bring the Nix project you are currently working on or struggling with and discuss it with other people. Followed by a barbecue in the evening. Please add to the Pad which food you will bring to the barbecue: https://etherpad.wiki...

Augsburg - Germany

Monday, May 1 at 2:00 PM

5

https://www.meetup.com/Munich-NixOS-Meetup/events/239077247/

April 09, 2017 08:06 AM

April 03, 2017

Typing nix

Defining the language

Before we can work on the type system, the first step is to define the language in a way that will allow us to think about it. In particular, we wanted to simplify it as much as possible by removing everything that could be considered as syntactic sugar.

Nix-light

The language we came up (that we will call nix-light in lack of a better name for now) with is (modulo some modifications) a subset of nix. You can find its grammar here and its (not fully written yet) semantics here. The − probably more up-to-date − sources are available here

The main differences with plain nix are :

  • The various syntaxes (x.y, x."foo" or x.${e}) for accessing e recrord field are all reduced to the construct x.e where e is an expression. The same holds for the different ways of defining a record field

  • The definition of records is also simplified: the notation {e.e1 = y1; e.e2 = y2; …} is rewritten to its more explicit version { e = { e1 = y1; e2 = y2; …}; …}

  • The rec { ... } construct is removed as well as it can be emulated by a let-binding (let-bindings being recursive).

  • The inherit keyword is also dropped in favour of explicit "x" = x and "x" = x.e fields declarations. As recursive records don't exist anymore, one can write "x" = x" in a record definition without provocating infinite recursion
  • Some builtins functions are replaced by operators − the difference being that those functions were first-class in nix, while the operators aren't.

    This is due to the fact that those builtins (most probably) won't be typeable, but only their applications will. For example, the functionArgs builtin can't be given a type by itself, but given a function f, an ad-hoc typing rule for functionArgs will be able to give a type to functionArgs f.

    The list of those operators isn't fixed and will depend on what the type-system can handle.

  • A constructor Cons and a constant nil have been introduced to represent lists in a more conventional manner. This isn't strictly speaking a simplification of the language, but it will most probably be necessary in order to type polymorphic lists, and is far more pleasant to work with than the opaque (and imho unpractical) list definition in nix. Two new patterns have also been introduced to perform matching on lists.

Of course, this is still a work-in-progress, and all this is expected to change according to the evolutions of the work.

Some will be left behind...

Some people may have noticed that the immediate translation from nix to nix-light is not complete in the sense that an arbitrary nix expression can't always be translated to an equivalent nix-light expression.

Merging of sub-records

The first problem occurs when the { x.y = z } shortcut is used with dynamic fields names.

For example, if e1 and e2 are arbitrary expressions the record

{
  ${e1}.x = 1;
  ${e2}.y = 2;
}

can't be converted to an expression in nix-light syntax, because we can't statically know wether it would be

{
  e1 = { "x" = 1; };
  e2 = { "y" = 2; }; };
}

or

{ e1 = { "x" = 1; "y" = 2; }; }

This case can be eluded by using a functions like the recursiveUpdate one defined in nixpkgs/lib/attrsets.nix, but with the rather annoying consequence of transforming something already hard to type into something even harder − the gist of this being "don't use dynamic record field names if you can avoid it, or the type-checker will bite you".

Recursive records with dynamic fields

As I told, the rec { ... } construct can be expressed using a let construct.

rec {
  l1 = e1;
  
  ln = en;
}

is equivalent to

let
  l1 = e1;
  
  ln = en;
in
{
  l1 = l1;
  
  ln = ln;
}

Once again, the problem arises with dynamic field names:

rec {
  e1 = e'1;
  e2 = e'2;
}

Can't be simply converted like the previous example because dynamic names aren't allowed in let-bindings.

My opinion on that problem is that this is something you should never ever do, regardless of typing constraints. IMHO this also holds most of the time for any use of dynamic fields. It would probably have been better if nix din't allow them and had a proper map data structure for the (rare) cases where it is needed

So I won't really consider this a problem if the program fails on such cases. But of course, if someone finds a solution to encode this in nix-light, I'll gladly take it.

Some small extras

Semantics

The semantic given for nix-light can quite easily be extended to nix, and thus can serve as a starting point if someone ever wants to formalize the language note however that this semantic is limited to what interests us for the purpose of type-checking. In particular, the derivation keyword is just considered as a function taking as argument a certain type of record and returning another type of record. The whole "writing to the store" stuff is ignored as it has no incidence on typing.

Lists

Toying with the nix language is also an excellent opportunity to try new stuff.

The introduction of list constructors and patterns is partially motivated by this (in fact, while the constructors will probably be very useful while typing, the patterns are just here because we find it far more consistent and logical). The goal is to see wether this raise any problem and wether it would be a good addition.

April 03, 2017 12:00 AM

March 31, 2017

Sander van der Burg

Subsituting impure version specifiers in node2nix generated package compositions

In a number of previous blog posts, I have described node2nix, a tool that can be used to automatically integrate NPM packages into the Nix packages ecosystem. The biggest challenge in making this integration possible is the fact that NPM does dependency management in addition to build management -- NPM's dependency management properties conflict with Nix's purity principles.

Dealing with a conflicting dependency manager is quite simple from a conceptual perspective -- you must substitute it by a custom implementation that uses Nix to obtain all required dependencies. The remaining responsibilities (such as build management) are left untouched and still have to be carried out by the guest package manager.

Although conceptually simple, implementing such a substitution approach is much more difficult than expected. For example, in my previous blog posts I have described the following techniques:

  • Extracting dependencies. In addition to the package we intend to deploy with Nix, we must also include all its dependencies and transitive dependencies in the generation process.
  • Computing output hashes. In order to make package deployments deterministic, Nix requires that the output hashes of downloads are known in advance. As a result, we must examine all dependencies and compute their corresponding SHA256 output hashes. Some NPM projects have thousands of transitive dependencies that need to be analyzed.
  • Snapshotting versions. Nix uses SHA256 hash codes (derived from all inputs to build a package) to address specific variants or versions of packages whereas version specifiers in NPM package.json configurations are nominal -- they permit version ranges and references to external artifacts (such as Git repositories and external URLs).

    For example, a version range of >= 1.0.3 might resolve to version 1.0.3 today and to version 1.0.4 tomorrow. Translating a version range to a Nix package with a hash code identifier breaks the ability for Nix to guarantee that a package with a specific hash code yields a (nearly) bit identical build.

    To ensure reproducibility, we must snapshot the resolved version of these nominal dependency version specifiers (such as a version range) at generation time and generate the corresponding Nix expression for the resulting snapshot.
  • Simulating shared and private dependencies. In NPM projects, dependencies of a package are stored in the node_modules/ sub folder of the package. Each dependency can have private dependencies by putting them in their corresponding node_modules/ sub folder. Sharing dependencies is also possible by placing the corresponding dependency in any of the parent node_modules/ sub folders.

    Moreover, although this is not explicitly advertised as such, NPM implicitly supports cyclic dependencies and is able cope with them because it will refuse to install a dependency in a node_modules/ sub folder if any parent folder already provides it.

    When generating Nix expressions, we must replicate the exact same behaviour when it comes to private and shared dependencies. This is particularly important to cope with cyclic dependencies -- the Nix package manager does not allow them and we have to break any potential cycles at generation time.
  • Simulating "flat module" installations. In NPM versions older than 3.0, every dependency was installed privately by default unless a shared dependency exists that fits within the required version range.

    In newer NPM versions, this strategy has been reversed -- every dependency will be shared as much as possible until a conflict has been encountered. This means that we have to move dependencies as high up in the node_modules/ folder hierarchy as possible which is an imperative operation -- in Nix this is a problem, because packages are cannot be changed after they have been built.

    To cope with flattening, we must compute the implications of flattening the dependency structure in advance at generation time.

With the above techniques it is possible construct a node_modules/ directory structure having a nearly identical structure that NPM would normally compose with a high degree of accuracy.

Impure version specifiers


Even if it would be possible to reproduce the node_modules/ directory hierarchy with 100% accuracy, there is another problem that remains -- some version specifiers always trigger network communication regardless whether the dependencies have been provided or not, such as:


[
{ "node2nix": "latest" }
, { "nijs": "git+https://github.com/svanderburg/nijs.git#master" }
, { "prom2cb": "github:svanderburg/prom2cb" }
]

When referring to tags or Git branches, NPM is unable to determine to which version a package resolves. As a consequence, it attempts to retrieve the corresponding packages to investigate even when a compatible version in the node_modules/ directory hierarchy already exists.

While performing package builds, Nix takes various precautions to prevent side effects from influencing builds including network connections. As a result, an NPM package deployment will still fail despite the fact that a compatible dependency has already been provided.

In the package builder Nix expression provided by node2nix, I used to substitute these version specifiers in the package.json configuration files by a wildcard: '*'. Wildcards used to work fine for old Node.js 4.x/NPM 2.x installations, but with NPM 3.x flat module installations they became another big source of problems -- in order to make flat module installations work, NPM needs to know to which version a package resolves to determine whether it can be shared on a higher level in the node_modules/ folder hierarchy or not. Wildcards prevent NPM from making these comparisons, and as a result, some package deployments fail that did not use to fail with older versions of NPM.

Pinpointing version specifiers


In the latest node2nix I have solved these issues by implementing a different substitution strategy -- instead of substituting impure version specifiers by wildcards, I pinpoint all the dependencies to the exact version numbers to which these dependencies resolve. Internally, NPM addresses all dependencies by their names and version numbers only (this also has a number of weird implications, because it disregards the origins of these dependencies, but I will not go into detail on that).

I got the inspiration for this pinpointing strategy from the yarn package manager (an alternative to NPM developed by Facebook) -- when deploying a project with yarn, yarn pinpoints the installed dependencies in a so-called yarn.lock file so that package deployments become reproducible when a system is deployed for a second time.

The pinpointing strategy will always prevent NPM from consulting external resources (under the condition that we have provided the package by our substitute dependency manager first) and always provide version numbers for any dependency so that NPM can perform flat module installations. As a result, the accuracy of node2nix with newer versions of NPM has improved quite a bit.

Availability


The pinpointing strategy is part of the latest node2nix that can be obtained from the NPM registry or the Nixpkgs repository.

One month ago, I have given a talk about node2nix at FOSDEM 2017 summarizing the techniques discussed in my blog posts written so far. For convenience, I have embedded the slides into this web page:

by Sander van der Burg (noreply@blogger.com) at March 31, 2017 09:25 PM

March 30, 2017

Munich NixOS Meetup

NixOS 17.03 Release Party

photoMunich NixOS Meetup

NixOS 17.03 "Gorilla" will be the next release scheduled to be released on March 30, 2017. Time to party on March 31, 2017!

Meet other fellow Nix/NixOS users & developers and the 17.03 release manager for some drinks.

We will provide beer, Tschunks and non-alcoholic beverages on our rooftop terrace.

See release notes for more information about major changes and updates at http://nixos.org/nixo....

And of course the regular thank you to Eelco Dolstra for his tireless work onNixOS, Nix and all the projects around that. I'd like to thank Domen Kožar for his help on getting this release out smoothly and his regular work on NixOS, the security team for taking a lot of workload off the release manager by always making sure to keep our systems and packages secure and also Mayflower for allowing Robin to work on NixOS a lot in working hours.

The art is from http://www.hasanlai.c... released under CC-BY-SA-4.0.

München - Germany

Friday, March 31 at 7:00 PM

23

https://www.meetup.com/Munich-NixOS-Meetup/events/238567023/

March 30, 2017 08:52 PM

March 28, 2017

Typing nix

Let's type nix !

So this is it, my internship has begun. As promised, I will try to post regular updates on my work on this blog.

Current work

I (with my advisor) have been working on the nix grammar : our goal was to define the simplest version of the grammar (by removing everything that can reasonably be considered as syntactic sugar) which will be used as a basis for all the reasoning, and to define a subset of the language that we will support.

Overview of a possible type system for functions

We are of course still far away from an effective type system, but here is an overview of what it could look like (note that this is just an a priori overview and that the actual result may be totally different) :

  • The types τ are union, intersections and difference of basic types (int, bool, ...) and arrow ("function") types (and type variables α, β, … and universally quantified types ∀ α. τ). Here "union", "intersection" and "difference" can be interpreted in a set-theoritic way : an expression e is of type τ|σ (τ "union" σ) if it is of type τ or of type σ. Likewise, it is of type τ & σ (τ "inter" σ) if it is of type τ and of type σ at the same time, and it is of type τ \ σ if it is of type τ and not of type σ.

  • A function whose argument can be of type τ or of type τ' and which returns a type σ will have type (τ | τ') -> σ. This type is equivalent to the type (τ -> σ) & (τ' -> σ). This means that this function can be considered as a function from τ to σ or as a function from τ' to σ.

For example, the function

x:
if isInteger x then
  toString (1 - x)
else if isBool x then
  toString (!x)
else
  raise "Invalid argument"

would have type (int|bool) -> string (or (int -> string) & (bool -> string) which is equivalent). Note that although the argument x has type (int|bool) outside the "if", the expressions 1-x and !x are both well typed. This requires type-checker to be able to infer that in the first branch, x can only be of type int (because isInteger is true), and only of type bool in the second.

In the same vein, the function

x:
if isInteger x then
  toString x
else if isBool x then
  x
else
  raise "Invalid argument"

would have type (int -> string) & (bool -> bool).

Those types are slightly different from what one may find in an ML-like type system where both functions would be ill-typed, and thus will require a different formalism.

This formalism will also require the addition of gradual typing (ie adding a special type ? called the gradual type which means "I don't know how to type this, so I rely on the programmer not to mess with it"), because every correct code can't be typed, and nix expressions are full of correct but almost impossible to type constructions.

March 28, 2017 12:00 AM

March 20, 2017

Munich NixOS Meetup

NixOS 17.03 Release Sprint

photoMunich NixOS Meetup

The next stable NixOS release 17.03 'Gorilla' is going to happen on the 31st of March. The goal of this sprint is to fix critical issues before the release. The release manager will also attend and is available for guidance and feedback.

• Blocking issues: https://github.com/Ni...

• All 17.03 issues: https://github.com/Ni...

The sprint will be held at the Mayflower office in Munich on Saturday and Sunday starting at 11:00. Drinks will be provided.

The art is from http://www.hasanlai.c... released under CC-BY-SA-4.0.

München - Germany

Saturday, March 25 at 11:00 AM

7

https://www.meetup.com/Munich-NixOS-Meetup/events/238567006/

March 20, 2017 08:49 PM

March 14, 2017

Sander van der Burg

Reconstructing Disnix deployment configurations

In two earlier blog posts, I have described Dynamic Disnix, an experimental framework enabling self-adaptive redeployment on top of Disnix. The purpose of this framework is to redeploy a service-oriented system whenever the conditions of the environment change, so that the system can still meet its functional and non-functional requirements.

An important category of events that change the environment are machines that crash and disappear from the network -- when a disappearing machine used to host a crucial service, a system can no longer meet its functional requirements. Fortunately, Dynamic Disnix is capable of automatically responding to such events by deploying the missing components elsewhere.

Although Dynamic Disnix supports the recovery of missing services, there is one particular kind of failure I did not take into account. In addition to potentially crashing target machines that host the services of which a service-oriented systems consist, the coordinator machine that initiates the deployment process and stores the deployment state could also disappear. When the deployment state gets lost, it is no longer possible to reliably update the system.

In this blog post, I will describe a new addition to the Disnix toolset that can be used to cope with these kinds of failures by reconstructing a coordinator machine's deployment configuration from the meta data stored on the target machines.

The Disnix upgrade workflow


As explained in earlier blog posts, Disnix requires three kinds of deployment models to carry out a deployment process: a services model capturing the components of which a system consists, an infrastructure model describing the available target machines and their properties, and a distribution model mapping services in the services model to target machines in the infrastructure model. By writing instances of these three models and running the following command-line instruction:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Disnix will carry out all activities necessary to deploy the system: building the services and its intra-dependencies from source code, distributing the services and its intra-dependencies, and activating all services in the right order.

When changing any of the models and running the same command-line instruction again, Disnix attempts to upgrade the system by only rebuilding the aspects that have changed, and only deactivating the obsolete services and activating new services.

Disnix (as well as other Nix-related tools) attempt to optimize a redeployment process by only executing the steps that are required to reach a new deployment state. In Disnix, the building and distribution steps are optimized due to the fact that every package is stored in isolation the Nix store in which each package has a unique filename with a hash prefix, such as:

/nix/store/acv1y1zf7w0i6jx02kfa6gxyn2kfwj3l-firefox-48.0.2

As explained in a number of earlier blog posts, the hash prefix (acv1y1zf7w0i6jx02kfa6gxyn2kfwj3l...) is derived from all inputs used to build the package including its source code, build script, and libraries that it links to. That, for example, means that if we upgrade a system and nothing to the any of inputs of Firefox changes, we get an identical hash and if such a package build already exists, we do not have to build or transfer the package from an external site again.

The building step in Disnix produces a so-called low-level manifest file that is used by tools executing the remaining deployment activities:

<?xml version="1.0"?>
<manifest version="1">
<distribution>
<mapping>
<profile>/nix/store/aiawhpk5irpjqj25kh6ah6pqfvaifm53-test1</profile>
<target>test1</target>
</mapping>
</distribution>
<activation>
<mapping>
<dependsOn>
<dependency>
<target>test1</target>
<container>process</container>
<key>d500194f55ce2096487c6d2cf69fd94a0d9b1340361ea76fb8b289c39cdc202d</key>
</dependency>
</dependsOn>
<name>nginx</name>
<service>/nix/store/aa5hn5n1pg2qbb7i8skr6vkgpnsjhlns-nginx-wrapper</service>
<target>test1</target>
<container>wrapper</container>
<type>wrapper</type>
<key>da8c3879ccf1b0ae34a952f36b0630d47211d7f9d185a8f2362fa001652a9753</key>
</mapping>
</activation>
<targets>
<target>
<properties>
<hostname>test1</hostname>
</properties>
<containers>
<mongo-database/>
<process/>
<wrapper/>
</containers>
<system>x86_64-linux</system>
<numOfCores>1</numOfCores>
<clientInterface>disnix-ssh-client</clientInterface>
<targetProperty>hostname</targetProperty>
</target>
</targets>
</manifest>

The above manifest file contains the following kinds of information:

  • The distribution element section maps Nix profiles (containing references to all packages implementing the services deployed to the machine) to target machines in the network. This information is used by the distribution step to transfer packages from the coordinator machine to a target machine.
  • The activation element section contains elements specifying which service to activate on which machine in the network including other properties relevant to the activation, such as the type plugin that needs to be invoked that takes care of the activation process. This information is used by the activation step.
  • The targets section contains properties of the machines in the network and is used by all tools that carry out remote deployment steps.
  • There is also an optional snapshots section (not shown in the code fragment above) that contains the properties of services whose state need to be snapshotted, transferred and restored in case their location changes.

When a Disnix (re)deployment process successfully completes, Disnix stores the above manifest as a Disnix coordinator Nix profile on the coorindator machine for future reference with the purpose to optimize the successive upgrade step -- when redeploying a system Disnix will compare the generated manifest with the previously deployed generated instance and only deactivate services that have become obsolete and activating services that are new, making upgrades more efficient than fresh installations.

Unfortunately, when the coordinator machine storing the manifests gets lost, then also the deployment manifest gets lost. As a result, a system can no longer be reliably upgraded -- without deactivating obsolete services, newly deployed services may conflict with services that are already running on the target machines preventing the system from working properly.

Reconstructible manifests


Recently, I have modified Disnix in such a way that the deployment manifests on the coordinator machine can be reconstructed. Each Nix profile that Disnix distributes to a target machine includes a so-called profile manifest file, e.g. /nix/store/aiawhpk5irpjqj25kh6ah6pqfvaifm53-test1/manifest. Previously, this file only contained the Nix store paths to the deployed services and was primarily used by the disnix-query tool to display the installed set of services per machines.

In the latest Disnix, I have changed the format of the profile manifest file to contain all required meta data so that the the activation mappings can be reconstructed on the coordinator machine:

stafftracker
/nix/store/mi7dn2wvwvpgdj7h8xpvyb04d1nycriy-stafftracker-wrapper
process
process
d500194f55ce2096487c6d2cf69fd94a0d9b1340361ea76fb8b289c39cdc202d
false
[{ target = "test2"; container = "process"; _key = "4827dfcde5497466b5d218edcd3326327a4174f2b23fd3c9956e664e2386a080"; } { target = "test2"; container = "process"; _key = "b629e50900fe8637c4d3ddf8e37fc5420f2f08a9ecd476648274da63f9e1ebcc"; } { target = "test1"; container = "process"; _key = "d85ba27c57ba626fa63be2520fee356570626674c5635435d9768cf7da943aa3"; }]

The above code fragment shows a portion of the profile manifest. It has a line-oriented structure in which every 7 lines represent the properties of a deployed service. The first line denotes the name of the service, second line the Nix store path, third line the Dysnomia container, fourth line the Dysnomia type, fifth line the hash code derived of all properties, sixth line whether the attached state must be managed by Disnix and the seventh line an encoding of the inter-dependencies.

The other portions of the deployment manifest can be reconstructed as follows: the distribution section can be derived by querying the Nix store paths of the installed profiles on the target machines, the snapshots section by checking which services have been marked as stateful and the targets section can be directly derived from a provided infrastructure model.

With the augmented data in the profile manifests on the target machines, I have developed a tool named disnix-reconstruct that can reconstruct a deployment manifest from all the meta data the manifests on the target machines provide.

I can now, for example, delete all the deployment manifest generations on the coordinator machine:

$ rm /nix/var/nix/profiles/per-user/sander/disnix-coordinator/*

and reconstruct the latest deployment manifest, by running:

$ disnix-reconstruct infrastructure.nix

The above command resolves the full paths to the Nix profiles on the target machines, then downloads their intra-dependency closures to the coordinator machine, reconstructs the deployment manifest from the profile manifests and finally installs the generated deployment manifest.

If the above command succeeds, then we can reliably upgrade a system again with the usual command-line instruction:

$ disnix-env -s services.nix -i infrastructure.nix -d distribution.nix

Extending the self-adaptive deployment framework


In addition to reconstructing deployment manifests that have gone missing, disnix-reconstruct offers another benefit -- the self-adaptive redeployment framework described in the two earlier blog posts is capable of responding to various kinds of events, including redeploying services to other machines when a machine crashes and disappears from the network.

However, when a machine disappears from the network and reappears at a later point in time, Disnix no longer knows about its configuration. When such a machine reappears in the network, this could have disastrous results.

Fortunately, by adding disnix-reconstruct to the framework we can solve this issue:


As shown in the above diagram, whenever a change in the infrastructure is detected, we reconstruct the deployment manifest so that Disnix knows which services are deployed to it. Then when the system is being redeployed, the services on the reappearing machines can also be upgraded or undeployed completely, if needed.

The automatic reconstruction feature can be used by providing the --reconstruct parameter to the self adapt tool:


$ dydisnix-self-adapt -s services.nix -i infrastructure.nix -q qos.nix \
--reconstruct

Conclusion


In this blog post, I have described the latest addition to Disnix: disnix-reconstruct that can be used to reconstruct the deployment manifest on the coordinator machine from meta data stored on the target machines. With this addition, we can still update systems if the coordinator machine gets lost.

Furthermore, we can use this addition in the self-adaptive deployment framework to deal with reappearing machines that already have services deployed to them.

Finally, besides developing disnix-reconstruct, I have reached another stable point. As a result, I have decided to release Disnix 0.7. Consult the Disnix homepage for more information.

by Sander van der Burg (noreply@blogger.com) at March 14, 2017 11:02 PM

March 02, 2017

Munich NixOS Meetup

guix/guixSD: Talk & Discussion

photoMunich NixOS Meetup

John Darrington will give a talk about guix and guixSD, a package manager and a Linux distribution which are based on similar concepts as nix/NixOS.

The talk will be in English.

Augsburg - Germany

Wednesday, March 29 at 7:00 PM

11

https://www.meetup.com/Munich-NixOS-Meetup/events/237831744/

March 02, 2017 08:16 AM