MAJ terminée. Nous sommes passés en version 14.6.2 . Pour consulter les "releases notes" associées c'est ici :

Commit 1641acd8 authored by Simon Tournier's avatar Simon Tournier
Browse files

Add draft blog post.

* drafts/ New file.
parent bfd7f52c
title: DRAFT When Docker images become fixed-point
date: 2021-10-22 14:00:00
author: Simon Tournier
tags: reproducibility
Docker images are
right? They lack transparency and it is hard nor impossible to know what is
strawberry or whale oil, right? Although containers are efficient way to ship
*things*, the core question is how these *things* are produced.
The aim of this post is to demonstrate that the issue is not Docker
images by themselves, instead the concrete question when speaking about
*reproducibility*, is: from where the binaries come and using which tool
for supplying?
This scenario had been initially written as comment when reviewing
## Alice generates
Alice is working on a standard scientific stack using Python. Therefore,
she stores along her project the files `manifest.scm` containing the
package set and `channels.scm` containing the state of Guix (other said
the version). Owning these two files allows to replay using
[`guix time-machine`](
the exact same computational environment.
Concretely, `manifest.scm` reads,
and [guix describe -f
(list (channel
(name 'guix)
(url "")
"BBB0 2DDF 2CEA F6A8 0D1D E643 A2A0 6DF2 A33A 54FA")))))
So far, so good. Because Alice needs to run this stack on some infrastructure
not running Guix but instead running Docker, she just
[pack]( her
scientific stack with something along this line,
$ guix pack -f docker --save-provenance -m manifest.scm
The next step might depend. One solution is to locally load the generated
tarball using Docker tools, something along this line,
$ docker load < /gnu/store/6rga6pz60di21mn37y5v3lvrwxfvzcz9-python-python-numpy-docker-pack.tar.gz
Loaded image: python-python-numpy:latest
$ docker images
python-python-numpy latest ea2d5e62b2d2 51 years ago 431MB
then `docker push` to a convenient registry. The second solution is to
transfer the previous tarball as any other data to the other infrastructure
and run overthere the previous Docker commands.
For the sake on the demonstration, on the other machine, it just works:
$ docker run -ti python-python-numpy:latest python3
Python 3.8.2 (default, Jan 1 1970, 00:00:01)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
import numpy as np
>>> A = np.array([[1,0,1],[0,1,0],[0,0,1]])
A = np.array([[1,0,1],[0,1,0],[0,0,1]])
>>> _, s, _ = np.linalg.svd(A); s; abs(s[0] - 1./s[2])
_, s, _ = np.linalg.svd(A); s; abs(s[0] - 1./s[2])
array([1.61803399, 1. , 0.61803399])
>>> quit()
On a side note, the Docker image is directly produced by Guix. Other said,
Guix manages everything, from the binary packages and all the requirements to
the Docker image itself – no `Dockerfile` involved. In other words, this
Docker image is just a container format among many others, for instance `guix
pack -f squashfs --save-provenance -m manifest.scm` will generate a
[Singularity]( image (other container format)
with the exact same binaries inside.
## Bob redo later and elsewhere
Bob works with the Alice's Docker image. He needs to run this exact same
versions on another infrastructure using plain relocatable tarballs, for
example. Or he needs to scrutinize how all the binaries in this stack are
produced, because maybe he found a bug and want to know if all the results
obtained with this Docker image are correct or not, or maybe he wants to study
a specific aspect to better understand a specific result. Well, Bob is doing
Science and thus Bob needs transparency.
The files `manifest.scm` and `channels.scm` sadly disappeared long time ago.
Probably at the end the Alice's postdoc. If the Docker image had been
produced with `Dockerfile`, then game over! At least, hard time depending on
which image as base the `Dockerfile` had used – for instance give a look at
Debian [snapshot]( and
Hopefully, Bob remembers this Docker image had been produced with Guix
(`pack --save-provenace`). Let get the recipe of the smoothie.
Here the tricks! First, let start the container which eases exporting to
plain tarball. Second, let extract the embedded Guix profile.
$ docker run -d python-python-numpy:latest python3
$ docker export -o /tmp/re-pack.tar $(docker ps -a --format "{{.ID}}"| head -n1)
$ tar -xf /tmp/re-pack.tar $(tar -tf /tmp/re-pack.tar | grep 'profile/manifest')
$ tree gnu
└── store
└── ia1sxr3qf3w9dj7y48rwvwyx289vpfgi-profile
└── manifest
2 directories, 1 file
Wow! Is it really a regular profile? Yes, it is!
$ guix package -p gnu/store/ia1sxr3qf3w9dj7y48rwvwyx289vpfgi-profile --export-channels
;; This channel file can be passed to 'guix pull -C' or to
;; 'guix time-machine -C' to obtain the Guix revision that was
;; used to populate this profile.
(name 'guix)
(url "")
"BBB0 2DDF 2CEA F6A8 0D1D E643 A2A0 6DF2 A33A 54FA"))))
$ guix package -p gnu/store/ia1sxr3qf3w9dj7y48rwvwyx289vpfgi-profile --export-manifest
;; This "manifest" file can be passed to 'guix package -m' to reproduce
;; the content of your profile. This is "symbolic": it only specifies
;; package names. To reproduce the exact same profile, you also need to
;; capture the channels being used, as returned by "guix describe".
;; See the "Replicating Guix" section in the manual.
(list "python" "python-numpy"))
Awesome, isn't it? These two last outputs are equivalent to the Alice's
`manifest.scm` and `channels.scm` ones. Other said, let run whenever and
whereve this,
guix time-machine -C new-channels.scm \
-- pack -f docker --save-provenance -m new-manifest.scm
and it should produce the exact same `docker-pack.tar` as previously. If not,
raise your hand and open a bug.
Join the fun, join [us](!
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment