Ship one software tree to every node
CVMFS is a read-only, content-addressed filesystem that distributes software over plain HTTP.
Build the stack once on a Stratum-0, publish it, and every compute node mounts
/cvmfs/software.ndexr.io
on demand — no per-node installs, no rsync, no drift.
How it works
Four tiers. The source is the only writable one — everything below it is a signed, cacheable HTTP mirror.
Stratum-0 (the source)
One authoritative server holds the master copy. You publish into it inside a transaction; it signs and content-addresses every file. This is the only writable tier.
Stratum-1 (replicas)
Read-only HTTP mirrors of the Stratum-0. They survive the source going down and put bytes closer to the cluster. Optional for a single-site setup — add them when you scale out.
Squid proxy
A shared forward-cache in front of the cluster. The first node to ask for a file pulls it over HTTP; every other node gets it from local Squid. This is what makes CVMFS cheap at HPC scale.
Client cache (FUSE)
Each node mounts /cvmfs read-only via autofs + FUSE. Files are fetched on demand, verified against their signed catalog, and cached on local disk. Nothing is pre-staged; nothing is writable.
Acquire, install & connect a compute node
Run on every machine that should see the stack — login nodes, workers, your laptop.
Add the CERN package repo & install the client
Pull the release package that registers CVMFS's apt/yum repo, then install the client.
wget https://cvmrepo.s3.cern.ch/cvmrepo/apt/cvmfs-release-latest_all.deb
sudo dpkg -i cvmfs-release-latest_all.deb
rm -f cvmfs-release-latest_all.deb
sudo apt-get -y update
sudo apt-get -y install cvmfs
sudo yum install -y https://cvmrepo.s3.cern.ch/cvmrepo/yum/cvmfs-release-latest.noarch.rpm
sudo yum install -y cvmfs
-
CVMFS binary is on PATH.
command -v cvmfs2/usr/bin/cvmfs2 -
Client reports a version.
cvmfs2 --versionCompiler: ... CernVM-FS version 2.x.x
Wire up autofs
This installs the autofs map so anything under
/cvmfs/
mounts on first access and unmounts when idle.
sudo cvmfs_config setup
-
Setup reports no problems.
cvmfs_config chksetupOK -
autofs is active.
systemctl is-active autofsactive
Install the ndexr public key
CVMFS verifies every catalog signature against the repository's public key.
Drop ours into the keys directory so the client trusts
software.ndexr.io
.
sudo mkdir -p /etc/cvmfs/keys/ndexr.io
sudo curl -fsSL https://cvmfs.ndexr.io/keys/software.ndexr.io.pub \
-o /etc/cvmfs/keys/ndexr.io/software.ndexr.io.pub
.pub
file is the public half of the key generated on the Stratum-0 in the
server section below. It is safe to publish — the private
.masterkey
never leaves the source.
-
Key is present and non-empty.
test -s /etc/cvmfs/keys/ndexr.io/software.ndexr.io.pub && echo presentpresent -
It is a valid PEM public key.
head -1 /etc/cvmfs/keys/ndexr.io/software.ndexr.io.pub-----BEGIN PUBLIC KEY-----
Point the client at our server
Two files: a domain config that tells the client where the
ndexr.io
Stratum lives, and the global
default.local
that selects which repos to mount and how to cache.
CVMFS_SERVER_URL="http://cvmfs.ndexr.io/cvmfs/@fqrn@"
CVMFS_KEYS_DIR=/etc/cvmfs/keys/ndexr.io
CVMFS_USE_GEOAPI=no
CVMFS_REPOSITORIES=software.ndexr.io
CVMFS_CLIENT_PROFILE=single # 'single' = no shared Squid; use a proxy at scale
CVMFS_HTTP_PROXY=DIRECT # or "http://squid.ndexr.io:3128"
CVMFS_QUOTA_LIMIT=10000 # local cache cap, MB
@fqrn@
expands to the fully-qualified repo name.
Use
CVMFS_CLIENT_PROFILE=single
on a laptop or a one-box test;
on a real cluster point
CVMFS_HTTP_PROXY
at a shared Squid so nodes share one cache.
-
The effective config resolves our server URL.
cvmfs_config showconfig software.ndexr.io | grep CVMFS_SERVER_URLCVMFS_SERVER_URL=http://cvmfs.ndexr.io/cvmfs/software.ndexr.io -
The repo is in the mount list.
cvmfs_config showconfig software.ndexr.io | grep CVMFS_REPOSITORIESCVMFS_REPOSITORIES=software.ndexr.io
Verify the mount
With config in place, the first access mounts the repo. These three checks together prove the full chain works: trust anchor, HTTP reachability, signature validation, and FUSE mount.
-
Probe succeeds end-to-end (key + network + signature).
cvmfs_config probe software.ndexr.ioProbing /cvmfs/software.ndexr.io... OK -
The tree is readable and non-empty.
ls /cvmfs/software.ndexr.iox86_64-pc-linux-gnu/ ... -
Mount stats show a revision and the proxy in use.
cvmfs_config stat software.ndexr.ioVERSION REVISION ... PROXY ONLINE
Stand up the Stratum-0 source
Do this once, on the
hpc.ndexr.io
box — it already owns
/gentoo
and the build control plane.
Apache serves
/srv/cvmfs
locally; the public
cvmfs.ndexr.io
nginx vhost reverse-proxies it (CVMFS HTTP is unauthenticated, so it sits outside the admin gate).
Everything published here flows to every client above.
Install the server package + Apache
The Stratum-0 needs the
cvmfs-server
tools and a web server to expose
/srv/cvmfs
. Keep >50 GiB free on
/var/spool/cvmfs
for the scratch area.
wget https://cvmrepo.s3.cern.ch/cvmrepo/apt/cvmfs-release-latest_all.deb
sudo dpkg -i cvmfs-release-latest_all.deb && rm -f cvmfs-release-latest_all.deb
sudo apt-get -y update
sudo apt-get -y install cvmfs-server apache2
sudo yum install -y https://cvmrepo.s3.cern.ch/cvmrepo/yum/cvmfs-release-latest.noarch.rpm
sudo yum install -y cvmfs-server httpd
sudo systemctl enable --now httpd
-
Server tools report a version.
cvmfs_server --versioncvmfs_server 2.x.x -
The web server answers locally.
curl -sI http://localhost/cvmfs/ | head -1HTTP/1.1 200 OK
Create the repository
mkfs
creates the repo, generates the master + repository keys in
/etc/cvmfs/keys/
, and leaves an empty, signed
/cvmfs/software.ndexr.io
ready to publish into.
The
-o
owner runs the publish transactions without root.
sudo cvmfs_server mkfs -o $USER software.ndexr.io
/etc/cvmfs/keys/software.ndexr.io.masterkey
immediately and keep it offline. Lose it and no client can ever trust a new revision
of this repo. The
.pub
half is what clients fetch in client-step 3.
-
The repo is listed locally.
cvmfs_server listsoftware.ndexr.io (stratum0 / local) -
Both key halves exist.
ls /etc/cvmfs/keys/software.ndexr.io.{pub,masterkey,crt,key}...pub ...masterkey ...crt ...key -
Fresh repo is healthy at revision 1.
cvmfs_server check software.ndexr.ioTag 'trunk' ... revision 1 ... OK
Publish in a transaction
Open a transaction, write into
/cvmfs/software.ndexr.io
like an ordinary directory, then publish. The new revision is signed and visible to
clients atomically — no half-written state is ever served.
abort
rolls back.
cvmfs_server transaction software.ndexr.io
# ... copy / build software into the tree ...
cp -a build-output/* /cvmfs/software.ndexr.io/
cvmfs_server publish software.ndexr.io # commit + sign
# cvmfs_server abort software.ndexr.io # throw the transaction away
-
Publish advanced the revision counter.
cvmfs_server check software.ndexr.io | grep -i revision... revision 2 ... OK -
The new content is visible on the source mount.
ls /cvmfs/software.ndexr.ioyour published tree -
No transaction is left open.
cvmfs_server listsoftware.ndexr.io (stratum0 / local) — no 'in transaction' flag
Keep the whitelist alive
The repo whitelist is short-lived by design. Re-sign on a timer so clients never see an expired trust chain — a weekly cron is plenty.
cvmfs_server resign software.ndexr.io
# /etc/cron.d/cvmfs-resign
0 3 * * 0 root cvmfs_server resign software.ndexr.io
-
The whitelist has a future expiry.
cvmfs_server check software.ndexr.io 2>&1 | grep -i whitelistwhitelist valid until <date> OK -
The cron job is installed.
test -f /etc/cron.d/cvmfs-resign && echo presentpresent
Publish the HPC Gentoo Prefix stack
The whole point: the Gentoo Prefix tree from hpc.ndexr.io is self-contained under one directory. That is exactly what CVMFS distributes best.
On the Stratum-0 (the hpc.ndexr.io box)
The source lives on the same box that runs
hpc.ndexr.io
. Publish under
<triple>/<YYYY.MM>/
so every arch and every build is addressable, and flip
current
only after a clean publish — old jobs keep resolving while new ones move forward.
REPO=software.ndexr.io
T=$(gcc -dumpmachine); V=$(date +%Y.%m)
DEST=/cvmfs/$REPO/$T/$V
cvmfs_server transaction $REPO
rsync -a --delete /gentoo/ $DEST/prefix/ # Gentoo Prefix userland
rsync -a --delete $EB_PREFIX/software/ $DEST/eb/software/ # EasyBuild builds
rsync -a --delete $EB_PREFIX/modules/ $DEST/eb/modules/ # modulefiles
rsync -a --delete /opt/lmod/ $DEST/lmod/ # Lmod itself
ln -sfn $V /cvmfs/$REPO/$T/current
cvmfs_server publish $REPO
On every compute node
No install step — the node already mounts
/cvmfs
. Point at
current
for the running arch, then
module load
anything EasyBuild built — byte-identical to the source.
ROOT=/cvmfs/software.ndexr.io/$(gcc -dumpmachine)/current
# raw prefix toolchain (GCC 13, OpenBLAS, R, Python)
export PATH="$ROOT/prefix/usr/bin:$ROOT/prefix/bin:$PATH"
# or the EasyBuild module tree via Lmod
source $ROOT/lmod/lmod/init/bash
module use $ROOT/eb/modules/all
module load R
2026.06
and the next revision move over the wire. A rebuild that touches
ten packages costs ten packages of transfer, cluster-wide — not a full re-stage.
The
<triple>
split lets a second arch publish into the same repo without collision.
Prove the two servers agree
A publish is only real once a client can read the exact revision the source signed. These three reads — at the source, over public HTTP, and on a node — must report the same revision number. If they diverge, you have caught a broken publish, a stale cache, or a reverse-proxy serving an old copy.
| Vantage point | Read the revision | Expect |
|---|---|---|
|
1 · Source
Stratum-0 truth
|
cvmfs_server check software.ndexr.io | grep -i revision
|
the revision you just published, e.g.
revision 2
|
|
2 · HTTP
what the world sees
|
curl -s http://cvmfs.ndexr.io/cvmfs/software.ndexr.io/.cvmfspublished | grep -a '^S'
|
S2
— the
S
-line is the published revision
|
|
3 · Client
a real node
|
cvmfs_config stat -v software.ndexr.io | awk 'NR==2{print $2}'
|
same number — after
cvmfs_config reload
if the node cached an older revision
|
Content parity, not just revision
A matching revision proves the catalog moved; to prove the bytes match, diff the directory listing the source published against what a node resolves.
diff \
<(ssh stratum0 'ls /cvmfs/software.ndexr.io/$(gcc -dumpmachine)/current') \
<(ls /cvmfs/software.ndexr.io/$(gcc -dumpmachine)/current)
# no output = byte-identical tree
Force a node to the newest revision
Clients poll on a TTL, so a fresh publish is not instant. To pull it now, reload; if a node is wedged on a bad cache, wipe and re-probe.
cvmfs_config reload software.ndexr.io # re-read catalog now
sudo cvmfs_config wipecache # nuke local cache
cvmfs_config probe software.ndexr.io # remount → OK
Command reference
The commands you reach for after it's running.
cvmfs_config probe software.ndexr.io
cvmfs_config showconfig software.ndexr.io
cvmfs_config stat software.ndexr.io
cvmfs_config reload
sudo cvmfs_config wipecache
cvmfs_server resign software.ndexr.io
cvmfs_server list
cvmfs_server check software.ndexr.io