>> Motivation
When using docker containers as an Elixir development environment, bandwidth limitations or frequent network failures can hinder the development flow as containers before starting check if the dependencies are updated via mix deps.get command. While it is frustrating to keep retrying on timeout issues, errors from fetching via hex.pm elevates this issue and has been more frequent based on my co-workers experience that has become a productivity issue. Although hex.pm's historical uptime does not indicate frequent downtime, unreliable internet service during peak or working hours may be a cause. Regardless, it would be nice if the dependencies to fetch for the Elixir app could be stored offline and fetched locally to avoid faulty network trips.
(Example repository for this article can be found here.)
2021-09-29 Update: Add offline support for Git dependencies
>> hex.registry
When hex v0.21.0 was released, it came with the ability to self-host hexpm mirrors via mix hex.registry build command. Assuming an Elixir app, the custom mirror can be built by:
$ cd ~/my_elixir_app
# Install the latest version of hex >= 0.21.0
$ mix local.hex
$ openssl genrsa -out private_key.pem
$ mkdir .hex_repo
# Initialize the hex mirror
# The argument --name=hexpm is important later on
$ mix hex.registry build .hex_repo --name=hexpm --private-key=private_key.pem
* creating .hex_repo/public_key
* creating .hex_repo/tarballs
* creating .hex_repo/names
* creating .hex_repo/versions
The newly created mirror does not have any packages stored yet. To store a single package in this offline repository, it is fetched online with mix hex.package fetch and then moved to the mirror:
$ cd ~/my_elixir_app
# Fetch decimal 2.0.0 package
$ mix hex.package fetch decimal 2.0.0
# Move to repo/tarballs
$ mv decimal-2.0.0.tar .hex_repo/tarballs/
# Rebuild package indices
$ mix hex.registry build .hex_repo --name=hexpm --private-key=private_key.pem
Fetching all the dependencies of the application is a bit more complicated as it requires parsing the mix.lock to acquire a list of package and corresponding version. Thankfully, a useful fact about the lock format is that it is just a serialized Elixir map, meaning we can just parse and load it via Code.eval_file and then process it with the Enum module as a simple Elixir script. For example with this sample lock file:
# mix.lock
%{
"decimal": {:hex, :decimal, "1.8.1", "a4ef3f5f3428bdbc0d35374029ffcf4ede8533536fa79896dd450168d9acdf3c", [:mix], [], "hexpm"},
"ecto": {:hex, :ecto, "3.1.7", "fa21d06ef56cdc2fdaa62574e8c3ba34a2751d44ea34c30bc65f0728421043e5", [:mix], [{:decimal, "~> 1.6", [hex: :decimal, repo: "hexpm", optional: false]}, {:jason, "~> 1.0", [hex: :jason, repo: "hexpm", optional: true]}], "hexpm"},
"espec": {:hex, :espec, "1.8.1", "338d2f49cf4038bf617de7fcb0f92396f31a7b2febf506cfb56a6ac9ac18b802", [:mix], [{:meck, "~> 0.8.13", [hex: :meck, repo: "hexpm", optional: false]}], "hexpm"},
"ex_image_info": {:hex, :ex_image_info, "0.2.4", "610002acba43520a9b1cf1421d55812bde5b8a8aeaf1fe7b1f8823e84e762adb", [:mix], [], "hexpm"},
"meck": {:hex, :meck, "0.8.13", "ffedb39f99b0b99703b8601c6f17c7f76313ee12de6b646e671e3188401f7866", [:rebar3], [], "hexpm"},
"mime": {:hex, :mime, "1.3.1", "30ce04ab3175b6ad0bdce0035cba77bba68b813d523d1aac73d9781b4d193cf8", [:mix], [], "hexpm"},
"nimble_parsec": {:hex, :nimble_parsec, "0.5.3", "def21c10a9ed70ce22754fdeea0810dafd53c2db3219a0cd54cf5526377af1c6", [:mix], [], "hexpm"},
"plug": {:hex, :plug, "1.9.0", "8d7c4e26962283ff9f8f3347bd73838e2413fbc38b7bb5467d5924f68f3a5a4a", [:mix], [{:mime, "~> 1.0", [hex: :mime, repo: "hexpm", optional: false]}, {:plug_crypto, "~> 1.0", [hex: :plug_crypto, repo: "hexpm", optional: false]}, {:telemetry, "~> 0.4", [hex: :telemetry, repo: "hexpm", optional: true]}], "hexpm"},
"plug_crypto": {:hex, :plug_crypto, "1.1.2", "bdd187572cc26dbd95b87136290425f2b580a116d3fb1f564216918c9730d227", [:mix], [], "hexpm"},
}
A simple script to process it and call mix hex.package fetch
for
each package and version would be:
# update_local_repo.ex
{mix_map , []} = Code.eval_file("mix.lock")
# Enable mix commands programatically
Mix.start()
mix_map
|> Enum.map(fn {package, entry} ->
# Every third entry of the tuple is the exact version
# entry = {:hex, :decimal, "1.8.1", "omittedhash", [:mix], [], "hexpm"}
{package, elem(entry, 2)}
end)
|> Enum.filter(fn {_package, version} ->
# Optional, remove git packages since they do not have versions but revisions
# version = {:git, "https://github.com/annkissam/rummage_ecto.git", "gitrev", [branch: "v2.0"]}
case Version.parse(version) do
{:ok, _} -> true
:error -> false
end
end)
# Optional, sort packages by name to get predicatable progress
|> Enum.sort_by(&elem(&1, 0))
# Optional, Task.async_stream is just to parallelize the process. Enum.each also works.
|> Task.async_stream(fn {package, version} ->
# Run mix hex.package fetch for each package and store in the mirror
Mix.shell().cmd("mix hex.package fetch #{package} #{version} --output .hex_repo/tarballs/")
end, timeout: :infinity, max_concurrency: 3)
|> Stream.run()
Running the script and rebuilding the mirror afterwards:
$ elixir update_local_repo.ex
ecto v3.1.7 downloaded to .hex_repo/tarballs/ecto-3.1.7.tar
espec v1.8.1 downloaded to .hex_repo/tarballs/espec-1.8.1.tar
decimal v1.8.1 downloaded to .hex_repo/tarballs/decimal-1.8.1.tar
ex_image_info v0.2.4 downloaded to .hex_repo/tarballs/ex_image_info-0.2.4.tar
mime v1.3.1 downloaded to .hex_repo/tarballs/mime-1.3.1.tar
meck v0.8.13 downloaded to .hex_repo/tarballs/meck-0.8.13.tar
nimble_parsec v0.5.3 downloaded to .hex_repo/tarballs/nimble_parsec-0.5.3.tar
plug v1.9.0 downloaded to .hex_repo/tarballs/plug-1.9.0.tar
plug_crypto v1.1.2 downloaded to .hex_repo/tarballs/plug_crypto-1.1.2.tar
$ tree .hex_repo/tarballs
.hex_repo/tarballs
├── decimal-1.8.1.tar
├── ecto-3.1.7.tar
├── espec-1.8.1.tar
├── ex_image_info-0.2.4.tar
├── meck-0.8.13.tar
├── mime-1.3.1.tar
├── nimble_parsec-0.5.3.tar
├── plug-1.9.0.tar
└── plug_crypto-1.1.2.tar
0 directories, 9 files
# Rebuild package indices
$ mix hex.registry build .hex_repo --name=hexpm --private-key=private_key.pem
* creating .hex_repo/packages/decimal
* creating .hex_repo/packages/ecto
* creating .hex_repo/packages/espec
* creating .hex_repo/packages/ex_image_info
* creating .hex_repo/packages/meck
* creating .hex_repo/packages/mime
* creating .hex_repo/packages/nimble_parsec
* creating .hex_repo/packages/plug
* creating .hex_repo/packages/plug_crypto
* updating .hex_repo/names
* updating .hex_repo/versions
Serving the .hex_repo
with any static file server like nginx will
turn this into a proper local hexpm
mirror. With just Elixir/Erlang,
it can be as simple as:
$ erl -s inets -eval
'inets:start(httpd,[{port,8000},{server_name,"localhost"},{server_root,"."},{document_root,".hex_repo"}]).'
>> hex.repo
To use the local mirror thought is a bit messy. Remember the name of
the mirror is specifically --name=hexpm
, the reason is to circumvent
adding repo: "name_of_repo"
for each of the dependency in mix.exs
.
To explain further, assuming the name of our mirror is instead
local
, the mirror can be registered this way through mix hex.repo
add:
# Notice name is now local instad of hexpm
$ mix hex.registry build .hex_repo --name=local --private-key=private_key.pem
# Add mirror to hex
$ mix hex.repo add local http://localhost:8000 --public-key=.hex_repo/public_key
# Check if our local mirror is registered
$ mix hex.repo list
Name URL Public key Auth key
hexpm https://repo.hex.pm SHA256:O1LOYhHFW4kcrblKAxROaDEzLD8bn1seWbe5tq8TRsk
local http://localhost:8000 SHA256:yRX8noVK1hcBU1e5FA7yv9fhz3v3wlzzF4PCZhwsVeI
After registering the mirror, every dependency in the application must
add repo: "local"
that looks like this:
# mix.exs
defp deps do
[
{:ecto, "~> 3.1.7", repo: "local"},
{:ex_image_info, "~> 0.2.4", repo: "local"},
{:nimble_parsec, "~> 0.5.0", repo: "local"},
{:plug, "~> 1.9.0", repo: "local"},
{:espec, "~> 1.8.1", repo: "local"}
]
# Or more generically
|> Enum.map(fn {package, version, opts} ->
{package, version, Keyword.put_new(opts, :repo, "local")}
end)
end
With either method, it should work for simpler projects; however for
my umbrella app with complex dependencies, I get this error with mix deps.get
:
Dependencies have diverged:
* poison (Hex package)
different specs were given for the poison app:
> In apps/my_umbrella_app/mix.exs:
{:poison, "== 3.1.0", [env: :prod, hex: "poison", repo: "local"]}
> In deps/elixir_nsq/mix.exs:
{:poison, "~> 3.0", [env: :prod, repo: "hexpm", hex: "poison"]}
Ensure they match or specify one of the above in your deps and set "override: true"
* conduit (Hex package)
the dependency conduit in apps/my_umbrella_app/mix.exs is overriding a child dependency:
> In apps/my_umbrella_apps/mix.exs:
{:conduit, "== 0.12.10", [env: :prod, hex: "conduit", repo: "local"]}
> In deps/conduit_plugs/mix.exs:
{:conduit, "0.12.10", [env: :prod, repo: "hexpm", hex: "conduit"]}
Ensure they match or specify one of the above in your deps and set "override: true"
** (Mix) Can't continue due to errors on dependencies
The issue here is that the dependency versions and repository are not
the same and adding override: true
option does not solve the issue.
Whether it is a bug or not, I was not able to find a fix for this
issue nor should I be required to change my version requirements. So
my unsafe workaround is to override the default hexpm
repository
entry with the local one:
# Note the repo name is hexpm
$ mix hex.registry build .hex_repo --name=hexpm --private-key=private_key.pem
$ mix hex.repo add hexpm http://localhost:8000 --public-key=.hex_repo/public_key
# Check if our hexpm repo is overriden
$ mix hex.repo list
Name URL Public key Auth key
hexpm http://localhost:8000 SHA256:HFj7kHFiEFD7c7YyJ9W9MJATcLKZZsW+4/AMs1Znzgo
One minor security issue is that the new local repository signature is not the same as the old online signature. Thankfully, this can be overriden by the environment variable HEX_NO_VERIFY_REPO_ORIGIN=1 and fetching dependencies via the local mirror now works:
$ HEX_NO_VERIFY_REPO_ORIGIN=1 mix deps.get
It may report some fetch error, but it should still work. To revert to
the old repository, download the online public key to a file and
override hexpm
registry again:
# Assuming the public key from hex.pm is hex_public_key
$ mix hex.repo add hexpm https://repo.hex.pm --public-key=hex_public_key
Almost all dependencies can be fetched locally except for git
dependencies since those are fetched with git clone and making a local
git repository is beyond the scope of this article. One naive solution
is to build those dependencies with hex.build, copy them into the
mirror and convert their deps
entries from git
to standard
hexpm
; however, this might not be worthwhile to configure per
environment specially when building or releasing remotely with
continuous integration tools or edeliver. Until hex
provides custom
fetchers, leaving the git
dependencies alone is safer and still a
win overall.
>> docker
Applying this fix back to the docker development environment that uses docker-compose which typically looks like this:
# docker-compose.yml
# Simplified for this example
version: '3.1'
services:
# Any static HTTP server like nginx
app_deps:
image: nginx:alpine
volumes:
- ./.hex_repo:/usr/share/nginx/html
# The Elixir (phoenix REST API) application
app:
image: bitwalker/alpine-elixir
command: "mix phx.server"
# Add the registry in the entrypoint
entrypoint: /opt/app/docker-entrypoint.sh
volumes:
- .:/opt/app
# Use docker volume to avoid conflicting with docker deps folder
- mix_deps:/opt/app/deps
# Cache the builds to avoid recompilation
- mix_build:/opt/app/_build
# Make sure the HTTP server is running before this
depends_on:
- app_deps
links:
- app_deps:app_deps
environment:
# Specify the hex repo to register
HEX_REGISTRY: "http://app_deps:80"
volumes:
mix_deps:
mix_build:
In the entrypoint, the local hex mirror is registered if
HEX_REGISTRY
is present:
# docker-entrypoint.sh
#!/bin/sh
set -e
# Allow the image to use the local or online mirror
if [[ -z "${HEX_REGISTRY}" ]]; then
echo "Using online hexpm mirror"
mix hex.repo add hexpm "https://repo.hex.pm" --public-key="hex_public_key"
export HEX_NO_VERIFY_REPO_ORIGIN=
else
echo "Using local hexpm mirror: $HEX_REGISTRY"
mix hex.repo add hexpm "$HEX_REGISTRY" --public-key=".hex_repo/public_key"
export HEX_NO_VERIFY_REPO_ORIGIN=1
fi
echo "Updating dependencies if any uninstalled packages..."
mix do deps.get
exec "$@"
>> Workflow
Whenever mix.lock
or any dependencies are changed, the mirror should
be ideally updated and rebuilt as well. One approach specially when
switching branches is to check if mix.lock
has changed then rebuild
which looks like this:
# Check if mix.lock was changed using `md5sum`
touch .mix.lock.old
OLD_HASH = $(md5sum ".mix.lock.old")
cat mix.lock > .mix.lock.old
NEW_HASH = $(md5sum "mix.lock")
if [ "$OLD_HASH" != "$NEW_HASH" ] ; then
echo "Updating local mirror"
mkdir -p .hex_repo/tarballs
elixir update_local_repo.ex
mix hex.registry build .hex_repo --name=hexpm --private-key=private_key.pem
fi
Rather than executing a rebuild per branch or start, perhaps it is
easier to store the local mirror in git
so that the person
responsible for the branch should always rebuild the mirror and push
that with the final changes. For my medium-sized project, the size of
my .hex_repo
folder is around 10MB
which is an acceptable
trade-off:
# Store the mirror and keys in git
git add .hex_repo private_key hex_public_key update_local_repo.ex
git commit -m "feat: store local repository"
# When working with on a feature branch
git checkout -b feature/work
# Assuming a new dependency is added after writing the feature
nano mix.exs
git add mix.exs lib/
# Revert to online repository and then pull
mix hex.repo add hexpm "https://repo.hex.pm" --public-key="hex_public_key"
mix deps.get
# Rebuild the repository
elixir update_local_repo.ex
mix hex.registry build .hex_repo --name=hexpm --private-key=private_key.pem
git add .hex_repo
git commit -m "feat: my new feature"
git push
This is not a perfect process since people can forget to update the mirror even with git hooks. Nonetheless, this works to minimize online fetching and putting the burden on the branch creator rather than its users.
>> Git Repo
After this development feature was merged, my co-worker still had issue specially with the remaining git dependencies. The issue is that the docker containers does not allow external network access for some reason. Because debugging the docker network issue could be potentially hard to resolve, caching the git dependencies is mandatory.
For each git or more precisely GitHub dependency in mix.exs
, they
are cloned bare and the paths replicated under a .git_repo
folder
that will be similarly served via nginx
like with .hex_repo
.
Given this mix.exs
:
defp deps do
[
{:elixir_nsq,
git: "https://github.com/wistia/elixir_nsq",
ref: "b43616a08459451cc5afdcd9839b732cbc1dedfa",
override: true},
{:rummage_ecto, git: "https://github.com/annkissam/rummage_ecto.git", branch: "v2.0"}
]
end
The construction of the new repository would be:
$ cd ~/my_elixir_app
# Create the git repository
$ mkdir .git_repo
$ cd .git_repo
# Clone each git repository bare
$ git clone --bare https://github.com/annkissam/rummage_ecto.git annkissam/rummage_ecto.git
$ git clone --bare https://github.com/wistia/elixir_nsq.git wistia/elixir_nsq.git
As an Elixir script reading mix.lock
:
# update_git_repo.ex
{mix_map , []} = Code.eval_file("mix.lock")
Mix.start()
# Cleanup git repository folder
Mix.shell().cmd("rm -rf .git_repo/*")
mix_map
|> Enum.filter(fn
# Filter out
# entry = "elixir_nsq": {:git, "https://github.com/wistia/elixir_nsq", "sharev", [ref: "sharevused"]}
{package, {:git, "https://github.com" <> _path, _rev, _opts}} -> true
_ -> false
end)
|> Enum.map(fn {package, entry} -> {package, elem(entry, 1)} end)
|> Enum.sort_by(&elem(&1, 0))
|> Task.async_stream(fn {package, git_url} ->
"https://github.com/" <> path = git_url
# Normalize path ends with .git for bare repositories
path = if(String.ends_with?(path, ".git"), do: path, else: "#{path}.git")
Mix.shell().cmd("git clone --bare #{git_url} .git_repo/#{path}")
end, timeout: :infinity, max_concurrency: 3)
|> Stream.run()
Cloning the whole repo can be huge or unnecessary so adding
--single-branch and --branch clone or other filtering options may be
worth investigating. For my same project, it added a hefty 5MB
for
only 5 or so dependencies which is not ideal. Nonetheless, serving
this .git_repo
folder with another nginx
container as an alias to
https://github.com is our sneaky strategy similar to overriding
hexpm
.
However, it is not as straightforward as making nginx
a compatible
git HTTP server requires git-http-backend or fcgiwrap integration that
requires a custom nginx
configuration. After some research, the
dockerfile and configuration I landed on is:
# Dockerfile.gitx
FROM nginx:1.21.3-alpine
# Make sure git-http-backend and fcgiwrap is installed
RUN apk add --update git git-daemon apache2-utils fcgiwrap
# Make sure the git repo root is accessible to nginx
RUN mkdir -p /srv/git && chown -R nginx:nginx /srv/git && chmod -R 755 /srv/git
# git-nginx.conf
server {
# Setup HTTP endpoint
listen 80;
server_name github.com;
# Serve every repository under /srv/git
root /srv/git;
location / {
try_files $uri $uri/ =404;
}
location ~ (/.*) {
client_max_body_size 0;
include /etc/nginx/fastcgi_params;
# Found under git-daemon package
fastcgi_param SCRIPT_FILENAME /usr/libexec/git-core/git-http-backend;
# Serve every repository under /srv/git
fastcgi_param GIT_PROJECT_ROOT /srv/git;
fastcgi_param GIT_HTTP_EXPORT_ALL "";
fastcgi_param PATH_INFO $1;
# fcgi socket to conect to git daemon
fastcgi_pass unix:/var/run/fcgiwrap.sock-1;
}
}
This basic configuration just serves every git repository under
/srv/git
without authentication, so it is only a read or clone only
server. The tricky part here is setting up the fcgiwrap
socket before
nginx
runs so a custom entrypoint is also needed here:
# git-entrypoint.sh
#!/bin/bash
echo "Creating fcgiwrap socket"
rm -f /var/run/fcgiwrap.sock-1
fcgiwrap -s unix:/var/run/fcgiwrap.sock-1 &
echo "Waiting on fcgiwrap socket"
sleep 1
chown nginx:nginx /var/run/fcgiwrap.sock-1
chmod 777 /var/run/fcgiwrap.sock-1
echo "Starting nginx"
nginx -g 'daemon off;'
I initially thought of using OpenRC but it was less consistent in
starting up and creating the fcgiwrap
socket manually was easier
although slightly less efficient. Regardless, adding HTTPS support or
proxy is the last but easier step which just requires a small nginx
update:
# Original HTTP endpoint
server {
listen 443 ssl;
server_name github.com;
# Generated with
# openssl req -x509 -nodes -days 365 -subj "/C=CA/ST=QC/O=Company, Inc./CN=mydomain.com" -addext "subjectAltName=DNS:mydomain.com" -newkey rsa:2048 -keyout git-selfsigned.key -out git-selfsigned.crt
ssl_certificate /srv/git-selfsigned.crt;
ssl_certificate_key /srv/git-selfsigned.key;
location / {
proxy_pass http://127.0.0.1:80;
}
}
Wiring it up in docker-compose.yml
:
git_deps:
build:
context: ./.git_repo
dockerfile: ../Dockerfile.gitx
entrypoint: "sh /srv/git-entry.sh"
# Expose ports for demonstration
ports:
- "22080:80"
- "22443:443"
volumes:
- ./.git_repo:/srv/git
- ./git-nginx.conf:/etc/nginx/conf.d/default.conf
- ./git-entrypoint.sh:/srv/git-entry.sh
- ./git-selfsigned.crt:/srv/git-selfsigned.crt
- ./git-selfsigned.key:/srv/git-selfsigned.key
app:
depends_on:
# Make this app also wait on the git repo
- git_deps
- app_deps
links:
# Notice how git_deps is mapped as github.com
# To fetch online, comment this line and GIT_NO_SSL_VERIFY
- git_deps:github.com
- app_deps:app_deps
environment:
# Also disable check for self signed certificates when cloning
GIT_NO_SSL_VERIFY: "true"
Because of the new links
, https://github.com now points to the local
git HTTPS server from git_deps
. Like with
HEX_NO_VERIFY_REPO_ORIGIN
, this will not work as git clone
does
not allow self-signed certificates without bypassing it with
GIT_NO_SSL_VERIFY environment variable similar to
HEX_NO_VERIFY_REPO_ORIGIN
. Testing it out with docker-compose up git_deps
:
# Outsiide the container
# With HTTP
$ git clone http://localhost:22080/annkissam/rummage_ecto.git
# With HTTPS
$ git clone https://localhost:22443/wistia/elixir_nsq.git
# Within the container
# docker-compose exec app ash
$ GIT_SSL_NO_VERIFY=true git clone https://github.com/wistia/elixir_nsq.git
While not the most optimal solution, it does resolve my co-worker's
issue and the development environment is now completely offline aside
from extra build hooks. Remember to store .git_repo
and its
accompanying files. Whenever a git dependency needs to be updated,
delete and clone the bare repository needed and commit again.