>> Motivation

When using docker containers as an Elixir development environment, bandwidth limitations or frequent network failures can hinder the development flow as containers before starting check if the dependencies are updated via mix deps.get command. While it is frustrating to keep retrying on timeout issues, errors from fetching via hex.pm elevates this issue and has been more frequent based on my co-workers experience that has become a productivity issue. Although hex.pm's historical uptime does not indicate frequent downtime, unreliable internet service during peak or working hours may be a cause. Regardless, it would be nice if the dependencies to fetch for the Elixir app could be stored offline and fetched locally to avoid faulty network trips.

(Example repository for this article can be found here.)

2021-09-29 Update: Add offline support for Git dependencies

>> hex.registry

When hex v0.21.0 was released, it came with the ability to self-host hexpm mirrors via mix hex.registry build command. Assuming an Elixir app, the custom mirror can be built by:

$ cd ~/my_elixir_app

# Install the latest version of hex >= 0.21.0
$ mix local.hex

$ openssl genrsa -out private_key.pem
$ mkdir .hex_repo

# Initialize the hex mirror
# The argument --name=hexpm is important later on
$ mix hex.registry build .hex_repo --name=hexpm --private-key=private_key.pem

* creating .hex_repo/public_key
* creating .hex_repo/tarballs
* creating .hex_repo/names
* creating .hex_repo/versions

The newly created mirror does not have any packages stored yet. To store a single package in this offline repository, it is fetched online with mix hex.package fetch and then moved to the mirror:

$ cd ~/my_elixir_app

# Fetch decimal 2.0.0 package
$ mix hex.package fetch decimal 2.0.0

# Move to repo/tarballs
$ mv decimal-2.0.0.tar .hex_repo/tarballs/

# Rebuild package indices
$ mix hex.registry build .hex_repo --name=hexpm --private-key=private_key.pem

Fetching all the dependencies of the application is a bit more complicated as it requires parsing the mix.lock to acquire a list of package and corresponding version. Thankfully, a useful fact about the lock format is that it is just a serialized Elixir map, meaning we can just parse and load it via Code.eval_file and then process it with the Enum module as a simple Elixir script. For example with this sample lock file:

# mix.lock
%{
  "decimal": {:hex, :decimal, "1.8.1", "a4ef3f5f3428bdbc0d35374029ffcf4ede8533536fa79896dd450168d9acdf3c", [:mix], [], "hexpm"},
  "ecto": {:hex, :ecto, "3.1.7", "fa21d06ef56cdc2fdaa62574e8c3ba34a2751d44ea34c30bc65f0728421043e5", [:mix], [{:decimal, "~> 1.6", [hex: :decimal, repo: "hexpm", optional: false]}, {:jason, "~> 1.0", [hex: :jason, repo: "hexpm", optional: true]}], "hexpm"},
  "espec": {:hex, :espec, "1.8.1", "338d2f49cf4038bf617de7fcb0f92396f31a7b2febf506cfb56a6ac9ac18b802", [:mix], [{:meck, "~> 0.8.13", [hex: :meck, repo: "hexpm", optional: false]}], "hexpm"},
  "ex_image_info": {:hex, :ex_image_info, "0.2.4", "610002acba43520a9b1cf1421d55812bde5b8a8aeaf1fe7b1f8823e84e762adb", [:mix], [], "hexpm"},
  "meck": {:hex, :meck, "0.8.13", "ffedb39f99b0b99703b8601c6f17c7f76313ee12de6b646e671e3188401f7866", [:rebar3], [], "hexpm"},
  "mime": {:hex, :mime, "1.3.1", "30ce04ab3175b6ad0bdce0035cba77bba68b813d523d1aac73d9781b4d193cf8", [:mix], [], "hexpm"},
  "nimble_parsec": {:hex, :nimble_parsec, "0.5.3", "def21c10a9ed70ce22754fdeea0810dafd53c2db3219a0cd54cf5526377af1c6", [:mix], [], "hexpm"},
  "plug": {:hex, :plug, "1.9.0", "8d7c4e26962283ff9f8f3347bd73838e2413fbc38b7bb5467d5924f68f3a5a4a", [:mix], [{:mime, "~> 1.0", [hex: :mime, repo: "hexpm", optional: false]}, {:plug_crypto, "~> 1.0", [hex: :plug_crypto, repo: "hexpm", optional: false]}, {:telemetry, "~> 0.4", [hex: :telemetry, repo: "hexpm", optional: true]}], "hexpm"},
  "plug_crypto": {:hex, :plug_crypto, "1.1.2", "bdd187572cc26dbd95b87136290425f2b580a116d3fb1f564216918c9730d227", [:mix], [], "hexpm"},
}

A simple script to process it and call mix hex.package fetch for each package and version would be:

# update_local_repo.ex
{mix_map , []} = Code.eval_file("mix.lock")

# Enable mix commands programatically
Mix.start()

mix_map
|> Enum.map(fn {package, entry} ->
  # Every third entry of the tuple is the exact version
  # entry = {:hex, :decimal, "1.8.1", "omittedhash", [:mix], [], "hexpm"}
  {package, elem(entry, 2)}
end)
|> Enum.filter(fn {_package, version} ->
  # Optional, remove git packages since they do not have versions but revisions
  # version = {:git, "https://github.com/annkissam/rummage_ecto.git", "gitrev", [branch: "v2.0"]}
  case Version.parse(version) do
    {:ok, _} -> true
    :error -> false
  end
end)
# Optional, sort packages by name to get predicatable progress
|> Enum.sort_by(&elem(&1, 0))
# Optional, Task.async_stream is just to parallelize the process. Enum.each also works.
|> Task.async_stream(fn {package, version} ->
  # Run mix hex.package fetch for each package and store in the mirror
  Mix.shell().cmd("mix hex.package fetch #{package} #{version} --output .hex_repo/tarballs/")
end, timeout: :infinity, max_concurrency: 3)
|> Stream.run()

Running the script and rebuilding the mirror afterwards:

$ elixir update_local_repo.ex

ecto v3.1.7 downloaded to .hex_repo/tarballs/ecto-3.1.7.tar
espec v1.8.1 downloaded to .hex_repo/tarballs/espec-1.8.1.tar
decimal v1.8.1 downloaded to .hex_repo/tarballs/decimal-1.8.1.tar
ex_image_info v0.2.4 downloaded to .hex_repo/tarballs/ex_image_info-0.2.4.tar
mime v1.3.1 downloaded to .hex_repo/tarballs/mime-1.3.1.tar
meck v0.8.13 downloaded to .hex_repo/tarballs/meck-0.8.13.tar
nimble_parsec v0.5.3 downloaded to .hex_repo/tarballs/nimble_parsec-0.5.3.tar
plug v1.9.0 downloaded to .hex_repo/tarballs/plug-1.9.0.tar
plug_crypto v1.1.2 downloaded to .hex_repo/tarballs/plug_crypto-1.1.2.tar

$ tree .hex_repo/tarballs

.hex_repo/tarballs
├── decimal-1.8.1.tar
├── ecto-3.1.7.tar
├── espec-1.8.1.tar
├── ex_image_info-0.2.4.tar
├── meck-0.8.13.tar
├── mime-1.3.1.tar
├── nimble_parsec-0.5.3.tar
├── plug-1.9.0.tar
└── plug_crypto-1.1.2.tar

0 directories, 9 files

# Rebuild package indices
$ mix hex.registry build .hex_repo --name=hexpm --private-key=private_key.pem

* creating .hex_repo/packages/decimal
* creating .hex_repo/packages/ecto
* creating .hex_repo/packages/espec
* creating .hex_repo/packages/ex_image_info
* creating .hex_repo/packages/meck
* creating .hex_repo/packages/mime
* creating .hex_repo/packages/nimble_parsec
* creating .hex_repo/packages/plug
* creating .hex_repo/packages/plug_crypto
* updating .hex_repo/names
* updating .hex_repo/versions

Serving the .hex_repo with any static file server like nginx will turn this into a proper local hexpm mirror. With just Elixir/Erlang, it can be as simple as:

$ erl -s inets -eval
'inets:start(httpd,[{port,8000},{server_name,"localhost"},{server_root,"."},{document_root,".hex_repo"}]).'

>> hex.repo

To use the local mirror thought is a bit messy. Remember the name of the mirror is specifically --name=hexpm, the reason is to circumvent adding repo: "name_of_repo" for each of the dependency in mix.exs. To explain further, assuming the name of our mirror is instead local, the mirror can be registered this way through mix hex.repo add:

# Notice name is now local instad of hexpm
$ mix hex.registry build .hex_repo --name=local --private-key=private_key.pem

# Add mirror to hex
$ mix hex.repo add local http://localhost:8000 --public-key=.hex_repo/public_key

# Check if our local mirror is registered
$ mix hex.repo list

Name            URL                                                   Public key                                          Auth key
hexpm           https://repo.hex.pm                                   SHA256:O1LOYhHFW4kcrblKAxROaDEzLD8bn1seWbe5tq8TRsk
local           http://localhost:8000                                 SHA256:yRX8noVK1hcBU1e5FA7yv9fhz3v3wlzzF4PCZhwsVeI

After registering the mirror, every dependency in the application must add repo: "local" that looks like this:

# mix.exs
defp deps do
  [
    {:ecto, "~> 3.1.7", repo: "local"},
    {:ex_image_info, "~> 0.2.4", repo: "local"},
    {:nimble_parsec, "~> 0.5.0", repo: "local"},
    {:plug, "~> 1.9.0", repo: "local"},
    {:espec, "~> 1.8.1", repo: "local"}
  ]
  # Or more generically
  |> Enum.map(fn {package, version, opts} ->
    {package, version, Keyword.put_new(opts, :repo, "local")}
  end)
end

With either method, it should work for simpler projects; however for my umbrella app with complex dependencies, I get this error with mix deps.get:

Dependencies have diverged:
* poison (Hex package)
  different specs were given for the poison app:

  > In apps/my_umbrella_app/mix.exs:
    {:poison, "== 3.1.0", [env: :prod, hex: "poison", repo: "local"]}

  > In deps/elixir_nsq/mix.exs:
    {:poison, "~> 3.0", [env: :prod, repo: "hexpm", hex: "poison"]}

  Ensure they match or specify one of the above in your deps and set "override: true"
* conduit (Hex package)
  the dependency conduit in apps/my_umbrella_app/mix.exs is overriding a child dependency:

  > In apps/my_umbrella_apps/mix.exs:
    {:conduit, "== 0.12.10", [env: :prod, hex: "conduit", repo: "local"]}

  > In deps/conduit_plugs/mix.exs:
    {:conduit, "0.12.10", [env: :prod, repo: "hexpm", hex: "conduit"]}

  Ensure they match or specify one of the above in your deps and set "override: true"
** (Mix) Can't continue due to errors on dependencies

The issue here is that the dependency versions and repository are not the same and adding override: true option does not solve the issue. Whether it is a bug or not, I was not able to find a fix for this issue nor should I be required to change my version requirements. So my unsafe workaround is to override the default hexpm repository entry with the local one:

# Note the repo name is hexpm
$ mix hex.registry build .hex_repo --name=hexpm --private-key=private_key.pem
$ mix hex.repo add hexpm http://localhost:8000 --public-key=.hex_repo/public_key

# Check if our hexpm repo is overriden
$ mix hex.repo list

Name            URL                                                   Public key                                          Auth key
hexpm           http://localhost:8000                                 SHA256:HFj7kHFiEFD7c7YyJ9W9MJATcLKZZsW+4/AMs1Znzgo

One minor security issue is that the new local repository signature is not the same as the old online signature. Thankfully, this can be overriden by the environment variable HEX_NO_VERIFY_REPO_ORIGIN=1 and fetching dependencies via the local mirror now works:

$ HEX_NO_VERIFY_REPO_ORIGIN=1 mix deps.get

It may report some fetch error, but it should still work. To revert to the old repository, download the online public key to a file and override hexpm registry again:

# Assuming the public key from hex.pm is hex_public_key
$ mix hex.repo add hexpm https://repo.hex.pm  --public-key=hex_public_key

Almost all dependencies can be fetched locally except for git dependencies since those are fetched with git clone and making a local git repository is beyond the scope of this article. One naive solution is to build those dependencies with hex.build, copy them into the mirror and convert their deps entries from git to standard hexpm; however, this might not be worthwhile to configure per environment specially when building or releasing remotely with continuous integration tools or edeliver. Until hex provides custom fetchers, leaving the git dependencies alone is safer and still a win overall.

>> docker

Applying this fix back to the docker development environment that uses docker-compose which typically looks like this:

# docker-compose.yml
# Simplified for this example
version: '3.1'
services:
  # Any static HTTP server like nginx
  app_deps:
    image: nginx:alpine
    volumes:
      - ./.hex_repo:/usr/share/nginx/html
  # The Elixir (phoenix REST API) application
  app:
    image: bitwalker/alpine-elixir
    command: "mix phx.server"
    # Add the registry in the entrypoint
    entrypoint: /opt/app/docker-entrypoint.sh
    volumes:
      - .:/opt/app
      # Use docker volume to avoid conflicting with docker deps folder
      - mix_deps:/opt/app/deps
      # Cache the builds to avoid recompilation
      - mix_build:/opt/app/_build
    # Make sure the HTTP server is running before this
    depends_on:
      - app_deps
    links:
      - app_deps:app_deps
    environment:
      # Specify the hex repo to register
      HEX_REGISTRY: "http://app_deps:80"

volumes:
  mix_deps:
  mix_build:

In the entrypoint, the local hex mirror is registered if HEX_REGISTRY is present:

# docker-entrypoint.sh
#!/bin/sh

set -e

# Allow the image to use the local or online mirror
if [[ -z "${HEX_REGISTRY}" ]]; then
  echo "Using online hexpm mirror"
  mix hex.repo add hexpm "https://repo.hex.pm" --public-key="hex_public_key"
  export HEX_NO_VERIFY_REPO_ORIGIN=
else
  echo "Using local hexpm mirror: $HEX_REGISTRY"
  mix hex.repo add hexpm "$HEX_REGISTRY" --public-key=".hex_repo/public_key"
  export HEX_NO_VERIFY_REPO_ORIGIN=1
fi

echo "Updating dependencies if any uninstalled packages..."
mix do deps.get

exec "$@"

>> Workflow

Whenever mix.lock or any dependencies are changed, the mirror should be ideally updated and rebuilt as well. One approach specially when switching branches is to check if mix.lock has changed then rebuild which looks like this:

# Check if mix.lock was changed using `md5sum`
touch .mix.lock.old

OLD_HASH = $(md5sum ".mix.lock.old")

cat mix.lock > .mix.lock.old
NEW_HASH = $(md5sum "mix.lock")

if [ "$OLD_HASH" != "$NEW_HASH" ] ; then
    echo "Updating local mirror"
    mkdir -p .hex_repo/tarballs
    elixir update_local_repo.ex
    mix hex.registry build .hex_repo --name=hexpm --private-key=private_key.pem
fi

Rather than executing a rebuild per branch or start, perhaps it is easier to store the local mirror in git so that the person responsible for the branch should always rebuild the mirror and push that with the final changes. For my medium-sized project, the size of my .hex_repo folder is around 10MB which is an acceptable trade-off:

# Store the mirror and keys in git
git add .hex_repo private_key hex_public_key update_local_repo.ex
git commit -m "feat: store local repository"

# When working with on a feature branch
git checkout -b feature/work

# Assuming a new dependency is added after writing the feature
nano mix.exs
git add mix.exs lib/

# Revert to online repository and then pull
mix hex.repo add hexpm "https://repo.hex.pm" --public-key="hex_public_key"
mix deps.get

# Rebuild the repository
elixir update_local_repo.ex
mix hex.registry build .hex_repo --name=hexpm --private-key=private_key.pem

git add .hex_repo
git commit -m "feat: my new feature"

git push

This is not a perfect process since people can forget to update the mirror even with git hooks. Nonetheless, this works to minimize online fetching and putting the burden on the branch creator rather than its users.

>> Git Repo

After this development feature was merged, my co-worker still had issue specially with the remaining git dependencies. The issue is that the docker containers does not allow external network access for some reason. Because debugging the docker network issue could be potentially hard to resolve, caching the git dependencies is mandatory.

For each git or more precisely GitHub dependency in mix.exs, they are cloned bare and the paths replicated under a .git_repo folder that will be similarly served via nginx like with .hex_repo. Given this mix.exs:

defp deps do
  [
    {:elixir_nsq,
     git: "https://github.com/wistia/elixir_nsq",
     ref: "b43616a08459451cc5afdcd9839b732cbc1dedfa",
     override: true},
    {:rummage_ecto, git: "https://github.com/annkissam/rummage_ecto.git", branch: "v2.0"}
  ]
end

The construction of the new repository would be:

$ cd ~/my_elixir_app

# Create the git repository
$ mkdir .git_repo
$ cd .git_repo

# Clone each git repository bare
$ git clone --bare https://github.com/annkissam/rummage_ecto.git annkissam/rummage_ecto.git
$ git clone --bare https://github.com/wistia/elixir_nsq.git wistia/elixir_nsq.git

As an Elixir script reading mix.lock:

# update_git_repo.ex
{mix_map , []} = Code.eval_file("mix.lock")

Mix.start()

# Cleanup git repository folder
Mix.shell().cmd("rm -rf .git_repo/*")

mix_map
|> Enum.filter(fn
  # Filter out
  # entry = "elixir_nsq": {:git, "https://github.com/wistia/elixir_nsq", "sharev", [ref: "sharevused"]}
  {package, {:git, "https://github.com" <> _path, _rev, _opts}} -> true
  _ -> false
end)
|> Enum.map(fn {package, entry} -> {package, elem(entry, 1)} end)
|> Enum.sort_by(&elem(&1, 0))
|> Task.async_stream(fn {package, git_url} ->
  "https://github.com/" <> path = git_url

  # Normalize path ends with .git for bare repositories
  path = if(String.ends_with?(path, ".git"), do: path, else: "#{path}.git")

  Mix.shell().cmd("git clone --bare #{git_url} .git_repo/#{path}")
end, timeout: :infinity, max_concurrency: 3)
|> Stream.run()

Cloning the whole repo can be huge or unnecessary so adding --single-branch and --branch clone or other filtering options may be worth investigating. For my same project, it added a hefty 5MB for only 5 or so dependencies which is not ideal. Nonetheless, serving this .git_repo folder with another nginx container as an alias to https://github.com is our sneaky strategy similar to overriding hexpm.

However, it is not as straightforward as making nginx a compatible git HTTP server requires git-http-backend or fcgiwrap integration that requires a custom nginx configuration. After some research, the dockerfile and configuration I landed on is:

# Dockerfile.gitx
FROM nginx:1.21.3-alpine

# Make sure git-http-backend and fcgiwrap is installed
RUN apk add --update git git-daemon apache2-utils fcgiwrap

# Make sure the git repo root is accessible to nginx
RUN mkdir -p /srv/git && chown -R nginx:nginx /srv/git && chmod -R 755 /srv/git
# git-nginx.conf

server {
    # Setup HTTP endpoint
    listen       80;
    server_name  github.com;

    # Serve every repository under /srv/git
    root /srv/git;

    location / {
        try_files $uri $uri/ =404;
    }

    location ~ (/.*) {
        client_max_body_size 0;
        include       /etc/nginx/fastcgi_params;
        # Found under git-daemon package
        fastcgi_param SCRIPT_FILENAME     /usr/libexec/git-core/git-http-backend;

        # Serve every repository under /srv/git
        fastcgi_param GIT_PROJECT_ROOT    /srv/git;
        fastcgi_param GIT_HTTP_EXPORT_ALL "";

        fastcgi_param PATH_INFO           $1;

        # fcgi socket to conect to git daemon
        fastcgi_pass  unix:/var/run/fcgiwrap.sock-1;
    }
}

This basic configuration just serves every git repository under /srv/git without authentication, so it is only a read or clone only server. The tricky part here is setting up the fcgiwrap socket before nginx runs so a custom entrypoint is also needed here:

# git-entrypoint.sh
#!/bin/bash

echo "Creating fcgiwrap socket"
rm -f /var/run/fcgiwrap.sock-1
fcgiwrap -s unix:/var/run/fcgiwrap.sock-1 &

echo "Waiting on fcgiwrap socket"
sleep 1
chown nginx:nginx /var/run/fcgiwrap.sock-1
chmod 777 /var/run/fcgiwrap.sock-1

echo "Starting nginx"
nginx -g 'daemon off;'

I initially thought of using OpenRC but it was less consistent in starting up and creating the fcgiwrap socket manually was easier although slightly less efficient. Regardless, adding HTTPS support or proxy is the last but easier step which just requires a small nginx update:

# Original HTTP endpoint

server {
  listen 443 ssl;
  server_name github.com;

  # Generated with
  # openssl req -x509 -nodes -days 365 -subj "/C=CA/ST=QC/O=Company, Inc./CN=mydomain.com" -addext "subjectAltName=DNS:mydomain.com" -newkey rsa:2048 -keyout git-selfsigned.key -out git-selfsigned.crt
  ssl_certificate      /srv/git-selfsigned.crt;
  ssl_certificate_key  /srv/git-selfsigned.key;

  location / {
    proxy_pass http://127.0.0.1:80;
  }
}

Wiring it up in docker-compose.yml:

git_deps:
  build:
    context: ./.git_repo
    dockerfile: ../Dockerfile.gitx
  entrypoint: "sh /srv/git-entry.sh"
  # Expose ports for demonstration
  ports:
    - "22080:80"
    - "22443:443"
  volumes:
    - ./.git_repo:/srv/git
    - ./git-nginx.conf:/etc/nginx/conf.d/default.conf
    - ./git-entrypoint.sh:/srv/git-entry.sh
    - ./git-selfsigned.crt:/srv/git-selfsigned.crt
    - ./git-selfsigned.key:/srv/git-selfsigned.key
app:
  depends_on:
    # Make this app also wait on the git repo
    - git_deps
    - app_deps
  links:
    # Notice how git_deps is mapped as github.com
    # To fetch online, comment this line and GIT_NO_SSL_VERIFY
    - git_deps:github.com
    - app_deps:app_deps
  environment:
    # Also disable check for self signed certificates when cloning
    GIT_NO_SSL_VERIFY: "true"

Because of the new links, https://github.com now points to the local git HTTPS server from git_deps. Like with HEX_NO_VERIFY_REPO_ORIGIN, this will not work as git clone does not allow self-signed certificates without bypassing it with GIT_NO_SSL_VERIFY environment variable similar to HEX_NO_VERIFY_REPO_ORIGIN. Testing it out with docker-compose up git_deps:

# Outsiide the container
# With HTTP
$ git clone http://localhost:22080/annkissam/rummage_ecto.git

# With HTTPS
$ git clone https://localhost:22443/wistia/elixir_nsq.git


# Within the container
# docker-compose exec app ash
$ GIT_SSL_NO_VERIFY=true git clone https://github.com/wistia/elixir_nsq.git

While not the most optimal solution, it does resolve my co-worker's issue and the development environment is now completely offline aside from extra build hooks. Remember to store .git_repo and its accompanying files. Whenever a git dependency needs to be updated, delete and clone the bare repository needed and commit again.