My trial and error with Dockerfile

This is a post on my trial and error on building Docker image with Dockerfile
software
Author

Hyoungchul Kim

Published

September 13, 2025

I had a chance to create a replication package for my coding assignment this fall. While doing the assignment, I decided to use this opportunity to actually implement a Dockerfile that will work for most of the platforms (e.g. macOC, Linux, ARM/AMD architecture). This is a post on my trial and error on building Docker image with Dockerfile.

(R specific) Incompatible binary issue with certain packages

If you have used linux, you would be very familiar with some of the installtion issues with R packages. It just take awful long time to install most of the packages in linux. The reasn is simple: unlike Windows or MacOS, lot of the packages do not have pre-compiled binary for linux. This is sort of due to the fact that there are so many different linux distributions. As Windows or MacOS is a popular OS (Ugh…), it is easier to maintain standard binary for them. However, this is not the case of linux. Due to this reason, linux users need to compile the source code of the package and literally create the binary for the package. Since most of the powerful packages require compilation (they use C/C++/Fortran for creating efficient package), it is not a surprise that it takes a long time to install the package. Even worse, sometimes compilation does not work because you do not have the necessary compilers or necessary system libraries. This means you need to pre-install all the necessary dependencies before compiling the package.

Fortunately, this issue was recently solved through the help of Rstudio (Posit) Public Package Manager. This is a service that provides fall installtion of binary R packages for linux. Nowadays, most of the packages have their binary counterpart for linux. Thus, you can significantly reduce the time to install the package.

The problem however, is that the package manager is not perfect. I am not exactly sure why, but it seems binary might not work for some newer version of linux distribution if they were built from some different version of system libraries. This seems to be case for packages like sf and stringi.

So how can we solve this? Well if you are using renv R package as your package dependency manager, you can simply use the following command in your Dockerfile to override the repository written in the renv.lock file:

ENV RENV_CONFIG_REPOS_OVERRIDE=https://packagemanager.posit.co/cran/__linux__/noble/latest

This command will override the repository and install the packages from repository based on certain linux distribution you are using. In this case, I set it to noble which is the Ubuntu 24.04 LTS. This goes nicely with R Rocker project and GitHub Actions because their newest images are based on Ubuntu 24.04 LTS.

But there is a caveat: current version (1.1.5) of renv package does not support this feature. This is a known issue (#2127). Fortunately, this was resolved but the newer version is not yet released. For now, you need to use the development version of renv package.1 You can install it by running the following command:

RUN R -e "install.packages('renv', repos = 'https://posit.r-universe.dev')"

Multi-platform issue

Going deep into reproducibility is that you need to realize that there are two main CPU architecture: amd64 (x86_64) and aarch64 (arm64). The problem is that base Docker images built on certain CPU platform will not work for the other platform. Also, there are cases where different architecture will require different binary to install certain software. That is, some binary built for amd64 will not work for aarch64 and vice versa.

Why should we care about this? Well, the reason is simple: both of the architecture are widely used in the world. amd64 is very common architecture for many types of computers. aarch64 is also a very common architecture. In fact, you are using aarch64 architecture if you are using an Apple Silicon Mac. So the big problem is that if you are using a base Docker image built on amd64, it may not work for people using MacOS.

Fortunately, solution is simple: build both amd64 and aarch64 images. You can do this locally using docker buildx command or you can use this GitHub Actions yaml file:

name: build_docker

on:
  push:
    branches: [ master, main ]

jobs:
  docker:
    runs-on: ubuntu-latest
    env:
      IMAGE_NAME: r_4.5.1   # repo name on Docker Hub: DOCKERHUB_USERNAME/r_4.5.1
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      # Enable QEMU so we can cross-build arm64 on amd64 runners
      - name: Set up QEMU
        uses: docker/setup-qemu-action@v3

      # Set up Buildx builder
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Login to Docker Hub
        uses: docker/login-action@v3
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}

      - name: Build and push (multi-arch) base Dockerfile
        uses: docker/build-push-action@v6
        with:
          context: .
          file: ./Dockerfile
          platforms: linux/amd64,linux/arm64   # <-- key line
          push: true
          # tag strategy: latest + branch + short sha
          tags: |
            ${{ secrets.DOCKERHUB_USERNAME }}/${{ env.IMAGE_NAME }}:latest
            ${{ secrets.DOCKERHUB_USERNAME }}/${{ env.IMAGE_NAME }}:${{ github.ref_name }}
            ${{ secrets.DOCKERHUB_USERNAME }}/${{ env.IMAGE_NAME }}:${{ github.ref_name }}-${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max
          pull: true

Another multi-platform issue: tinytex

I was also trying to install Quarto because I used it to render my assignment pdf. However, I encountered a problem when I was trying to install tinytex package. Quarto needs tex compiler to render the pdf. One that is used a lot is tinytex because it is a portable and light tex distribution. When I was installing tinytex in amd64 image, it was fine because all I need to do is:

RUN quarto install tinytex

which is provided by Quarto command. The issue, however, was when I was trying to same thing for aarch64 image. Apparently, it seems the previous command runs a binary that is based on amd64 architecture. Thus this command cannot install tinytex for aarch64 image. In order to solve this, I created this if statement in the Dockerfile to install tinytex for aarch64 image.

# Quarto
ENV QUARTO_VERSION=1.7.32
RUN /rocker_scripts/install_quarto.sh

# --- TinyTeX install (arm64 manual, amd64 via Quarto) ---
RUN set -eux; \
  if [ "$ARCH_TYPE" = "arm64" ]; then \
    wget -qO- "https://yihui.org/tinytex/install-unx.sh" | sh -s - --admin --no-path; \
  else \
    quarto install tinytex; \
  fi

# Set TinyTeX path
ENV PATH="/root/.TinyTeX/bin/aarch64-linux:/root/.TinyTeX/bin/x86_64-linux:${PATH}"

# Set CTAN mirror for tlmgr
ENV TEXLIVE_REPOSITORY="https://ctan.math.illinois.edu/systems/texlive/tlnet"

# Set repo in tlmgr
RUN tlmgr option repository "$TEXLIVE_REPOSITORY"; \
  tlmgr update --self; \
  tlmgr update --all
# -------------------------------------------------------

Note that if you are only making amd64 image, you don’t need to set the tinytex path. I am doing this because in aarch64 image, tinytex is installed manually and path is not set by Quarto command.

tinytex CTAN mirror error

This was the most annoying issue I encountered. When tinytex encounter tex package that is not installed in the local system, it uses CTAN mirror in install the necessary tex packages. The problem is that the mirror they refer to is very random… So if you don’t set the mirror, it might occasionally connect to stale one and not be able to install the necessary tex packages. To solve this, you just need to manually set the mirror that seems to be “working.” This is done in the previous code block.

If you want to look at the full Dockerfile, you can find it here.

Footnotes

  1. Let’s hope the newer version gets released soon…↩︎