GitHub Action: Reclaim The Bytes

Remove unused software to reclaim disk space.

Author

Robrecht Cannoodt

Published

June 16, 2023

In the dynamic world of data science and software development, making efficient use of available resources is not just an option but an absolute necessity. In a lot of our projects (e.g. OpenPipelines, viash.io, OpenProblems), we need to build or use so many Docker image as part of our CI workflow, that we run out of disk space quite quickly.

As part of our commitment to empowering developers, we at Data Intuitive are excited to announce the Version 1 release of our innovative GitHub Action, “Reclaim The Bytes”. This GitHub action is designed to streamline your workflow by freeing up disk space on GitHub runners during the build process, helping you make the most of the available space.

Background and Inspiration

The development of “Reclaim The Bytes” was motivated by a desire to enhance the efficiency of workspace environments in GitHub. It was inspired by similar initiatives such as easimon’s maximize-build-space, ThewApp’s free-actions. The discussion thread here provided additional insights that shaped the development of this action.

The action works by executing rm -rf on specific folders, thereby removing unnecessary software and liberating valuable disk space. While the process is simple and quick, it is critical to understand that it might inadvertently delete dependencies required by your job, potentially disrupting your build. For instance, if your build job uses a .NET based tool and the required runtime is deleted, it may affect the smooth execution of your task. Therefore, it is crucial to ascertain which software may or may not be removed for your specific use case.

Usage

Below is an example usage in a GitHub workflow:

name: My build action requiring more space
on: push

jobs:
  build:
    name: Build my artifact
    runs-on: ubuntu-latest
    steps:
      - name: Reclaim the bytes
        uses: data-intuitive/reclaim-the-bytes@v1
        with:
          remove-hosted-tool-cache: true
          remove-go: false
          remove-codeql: false
          remove-powershell: false
          remove-android-sdk: true
          remove-haskell-ghc: true
          remove-swift: true
          remove-dotnet: true
          remove-docker-images: true
          remove-swap: true

      - name: Checkout
        uses: actions/checkout@v3

      - name: Report free space
        run: |
          echo "Free space:"
          df -h

Measurements

Understanding the balance between the time taken to remove a piece of software and the amount of space freed as a result is essential for optimizing your usage of “Reclaim The Bytes”. We have provided a visualization to help you comprehend this balance better. Below is a summary table that provides data on the software removed, the operating system, the duration in seconds for the removal process, and the amount of space freed in GB.

Software OS Duration (s) Space freed (GB)
android-sdk ubuntu-20.04 26.8 12
android-sdk ubuntu-22.04 37.0 13
codeql ubuntu-20.04 1.0 6
codeql ubuntu-22.04 1.0 6
docker-images ubuntu-20.04 18.4 5
docker-images ubuntu-22.04 17.0 4
dotnet ubuntu-20.04 5.6 3
dotnet ubuntu-22.04 2.2 2
go ubuntu-20.04 2.2 1
go ubuntu-22.04 1.2 2
haskell-ghc ubuntu-20.04 2.8 5
haskell-ghc ubuntu-22.04 3.4 5
hosted-tool-cache ubuntu-20.04 5.0 10
hosted-tool-cache ubuntu-22.04 3.4 9
powershell ubuntu-20.04 0.8 1
powershell ubuntu-22.04 0.6 2
python ubuntu-20.04 0.0 0
python ubuntu-22.04 0.4 0
swap ubuntu-20.04 0.0 0
swap ubuntu-22.04 0.0 0
swift ubuntu-20.04 0.6 2
swift ubuntu-22.04 0.4 2

Additionally, a scatterplot visualizes the same information, providing an intuitive understanding of the trade-off between removal duration and space freed. This way, you can make informed decisions about which software to keep or remove based on your unique requirements and constraints.

In summary, “Reclaim The Bytes” provides an efficient method for software developers to optimize their GitHub actions and maximize disk space usage. However, it is essential to understand its operation to ensure the successful execution of your build jobs. With this understanding, we hope you will leverage “Reclaim The Bytes” to improve your workflow and make your software development journey smoother and more efficient. Enjoy coding!

Elevate your data workflows

Transform your data workflows with Data Intuitive’s complete support from start to finish.

Our team can assist with defining requirements, troubleshooting, and maintaining the final product, all while providing end-to-end support.

book a meeting with us