Neil Macy

Cloning Only Part of Your Git Repo

The iOS project that I work on is a few years old and has over 14,000 commits in its commit history. That's a lot of code. The step in our Bitrise workflow that clones the repo takes around 2-3 minutes to clone all of it. But I recently discovered that it doesn't need to. Git supports something called "shallow clones", which let us only clone the part of the repo that we actually need.

How Bitrise Lets Us Use Shallow Clones

By setting the clone_depth in the git-clone step (see the Bitrise docs here), you can tell Bitrise to only check out a certain number of commits.

For example, we have a workflow that runs our UI tests nightly. It only needs the current commit to run those tests, it needs no history. By setting the clone_depth to 1, we can avoid cloning the unnecessary history and save around 2 minutes on average.

I recently added a load of scripts to help us automate our release process. These scripts do various things on the latest commit on our main branch, and we run them with Bitrise workflows. The longest step by far in most of these workflows is git-clone, since they usually don't actually build the app. As they only need the current commit, they can also be sped up quite considerably by not cloning the whole repo, cutting them down from over 4 minutes on average to under 2 minutes.

Shallow and Partial Clones

This isn't just a Bitrise thing though, it's part of git itself. The clone_depth property tells the git-clone Bitrise step to use a "shallow clone". This means that if you give a depth of 1, it only checks out the latest commit and completely ignores anything else.

Here's an interesting read from Derrick Stolee at GitHub, comparing blobless, treeless and shallow clones: Get up to speed with partial clone and shallow clone. (Don't worry, he explains what those three types of clones mean really well.)

And if you want to RTFM, here's the git documentation for shallow clones and partial clones.

Published on 7 April 2022