Vagrant’s file provisioner is slow but don’t let it get in your way

Originally posted on 2022-06-28

TL;DR

If startup time from vagrant up is critical then don’t use Vagrant's file provisioner. Instead, mount the directory as a read-only shared directory and copy the files with cp.

The problem

Copying a large number of files to a virtual machine with Vagrant is very slow. I did some digging and came up with some interesting data on when/why this happens as well as ways to avoid it.

The size in bytes of the data is not a contributing factor in my case. This doesn’t mean that very large files are not an issue. It just means that my data was small (38 MB) but still took a long time.

It appears to be the number of files being copied that is causing the performance issue. 1k files, ~45 seconds. 5k files, ~3m10s. It is scaling up more slowly than linearly, which is good, but it's still crazy slow. This is likely due to the fact that Vagrant is using interpreted Ruby code to loop through the list of files instead of a native utility like cp.

Fix #1 - Move just one ZIP file instead of thousands of small files

I repackaged my files as a ZIP file. Compressing the data took 0.1s. Copying the file to the VM took less than 1 second. Decompressing the file inside the VM took less than 1 second.

In my particular environment I couldn’t assume the host has a command-line ZIP utility so I had to try something else.

Fix #2 - Read-only shared directory and native Linux cp

I shared the git directory as a read-only directory (to be extra safe) and then copied it to the guest using cp -R. Sure enough, it takes less than 1 second.