How I Use Git LFS to Manage Large Git Files?

Introduction

Git is a powerful version control system that allows us to track changes to any file and easily revert to any version. However, when it comes to storing large files, Git struggles because Git is not designed for storing large files (images, videos, music, etc.).

Imagine a 1MB image that has been modified 10 times; Git would store 10 versions of the image, which amounts to 10 MB in the repository! (Although this simplified explanation is not entirely accurate, as Git stores the differences between two versions rather than a complete copy of each version, the challenge of compressing differences between binary files remains.) This can lead to performance issues, wasted storage space, and excessive network resource consumption. Git Large File Storage is an extension of Git specifically designed to address these issues.

How Does Git LFS Solve the Problem?

Git LFS solves the problem by storing the contents of large files on an external server instead of directly in the Git repository. By replacing the actual large files with pointers to the large files, the size of the Git repository does not increase, regardless of how large the files are or how many times they are modified.

In addition to improved storage, syncing large files to the local machine becomes easier, and compared to downloading the entire history of a Git repository, using LFS allows us to request only the currently necessary large files to save resources.

How to Use Git LFS?

1. Download Git LFS

First, you need to install Git LFS (obviously 😐)

Terminal window
git lfs install

2. Configure Files to Use Git LFS

In the Git repository where you want to use Git LFS, choose the file types you want Git LFS to manage (or edit .gitattributes directly). You can configure new file extensions that need LFS intervention at any time. The following setting informs LFS to handle all files with the .png extension.

Terminal window
git lfs track "*.png"

After the configuration, ensure that the changes to .gitattributes are committed to the Git repository.

3. Commit Files

Terminal window
git add foobar.png
git commit -m "Add foobar.png"
git push origin main

If you, like me, use GitHub to host your code, you can simply commit the files as usual, and GitHub will handle the rest for you 🤯. When previewing LFS files on GitHub, there will also be annotations, and the experience is completely consistent with using Git without LFS.

GitHub LFS

If you want to list the files currently managed by LFS, you can run the following command:

Terminal window
git lfs ls-files

You will get a detailed list of LFS-managed files as follows:

071ba8a6fd * fonts/NotoSansTC-Bold.ttf
6b137e2eb5 * fonts/NotoSansTC-Regular.ttf
ba2e34dddb * public/images/brand/default-og.jpg
0866a98641 * public/images/brand/favicon/apple-touch-icon.png
c58814f7fa * public/images/brand/favicon/icon-192

Experience After Using Git LFS

Currently, all large files on the website are hosted through GitHub LFS, significantly simplifying my workflow and improving efficiency.

In the early days of building this site, it required setting up an additional CMS server or using third-party asset hosting services like Cloudinary to bring images into articles. But now, as long as the images are placed in the Git repository, they can be used directly. This is an elegant and streamlined process for a website like WebDong, which is solely managed by developers (GitHub acts like my Git-Based CMS).

Even if your website is not managed by developers, you can still use Git LFS to manage large files, such as some Open Graph images, font files… Having all resources managed in one place is also very convenient.

Things to Note When Using Git LFS

Git LFS Service Is Not Unlimited and Free

You can check bill for Git Large File Storage🔗 and Viewing your Git Large File Storage usage🔗 or the pricing details of LFS providers to learn more.

Taking the entire Web Dong blog as an example, there are a total of 251 LFS files, including various videos, images, and fonts, which have only used 43.2 MB of space. This is, of course, because I have been mindful and compressed these files in advance, but I must say that the monthly 1 GB free quota is already sufficient for many scenarios.

GitHub LFS Usage Quota
Web Dong Dong Blog Git LFS Storage Usage Quota

CI/CD Process Considerations

GitHub LFS Bandwidth Usage Quota
Web Dong Dong Blog Git LFS Bandwidth Usage Quota

Since Git LFS stores large files on additional servers, extra configuration is needed in the CI/CD process to download these LFS files. This point has been practiced in my blog🔗, and there is also additional caching setup to avoid fetching the same LFS files repeatedly🔗.

Conclusion

I wrote this article because the team previously did not have the habit of adopting Git LFS, and some large web assets were still stored directly in the Git repository. With architecture updates and the introduction of a Monorepo structure, it is expected that project size will increase sharply, prompting me to hope to introduce Git LFS early on to prevent the issue of the Git repository exploding.

I believe this technology is particularly suitable for use in certain fields, such as front-end or game development, where interaction with a large number of binary files is required.

Further Reading