Git Worktree and Git LFS


  • Description: Multiple working trees from one repo (git worktree), large-file storage with pointer indirection (git lfs), and rewriting history to purge committed large files (git filter-repo, BFG, legacy filter-branch).
  • My Notion Note ID: K2B-2-4
  • Created: 2026-01-15
  • Updated: 2026-05-19
  • License: Reuse is very welcome. Please credit Yu Zhang and link back to the original on yuzhang.io

Table of Contents


1. Overview

Three orthogonal topics tied together by one theme — working with a single repo at scale beyond the basics in Pro Git Ch 1–3.

  • git worktree — check out multiple branches into separate directories from one .git/. Lets you context-switch without stashing.
  • git lfs — keep large binaries out of the main object database. Repo stores pointers; bytes live on an LFS server.
  • Large-file purge — when binaries already landed in history and bloated the repo, rewrite history with git filter-repo / BFG and force-push.

The three together cover what Pro Git Ch 7 (Git Tools) and Ch 10 (Internals) gesture at for advanced repo management.


2. Git Worktree

2.1 Concept

  • Multiple linked working trees sharing one object database (.git/).
  • Main worktree = where .git/ lives. Linked worktrees = extra checkouts elsewhere.
  • Each linked worktree has its own HEAD, index, and checked-out branch — but shared refs, objects, and config.
  • Cheap — no extra clone. New worktree = new directory + ~few KB metadata.

2.2 Subcommands

Command Purpose
git worktree add <path> [<branch>] Create new linked worktree. If <branch> omitted, creates a new branch named $(basename <path>) off HEAD.
git worktree list Show all worktrees (path, HEAD, branch, status flags).
git worktree remove <path> Delete a worktree (must be clean unless --force).
git worktree prune Drop stale metadata for worktrees whose directories were deleted manually.
git worktree move <path> <new-path> Relocate.
git worktree lock <path> Prevent prune/move/remove (e.g. worktree on detachable disk).
git worktree unlock <path> Reverse lock.
git worktree repair [<path>...] Fix .git pointer files after manual moves.

2.3 Key flags for add

  • git worktree add -b <new-branch> <path> [<start>] — create new branch + worktree in one step.
  • git worktree add --detach <path> — detached HEAD (throwaway / read-only checkout).
  • git worktree add --no-checkout <path> — skip checkout (set up sparse-checkout first).
  • git worktree add --orphan <path> — unborn branch (no history yet). Requires Git 2.42+ (2023-08).
  • git worktree add --lock --reason "msg" <path> — create already-locked.

2.4 Storage layout

  • .git/worktrees/<name>/ — per-worktree state (HEAD, index, logs, ORIG_HEAD).
  • In the linked worktree root: a .git file (not directory) pointing to gitdir: /path/to/main/.git/worktrees/<name>.
  • $GIT_DIR resolves per-worktree; $GIT_COMMON_DIR resolves to the shared .git/ (refs, objects).

2.5 Restrictions

  • Same branch can't be checked out in two worktrees simultaneously (override with --force, but the two checkouts then race on the same branch ref).
  • Main worktree can't be removed with worktree remove — delete the whole repo instead.
  • Submodules in linked worktrees have incomplete support.
  • Bare repos have no main worktree (git clone --bare), so any checkout requires a linked worktree.

2.6 Common workflows

Hotfix without disturbing in-progress work

# In main worktree, mid-refactor
git worktree add ../repo-hotfix -b hotfix/login-crash origin/main
cd ../repo-hotfix
# fix, commit, push
cd -
git worktree remove ../repo-hotfix
git branch -D hotfix/login-crash   # if merged + done

Build multiple branches simultaneously (CI-like locally)

git worktree add ../repo-v1 v1.x
git worktree add ../repo-v2 v2.x
# Run builds in parallel; no stash needed

Throwaway exploration (detached)

git worktree add --detach ../repo-test HEAD~10
# Poke around at an old commit; remove when done

Cleanup after deleting worktrees manually

rm -rf ../repo-hotfix          # didn't use `worktree remove`
git worktree list              # shows it as prunable
git worktree prune

3. Git LFS

3.1 Why LFS

  • Git stores every version of every blob in .git/objects/. Binary files (PSD, video, models, datasets) compress poorly and rewrite often → repo size explodes.
  • Clones get slow, GC takes hours, hosting providers reject pushes over their pack-size limits.
  • LFS replaces large blobs with text pointers in commits. Real content lives on an LFS server (GitHub, GitLab, self-hosted).
  • On checkout, LFS downloads the actual blob via the OID (SHA-256 of the actual content) stored in the pointer. Transparent to most Git operations.

3.2 Pointer format

A committed .psd under LFS is actually a ~130-byte text file:

version https://git-lfs.github.com/spec/v1
oid sha256:4cac19622fc3ada9c0fdeadb33f88f367b541f038adf30f50e6a30ac8b1ec46e
size 12345

Git tracks this small pointer; the LFS client uploads/downloads the real bytes to/from the LFS endpoint at push/checkout time.

3.3 Install + initialize

# Install the binary (once per machine)
sudo apt install git-lfs              # Debian/Ubuntu
brew install git-lfs                  # macOS
# or download from https://git-lfs.com

# Wire LFS hooks into the user's global git config (once per user)
git lfs install

git lfs install writes the clean/smudge filters to the user-global git config (~/.gitconfig) and installs pre-push / post-checkout / post-commit / post-merge hooks into the current repo (hooks are always per-repo). --local only changes filter scope — writes filters to the repo's local config instead of global; hook installation is unaffected.

3.4 Track patterns

git lfs track "*.psd"
git lfs track "assets/videos/**"

This writes patterns to .gitattributes:

*.psd filter=lfs diff=lfs merge=lfs -text
assets/videos/** filter=lfs diff=lfs merge=lfs -text
  • Commit .gitattributes so collaborators get the same filter rules.
  • Tracking is not retroactive — files already committed as regular blobs stay regular. Use git lfs migrate to convert (§ 3.6).

3.5 Daily workflow

Once lfs install ran and .gitattributes is committed, day-to-day usage is the same as plain Git:

git add big-design.psd
git commit -m "Add hero design"
git push                          # LFS client uploads blob during push

Clone:

git clone https://github.com/user/repo.git
# LFS files pulled automatically by smudge filter on checkout

If you cloned without LFS installed, run git lfs install then git lfs pull to fetch missing blobs.

3.6 Useful commands

git lfs status                # which LFS files are staged/modified
git lfs ls-files              # all LFS-tracked files in current ref
git lfs track                 # list current track patterns
git lfs untrack "*.psd"       # remove pattern from .gitattributes
git lfs fetch --all           # download all LFS objects for all refs
git lfs pull                  # fetch + checkout LFS content for current ref
git lfs prune                 # delete local LFS objects not in current/recent refs and already pushed
git lfs env                   # diagnostics: endpoint, hooks, version

3.7 Migrating existing files into LFS

For files already committed as normal blobs:

git lfs migrate import --include="*.psd" --include-ref=refs/heads/main
# Rewrites history — old commits get new SHAs
git push --force-with-lease origin main
  • --no-rewrite variant adds a new commit converting current files, no history rewrite (safer but doesn't shrink history). Note: --no-rewrite takes an explicit file list (not --include patterns) and ignores --include-ref; the matching files must already be covered by .gitattributes LFS rules.
  • Coordinate with team — rewrite changes SHAs of every affected commit (see § 4.5).

3.8 Hosting limits (rough)

  • GitHub: included quota 10 GiB storage + 10 GiB/month bandwidth on Free/Pro; 250 GiB + 250 GiB on Team/Enterprise Cloud. Pre-paid data packs retired — overages bill via metered usage. Per-file hard limit: 2 GB Free/Pro, 4 GB Team, 5 GB Enterprise Cloud.
  • GitLab: tied to project storage quota; depends on tier.
  • Self-hosted: any LFS-spec-compatible server (Gitea, LFS Test Server, Artifactory).

4. Removing Large Files from History

When binaries already landed in the repo (and possibly in many commits), tracking via LFS afterward doesn't shrink the existing history. Have to rewrite history.

4.1 When you need it

  • Committed a >100 MB file → GitHub rejects the push (GitHub-specific hard limit).
  • Committed credentials, large binaries, or a vendored node_modules.
  • Repo size grew to gigabytes; clones take ages.
  • After cleanup, every reflog/dangling reference must also be expired; otherwise GC keeps the blob alive.

Modern replacement for filter-branch. Single Python script. Orders of magnitude faster; safer error handling.

Install:

pipx install git-filter-repo          # recommended (avoids PEP 668 errors on modern Python)
brew install git-filter-repo          # macOS
# fallback: drop the single git-filter-repo script onto $PATH

Remove a file from all history:

git filter-repo --path path/to/big.bin --invert-paths

Remove all blobs larger than 10 MB:

git filter-repo --strip-blobs-bigger-than 10M

Remove a directory:

git filter-repo --path secrets/ --invert-paths

filter-repo refuses to run unless it detects a fresh clone (pass --force to override) and removes the origin remote post-rewrite — design choice to make sure you re-add the correct URL and force-push consciously.

4.3 BFG Repo-Cleaner

Java-based alternative. Faster than filter-branch (~10–720x), less flexible than filter-repo but simpler CLI.

# Remove all files > 100M
java -jar bfg.jar --strip-blobs-bigger-than 100M repo.git

# Remove specific file name everywhere
java -jar bfg.jar --delete-files secret.key repo.git

# Replace passwords in all text
java -jar bfg.jar --replace-text passwords.txt repo.git

After BFG, run git reflog expire --expire=now --all && git gc --prune=now --aggressive.

4.4 git filter-branch (legacy — still seen in old docs)

Officially deprecated. Slow and full of footguns. Mentioned only because older guides reference it:

git filter-branch --force --index-filter \
  'git rm --cached --ignore-unmatch path/to/big.bin' \
  --prune-empty --tag-name-filter cat -- --all

Followed by force-push + reflog expire + gc. Prefer filter-repo for anything new.

4.5 Post-cleanup

Steps vary by tool:

  • filter-repo — already runs repack + prune automatically. Does not create refs/original/. Just force-push.
  • BFG — needs git reflog expire --expire=now --all && git gc --prune=now --aggressive (BFG docs recommend --aggressive). No refs/original/ either.
  • filter-branch (legacy) — also leaves refs/original/ backup refs; delete those first, then expire reflog + gc.

Universal sequence covering the legacy case:

# Force-push every branch and tag
git push origin --force --all
git push origin --force --tags

# (filter-branch only) drop the backup refs it leaves behind
git for-each-ref --format='delete %(refname)' refs/original \
  | git update-ref --stdin

# Expire reflog so old blobs become unreachable
git reflog expire --expire=now --all

# GC actually deletes the blobs (plain gc; --aggressive only when BFG docs ask)
git gc --prune=now

# Sanity check size shrunk
git count-objects -vH

4.6 Coordinating with collaborators

History rewrite changes commit SHAs → every existing clone now diverges from the rewritten remote.

  • Notify everyone before force-push.
  • Easiest recovery for collaborators: delete their clone and re-clone fresh.
  • If they have unmerged local work, rebase it onto the new history (git rebase --onto <new-base> <old-base> <local-branch>).
  • Open PRs/MRs need to be closed and re-opened against the new history.

5. Pitfalls

  • Worktree branch lock-out. Switching the branch a linked worktree has checked out fails ("already checked out at ..."). Either switch in that worktree, remove it first, or use a detached HEAD there.
  • git worktree prune is silent. Won't tell you which worktrees it nuked. Run git worktree list first.
  • LFS without git lfs install on a clone. Working tree gets pointer files (130-byte text) instead of real assets. Build tooling reading "image" files silently breaks. Always run git lfs install per machine.
  • LFS migration during active work. git lfs migrate import rewrites history. Same coordination pain as § 4.6.
  • Filter-repo wipes origin. Designed behaviour. Re-add the remote after rewriting and double-check the URL before force-pushing.
  • GC doesn't shrink immediately. Blobs survive until reflog expires + pack repack. The full cleanup sequence in § 4.5 is mandatory; otherwise du -sh .git/ shows no change.
  • .gitattributes not committed. LFS-track on machine A but A's .gitattributes not pushed → machine B commits the file as a regular blob, breaking the LFS contract retroactively. Always commit .gitattributes in the same commit as the first LFS-tracked file.
  • Force-push to shared branches without coordination corrupts everyone's local clones in the same way as a bad rebase (golden rule, Pro Git 3.6). Worse for history rewrites because every commit SHA changed.

6. References