Delete Sensitive Data From Git - Blog - Luminousmen

Caution: please use the below knowledge with caution — all the tools described below cannot be used with blind copy-pasting. Be aware of that — you have been warned (commands are intentionally not working). (But if you do have any funny/scary stories, please share in the comments).

One day, while working on a Git-based project, you might accidentally save personal data such as logins, passwords, or SSH keys directly to the repository. Yes, this sounds like a plot for a horror movie, but unfortunately these situations happen. Sure, you can try deleting the file with git rm, but as you probably already know, the file will still remain in git history.

As soon as private data enters the repository, it should be considered compromised. This means that you need to change passwords or keys immediately. Remember: there is no way to determine who could have seen or downloaded the file and when.

Fortunately, there are tools that allow you to completely remove a file from your Git repository history. In this article, we will look at two such tools: git-filter-branch and BFG Repo-Cleaner.

git-filter-branch

git-filter-branch the utility is part of git and does not require additional installation.

Suppose that we accidentally saved the .secret file in the repository, which stores the password for the database. The project is in active development and a test database is being used, but out of habit, the password is the same as the one you use to log in to the system or anywhere else. Sound familiar? I hope not.

After making sure we have the latest version and that there are no local changes, we can delete the file using git-filter-branch:

git` filter-branch --force --index-filter \ 'git rm --cached --ignore-unmatch .secret' \ --prune-empty --tag-name-filter cat -- --all

The result will be something like this:

Rewrite d402133542d0f0f1578e916b8a350842cc955870 (1327/1333) (216 seconds passed, remaining 0 predicted) rm '.environment' Ref 'refs/heads/master' was rewritten Ref 'refs/remotes/origin/master' was rewritten ...

This command will delete the .secret file in each repository commit. If you want to remove the directory, you must add the -f switch to git rm:

git` filter-branch --force --index-filter \ 'git rm -f --cached --ignore-unmatch folder_to_remove/' \ --prune-empty --tag-name-filter cat -- --all

If there are other files that need to be removed, run this command for each of them.

Now we need to commit all the changes, unfortunately, using force push:

$ git` push origin --force --all

After all the changes have been made, everyone else who has worked with this repository needs to rebase. Or delete their local repository and clone it again. The latter is better, since there is less chance of shooting yourself in the foot.

Note: to avoid a repeat of this nasty situation, the file should be added to the.gitignore file.

BFG Repo-Cleaner

BFG Repo-Cleaner — is a simpler and easier alternative to git-filter-branch for removing unwanted files from the git repository.

To install for macOS, but you can build it yourself:

brew install bfg

For example, to remove the same .secret file as in the git-filter-branch example above using BFG:

bfg` --delete-files .secret

Example output:

Found 775 objects to protect Found 161 commit-pointing refs : ... Protected commits ----------------- These are your protected commits, and so their contents will NOT be altered: * commit 89943765 (protected by 'HEAD') Cleaning -------- Found 1726 commits Cleaning commits: 100% (1726/1726) Cleaning commits completed in 471 ms. BFG aborting: No refs to update - no dirty commits found??

In addition, the BFG can automatically search for passwords or other data by replacing them with REMOVED. To do this, create a protected.txt file with search patterns (e.g. regular expressions) and execute:

bfg` --replace-text protected.txt

As a result of this command, each file in each commit will be looked through and if a password from the protected.txt file is found, it will be replaced with REMOVED.

If you're worried that deleting files will take hours, don't worry. BFG is faster than git-filter-branch in most cases. For medium-sized repositories with tens of thousands of commits, BFG can usually handle it in seconds or minutes. The BFG vs git-filter-branch - speed comparison.

Thank you for reading! Curious about something or have thoughts to share? Leave your comment below! Follow me via LinkedIn, Substack, or Telegram.

Từ khóa » Xoa Git