Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git commands that could break/rewrite the history

Can you provide a list of (all, or the most common) the operations or commands that can compromise the history in git?

What should be absolutely avoided?

  1. Amend a commit after a push of this one (git commit/git push/git commit --amend)
  2. Rebase toward something that has already pushed

I would like this question (if it has not already asked before somewhere else) to become some kind of reference on the common avoidable operations on git.

Moreover I use git reset a lot, but am not completely aware of the possible damage I could do to the repository (or to the other contributors copies). Can git reset be dangerous?

like image 718
Kamafeather Avatar asked Sep 08 '14 08:09

Kamafeather


People also ask

Can you rewrite git history?

There are many ways to rewrite history with git. Use git commit --amend to change your latest log message. Use git commit --amend to make modifications to the most recent commit. Use git rebase to combine commits and modify history of a branch.

What is the safest command to use to change history in git?

git commit –amend However, git commit --amend is a relatively safe command that helps you keep control of your checkpoint commits. This command takes your current changes, adds them to the previous commit, and lets you edit your commit message.

Does rebase rewrite history?

Interactive rebase is one of those tools that "rewrite" Git history – and you shouldn't do this on commits that have already been shared with others. With this little warning message out of the way, let's look at some practical examples!


1 Answers

Note that, starting Git 2.24 (Q4 2019), the list above might not need to include git filter-branch anymore.

git filter-branch is being deprecated (BFG too)

See commit 483e861, commit 9df53c5, commit 7b6ad97 (04 Sep 2019) by Elijah Newren (newren).
(Merged by Junio C Hamano -- gitster -- in commit 91243b0, 30 Sep 2019)

Recommend git-filter-repo instead of git-filter-branch

filter-branch suffers from a deluge of disguised dangers that disfigure history rewrites (i.e. deviate from the deliberate changes).

Many of these problems are unobtrusive and can easily go undiscovered until the new repository is in use.
This can result in problems ranging from an even messier history than what led folks to filter-branch in the first place, to data loss or corruption. These issues cannot be backward compatibly fixed, so add a warning to both filter-branch and its manpage recommending that another tool (such as filter-repo) be used instead.

Also, update other manpages that referenced filter-branch.
Several of these needed updates even if we could continue recommending filter-branch, either due to implying that something was unique to filter-branch when it applied more generally to all history rewriting tools (e.g. BFG, reposurgeon, fast-import, filter-repo), or because something about filter-branch was used as an example despite other more commonly known examples now existing.
Reword these sections to fix these issues and to avoid recommending filter-branch.

Finally, remove the section explaining BFG Repo Cleaner as an alternative to filter-branch.
I feel somewhat bad about this, especially since I feel like I learned so much from BFG that I put to good use in filter-repo (which is much more than I can say for filter-branch), but keeping that section presented a few problems:

  • In order to recommend that people quit using filter-branch, we need to provide them a recommendation for something else to use that can handle all the same types of rewrites.
    To my knowledge, filter-repo is the only such tool. So it needs to be mentioned.
  • I don't want to give conflicting recommendations to users
  • If we recommend two tools, we shouldn't expect users to learn both and pick which one to use; we should explain which problems one can solve that the other can't or when one is much faster than the other.
  • BFG and filter-repo have similar performance
  • All filtering types that BFG can do, filter-repo can also do.
    In fact, filter-repo comes with a reimplementation of BFG named bfg-ish which provides the same user-interface as BFG but with several bugfixes and new features that are hard to implement in BFG due to its technical underpinnings.

While I could still mention both tools, it seems like I would need to provide some kind of comparison and I would ultimately just say that filter-repo can do everything BFG can, so ultimately it seems that it is just better to remove that section altogether.


the operations or commands that can compromise the history in git?

At least, the newren/git-filter-repo can recover from any history compromised by its usage.

Amongst its stated goals:

More intelligent safety

Writing copies of the original refs to a special namespace within the repo does not provide a user-friendly recovery mechanism. Many would struggle to recover using that.

Almost everyone I've ever seen do a repository filtering operation has done so with a fresh clone, because wiping out the clone in case of error is a vastly easier recovery mechanism.
Strongly encourage that workflow by detecting and bailing if we're not in a fresh clone, unless the user overrides with --force.


git filter-repo as mentioned in the documentation roughly works by running:

git fast-export <options> | filter | git fast-import <options>

And git fast-export / git fast-import has some improvment with git 2.24 (Q4 2019)

See commit 941790d, commit 8d7d33c, commit a1638cf, commit 208d692, commit b8f50e5, commit f73b2ab, commit 3164e6b (03 Oct 2019), and commit af2abd8 (25 Sep 2019) by Elijah Newren (newren).
(Merged by Junio C Hamano -- gitster -- in commit 16d9d71, 15 Oct 2019)

For example:

fast-import: allow tags to be identified by mark labels

Signed-off-by: Elijah Newren

Mark identifiers are used in fast-export and fast-import to provide a label to refer to earlier content.

Blobs are given labels because they need to be referenced in the commits where they first appear with a given filename, and commits are given labels because they can be the parents of other commits.

Tags were never given labels, probably because they were viewed as unnecessary, but that presents two problems:

  1. It leaves us without a way of referring to previous tags if we want to create a tag of a tag (or higher nestings).
  2. It leaves us with no way of recording that a tag has already been imported when using --export-marks and --import-marks.

Fix these problems by allowing an optional mark label for tags.

like image 104
VonC Avatar answered Sep 18 '22 05:09

VonC