Commits and History¶
A commit is Git's fundamental unit of work - a permanent snapshot of your entire project at a specific point in time. Understanding what commits contain, how to write good commit messages, and how to navigate history with git log and git diff are essential skills you'll use every day.
Anatomy of a Commit¶
Every commit in Git stores four pieces of information:
- Tree - a reference to a tree object that captures the state of every file and directory at the moment of the commit
- Parent(s) - a reference to the commit(s) that came immediately before. A root commit has no parent. A merge commit has two (or more) parents.
- Author - who originally wrote the change (name, email, timestamp)
- Committer - who applied the change to the repository (name, email, timestamp)
The author and committer are usually the same person. They differ when someone applies a patch written by another developer, or during cherry-picks and rebases.
A commit also has a message - the human-readable description of what changed and why.
Each commit is identified by a SHA-1 hash - a 40-character hexadecimal string computed from all of the above. Change any part of a commit (the files, the message, the parent, the author) and the hash changes. This makes Git's history tamper-evident: you cannot alter past commits without changing every subsequent hash in the chain.
tree 4b825dc642cb6eb9a060e54bf899d15f7e8c9c2f
parent a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0
author Jane Developer <jane@example.com> 1700000000 -0500
committer Jane Developer <jane@example.com> 1700000000 -0500
Add user authentication module
Writing Good Commit Messages¶
Commit messages are documentation. Six months from now, when someone (including you) runs git log to understand why a change was made, the commit message is the primary source of context.
The Format¶
Short summary of the change (50 chars or less)
More detailed explanation if needed. Wrap at 72 characters.
Explain what changed and why, not how (the diff shows how).
Reference issue tracker IDs if applicable.
The Rules¶
Subject line:
- Keep it under 50 characters (hard limit: 72)
- Use imperative mood: "Add feature" not "Added feature" or "Adds feature"
- Capitalize the first word
- No period at the end
- Make it specific: "Fix null pointer in user lookup" not "Fix bug"
Body (optional but recommended for non-trivial changes):
- Separate from subject with a blank line
- Wrap at 72 characters
- Explain what changed and why, not how (the diff shows how)
- Reference issue numbers, related commits, or design decisions
Conventional Commits¶
Many teams use the Conventional Commits format, which adds structured prefixes:
feat: add email notification for failed builds
fix: correct timezone handling in scheduler
docs: update API authentication guide
refactor: extract validation into shared module
test: add integration tests for payment flow
chore: upgrade dependencies to latest patch versions
The prefix tells you at a glance what kind of change this is. Some tools use these prefixes to automatically generate changelogs and determine version bumps.
Viewing History: git log¶
git log is your primary tool for exploring commit history. It's deeply configurable - you can filter, format, search, and graph the output in dozens of ways.
Basic Usage¶
# Full log (press q to exit, space to page)
git log
# Compact one-line format
git log --oneline
# Show the last 5 commits
git log -5
# Show graph with branch structure
git log --oneline --graph --all
# Show graph with decorations (branch/tag names)
git log --oneline --graph --all --decorate
Filtering by Date¶
# Commits from the last week
git log --since="1 week ago"
# Commits between two dates
git log --after="2024-01-01" --before="2024-02-01"
# Relative date display
git log --oneline --date=relative
Filtering by Author¶
# Commits by a specific author (partial match)
git log --author="Jane"
# Commits by multiple authors
git log --author="Jane\|Bob"
Searching Commit Messages¶
# Search commit messages for a string
git log --grep="authentication"
# Case-insensitive search
git log --grep="auth" -i
# Commits matching ALL grep patterns (not just any)
git log --grep="fix" --grep="login" --all-match
Searching Code Changes¶
# Find commits that added or removed the string "TODO" (pickaxe)
git log -S "TODO"
# Find commits where the number of occurrences of "TODO" changed
git log -S "TODO" --diff-filter=M
# Search with regex in code changes
git log -G "function\s+authenticate"
-S (the pickaxe) finds commits where the number of occurrences of a string changed. -G finds commits where a line matching a regex was added or removed. The pickaxe is faster for exact strings; -G handles patterns.
Formatting Output¶
# Custom format
git log --format="%h %an %ar %s"
# Format with colors
git log --format="%C(yellow)%h%C(reset) %C(blue)%an%C(reset) %C(green)%ar%C(reset) %s"
Common format placeholders:
| Placeholder | Output |
|---|---|
%H |
Full commit hash |
%h |
Abbreviated hash |
%an |
Author name |
%ae |
Author email |
%ar |
Author date, relative |
%ai |
Author date, ISO format |
%s |
Subject line |
%b |
Body |
%d |
Ref names (branches, tags) |
Following File History¶
# History of a specific file
git log -- path/to/file.py
# Follow renames (track file across renames)
git log --follow -- path/to/file.py
# Show the patch (actual diff) for each commit
git log -p -- path/to/file.py
# Show stats (files changed, insertions, deletions)
git log --stat
Comparing Changes: git diff¶
While git log shows you what happened, git diff shows you exactly what changed. It compares content between any two of the three trees, or between any two commits.
The Three Comparisons¶
# Working directory vs staging area (unstaged changes)
git diff
# Staging area vs last commit (what will be committed)
git diff --staged # or --cached (identical)
# Working directory vs last commit (all uncommitted changes)
git diff HEAD
Comparing Commits¶
# Difference between two commits
git diff a1b2c3d e4f5a6b
# Difference between current commit and two commits ago
git diff HEAD~2 HEAD
# Difference between two branches
git diff main feature/auth
# Only show which files changed (not the content)
git diff --name-only main feature/auth
# Show stats (like git log --stat)
git diff --stat HEAD~3 HEAD
Scoping Diffs to Files¶
# Diff for a specific file
git diff -- src/auth.py
# Diff for a directory
git diff -- src/
# Staged changes for a specific file
git diff --staged -- src/auth.py
Reading Diff Output¶
diff --git a/src/auth.py b/src/auth.py
index 4a39281..e4f5a6b 100644
--- a/src/auth.py
+++ b/src/auth.py
@@ -12,7 +12,9 @@ def authenticate(username, password):
user = db.find_user(username)
if not user:
return None
- if user.check_password(password):
+ if not user.is_active:
+ raise AccountDisabledError(username)
+ if user.check_password(password) and user.is_active:
return create_session(user)
return None
The header shows which file changed. Lines starting with - were removed, lines starting with + were added. The @@ line shows the location in the file (starting at line 12, showing 7 lines of context in the old file, 9 in the new).
Inspecting a Single Commit: git show¶
git show displays the details of a specific commit - the message, author, date, and the full diff:
# Show the most recent commit
git show
# Show a specific commit
git show a1b2c3d
# Show only the stat (no diff)
git show --stat a1b2c3d
# Show a specific file as it was in a commit
git show a1b2c3d:src/auth.py
The commit:path syntax is useful for viewing a file at any point in history without checking it out.
Amending the Most Recent Commit¶
Made a typo in your commit message? Forgot to add a file? git commit --amend lets you modify the most recent commit:
# Fix the commit message
git commit --amend -m "Corrected commit message"
# Add a forgotten file to the last commit
git add forgotten-file.py
git commit --amend --no-edit # Keep the same message
--amend doesn't actually modify the old commit. It creates a new commit with a new hash and moves the branch pointer to it. The old commit becomes unreachable (but can still be found via the reflog for a while).
Only amend unpushed commits
If you've already pushed a commit to a shared branch, amending it rewrites history. Other developers who pulled the original commit will have a diverged history. Amend freely on local branches; use git revert on shared branches instead. The Rewriting History guide covers this in depth.
Commit References¶
Git provides several ways to reference commits without typing full SHA-1 hashes:
| Reference | Meaning |
|---|---|
HEAD |
The current commit |
HEAD~1 or HEAD~ |
One commit before HEAD (first parent) |
HEAD~3 |
Three commits before HEAD |
HEAD^ |
First parent of HEAD (same as HEAD~1 for non-merge commits) |
HEAD^2 |
Second parent of HEAD (only meaningful for merge commits) |
main |
The commit that the main branch points to |
v1.0 |
The commit that the v1.0 tag points to |
@{2} |
Where HEAD was two moves ago (from the reflog) |
The ~ operator follows first parents (the "main line"). The ^ operator selects among multiple parents (relevant for merge commits). For linear history, HEAD~1 and HEAD^ are identical.
Practical Exercise¶
Further Reading¶
- Pro Git - Chapter 2.3: Viewing the Commit History - comprehensive coverage of
git logoptions - Official git-log documentation - complete reference for all flags and format placeholders
- Official git-diff documentation - complete reference for diff modes
- Conventional Commits - structured commit message specification
- A Note About Git Commit Messages (Tim Pope) - the classic post on commit message formatting
Previous: The Three Trees | Next: Branches and Merging | Back to Index