In this article, I will discuss the most commonly used GIT capabilities in our day to day life.
Background
GIT is originally written by Linus Torvalds. GIT is a source control system and GIT is more than a source control system. The best GIT reference that I found is this book that I strongly recommend you to read. But in this note, I want to limit the scope to the most commonly used GIT capabilities in our day to day life.
Install the Latest GIT
The computer that I use for this note is a "Linux Mint 18.3 Cinnamon 64-bit" VM. The latest GIT is not available in the Ubuntu default repositories. In order to install the latest GIT, I need to add the "Ubuntu Git Maintainers" to the repository list.
sudo add-apt-repository ppa:git-core/ppa
sudo apt-get update
By the time this note is prepared, the latest GIT version is "2.18.0".
We can then install the latest GIT by the following command:
sudo apt-get install git
We can check the version of the GIT being used by the following command:
git --version
GIT is available on all the operating systems and it has been pretty stable in the recent versions. You should not see much difference if you use a different version on other operating systems.
The First GIT Repository & ".gitignore" & ".git"
Initiate a GIT Repository
A GIT repository is simply a directory with an optional file named ".gitignore". In my example, I added the simplest ".gitignore" that tells GIT not to ignore any files, particularly the ".gitignore" file.
# Do not ignore anything
!.gitignore
To make the directory into a GIT repository, we can simply issue the following command in the directory:
git init
The "git init
" command initiated the "git-test" directory into a GIT repository and added the ".git" directory in it.
Add Files and Commit to the Repository
After the repository is initiated, we can add some files to it:
touch A B C
In a GIT repository, we can use the "git status
" command to check the status of the files.
GIT recognized the new files including the ".gitignore" file and recommends us to commit them. In order to commit the files into the repository, we need to first stage them.
For example, if we want to stage the file "A", we can issue the following command:
git add -- A
If we want to stage all the files, we can issue the following command:
git add .
We can also use the "-A
" option to stage all the files, which is equivalent to the ".
" option. If we want to stage only the modified and deleted files, we can use the "-u
" options.
git add -u
After staging the files, we can make a commit
.
git commit -m "First commit"
If we check the status of the repository after the commit
, GIT will tell us that everything is kept properly.
If this is your first time to make a "commit
", GIT will ask you to provide some information for book-keeping purposes. You can set your name and email in GIT by the following commands:
git config --global user.email "song.li@example.com"
git config --global user.name "Song Li"
If you want your name and email only applied to the current repository, you can use the following command:
git config user.email "song.li@example.com"
git config user.name "Song Li"
The ".git" Directory is Everything
GIT keeps all the configurations and commits in the ".git" directory. In order to have a taste of the GIT capability, we can delete all the files except the ".git" directory.
rm -f A
rm -f B
rm -f C
rm -f .gitignore
GIT can tell us that these files have been deleted.
We can recover all the files to their last commit by the "git checkout
" command.
git checkout -- .
If we want to take the "git-test" directory out from the GIT control, we can simply delete the ".git" directory and the "git-test" directory is no longer a GIT repository.
GIT Branches & "git checkout"
Create a New Branch
As any source control systems, people typically work with GIT by branches. We can find the branches in a repository by the "git branch
" command.
By default, the "master
" branch is created when we initiate the GIT repository. When we are in a GIT branch, we can create a new branch based upon the current working branch.
git branch branch-1
Because the active branch is "master
", the initial state of the newly created "branch-1
" will be exactly the same as the last commit of the "master
" branch. As a short cut, we can also create a branch by the "git checkout
" command.
git checkout -b branch-2
The "git checkout -b
" command creates a new branch and switches to it immediately.
Now we have created two new branches and the active working branch is the "branch-2
".
The "git checkout" a Branch
In GIT terminology, switching to a branch is called checking out a branch. If we want to switch to the "branch-1
", we can issue the following command:
git checkout branch-1
Delete a Branch
You may want to delete a branch sometimes. Let us first create a branch named "to-be-deleted
".
git checkout -b to-be-deleted
You cannot delete a branch if it is actively checked out unless you force it. We normally switch to another branch and use the "-d
" option to delete it.
git checkout master
git branch -d to-be-deleted
Working in a GIT Branch
Although we constantly switching branches, most of our time is spent on working in a particular branch. Instead of focusing on committing our changes in the branch, I will spend more time on how to revert our changes in the branch in this section.
git checkout -b working-branch
Because we only made one commit in the whole repository so far, the "working-branch
" matches the "master
" branch. We have three files in it.
Now let us add one file, delete one file and modify one file in the branch.
rm -f C
touch D
echo "Modified in working-branch" >> A
In order to continue with the rest of the examples, let us stage the changes now.
git add .
Un-stage the Changes Before a Commit
After staging, GIT is ready to commit the staged changes. In this example, we deleted "C
" and added "D
". GIT recognized the deletion/add as a file name change and it is OK.
If we do not want to commit the deletion of "C
", we can issue the following command to un-stage it.
git reset -- C
If we want to take out all the changes from staging, we can issue the following command:
git reset
After calling "git reset
", all the files are removed from staging.
Recover the Files to the Last Commit
For an un-staged file, we can recover it to the last committed state by the following command:
git checkout -- A
If we want to recover all the files, we can issue the following command:
git checkout -- .
We can find that "git checkout
" does not delete the newly added file. In GIT, a newly added file is called an untracked file. To delete the untracked files, we can use the following command:
git clean -fd
The "-f
" option tells GIT to clean the untracked files, the "-d
" options tells GIT to clean the untracked directories. If we want to clean even the ignored files and directories, we can use the "-x
" options.
git clean -fdx
Regardless of whether a file is staged or not, if you want to set the state of all the files in the branch to the last commit, you can use the following command:
git reset --hard
Discard a Commit
If we already made a commit, GIT still gives us the opportunity to discard it.
rm -f C
touch D
echo "Modified in working-branch" >> A
git add .
git commit -m "A commit to regret"
After the commit, we can use "git log
" to find all the commits on the branch.
git log
To discard the "A commit to regret", we can issue the following command:
git reset --hard c3b2c91f00ff4e4c97ba4484592c5c0284ae198e
We can also use the "HEAD~
" to represent the hash code of the previous commit.
git reset --hard HEAD~
The "HEAD" represents the most recent commit, "HEAD~
" represents the parent commit. In case of a merge, a commit may have two parents. The "HEAD^1
" is the first parent, and the "HEAD^2
" is the second parent.
Revert a Published Commit
If you have already "pushed" the commit to the remote repository, it is better add a new commit to maintain your commit history. You can use the "git revert" comman.
git revert -n 2870f17ac38e68a0..HEAD
where "2870f17ac38e68a0
" is the hash of the commit that you want to revert to. You can then commit the change and push your new commit to the remote repository.
Get a File From a Commit
If you want to get a file from a commit or a branch to your current branch, you can use the following command:
git checkout 2870f17ac38e -- path/to/file
Where "2870f17ac38e
" is the hash of the desired commit to get the file from.
The "git diff"
GIT provides a nice tool "git diff
" to compare a file with its staged state. If no new change staged, it compares the file with its last committed state.
git checkout working-branch
echo "AAA" > A
git commit -am "Commit No.1"
echo "BBB" >A
git diff -- A
If we want to compare all the files in the branch, we can use the following command:
git diff
If we want to compare with another branch, such as the "master
" branch, we can use the following command:
git diff master
The "git checkout" vs. "git reset"
GIT has a lot of highly overloaded commands which can make us easily confused. The "git checkout
" and "git reset
" can be pretty confusing because they are internally related. Without going to the internal implementations, I will only talk about how they are most commonly used.
The "git checkout"
The typical use case of "git checkout
" is to checkout a branch.
git checkout branch-1
If you want to create a branch and switch to it at the same time, you can use the "-b
" option.
git checkout -b a-new-branch
The "git checkout
" can also be used to undo the un-staged changes. For example, the following command undoes the changes to the file "A" to its last committed state.
git checkout -- A
If you want to discard all the un-staged changes, you can use the following command:
git checkout -- .
The "git reset'
One of the most common use cases of "git reset
" is to un-stage a file. For example, we can un-stage the file "A" by the following command:
git reset -- A
If we want to un-stage all the changes, we can issue the following command.
git reset
If we want to completely discard the most recent commit from a branch, we can issue the following command:
git reset --hard HEAD^1
According to the documentation, the "git reset
" has three commonly used modes:
- The "--soft" mode - Does not touch the index file or the working tree at all (but resets the head to <commit>, just like all modes do). This leaves all your changed files "Changes to be committed", as git status would put it.
- The "--mixed" mode - Resets the index but not the working tree (i.e., the changed files are preserved but not marked for commit) and reports what has not been updated. This is the default action.
- The "--hard" mode - Resets the index and working tree to <commit>. Any changes to tracked files in the working tree since <commit> are discarded.
The "--mixed
" mode is the default mode, which explains why we can use "git reset
" to un-stage all the changes that we do not want to commit.
A Little Bit in the ".git" Directory
As we have known that GIT keeps all the information in the ".git" directory. Without making the effort to understand every detail related to how GIT works, it is beneficial to at least take a quick look into the ".git" directory.
The "config" & the "description" Files
In a GIT repository, the "config" file keeps the basic configuration information related to the repository.
For example, in our repository, it has the user name and the email. The "description" file keeps the name of the repository. When we create the repository, we did not give it a name. But we can give a name to it by modifying the "description" file.
echo "git-test" > description
The "HEAD" File & the "refs" Directory
We may be curious to know how GIT knows about all the branches. Let us take a look at the "HEAD" file.
The "HEAD" file is a small file, which has the name of the current working branch. The branch name "working-branch" corresponds to a file in the "refs/heads" directory.
For each branch in the repository, we can find a file of the same name in the "refs/heads" directory. When we switching branches, we are modifying the "HEAD" file to point to the new working branch. Now let us take a look at the branch file "master".
The file "master" is a small file, it has only a hash code, which points to a commit represented by the hash code. When we checkout a branch, GIT can find the commit by the branch name and reconstruct the working directory. It is important to know that different branches can point to the same commit.
The "objects" Directory & the "git cat-file"
Every commit in the GIT repository and all the versions of the files for all the commits are kept in the "objects" directory. For easy management and retrieval, Linus Torvalds organized the data in layers of subdirectories.
We can use the "git cat-file" to look into the data saved in the "objects" directory. For example, we can explore the data for the commit "c3b2c91f00ff4e4c97ba4484592c5c0284ae198e
", which is the HEAD of the "master" branch.
git cat-file -p c3b2c91f00
You may notice that I did not use the full hash code, but only the first couple of letters. In GIT, it is sufficient to identify the whole hash code in most of the cases. The above command tells us that the commit has a tree structure represented by another hash code. Let us further take a look at this tree.
We can see all the files that we have committed in the tree. We can further take a look at the version of the ".gitignore" file in this branch.
git cat-file -p b4e54723341
It is exactly the content of the ".gitignore" file when we initiated the GIT repository. GIT keeps it in the "objects" directory for us.
The "index" File & the "git ls-files"
The index file serves as the pivotal point between the commits and the working directory.
- When we checkout a branch, the index is updated to match the information of the commit pointed by the HEAD file of the branch and the working directory is updated to match the content.
- When we stage a file, the index is updated for the file, so GIT knows that we have staged a file, but have not committed it yet.
- When we make a commit, a new commit is created that all the information in the index is stored to the "objects" directory.
The "index" file is a binary file that we cannot check its content directly. But we can use the "git ls-files" command to look at it.
git checkout working-branch
git ls-files --stage
We can see all the files have the same hash code except the ".gitignore" because all the files are empty at this time.
echo "Add some content" >> A
git add -- .
If we modify the file "A" and stage it, we can see that the index is updated.
The "git cat-file" can tell us the exact content that we have staged.
The GIT Merge & Conflicts & "--continue"
With any source control system, you will eventually face merges sooner or later and GIT is not an exception. One of the simplest but contently overlooked question is "Who is whom".
Who is Whom?
Thanks to this note that made the "Who is whom" question explicit and clear. In Git, performing a merge requires two steps:
- Check out the branch that should receive the changes.
- Call the "
git merge
" command with the name of the branch that contains the desired changes.
This clearly answered the question that your current working branch will be updated after a successful merge.
The Fast Forward Merge
Before making a merge, let us first take a look at the state of our branches.
git show-ref
All my branches point to the same commit. It means that all my branches have exactly the same content. Now let us make some changes in the "branch-1
" and commit it.
git checkout branch-1
echo "Mofified in branch-1" >> A
git add -- .
git commit -m "branch-1 is updated"
Now let us checkout the information on the HEAD
of the "branch-1
".
git rev-parse HEAD
git cat-file -p HEAD
Whenever GIT makes a commit, it keeps a record of its parent commit. If now we want to merge "branch-1
" into the "master
" branch, GIT has sufficient information to complete the merge without looking at the content of each branch.
- If the
HEAD
of the receiving branch is a parent or a remote parent along the parent chain of the desired branch, GIT recognizes no change has been made in the receiving branch. - In such a case, GIT will simply replace all the content of the receiving branch by the desired branch. In GIT terminology, it is called a "fast-forward" merge.
Now let us merge the "branch-1
" into the "master
" branch.
git checkout master
git merge branch-1
In cases of a fast forward merge, GIT will simply update the HEAD
of the "master
" branch with the HEAD
of the "branch-1
".
The result of the "master
" branch is equivalent to the result when we issue the following command:
git reset --hard f5e3b0658afeb194340620d912e389d9c06f2cd0
The Three Way Merge
In case changes have been made in both branches, GIT needs to look into the content of each branch. Most commonly GIT will perform a three-way merge. GIT will compare the HEAD
commit of each branch and their most recent common ancestor to decide if a file has been added or deleted and if a file has been modified. GIT will declare a conflict in the following situations and leave the decisions to us:
- If a file has been modified in both branches and if the content on the same line or adjacent lines are different
- If a file has been deleted in one branch but modified in the other branch
- If a file has been added in both branches but the content is different on the same line or on adjacent lines
Now let us commit some changes to the "branch-1
" and "branch-2
" to complete a three-way merge.
git checkout branch-1
echo "Modified in branch-1" > A
rm -f B
echo "Added in branch-1" >> D
git add -- .
git commit -m "Prepare for merge with branch-2"
In "branch-1", we modified the file "A", deleted the file "B", and added the file "D" with content.
git checkout branch-2
echo "Modified in branch-2" > A
echo "B is modified" >> B
echo "Added in branch-2" >> D
git add -- .
git commit -m "Prepare for merge with branch-1"
In "branch-2", we modified the file "A", modified the file "B", and add the file "D" with different content. Now let us merge "branch-2" into "branch-1".
git checkout branch-1
git merge --no-commit branch-2
We have conflicts and we can further look at the conflicts by the "git status
".
In case of a conflict, GIT make change to the content of the conflict file to help us to make the decision. For example, the following is the content in the file "D".
Due to the conflicts, GIT is unable to make decisions for us. We will need to examine every file to decide what we want to do. If we want to keep the file "B" and manually resolve the conflict on "A" and "D", we can issue the following command:
git checkout --theirs B
echo "Actual merge result for A" > A
echo "Actual merge result for D" > D
Of course, this is an oversimplified conflict resolution. In real situations, you need to use your favorite editor to look at each file closely to decide your final decision on each file. After resolving the conflicts, you can commit the merge.
git add -- .
git commit -m "Merge branch 2"
You can also use "git merge --continue
" to complete the merge.
git merge --continue
If you now exam the HEAD
of the branch, you can find that the commit has two parents:
The two parents are the both commits that the new commit merged from.
The GIT Tag
- Git has the ability to tag specific points in a repository’s history as being important. Typically, people use this functionality to mark release points.
- A tag is very much like a branch that doesn’t change, it’s just a pointer to a specific commit.
Lightweight Tags and Annotated Tags
You can create a lightweight tag by the following command:
git tag v01.01.01
You can also include more information in the tag by creating an annotated tag.
git tag -a v01.01.02 -m "This is an annotated tag"
You can list all the tags by the following command:
git tag -l
You can also filter the tags by providing a filter pattern in the command:
git tag -l v01.01.*
If you want to see the comments when you list the tags, you can use the "-n
" option:
git tag -n
You can delete a tag by the following command:
git tag -d v01.01.02
If you want to see the hash related the tag commit, you can use the following command:
git show-ref tags v01.01.01
Push Tags to the Remote Repository
If you want to push your tag to the remote repository, you can use the following command:
git push origin v01.01.01
The following command will push all the tags to the remote repository:
git push --tags
You can delete a remote tag by the following command:
git push --delete origin v01.01.01
The Repository Size & "reflog" & "git gc"
GIT does a good job to minimize the size of the repository. In most of the cases, you do not need to worry about it. But if you are curious, you can take a look at the GIT garbage collection. This is the best note that I found on GIT garbage collection that I would recommend you to take a look at it. In order to see how garbage collection works GIT, let us make a commit and then discard it.
git checkout working-branch
echo "Whatever modification" >> B
git add -- .
git commit -m "Commit to be discarded"
We can find the hash code of this commit by the following command:
git rev-parse HEAD
bffe4e1bfcbbf026e3e9b34dacc66cf18dcb501c
We can issue the following command to discard this commit and revert the branch to the previous commit:
git reset --hard HEAD~
After discarding the commit, the commit "bffe4e1...
" is no longer associated to any branch. Ideally, we should be able to garbage collect it.
git gc --prune=now
But if we take a further look, we can find that the commit is still in the repository:
git cat-file -p bffe4e1
The reason why the garbage collection did not work on this commit is that it is associated with the log. When we make a commit, the commit is added to the log that we can reference later. It is called "reflog" in GIT. We need to clear the log, so we can collect this commit.
git reflog expire --expire=now --all
After clearing the log, we can run "git gc --prune=now
" again. We should see the dangling commit is cleared from the repository.
If you want to find all the dangling commits and objects, you can use the following command:
git fsck --full
I could not find the default expiration time of the "reflog
" from the GIT website. But from this link, the default time is 90 days and it is configurable.
The Remote Repository
Working with local repositories gives us the opportunity to learn most of GIT. but without a remote repository, you are unable to share works with your team. The effort to learn to work with a remote repository is no match to the effort that we have spent on the local repository. I will not spend a lot of time to talk about it. I will just list the most commonly used commands here for completeness. For most people the first thing to do with a remote repository is the "git clone
".
git clone https://github.com/BigMountainTiger/lu-decomposition.git
I created a repository on "GITHUB" named "lu-decomposition", please feel free to clone it. If you want to find all the remote branches, you can use the "-r
" option.
git branch -r
If you have the permission, you can also push your changes to the remote repository. If you created a local branch that the remote repository is not aware of, you can use the following command to publish it to the remote:
git checkout -b new-branch
git push --set-upstream origin new-branch
After publishing the branch, you can simply use "git push
" to send your local commits to the remote:
git push
You can use "git pull
" to get the commits that other people pushed to the remote:
git pull
If you want to check if there are any new updates on the remote branch, but do not want to pull it locally, you can use the fetch
command:
git fetch
If you want to get the status of the remote repository, but not limited to your working branch, you can use the following command:
git remote update
You can delete the remote branch:
git push origin --delete new-branch
Of course, you can always delete your local branch:
git branch -d new-branch
If you want GIT to remember your credentials to the remote repository, you can use the following command:
git config --global credential.helper store
Points of Interest
- This is a note on GIT & miscellaneous subjects.
- The best GIT reference that I found is this book that I strongly recommend you to read it. This note is just putting the most commonly used GIT commands together.
- I hope you like my postings and I hope this note can help you in one way or the other.
History
- 27th June, 2018: First revision