There are some similarities between Subversion and Git. They are both source-control systems that work on nearly every modern development platform. They are both popular and are open source, proving integration points for other development-related tools. They both support branching, merging, and working with files without locking them. There are still other similarities, as well. Understanding how these systems are similar can be important for those with experience using Subversion who want to learn Git. However, understanding the similarities alone will not be sufficient for learning Git. It is also necessary to understand the differences and to look at some of the features that Git provides, for which Subversion has no equivalent.

A Suite of Paradigm Shifts

While there are a number of ideas that transfer from Subversion to Git, there are also a number of ideas in Git that will not have an equivalent. Having a basic understanding of where and how Subversion and Git differ will offer insights into working with Git and why certain actions and commands are necessary.

From Centralized to Distributed

Subversion-among many other popular source-control systems such as CVS and Team Foundation Server-solves these problems using a centralized repository model. Users who wish to work against that repository will bring a copy of what they need down to their local computer, make the necessary changes, and then commit the changes directly to the repository on the server. This model provides collaboration by having everyone work against the same server, with all of their work being visible to everyone else upon commit. Figure 1 shows this classic model of source control with a centralized server and many users working against the repository on it.

![Figure 1: A centralized version-control system.](https://codemag.com/Article/Image/1105101/Figure 1.tiff)

In contrast to Subversion and centralized version-control systems, distributed systems do not require a central server-though one can be used. Instead, users work against a local repository as shown in Figure 2, not one that is stored on a server. Most actions taken against the repository are done locally including checking out files to work on, making changes, and committing those files. It is only when the user is ready to share their work that they need to think about any repositories that are not on their computer. At that point, though, the user does not have to push their changes to a central server. They can push changes to other users directly or let others grab the changes themselves.

![Figure 2: A distributed version control system.](https://codemag.com/Article/Image/1105101/Figure 2.tiff)

From Client-server to Peer-to-peer

Aside from working against your local repository, Git provides the ability to connect to any of a number of Remote Git repositories, as shown in Figure 2. In a peer-to-peer set up, such as this, commits are transferred directly between the users, instead of through a centralized server. Git still supports the use of a shared, central server, but one is not strictly needed.

From a Subversion Repository to Git Remotes

When you do a checkout from a Subversion repository, information about the server is stored in the local file system. This tells the Subversion client where the server is, how to access it, and the most recent state of the repository that was pulled down to the local computer. A Subversion checkout can point to one and only one repository, and all work is done against that repository.

With Git, however, it is possible to configure multiple connections to multiple remote repositories. You do this by using Git remotes. A remote is a set of configuration details that tells Git where the remote repository is, how to access it, and what the most recently known state of the remote repository is (see Figure 3). Git can connect to the configured remotes at any time and there are only a few commands specifically for interacting with remote repositories.

![Figure 3: Information about a Git remote.](https://codemag.com/Article/Image/1105101/Figure 3.tiff)

From Many .svn Folders to a Single .git Folder

A Subversion checkout creates a hidden .svn folder for every folder that it retrieves from the remote repository. Each of these .svn folders houses the necessary information for a Subversion client to know about the remote server, branch, and folder.

There is only one .git folder for a given Git repository.

There is only one .git folder for a given Git repository. Since Git pays attention to the file system and runs most commands locally, there is no need to store connection and state information about a folder in every folder. Git is able to read the file system and make comparisons to the actual repository, to determine the state of the files in the working folder.

Figure 4 shows an example of a Git repository and working folder layout. The actual repository is housed inside of the .git folder inside of the working directory of the project.

![Figure 4: A folder with a Git repository.](https://codemag.com/Article/Image/1105101/Figure 4.tiff)

From Trunk to Master

Subversion repositories typically have folders called branches, tags, and trunk. The trunk represents the primary location of the content that is being worked on in the repository and is separate from any branches that may exist in the repository. While Subversion does not require a branches, tags, and trunk folder structure, this is the most common way of working. Most Subversion client software expects this or at least looks for this structure, and knows how to work with it.

With Git, the equivalent of trunk is the master branch.

With Git, the equivalent of trunk is the master branch. Technically, there is no distinction between the master branch and any other branch in the repository. By convention, though, most Git repositories have a master branch that represents the most up-to-date, stable version of the content in the repository. However, there is no requirement for a Git repository to have a branch named master; this is only a convention and a default built into Git.

From Checkout to Clone and Checkout

A subversion user will check out the trunk or branch that they need to work with. This brings a copy of the trunk or branch down to the working directory structure of the local computer.

A Git user may perform a clone of a remote repository to create a local repository, which will contain the same commits and contents as the remote one. When you perform a clone, the master branch is pulled from the remote repository down to the local repository, by default. You may pull down other branches in addition to or instead of the master branch, as needed. The clone also configures a link to the remote repository, called a git remote, with a default name of origin, as shown in Figure 3.

Once the clone is finished, the user works against their local repository directly by checking out the branch they wish to work with. However, this does not require a connection to the remote repository. The checkout occurs within the working directory structure of the cloned repository.

From Server-based Branches to Local Branches

A branch in Subversion is defined on the server. To work against that branch, a developer must check out that branch or “switch” their current checkout to that branch. Once the developer is done working in that branch, they either check out or “switch” over to another branch or the trunk.

A branch in Git can be defined in a user’s local repository.

A branch in Git can be defined in a user’s local repository. Even when a branch is available on a remote repository, a developer must bring that branch down to their local repository before they can work against it. They must still perform a checkout to switch to that branch and work against it. However, that checkout does not require a connection to any remote repository. Only the fetch or pull to bring the branch to your local repository requires a connection to a remote.

The developer can also create local branches in a Git repository with no need for a remote connection because a Git branch is not automatically shared with a remote repository. This allows a developer to have many local branches that are useful to them, without any other developer knowing about or seeing them.

With multiple branches available in a local Git repository, it can be easy to forget which branch is being worked with. Fortunately, Git client software-including the Git command line-will tell you which branch you are working against. Figure 5 shows the results of a git status command, which includes the current branch name. It is also possible to configure various command-line shells to display the current branch name. For example, Figure 5 also shows the Git Bash Shell (part of MSysGit-a Git installation for Windows) command-line prompt. It includes the name of the user logged into Windows, the machine name, the current folder, and the current Git branch.

![Figure 5: A Git status command in a Git Bash prompt, showing the current branch.](https://codemag.com/Article/Image/1105101/Figure 5.tiff)

From a Folder-like Hierarchy with Commits to a Timeline of Commits

Looking at a Subversion repository structure will reveal what looks like a folder structure where the branches, tags, and trunk are organized. Subversion branches are often organized into subfolders and hierarchies to help the team keep track of them. For example, a team may have a “bugfixes” folder that contains branches for fixing bugs. They may also have a “development” folder for branches that pertain to new development or features. Some teams have branch folders named after individual developers so that the developer can have their own set of branches that no one else needs to worry about.

A HEAD is a named pointer to a specific commit, representing the current commit of a given branch.

In Git, there is no hierarchy of folders. Instead, Git maintains the knowledge of each branch through what are called HEADs. A HEAD is a named pointer to a specific commit, representing the current commit of a given branch. Figure 6 shows the master branch of a Git repository. When you make a commit on the master branch, Git moves the pointer to the new commit.

![Figure 6: The master branch points to a specific commit in the repository.](https://codemag.com/Article/Image/1105101/Figure 6.tiff)

A team may still have branches for bug fixes and new development efforts, but a repository on a developer’s local machine only needs to contain the branches they care about. The individual developer does not need to bring all branches from all remote repositories down to their local repository. If they do not need the bug fix branches that anyone else is working on, then they do not bring those branches down to their repository. This eliminates much of the need for a hierarchy of folders to organize the repository’s branches.

From Every Action Visible, To Only Sharing What’s Important

With a centralized repository and most actions being done directly against it, a Subversion user is showing everyone on the team what they are doing all the time. Even when a user is working in a Subversion branch, the changes they make are always visible and available to every other user.

However, with most actions being taken against a local repository in Git, a developer only has to show what is important to the rest of the team. A developer can create as many local branches as they need and try out many different ideas. They can also revert changes to grab previous versions or pull in changes from other places to get what they need.

A developer is not required to show their entire history to other developers on the team. They can limit what is shared to only what is important whether that’s a single commit encapsulating an entire feature or a small number of commits that were built from a large number of branches and ideas. This provides a developer with some freedom and flexibility that Subversion may not allow: working in your own manner without worrying about how others are working.

From Commit to Stage and Commit

Both Subversion and Git require you to add a file to the repository so that the repository will know to track it. A commit in Subversion is a single step-the commit with any necessary commit message. However, a commit in Git is a two-step process (with a possible shortcut). Once a file has been added to Git for tracking, any changes made to that file must be staged before they can be committed.

Staging a file takes the current state of that files and places it in a special staging area of Git, where it can be committed or have a few other operations performed on it. Any additional changes that are made to the file must be staged again, or Git will not commit those changes. Once a file or group of files has been staged, a commit with the necessary commit message can be made. Only files that have been staged will be committed. If you stage a file for commit by mistake or otherwise need to remove it from the staging area, you can un-stage it by performing a git reset on the file.

From Revision Numbers to SHA-1 IDs

A commit to a Subversion branch or trunk will increment the repository’s revision number. Anyone that commits to the repository next will get the next increment of the revision number. This works because there is a single repository that all users work against, and commits can only be made to that repository.

Instead of an incremental revision number, Git uses a Secure Hash Algorithm (SHA1) to create a unique identifier for each commit.

With every user having their own repository, and the possibility of connecting to multiple remote repositories to share code, incremental revision numbers would not be sufficient for Git. There is no guarantee that any specific user would pull any specific commit into their local repository. Instead of an incremental revision number, Git uses a Secure Hash Algorithm (SHA1) to create a unique identifier for each commit.

The ID of an individual commit is a series of 40 hexadecimal numbers and characters, creating a unique ID, using the SHA-1 algorithm. Git generates the ID based on the commit content, ancestry, and other data points and Git uses this to verify the integrity of the commit. If the ID does not match what is in the commit, then Git will consider the commit corrupted.

When working with Git commands that need a commit ID, you only need to supply enough of the commit ID to find the commit in question-typically the first 6 to 8 characters of the ID. Figure 7 shows several commit IDs from the most recent commits in a repository, by running the git log command.

![Figure 7: Several commits with 40-character commit IDs.](https://codemag.com/Article/Image/1105101/Figure 7.tiff)

From Metadata and Tree Conflicts to Ancestry

Most Subversion users have run into tree conflicts caused by the metadata that Subversion stores with files and folders. Subversion uses this metadata to keep track of where a revision came from and provides some necessary data for functionality, such as reintegrating a branch automatically. However, the metadata is easily broken when multiple users are working on the same files and folders, and when branching and merging comes into play. The end result is a tree conflict, which many developers simply ignore or revert-making the problem worse.

Git, by contrast, does not need to store any metadata about individual files and folders to know where they come from. Rather, Git stores ancestry inside of the repository itself, connected to individual commits. It’s the ancestry of the commits that gives a Git log the transit-line look, as shown in Figure 6.

For example, a repository may have two branches-master and dev. If dev was created from master and has commits in it that are not in master, but master has no commits that are not in dev, the two branches will exist on the same timeline, as shown in Figure 8. Performing a merge from dev into master will result in a fast-forward merge as shown in Figure 9. That is, Git will simply move the pointer that represents the head of master up the line to the same point as dev. The end result, shown in Figure 10, is that the two branches are now pointing to the same commit.

![Figure 8: Two branches that have not yet diverged will remain on the same timeline.](https://codemag.com/Article/Image/1105101/Figure 8.tiff)

![Figure 9: A fast-forward merge.](https://codemag.com/Article/Image/1105101/Figure 9.tiff)

![Figure 10: After the fast-forward merge, both branches now point to the same commit.](https://codemag.com/Article/Image/1105101/Figure 10.tiff)

This ancestry also allows a wide variety of branching strategies to be used. Whereas creating branches of branches in Subversion is typically frowned upon because of the potential complexities in merging, this is a common practice in Git. Having true ancestry stored with each commit means Git knows exactly where a branch is in relation to other branches, and what must be done when merging. However, this does not mean that all merges are simple in Git. A complex branch set can lead to complex merging situations, requiring intervention by the user.

From Username to Real Name and Email

When commits are done against a Subversion repository, the username that you authenticated with is stored with the commit. This lets others know who made what changes when. Git also provides information about who made what commits, but uses your name and email address instead of a username. Since Git is a distributed system and there is no guarantee that any two systems will have the same authentication mechanisms in place, there is no way to ensure a username is available via authentication. Instead, Git stores your name and email in its configuration, which can be set with the following commands:

git config --global user.name “your name”
git config --global user.email <a href="mailto://your.email@example.com">your.email@example.com</a>

These commands will store your name and email in Git’s global configuration, meaning your name and email will default to this for any repository you work with.

Note that some Git hosting software and services will send warning or error messages if you do not have your name and email configured. Therefore, you should configure these items immediately after installing Git.

Working With Git

Having a working knowledge of the differences between Subversion and Git will help in the transition to using Git. However, there are enough differences and it is enough of a paradigm shift, that the only true way to learn Git is to use Git.

A Developer and a Local Repository

Working with a local Git repository is fairly straightforward as there is no server to set up. Initializing Git inside of a folder will create a local repository for use, even when files exist in the folder:

cd my-project
git init

The result is a fully functioning Git repository that is ready to take commits, etc. Before a commit can be made, though, Git must know about the files. You do this by adding the files to Git and it is similar to adding files to a Subversion repository. Git also requires that you stage files before you commit them, as shown in Figure 11. Fortunately, the process of adding files to Git and staging them for commit is the same. Therefore, if a new file needs to be added, staged, and committed, it is only a two-step process-the same as staging and committing a file that Git is already tracking:

![Figure 11: Files are changed but not yet staged, so they cannot be committed.](https://codemag.com/Article/Image/1105101/Figure 11.tiff)

git add .
git commit -m “my first commit”

Calling git add will add any new files and folder, and stage all currently tracked files and folders, making them ready for commit. Figure 12 shows the status of a repository with files that are staged and ready for commit.

![Figure 12: Files are staged and ready for commit.](https://codemag.com/Article/Image/1105101/Figure 12.tiff)

Calling git commit works in a similar manner as calling svn commit for a subversion repository. In the case of Git, this will commit the staged files. The m option provides a message for the commit. Figure 13 shows the results of the commit, including the first portion of the commit ID and what changes were made.

![Figure 13: Committing changes that were staged.](https://codemag.com/Article/Image/1105101/Figure 13.tiff)

There is a shortcut to stage and commit file in one step, as well. If changes are made to file that are already being tracked by Git (they have previously been added to the repository), then calling git commit with the -a option will stage and commit the files in one command:

git commit -a -m “my commit message”

Git also allows options to be stacked, meaning the previous command is equivalent to this command:

git commit -am “my commit message”

Note that this shortcut will not work for files that are not currently tracked by Git, nor file deletions. To handle those scenarios, you must stage the changes manually.

History and Branches in a Repository

As commits are made, Git tracks information along with the files, including the commit ID and the commit’s ancestry.

As commits are made, Git tracks information along with the files, including the commit ID and the commit’s ancestry. Figure 10 shows several commits that have been made in the master branch of a repository. Work can be done directly against the master branch, or branches can be created to work on something specific such as a new feature, bug fix, or other topic:

git branch mytopic
git checkout mytopic

When you create a branch, the new branch will immediately point to the same commit as the branch it was created from, as shown in Figure 14.

![Figure 14: A new branch starts on the same commit as the branch it came from.](https://codemag.com/Article/Image/1105101/Figure 14.tiff)

Performing a checkout will change the working copy to the branch. Once a branch has been checked out, commits and other actions can be performed against it.

There is a shortcut for creating a branch and performing the checkout at the same time, as well:

git checkout -b mytopic

When commits are made to the branch, but no additional commits have been made to the master, Git will continue to show the master on the same timeline as was previously shown in Figure 8. Only when the master and the topic branch diverge-that is, both the master and the topic branches have differing commits-will Git show them as diverging timelines, as shown in Figure 15.

![Figure 15: Diverging branches.](https://codemag.com/Article/Image/1105101/Figure 15.tiff)

Like Subversion, you can merge branches to bring the changes from one branch into another. When you merge the topic branch into the master, one of two things can happen to the timeline. If no commits have been made to master since the branch was created, Git will, by default, fast forward the master to the same commit as the topic branch, as shown in Figure 9 and Figure 10. If, however, the master and the topic branch have diverged, then Git will apply the changes that have been made in the branch to the master, resulting in a new commit that represents the merge:

git checkout master
git merge mytopic

Since the topic branch was merged into the master branch, only the master branch’s HEAD will be moved to the new commit, as shown in Figure 16. A developer working with a local repository can create as many branches as they need, and merge them as many times as they need, to complete their work.

![Figure 16: Merging divergent branches.](https://codemag.com/Article/Image/1105101/Figure 16.tiff)

A Developer and a Remote Repository

When starting to work from a remote repository and there is not an existing local copy, the first step is to clone the remote one. If the remote repository is hosted via HTTP, you can clone it from the URL directly:

git clone <a href="http://my.server.example.com/myproject.git";>http://my.server.example.com/myproject.git<;/a>

This will create a local copy of the repository that is hosted at that URL, in a local folder called “myproject”, the clone will pull the contents of the remote master branch down to the local repository. It will also set up a Git remote called origin. Origin is a naming convention that signifies the remote repository that was used to create the local clone. Figure 17 shows a repository being cloned from GitHub via SSH. Figure 18 then shows both the local and remote master branches pointing to the same commit.

![Figure 18: A local and remote master branch, pointing to the same commit.](https://codemag.com/Article/Image/1105101/Figure 18.tiff)

![Figure 17: Cloning a repository from GitHub via SSH.](https://codemag.com/Article/Image/1105101/Figure 17.tiff)

If a local repository exists and only needs to be connected to a remote repository, you can do this by configuring a Git remote that points to the remote repository. You can name the remote nearly anything-it does not have to be origin-as long as the name does not have spaces or special characters (other than a select few, such as “-“, “_”, or “.”). You can configure the remote to connect via any protocol that Git supports, and verify the connection to that remote by calling the remote show command, as shown in Figure 19:

![Figure 19: Configuring a remote and checking the remote’s status.](https://codemag.com/Article/Image/1105101/Figure 19.tiff)

git remote add myserver <a href="mailto://git@example.com">git@example.com</a>/project.git
git remote show myserver

After you have cloned a repository, the master branch from the remote will have been pulled down and checked out. However, when you configure a remote manually, no branches will be pulled down automatically, which means they will not yet be available. To pull branches down and begin using them, you must first fetch the remote branches:

git fetch myserver

Figure 20 shows the results of the fetch and Figure 21 shows the state of the repository after the fetch has been done. Note that the remote branch, “origin/myserver/mytopic” (a.k.a “mybranch”) is now available locally, but does not have a local branch pointing to it. The local master branch was already pointing to the same commit as the remote master branch, and Git uses the commit IDs to know that these branches are the same.

![Figure 20: Fetching the branches from a remote repository.](https://codemag.com/Article/Image/1105101/Figure 20.tiff)

![Figure 21: A remote branch, “mytopic”, has been fetched but has no local branch pointing to it.](https://codemag.com/Article/Image/1105101/Figure 21.tiff)

To create a local branch from the remote branch, specify the remote-name/branch-name as the starting point for the local branch. Figure 22 shows the result of creating the local branch from the remote, and how the local branch now points to the same commit:

![Figure 22: Creating a local branch from a remote.](https://codemag.com/Article/Image/1105101/Figure 22.tiff)

git branch sometopic myserver/sometopic

There is no requirement for a local branch to be named the same as a remote branch. If a local branch should have a different name, specify the name as the first parameter after git branch.

After cloning a repository or configuring a remote and pulling down the desired branch, work can be done against the local repository. Commits can be made, branches can be created and merged, etc.

When commits are made and branches are created, they are still local commits and branches; Git does not automatically update the remote repository.

When commits are made and branches are created, they are still local commits and branches; Git does not automatically update the remote repository. Once you are ready to share the commits in the local repository, though, you can push them to the remote repository. For example, if you made the commits against the master branch, as shown in Figure 23, you can push them with the push command, specifying which branch to push:

![Figure 23: Changes made in a local branch.](https://codemag.com/Article/Image/1105101/Figure 23.tiff)

git push myserver master

This will push all changes that have been made to the local master branch out to the remote master branch. The result is that the remote master branch is now pointing to the same commit as the local master, as shown in Figure 24.

![Figure 24: The results of pushing local changes to a remote repository.](https://codemag.com/Article/Image/1105101/Figure 24.tiff)

Note that commit IDs and other commit details created in a local repository stay the same when they are pushed to a remote repository. This is how Git knows what the status of a remote branch is in comparison to a local branch, and in comparison to other branches.

Multiple Developers and Remote Repositories

When multiple developers are working on the same project, they will need to share their changes with each other. In Subversion, you do this through the central repository. Any commits that you make are available to the other developers, whether those commits are on a branch or on the trunk. In Git, however, commits are not always available to others immediately.

You can set up Git so that every person on a team has a remote pointing to every other person on the team, in a peer-to-peer fashion as shown in Figure 2. In this type of setup, commits are generally available when the person you want to pull commits from is available via a network. If you are all working in the same office, then the commits will likely be available immediately. However, if the team is using a shared remote repository, commits are only made available when they are pushed to the shared repository. This configuration allows Git users the same advantages of a centralized repository, like Subversion, while still providing all of the advantages of a distributed source-control system.

Regardless of the repository topology, though, a team that is collaborating using Git will follow the same practices as a developer working in a local repository and one working with a remote repository. Clones will be made, remotes will be configured, and commits and branches will be pushed and pulled to remote repositories. The difference in working with a team is how to handle someone else’s commits being available in a remote repository and the need to pull them down to your machine.

When one developer finishes some work and pushes it up to a remote repository that is shared with the other developers, then the changes will be available for the other developers. Figure 25 shows a remote connection that has commits from another developer, which need to be pulled down to the local machine.

![Figure 25: A local branch is out of date-there are changes on the remote.](https://codemag.com/Article/Image/1105101/Figure 25.tiff)

You pull commits from other developers in a remote repository by using fetch and merge:

git fetch myserver

Once you have pulled down the commits to your local machine, you can merge them into the local branch, as shown in Figure 26:

![Figure 26: A fetch and a merge to bring remote changes to a local branch.](https://codemag.com/Article/Image/1105101/Figure 26.tiff)

git merge myserver/topicbranch

Note that you do the merge from within the local branch, and Git merges the contents of the remote branch that was pulled into the local repository. No changes have been pushed out to the remote repository yet. Once the changes in the local repository are ready, you can push them up to the remote for others to fetch, merge, and work with.

There is a shortcut for bringing remote changes into a local branch: git pull. Doing a pull will perform the fetch and the merge for you, automatically. You can specify which branch to pull manually, or you can let Git figure out which branch/branches to pull based on your repository’s configuration:

git checkout topicbranch
git pull myserver/topicbranch

All commits that exist in the remote repository’s branch must be present in the local repository’s branch before Git will allow the push to occur.

There is one catch to pushing changes out to a remote repository. When changes are available on the remote, if those changes have not yet been pulled into the local repository and merged into the local branch, Git will reject the push. All commits that exist in the remote repository’s branch must be present in the local repository’s branch before Git will allow the push to occur, as shown in Figure 27. Once the commits have been pulled from the remote repository and merged into the local branch, the push can occur.

![Figure 27: Git will fail a push if there are changes in the remote repository.](https://codemag.com/Article/Image/1105101/Figure 27.tiff)

The requirement to pull all changes first is there because a push that was not up to date with the remote’s commits would require a merge to happen on the remote side. If a conflict occurred during the merge, there may not be a way to resolve it. Therefore, Git requires local branches to be up to date-to contain all commits that are in the remote branch-before pushing to a remote branch.

Rebase: Another Way to Move Commits Between Branches

Merging is similar between Subversion and Git from a number of perspectives. They both bring changes from one branch into another (or the Trunk in the case of Subversion). They both allow work to be done independently of other changes. They both may run into merge conflicts that need to be resolved with merge tools, and they can both generally work with the same merge conflict resolution tools. Git can even work with TortoiseMerge from TortoiseSVN.

A rebase allows changes in one branch to be moved to the end of a set of changes from another branch, without doing a merge.

However, there are some significant differences in the options that Git provides for bringing commits from one branch into another. For example, Git provides a few options for bringing commits from one branch into another without doing a merge, such as the rebase command.

A rebase allows changes in one branch to be moved to the end of a set of changes from another branch, without doing a merge.

Rebase: It’s Sort of Like a Merge, But It’s Not a Merge

When you’ve created a topic branch from the master, it can diverge from the master branch, as shown in Figure 15. If a developer using a topic branch needs to keep up to date with the changes that have been occurring in the master branch, they can do a rebase of the topic branch instead of a merge.

A rebase is similar to a merge in that it brings changes from one branch into another. However, it is significantly different in that it does not merge the changes in one fell swoop. Instead, a rebase will re-commit all changes from the branch being rebased to the head of target branch:

git checkout mytopic
git rebase master

When the rebase occurs, Git will first rewind the topic branch back to the point in the repository history where the branches were on the same commit. Next, the topic branch’s HEAD will be moved to the same commit as the master branch’s HEAD, the same as if the branch had been created from that commit. Git will then re-apply the individual commits that were originally made on the topic branch to the new location of the topic branch. The end result, shown in Figure 28, is that the topic branch appears to have been created from the HEAD commit of the master branch instead of at the original point from which it was created.

![Figure 28: The results of a Git rebase.](https://codemag.com/Article/Image/1105101/Figure 28.tiff)

Figure 29 shows the original state of the repository in comparison to the new state of the repository. Note that the changes from topic branch have been moved from a separate timeline that branched away from the master to the end of the same timeline as the master. It is as if the topic branch were started and worked on after the changes had been made to the master.

![Figure 29: A side-by-side comparison of before and after a rebase.](https://codemag.com/Article/Image/1105101/Figure 29.tiff)

A Rebase Rewrites History

The commits in the mytopic branch on the left side of Figure 29 and the commits in the mytopic branch on the right side of Figure 29 will no longer have the same commit ID. As such, Git will treat them as different commits even though the contents are the same. This has a few implications that are very important and need to be examined before doing a rebase.

When you do a basic merge, the commits that are merged are left alone and a new commit is made to capture the merge. Pushing and pulling commits that have been merged is never dangerous because the commit IDs are static. They will be the same on all machines that receive those commits. Because the IDs are the same, changes made on one machine can be pulled into a repository on another machine. This is part of what makes a distributed source-control system work well.

When you do a rebase, history is rewritten.

When you do a rebase, history is rewritten. The commits that were made in the topic branch no longer have the same parent, and the change sets in the commits are now applied to content that is potentially different than it was originally. This means that Git cannot keep the commit IDs that were originally used. Git re-applies the commits to a new timeline and creates new commit IDs for them.

Even for the simple scenario where you only use a rebase to move the commits of a branch on to the end of another, this can be dangerous. If you bring down a branch from a remote repository, and then rebase that branch against a branch that only exists on the local machine, Git will change the IDs of the commits on the branch that were brought down. If the Git user then tries to push the branch that they rebased back up to the remote, Git will produce an error message saying the branches are out of sync. This happens because the HEAD of the branch in the local repository now points to a commit ID that is different than the branch in the remote repository. Even though the local and the remote branch share the same content in the commits, the commit IDs are not the same and therefore, Git no longer knows that they were once the same commits.

Rebase Conflicts Have an Advantage and a Disadvantage

When a series of commits are merged, they are essentially rolled into one large change set and then applied en-masse. There is no choice but to take every change that has occurred at the same time. When the merge is successful, this is not a disadvantage. When a merge fails due to a conflict, though, the user is forced to look at all changes from all commits in the merge resolution process.

A Git rebase is not immune to conflicts, and rebase conflicts are handled similarly to merge conflicts. However, when you do a rebase and conflicts occur, a potential advantage presents itself. Because Git applies every individual commit to the target branch, in sequence, Git has an opportunity to apply the changes in much smaller increments-as small as the commit itself. This allows the conflict to be as small as an individual commit, as well, resulting in far less content for the user to sift through to resolve the conflict.

Unfortunately, a rebase conflict with all of the individual commits being applied, one at a time, can present a disadvantage. If a rebase conflict occurs, resolving that conflict may create an opportunity for another commit that has yet to be applied to be in conflict. It is possible-and it does happen-for a rebase to result in a conflict for every commit that is re-applied.

Rebase Is a Tool You Should Own

There are some scenarios where a rebase makes more sense than a merge, in spite of the potential danger. Rebase is another tool that all Git users should have in their toolbox. Using it without understanding the potential consequences is not a good idea, though. Take the time to understand how a rebase works, what the core options are, and the side effects and potential downfalls.

Git Tools and Integration

One of the largest drawing powers that Subversion has is the community size. Subversion has been around for a long time and is a very popular system with a thriving community. There are hundreds, if not thousands, of tools that support Subversion directly, including some very easy-to-use GUI tools and integration with the most popular IDEs.

Git, though relatively young in comparison with Subversion, also has a thriving community supporting it; a community that is also rapidly growing. There are a good number of tools to help integrate Git into the most popular IDEs and other tools, with more being created and refined on a regular basis. As Git continues to grow in popularity, the number of tools and integration points also continues to grow.

Here are some of the more popular tools for working with Git and integrating it into a software developer’s daily work. For a more complete list of Git-based tools, see the Git wiki at https://git.wiki.kernel.org/index.php/Interfaces,_frontends,_and_tools.

MSysGit

URL: http://code.google.com/p/msysgit/

MSysGit is the standard Git installation on Windows-based machines. It brings all the power of Git to the Windows command line, includes a “Git Bash Shell” for those that are comfortable with Linux-style command-line shells, and includes a few GUI tools to help manage Git repositories.

GitK

Included with MSysGit and most other Git installations, GitK is a Git log and history viewer. It can show a repository’s complete history, the remote branches that have been pulled down, file changes, and difference logs, etc.

GitX

URL: http://gitx.frim.nl/

GitX is an OSX tool that brings the features of GitK into a native OSX application. It provides a history viewer, allows commits to be performed within it, etc.


Git Extensions

URL: http://code.google.com/p/gitextensions/

Git Extensions is a cross-platform suite of GUI tools to help manage Git repositories. It includes a log viewer (similar to the GitK tool that comes with MSysGit), File History Explorer, and other features that integrate directly into Windows Explorer and Microsoft Visual Studio. It also includes MSysGit in the installer, making it a one-stop shop for Git needs in a Windows environment.

TortoiseGit

URL: http://code.google.com/p/tortoisegit/

TortoiseGit is part of the popular Subversion tool, TortoiseSVN. It integrates directly with Windows Explorer and provides a GUI front end to the most common Git commands.

Tower

URL: http://www.git-tower.com/

Tower is an OSX Git client that includes many features and is similar to GitK and the Git Extensions for Windows. It has a repository manager and history viewer, integrates with other apps, and more.

Smart Git

URL: http://www.syntevo.com/smartgit/index.html

Smart Git is a multi-platform (Windows, OSX, and Linux) Git client that aims to help beginners get started quickly.

Posh-Git

URL: https://github.com/dahlbyk/posh-git

Posh-Git, or “PowerShell Git”, brings several features of the Git Bash Shell from MSysGit into Windows PowerShell, for those who are more comfortable with PowerShell.

Hosting Git Repositories

There are a wide variety of options for hosting a Git repository and making it available to others on private networks or via the Internet. The most basic option is to provide a shared folder on a network drive, or on a developer’s computer. However, there are many other options that provide a more secure hosting platform with additional functionality.

For a more complete list of Git hosting options, see the Git wiki at https://git.wiki.kernel.org/index.php/Interfaces,_frontends,_and_tools.

GitHub

URL: http://github.com/

GitHub is a Git hosting service that provides free and paid hosting options, for open source and private repositories. There is also a “Behind The Firewall” option for internal network hosting. GitHub provides many project management tools including issue tracking, wiki pages, and other collaborative features as well.

In recent years, GitHub has become the standard for hosting open-source projects on the web, and has helped popularize the use of Git around the world.

Gitorious

URL: http://gitorious.org/

Gitorious provides repository hosting for private and open-source projects, and web-based hosting and collaboration tools. It also provides tools for project management to facilitate teamwork and collaboration, such as wiki pages, and has an installable version of its own software for internal network hosting.

Unfuddle

URL: http://unfuddle.com

Unfuddle is a hosted project-management site that provides issue tracking, source control, and other services. It can host both Git and Subversion repositories and has options for both free and paid accounts.

GitoLite

URL: http://github.com/sitaramc/gitolite/

GitoLite is a Unix/Linux-based Git server that uses SSH (Secure Shell) to host Git repositories. It uses a Git repository that is hosted in Gitolite to provide administrative capabilities, including the ability to create other repositories, assign permissions to users, etc.

Git.aspx

URL: https://github.com/jeremyskinner/git-dot-aspx

Git.aspx is a Git repository server that runs on IIS7 and ASP.NET. It is a work in progress, but provides an option for Microsoft-based shops to host Git repositories internally.

Additional Reading and Resources

There is a lot more to learn about Git, how it works and how to work with it than has been described here. Here are a few resources for continuing down the path of learning Git.

The Official Git Website

URL: http://git-scm.com/

The official Git website provides access to the core Git documentation and many other resources, including links to other Git resources. It is a good place to start and to find information and detailed specifications on how Git works.

Git Immersion

URL: http://gitimmersion.com/

Git Immersion is a website dedicated to getting a new Git user up and running quickly. It walks the reader through the basics of installation, configuration, and working with a Git repository in detail.

Pro Git

URL: http://progit.org

The Pro Git book, published by Apress, is available free of charge through their website. It provides a detailed look at how to work with Git and how Git works, covering nearly every aspect of Git.

Git Ready

URL: http://gitready.com

Git Ready is a website that aims to help people “learn Git one commit at a time”.

Git From a Developer’s Perspective

URL: http://www.code-magazine.com/Article.aspx?QuickID=1008091

Published in the July/August 2010 issue of CODE Magazine, Git From a Developer’s Perspective provides a laundry list of the most common Git commands and how to work with them. It also includes information on working with Github repositories for collaborating with other Git users.

Git Reference

URL: http://gitref.org/

Git Reference is a website for quick reference of commonly used Git commands.