Skip to main content
Engineering LibreTexts

14-B.3: Git: Version Control

  • Page ID
    43489
  • Git

    Nowadays software development takes place in a distributive way. This article focuses on one such technology that supports distributed software development i.e., Git.

    What is Git about?

    • Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
    • Git relies on the basis of distributed development of a software, where more than one developer may have access to the source code of a specific application and can modify changes to it, which may be seen by other developers.
    • Initially designed and developed by Linus Torvalds for Linux kernel development in 2005.
    • Every git working directory is a full-fledged repository with complete history and full version-tracking capabilities, independent of network access or a central server.
    • Git allows a team of people to work together, all using the same files, and it helps the team cope with the confusion that tends to happen when multiple people are editing the same files.

    Why Use Version Control Software?

    • Version control software allows the user to have “versions” of a project, which show the changes that were made to the code over time, and allows the user to backtrack if necessary and undo those changes.
    • This ability alone, of being able to compare two versions or reverse changes, makes it fairly invaluable when working on larger projects.
    • In a version control system, the changes would be saved just in time – a patch file that could be applied to one version, in order to make it the same as the next version.
    • All versions are stored on a central server, and individual developers checkout and upload changes back to this server.

    Characteristics of Git

    1. Strong support for non-linear development
    • Git supports rapid branching and merging, and includes specific tools for visualizing and navigating a non-linear development history.
    • A major assumption in Git is that a change will be merged more often than it is written.
    • Branches in Git are very lightweight.
    1. Distributed development
    • Git provides each developer a local copy of the entire development history, and changes are copied from one such repository to another.
    • The changes can be merged in the same way as a locally developed branch very efficiently and effectively.
    1. Compatibility with existing systems/protocol
    • Git has a CVS server emulation, which enables the use of existing CVS clients and IDE plugins to access Git repositories.
    1. Efficient handling of large projects
    • Git is very fast and scalable compared to other version control systems.
    • The fetching power from a local repository is much faster than what is possible with a remote server.
    1. Data assurance
    • The Git history is stored in such a way that the ID of a particular version depends upon the complete development history leading up to that commit.
    • Once published, it is not possible to change the old versions without it being noticed.
    1. Automatic garbage collection
    • Git automatically performs garbage collection when enough loose objects have been created in the repository.
    • Garbage collection can be called explicitly using git gc –prune.
    1. Periodic explicit object packing
    • Git stores each newly created object as a separate file. It uses packs that store a large number of objects in a single file (or network byte stream) called packfile, which are delta-compressed among themselves.
    • A corresponding index file is created for each pack file, specifying the offset of each object in the packfile.
    • The process of packing can be very expensive computationally.
    • Git allows the expensive pack operation to be deferred until later when time does not matter.
    • Git does periodic repacking automatically but manual repacking can be done with the git gc command.

    How Git Works

    1. A Git repository is a key-value object store where all objects are indexed by their SHA-1 hash value.
    2. All commits, files, tags and filesystem tree nodes are different types of objects living in this repository.
    3. A Git repository is a large hash table with no provision made for hash collisions.
    4. Git specifically works by taking “snapshots” of files.

    Git Repository Structure

    The structure consists of four parts:

    1. Working Directory: This is your local directory where you make the project (write code) and make changes to it.
    2. Staging Area (or index): This is an area where you first need to put your project before committing. This is used for code review by other team members.
    3. Local Repository: This is your local repository where you commit changes to the project before pushing them to central repository on Github. This is what is provided by distributed version control system. This corresponds to the .git folder in our directory.
    4. Central Repository: This is the main project on the central server, a copy of which is with every team member as local repository.

    All the repository structure is internal to Git and transparent to developer.

    Git Command Examples

    To add a file Readme.txt to the staging area to track its changes:

    pbmac@pbmac-server $ git add Readme.txt

    To create a new branch named as Testing:

    pbmac@pbmac-server $ git branch Testing

    To switch to branch Testing from master branch:

    pbmac@pbmac-server $ git checkout Testing

    To clone or make a local copy of the global repository in your system:

    pbmac@pbmac-server $ git clone https://github.com/pbmac/TestFile.pdf

    To commit our changes (take a snapshot) and provide a message to remember for future reference:

    pbmac@pbmac-server $ git commit -m “Created a Readme.txt”

    To set the basic configurations on github like your name and email:

    pbmac@pbmac-server $ git config

    To create a local git repository for us in our store folder (this will help to manage the git commands for that particular repository):

    pbmac@pbmac-server $ git init

    To check the history of commits for our reference:

    pbmac@pbmac-server $ git log

    To merge Testing branch with master branch:

    pbmac@pbmac-server $ git merge Testing

    To push all the contents of our local repository that belong to master branch to the server (Global repository):

    pbmac@pbmac-server $ git push -u origin master

    To see what's changed since last commit; shows all the files that have been added and modified and ready to be committed and files which are untracked:

    pbmac@pbmac-server $ git status

     

    Setting Up Repositories

    In the following example, the user enters their user.name and their user.email - DO NOT simply type "user.name" or "user.email" - that is incorrect.

    # Configure global settings
    pbmac@pbmac-server $ git config --global user.name "Your name here"
    pbmac@pbmac-server $ git config --global user.email "your_email@example.com"
    
    # then we make a directory for the git project
    pbmac@pbmac-server $ mkdir newProject
    pbmac@pbmac-server $ cd newProject/
    
    # Need to initialize the directory
    pbmac@pbmac-server $ git init
    Initialized empty Git repository in /home/pbmac/Development/newProject/.git/
    pbmac@pbmac-server $ git add *.cpp
    
    create a code file
    pbmac@pbmac-server $ edit file1.cpp
    
    # add it to the repository
    pbmac@pbmac-server $ git add *.cpp
    
    # Commit the changes to the repository server
    pbmac@pbmac-server $ git commit -m 'test initial versions'
    [master (root-commit) c9ed3f6] test initial versions
     1 file changed, 0 insertions(+), 0 deletions(-)
     create mode 100644 file1.cpp
    

    Branching and Merging

    In particular, pay attention to branching: Create a separate branch to develop a feature (or work on a bug) without disturbing the master branch. If it works out, you can merge it back into the master; if it doesn’t, you can trash it. Branching is super easy, so for big projects, you should probably do it more often than not.

    To create a branch called new_feature:

     pbmac@pbmac-server $ git branch new_feature 
    

    Then “check it out”:

    pbmac@pbmac-server $ git checkout new_feature 
    

    Make various modifications, and then add and commit. To go back to the master branch, check it out:

    pbmac@pbmac-server $ git checkout master 
    

    To push the branch to github, use this:

    pbmac@pbmac-server $ git push origin new_feature 
    

    If you make changes to the master branch, you’ll want to merge them into your exploratory one:

    pbmac@pbmac-server $ git checkout new_feature
    pbmac@pbmac-server $ git merge master 
    

    If you’re satisfied with your changes in the exploratory branch, merge them into the master:

    pbmac@pbmac-server $ git checkout master
    pbmac@pbmac-server $ git merge new_feature 
    

    If you’re done with the branch and want to delete it:

     pbmac@pbmac-server $ git branch -d new_feature 

    Git Collaboration

    One of git's strengths is the ability for multiple people to collaborate on a project. There are different methods to make collaboration work; one of them is described below. In this example we will have the currentMaster code, and we will have the newFeature code.

    1. Update the local repository with the currentMaster, which is the known and working branch of the project:
      git pull currentMaster 
    2. Create a new branch where the new feature code will be added and tested:
      git checkout -b newFeature
    3. Once the changes have been made on this branch, they need to be staged for commit:
      git add
    4. Now the changes need to be committed on the local repository:
      git commit -m "Added the new feature as requested by the client" 
    5. Upload the changes (including the newFeature branch) to GitHub:
      git push currentMaster newFeature
    6. Go to the main repository on GitHub - the newFeature branch should now be visible there.
    7. Click on the newFeature branch.
    8. Click on “Pull Request” button - this asks for the code to be merged into the main (currentMaster) branch on GitHub.
    9. Click on “Send Pull Request” which sends the pull request to the individual on the team in charge of merging changes.

    What is .gitignore

    Files in your working Git repository can be:

    1. Untracked: Changes that have not been staged or committed
    2. Tracked: Changes that have been staged or committed
    3. Ignored: Files you tell Git to ignore

    There are some files you want Git to ignore and not track in your repository. These include many that are auto-generated or platform-specific, as well as other local configuration files such as:

    1. Files with sensitive information
    2. Compiled code, such as .dll or .class
    3. System files like .DS_Store or Thumbs.db
    4. Files with temporary information such as logs, caches, etc.
    5. Generated files such as dist folders

    If you don't want Git to track certain files in your repository, there is no Git command you can use. (Although you can stop tracking a file with the git rm command, such as git rm --cached .) Instead, you need to use a .gitignore file, a text file that tells Git which files not to track.

    It's easy to create a .gitignore file; just create a text file and name it .gitignore. Remember to add a single dot ( . ) at the beginning of the file name. That's it!

    Adapted from:
    "git/github guide a minimal tutorial" by Karl W. Broman is licensed under CC BY 4.0
    "Version Control with Git - Collaborating" by The Carpentries is licensed under CC BY 4.0
    "Don't ignore .gitignore" by Rajeev Bera, OpenSource.com is licensed under CC BY 4.0

    • Was this article helpful?