Monday, March 2, 2015

Professional source code management with Git

If you want to start working on an open-source project or if you are applying for a new job that requires previous experience with distributed revision control and source code management (SCM), then almost certainly you are being asked to know how to use Git.

Git is a huge monster, it is pretty extensive and can be overwhelming at first. If you don't believe me, just take a look at the Git bash documentation: In fact, please do. It is a great start if the following terminology is foreign to you: pushing, pulling, rebasing, branching.

When you are joining a new open-source project, if your intentions are of course to contribute to the project, then the first step is to fork the repository. Forking and cloning the repository will create a clean slate for you to work on. Depending on what you intend to implement or fix (and also depending on the team) it is usually a good idea to start branching immediately after forking, since after you have forked a project, you will always first point to the master branch

Making your changes on the master branch can be catasthropic. Depending how much your new code affects existing code (e.g. if you are implementing a totally new function that overrides an existing one) and also depending on the size of the project and on the number of commits (per day/week/month) you will usually run into some merging (and probably political) problems. So creating your own branch gives you also the freedom to implement without having to worry to merge important changes. You will ideally only have to merge once at the end.

All of your changes will be pushed into your own branch, which will then be merged at the end into the master branch using pull requests. With professional source code management, comes also professional and consistent commit messages. Ideally you will be only committing once and with a single commit message that has the following structure:
Title - provides a brief summary of the Issue/Bug/Feature. e.g. Issue 3 - Fixed a new line error in the graphics engine

Description - provides a detailed description of what was fixed. e.g. The class graphicsEngine.js was ignoring CRLFs from Windows machines. This class now takes CRLFs into consideration.
Between title and description you have a new line (CRLF on Windows, but ideally only LF).
Having single commits for a branch is unfortunately not always the case. Sometimes you make mistakes or are not clear enough on commit messages. Sometimes you commit more than once, because you thought you were ready but clearly missed something in the end. Or maybe you just like committing stuff. So in order to get all your commits in one, you need to squash them. This is done using a technique called rebasing. Rebasing will allow you to merge all your commits and commit messages into one. It will also allow you to pick which commits you want merged and which should be left out. It is a great tool for refactoring your commits!

There are many other aspects that come with professional source code management with Git. This tutorial only covers the tip of the iceberg, but it provides a quick and good introduction to the power of distributed revision control tools such as git can offer.


Post a Comment