1.0 BACKGROUND
Software is unique in the sense it is intangible and highly malleable. It is intangible because we can not touch it. The best we can do is to look at the files. Software is malleable because it is so easy to change it, either by design or by mistake. And it is equally easy to damage it. These unique qualities pose demanding requirements on Software Configuration Management.
It is instructive to compare software development with an engineering activity like building a bridge. In case of building a bridge, the major activities are study of requirements, project design, structural design, manufacture of structures, actually building the bridge and testing it. In software, there are activities like study of requirements, system design, software design, coding and testing. So there is a lot of similarity. In case of bridge building, there are desk activities like project design, design, preparation of drawings, etc. There are elaborate systems of configuration management for outputs of these activities. In case of software, almost all work is desk work. So, taking a clue from engineering, we need good methods of configuration management for outputs of software related activities.
The most important activities of the life cycle of a software system are programming, testing and deployment of the system. A software product or system undergoes many revisions during its life cycle. We need to take steps so that there is no regression in chronological system versions and we work efficiently with multiple developers at multiple locations. Software Configuration Management needs an elaborate system for Source Code Management (SCM). There are software Revision Control Systems that help in managing the source code. Examples of such systems are Source Code Control System (SCCS), developed by Marc Rochkind (1972), Revision Control System (RCS), originally developed by Walter F. Tichy (1982), Concurrent Versions System (CVS) (1990), Apache Subversion (SVN) (2000) and Git, originally developed by Linus Torvalds (2005).
2.0 AN EXAMPLE
Consider a hypothetical example of Alice, a solo programmer, developing a Simple Software Product (SSP). Alice periodically releases progressive versions of SSP as she adds new functions to this software and refines the existing ones. The SSP development and release diagram looks like this.
Fig. 1 Project development tree
The rectangles in the above figure represent release of an SSP version. The arrows between the rectangles represent the development work between releases.
So, what are the configuration management issues here? The basic source code management issues are relevant even in a project as simple as this.
2.1 Software must not regress
As Alice releases new a software version her major concern is that any of the functionality offered by the previous version should not be lost. As software is highly malleable, it is quite possible that some text gets deleted accidentally during editing which breaks an earlier working function of SSP. Like many others in the software field, Alice uses informal methods for SCM. Before releasing a version, she compares its files with that of the previous version using the diff command,
diff -Nur new-version previous-version > differences-for-new-version
The differences-for-new-version file gives all the changes that have been made in the software since the last release. Alice examines this file carefully and once satisfied that the changes are genuine, she goes ahead with the release. She also keeps an archive of the software released for each version along with some documentation like the work done in a version in multiple safe places.
2.2 Difficulties with Manual SCM
There are some inherent difficulties with manual source code management. It is tedious, requires considerable effort and is prone to errors. Our efficiency in doing SCM manually is quite low. Since it requires effort, there will be reluctance to go through the entire version making workflow for small changes. This might result in lower end-user satisfaction. Also, one might miss some documentation during a version release. Or, for a small change, one might skip the version control workflow. If something goes drastically wrong with the software, we will be reluctant to go to an earlier version because we will not know which version to go back to and how much functionality would be lost in the process.
Normally, we are interested in configuration management of only some of the files like source files, configuration files and, say, some data files. In the version directory tree, there are a lot of files, like object and executable files, which are generated by the system. The above diff command finds differences between all the files in the directory tree and the output might be voluminous. That is another difficulty with the manual version control as we cannot, at least that easily, identify the files which should be put under configuration management.
2.3 Working in a team
Once SSP grows to a size, Alice would certainly need some help in development and testing activities. More people bring in new ideas and specialized expertise. So, after sometime, Bob and Carol, try SSP, find it interesting and feel that it can do better with some more functions. They discuss with Alice and the one-programmer effort grows into a three person teamwork. The development tree now looks like this.
Fig. 2 Project development tree for the team
Now, the work is being done in parallel on the project. There are multiple branches of development. Alice, being the original developer, is still responsible for overall progress of the project. She maintains the master branch of development of software, which has all the functionality and from which software versions of SSP are released. Bob works in parallel, takes version 1.2 and adds his code to it. After some time, Bob has a version b1, which has version 1.2 plus Bob's work till date. Bob continues his work and finishes with his version b2. In the meantime, Alice has done some more work and released version 1.4 and Carol has started a new branch of development, starting with the base version 1.3. Of course, any two versions on different branches have the common functionality of a common ancestor and additional functionality of the individual work done on the branch.
Merging Versions
Bob's version b2 needs to be merged with the master version so that the additional functionality of b2 becomes available in the next version release. Since Bob started with version 1.2, we can find delta (b2 – 1.2) , which is,
diff -Nur version-b2 version-1.2 > delta-b2-1.2
The work remaining is to add delta-b2-1.2 to the version 1.4 and arrive at version 1.5, which can be released. This can be done using a combination of manual editing, the patch command, and examining the differences between the versions 1.5 and 1.4, and also between versions 1.4 and 1.2.
interestingly, the merge process to some extent depends upon the software architecture. If Alice and Bob's work are totally distinct, that is the problem has been so divided, that there is hardly any coupling between their work, the merge is much easier and can possibly be done mechanically. If if it is other way round, like they share some code and the variables, the merge becomes more difficult and requires more of manual effort.
Once again, we may note that the part of the difficulty is because SCM is being done manually. If we use a SCM system like Subversion or Git, the tools help in merging and the SCM process becomes much easier. The advantages of accuracy, saving in effort and overall quality make a strong case for using a SCM system for software revision control.