Migrating to Git, Forensic Considerations

https://creativecommons.org/licenses/by/2.0/ Jack Spades

Git has unleashed an unusual number of migrations from legacy tools among a wide variety of companies. Desire to attract new developer talent is one reason we commonly hear, another is finally there is an open source version control tool that has sufficiently compelling advantages, perceived as well as real, to undergo a migration.

Often there’s an initial desire to migrate everything to Git, and shut off the legacy system. If that system is commercial and requires on going licensing fees, there’s a stronger incentive. But if it’s an open source tool, there are reasons you might want to pay the maintenance cost of keeping it around.

Easier migrations

One reason is that it’s usually advisable to not attempt a full migration of all history, but instead pick major baselines and only migrate those. By leaving the legacy system running, but in read-only mode, you always have the chance to go back and find something in the full history. That leads us to the next topic.

Proof of invention

Many companies face infrequent but high stakes litigation around intellectual property disputes. Take the example of an algorithm that’s an issue in a lawsuit. You need to prove you were using the algorithm prior to a certain date. Version control systems are ideal for this, but only if you have the complete historical record. Someone noticed that the CVS repository hadn’t been used in three years and deleted it? Oops.

Forensics in a hybrid system

Thinking ahead a few years, we now have all new development done in Git, with our trusty CVS server patiently waiting for the next lawsuit. Consider that an investigation begun today may start in the Git history, and then need to be traced into the read-only CVS history.

This means that you will need to be able to link history going backwards into time across your hybrid Git-CVS deployment. Practically, this means that this requirement should be taken into consideration during the initial migration to Git.

Imperfect history migration

You might think that doing a full history migration would be a fix for this. In some cases this might be advised; you should generally migrate enough history so that going back into the legacy system is an unusual event. However the problem here is that perfect fidelity in history migration between SCM systems is rarely possible. There are differences in capabilities or metadata that may have no deterministic answer.

The legacy system remains the definitive system of record pertaining to your intellectual property. Further, you may need to treat that history as it spans legacy and new tools. While its use may be infrequent, you’ll likely be happy you planned ahead during the giddy days of your Git adoption.



2 Responses to “Migrating to Git, Forensic Considerations”

  • It’s not just open source — even a proprietary version management system would be a lot cheaper to keep going as a single VM with one license for someone in the Legal dept. than scaled up for actual production use. You might not even need a paid license for read-only use.

  • True, costs for many types of systems might end up minimal, but if you scale down your install as you suggest, are you also changing the systems in some significant way? For example, are you changing the relationship with other systems that interact with or supplement the SCM data by having a single user?

Leave a Reply