Git Is Not Distributed

disconnected_320Everyone knows Git as a “Distributed Version Control System”, or DVCS. There it is: “distributed”, right in the description.

Except that Git is better described as “disconnected.”

The main reason is that true distributed computing systems feature coordinated communication between the distributed nodes. Git, although it can communicate over a WAN to other nodes, has no such coordination of pushes. Pushes are initiated manually or with ad-hoc scripts.

This lack of coordination between related Git repos means that Git is really a disconnected system.

The developer using Git on her laptop is well described as being disconnected. She has no idea what is going on in other repos, particularly the shared repo her teammates push to at the end of a completed task. Note that there’s nothing essentially wrong with the disconnected model here, it’s only the term “distributed” that is at issue.

Enterprise Git

Enterprise Git deployments face different scalability challenges than most other types of projects. The need to support large, geographically distributed development teams with scalability and performance combined with business continuity through high availability and disaster recovery raises a number of questions supporting large, global Git deployments. How do I provide fast clones at remote sites? How do I recover from hardware or connectivity failures? How do I avoid picking winners and losers in my development organization when I choose the master server location?

Redux

Sometimes master/slave replication is used to provide local read-only mirrors at sites worldwide as an attempt to answer some of these questions. We’ve seen the same pattern in Subversion deployments using tools like svnsync to support these mirrors. This is not a very satisfying solution in practice, as I wrote in “Why svnsync Is Not Good Enough for Enterprise Global Software Development“.

It’s the coordination, stupid

While this title riffs off a famous snowclone, Git without coordinated replication is similar to Subversion with svnsync in terms of being distributed. Git has the capability of svnsync built in: it can already reconcile repositories over a WAN. What it lacks, just as Subversion with svnsync does, is the coordination of the reconciliation and replication process. So Git is no more distributed than svnsync makes Subversion distributed.

Making Git Distributed

This is where WANdisco’s patented replication technology steps in to provide 100% data safe and optimally coordinated replication of shared Git and Subversion repositories. While the industry has enjoyed the benefits of SVN MultiSite for years, the recently announced Git MultiSite makes Git, at least between the shared enterprise repositories, finally worthy of truly being called “distributed.”

1 Response to “Git Is Not Distributed”


  • Perhaps “loosely coupled” instead of “disconnected” would be a more accurate description of this Git characteristic?

Leave a Reply