We recently released our Git MultiSite product, the newest member of the MultiSite family, built on the unique and proven replication platform powering SVN MultiSite and SVN MultiSite Plus.
I can hear the cries already: Git is multi-site by nature, so why build a multi-site product for Git?
There are definitely common use cases where Git by itself works just fine, just as there are hundreds of thousands of Subversion servers in happy use without MultiSite, mirrors or a replication strategy. In particular, common types of open source projects where Subversion and Git were born and bred are generally well served by the stock tools. But recently Git has been joining battle-tested Subversion in the enterprise, where its untamed nature has the potential to become a multi-headed monster instead of a multi-site deployment.
The nature of enterprise and some other types of software development create additional requirements for availability, data safety and global scalability, stretching nascent tools of open source into challenging new circumstances. This extreme environment is WANdisco’s natural habitat, where we combine strong support of open source projects with enterprise know-how.
Global Software Development
Enterprise software development has largely become global software development, with virtual teams often being assembled for a project based on availability and skill set rather than location. With larger teams and larger projects, cloning over a LAN become increasingly preferable to cloning over the WAN. Replication brings the data to the LAN near each developer site, and so replication forms the core technology behind a scalable enterprise Git solution, as in our Git MultiSite product.
High availability and business continuity are also important to large enterprise software development projects. Even seemingly small amounts of downtime can add up to millions of dollars a year in costs, as reported in recent Forrester research. And while developers can continue to commit locally with Git even if their shared repository is down, this runs contrary to the trends of continuous integration and continuous delivery, where it’s important to get changes into the mainline as soon as possible. That’s why Git MultiSite can be configured in a variety of ways to provide seamless failover. When failed nodes come back online, they transparently rejoin the group and “self-heal” by automatically catching up to their replicated peers.
I’ve seen it stated that since cloned Git repositories are all copies of the master repository, disaster recovery is as simple as finding a developer that’s pulled recently, and then rsyncing that repo as the new master shared repo. Now I hear cries from the enterprise administration side of the room, because how do we restore server side configurations, scripts and access control? What does “recently pulled” mean and how do I find that developer at 3AM? Did that developer pull all the branches or just master? What if someone had just accidentally deleted a branch ref, can I still recover using a reflog from a clone? Wouldn’t it be better to consider a system that solves all of that with no operator interaction?
These are some of the reasons that we built a Git MultiSite product despite Git’s reputation of being multi-site by nature. Do any of these needs speak to you?