Scaling Subversion with WANdisco

How many developers do you think can be supported by a single Apache Subversion server? I normally hear answers between 300 up to 1000 or so with specialized hardware. While addressing the vast majority of development projects, this level of scalability falls short of the largest enterprise development needs.

With WANdisco’s patented DConE replication technology, we can support Subversion deployments for the largest enterprise development requirements of 20,000 or more developers. Add built-in HADR (High Availability and Disaster Recovery) and transparent global multi-site capabilities, and now the enterprise has a proven and popular industry standard SCM tool that can support any known size of development project.

What is DConE?

DConE, WANdisco’s core technology, powers our SVN MultiSite products. DConE stands for “Distributed Coordination Engine” and implements a WAN (Wide Area Network) capable Paxos coordination algorithm, which is then integrated with Subversion to create the SVN MultiSite product.  DConE implements a mathematically proven ideal solution; there’s no need to wonder if a more efficient coordination solution exists.

How does DConE achieve this?

One reason is that every developer performs all read and write operations to a local, LAN (Local Area Network) resident server, even if the replicated server exists in multiple locations across the world on a WAN. It seems slightly magical, yet accurate, to think of this as a single instance of Subversion existing simultaneously on different machines. Distributed computing refers to this as “one copy equivalence.”

Further, even though writes must occur on each machine in the replication group, they take place continuously and in the background. So another reason for greater scalability is that maximum write load tends to follow the sun- giving the moonlit servers time to catch up. This has the effect of distributing the write load across multiple machines.

Another reason is because Subversion, as with most other data repositories, typically supports far more read traffic than write traffic (c. 95-99% reads). The write data for each server is delivered just once per transaction; the subsequent much higher read load occurs against the local server and without creating additional WAN traffic.

Not Magic

While impressive, DConE is not magic. The quality of the connection, and, ultimately, the speed of light limit how quickly data be moved from one replication node to another. What DConE delivers is a completely fault tolerant, mathematically ideal coordination engine for performing WAN connected replication.

Becoming Cloud-like

As I wrote in the article “Putting the Cloud into an Eyedropper”, the end result is to give today’s essential applications, originally architected for single machines and co-located teams, a new way forward as multi-machine, multi-master, and multi-site replication groups.

The alternative approach often finds these same applications forced into awkward service for globally distributed development teams and modern public, hybrid, and private cloud IT environments. They may employ fragile master/slave replication schemes to scale read traffic, attempt grueling ground up rewrites, or be forced to simply wait to be supplanted by newer technology.

Subversion at Extreme Scale

DConE is an effective way for Subversion to support more developers than previously thought to collaborate on massive, enterprise software projects.  We would love to hear about your experiences, positive and negative, with extreme Subversion deployments.

0 Responses to “Scaling Subversion with WANdisco”

  • No Comments

Leave a Reply