Monthly Archive for July, 2013

Five Things to Avoid in your Enterprise Git Solution

Are you a Git administrator considering a new enterprise Git solution? To help you along the decision path, here are five things to avoid when choosing your Git solution.

No plan for data and business continuity

The data you keep in Git is critical to the successful operation of your business. It has to be secure and highly available. When evaluating a Git solution, if it doesn’t provide excellent protection against hardware failures and other disasters, look elsewhere.

No growth path

Successful companies grow rapidly. Within a year, the size of your team could double thanks to growth and acquisition, and new challenges such as a partnership with a company overseas demand flexibility from your infrastructure. Your Git solution should help you scale to meet these challenges, not make you design your own replication system.

It’s not really Git

Avoid any Git solution that doesn’t use plain old Git repositories under the hood. Otherwise you may end up tied into a proprietary framework, missing out on the tremendous portability that Git offers. Data translation is a necessary evil during migration – not something you should do on a daily basis.

Narrow field of vision

Keeping track of the tens and hundreds of Git repositories you maintain on several servers in multiple locations is a challenge. You want a solution that allows you to see the whole deployment at a glance, rather than one piece at a time. Can you see whether the repositories in your satellite office are up and performing well without calling someone there?

They don’t know Git

When you purchase a Git solution, you want a vendor that will stand behind it and help you get the most out of Git. Do they offer Git support and services? If you can’t pick up the phone and talk to an expert who knows more about Git than you do, what are you paying for?

Learn more about our Git solutions here and our Git services and support here.

Certified Git Binaries Now Available

WANdisco has supplied certified Subversion binaries for years, and now our certified Git binaries are available for several major platforms. And all I can say is…thank goodness.

If you’re a Linux user and your distribution has an out of date Git package in its repositories, it can be quite an inconvenience. I’ve built and installed Git from source a number of times, and it requires several third-party packages as dependencies, especially if you want to build the docs. Some of these packages can be hard to find on Ubuntu, and downloading and building everything from scratch requires 15 minutes I can’t afford each time. Now I can just grab a certified, up-to-date package from WANdisco.

SCM administrators have an even bigger challenge: making sure every user has a suitable Git package on their workstation (you may not care all the time, but some Git tools and integrations require newer versions of Git). In this case, you definitely don’t want to build Git from source, or have your users do it themselves. You could end up with a release that has bugs, leaving you to rebuild, pick up the patches and push a new build out to all your users – which can quickly become a maintenance headache.

Now you can take advantage of fully tested and certified, continuously up-to-date Git binaries with all the latest bug fixes. WANdisco also offers support and services packages. Just visit our download page and you’ll find all the information you need to get started.

Subversion and the Mainline Model

Subversion: Built for the Best Branching Model

Over the years Subversion has taken a few lumps for its branching and merging tools, but the latest release has fixed a key problem, and it’s worth remembering that the workflow Subversion supports best is also the best workflow: the mainline model. This model is the recommended approach in the continuous integration and continuous delivery communities.  In this article we’ll look at Subversion and the mainline model from a high level.

Most of the workflows that are popular in the Git community can be used for SVN as well; even if the implementation differs, the mainline model is just about the same.

In the mainline model, you have very few long lived branches. Most commits happen directly on trunk, and developers are encouraged to commit frequently. Avoiding long lived feature branches ensures that integration happens sooner rather than later, avoids painful merges, and gives you all the benefit of continuous integration practices.

SVN Mainline Model

SVN Mainline Model

The diagram above shows what your branching model looks like if you’re a true believer in the mainline model, meaning you don’t branch until you hit a release point and need to start isolating bug fixes from new development work. Topic branches [1] are used for very short periods of time, primarily to give developers a private area to save work before committing to trunk. Topic branches also are natural points for pre-flight code review and continuous integration.

If this simple diagram looks familiar, it’s because it’s the model that Subversion was built for, matching the familiar trunk/branches/tags layout of a Subversion repository. As of Subversion 1.8, Subversion’s merge engine handles this model well. Merges are primarily done to keep a topic branch up to date, and since a topic branch only lives for a very short time, the merges are easy. Similarly, symmetric merges between trunk and release branches are handled very well.

It’s also worth recalling one of Subversion’s perennial strong points: making branches is cheap and easy, requiring only one command or a few clicks in a Subversion GUI. There’s no overhead with Subversion branches, so developers can make as many private topic branches as they like without incurring any penalties. Making a lot of very small, short-lived topic branches is a much safer practice than working on a few big feature branches. Similarly, you can tag (make a read-only branch) at any point to indicate key milestones. If making a branch is hard in your SCM tool, or you’ve been warned not to make too many branches in order to avoid a performance problem, then you should take a look at Subversion or Git. The open source SCM tools have solved this problem.

Subversion’s merge engine may not be perfect, but it works very well as part of a continuous delivery system. If you have an elaborate branching model with multiple levels and a lot of sideways merges, no merge engine will save you from eventual trouble. Not only will your revision graph look like spaghetti, but you’ll eventually run into bigger workflow and process problems.

And of course, you can scale Subversion to support a large distributed software team using WANdisco’s suite of MultiSite products. The mainline model isn’t limited to a small co-located team anymore.

Trying to figure out the best way to deploy Subversion? WANdisco has a team of SVN experts waiting to help!

Subversion is a registered trademark of the Apache Software Foundation.

 [1] The Git community uses the term topic branch.  Other names include task branches, development branches, and feature branches.  But I think topic branch captures the use case perfect – a branch that holds one small topic of work.

 

Subversion 1.8.1 Released

Following June’s long-awaited release of Subversion 1.8, the Apache Software Foundation (ASF) has announced the first update, 1.8.1.

Apache Subversion 1.8.1 is largely a bug-fix release, including fixes for the following:

  • upgrade –  fix notification of 1.7.x working copies
  • resolve –  improve the interactive conflict resolution menu
  • translation updated for German and Simplified Chinese
  • improved error messages when encoding conversion fails
  • update –  fix some tree conflicts not triggering resolver
  • merge –  rename ‘automatic merge’ to ‘complete merge’
  • log –  reduce network usage on repository roots
  • commit –  remove stale entries from wc lock table when deleting
  • wc –  fix crash when target is symlink to a working copy root
  • mod_dav_svn –  better status codes for anonymous user errors
  • mod_dav_svn –  better status codes for commit failures

For a full list of all bug fixes and improvements, see the Apache changelog.

You can download our fully tested, certified binaries for Subversion 1.8.1 free here.

WANdisco’s binaries are a complete, fully tested version of Subversion based on the most recent stable release, including the latest fixes, and undergo the same rigorous quality assurance process that WANdisco uses for its enterprise products that support the world’s largest Subversion implementations.

Using TortoiseSVN?

There is an updated version of TortoiseSVN, fully compatible with Subversion 1.8.1, available for free download now.

Git Workflows and Continuous Delivery

Using MultiSite Replication to Facilitate a Global Mainline

Although Git is a distributed version control system (DVCS), it can support almost any style of software configuration management (SCM) workflow. The lines between the four prominent workflows in the Git user community can be blurry in implementation, but there are important conceptual differences between them.  Understanding these differences is important when considering the use of Git workflows and continuous delivery in your organization.

After an introduction to these workflows, we’ll evaluate how they match up against continuous integration and continuous delivery best practices, and then look at their application with global software development teams.

Workflow Overview

Fork and pull

In this model, a developer will fork (clone) a Git repository and work independently on their own server-side copy. When the developer has a change ready to contribute, he/she will ask the upstream maintainers to pull the changes into the original repository.

This model originated in open source projects and is prominent in that community. Contributors to open source projects may not even know one another and rely on a trusted set of upstream maintainers to review any contributions.

Fork and pull workflow

Figure 1: Fork and pull workflow

Feature branches

In this model, new branches are made for each feature (also called a task or topic) and are sometimes shared with the master repository. When changes are approved they are merged to the mainline (master) branch.

This model suits many small teams, as they are able to collaborate in a single shared repository yet still isolate new work to an individual or a small group. Functionally it is very similar to fork-and-pull but a feature branch usually has a shorter lifespan than a forked repository.

Feature branch workflow

Figure 2: Feature branch workflow

Mainline model

In this model, most work is committed directly to the trunk (master branch). There are few, if any, long lived branches less stable than the trunk. Long lived branches are sometimes used for release maintenance. Developers are encouraged to commit to the trunk frequently, perhaps daily. Local branches and stashes can be used for pre-flight review and build, but are not promoted to the shared repository.

The mainline model is strongly recommended in continuous integration and continuous delivery paradigms. It encourages very frequent reconciliation of new work, preventing any buildup of merge debt. Following this model, work is merged and up to date on a regular basis and available for testing and possible deployment.

The mainline model scales to large teams in enterprise settings but requires a high level of development discipline. For example, large new features must be decomposed into small incremental changes that can be committed rapidly. Furthermore, incomplete work may be hidden by configuration or feature toggles.

There is often a fine distinction between practical use of the mainline model and the feature branch workflow. If feature branches are personal, local, and short lived, they are consistent with the mainline model. However, use of a formal promotion process (merge request) versus a pure push can slow down the pace of commits. If every developer commits once a day, all of those commits would need a human review.

Mainline workflow

Figure 3: Mainline workflow

Git Flow

“Git Flow” is a popular model developed by Vincent Driessen[1]. It recommends a long lived development branch containing work-in-progress, a stable mainline, and feature, hot fix, and release branches as necessary. It is somewhat similar to a mainline model with long lived integration branches and feature branches.

Unlike the mainline model, however, the Git flow model violates some of the precepts of continuous integration. Notably, work may be left on the development branch or feature branches, not integrated with the latest changes on the mainline, for a long period of time. Nonetheless this model is often a comfortable transition for teams new to Git and continuous integration. It may also feel more natural for products with a clear distinction between stable development and production code, as opposed to SaaS products that deliver new changes daily.

Git Flow

Figure 4: Git flow

Application to Continuous Delivery

Continuous delivery indicates that each commit is a potential release candidate. Building on continuous integration principles, each commit is merged into the trunk and subjected to a progressively more difficult series of test and verification steps. For example, a commit may run through a pre-flight build, unit testing, component testing, performance testing, staging deployment, and production deployment. The latter stages are more expensive and time consuming, and may even involve human review. A commit that passes all the stages is available to deploy (but is only deployed when the business is ready). A failure must be addressed as soon as possible.

To view it in another light, continuous delivery tries to reduce isolation by vetting and surfacing new work as quickly as possible. Important new features are not hidden in forks or branches for weeks – they are integrated, tested, and made available to the business as soon as possible.

Workflows

As noted earlier, the mainline model is best suited to continuous delivery and is strongly recommended in the literature. Eliminating long lived development branches ensures that every change is tested and integrated quickly, delivering value to the business frequently. It also enforces good habits like decomposing stories and features into incremental tasks that are less likely to cause breakages.

The fork and pull model can leave changes isolated in other repositories for long periods of time, and often involves a gated promotion process. It is the workflow least suited to rapid development in large enterprise teams.

The feature branch workflow occupies a middle ground. If the feature branches are local and short lived, they effectively serve as private staging areas. The promotion process (merge request) should be automated as much as possible with little human intervention.

Git flow is a workable model but introduces a second long-lived branch, putting distance between development and deployment.

Challenges

Consider adoption of the mainline development model as advocated by the continuous integration paradigm. Committing once a day to the trunk is a sea change for developers used to working on isolated branches (or forks) for long periods of time. Though developers may be skeptical, the risk and discomfort are mitigated by:

  • Running rigorous pre- and post-commit tests if you have the latest code and dependencies and can rely on fast continuous integration.

  • Being able to pull updates quickly several times a day.

  • Being able to commit quickly, particularly if a prior commit introduced a breakage and you must fix it or roll back.

Reducing the risk and discomfort of the mainline model imposes several demands of this nature on the SCM system. These demands are even more challenging when you are working with several teams in different locations; you have many more contributors, and the product is assembled from multiple components.

These scaling and infrastructure challenges illuminate the isolation that often arises from working in a large distributed environment. Data may be local to or effectively mastered at one site; and all the complications of working over a WAN will hinder performance and slow down the development tempo.

Global software development on complex projects is common to enterprise software development and complicates the adoption of continuous delivery. In order for a set of large distributed teams to adopt continuous delivery and the mainline model, they must have the tools to overcome data isolation of all kinds:

  • A version control infrastructure that allows a developer at any site full access to the latest source code with the ability to commit frequently.

  • The ability to set up continuous integration (build and test) infrastructure that operates well under heavy load at multiple locations.

  • The support to cope with tens or hundreds of repositories containing the product components, configuration data, environment settings, and other necessary material.

In short, the mainline model reduces isolation introduced by non-optimal codeline models (i.e. new work lingering in long lived branches) to make sure that new work is available quickly. Development teams need the support of a solid SCM infrastructure to adopt the mainline model and avoid the isolation that often comes from working in large distributed teams.

Solving Continuous Delivery Challenges for Global Development with MultiSite Replication

An SCM system that only functions well in a LAN environment under moderate load will not suffice for global development projects. A simple master-slave data replication scheme will not overcome the complexities of operating in a large distributed environment.

Only a true active-active replication system can scale up an SCM system to cope with continuous delivery for a global distributed software organization. With active-active replication as provided by WANdisco’s family of MultiSite products, each node in the system is a peer, usable for any operation at LAN speeds.

  • With an active-active replication system, teams at all sites are first class citizens and can use and access key data with no latency bottlenecks.

  • Likewise, additional peer nodes can handle the load imposed by larger teams of contributors and the associated build and test automation.

  • Since the system is self-healing with automated failover and high availability, there is no risk of down time due to maintenance windows, hardware failures, or network outages.

  • Selective replication means that an administrator can choose which repositories are replicated to which sites. Repositories with production environment data may only be replicated to sites that interact with runtime servers, for example.

  • The MultiSite administration console provides global visibility across all servers and repositories, making it easier to coordinate a product assembled from several components kept in separate repositories.

Conclusion

Git can support many development workflows. The mainline model is considered optimal for continuous delivery.

The code in the SCM system delivers value to the business when it is available to the customer. Continuous delivery is a set of practices designed to reduce the isolation of the data and get it to customers sooner. Active-active replication fully supports the mainline model and other continuous delivery best practices by making the data available when and where it is needed throughout the delivery pipeline.

Learn more about our Git solutions here and our Git services and support here.

[1] http://nvie.com/posts/a-successful-git-branching-model/

Git Is Not Distributed

disconnected_320Everyone knows Git as a “Distributed Version Control System”, or DVCS. There it is: “distributed”, right in the description.

Except that Git is better described as “disconnected.”

The main reason is that true distributed computing systems feature coordinated communication between the distributed nodes. Git, although it can communicate over a WAN to other nodes, has no such coordination of pushes. Pushes are initiated manually or with ad-hoc scripts.

This lack of coordination between related Git repos means that Git is really a disconnected system.

The developer using Git on her laptop is well described as being disconnected. She has no idea what is going on in other repos, particularly the shared repo her teammates push to at the end of a completed task. Note that there’s nothing essentially wrong with the disconnected model here, it’s only the term “distributed” that is at issue.

Enterprise Git

Enterprise Git deployments face different scalability challenges than most other types of projects. The need to support large, geographically distributed development teams with scalability and performance combined with business continuity through high availability and disaster recovery raises a number of questions supporting large, global Git deployments. How do I provide fast clones at remote sites? How do I recover from hardware or connectivity failures? How do I avoid picking winners and losers in my development organization when I choose the master server location?

Redux

Sometimes master/slave replication is used to provide local read-only mirrors at sites worldwide as an attempt to answer some of these questions. We’ve seen the same pattern in Subversion deployments using tools like svnsync to support these mirrors. This is not a very satisfying solution in practice, as I wrote in “Why svnsync Is Not Good Enough for Enterprise Global Software Development“.

It’s the coordination, stupid

While this title riffs off a famous snowclone, Git without coordinated replication is similar to Subversion with svnsync in terms of being distributed. Git has the capability of svnsync built in: it can already reconcile repositories over a WAN. What it lacks, just as Subversion with svnsync does, is the coordination of the reconciliation and replication process. So Git is no more distributed than svnsync makes Subversion distributed.

Making Git Distributed

This is where WANdisco’s patented replication technology steps in to provide 100% data safe and optimally coordinated replication of shared Git and Subversion repositories. While the industry has enjoyed the benefits of SVN MultiSite for years, the recently announced Git MultiSite makes Git, at least between the shared enterprise repositories, finally worthy of truly being called “distributed.”

Forrester TEI Shows SVN MultiSite Delivers ROI of 357% and Payback Period of 2 Months

Analyst Study Confirms SVN MultiSite Boosts Productivity and Ensures Uptime, Learn More during Webinar

We are proud to announce the results of Forrester’s Total Economic Impact (TEI) Report for SVN MultiSite. The subject of the study, a WANdisco customer, was a Fortune 500 company with annual revenues of over $5 billion. Forrester concluded that SVN MultiSite generated a return on investment (ROI) of 357% with a payback period of less than 2 months.

Significant benefits and cost savings were found in a broad range of areas that Forrester attributed to SVN MultiSite’s ability to provide remote developers with local real-time access to Subversion repositories and the elimination of downtime.

Learn about the report during a webinar Wednesday, July 24 at 10:00 AM Pacific / 1:00 PM Eastern when guest speaker Jean-Pierre Garbani, Forrester Research, Inc., Vice President and Principal Analyst, Infrastructure and Operations will present the findings.

Learn how WANdisco’s SVN MultiSite enables:

  • Reducing application development costs via faster development, build, and release cycles.
  • Eliminating downtime by implementing a failover strategy and removing any single point of failure.
  • Significantly reducing cost of ownership based on a proven open source strategy.

Register here.

Ready to try SVN Multisite? Register for a free trial today!

SmartSVN 7.6 Release Candidate 1 Issued

Today we launched SmartSVN 7.6, release candidate 1. SmartSVN is the cross-platform graphical client for Apache Subversion.

SmartSVN 7.6 represents a major step forward from 7.5.5 in features as well as performance improvements.

New SmartSVN 7.6 features include:

– Auto-update – there is no need to install new versions manually

– Repository Browser – defined svn:externals are shown as own entries

– proxy auto-detection

– external tools menu

– OS X retina support

GUI improvements include:

– file/directory input fields – support for ~ on unix-like operating systems

– natural sorting (“foo-9.txt” before “foo-10.txt”)

– more readable colors on Transactions and other panes

 SmartSVN 7.6 fixes include:

– speed-search – possible internal error typing Chinese characters

– Revision Graph – errors when deselecting all branches

– Tag Browser – possible internal error

– SVN operations – significant performance improvements

– Check Out – checking out to an already versioned directory appeared to work, then failed later

– Refresh – possible performance problems

For a full list of all improvements and bug fixes, view the changelog.

Have your feedback included in a future version of SmartSVN

Many issues resolved in this release were raised via our dedicated SmartSVN forum, so if you’ve got an issue, or a request for a new feature, head over there and let us know.

You can download Release Candidate 1 for SmartSVN 7.6 from our early access page.

Haven’t yet started with SmartSVN? Claim your free trial of SmartSVN Professional here.

Apache HTTP Server Project Releases Version 2.2.25

WANdisco Subversion Committer Ben Reser Recognized for Contributions

The Apache HTTP Server Project announced the release of version 2.2.25 of the Apache HTTP Server (“httpd”) on July 10th, 2013.

WANdisco Subversion committer Ben Reser was thanked in the official announcement for identifying a Denial of Service vulnerability.  The vulnerability known as CVE-2013-1896 may allow remote users with write access to crash the httpd server hosting Subversion repositories.  Subversion administrators are urged to upgrade their installation of httpd to 2.2.25 or the latest 2.4.x release.

WANdisco’s Subversion binary packages for Solaris and Windows operating systems have been updated to include the updated version of httpd. Download them from our website.

According to Apache, 2.2.25 “is principally a security and bugfix release.”  See the official 2.2.25 changelist for a complete list of improvements in this release.

Download Apache HTTP Server 2.2.25 or the latest 2.4.x release here.

Subversion 1.8 Backwards Compatibility

Upgrading to a new version of your SCM system is a big decision, often requiring careful planning by administrators to balance the benefits of new features and capabilities against any compatibility and upgrade concerns.

Fortunately for administrators, the Subversion project has always been very good about trying to maintain backwards compatibility and documenting which features require newer servers and clients. As you’ll notice, some of the most appealing parts of Subversion 1.8 only require a client upgrade.

In the interest of saving time, I’ve summarized the most important information here.

Upgrade highlights

  • Pre-1.8 servers and clients are compatible with 1.8 servers and clients, but not all the new features are available with older servers and clients.

  • You do not need to dump and reload repositories when upgrading the server. However, doing so may give you better performance and a smaller repository size.

  • You do need to upgrade your working copy with the subversion upgrade command when you start to use a 1.8 client. If your working copy was created with a pre-1.6 client, start by upgrading to 1.6 or 1.7.

When do I need a 1.8 client?

When do I need a 1.8 server?

Other concerns

  • Upgrading the server is always a bigger decision than upgrading the client. Subversion 1.8 is a new release and you should confirm that it works well in your environment before upgrading the server.

  • Make sure you test compatibility with all of your third party software including continuous integration servers, IDE plugins, and GUIs before upgrading.

Ready to take the leap? You can find certified Subversion 1.8 binaries on the WANdisco web site.

Subversion is a registered trademark of the Apache Software Foundation.