Subversion Blog

Subversion 1.9の新機能

What’s New in Subversion 1.9 (クリックでリプレイ)と題したWebinarの概要です。

1.9ではクライアント側とサーバー側の両方の強化が行われています。
svn auth, copy, merge,blame, cleanup,infoに新しいオプションが追加されています。1.8のWorking Copyに対する互換を保障しているので、1.9クライアントと1.8を一緒に使うことも可能になります。
Lock(コミット中に他の人がコミットしないようにする)の数が数百を超えるとスケールしないとの問題に対応しています。一つは既にGETで使われているHTTP PipelineのLockへの適用です。クライアント側のみの変更のため、クライアント側を1.9にすることで効果を得られます。多数のLock発生時、サーバー側での余計なデータの書き込みオーバヘッド解消のため、FSFSにMulti Lock機能が追加されています。LockのHock(post-lock, post-unlock)が複数パスで使用できるようになっています。
FSFSは1.9で新しくFormat7になりました。従来、性能向上はキャッシュに頼ってきましたが、リポジトリが大きくなると限界がありました。Format7ではRevisionファイルにLogicalアドレスを使用しディスクアクセスの効率化を図っています。また、今まではチェックサムがリポジトリの全てを対象にしていませんでした、全体を対象にすることになりデータ破壊の検出の精度が向上しています。Pack実行中のコミットがブロックされる期間も大幅に短縮されています。
新バックエンドFSXは性能改善、Packのボトルネック解消を目指して開発中であり、正式なリリースは1.10になる予定。
Server側もリポジトリ、クライアントとの互換は保障しています。

詳細はApacheのサイトを参照ください。

参考:
WANdisco社はSubversion活用の為に必要な総合的なサービスを提供しています。
ご興味あれば、お気軽にご連絡ください(wandisco.japan@wandisco.com)。

Subversion Offer

Subversion Offer

Subversion Offer

avatar

About Kenji Ogawa (小川 研之)

WANdisco社で2013年11月より日本での事業を展開中。 以前は、NECで国産メインフレーム、Unix、ミドルウェアの開発に従事。その後、シリコンバレーのベンチャー企業開拓、パートナーマネージメント、インドでのオフショア開発に従事。

最新版Subversion 1.9 がダウンロード可能に

Subversionの最新版には多くの新機能とバグ修正が含まれています。性能改善、ネットワークリソース有効利用も可能になっています。リポジトリのバックエンドとして長く使われてきたFSFSが新しいもの(FSX)になりました。これによりログ・マージ等が改善されました。
9月15日(日本では16日2:00AM)に下記のWebinarで新機能を紹介します。

“What’s New in Subversion 1.9.” Register
(Replayもあります。上記の“Register”で登録すれば、Replyに関するメールも届きますので、是非、登録ください。Webinar概要については別途、本ブログにアップ予定です)
詳細な説明は以下で参照できます。

http://subversion.apache.org/docs/release-notes/1.9.html
弊社でテスト済のSubversion 1.9のバイナリは以下からダウンロード可能です。

http://www.wandisco.com/subversion/os/downloads

avatar

About Kenji Ogawa (小川 研之)

WANdisco社で2013年11月より日本での事業を展開中。 以前は、NECで国産メインフレーム、Unix、ミドルウェアの開発に従事。その後、シリコンバレーのベンチャー企業開拓、パートナーマネージメント、インドでのオフショア開発に従事。

Wildcards in Subversion Authorization

Support for wildcards in Subversion authorization rules has been noticeably lacking for many years.  The use cases for using wildcards are numerous and well understood: denying write access to a set of protected file types in all branches, granting access to all sandbox branches in all projects, and so on.

So I’m very pleased to announce that WANdisco is now supporting wildcards for Subversion in our Access Control Plus product.  With this feature you can now easily define path restrictions for Subversion repositories using wildcards.

How does this work given that core Subversion doesn’t support wildcards?  Well, wildcard support is a long-standing feature request in the open source Subversion project, and we picked up word that there was a good design under review.  We asked one of the committers that works for WANdisco to create a patch that we can regression test and ship with our SVN MultiSite Plus and Access Control Plus products until the design lands in the core project.

Besides letting you define rules with wildcards, Access Control Plus does a couple of other clever things.

  • Let you set a relative priority that impacts the ordering of sections in the AuthZ file.  The order is significant when wildcards are in use as multiple sections may match the same path.
  • Warn you if two rules may conflict because they affect the same path but have a different priority.

acp-wildcard-conflictThis feature will likely be a life saver for Subversion administrators – just contact us and we’ll help you take advantage of it.

SmartSVN 8.6.3 General Access Released!

We’re pleased to announce the latest release of SmartSVN, 8.6.3. SmartSVN is the popular graphical Subversion (SVN) client for Mac, Windows, and Linux. SmartSVN 8.6.3 is available immediately for download from our website.

New Features include:

– Show client certificate option in the SSL tab in Preferences

Fixes include:

– Bug reporting now suggests the email address from the license file

For a full list of all improvements and features please see the changelog.

 

Note for Mac Os X 8.6.2 users:- If you installed version 8.6.2 as a new download (rather than autoupdating) you will need to download and reinstall 8.6.3 to stop the master password window from constantly reappearing
– You will be required to enter the master password once more after the installation

Contribute to further enhancements

Many of the issues resolved in SmartSVN were raised by our dedicated SmartSVN forum, so if you’ve got an issue or a request for a new feature, head there and let us know.

Get Started

Haven’t yet started using SmartSVN? Get a free trial of SmartSVN Professional now.

If you have SmartSVN and need to update to SmartSVN 8, you can update directly within the application. Read the Knowledgebase article for further information.

Experiences with R and Big Data

The next releases of Subversion MultiSite Plus and Git MultiSite will embed Apache Flume for audit event collection and transmission. We’re taking an incremental approach to audit event collection and analysis, as the throughput at a busy site could generate a lot of data.

In the meantime, I’ve been experimenting with some more advanced and customized analysis. I’ve got a test system instrumented with a custom Flume configuration that pipes data into HBase instead of our Access Control Plus product. The question then is how to get useful answers out of HBase to questions like: What’s the distribution of SCM activity between the nodes in the system?

It’s actually not too bad to get that information directly from an HBase scan, but I also wanted to see some pretty charts. Naturally I turned to R, which led me again to the topic of how to use R to analyze Big Data.

A quick survey showed three possible approaches:

  • The RHadoop packages provided by Revolution Analytics, which includes RHBase and Rmr (R MapReduce)
  • The SparkR package
  • The Pivotal package that lets you analyze data in Hawq

I’m not using Pivotal’s distribution and I didn’t want to invest time in looking at a MapReduce-style analysis, so that left me with RHBase and Spark R.

Both packages were reasonably easy to install as these things go, and RHBase let me directly perform a table scan and crunch the output data set. I was a bit worried about what would happen once a table scan started returning millions of rows instead of thousands, so I wanted to try SparkR as well.

SparkR let me define a data source (in this case an export from HBase) and then run a functional reduce on it. In the first step I would produce some metric of interest (AuthZ success/failure for some combination of repository and node location) for each input line, and then reduce by key to get aggregate statistics. Nothing fancy, but Spark can handle a lot more data than R on a single workstation. The Spark programming paradigm fits nicely into R; it didn’t feel nearly as foreign as writing MapReduce or HBase scans. Of course, Spark is also considerably faster than normal MapReduce.

Here’s a small code snippet for illustration:

 

lines <- textFile(sc, "/home/vagrant/hbase-out/authz_event.csv")
mlines = lapply(lines, function(line) {
       return(list(key, metric))
       })
parts = reduceByKey(mlines, "+", 2L)
reduced = collect(parts)

 

In reality, I might use SparkR in a lambda architecture as part of my serving layer and RHBase as part of the speed layer.

It already feels like these extra packages are making Big Data very accessible to the tools that data scientists use, and given that data analysis is driving a lot of the business use cases for Hadoop, I’m sure we’ll see more innovation in this area soon.

Unified Git and Subversion Management

Over the past several years the movement in ALM tools has been away from heavy, inflexible tools towards lighter and more flexible solutions. Developers want and need the freedom to experiment and work quickly without being bound by heavy processes and restrictions.

But, of course, an enterprise still needs some level of management and governance over software development. Now it looks like the pendulum is swinging back towards a useful middle ground – and WANdisco’s new Access Control Plus product strikes that fine balance between flexibility and guidance.

Access Control Plus is flexible because it lets team leaders manage access to their repositories.  Site administrators can set overall policies and make sure that the truly sensitive data stays safe. Access Control Plus provides for any level of delegated team management, letting the team leaders closest to the source code manage their teams and permissions. And with accounts backed by any number of LDAP or Active Directory authorities, the grunt work of account management is automated.

Yet Access Control Plus is still an authoritative resource for security, auditing and reporting. It covers permissions for all of your Subversion and Git repositories at any location. That’s important for a number of reasons:

  • Sanity! You need some form of consistent permission management over your repositories.
  • An audit trail of your inventions. With the new America Invents Act, a comprehensive record of your intellectual property is more important than ever.
  • Regulatory regimes. Whether it’s Sarbanes-Oxley, HIPAA, or PCI, can you prove accurately who was accessing and modifying your IP?  That’s a key concern for compliance officers.
  • DevOps. If you practice configuration as code, then some of your crown jewels are stored in SCM, and need to be managed appropriately.
  • Industry standards. From CMMI to ISO 9000, standard processes and controls are the cost of doing business in certain industries.  Access Control Plus ticks all of the auditing and reporting checkmarks for you.

Combined with SVN MultiSite Plus and Git MultiSite, Access Control Plus is a complete solution for making your valuable digital data highly available and secure. Be proactive – give us a call and figure out how to manage all of your Subversion and Git repositories.

 

The AIA Prior Use Defense and DevOps

Configuration as Highly Valuable Code

As I wrote about earlier, the expanded scope of the ‘prior use’ defense in the America Invents Act (AIA) provides you with an improved defense against patent litigation. If you’ve adopted DevOps and Continuous Delivery, you need to make sure that you have a strong record of how you’re deploying your software, not just how it was developed. After all, some of your secret sauce may well be your deployment process – a clever way of scaling your application on Azure or EC2, or perhaps a sophisticated canary deployment technique.

Proving that your clever deployment tricks were in use at some point in time is just another reason to treat configuration as code and store it in your Git repositories. In order to do that, you need to figure out a couple of key problems:

  • How do you secure the production data while still making less sensitive deployment data available to development teams?
  • How do you prove that your production data was actually in use?
  • How do you manage having Git repositories on production app servers that may be outside your firewall?

WANdisco’s Git Access Control and Git MultiSite provide easy answers to those challenges.  Git Access Control lets you control write access down to the file level, so you can easily let developers modify staging data without giving them access to production data in the same repository. These permissions are applied consistently on every repository, on every server.

Similarly, Git Access Control provides comprehensive audit capabilities so you can see when data was cloned or fetched to a particular server. You can also use these auditing capabilities to satisfy regulatory concerns over access to production environment data.

Finally, Git MultiSite’s flexible replication groups let you securely control where and how a DevOps repository is used. For example, you may want to have the DevOps repository available for full use on internal servers, but only available for clones and pulls on a production server.

If DevOps has taught us anything, it’s that configuration and environment data is as important as source code in many cases. Git Access Control and Git MultiSite give you the control you need to confidently store configuration as code and establish your ‘prior use’ history.

More Rebasing in Subversion

Continuing on from a previous post about rebasing in Subversion, let’s look at a more general example of using rebasing to port commits to a new base branch.

In Git we’ll start with this picture.

I have three branches: master, team, and topic. Now I’d like to get the unique commits (to-1, to-2) on the topic branch and get them back to master cleanly, but I don’t want the intermediate work on the team branch (commit te-1).

So I use rebasing to get the diffs between topic and team, and use master as the new base for the rebased topic branch.

That gives me the clean picture above. At this point it would be trivial to do a fast forward merge of topic to master.

Using much the same techniques as I discussed last time, it’s possible to emulate this capability in Subversion. Here’s my starting point.

Again, I want to get the local commits from topic and make them more easily accessible to trunk without running a regular merge, which would have to go through the team branch.  Here’s the recipe.

make branch topic-prime from current head of trunk
run svn mergeinfo to determine the diffs between topic and team (revs eligible for a merge from topic to team)
run a cherry-pick merge (ignoring ancestry) of each of those revs from topic to topic-prime

Using that recipe gives me this picture:

At this point I could continue working on topic-prime or run a relatively simple merge to trunk. I could have also changed my recipe to run the cherry-pick merges directly onto trunk instead of using a new branch.

In any case, the end result is fairly close to what you’d have in Git, although the process of getting there wasn’t as easy (and I still have the original topic branch lying around).

Git has uncovered a lot of useful tools and techniques, and although it takes a bit of extra work, you can emulate some of these in Subversion. Questions? Give me a ping on Twitter or svnforum.

Authentication and Authorization – Subversion and Git Live 2014

We’ve switched the format of some of the talks for Subversion and Git Live this year – several will be hands on, giving you the opportunity to test out the subject being discussed rather than just making notes.

One of these talks this year will be delivered by Ben Reser, one of our Subversion committers, on Authentication and Authorization. Ben has been working on Subversion since 2003 and will be discussing:

  • A brief overview of the access control methods Subverison supports.
  • Hands on setting up of a Subversion server with LDAP authentication over HTTP.
  • A look at the performance costs of access control and what you can do to minimize them.
  • How to put your authz configuration file into the repository.

The hands on portion will be covering a hypothetical company as they grow and shift from a very basic setup to a much more complex setup, showing some of the problems they’d have along the way and discussing their reasons for making configuration changes. The company starts off with a single repository with basic authentication (no path based authorization) and ends up with multiple repositories, LDAP and path based authorization. Eventually we’ll even use the new in-repository authz feature added with 1.8. The configuration improvements along the way will show how to ease administrative burden and improve performance.

The goal with this talk is to have you walking away knowing why you configure Subversion the way you do and how you can make things better for your particular setup, rather than just giving you an example authz file and telling you it’s the ‘right’ way to do things.

If that sounds good to you, why not come see us at Subversion and Git Live 2014? There’s more info about the event here: http://www.wandisco.com/subversion-git-live-2014.

WANdisco Announces Availability of Apache Subversion 1.9.0 alpha binaries

Apache have announced the release of the binaries for Subversion 1.9.0 alpha with a number of significant improvements.

It’s important to note that this is an alpha release and as such is not recommended for production environments. If you’re able to download and test this release in a non production environment though we’d be grateful for any feedback – if you notice anything untoward or even just want to chat or ask about this latest version please drop us a post in our forums.

This release introduces improvements to caching and authentication, some filesystem optimisations for fsx and fsfs, a number of additions to svnadmin commands and improvements to the interactive conflict resolution menus. Other enhancements include:

  • New options for ‘svnadmin verify’
  •  –check-normalization
  •  –keep-going
  • svnadmin info: print info about a repository.
  • additions to svn cleanup
  •  add ‘–remove-unversioned’ and ‘–remove-ignore’
  •  add ‘–include-externals’ option
  •  add ‘–quiet’ option

You can see a full list of changes in the release notes here.

To save you the hassle of compiling from source you can download our fully tested, certified binaries free from our website here: http://www.wandisco.com/subversion/download/early-access

WANdisco’s Subversion binaries provide a complete, fully tested version of Subversion based on the most recent stable release, including the latest fixes, and undergo the same rigorous quality assurance process that WANdisco uses for its enterprise products that support the world’s largest Subversion implementations.

Using TortoiseSVN or SmartSVN? As this is an alpha release there’s no compatible version of these Subversion clients yet, but watch this space and we’ll have them ready before the general release of Subversion 1.9.0.