Git Repository Metrics

Managing Git repositories means looking ahead, not just fighting today’s fire. Keeping an eye on key Git repository metrics will keep you a step ahead – and keep your development teams happy. There are several useful predictive metrics you can look at including repository size, growth rate, number of references, number of files exceeding a size threshold, and number of operations per day. These metrics help you with hardware sizing and also help you maintain good performance. You can see if you need more Git replicas to handle clones and pulls from a new development team, or if someone is checking in too many large binary files and slowing down repository performance.

Metrics in the Dashboard

How do you go about collecting this data? Most Git reporting tools focus purely on development metrics like number of commits and developer activity. By contrast, Git MultiSite has some useful metrics built right into its administration console, viewable either graphically or in a list.

Repository Size and Activity Over Time

Repository Size and Activity Over Time

Collecting and Viewing Metrics with Graphite

The administration console dashboard gives you a quick snapshot of key metrics over time, however you may have your own reporting and analysis tools that provide a more elaborate monitoring framework. In that case you can pull data out of Git MultiSite’s REST API to feed into an external system, giving you complete control over how you use the repository metrics.

As a simple example, let’s look at how to track repository size over time using Graphite.  Graphite is an open source tool for storing and charting any type of numeric time-series metric.  Internally it uses a round-robin database that allows for flexible data storage management and purging of old data.

Collecting Data

First, I’ll write a script that uses curl to gather the latest repository statistics from Git MultiSite’s REST API, parse out the size, and feed it to Graphite using the plaintext protocol.

use XML::Simple;
use Data::Dumper;
my $ENDPOINT = 'http://gitms1:8082/dcone/';
my $REPOSITORIES = 'repositories/';
my $PORT='2003';
my $SERVER='';
my $FEED_PREFIX = 'gitms.';
my $FEED_SUFFIX = '.size';

my $rest_call = 'curl ' . $ENDPOINT . $REPOSITORIES;
my $rest_output = `$rest_call`;

my $ref = XMLin($rest_output, ForceArray=>1);
my $date = `date +%s`;

for(my $ii=0; $ii <= $#{ $ref->{repository} }; $ii++) {
   my $size = $ref->{repository}->[$ii]->{repoSize}->[0];
   my $name = $ref->{repository}->[$ii]->{name}->[0];
   my $feed = $FEED_PREFIX . $name . $FEED_SUFFIX;
   print "$name\n$size\n$date\n";
   system("echo \"$feed $size $date\" | nc $SERVER $PORT");

I’ll set up a cron job to run this script every 5 minutes. The script will insert a metric called gitms.<repo name>.size for each repository.

Note that there are more efficient ways to send the data to Graphite, but the plaintext protocol works well for demonstrations.

Viewing Data

Next I’ll configure a simple Graphite chart that shows the repository size over time.

Repository Size

Repository Size

Graphite can also show calculated metrics. Here I’ll look at a chart showing repository growth over time. (Specifically, the chart is showing the 7 day delta in repository size for each time point.)

Repository Growth

Repository Growth

As a Git administrator, I’d keep an eye out for unusual spikes in repository growth. These spikes may indicate an automated build system run amok, or a new project starting up. I may need to take corrective action or start planning for a capacity upgrade.

Tools like Graphite are purpose-built for metric storage and charting, so having an easy way to extract data from Git MultiSite using the REST API makes a great integration point.

Get Going

Git MultiSite provides an open and extensible management framework for your Git repositories, along with all the benefits of true active-active replication. If you’re interested in setting up a comprehensive Git monitoring system, ask for advice or start a free trial of Git MultiSite today.

0 Responses to “Git Repository Metrics”

  • No Comments

Leave a Reply