Tag Archive for 'apache'

Application Specific Data? It’s So 2013

Looking back at the past 10 years of software the word ‘boring’ comes to mind.  The buzzwords were things like ‘web services’, ‘SOA’.  CIO’s Tape drives 70sloved the promise of these things but they could not deliver.  The idea of build once and reuse everywhere really was the ‘nirvana’.

Well it now seems like we can do all of that stuff.

As I’ve said before Big Data is not a great name because it implies that all we are talking about a big database with tons of data.  Actually that’s only part of the story. Hadoop is the new enterprise applications platform.  The key word there is platform.  If you could have a single general-purpose data store that could service ‘n’ applications then the whole of notion of database design is over.  Think about the new breed of apps on a cell phone, the social media platforms and web search engines.  Most of these do this today.  Storing data in a general purpose, non-specific data store and then used by a wide variety of applications.  The new phrase for this data store is a ‘data lake’ implying a large quantum of every growing and changing data stored without any specific structure

Talking to a variety of CIOs recently they are very excited by the prospect of both amalgamating data so it can be used and also bringing into play data that previously could not be used.  Unstructured data in a wide variety of formats like word documents and PDF files.  This also means the barriers to entry are low.  Many people believe that adopting Hadoop requires a massive re-skilling of the workforce.  It does but not in the way most people think.  Actually getting the data into Hadoop is the easy bit (‘data ingestion‘ is the new buzz-word).  It’s not like the old relational database days where you first had to model the data using data normalization techniques and then use ETL to make the data in usable format.  With a data lake you simply set up a server cluster and load the data. Creating a data model and using ETL is simply not required.

The real transformation and re-skilling is in application development.  Applications are moving to data – today in a client-server world it’s the other way around.  We have seen this type of reskilling before like moving from Cobol to object oriented programming.

In the same way that client-server technology disrupted  mainframe computer systems, big data will disrupt client-server.  We’re already seeing this in the market today.  It’s no surprise that the most successful companies in the world today (Google, Amazon, Facebook, etc.) are all actually big data companies.  This isn’t a ‘might be’ it’s already happened.

avatar

About David Richards

David is CEO, President and co-founder of WANdisco and has quickly established WANdisco as one of the world’s most promising technology companies.

Since co-founding the company in Silicon Valley in 2005, David has led WANdisco on a course for rapid international expansion, opening offices in the UK, Japan and China. David spearheaded the acquisition of Altostor, which accelerated the development of WANdisco’s first products for the Big Data market. The majority of WANdisco’s core technology is now produced out of the company’s flourishing software development base in David’s hometown of Sheffield, England and in Belfast, Northern Ireland.

David has become recognised as a champion of British technology and entrepreneurship. In 2012, he led WANdisco to a hugely successful listing on London Stock Exchange (WAND:LSE), raising over £24m to drive business growth.

With over 15 years’ executive experience in the software industry, David sits on a number of advisory and executive boards of Silicon Valley start-up ventures. A passionate advocate of entrepreneurship, he has established many successful start-up companies in Enterprise Software and is recognised as an industry leader in Enterprise Application Integration and its standards.

David is a frequent commentator on a range of business and technology issues, appearing regularly on Bloomberg and CNBC. Profiles of David have appeared in a range of leading publications including the Financial Times, The Daily Telegraph and the Daily Mail.

Specialties:IPO’s, Startups, Entrepreneurship, CEO, Visionary, Investor, ceo, board member, advisor, venture capital, offshore development, financing, M&A

WANdisco Releases New Version of Hadoop Distro

We’re proud to announce the release of WANdisco Distro (WDD) version 3.1.1.

WDD is a fully tested, production-ready version of Apache Hadoop 2 that’s free to download. WDD version 3.1.1 includes an enhanced, more intuitive user interface that simplifies Hadoop cluster deployment. WDD 3.1.1 supports SUSE Linux Enterprise Server 11 (Service Pack 2), in addition to RedHat and CentOS.

“The number of Hadoop deployments is growing quickly and the Big Data market is moving fast,” said Naji Almahmoud, senior director of global business development, SUSE, a WANdisco Non-Stop Alliance partner. “For decades, SUSE has delivered reliable Linux solutions that have been helping global organizations meet performance and scalability requirements. We’re pleased to work closely with WANdisco to support our mutual customers and bring Hadoop to the enterprise.”

All WDD components are tested and certified using the Apache BigTop framework, and we’ve worked closely with both the open source community and leading big data vendors to ensure seamless interoperability across the Hadoop ecosystem.

“The integration of Hadoop into the mainstream enterprise environment is increasing, and continual communication with our customers confirms their requirements – ease of deployment and management as well as support for market leading operating systems,” said David Richards, CEO of WANdisco. “With this release, we’re delivering on those requirements with a thoroughly tested and certified release of WDD.”

WDD 3.1.1 can be downloaded for free now. WANdisco also offers Professional Support for Apache Hadoop.

Apache Subversion Team Releases 1.7.9 and 1.6.21

The Apache Subversion team has announced two new releases: Subversion 1.7.9 and 1.6.21.

Subversion 1.7.9 improves the error messages for svn:date and svn:author props, and it improves the logic in mod_dav_svn’s implementation of lock, as well as a list of other features and fixes:

  • Doxygen docs now ignore prefixes when producing the index

  • Javahl status api now respects the ignoreExternals boolean

  • Executing unnecessary code in log with limit is avoided

  • A fix for a memory leak in `svn log` over svn://

  • An incorrect authz failure when using neon http library has been fixed

  • A fix for an assertion when rep-cache is inaccessible

More information on Apache Subversion 1.7.9 can be found in the Changes file.

Meanwhile, Subversion 1.6.21 improves memory usage when committing properties in mod_dav_svn, and also improves logic in mod_dav_svn’s implementation of lock, alongside bug fixes including:

  • A fix for a post-revprop-change error that could cancel commits

  • A fix for a compatibility issue with g++ 4.7

More information on Apache Subversion 1.6.21 can be found in the Changes file.

Both versions can be downloaded free via the WANdisco website.

Free Webinar: Enterprise-Enabling Hadoop for the Data Center

We’re pleased to announce that WANdisco will be co-hosting a free Apache Hadoop webinar with Tony Baer, Ovum’s lead Big Data analyst. Ovum is an independent analyst and consultancy firm specializing in the IT and telecommunications industries.

This webinar, ‘Big Data – Enterprise-Enabling Hadoop for the Data Center’, will cover the key issues of availability, performance and scalability and how Apache Hadoop is evolving to meet these requirements.

“This webinar will discuss the importance of availability, performance and scalability,” said Ovum’s Tony Baer. “Ovum believes that for Hadoop to become successfully adopted in the enterprise, that it must become a first class citizen with IT and the data center. Availability, performance and scalability are key issues, and also where there is significant innovation occurring. We’ll discuss how the Hadoop platform is evolving to meet these requirements.”

Topics include:

  • How Hadoop is becoming a first class, enterprise-hardened technology for the data center
  • Hadoop components and the role of reliability and performance in those components

  • Disaster recovery challenges faced by globally distributed organizations and how replication technology is crucial to business continuity

  • The importance of seamless Hadoop migration from the public cloud to private clouds, especially for organizations that require secure 24/7 access with real-time performance

Big Data – Enterprise-Enabling Hadoop for the Data Center’ will be held on Tuesday, April 30th at 10:00 am Pacific / 1:00 pm Eastern. Register for this free webinar here.

Introduction to SmartSVN

SmartSVN is a powerful and easy-to-use graphical client for Apache Subversion. There are several clients for Subversion, but here are just a few reasons you should try SmartSVN:

  • It’s cross-platform – SmartSVN runs on Windows, Linux and Mac OS X, so you can continue using the operating system (OS) that works the best for you. It can also be integrated into your OS, via Mac’s Finder Integration or Windows Shell.

  • Everything you need, out of the box – SmartSVN comes complete with all the tools you need to manage your Subversion projects:

  1. Conflict solver – this feature combines the freedom of a general, three-way-merge with the ability to detect and resolve any conflicts that occur during the development lifecycle.

  2. File compare – this allows you to make inner-line comparisons and directly edit the compared files.

  3. Built-in SSH client – allows users to access servers using the SSH protocol. This security-conscious protocol encrypts every piece of communication between the client and the server, for additional protection.

  • A complete view of your project at a glance – the most important files (such as conflicted, modified or missing files) are placed at the top of the file list. SmartSVN also highlights which directories contain local modifications, which directories have been changed in the repository, and whether individual files have been modified locally or in the central repo. This makes it easy to get a quick overview of the state of your project.

  • Fully customizable – maximize productivity by fine-tuning your SmartSVN installation to suit your particular needs: Change keyboard shortcuts, write your own plugin with the SmartSVN API, group revisions to personalize your display, create Change Sets, and alter the context menus and toolbars to suit you. You can learn more about customizing SmartSVN at our ‘5 Ways to Customize SmartSVN’ blog post.

  • Comprehensive bug tracker support – Trac and JIRA are both fully supported.

Multitude of support options – SmartSVN users have access to a range of free support, from refcards to blogs and documentation, the SmartSVN forum and a Twitter account maintained by our open source experts. If you need extra support with your SmartSVN installation, expert email support is included with SmartSVN Professional licenses.

Want to learn more about SmartSVN? On April 18th, WANdisco will be be holding a free ‘Introduction to SmartSVN’ webinar covering everything you need to get off to a great start with this popular client:

  • Repository basics

  • Checkouts, working folders, editing files and commits

  • Reporting on changes

  • Simple branching

  • Simple merging

This webinar is free so register now.

Subversion Tip of the Week

Tagging and Branching with SmartSVN’s ‘Copy Within Repository’

SmartSVN’s ‘Copy Within Repository’ command allows users to perform pure repository copies, which is particularly useful for quickly creating tags and branches.

To create a repository copy within SmartSVN:

1) Open the ‘Modify’ menu and select ‘Copy within Repository’.

2) From the ‘Copy From’ dropdown menu, select the repository where the source resides.

3) In the ‘Copy From’ textbox, specify the directory being copied. In ‘Source Revision,’ tell SmartSVN whether it should copy the HEAD revision (this is selected by default) or a different revision. Use the ‘Browse’ button if you need more information about the contents of the different directories and/or revisions that make up your project.

copy within repo

4) Select either:

  • Copy To – source is copied into the ‘Directory’ under the filename specified by ‘With Name’

  • Copy Contents Into – the contents of the source are copied directly into the ‘Directory’ under ‘With Name.’

5) Enter the copy’s destination in the ‘Directory’ textbox. You can view the available options by clicking the ‘Browse’ button.

6) Give your copy a name in the ‘With Name’ textbox.

7) The copy is performed directly in the repository, so you’ll need to enter an appropriate commit message.

8) Once you’re happy with the information you’ve entered, hit ‘Copy’ to create your new branch/tag.

Try SmartSVN Professional free today! Get a free trial at http://www.smartsvn.com/download.

SmartSVN’s Project Settings: Properties

You can easily change how SmartSVN handles all your Apache Subversion projects using the popular, cross-platform client’s ‘global preferences’ settings. However, sometimes you’ll want to be more flexible and change SmartSVN’s settings on a per-project basis.

In this post, we take a closer look at the changes you can make to Subversion’s properties, on a project-by-project basis using SmartSVN’S ‘Project Settings’ menu.

Accessing Project Settings

To access SmartSVN’s Project Settings, open the ‘Project’ menu and select ‘Settings.’ The different options are listed on the dialog box’s left-hand side.

project settings

EOL Style

Subversion doesn’t pay attention to a file’s end-of-line (EOL) markers by default, which can be a problem for teams who are collaborating on a document across different operating systems. Different operating systems use different characters to represent EOL in a text file, and some operating systems struggle when they encounter unexpected EOL markers.

The ‘EOL Style’ option specifies the end-of-line style default for your current project. You can choose from:

  • Platform-Dependent/Native – files contain EOL markers native to your operating system.

  • LF (Line Feed) – files contain LF characters, regardless of the operating system.

  • CR+LF (Carriage Return & Line Feed) – files contain CRLF sequences, regardless of the operating system.

  • CR (Carriage Return) – files contain CR characters, regardless of the operating system.

  • As is (no convention) – this is typically the default value of EOL-style.

The ‘In case of inconsistent EOLs’ allows you to define how SmartSVN should handle files with inconsistent EOLs.

You can more about EOL Style at the ‘Subversion Properties: EOL-Style’ blog post.

EOL Style — Native

Usually, text files are stored with their ‘native’ EOL Style in the Subversion repository. However, under certain circumstances, it might be convenient to redefine what ‘native’ means, for example, when you’re working on a project on Windows but frequently uploading it to a Unix server. Open this dialog and choose from Linux/Unix, Mac or Windows.

Keyword Substitution

Allows you to automatically add ‘keywords’ into the contents of a file itself. These keywords are useful for automatically maintaining information that would be too time-consuming to keep updating manually.

You can choose from:

  • Author – the username of the person who created the revision.
  • Date – the UTC the revision was created (note, this is based on the server’s clock not the client’s.)

  • ID – a compressed combination of the keywords ‘Author,’ ‘Date’ and ‘Revision.’

  • Revision – describes the last revision in which the selected file was changed in the repository.

  • URL – a link to the latest version of the file in the repository.

  • Header – similar to ‘ID,’ this is a compressed combination of the other keywords, plus the URL information.

You can find out more about Keyword Substitution at our ‘Exploring SVN Properties’ post.

Learn more about the other options available in SmartSVN’s ‘Project Settings’ dialog by reading our Subversion Tip of the Week post.

ASF Announces Apache Bloodhound as Top-Level Project

WANdisco submitted Bloodhound to the Apache Incubator in December 2011 and our developers have been involved in the Apache Bloodhound project since its inception. So we’re pleased that today the Apache Software Foundation (ASF) officially announced Bloodhound as a Top-Level Project (TLP).

Bloodhound is a Trac-based software development collaboration tool that includes an Apache Subversion repository browser, wiki, and defect tracker. It’s also compatible with the hundreds of free plugins available for Trac, allowing users to customize their experience even further.

WANdisco received many requests for an issue tracker and at the time, open source options available for integration were limited, which is why we decided to invest in setting one up in the Apache Incubator,” said David Richards, CEO of WANdisco. “WANdisco has been actively supportive of the ASF, and we’re proud to have played a leading role in Bloodhound.”

When Bloodhound entered the incubator, while it was built on the Trac framework, it was a completely new project,” said Gary Martin, Vice President of Apache Bloodhound and WANdisco developer. “Bloodhound’s strengths lie in its powerful combination of Apache Subversion source control and robust ticket system.”

You can learn more about Apache Bloodhound, and download the latest 0.5.2 release, at the Bloodhound website.

 

WANdisco’s March Roundup

Following the recent issuance of our “Distributed computing systems and system components thereof” patent, which cover the fundamentals of active-active replication over a Wide Area Network, we’re excited to announce the filing of three more patents. These patents involve methods, devices and systems that enhance security, reliability, flexibility and efficiency in the field of distributed computing and will have significant benefits for users of our Hadoop Big Data product line.

“Our team continues to break new ground in the field of distributed computing technology,” said David Richards, CEO for WANdisco. “We are proud to have some of the world’s most talented engineers in this field working for us and look forward to the eventual approval of these most recent patent applications. We are particularly excited about their application in our new Big Data product line.”

Our Big Data product line includes Non-Stop NameNode, WANdisco Hadoop Console and WANdisco Distro (WDD.)

This month, we also welcomed Bas Nijjer, who built CollabNet UK from startup to multimillion dollar recurring revenue, to the WANdisco team. Bas Nijjer has a proven track record of increasing customer wins, accelerating revenue and providing customer satisfaction, and he takes on the role of WANdisco Sales Director, EMEA.

“Bas is an excellent addition to our team, with great insight on developing and strengthening sales teams and customer relationships as well as enterprise software,” said David Richards. “His expertise and familiarity with EMEA and his results-oriented attitude will help strengthen the WANdisco team and increase sales and renewals. We are pleased to have him join us.”

If joining the WANdisco team interests you, visit our Careers page for all the latest employment opportunities.

We’ve also posted lots of new content at the WANdisco blog. Users of SmartSVN, our cross-platform graphical Subversion client, can find out how to get even more out of their installation with our ‘Performing a Reverse Merge in SmartSVN’ and ‘Backing Up Your SmartSVN Data’ tutorials. For users running the latest and greatest, 7.5.4 release of SmartSVN, we’ve put together a deep dive into the fixes and new functionality in this release with our ‘What’s New in SmartSVN 7.5.4?’ post. If you haven’t tried SmartSVN yet, you can claim your free trial of this release by visiting http://smartsvn.com/download

We also have a new post from James Creasy, WANdisco’s Senior Director of Product Management, where he takes a closer look at the “WAN” in “WANdisco:”

“We’ve all heard about the globalization of the world economy. Every globally relevant company is now highly dependent on highly available software, and that software needs to be equally global. However, most systems that these companies rely on were architected with a single machine in mind. These machines were accessed over a LAN (local area network) by mostly co-located teams.

All that changed, starting in the 1990’s with widespread adoption of outsourcing. The WAN computing revolution had begun in earnest.”

You can read “What’s in a name, WANdisco?” in full now.

Also at the blog we address the hot topic of ‘Is Subversion Ready for the Enterprise?’ And, if you need more information on the challenges and available solutions for deploying Subversion in an enterprise environment, be sure to sign up for our free-to-attend ‘Scaling Subversion for the Enterprise’ sessions. Taking place a few times a week, these webinars cover limitations and risks related to globally distributed SVN deployments, as well as free resources and live demos to help you overcome them. Take advantage of the opportunity to get answers to your business-specific questions and live demos of enterprise-class SVN products.

Subversion Tip of the Week

Apache Subversion supports the creation and use of ‘patches’ – text files containing the differences between two files. Patches specify which lines have been removed, added and changed, and are particularly useful when you don’t have write access to a repository. In these instances, you can create a patch file showing the changes between a file as it exists in the repository, and the version in your working copy. Then, you can create a ticket and attach your patch file for someone with repository write access to review and commit the accepted changes to the repository.

To create a patch file, you first need to review the differences between the specific files/revisions you are targeting using the ‘svn diff’ command. In this example, we are examining the differences between the version of the project in our working copy and the central repository.

tip of the week

If you’re satisfied with the differences ‘svn diff’ has identified, run the following command to create a patch:

svn diff > patch_name.diff

tip of the week 2

All the changes will now be written to a patch on your local machine.

tip of the week 3

You can now send this patch to a user who does have write access to the repository.

Creating a Patch Between Revisions

Alternatively, if you want to create a patch containing the differences between two revisions, run the following command:

svn diff r:(revision)(revision) (working-copy-location)

Followed by:

svn diff > patch_name.diff

Again, this patch file can now be submitted to someone with write access.

Want more advice on your Apache Subversion installation? We have a full series of SVN refcards for free download, covering hot topics such as branching and merging, and best practices. You can find out more at www.wandisco.com/svnref