Tag Archive for 'WANdisco'

Why Data Driven Companies Rely on WANdisco Fusion

Hadoop is now clearly gaining momentum. We are seeing more and more customers attempting to deploy enterprise grade applications. Data protection, governance, performance and availability are top concerns. WANdisco Fusion’s level of resiliency is enabling customers to move out of the lab and into production much faster.

As companies start to scale these platforms and begin the journey to becoming data driven, they are completely focused on business value and return on investment. WANdisco’s ability to optimize resource utilization by eliminating the need for standby servers resonates well with our partners and customers. These companies are not Google or Facebook. They don’t have an endless supply of hardware and their core business isn’t delivering technology.

As these companies add data from more sources to Hadoop, they are implementing backup and disaster recovery plans and deploying multiple clusters for redundancy. One of our customers, a large bank, is beginning to utilize the cloud for DR.

I’ve met 11 new customers in the past eight days. Five of them have architected cloud into their data lake strategy and are evaluating the players. They are looking to run large data sets in the cloud for efficiency as well as backup and DR.

One of those customers, a leader in IT security, tells me they plan to move their entire infrastructure to the cloud within the next 12 months. They already have 200 nodes in production today, which they expect to double in a year.

Many of our partners are interested in how they can make it easy to onboard data from behind the firewall to the cloud while delivering the best performance. They recognize this is fundamental to a successful cloud strategy.

Companies are already embarking on migrations from one Hadoop platform to another. We’re working with customers on migration from MapR to HDP, CDH to HDP, CDH to Oracle BDA, and because we are HCFS compatible, GPFS to IOP. Some of these are petabyte scale.

For many of these companies, WANdisco Fusion’s ability to eliminate downtime, data loss and business disruption is a prerequisite to making that transition. Migration has never been undertaken lightly. I’ve spoken to partners who are unable to migrate their customers due to the required amount of downtime and risk involved.

One customer I met recently completed a large migration to HDP and just last week acquired a company that has a large cluster on Cloudera. We’re talking to them about how we can easily provide a single consistent view of the data. This will allow them to get immediate value from the data they have just acquired. If they choose to migrate completely, they are in control of the timing.

Customers measure their success by time to value. We’re working closely with our strategic partners to ensure our customers don’t have to worry about the nuts and bolts, irrespective of distributions, on-prem, cloud, or hybrid environment so customers can concentrate on the business outcome.

Please reach out to me if these use cases resonate and you would like to learn more.

Peter Scott
SVP Business Development

avatar

About Mackensie Gibson

WANdisco Fusion Q&A with Jagane Sundar, CTO

Tuesday we unveiled our new product: WANdisco Fusion. Ahead of the launch, we caught up with WANdisco CTO Jagane Sundar, who was one of the driving forces behind Fusion.

Jagane joined WANdisco in November 2012 after the firm’s acquisition of AltoStor and has since played a key role in the company’s product development and rollout. Prior to founding AltoStor along with Konstantin Shvachko, Jagane was part of the original team that developed Apache Hadoop at Yahoo!.

Jagane, put simply, what is WANdisco Fusion?

JS: WANdisco Fusion is a wonderful piece of technology that’s built around a strongly consistent transactional replication engine, allowing for the seamless integration of different types of storage for Hadoop applications.

It was designed to help organizations get more out of their Big Data initiatives, answering a number of very real problems facing the business and IT worlds.

And the best part? All of your data centers are active simultaneously: You can read and write in any data center. The result is you don’t have hardware that’s lying idle in your backup or standby data center.

What sort of business problems does it solve?

JS: It provides two new important capabilities for customers. First, it keeps data consistent across different data centers no matter where they are in the world.

And it gives customers the ability to integrate different storage types into a single Hadoop ecosystem. With WANdisco Fusion, it doesn’t matter if you are using Pivotal in one data center, Hortonworks in another and EMC Isilon in a third – you can bring everything into the same environment.

Why would you need to replicate data across different storage systems?

JS: The answer is very simple. Anyone familiar with storage environments knows how diverse they can be. Different types of storage have different strengths depending on the individual application you are running.

However, keeping data synchronized is very difficult if not done right. Fusion removes this challenge while maintaining data consistency.

How does it help future proof a Hadoop deployment?

JS: We believe Fusion will form a critical component of companies’ workflow update procedures. You can update your Hadoop infrastructure one data center at a time, without impacting application availability or by having to copy massive amounts of data once the update is done.

This helps you deal with updates from both Hadoop and application vendors in a carefully orchestrated manner.

Doesn’t storage-level replication work as effectively as Fusion?

JS: The short answer is no. Storage-level replication is subject to latency limitations that are imposed by file systems. The result is you cannot really run storage-level replication over long distances, such as a WAN.

Storage-level replication is nowhere nearly as functional as Fusion: It has to happen at the LAN level and not over a true Wide Area Network.

With Fusion, you have the ability to integrate diverse systems such as NFS with Hadoop, allowing you to exploit the full strengths and capabilities of each individual storage system – I’ve never worked on a project as exciting and as revolutionary as this one.

How did WANdisco Fusion come about?

JS: By getting inside our customers’ data centers and witnessing the challenges they faced. It didn’t take long to notice the diversity of storage environments.

Our customers found that different storage types worked well for different applications – and they liked it that way. They didn’t want strict uniformity across their data centers, but to be able to leverage the strengths of each individual storage type.

At that point we had the idea for a product that would help keep data consistent across different systems.

The result was WANdisco Fusion: a fully replicated transactional engine that makes the work of keeping data consistent trivial. You only have to set it up once and never have to bother with checking if your data is consistent.

This vision of a fully utilized, strongly consistent diverse storage environment for Hadoop is what we had in mind when came up with the Fusion product.

You’ve been working with Hadoop for the last 10 years. Just how disruptive is WANdisco Fusion going to be?

JS: I’ve actually been in the storage industry for more than 15 years now. Over that period I’ve worked with shared storage systems, and I’ve worked with Hadoop storage systems. WANdisco Fusion has the potential to completely revolutionize the way people use their storage infrastructure. Frankly, this is the most exciting project I’ve ever been part of.

As the Hadoop ecosystem evolved I saw the need for this virtual storage system that integrates different types of storage.

Efforts to make Hadoop run across different data centers have been mostly unsuccessful. For the first time, we at WANdisco have a way to keep your data in Hadoop systems consistent across different data centers.

The reason this is so exciting is because it transforms Hadoop into something that runs in multiple data centers across the world.

Suddenly you have capabilities that even the original inventors of Hadoop didn’t really consider when it was conceived. That’s what makes WANdisco Fusion exciting.