Tag Archive for 'Jagane Sundar'

WANdisco Fusion Q&A with Jagane Sundar, CTO

Tuesday we unveiled our new product: WANdisco Fusion. Ahead of the launch, we caught up with WANdisco CTO Jagane Sundar, who was one of the driving forces behind Fusion.

Jagane joined WANdisco in November 2012 after the firm’s acquisition of AltoStor and has since played a key role in the company’s product development and rollout. Prior to founding AltoStor along with Konstantin Shvachko, Jagane was part of the original team that developed Apache Hadoop at Yahoo!.

Jagane, put simply, what is WANdisco Fusion?

JS: WANdisco Fusion is a wonderful piece of technology that’s built around a strongly consistent transactional replication engine, allowing for the seamless integration of different types of storage for Hadoop applications.

It was designed to help organizations get more out of their Big Data initiatives, answering a number of very real problems facing the business and IT worlds.

And the best part? All of your data centers are active simultaneously: You can read and write in any data center. The result is you don’t have hardware that’s lying idle in your backup or standby data center.

What sort of business problems does it solve?

JS: It provides two new important capabilities for customers. First, it keeps data consistent across different data centers no matter where they are in the world.

And it gives customers the ability to integrate different storage types into a single Hadoop ecosystem. With WANdisco Fusion, it doesn’t matter if you are using Pivotal in one data center, Hortonworks in another and EMC Isilon in a third – you can bring everything into the same environment.

Why would you need to replicate data across different storage systems?

JS: The answer is very simple. Anyone familiar with storage environments knows how diverse they can be. Different types of storage have different strengths depending on the individual application you are running.

However, keeping data synchronized is very difficult if not done right. Fusion removes this challenge while maintaining data consistency.

How does it help future proof a Hadoop deployment?

JS: We believe Fusion will form a critical component of companies’ workflow update procedures. You can update your Hadoop infrastructure one data center at a time, without impacting application availability or by having to copy massive amounts of data once the update is done.

This helps you deal with updates from both Hadoop and application vendors in a carefully orchestrated manner.

Doesn’t storage-level replication work as effectively as Fusion?

JS: The short answer is no. Storage-level replication is subject to latency limitations that are imposed by file systems. The result is you cannot really run storage-level replication over long distances, such as a WAN.

Storage-level replication is nowhere nearly as functional as Fusion: It has to happen at the LAN level and not over a true Wide Area Network.

With Fusion, you have the ability to integrate diverse systems such as NFS with Hadoop, allowing you to exploit the full strengths and capabilities of each individual storage system – I’ve never worked on a project as exciting and as revolutionary as this one.

How did WANdisco Fusion come about?

JS: By getting inside our customers’ data centers and witnessing the challenges they faced. It didn’t take long to notice the diversity of storage environments.

Our customers found that different storage types worked well for different applications – and they liked it that way. They didn’t want strict uniformity across their data centers, but to be able to leverage the strengths of each individual storage type.

At that point we had the idea for a product that would help keep data consistent across different systems.

The result was WANdisco Fusion: a fully replicated transactional engine that makes the work of keeping data consistent trivial. You only have to set it up once and never have to bother with checking if your data is consistent.

This vision of a fully utilized, strongly consistent diverse storage environment for Hadoop is what we had in mind when came up with the Fusion product.

You’ve been working with Hadoop for the last 10 years. Just how disruptive is WANdisco Fusion going to be?

JS: I’ve actually been in the storage industry for more than 15 years now. Over that period I’ve worked with shared storage systems, and I’ve worked with Hadoop storage systems. WANdisco Fusion has the potential to completely revolutionize the way people use their storage infrastructure. Frankly, this is the most exciting project I’ve ever been part of.

As the Hadoop ecosystem evolved I saw the need for this virtual storage system that integrates different types of storage.

Efforts to make Hadoop run across different data centers have been mostly unsuccessful. For the first time, we at WANdisco have a way to keep your data in Hadoop systems consistent across different data centers.

The reason this is so exciting is because it transforms Hadoop into something that runs in multiple data centers across the world.

Suddenly you have capabilities that even the original inventors of Hadoop didn’t really consider when it was conceived. That’s what makes WANdisco Fusion exciting.

Answers to questions from the Webinar of Dec 11, 2012

 Download the webinar slides here.

Question 1: Are there any special considerations or support of Spring technologies for this (i.e. Spring-Data, Spring-Integration, Spring-Batch)?

Answer: We are continuously looking at technologies that make Hadoop easier to use and program. Spring-Data, Spring-Integration and Spring-Batch show promise. When sufficient momentum is gathered by these projects, we will work with the Spring community to include a tested version of these technologies in the AltoStor Appliance.

Question 2: What is the hadoop version underneath the appliance?

Answer: Hadoop 2. We intend to remain close to the latest version of Hadoop at any given point in time, modulo fixes and changes for bug fixes.

Question 3: Will the pricing model based on number of name nodes or size of the cluster?

Answer: Pricing decisions have not been made yet. We will announce pricing in the first quarter of 2013.

Question 4: Can you comment on how load balancing is resolved across active nodes? Is there a load balance router concept?

Answer: We do not require/depend on any specialized hardware such as load balancers or NFS filers. By “load balancing” we simply mean that application requests (read or write) can be directed to any NameNode based on its proximity to the client or available resources. Thus NameNodes can share the workload and provide higher overall cluster performance compared to active-standby architecture.

Question 5: How does Active-Active replication impact processing time relative to current Hadoop architectures?

Answer: Active-Active replication will result in load balancing of clients across many NameNodes, i.e. fewer clients will be serviced by each NameNode. Since NameNodes share the workload on a busy cluster, you should expect faster response time for clients. Generally, more active NameNodes can perform a proportionally larger amount of work.


avatar

About Jagane Sundar

We Just Acquired Big Data / Hadoop Company AltoStor.

I believe that the combination of AltoStor’s expertise and WANdisco’s patented active-active replication technology is the proverbial ‘marriage-made-in-heaven’.  The AltoStor acquisition will enable us to launch products into the highly lucrative Big Data / Hadoop market early next year.

So how lucrative is this market?  Well, I recently read an interesting article in Wikibon “Big Data: Hadoop, Business Analytics and Beyond” that Big data / hadoop market sizereiterated what we already knew.  Big Data isn’t a might happen next year thing.  No, it’s here today, to steal a quote from the excellent article: “Make no mistake: Big Data is the new definitive source of competitive advantage across all industries. Enterprises and technology vendors that dismiss Big Data as a passing fad do so at their peril and, in our opinion, will soon find themselves struggling to keep up with more forward-thinking rivals…. For those organizations that understand and embrace the new reality of Big Data, the possibilities for new innovation, improved agility, and increased profitability are nearly endless.”

So why did we acquire AltoStor?

First off, the founders (Dr. Konstantin Shvachko and Jagane Sundar) are really good guys.  This was an ‘old-school’ acquisition.  An initial deal was struck very quickly with a handshake.  Both sides could see very clear value – so doing the deal was incredibly simple.  I love the fact that they wanted stock as consideration – that’s real proof that they see significant long term-value creation rather than short-term gain.

For WANdisco Big Data is a Big Market.  We can see clear synergy between our unique / patented active-active replication technology and the hadoop logocreation of Hadoop high availability (HA) solutions.  This is one of the reasons why AltoStor was so attractive to us.  They have unique knowledge in the space:

•            The AltoStor founders have been working on Hadoop since its inception in 2006 at Yahoo.  Konstantin was part of Doug Cuttings team that created and implemented Hadoop.  His focus was massive scale, performance and availability of Hadoop – developing the Hadoop Distributed File System (HDFS).  He then went on to eBay where he implemented Hadoop.

•            The Founders are intimately aware of the problem WANdisco is planning to solve around Hadoop HA and hence understand the value of the solution in large scale Big Data replication over a Wide Area Network.

•            Finally, AltoStor are developing a product that is slated for release in Q1 2013, that will significantly simplify deployment of Hadoop / Big Data for enterprises.

Following the acquisition we now expect to have products available in the first quarter of 2013.  That’s very good news.

There’s going to be a lot of noise in this space over the coming months and years.  Many will jump on the ‘bandwagon’, making all sorts of lavish claims to be ‘the big data this’ and ‘the big data that’.  It always happens in hype-cycles like this.  In reality most are just companies repurposing existing legacy products and slapping a new label on it.  This is NOT one of those.  We are building from the ground-up with unique knowledge and information that only a few in the world have (the amount of brain-power in the room during some of the early design meeting was frightening!)

In 2005 when we founded WANdisco my peers would tell me that active-active replication over a Wide Area Network was impossible.  Well we’ve got hundreds-of-thousands of users using the technology for core development every day.  Applying this technology to Hadoop is groundbreaking and I think it will change the way the industry views network storage.  We like making the impossible possible at WANdisco.

avatar

About David Richards

David is CEO, President and co-founder of WANdisco and has quickly established WANdisco as one of the world’s most promising technology companies.

Since co-founding the company in Silicon Valley in 2005, David has led WANdisco on a course for rapid international expansion, opening offices in the UK, Japan and China. David spearheaded the acquisition of Altostor, which accelerated the development of WANdisco’s first products for the Big Data market. The majority of WANdisco’s core technology is now produced out of the company’s flourishing software development base in David’s hometown of Sheffield, England and in Belfast, Northern Ireland.

David has become recognised as a champion of British technology and entrepreneurship. In 2012, he led WANdisco to a hugely successful listing on London Stock Exchange (WAND:LSE), raising over £24m to drive business growth.

With over 15 years’ executive experience in the software industry, David sits on a number of advisory and executive boards of Silicon Valley start-up ventures. A passionate advocate of entrepreneurship, he has established many successful start-up companies in Enterprise Software and is recognised as an industry leader in Enterprise Application Integration and its standards.

David is a frequent commentator on a range of business and technology issues, appearing regularly on Bloomberg and CNBC. Profiles of David have appeared in a range of leading publications including the Financial Times, The Daily Telegraph and the Daily Mail.

Specialties:IPO’s, Startups, Entrepreneurship, CEO, Visionary, Investor, ceo, board member, advisor, venture capital, offshore development, financing, M&A