HDFS Blog

Page 2 of 2

New Webinar Replay: The Future of Big Data for the Enterprise

You may have heard that we’ve just launched the world’s first production-ready Apache Hadoop 2 distro. This WANdisco Distro (WDD) is a fully tested, production-ready version of Apache Hadoop, based on the most recent Hadoop release. We’re particularly excited, as the release of WDD lays the foundation for our upcoming enterprise Hadoop solutions. If you want to find out more about WANdisco’s plans for big data, the replay of our ‘The Future of Big Data for the Enterprise’ webinar is now available.

This webinar is led by WANdisco’s Chief Architect of Big Data, Dr. Konstantin Shvachko, and Jagane Sundar, our Chief Technology Officer and Vice President of Engineering for Big Data. Jagane and Konstantin were part of the original Apache Hadoop team, and have unparalleled expertise in Big Data.

This 30 minute webinar replay covers:

  • The cross-industry growth of Hadoop in the enterprise.
  • The new “Active-Active Architecture” for Apache Hadoop that improves performance.
  • Solving the fundamental issues of Hadoop: usability, high availability, HDFS’s single-point of failure and disaster recovery.
  • How WANdisco’s active-active replication technology will alleviate these issues by adding high-availability, data replication and data security to Hadoop, taking a fundamentally different approach to Big Data.

You can watch the full ‘The Future of Big Data for the Enterprise’ replay, along with our other webinars, at our Webinar Replays page.

Design of the Hadoop HDFS NameNode: Part 1 – Request processing

NameNode Design Part 1 - Client Request Processing

NameNode Design Part 1 – Client Request Processing

HDFS is Hadoop’s File System. It is a distributed file system in that it uses a multitude of machines to implement its functionality. Contrast that with NTFS, FAT32, ext3 etc. which are all single machine filesystems.

HDFS is architected such that the metadata, i.e. the information about file names, directories, permissions, etc. is separated from the user data. HDFS consists of the NameNode, which is HDFS’s metadata server, and DataNodes, where user data is stored. There can be only one active instance of the NameNode. A number of DataNodes (a handful to several thousand) can be part of this HDFS served by the single NameNode.

Here is how a client RPC request to the Hadoop HDFS NameNode flows through the NameNode. This pertains to the Hadoop trunk code base on Dec 2, 2012, i.e. a few months after Hadoop 2.0.2-alpha was released.

The Hadoop NameNode receives requests from HDFS clients in the form of Hadoop RPC requests over a TCP connection. Typical client requests include mkdir, getBlockLocations, create file, etc. Remember – HDFS separates metadata from actual file data, and that the NameNode is the metadata server. Hence, these requests are pure metadata requests – no data transfer is involved. The following diagram traces the path of a HDFS client request through the NameNode. The various thread pools used by the system, locks taken and released by these threads, queues used, etc. are described in detail in this message.

 

  • As shown in the diagram, a Listener object listens to the TCP port serving RPC requests from the client. It accepts new connections from clients, and adds them to the Server object’s connectionList
  • Next, a number of RPC Reader threads read requests from the connections in connectionList, decode the RPC requests, and add them to the rpc call queue – Server.callQueue.
  • Now, the actual worker threads kick in – these are the Handler threads. The threads pick up RPC calls and process them. The processing involves the following:
    • First grab the write lock for the namespace
    • Change the in-memory namespace
    • Write to the in-memory FSEdits log (journal)
  • Now, release the write lock on the namespace. Note that the journal has not been sync’d yet – this means we cannot return success to the RPC client yet
  • Next, each handler thread calls logSync. Upon returning from this call, it is guaranteed that the logfile modification have been sync’d to disk. Exactly how this is guaranteed is messy. Here are the details:
    • Everytime an edit entry is written to the edits log,  a unique txid is assigned for this specific edit. The Handler retrieves this log txid and saves it. This is going to be used to verify whether this specific edit log entry has been sync’d to disk
    • When logSync is called by a Handler, it first checks to see if the last sync’d log edit entry is greater than the txid of the edit log just finished by the Handler. If the Handler’s edit log txid is less than the last sync’d txid, then the Handler can mark the RPC call as complete. If the Handler’s edit log txid is greater than the last sync’d txid, then the Handler has to do one of the following things:
      • It has to grab the sync lock and sync all transcations
      • If it cannot grab the sync lock, then it waits 1000ms and tries again in a loop
      • At this point, the log entry for the transaction made by this Handler has been persisted. The Handler can now mark the RPC as  complete.
    • Now, the single Responder thread picks up completed RPCs and returns the result of the RPC call to to the RPC client. Note that the Responder thread uses NIO to asynchronously send responses back to waiting clients. Hence one thread is sufficient.

There is one thing about this design that bothers me:

  • The threads that wait for their txid to sync sleep 1000ms, wait, sleep 1000ms, wait and continue with this poll. It may make sense to remove the polling mechanism and to replace by an actual sleep/notify mechanism.

That’s all in this writeup, folks.

avatar

About Jagane Sundar

We Just Acquired Big Data / Hadoop Company AltoStor.

I believe that the combination of AltoStor’s expertise and WANdisco’s patented active-active replication technology is the proverbial ‘marriage-made-in-heaven’.  The AltoStor acquisition will enable us to launch products into the highly lucrative Big Data / Hadoop market early next year.

So how lucrative is this market?  Well, I recently read an interesting article in Wikibon “Big Data: Hadoop, Business Analytics and Beyond” that Big data / hadoop market sizereiterated what we already knew.  Big Data isn’t a might happen next year thing.  No, it’s here today, to steal a quote from the excellent article: “Make no mistake: Big Data is the new definitive source of competitive advantage across all industries. Enterprises and technology vendors that dismiss Big Data as a passing fad do so at their peril and, in our opinion, will soon find themselves struggling to keep up with more forward-thinking rivals…. For those organizations that understand and embrace the new reality of Big Data, the possibilities for new innovation, improved agility, and increased profitability are nearly endless.”

So why did we acquire AltoStor?

First off, the founders (Dr. Konstantin Shvachko and Jagane Sundar) are really good guys.  This was an ‘old-school’ acquisition.  An initial deal was struck very quickly with a handshake.  Both sides could see very clear value – so doing the deal was incredibly simple.  I love the fact that they wanted stock as consideration – that’s real proof that they see significant long term-value creation rather than short-term gain.

For WANdisco Big Data is a Big Market.  We can see clear synergy between our unique / patented active-active replication technology and the hadoop logocreation of Hadoop high availability (HA) solutions.  This is one of the reasons why AltoStor was so attractive to us.  They have unique knowledge in the space:

•            The AltoStor founders have been working on Hadoop since its inception in 2006 at Yahoo.  Konstantin was part of Doug Cuttings team that created and implemented Hadoop.  His focus was massive scale, performance and availability of Hadoop – developing the Hadoop Distributed File System (HDFS).  He then went on to eBay where he implemented Hadoop.

•            The Founders are intimately aware of the problem WANdisco is planning to solve around Hadoop HA and hence understand the value of the solution in large scale Big Data replication over a Wide Area Network.

•            Finally, AltoStor are developing a product that is slated for release in Q1 2013, that will significantly simplify deployment of Hadoop / Big Data for enterprises.

Following the acquisition we now expect to have products available in the first quarter of 2013.  That’s very good news.

There’s going to be a lot of noise in this space over the coming months and years.  Many will jump on the ‘bandwagon’, making all sorts of lavish claims to be ‘the big data this’ and ‘the big data that’.  It always happens in hype-cycles like this.  In reality most are just companies repurposing existing legacy products and slapping a new label on it.  This is NOT one of those.  We are building from the ground-up with unique knowledge and information that only a few in the world have (the amount of brain-power in the room during some of the early design meeting was frightening!)

In 2005 when we founded WANdisco my peers would tell me that active-active replication over a Wide Area Network was impossible.  Well we’ve got hundreds-of-thousands of users using the technology for core development every day.  Applying this technology to Hadoop is groundbreaking and I think it will change the way the industry views network storage.  We like making the impossible possible at WANdisco.

avatar

About David Richards

David is CEO, President and co-founder of WANdisco and has quickly established WANdisco as one of the world’s most promising technology companies. Since co-founding the company in Silicon Valley in 2005, David has led WANdisco on a course for rapid international expansion, opening offices in the UK, Japan and China. David spearheaded the acquisition of Altostor, which accelerated the development of WANdisco’s first products for the Big Data market. The majority of WANdisco’s core technology is now produced out of the company’s flourishing software development base in David’s hometown of Sheffield, England and in Belfast, Northern Ireland. David has become recognised as a champion of British technology and entrepreneurship. In 2012, he led WANdisco to a hugely successful listing on London Stock Exchange (WAND:LSE), raising over £24m to drive business growth. With over 15 years' executive experience in the software industry, David sits on a number of advisory and executive boards of Silicon Valley start-up ventures. A passionate advocate of entrepreneurship, he has established many successful start-up companies in Enterprise Software and is recognised as an industry leader in Enterprise Application Integration and its standards. David is a frequent commentator on a range of business and technology issues, appearing regularly on Bloomberg and CNBC. Profiles of David have appeared in a range of leading publications including the Financial Times, The Daily Telegraph and the Daily Mail. Specialties:IPO's, Startups, Entrepreneurship, CEO, Visionary, Investor, ceo, board member, advisor, venture capital, offshore development, financing, M&A