Question 1: Are there any special considerations or support of Spring technologies for this (i.e. Spring-Data, Spring-Integration, Spring-Batch)?
Answer: We are continuously looking at technologies that make Hadoop easier to use and program. Spring-Data, Spring-Integration and Spring-Batch show promise. When sufficient momentum is gathered by these projects, we will work with the Spring community to include a tested version of these technologies in the AltoStor Appliance.
Question 2: What is the hadoop version underneath the appliance?
Answer: Hadoop 2. We intend to remain close to the latest version of Hadoop at any given point in time, modulo fixes and changes for bug fixes.
Question 3: Will the pricing model based on number of name nodes or size of the cluster?
Answer: Pricing decisions have not been made yet. We will announce pricing in the first quarter of 2013.
Question 4: Can you comment on how load balancing is resolved across active nodes? Is there a load balance router concept?
Answer: We do not require/depend on any specialized hardware such as load balancers or NFS filers. By “load balancing” we simply mean that application requests (read or write) can be directed to any NameNode based on its proximity to the client or available resources. Thus NameNodes can share the workload and provide higher overall cluster performance compared to active-standby architecture.
Question 5: How does Active-Active replication impact processing time relative to current Hadoop architectures?
Answer: Active-Active replication will result in load balancing of clients across many NameNodes, i.e. fewer clients will be serviced by each NameNode. Since NameNodes share the workload on a busy cluster, you should expect faster response time for clients. Generally, more active NameNodes can perform a proportionally larger amount of work.