In the ‘yet another headache for CIOs’ category, here’s an interesting read from the Wall Street Journal on why US companies are going to start building more data centers in Europe soon. In the wake of various cybersecurity threats and some recent political events, national governments are more sensitive to their citizens’ data leaving their area of control. That’s data locality leading to more data centers – and it’ll hit a lot of companies.
Multinational firms are of course affected as they have customer data originating from several areas. But in my mind the jury is out on how big the impact will be. If you’re even a consumer of social media information, do you need a local data center in every area where you’re trying to get that data feed? It’s likely going to take a few years (and probably some legal rulings) before the dust settles.
You can imagine that this new requirement puts a real crimp in Hadoop deployment plans. Do you now need at least a small cluster in each area you do business in? If so, how do you easily keep sensitive data local while still sharing downstream analysis?
This is one of the areas where a geographically distributed HDFS with powerful selective replication capabilities can come to the rescue. For more details, have a listen to the webinar on Hadoop data transfer pipelines that I ran with 451 Research’s Matt Aslett last week.