Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can we store relational data in hdfs

Tags:

hadoop

hdfs

I am trying to convert a application that have relational database as backend. Can I store the data relationaly in HDFS as well?

like image 381
Vaibhav Jain Avatar asked Aug 05 '13 08:08

Vaibhav Jain


1 Answers

Just for the sake of storing, you can store anything in HDFS. But that won't make any sense. First of all, you should not think of Hadoop as a replacement to your RDBMS(which you are trying to do here). Both are meant for totally different purposes. Hadoop is not a good fit for your transactional, relational or real-time kind of needs. It was meant to serve your offline batch processing needs. So, it's better to analyze your use case properly and then freeze your decision.

As a suggestion I would like to point you to Hive. It provides you warehousing capabilities on top of your existing Hadoop cluster. It also provides an SQL like interface to your warehouse, which will make your life much easier if you are coming from SQL background. But again, Hive is also a batch processing system and is not a good fit if you need something real time.

You can have a look at HBase though, as suggested by abhinav. It's a DB that can run on top of your Hadoop cluster and provides you random, real time read/write access to your data. But you should keep 1 thing in mind that it's a NoSQL db. It doesn't follow the SQL terminologies and conventions. So, you might find it a bit alien initially. You might have to think about issues like how to store your data in a new storage style(columnar) unlike the row style storage of your RDBMS. Otherwise it's not a problem to setup and use it.

HTH

like image 116
Tariq Avatar answered Oct 10 '22 21:10

Tariq