Apache Drill vs Spark [closed]

Tags:

I have some expirience with Apache Spark and Spark-SQL. Recently I've found Apache Drill project. Could you describe me what are the most significant advantages/differences between them? I've already read Fast Hadoop Analytics (Cloudera Impala vs Spark/Shark vs Apache Drill) but this topic is still unclear for me.

838

asked Apr 22 '15 07:04

Matzz

1 Answers

Here's an article I came across that discusses some of the SQL technologies: http://www.zdnet.com/article/sql-and-hadoop-its-complicated/

Drill is fundamentally different in both the user's experience and the architecture. For example:

Drill is a schema-free query engine. For instance, you can point it at a directory of JSON or Parquet log files (on your local box, an NFS share, S3, HDFS, MapR-FS, etc.) and run a query. You don't have to load data, create and manage schemas or pre-process the data.
Drill uses a JSON document model internally which allows it to query data of any structure. A lot of modern data is complex, meaning a record can contain nested structures and arrays, and field names may actually encode values such timestamps or web page URLs. Drill allows normal BI tools to operate seamlessly on such data without requiring the data to be flattened in advance.
Drill works with a variety of non-relational datastores, including Hadoop, NoSQL databases (MongoDB, HBase) and cloud storage. Additional datastores will be added.

Drill 1.0 was just released (May 19, 2015). You can easily download it onto your laptop and play with it without any infrastructure (Hadoop, NoSQL, etc.).

112

answered Oct 12 '22 13:10

Tomer Shiran

Related questions
                            
                                Generate metadata for parquet files
                            
                                hbase connection refused
                            
                                Apache Spark on YARN: Large number of input data files (combine multiple input files in spark)
                            
                                How does HDFS with append works
                            
                                How to debug hadoop mapreduce jobs from eclipse?
                            
                                YARN Resourcemanager not connecting to nodemanager
                            
                                Hadoop Mapper is failing because of "Container killed by the ApplicationMaster"
                            
                                Is there maximum size of string data type in Hive?
                            
                                Is there a way to add nodes to a running Hadoop cluster?
                            
                                How do I control a hive job name but keep the stage info?
                            
                                Spark : check your cluster UI to ensure that workers are registered
                            
                                Install Hue without Cloudera
                            
                                How to submit a spark job on a remote master node in yarn client mode?
                            
                                Using HBase to store time series data
                            
                                REGEXP_REPLACE capturing groups
                            
                                Hadoop on windows server
                            
                                create a schema in hive
                            
                                Writing files to local system with Spark in Cluster mode
                            
                                How can I configure the maven shade plugin to include test code in my jar?
                            
                                Use Spark to list all files in a Hadoop HDFS directory?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Apache Drill vs Spark [closed]

Tags:

apache-spark

hadoop

bigdata

apache-drill

Matzz

People also ask

1 Answers

Tomer Shiran

Recent Activity

Donate For Us