I want to test and configure Impala with my Hadoop 2.2.0 distribution, not Cloudera ones. I want to know if its possible to use Impala without CDH, because I only read that Impala is CDH dependent. I'm trying to follow the guide in Impala Github - https://github.com/cloudera/impala - and I'll do the possible changes to make it work. Does anyone already done that? or is it really impossible?

I think there are two things here that should be addressed separately: <ol> <li>Running Impala on non-CDH Hadoop. It is possible, though it is not tested or supported by Cloudera. However, other Hadoop distributions include Impala, e.g. MapR's distribution includes Cloudera Impala and Amazon announced support for Impala on Elastic MapReduce, and they have both tested that it works with their distributions. I assume you're not using MapR, either, but my point is just that it is possible.</li> <li>Running Impala on Hadoop 2.2.0. This is also possible as the CDH5 beta 1 release includes Hadoop 2.2.0, so Impala versions 1.2 and higher should work. Please do make sure you use the latest version (1.2.3 as of now) because there are a number of important fixes in the last few minor dot releases.</li> </ol> So yeah, it's possible, though it probably won't be a smooth installation and there isn't a lot of help for this use case. Good luck!

Impala on Hadoop 2.2.0 without CDH?

1 Answers

I think there are two things here that should be addressed separately:

Running Impala on non-CDH Hadoop. It is possible, though it is not tested or supported by Cloudera. However, other Hadoop distributions include Impala, e.g. MapR's distribution includes Cloudera Impala and Amazon announced support for Impala on Elastic MapReduce, and they have both tested that it works with their distributions. I assume you're not using MapR, either, but my point is just that it is possible.
Running Impala on Hadoop 2.2.0. This is also possible as the CDH5 beta 1 release includes Hadoop 2.2.0, so Impala versions 1.2 and higher should work. Please do make sure you use the latest version (1.2.3 as of now) because there are a number of important fixes in the last few minor dot releases.

So yeah, it's possible, though it probably won't be a smooth installation and there isn't a lot of help for this use case. Good luck!

151

answered Nov 01 '22 16:11

Matt

Related questions
                            
                                Do mappers store it's intermediate outputs on datanode's RAM on which it is running?
                            
                                Apache Hive: How to convert string to timestamp?
                            
                                Conversion Hive datediff() to months
                            
                                Query Parquet data through Vertica (Vertica Hadoop Integration)
                            
                                Cannot use a "." in a Hive table column name
                            
                                PySpark: Handing NULL in Joins
                            
                                Streaming data store in hive using spark
                            
                                Python Hadoop streaming on windows, Script not a valid Win32 application
                            
                                Spark & Scala: saveAsTextFile() exception
                            
                                Starting HBASE, java.lang.ClassNotFoundException: org.apache.htrace.SamplerBuilder
                            
                                How to fix "Error: Could not find or load main class ”-Djava.library.path=.usr.local.hadoop.lib” while installing hadoop
                            
                                Is the input format responsible for implementing data locality in Hadoop's MapReduce?
                            
                                Hadoop for JSON files
                            
                                HBase schema/key for real-time analytics solution
                            
                                HBase setting timestamp
                            
                                Pig approach to pairing data fields in a data set
                            
                                Can apache flume hdfs sink accept dynamic path to write?
                            
                                Load snappy-compressed files into Elastic MapReduce
                            
                                Building Hadoop with Maven - "Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (create-testdirs)"
                            
                                How to get the SerDe Properties of an existing Hive Table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Impala on Hadoop 2.2.0 without CDH?

Tags:

hadoop

cloudera

impala

BAndrade

People also ask

1 Answers

Matt

Recent Activity

Donate For Us