I have two Hive tables of the same structure (schema). What would be an efficient SQL request to concatenate them into a single table with the same structure? Update, this works quite fast in my case: CREATE TABLE xy AS SELECT * FROM ( SELECT * FROM x UNION ALL SELECT * FROM y ) tmp;

If you are trying to merge <code>table_A</code> and <code>table_b</code> into a single one, the easiest way is to use the <code>UNION ALL</code> operator. You can find the syntax and use cases here - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union

Hive: Fast concatenate two tables into one?

Video Answer

2 Answers

If you are trying to merge table_A and table_b into a single one, the easiest way is to use the UNION ALL operator. You can find the syntax and use cases here - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union

answered Nov 16 '22 04:11

visakh

"union all" is a right solution but might be expensive, resource/time wise. I'd recommend creating a table with two partitions, one for table A and another for Table B. This way, no need to merge (or union all). The merged table is available as soon as both partitions get populated.

answered Nov 16 '22 03:11

Tauruzzz

Related questions
                            
                                Java Keystore PrivateKeyEntry vs trustedCertEntry
                            
                                Is it possible to run Hadoop in Pseudo-Distributed operation without HDFS?
                            
                                Specifying memory limits with hadoop
                            
                                Hadoop: How does OutputCollector work during MapReduce?
                            
                                Spark fails on big shuffle jobs with java.io.IOException: Filesystem closed
                            
                                Spark forcing log4j
                            
                                How to change user in hdfs using sparkSubmit in java
                            
                                S3 and EMR data locality [closed]
                            
                                Is "Adopting MapReduce model" = Universal answer to scalability?
                            
                                What is the closest thing to Apache Hadoop in other languages?
                            
                                "GC Overhead limit exceeded" on Hadoop .20 datanode
                            
                                Simple oozie example of hive query?
                            
                                Pig, how to refer to a field after a join and a group by
                            
                                In Hive, how can I add a column only if that column does not exist?
                            
                                Should the HBase region server and Hadoop data node on the same machine?
                            
                                Hadoop 2.6 Connecting to ResourceManager at /0.0.0.0:8032
                            
                                could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation
                            
                                how to tune mapred.reduce.parallel.copies?
                            
                                How oozie handle dependencies?
                            
                                What is the HDFS Location on Hadoop?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Hive: Fast concatenate two tables into one?

Tags:

concatenation

hadoop

hive

DarqMoth

People also ask

Video Answer

2 Answers

visakh

Tauruzzz

Recent Activity

Donate For Us