I have a big data set in cassandra where I used hive to analyze and send data to hdfs file system. I am wondering is it possible to group by the appName and depending on the appName I send my data to differant hdfs file systems (Please note app names are not predefined)
appName Data
a1 abc
a1 pqr
a1 qwe
a2 my
a2 data
a2 abc
a2 bnm
a3 ewr
a3 asf
a4 abc123
a1 dataset ->/apps/a1 a2 dataset ->/apps/a2 ect
dynamic partitions: https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-DynamicpartitionInsert might suit you.
you wont be able to choose the path in HDFS but different apps will go to different folders.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With