Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Beeline-Hive returns CSV with blank rows on top of data

Tags:

My script does simple job, run SQL from a file and save to CSV.

Code is up and running but there is odd behaviour while producing CSV output. Data starts at around line 70, rather then from very beginning in the CSV file.

#!/bin/bash
beeline -u jdbc:hive2:default -n  -p  --silent=true --outputformat=csv2 -f code.sql > file_date+`%Y%m%d%H%M%`.csv

I would like my data to start at the very first row of actual data.

1 blank;blank;blank
2 blank;blank;blank
3 blank;blank;blank
4 attr;attr;attr
5 data;data;data
6 data;data;data
7 data;data;data
8 data;data;data
9 data;data;data
like image 419
marcin2x4 Avatar asked Aug 08 '19 11:08

marcin2x4


People also ask

Why CSV writer adds blank rows?

Why csv writer adds blank rows? The way Python handles newlines on Windows can result in blank lines appearing between rows when using csv. writer . In Python 2, opening the file in binary mode disables universal newlines and the data is written properly.

How do you dump output to file from Beeline?

You can try this in your hql: INSERT OVERWRITE DIRECTORY '/user/user1/results' select count(*) from sample_table; This will write the output of your query into the results directory on HDFS.


1 Answers

Workaround embedded in next step of my automation:

 sed -i '/^$/d' file.txt 
like image 68
marcin2x4 Avatar answered Sep 30 '22 20:09

marcin2x4