I have over 40 gb tar.gz file at https://ghtstorage.blob.core.windows.net/downloads/mysql-2016-06-16.tar.gz How can I find the number of rows in the CSV file that is compressed inside this tar.gz file without uncompressing the entire file which might be in 100+ GBs?
If there is only one csv file in that tar.gz, you could do this as a bash one-liner:
tar -zxOf mysql-2016-06-16.tar.gz | wc -l
It uses tar to extract all the files in the archive to standard output (-O, capital o, not zero), and wc to count the lines.
If there is more files, and only want that one file, you can count the lines in that file like this:
tar -zxOf mysql-2016-06-16.tar.gz mysql-2016-06-16/commit_comments.csv| wc -l
Here's how to list all files in the archive:
tar -zlf mysql-2016-06-16.tar.gz
CSV files usually have a header, so remove one line per file and you have the number of rows.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With