I have a large CSV file (7.3GB; 16,300,000 lines), how can I split this file into two files?
Have you taken a look at the split
command? See this man page for more information.
This page contains an example use of this command.
Aside:
the man -k
command is rather useful for finding unix/linux commands if you aren't quite sure what the specific command is. Specify a keyword with the man -k command and the system will pull out related commands. E.g.,
% man -k split
will yield:
csplit (1) - split a file into sections determined by context lines
dirsplit (1) - splits directory into multiple with equal size
dpkg-split (1) - Debian package archive split/join tool
gpgsplit (1) - Split an OpenPGP message into packets
pnmsplit (1) - split a multi-image portable anymap into multiple single-image files
ppmtoyuvsplit (1) - convert a portable pixmap into 3 subsampled raw YUV files
split (1) - split a file into pieces
splitdiff (1) - separate out incremental patches
splitfont (1) - extract characters from an ISO-type font.
URI::Split (3pm) - Parse and compose URI strings
wcstok (3) - split wide-character string into tokens
yuvsplittoppm (1) - convert a Y- and a U- and a V-file into a portable pixmap
zipsplit (1) - split a zipfile into smaller zipfiles
split -d -n l/N filename.csv tempfile.part.
splits the file into N files without splitting lines. As mentioned in the comments above, the header is not repeated in each file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With