I have a file, say all
, with 2000 lines, and I hope it can be split into 4 small files with line number 1~500, 501~1000, 1001~1500, 1501~2000.
Perhaps, I can do this using:
cat all | head -500 >small1 cat all | tail -1500 | head -500 >small2 cat all | tail -1000 | head -500 >small3 cat all | tail -500 >small4
But this way involves the calculation of line number, which may cause error when the number of lines is not a good number, or when we want to split the file to too many small files (e.g.: file all
with 3241 lines, and we want to split it into 7 files, each with 463 lines).
Is there a better way to do this?
To split a file into pieces, you simply use the split command. By default, the split command uses a very simple naming scheme. The file chunks will be named xaa, xab, xac, etc., and, presumably, if you break up a file that is sufficiently large, you might even get chunks named xza and xzz.
Read a file (data stream, variable) line-by-line (and/or field-by-field)? We can use sed with w option to split a file into mutiple files. Files can be split by specifying line address or pattern.
If you use the -l (a lowercase L) option, replace linenumber with the number of lines you'd like in each of the smaller files (the default is 1,000). If you use the -b option, replace bytes with the number of bytes you'd like in each of the smaller files.
When you want to split a file, use split
:
split -l 500 all all
will split the file into several files that each have 500 lines. If you want to split the file into 4 files of roughly the same size, use something like:
split -l $(( $( wc -l < all ) / 4 + 1 )) all all
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With