I have a folder that contains multiple text files. I'm trying to split all text files at 10000 line per file while keeping the base file name i.e. if filename1.txt contains 20000 lines the output will be filename1-1.txt (10000 lines) and filename1-2.txt (10000 lines).
I tried to use split -10000 filename1.txt
but this is not keeping the base filename and i have to repeat the command for each text file in the folder. I also tried doing for f in *.txt; do split -10000 $f.txt; done
. This didn't work too.
Any idea how can i do this? Thanks.
To split a file into pieces, you simply use the split command. By default, the split command uses a very simple naming scheme. The file chunks will be named xaa, xab, xac, etc., and, presumably, if you break up a file that is sufficiently large, you might even get chunks named xza and xzz.
Right-click the file and select the Split operation from the program's context menu. This opens a new configuration window where you need to specify the destination for the split files and the maximum size of each volume. You can select one of the pre-configured values or enter your own into the form directly.
for f in filename*.txt; do split -d -a1 -l10000 --additional-suffix=.txt "$f" "${f%.txt}-"; done
Or, written over multiple lines:
for f in filename*.txt
do
split -d -a1 -l10000 --additional-suffix=.txt "$f" "${f%.txt}-"
done
How it works:
-d
tells split
to use numeric suffixes
-a1
tells split
to start with only single digits for the suffix.
-l10000
tells split
to split every 10,000 lines.
--additional-suffix=.txt
tells split
to add .txt
to the end of the names of the new files.
"$f"
tells split
the name of the file to split.
"${f%.txt}-"
tells split
the prefix name to use for the split files.
Suppose that we start with these files:
$ ls
filename1.txt filename2.txt
Then we run our command:
$ for f in filename*.txt; do split -d -a1 -l10000 --additional-suffix=.txt "$f" "${f%.txt}-"; done
When this is done, we now have the original files and the new split files:
$ ls
filename1-0.txt filename1-1.txt filename1.txt filename2-0.txt filename2-1.txt filename2.txt
split
If your split does not offer --additional-suffix
, then consider:
for f in filename*.txt
do
split -d -a1 -l10000 "$f" "${f%.txt}-"
for g in "${f%.txt}-"*
do
mv "$g" "$g.txt"
done
done
No need for shell loops, just one simple awk command does it for all files:
awk 'FNR%1000==1{if(FNR==1)c=0; close(out); out=FILENAME; sub(/.txt/,"-"++c".txt)} {print > out}' *
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With