To split a file into pieces, you simply use the split command. By default, the split command uses a very simple naming scheme. The file chunks will be named xaa, xab, xac, etc., and, presumably, if you break up a file that is sufficiently large, you might even get chunks named xza and xzz.
First up, right-click the file you want to split into smaller pieces, then select 7-Zip > Add to Archive. Give your archive a name. Under Split to Volumes, bytes, input the size of split files you want. There are several options in the dropdown menu, although they may not correspond to your large file.
To split a file equally into two files, we use the '-n' option. By specifying '-n 2' the file is split equally into two files.
If you mean an equal number of lines, split
has an option for this:
split --lines=75
If you need to know what that 75
should really be for N
equal parts, its:
lines_per_part = int(total_lines + N - 1) / N
where total lines can be obtained with wc -l
.
See the following script for an example:
#!/usr/bin/bash
# Configuration stuff
fspec=qq.c
num_files=6
# Work out lines per file.
total_lines=$(wc -l <${fspec})
((lines_per_file = (total_lines + num_files - 1) / num_files))
# Split the actual file, maintaining lines.
split --lines=${lines_per_file} ${fspec} xyzzy.
# Debug information
echo "Total lines = ${total_lines}"
echo "Lines per file = ${lines_per_file}"
wc -l xyzzy.*
This outputs:
Total lines = 70
Lines per file = 12
12 xyzzy.aa
12 xyzzy.ab
12 xyzzy.ac
12 xyzzy.ad
12 xyzzy.ae
10 xyzzy.af
70 total
More recent versions of split
allow you to specify a number of CHUNKS
with the -n/--number
option. You can therefore use something like:
split --number=l/6 ${fspec} xyzzy.
(that's ell-slash-six
, meaning lines
, not one-slash-six
).
That will give you roughly equal files in terms of size, with no mid-line splits.
I mention that last point because it doesn't give you roughly the same number of lines in each file, more the same number of characters.
So, if you have one 20-character line and 19 1-character lines (twenty lines in total) and split to five files, you most likely won't get four lines in every file.
The script isn't even necessary, split(1) supports the wanted feature out of the box:split -l 75 auth.log auth.log.
The above command splits the file in chunks of 75 lines a piece, and outputs file on the form: auth.log.aa, auth.log.ab, ...
wc -l
on the original file and output gives:
321 auth.log
75 auth.log.aa
75 auth.log.ab
75 auth.log.ac
75 auth.log.ad
21 auth.log.ae
642 total
A simple solution for a simple question:
split -n l/5 your_file.txt
no need for scripting here.
From the man file, CHUNKS may be:
l/N split into N files without splitting lines
Update
Not all unix dist include this flag. For example, it will not work in OSX. To use it, you can consider replacing the Mac OS X utilities with GNU core utilities.
split was updated in coreutils release 8.8 (announced 22 Dec 2010) with the --number option to generate a specific number of files. The option --number=l/n generates n files without splitting lines.
coreutils manual
I made a bash script, that given a number of parts as input, split a file
#!/bin/sh
parts_total="$2";
input="$1";
parts=$((parts_total))
for i in $(seq 0 $((parts_total-2))); do
lines=$(wc -l "$input" | cut -f 1 -d" ")
#n is rounded, 1.3 to 2, 1.6 to 2, 1 to 1
n=$(awk -v lines=$lines -v parts=$parts 'BEGIN {
n = lines/parts;
rounded = sprintf("%.0f", n);
if(n>rounded){
print rounded + 1;
}else{
print rounded;
}
}');
head -$n "$input" > split${i}
tail -$((lines-n)) "$input" > .tmp${i}
input=".tmp${i}"
parts=$((parts-1));
done
mv .tmp$((parts_total-2)) split$((parts_total-1))
rm .tmp*
I used head
and tail
commands, and store in tmp files, for split the files
#10 means 10 parts
sh mysplitXparts.sh input_file 10
or with awk, where 0.1 is 10% => 10 parts, or 0.334 is 3 parts
awk -v size=$(wc -l < input) -v perc=0.1 '{
nfile = int(NR/(size*perc));
if(nfile >= 1/perc){
nfile--;
}
print > "split_"nfile
}' input
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With