Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split file into unequal chunks in Linux

Tags:

linux

bash

unix

I wish to split a large file (with ~ 17M Lines of Strings) into multiple files with varying number of lines in each chunk. Would it be possible to send in an array to the 'split -l' command like this:

[
 1=>1000000,
 2=>1000537,
 ...
]

so as to send those many number of lines to each chunk

like image 495
alpha_cod Avatar asked Feb 05 '13 22:02

alpha_cod


People also ask

How do I split a file into chunks?

To split a file into pieces, you simply use the split command. By default, the split command uses a very simple naming scheme. The file chunks will be named xaa, xab, xac, etc., and, presumably, if you break up a file that is sufficiently large, you might even get chunks named xza and xzz.

How do I split a file in half?

First up, right-click the file you want to split into smaller pieces, then select 7-Zip > Add to Archive. Give your archive a name. Under Split to Volumes, bytes, input the size of split files you want. There are several options in the dropdown menu, although they may not correspond to your large file.

How do I split multiple files in Linux?

To split a file equally into two files, we use the '-n' option. By specifying '-n 2' the file is split equally into two files.


2 Answers

Use a compound command:

{
  head -n 10000 > output1
  head -n   200 > output2
  head -n  1234 > output3
  cat > remainder
} < yourbigfile

This also works with loops:

{
  i=1
  for n in 10000 200 1234
  do
      head -n $n > output$i
      let i++
  done
  cat > remainder
} < yourbigfile

This does not work on OS X, where head reads and discards additional output.

like image 84
that other guy Avatar answered Sep 28 '22 08:09

that other guy


The split command does not have that capability, so you'll have to use a different tool, or write one of your own.

like image 37
Jim Lewis Avatar answered Sep 28 '22 06:09

Jim Lewis