Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

fast deletion of a line with an index from a file

Tags:

shell

sed

awk

I have a HUGE file of 10G. I want to remove line 188888 from this file.

I use sed as follows:

sed -i '188888d' file

The problem is it is really slow. I understand it is because of the size of the file, but is there any way that I can do that faster.

Thanks

like image 579
Amir Avatar asked Nov 29 '25 22:11

Amir


2 Answers

Try

sed -i '188888{;d;q;}' file

You may need to experiment with which of the above semi-colons you keep, {d;q} ... being the 2nd thing to try.

This will stop searching the file after it deletes that one line, but you'll still have to spend the time re-writing the file. It would also be worth testing

sed '188888{;q;d;}' file > /path/to/alternate/mountpoint/newFile

where the alternate mountpoint is on a separate disk drive.

final edit Ah, one other option would be to edit the file while it is being written through a pipe

 yourLogFileProducingProgram | sed -i '188888d' > logFile

But this assumes that you know that the data your want to delete is always at line '188888, is that possible?

I hope this helps.

like image 113
shellter Avatar answered Dec 02 '25 04:12

shellter


The file lines are determined by counting the \n character, if the line size are variable then you cannot calculate the offset to the location given a line but have to count the number of newlines.

This will always be O(n) where n is the number of bytes in the file.

Parallel algorithms does not help either because this operation is disk IO limited, divide and conquer will be even slower.

If you will do this a lot on a same file, there are ways to preprocess the file and make it faster.

A easy way is to build a index with

line#:offset

And when you want to find a line, do binary search (Log n) in index for the line number you want, and use the offset to locate the line in the original file.

like image 24
Desmond Zhou Avatar answered Dec 02 '25 05:12

Desmond Zhou



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!