fast deletion of a line with an index from a file

Question

I have a HUGE file of 10G. I want to remove line 188888 from this file.

I use sed as follows:

sed -i '188888d' file

The problem is it is really slow. I understand it is because of the size of the file, but is there any way that I can do that faster.

Thanks

shellter · Accepted Answer

Try

sed -i '188888{;d;q;}' file

You may need to experiment with which of the above semi-colons you keep, {d;q} ... being the 2nd thing to try.

This will stop searching the file after it deletes that one line, but you'll still have to spend the time re-writing the file. It would also be worth testing

sed '188888{;q;d;}' file > /path/to/alternate/mountpoint/newFile

where the alternate mountpoint is on a separate disk drive.

final edit Ah, one other option would be to edit the file while it is being written through a pipe

 yourLogFileProducingProgram | sed -i '188888d' > logFile

But this assumes that you know that the data your want to delete is always at line '188888, is that possible?

I hope this helps.

Desmond Zhou · Answer

The file lines are determined by counting the character, if the line size are variable then you cannot calculate the offset to the location given a line but have to count the number of newlines.

This will always be O(n) where n is the number of bytes in the file.

Parallel algorithms does not help either because this operation is disk IO limited, divide and conquer will be even slower.

If you will do this a lot on a same file, there are ways to preprocess the file and make it faster.

A easy way is to build a index with

line#:offset

And when you want to find a line, do binary search (Log n) in index for the line number you want, and use the offset to locate the line in the original file.

fast deletion of a line with an index from a file

Tags:

shell

sed

awk

Amir

2 Answers

shellter

Desmond Zhou

Recent Activity

Donate For Us

fast deletion of a line with an index from a file

Tags:

shell

sed

awk

Amir

2 Answers

shellter

Desmond Zhou

Related questions

Recent Activity

Donate For Us