Hi I have a bit of a vague question...
I wanted to construct a tool to search through log files and i wanted the following functionality:
1) Search through the log files until a given log line is found. 2) After finding 1) jump forward an unknown number of lines until a condition is met. At this point the data is used to do some computation. 3) After completing 2) I want to return to line found in 1) and proceed through the file.
Now I'm able to perform the 1) and 2) fairly easily just looping over each line:
for line in file
for 3) I was going to use something like file.seek(linenum) and continue to iterate over the lines. But is there a more efficient way for any of the above steps?
thanks
For files this is easy enough to solve by using tell and seek:
o=open(myfile)
#read some lines
last_position= o.tell()
#read more lines
o.seek( last_position )
#read more lines again
Note that, unlike you refer in your question, seek does not take a line number. It takes a byte offset. For ASCII files, a byte offset is also the character offset, but that doesn't hold for most modern encodings.
There's no "more efficient" way of doing this, AFAIK. This is extremely efficient from the OS, memory, cpu and disk perspectives. It's a bit clumsy from a programming standpoint, but unfortunately python does not offer a standard way to clone iterators
This answer implements an efficient line-based reader for huge files: https://stackoverflow.com/a/23646049/34088
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With