Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - How can I open a file and specify the offset in bytes?

I'm writing a program that will parse an Apache log file periodically to log it's visitors, bandwidth usage, etc..

The problem is, I don't want to open the log and parse data I've already parsed. For example:

line1
line2
line3

If I parse that file, I'll save all the lines then save that offset. That way, when I parse it again, I get:

line1
line2
line3 - The log will open from this point
line4
line5

Second time round, I'll get line4 and line5. Hopefully this makes sense...

What I need to know is, how do I accomplish this? Python has the seek() function to specify the offset... So do I just get the filesize of the log (in bytes) after parsing it then use that as the offset (in seek()) the second time I log it?

I can't seem to think of a way to code this >.<

like image 483
dave Avatar asked Jul 21 '10 12:07

dave


People also ask

How do I read an offset file in Python?

Description. Python file method seek() sets the file's current position at the offset. The whence argument is optional and defaults to 0, which means absolute file positioning, other values are 1 which means seek relative to the current position and 2 means seek relative to the file's end. There is no return value.

How do I set offset in Python?

Python makes it extremely easy to create/edit text files with a minimal amount of code required. To access a text file we have to create a filehandle that will make an offset at the beginning of the text file. Simply said, offset is the position of the read/write pointer within the file.

What is byte offset in file?

The byte offset is the count of bytes starting at zero. One character or space is usually one byte when talking about Hadoop.

How do you read a portion of a file in Python?

The read() method by default returns the whole content of a file, but you can also specify how many character you want to return. The read() method by default returns the whole content of a file, but you can also specify how many character you want to return.


2 Answers

You can manage the position in the file thanks to the seek and tell methods of the file class see https://docs.python.org/2/tutorial/inputoutput.html

The tell method will tell you where to seek next time you open

like image 82
luc Avatar answered Oct 14 '22 16:10

luc


log = open('myfile.log')
pos = open('pos.dat','w')
print log.readline()
pos.write(str(f.tell())
log.close()
pos.close()

log = open('myfile.log')
pos = open('pos.dat')
log.seek(int(pos.readline()))
print log.readline()

Of course you shouldn't use it like that - you should wrap the operations up in functions like save_position(myfile) and load_position(myfile), but the functionality is all there.

like image 29
Wayne Werner Avatar answered Oct 14 '22 14:10

Wayne Werner