How to find the byte position of specific line in a file

Tags:

What's the fastest way to find the byte position of a specific line in a file, from the command line?

e.g.

$ linepos myfile.txt 13
5283

I'm writing a parser for a CSV that's several GB in size, and in the event the parser is halted, I'd like to be able to resume from the last position. The parser is in Python, but even iterating over file.readlines() takes a long time, since there are millions of rows in the file. I'd like to simply do file.seek(int(command.getoutput("linepos myfile.txt %i" % lastrow))), but I can't find a shell command to efficiently do this.

Edit: Sorry for the confusion, but I'm looking for a non-Python solution. I already know how to do this from Python.

713

asked Feb 04 '14 17:02

Cerin

1 Answers

From @chepner's comment on my other answer:

position = 0  # or wherever you left off last time
try:
    with open('myfile.txt') as file:
        file.seek(position)  # zero in base case
        for line in file:
            position = file.tell() # current seek position in file
            # process the line
except:
    print 'exception occurred at position {}'.format(position)
    raise

answered Oct 13 '22 20:10

mhlester

Related questions
                            
                                Restoring keyboard settings in Xorg environment after suspending
                            
                                Proper use of LD_LIBRARY_PATH or ldconfig for a software package
                            
                                why "extra characters after command" error shown for the sed command line shown?
                            
                                running a persistent python script from systemd?
                            
                                Got error: No rule to make target while compiling linux Kernel
                            
                                Edit CMakeLists.txt to compile with -fPIC
                            
                                How to use dlsym reliably when you have duplicated symbols?
                            
                                socket.gaierror: [Errno -2] Name or service not known
                            
                                Sending a struct from kernel to userland via netlink
                            
                                Execute command on the same line multiple times with sed
                            
                                How does /usr/bin/time measure memory usage?
                            
                                SDL2 - Check if OpenGL context is created
                            
                                linux application get Killed
                            
                                Python shutil.copy fails on FAT file systems (Ubuntu)
                            
                                curl: (2) Failed Initialization
                            
                                Linux Shell script what dirname and ? means?
                            
                                correct use of linux inotify - reopen every time?
                            
                                Using wget to download select directories from ftp server
                            
                                Untab in nano: move a block of code to the left
                            
                                LD_PRELOAD and thread local variable

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to find the byte position of specific line in a file

Tags:

linux

bash

command-line

Cerin

People also ask

1 Answers

mhlester

Recent Activity

Donate For Us