I'm new to python and am trying to read "blocks" of data from a file. The file is written something like: <pre class="prettyprint"><code># Some comment # 4 cols of data --x,vx,vy,vz # nsp, nskip = 2 10 # 0 0.0000000 # 1 4 0.5056E+03 0.8687E-03 -0.1202E-02 0.4652E-02 0.3776E+03 0.8687E-03 0.1975E-04 0.9741E-03 0.2496E+03 0.8687E-03 0.7894E-04 0.8334E-03 0.1216E+03 0.8687E-03 0.1439E-03 0.6816E-03 # 2 4 0.5056E+03 0.8687E-03 -0.1202E-02 0.4652E-02 0.3776E+03 0.8687E-03 0.1975E-04 0.9741E-03 0.2496E+03 0.8687E-03 0.7894E-04 0.8334E-03 0.1216E+03 0.8687E-03 0.1439E-03 0.6816E-03 # 500 0.99999422 # 1 4 0.5057E+03 0.7392E-03 -0.6891E-03 0.4700E-02 0.3777E+03 0.9129E-03 0.2653E-04 0.9641E-03 0.2497E+03 0.9131E-03 0.7970E-04 0.8173E-03 0.1217E+03 0.9131E-03 0.1378E-03 0.6586E-03 and so on </code></pre> Now I want to be able specify and read only one block of data out of these many blocks. I'm using <code>numpy.loadtxt('filename',comments='#')</code> to read the data but it loads the whole file in one go. I searched online and someone has created a patch for the numpy io routine to specify reading blocks but it's not in mainstream numpy. It's much easier to choose blocks of data in gnuplot but I'd have to write the routine to plot the distribution functions. If I can figure out reading specific blocks, it would be much easier in python. Also, I'm moving all my visualization codes to python from IDL and gnuplot, so it'll be nice to have everything in python instead of having things scattered around in multiple packages. I thought about calling gnuplot from within python, plotting a block to a table and assigning the output to some array in python. But I'm still starting and I could not figure out the syntax to do it. Any ideas, pointers to solve this problem would be of great help.

A quick basic read: <pre class="prettyprint"><code>>>> def read_blocks(input_file, i, j): empty_lines = 0 blocks = [] for line in open(input_file): # Check for empty/commented lines if not line or line.startswith('#'): # If 1st one: new block if empty_lines == 0: blocks.append([]) empty_lines += 1 # Non empty line: add line in current(last) block else: empty_lines = 0 blocks[-1].append(line) return blocks[i:j + 1] >>> for block in read_blocks(s, 1, 2): print '-> block' for line in block: print line -> block 0.5056E+03 0.8687E-03 -0.1202E-02 0.4652E-02 0.3776E+03 0.8687E-03 0.1975E-04 0.9741E-03 0.2496E+03 0.8687E-03 0.7894E-04 0.8334E-03 0.1216E+03 0.8687E-03 0.1439E-03 0.6816E-03 -> block 0.5057E+03 0.7392E-03 -0.6891E-03 0.4700E-02 0.3777E+03 0.9129E-03 0.2653E-04 0.9641E-03 0.2497E+03 0.9131E-03 0.7970E-04 0.8173E-03 0.1217E+03 0.9131E-03 0.1378E-03 0.6586E-03 >>> </code></pre> Now I guess you can use numpy to read the lines...

Reading data blocks from a file in Python

Tags:

python

numpy

block

I'm new to python and am trying to read "blocks" of data from a file. The file is written something like:

# Some comment
# 4 cols of data --x,vx,vy,vz
# nsp, nskip =           2          10


#            0   0.0000000


#            1           4
 0.5056E+03  0.8687E-03 -0.1202E-02  0.4652E-02
 0.3776E+03  0.8687E-03  0.1975E-04  0.9741E-03
 0.2496E+03  0.8687E-03  0.7894E-04  0.8334E-03
 0.1216E+03  0.8687E-03  0.1439E-03  0.6816E-03


#            2           4
 0.5056E+03  0.8687E-03 -0.1202E-02  0.4652E-02
 0.3776E+03  0.8687E-03  0.1975E-04  0.9741E-03
 0.2496E+03  0.8687E-03  0.7894E-04  0.8334E-03
 0.1216E+03  0.8687E-03  0.1439E-03  0.6816E-03


#          500  0.99999422


#            1           4
 0.5057E+03  0.7392E-03 -0.6891E-03  0.4700E-02
 0.3777E+03  0.9129E-03  0.2653E-04  0.9641E-03
 0.2497E+03  0.9131E-03  0.7970E-04  0.8173E-03
 0.1217E+03  0.9131E-03  0.1378E-03  0.6586E-03

and so on

Now I want to be able specify and read only one block of data out of these many blocks. I'm using numpy.loadtxt('filename',comments='#') to read the data but it loads the whole file in one go. I searched online and someone has created a patch for the numpy io routine to specify reading blocks but it's not in mainstream numpy.

It's much easier to choose blocks of data in gnuplot but I'd have to write the routine to plot the distribution functions. If I can figure out reading specific blocks, it would be much easier in python. Also, I'm moving all my visualization codes to python from IDL and gnuplot, so it'll be nice to have everything in python instead of having things scattered around in multiple packages.

I thought about calling gnuplot from within python, plotting a block to a table and assigning the output to some array in python. But I'm still starting and I could not figure out the syntax to do it.

Any ideas, pointers to solve this problem would be of great help.

897

asked May 09 '12 07:05

toylas

2 Answers

A quick basic read:

>>> def read_blocks(input_file, i, j):
    empty_lines = 0
    blocks = []
    for line in open(input_file):
        # Check for empty/commented lines
        if not line or line.startswith('#'):
            # If 1st one: new block
            if empty_lines == 0:
                blocks.append([])
            empty_lines += 1
        # Non empty line: add line in current(last) block
        else:
            empty_lines = 0
            blocks[-1].append(line)
    return blocks[i:j + 1]

>>> for block in read_blocks(s, 1, 2):
    print '-> block'
    for line in block:
        print line


-> block
 0.5056E+03  0.8687E-03 -0.1202E-02  0.4652E-02
 0.3776E+03  0.8687E-03  0.1975E-04  0.9741E-03
 0.2496E+03  0.8687E-03  0.7894E-04  0.8334E-03
 0.1216E+03  0.8687E-03  0.1439E-03  0.6816E-03
-> block
 0.5057E+03  0.7392E-03 -0.6891E-03  0.4700E-02
 0.3777E+03  0.9129E-03  0.2653E-04  0.9641E-03
 0.2497E+03  0.9131E-03  0.7970E-04  0.8173E-03
 0.1217E+03  0.9131E-03  0.1378E-03  0.6586E-03
>>>

Now I guess you can use numpy to read the lines...

170

answered Oct 10 '22 02:10

Emmanuel

The following code should probably get you started. You will probably need the re module.

You can open the file for reading using:

f = open("file_name_here")

You can read the file one line at a time by using

line = f.readline()

To jump to the next line that starts with a "#", you can use:

while not line.startswith("#"):
    line = f.readline()

To parse a line that looks like "# i j", you could use the following regular expression:

is_match = re.match("#\s+(\d+)\s+(\d+)",line)
if is_match:
    i = is_match.group(1)
    j = is_match.group(2)

See the documentation for the "re" module for more information on this.

To parse a block, you could use the following bit of code:

block = [[]] # block[i][j] will contain element i,j in your block
while not line.isspace(): # read until next blank line
    block.append(map(float,line.split(" "))) 
    # splits each line at each space and turns all elements to float
    line = f.readline()

You can then turn your block into a numpy array if you want:

block = np.array(block)

Provided you have imported numpy as np. If you want to read multiple blocks between i and j, just put the above code to read one block into a function and use it multiple times.

Hope this helps!

answered Oct 10 '22 04:10

Pascal Bugnion

Related questions
                            
                                Are verbose __init__ methods in Python bad?
                            
                                Problems with the GC when using a WeakValueDictionary for caches
                            
                                Download A Single File Using Multiple Threads
                            
                                how to run several executable using python?
                            
                                Include nonce and block count in PyCrypto AES MODE_CTR
                            
                                Parsing SQL Query into a DOM-like tree to enable automatic permutation?
                            
                                Print a tree of pyparsing result
                            
                                Setting up Django settings for sphinx (documentation)
                            
                                Using itertools.product and want to seed a value
                            
                                Interactive Brokers automated trading
                            
                                Prevent MySQL-Python from inserting quotes around database name parameter
                            
                                Is there a way to get code-hints for gtk3 and python working on aptana?
                            
                                Beautifulsoup, maximum recursion depth reached
                            
                                Two-dimensional vs. One-dimensional dictionary efficiency in Python
                            
                                How can I prefetch_related across a reverse one-to-one relationship where the one-to-one relationship may be different?
                            
                                Why sys.getsizeof(numpy.int8(1)) returns 12?
                            
                                Re evaluate django query after changes done to database
                            
                                Mac OSX: Switch to Python 2.7.3
                            
                                How can I make Django-Tastypie override a resource if it already exists?
                            
                                Pass a JSON object to an url with requests

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With