Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read complex data in python?

I'm trying to read the data which is not structured well. It looks something like this

Generated by trjconv : P/L=1/400 t=   0.00000
11214
    1P1     aP1    1  80.48  35.36   4.25
    2P1     aP1    2  37.45   3.92   3.96
    3P2     aP2    3  18.53  -9.69   4.68
    4P2     aP2    4  55.39  74.34   4.60
    5P3     aP3    5  22.11  68.71   3.85
    6P3     aP3    6  -4.13  24.04   3.73
    7P4     aP4    7  40.16   6.39   4.73
    8P4     aP4    8  -5.40  35.73   4.85
    9P5     aP5    9  36.67  22.45   4.08
   10P5     aP5   10  -3.68 -10.66   4.18
Generated by trjconv : P/L=1/400 t=   1000.000
11214
    1P1     aP1    1  80.48  35.36   4.25
    2P1     aP1    2  37.45   3.92   3.96
    3P2     aP2    3  18.53  -9.69   4.68
    4P2     aP2    4  55.39  74.34   4.60
    5P3     aP3    5  22.11  68.71   3.85
    6P3     aP3    6  -4.13  24.04   3.73
    7P4     aP4    7  40.16   6.39   4.73
    8P4     aP4    8  -5.40  35.73   4.85
    9P5     aP5    9  36.67  22.45   4.08
   10P5     aP5   10  -3.68 -10.66   4.18
Generated by trjconv : P/L=1/400 t=   2000.000
11214
    1P1     aP1    1  80.48  35.36   4.25
    2P1     aP1    2  37.45   3.92   3.96
    3P2     aP2    3  18.53  -9.69   4.68
    4P2     aP2    4  55.39  74.34   4.60
    5P3     aP3    5  22.11  68.71   3.85
    6P3     aP3    6  -4.13  24.04   3.73
    7P4     aP4    7  40.16   6.39   4.73
    8P4     aP4    8  -5.40  35.73   4.85
    9P5     aP5    9  36.67  22.45   4.08
   10P5     aP5   10  -3.68 -10.66   4.18
Generated by trjconv : P/L=1/400 t=   3000.000
11214
    1P1     aP1    1  80.48  35.36   4.25
    2P1     aP1    2  37.45   3.92   3.96
    3P2     aP2    3  18.53  -9.69   4.68
    4P2     aP2    4  55.39  74.34   4.60
    5P3     aP3    5  22.11  68.71   3.85
    6P3     aP3    6  -4.13  24.04   3.73
    7P4     aP4    7  40.16   6.39   4.73
    8P4     aP4    8  -5.40  35.73   4.85
    9P5     aP5    9  36.67  22.45   4.08
   10P5     aP5   10  -3.68 -10.66   4.18

It consists of different frames with updated time. What I showed here is just a sample. The whole file is around 50GB. therefore it will be better to read it line by line or in chunks. But I could not figure out how to deal with the headers of each frame. Are there any ways to get rid of these headers? For now I used following method:

import numpy as np

#define a np.dtype for gro array/dataset (hard-coded for now)
gro_dt = np.dtype([('col1', 'S4'), ('col2', 'S4'), ('col3', int), 
                   ('col4', float), ('col5', float), ('col6', float)])

file = np.genfromtxt('sample.gro', skip_header = 2, dtype=gro_dt)

But it throws the following error when it comes to next header.

ValueError: Some errors were detected !
    Line #13 (got 7 columns instead of 6)
    Line #14 (got 1 columns instead of 6)
    Line #25 (got 7 columns instead of 6)
    Line #26 (got 1 columns instead of 6)
    Line #37 (got 7 columns instead of 6)
    Line #38 (got 1 columns instead of 6)
like image 201
Rohit Avatar asked Sep 05 '21 15:09

Rohit


People also ask

How do you read complex numbers in Python?

An complex number is represented by “ x + yi “. Python converts the real numbers x and y into complex using the function complex(x,y). The real part can be accessed using the function real() and imaginary part can be represented by imag().

How do you represent complex data types in Python?

Integer and floating points are separated by decimal points. 1 is an integer, 1.0 is a floating-point number. Complex numbers are written in the form, x + yj , where x is the real part and y is the imaginary part.

What is complex data in Python?

A complex number has two parts, real part and imaginary part. Complex numbers are represented as A+Bi or A+Bj , where A is real part and B is imaginary part. Python supports complex data type as built-in feature which means we can directly perform different operations on complex number in python.

Why does Python use j instead of i for complex numbers?

Python adopted the convention used by electrical engineers. In that field, i is used to represent current and use j as the square root of -1.

How to use complex data type in Python?

Python supports complex data type as built-in feature which means we can directly perform different operations on complex number in python. First thing first, python uses A+Bj notation to represent complex number meaning python will recognize 3+4j as a valid number but 3+4i is not valid.

How to handle complex numbers in Python?

Not only real numbers, Python can also handle complex numbers and its associated functions using the file “cmath”. Complex numbers have their uses in many applications related to mathematics and python provides useful tools to handle and manipulate them. Converting real numbers to complex number. An complex number is represented by “ x + yi “.

How to convert x and Y into complex numbers in Python?

An complex number is represented by “ x + yi “. Python converts the real numbers x and y into complex using the function complex(x,y).

Why are complex numbers no longer literals in Python?

This time, your expression is no longer a literal because Python evaluated it into a complex number comprising only two parts. Remember that the basic rules of algebra carry over to complex numbers, so if you group similar terms and apply component-wise addition, then you’ll end up with 6 + 8j. Notice how Python displays complex numbers by default.


3 Answers

Write an adaptor that strips the periodic headers.

def adapt(f):
    for line in f:
        if line.startswith("Generated"):
            print(line, end='')
            # Consume the following line as well.
            # If your data is well behaved, you can 
            # assume the following line exists and should be
            # skipped, instead of using the try statement.
            try:
                print(next(f), end='')
            except StopIteration:
                pass
            continue
        yield line

with open('sample.gro') as f:
    file = np.genfromtxt(adapt(f), dtype=gro_dt)
like image 155
chepner Avatar answered Oct 19 '22 17:10

chepner


Since genfromtxt accepts a generator function, maybe a converter function like so? (This keeps the t= value from the headers intact as the first column.)

def converter(inf):
    current_t = None
    for line in inf:
        if "trjconv" in line:
            current_t = line.partition("t=")[-1].strip()
        elif line.startswith("  "):
            yield current_t + line


gro_dt = np.dtype(
    [
        ("t", "float"),
        ("col1", "S4"),
        ("col2", "S4"),
        ("col3", int),
        ("col4", float),
        ("col5", float),
        ("col6", float),
    ]
)


with open("sample.gro") as fp:
    file = np.genfromtxt(converter(fp), dtype=gro_dt)

print(file)

The output begins

[(   0., b'1P1', b'aP1',  1, 80.48,  35.36, 4.25)
 (   0., b'2P1', b'aP1',  2, 37.45,   3.92, 3.96)
 (   0., b'3P2', b'aP2',  3, 18.53,  -9.69, 4.68)
 (   0., b'4P2', b'aP2',  4, 55.39,  74.34, 4.6 )
 (   0., b'5P3', b'aP3',  5, 22.11,  68.71, 3.85)
 (   0., b'6P3', b'aP3',  6, -4.13,  24.04, 3.73)
 (   0., b'7P4', b'aP4',  7, 40.16,   6.39, 4.73)
 (   0., b'8P4', b'aP4',  8, -5.4 ,  35.73, 4.85)
 (   0., b'9P5', b'aP5',  9, 36.67,  22.45, 4.08)
 (   0., b'10P5', b'aP5', 10, -3.68, -10.66, 4.18)
 (1000., b'1P1', b'aP1',  1, 80.48,  35.36, 4.25)
 (1000., b'2P1', b'aP1',  2, 37.45,   3.92, 3.96)
 (1000., b'3P2', b'aP2',  3, 18.53,  -9.69, 4.68)
 (1000., b'4P2', b'aP2',  4, 55.39,  74.34, 4.6 )
like image 24
AKX Avatar answered Oct 19 '22 15:10

AKX


assuming you want to collect the frame data (not sure you can do that for 50 GB..)
The code below does that.

def _is_interesting_line(line_str: str) -> bool:
    return line and line_str[0].isspace()


data = []
with open('data.txt') as f:
    while True:
        line = f.readline()
        if not line:
            break
        interesting = _is_interesting_line(line)
        if not interesting:
            print(line.strip())
        else:
            data.append(line.strip())
print('result:')
print(data)
like image 23
balderman Avatar answered Oct 19 '22 16:10

balderman