Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read file of repeated "key=value" pairs into DataFrame

I have a txt file with data in this format. The first 3 lines repeat over and over.

name=1
grade=A
class=B
name=2
grade=D
class=A

I would like to output the data in a table format, for example:

name | grade | class
1    | A     | B
2    | D     | A

I am struggling to set the headers and just loop over the data. What I have tried so far is:

def myfile(filename):
    with open(file1) as f:
        for line in f:
            yield line.strip().split('=',1)

def pprint_df(dframe):
    print(tabulate(dframe, headers="keys", tablefmt="psql", showindex=False,))

#f = pd.DataFrame(myfile('file1')
df = pd.DataFrame(myfile('file1'))
pprint_df(df)

The output from that is

+-------+-----+
| 0     | 1   |
|-------+-----|
| name  | 1   |
| grade | A   |
| class | B   |
| name  | 2   |
| grade | D   |
| class | A   |
+-------+-----+

Not really what I am looking for.

like image 285
Flenters Avatar asked Nov 13 '19 07:11

Flenters


2 Answers

You can use pandas to read the file and process the data. You may use this:

import pandas as pd
df = pd.read_table(r'file.txt', header=None)
new = df[0].str.split("=", n=1, expand=True)
new['index'] = new.groupby(new[0])[0].cumcount()
new = new.pivot(index='index', columns=0, values=1)

new Outputs:

0     class grade name
index                 
0         B     A    1
1         A     D    2
like image 194
luigigi Avatar answered Nov 14 '22 23:11

luigigi


I know you have enough answers, but here is another way of doing it using dictionary:

import pandas as pd
from collections import defaultdict
d = defaultdict(list)

with open("text_file.txt") as f:
    for line in f:
        (key, val) = line.split('=')
        d[key].append(val.replace('\n', ''))

df = pd.DataFrame(d)
print(df)

This gives you the output as:

name grade class
0    1     A     B
1    2     D     A

Just to get another perspective.

like image 7
SSharma Avatar answered Nov 14 '22 22:11

SSharma