I have a txt file with data in this format. The first 3 lines repeat over and over.
name=1
grade=A
class=B
name=2
grade=D
class=A
I would like to output the data in a table format, for example:
name | grade | class
1 | A | B
2 | D | A
I am struggling to set the headers and just loop over the data. What I have tried so far is:
def myfile(filename):
with open(file1) as f:
for line in f:
yield line.strip().split('=',1)
def pprint_df(dframe):
print(tabulate(dframe, headers="keys", tablefmt="psql", showindex=False,))
#f = pd.DataFrame(myfile('file1')
df = pd.DataFrame(myfile('file1'))
pprint_df(df)
The output from that is
+-------+-----+
| 0 | 1 |
|-------+-----|
| name | 1 |
| grade | A |
| class | B |
| name | 2 |
| grade | D |
| class | A |
+-------+-----+
Not really what I am looking for.
You can use pandas to read the file and process the data. You may use this:
import pandas as pd
df = pd.read_table(r'file.txt', header=None)
new = df[0].str.split("=", n=1, expand=True)
new['index'] = new.groupby(new[0])[0].cumcount()
new = new.pivot(index='index', columns=0, values=1)
new
Outputs:
0 class grade name
index
0 B A 1
1 A D 2
I know you have enough answers, but here is another way of doing it using dictionary:
import pandas as pd
from collections import defaultdict
d = defaultdict(list)
with open("text_file.txt") as f:
for line in f:
(key, val) = line.split('=')
d[key].append(val.replace('\n', ''))
df = pd.DataFrame(d)
print(df)
This gives you the output as:
name grade class
0 1 A B
1 2 D A
Just to get another perspective.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With