I have a data set with two columns and I need to change it from this format:
10 1
10 5
10 3
11 5
11 4
12 6
12 2
to this
10 1 5 3
11 5 4
12 6 2
I need every unique value in the first column to be on its own row.
I am a beginner with Python and beyond reading in my text file, I'm at a loss for how to proceed.
You can use Pandas dataframes.
import pandas as pd
df = pd.DataFrame({'A':[10,10,10,11,11,12,12],'B':[1,5,3,5,4,6,2]})
print(df)
Output:
A B
0 10 1
1 10 5
2 10 3
3 11 5
4 11 4
5 12 6
6 12 2
Let's use groupby and join:
df.groupby('A')['B'].apply(lambda x:' '.join(x.astype(str)))
Output:
A
10 1 5 3
11 5 4
12 6 2
Name: B, dtype: object
Using collections.defaultdict subclass:
import collections
with open('yourfile.txt', 'r') as f:
d = collections.defaultdict(list)
for k,v in (l.split() for l in f.read().splitlines()): # processing each line
d[k].append(v) # accumulating values for the same 1st column
for k,v in sorted(d.items()): # outputting grouped sequences
print('%s %s' % (k,' '.join(v)))
The output:
10 1 5 3
11 5 4
12 6 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With