I have a file with two datasets in, which I'd like to read into Python as two columns.
The data is in the form:
xxx yyy xxx yyy xxx yyy
and so on, so I understand that I need to somehow split it up. I'm new to Python (and relatively new to programming), so I've struggled a bit so far. At the moment I've tried to use:
def read(file):
column1=[]
column2=[]
readfile = open(file, 'r')
a = (readfile.read())
readfile.close()
How would I go about splitting the read in file into column1 and column2?
This is quite simple with the Python modules Pandas. Suppose you have a data file like this:
>cat data.txt
xxx yyy xxx yyy xxx yyy
xxx yyy xxx yyy xxx yyy
xxx yyy xxx yyy xxx yyy
xxx yyy xxx yyy xxx yyy
xxx yyy xxx yyy xxx yyy
>from pandas import DataFrame
>from pandas import read_csv
>from pandas import concat
>dfin = read_csv("data.txt", header=None, prefix='X', delimiter=r"\s+")
> dfin
X0 X1 X2 X3 X4 X5
0 xxx yyy xxx yyy xxx yyy
1 xxx yyy xxx yyy xxx yyy
2 xxx yyy xxx yyy xxx yyy
3 xxx yyy xxx yyy xxx yyy
4 xxx yyy xxx yyy xxx yyy
>dfout = DataFrame()
>dfout['X0'] = concat([dfin['X0'], dfin['X2'], dfin['X4']], axis=0, ignore_index=True)
>dfout['X1'] = concat([dfin['X1'], dfin['X3'], dfin['X5']], axis=0, ignore_index=True)
> dfout
X0 X1
0 xxx yyy
1 xxx yyy
2 xxx yyy
3 xxx yyy
4 xxx yyy
5 xxx yyy
6 xxx yyy
7 xxx yyy
8 xxx yyy
9 xxx yyy
10 xxx yyy
11 xxx yyy
12 xxx yyy
13 xxx yyy
14 xxx yyy
Hope it helps. Best.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With