How do I take multiple lists and put them as different columns in a python dataframe? I tried this solution but had some trouble.
Attempt 1:
res = zip(lst1,lst2,lst3)
Attempt 2:
percentile_list = pd.DataFrame({'lst1Tite' : [lst1],
'lst2Tite' : [lst2],
'lst3Tite' : [lst3] },
columns=['lst1Tite','lst1Tite', 'lst1Tite'])
How do I get a 100 row (length of each independent list) by 3 column (three lists) pandas dataframe?
Create pandas DataFrame from Multiple ListsUse column param and index param to provide column & row labels respectively to the DataFrame. Alternatively, you can also add column names to DataFrame and set the index using pandas. DataFrame. set_index() method.
Insert List into Cell Using DataFrame.at() Method. In order to insert the list into the cell will use DataFrame.at() function. For example, I will use the Duration column from the above DataFrame to insert list. at() inserts a list into a specific cell without raising a ValueError.
I think you're almost there, try removing the extra square brackets around the lst
's (Also you don't need to specify the column names when you're creating a dataframe from a dict like this):
import pandas as pd
lst1 = range(100)
lst2 = range(100)
lst3 = range(100)
percentile_list = pd.DataFrame(
{'lst1Title': lst1,
'lst2Title': lst2,
'lst3Title': lst3
})
percentile_list
lst1Title lst2Title lst3Title
0 0 0 0
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
...
If you need a more performant solution you can use np.column_stack
rather than zip
as in your first attempt, this has around a 2x speedup on the example here, however comes at bit of a cost of readability in my opinion:
import numpy as np
percentile_list = pd.DataFrame(np.column_stack([lst1, lst2, lst3]),
columns=['lst1Title', 'lst2Title', 'lst3Title'])
Adding to Aditya Guru's answer here. There is no need of using map. You can do it simply by:
pd.DataFrame(list(zip(lst1, lst2, lst3)))
This will set the column's names as 0,1,2. To set your own column names, you can pass the keyword argument columns
to the method above.
pd.DataFrame(list(zip(lst1, lst2, lst3)),
columns=['lst1_title','lst2_title', 'lst3_title'])
Adding one more scalable solution.
lists = [lst1, lst2, lst3, lst4]
df = pd.concat([pd.Series(x) for x in lists], axis=1)
Just adding that using the first approach it can be done as -
pd.DataFrame(list(map(list, zip(lst1,lst2,lst3))))
There are several ways to create a dataframe from multiple lists.
list1=[1,2,3,4]
list2=[5,6,7,8]
list3=[9,10,11,12]
pd.DataFrame({'list1':list1, 'list2':list2, 'list3'=list3})
pd.DataFrame(data=zip(list1,list2,list3),columns=['list1','list2','list3'])
Adding to above answers, we can create on the fly
df= pd.DataFrame()
list1 = list(range(10))
list2 = list(range(10,20))
df['list1'] = list1
df['list2'] = list2
print(df)
hope it helps !
@oopsi used pd.concat()
but didn't include the column names. You could do the following, which, unlike the first solution in the accepted answer, gives you control over the column order (avoids dicts, which are unordered):
import pandas as pd
lst1 = range(100)
lst2 = range(100)
lst3 = range(100)
s1=pd.Series(lst1,name='lst1Title')
s2=pd.Series(lst2,name='lst2Title')
s3=pd.Series(lst3 ,name='lst3Title')
percentile_list = pd.concat([s1,s2,s3], axis=1)
percentile_list
Out[2]:
lst1Title lst2Title lst3Title
0 0 0 0
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
7 7 7 7
8 8 8 8
...
you can simple use this following code
train_data['labels']= train_data[["LABEL1","LABEL1","LABEL2","LABEL3","LABEL4","LABEL5","LABEL6","LABEL7"]].values.tolist()
train_df = pd.DataFrame(train_data, columns=['text','labels'])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With