I have a dataframe with one of its column having a list at each index. I want to concatenate these lists into one list. I am using
ids = df.loc[0:index, 'User IDs'].values.tolist()
However, this results in
['[1,2,3,4......]']
which is a string. Somehow each value in my list column is type str
. I have tried converting using list()
, literal_eval()
but it does not work. The list()
converts each element within a list into a string e.g. from [12,13,14...]
to ['['1'',','2',','1',',','3'......]']
.
How to concatenate pandas column with list values into one list? Kindly help out, I am banging my head on it for several hours.
To start, you may use this template to concatenate your column values (for strings only): df['New Column Name'] = df['1st Column Name'] + df['2nd Column Name'] + ... Notice that the plus symbol ('+') is used to perform the concatenation.
By using df. loc[index]=list you can append a list as a row to the DataFrame at a specified Index, In order to add at the end get the index of the last record using len(df) function. The below example adds the list ["Hyperion",27000,"60days",2000] to the end of the pandas DataFrame.
consider the dataframe df
df = pd.DataFrame(dict(col1=[[1, 2, 3]] * 2))
print(df)
col1
0 [1, 2, 3]
1 [1, 2, 3]
pandas
simplest answerdf.col1.sum()
[1, 2, 3, 1, 2, 3]
numpy.concatenate
np.concatenate(df.col1)
array([1, 2, 3, 1, 2, 3])
chain
from itertools import chain
list(chain(*df.col1))
[1, 2, 3, 1, 2, 3]
response to comments:
I think your columns are strings
from ast import literal_eval
df.col1 = df.col1.apply(literal_eval)
If instead your column is string values that look like lists
df = pd.DataFrame(dict(col1=['[1, 2, 3]'] * 2))
print(df) # will look the same
col1
0 [1, 2, 3]
1 [1, 2, 3]
However pd.Series.sum
does not work the same.
df.col1.sum()
'[1, 2, 3][1, 2, 3]'
We need to evaluate the strings as if they are literals and then sum
df.col1.apply(literal_eval).sum()
[1, 2, 3, 1, 2, 3]
If you want to flatten the list this is pythonic
way to do it:
import pandas as pd
df = pd.DataFrame({'A': [[1,2,3], [4,5,6]]})
a = df['A'].tolist()
a = [i for j in a for i in j]
print a
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With