Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to identify a pandas column is a list

I want to identify if a column in pandas is a list (in each row).

df=pd.DataFrame({'X': [1, 2, 3], 'Y': [[34],[37,45],[48,50,57]],'Z':['A','B','C']})

df
Out[160]: 
   X             Y  Z
0  1          [34]  A
1  2      [37, 45]  B
2  3  [48, 50, 57]  C

df.dtypes
Out[161]: 
X     int64
Y    object
Z    object
dtype: object

Since the dtype of strings is "object", I'm unable to distinguish between columns that are strings and lists (of integer or strings).

How do I identify that column "Y" is a list of int?

like image 618
Gufran Pathan Avatar asked Aug 14 '17 10:08

Gufran Pathan


People also ask

How do you check if a column is in a list pandas?

Pandas. Series. isin() function is used to check whether a column contains a list of multiple values. It returns a boolean Series showing each element in the Series matches an element in the passed sequence of values exactly.

How do I get a list of pandas columns?

You can get the column names from pandas DataFrame using df. columns. values , and pass this to python list() function to get it as list, once you have the data you can print it using print() statement.

What is Tolist () in pandas?

tolist()[source] Return a list of the values. These are each a scalar type, which is a Python scalar (for str, int, float) or a pandas scalar (for Timestamp/Timedelta/Interval/Period) Returns list.

Is DataFrame a list of list?

The short answer is No - dataframes are not lists of lists.


1 Answers

You can use applymap, compare and then add all for check if all values are Trues:

print (df.applymap(type))
               X               Y              Z
0  <class 'int'>  <class 'list'>  <class 'str'>
1  <class 'int'>  <class 'list'>  <class 'str'>
2  <class 'int'>  <class 'list'>  <class 'str'>

a = (df.applymap(type) == list).all()
print (a)
X    False
Y     True
Z    False
dtype: bool

Or:

a = df.applymap(lambda x: isinstance(x, list)).all()
print (a)
X    False
Y     True
Z    False
dtype: bool

And if need list of columns:

L = a.index[a].tolist()
print (L)
['Y']

If want check dtypes (but strings, list, dict are objects):

print (df.dtypes)
X     int64
Y    object
Z    object
dtype: object

a = df.dtypes == 'int64'
print (a)
X     True
Y    False
Z    False
dtype: bool
like image 156
jezrael Avatar answered Sep 23 '22 21:09

jezrael