Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

return default if pandas dataframe.loc location doesn't exist

Tags:

python

pandas

I find myself often having to check whether a column or row exists in a dataframe before trying to reference it. For example I end up adding a lot of code like:

if 'mycol' in df.columns and 'myindex' in df.index: x = df.loc[myindex, mycol] else: x = mydefault 

Is there any way to do this more nicely? For example on an arbitrary object I can do x = getattr(anobject, 'id', default) - is there anything similar to this in pandas? Really any way to achieve what I'm doing more gracefully?

like image 524
fantabolous Avatar asked May 01 '14 06:05

fantabolous


People also ask

What does LOC in Pandas return?

The loc property is used to access a group of rows and columns by label(s) or a boolean array. . loc[] is primarily label based, but may also be used with a boolean array.

What is the difference between LOC [] and ILOC []?

The main difference between pandas loc[] vs iloc[] is loc gets DataFrame rows & columns by labels/names and iloc[] gets by integer Index/position. For loc[], if the label is not present it gives a key error. For iloc[], if the position is not present it gives an index error.

Does Pandas LOC return a copy?

The key concepts that are connected to the SettingWithCopyWarning are views and copies. Some operations in pandas (and numpy as well) will return views of the original data, while other copies.


2 Answers

There is a method for Series:

So you could do:

df.mycol.get(myIndex, NaN) 

Example:

In [117]:  df = pd.DataFrame({'mycol':arange(5), 'dummy':arange(5)}) df Out[117]:    dummy  mycol 0      0      0 1      1      1 2      2      2 3      3      3 4      4      4  [5 rows x 2 columns] In [118]:  print(df.mycol.get(2, NaN)) print(df.mycol.get(5, NaN)) 2 nan 
like image 92
EdChum Avatar answered Sep 21 '22 21:09

EdChum


Python has this mentality to ask for forgiveness instead of permission. You'll find a lot of posts on this matter, such as this one.

In Python catching exceptions is relatively inexpensive, so you're encouraged to use it. This is called the EAFP approach.

For example:

try:     x = df.loc['myindex', 'mycol'] except KeyError:     x = mydefault 
like image 33
FooBar Avatar answered Sep 20 '22 21:09

FooBar