Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Value Label in pandas?

Tags:

python

pandas

I am fairly new to pandas and come from a statistics background and I am struggling with a conceptual problem: Pandas has columns, who are containing values. But sometimes values have a special meaning - in a statistical program like SPSS or R called a "value labels".

Imagine a column rain with two values 0 (meaning: no rain) and 1 (meaning: raining). Is there a way to assign these labels to that values?

Is there a way to do this in pandas, too? Mainly for platting and visualisation purposes.

like image 352
Christian Sauer Avatar asked Mar 19 '14 08:03

Christian Sauer


People also ask

What are labels in pandas?

In pandas documents, the term label is used as if it is granted that we know what it is, such as in Indexing and selecting data. The axis labeling information in pandas objects serves many purposes: pandas provides a suite of methods in order to have purely label based indexing.

What is values in pandas?

It can be thought of as a dict-like container for Series objects. This is the primary data structure of the Pandas. Pandas DataFrame. values attribute return a Numpy representation of the given DataFrame.

How can I get pandas value?

You can get the value of a cell from a pandas dataframe using df. iat[0,0] .


2 Answers

There's not need to use a map anymore. Since version 0.15, Pandas allows a categorical data type for its columns. The stored data takes less space, operations on it are faster and you can use labels.

I'm taking an example from the pandas docs:

df = pd.DataFrame({"id":[1,2,3,4,5,6], "raw_grade":['a', 'b', 'b', 'a', 'a', 'e']})
#Recast grade as a categorical variable
df["grade"] = df["raw_grade"].astype("category")

df["grade"]

#Gives this:
Out[124]: 
0    a
1    b
2    b
3    a
4    a
5    e
Name: grade, dtype: category
Categories (3, object): [a, b, e]

You can also rename categories and add missing categories

like image 64
cd98 Avatar answered Oct 19 '22 00:10

cd98


You could have a separate dictionary which maps values to labels:

 d={0:"no rain",1:"raining"}

and then you could access the labelled data by doing

 df.rain_column.apply(lambda x:d[x])
like image 32
grasshopper Avatar answered Oct 18 '22 23:10

grasshopper