I am fairly new to pandas and come from a statistics background and I am struggling with a conceptual problem: Pandas has columns, who are containing values. But sometimes values have a special meaning - in a statistical program like SPSS or R called a "value labels". Imagine a column <code>rain</code> with two values <code>0</code> (meaning: no rain) and <code>1</code> (meaning: raining). Is there a way to assign these labels to that values? Is there a way to do this in pandas, too? Mainly for platting and visualisation purposes.

There's not need to use a <code>map</code> anymore. Since version 0.15, Pandas allows a categorical data type for its columns. The stored data takes less space, operations on it are faster and you can use labels. I'm taking an example from the pandas docs: <pre class="prettyprint"><code>df = pd.DataFrame({"id":[1,2,3,4,5,6], "raw_grade":['a', 'b', 'b', 'a', 'a', 'e']}) #Recast grade as a categorical variable df["grade"] = df["raw_grade"].astype("category") df["grade"] #Gives this: Out[124]: 0 a 1 b 2 b 3 a 4 a 5 e Name: grade, dtype: category Categories (3, object): [a, b, e] </code></pre> You can also rename categories and add missing categories

You could have a separate dictionary which maps values to labels: <pre class="prettyprint"><code> d={0:"no rain",1:"raining"} </code></pre> and then you could access the labelled data by doing <pre class="prettyprint"><code> df.rain_column.apply(lambda x:d[x]) </code></pre>

Value Label in pandas?

Tags:

python

pandas

I am fairly new to pandas and come from a statistics background and I am struggling with a conceptual problem: Pandas has columns, who are containing values. But sometimes values have a special meaning - in a statistical program like SPSS or R called a "value labels".

Imagine a column rain with two values 0 (meaning: no rain) and 1 (meaning: raining). Is there a way to assign these labels to that values?

Is there a way to do this in pandas, too? Mainly for platting and visualisation purposes.

352

asked Mar 19 '14 08:03

Christian Sauer

2 Answers

There's not need to use a map anymore. Since version 0.15, Pandas allows a categorical data type for its columns. The stored data takes less space, operations on it are faster and you can use labels.

I'm taking an example from the pandas docs:

df = pd.DataFrame({"id":[1,2,3,4,5,6], "raw_grade":['a', 'b', 'b', 'a', 'a', 'e']})
#Recast grade as a categorical variable
df["grade"] = df["raw_grade"].astype("category")

df["grade"]

#Gives this:
Out[124]: 
0    a
1    b
2    b
3    a
4    a
5    e
Name: grade, dtype: category
Categories (3, object): [a, b, e]

You can also rename categories and add missing categories

answered Oct 19 '22 00:10

cd98

You could have a separate dictionary which maps values to labels:

 d={0:"no rain",1:"raining"}

and then you could access the labelled data by doing

 df.rain_column.apply(lambda x:d[x])

answered Oct 18 '22 23:10

grasshopper

Related questions
                            
                                Cython : pure C loop optimization
                            
                                RuntimeError: input() already active - file loop
                            
                                How to label axes in Matplotlib using LaTeX brackets?
                            
                                How to get the output of subprocess.check_output() python module?
                            
                                matplotlib legend: Including markers and lines from two different graphs in one line
                            
                                Add build information in Jenkins using REST
                            
                                manage.py collectstatic command not found, Django 1.5.1
                            
                                Documentation for Python binding for MLT multimedia framework
                            
                                Python Tkinter update when entry is changed
                            
                                How to change default colour scheme for ipython notebook?
                            
                                Python regex for line of digits and optional dash+digits. Why not matching?
                            
                                How to get an Xlib.display.Window instance by id?
                            
                                Shape Detection in python using OpenCV
                            
                                Plot arrays of different lengths
                            
                                Is it possible to run dev_appserver.py with the remote datastore?
                            
                                Concatenate the files present in .m3u8 in python
                            
                                Integrating MailChimp with Django User Workflow
                            
                                Reportlab error: 'Table' object has no attribute '_colpositions'
                            
                                Sorting a 2D numpy array on to the proximity of each element to a certain point
                            
                                python - How to use StringIO with imghdr to determine if valid image

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With