I am looking at the famous Titanic dataset from the Kaggle competition found here: http://www.kaggle.com/c/titanic-gettingStarted/data
I have loaded and processed the data using:
# import required libraries
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
# load the data from the file
df = pd.read_csv('./data/train.csv')
# import the scatter_matrix functionality
from pandas.tools.plotting import scatter_matrix
# define colors list, to be used to plot survived either red (=0) or green (=1)
colors=['red','green']
# make a scatter plot
scatter_matrix(df,figsize=[20,20],marker='x',c=df.Survived.apply(lambda x:colors[x]))
df.info()
How can I add the categorical columns like Sex and Embarked to the plot?
You need to transform the categorical variables into numbers to plot them.
Example (assuming that the column 'Sex' is holding the gender data, with 'M' for males & 'F' for females)
df['Sex_int'] = np.nan
df.loc[df['Sex'] == 'M', 'Sex_int'] = 0
df.loc[df['Sex'] == 'F', 'Sex_int'] = 1
Now all females are represented by 0 & males by 1. Unknown genders (if there are any) will be ignored.
The rest of your code should process the updated dataframe nicely.
after googling and remembering something like the .map() function I fixed it in the following way:
colors=['red','green'] # color codes for survived : 0=red or 1=green
# create mapping Series for gender so it can be plotted
gender = Series([0,1],index=['male','female'])
df['gender']=df.Sex.map(gender)
# create mapping Series for Embarked so it can be plotted
embarked = Series([0,1,2,3],index=df.Embarked.unique())
df['embarked']=df.Embarked.map(embarked)
# add survived also back to the df
df['survived']=target
now I can plot it again...and drop the added columns afterwards.
thanks everyone for responding.....
Here is my solution:
# convert string column to category
df.Sex = df.Sex.astype('category')
# create additional column for its codes
df['Sex_code'] = df_clean.Sex.cat.codes
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With