Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing levels from data frame read from CSV file - R

I tried loading the baseball statistics from this link. When I read it from the file using

data <- read.csv("MLB2011.csv")

it seems to be reading all fields as factor values. I tried dropping those factor values by doing:

read.csv("MLB2011.xls", as.is= FALSE)

.. but it looks like the values are still being read as factors. What can I do to have them loaded as simple character values and not factors?

like image 862
name_masked Avatar asked Feb 07 '13 23:02

name_masked


1 Answers

You aren't reading a csv file, it is an excel spreadsheet (.xls format). It contains two worksheets bat2011 and pitch2011

You could use the XLConnect library to read this

library(XLConnect)
# load the work book (connect to the file)
wb <- loadWorkbook("MLB2011.xls")


# read in the data from the bat2011 sheet
bat2011 <- readWorksheet(wb, sheet = 'bat2011')

readWorksheet has an argument colType which you could use to specify the column types.


Edit

If you have already saved the sheets as csv files then

as.is = TRUE or stringsAsFactors = FALSE will be the correct argument values

like image 116
mnel Avatar answered Nov 16 '22 11:11

mnel