Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Specifying colClasses in the read.csv

Tags:

r

csv

read.csv

People also ask

What is colClasses?

You can use the colClasses argument when importing a file into R to specify the classes of each column: df <- read. csv('my_data.csv', colClasses=c('character', 'numeric', 'numeric')) The benefit of using colClasses is that you can import data much faster, especially when the files are extremely large.

What is colClasses R?

colClasses: Creates a vector of column classes used for tabular reading.

What is the syntax to read CSV files in R?

csv file using read. csv() function, by default, it gives the output as a data frame.

What is the difference between read csv and read table?

csv() as well as the read. csv2() function are almost identical to the read. table() function, with the sole difference that they have the header and fill arguments set as TRUE by default. Tip: if you want to learn more about the arguments that you can use in the read.


You can specify the colClasse for only one columns.

So in your example you should use:

data <- read.csv('test.csv', colClasses=c("time"="character"))

The colClasses vector must have length equal to the number of imported columns. Supposing the rest of your dataset columns are 5:

colClasses=c("character",rep("numeric",5))

Assuming your 'time' column has at least one observation with a non-numeric character and all your other columns only have numbers, then 'read.csv's default will be to read in 'time' as a 'factor' and all the rest of the columns as 'numeric'. Therefore setting 'stringsAsFactors=F' will have the same result as setting the 'colClasses' manually i.e.,

data <- read.csv('test.csv', stringsAsFactors=F)

If you want to refer to names from the header rather than column numbers, you can use something like this:

fname <- "test.csv"
headset <- read.csv(fname, header = TRUE, nrows = 10)
classes <- sapply(headset, class)
classes[names(classes) %in% c("time")] <- "character"
dataset <- read.csv(fname, header = TRUE, colClasses = classes)