Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Skipping last column in r with read.csv

Tags:

r

csv

read.csv

I was on that post read.csv and skip last column in R but did not find my answer, and try to check directly in Answer ... but that's not the right way (thanks mjuarez for taking the time to get me back on track.

The original question was:

I have read several other posts about how to import csv files with read.csv but skipping specific columns. However, all the examples I have found had very few columns, and so it was easy to do something like:

 columnHeaders <- c("column1", "column2", "column_to_skip")
 columnClasses <- c("numeric", "numeric", "NULL")
 data <- read.csv(fileCSV, header = FALSE, sep = ",", col.names = 
 columnHeaders, colClasses = columnClasses)

All answer were good, but does not work for what I entended to do. So I asked my self and other:

And in one function, does data <- read_csv(fileCSV)[,(ncol(data)-1)] could work?

I've tried in one line of R to get on data, all 5 of first 6 columns, so not the last one. To do so, I would like to use "-" in the number of column, do you think it's possible? How can I do that?

Thanks!

like image 650
Arthur Camberlein Avatar asked Feb 03 '18 13:02

Arthur Camberlein


People also ask

How do I make one column of a CSV file read only in R?

Method 1: Using read. table() function. In this method of only importing the selected columns of the CSV file data, the user needs to call the read. table() function, which is an in-built function of R programming language, and then passes the selected column in its arguments to import particular columns from the data.

How do I make certain columns read only in csv?

Use pandas. read_csv() to read a specific column from a CSV file. To read a CSV file, call pd. read_csv(file_name, usecols=cols_list) with file_name as the name of the CSV file, delimiter as the delimiter, and cols_list as the list of specific columns to read from the CSV file.

Can R read multiple CSV files?

Using R Base read. R base function provides read. csv() to import a CSV file into DataFrame. You can also use to this to import multiple CSV files at a time in R.


2 Answers

In base r it has to be 2 steps operation. Example:

> data <- read.csv("test12.csv")
> data
# 3 columns are returned
          a b c
1 1/02/2015 1 3
2 2/03/2015 2 4

# last column is excluded 
> data[,-ncol(data)]
          a b
1 1/02/2015 1
2 2/03/2015 2

one cannot write data <- read.csv("test12.csv")[,-ncol(data)] in base r.

But if you know max number of columns in your csv (say 3 in my case) then one can write:

df <- read.csv("test12.csv")[,-3]
df
          a b
1 1/02/2015 1
2 2/03/2015 2
like image 147
MKR Avatar answered Sep 20 '22 12:09

MKR


The right hand side of an assignment is processed first so this line from the question:

data <- read.csv(fileCSV)[,(ncol(data)-1)]

is trying to use data before it is defined. Also note what the above is saying is to take only the 2nd last field. To get all but the last field:

data <- read.csv(fileCSV)
data <- data[-ncol(data)]

If you know the name of the last field, say it is lastField, then this works and unlike the code above does not read the whole file and then remove the last field but rather only reads in fields other than the last. Also it is only one line of code.

read.csv(fileCSV, colClasses = c(lastField = "NULL"))

If you don't know the name of the last field but you do know how many fields there are, say n, then either of these would work:

read.csv(fileCSV)[-n]

read.csv(fileCSV, colClasses = replace(rep(NA, n), n, "NULL"))

Another way to do it without first reading in the last field is to first read in the header and first line to calculate the number of fields (assuming that all records have the same number) and then re-read the file using that.

n <- ncol(read.csv(fileCSV, nrows = 1))

making use of one of the prior two statements involving n.

like image 41
G. Grothendieck Avatar answered Sep 21 '22 12:09

G. Grothendieck