Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Importing csv file into R - numeric values read as characters

Tags:

r

I am aware that there are similar questions on this site, however, none of them seem to answer my question sufficiently.

This is what I have done so far:

I have a csv file which I open in excel. I manipulate the columns algebraically to obtain a new column "A". I import the file into R using read.csv() and the entries in column A are stored as factors - I want them to be stored as numeric. I find this question on the topic:

Imported a csv-dataset to R but the values becomes factors

Following the advice, I include stringsAsFactors = FALSE as an argument in read.csv(), however, as Hong Ooi suggested in the page linked above, this doesn't cause the entries in column A to be stored as numeric values.

A possible solution is to use the advice given in the following page:

How to convert a factor to an integer\numeric without a loss of information?

however, I would like a cleaner solution i.e. a way to import the file so that the entries of column entries are stored as numeric values.

Cheers for any help!

like image 474
user32259 Avatar asked Dec 04 '12 15:12

user32259


People also ask

Why is R reading my numbers as characters?

It's likely because they were originally factor s. You need as. numeric(as. character(........))

How does one read in a csv file of data into R?

The contents of a CSV file can be read as a data frame in R using the read. csv(…) function. The CSV file to be read should be either present in the current working directory or the directory should be set accordingly using the setwd(…)

What does the read csv () function in R do?

csv() Function. read. csv() function in R Language is used to read “comma separated value” files. It imports data in the form of a data frame.


2 Answers

Whatever algebra you are doing in Excel to create the new column could probably be done more effectively in R.

Please try the following: Read the raw file (before any excel manipulation) into R using read.csv(... stringsAsFactors=FALSE). [If that does not work, please take a look at ?read.table (which read.csv wraps), however there may be some other underlying issue].

For example:

   delim = ","  # or is it "\t" ?    dec = "."    # or is it "," ?    myDataFrame <- read.csv("path/to/file.csv", header=TRUE, sep=delim, dec=dec, stringsAsFactors=FALSE) 

Then, let's say your numeric columns is column 4

   myDataFrame[, 4]  <- as.numeric(myDataFrame[, 4])  # you can also refer to the column by "itsName" 


Lastly, if you need any help with accomplishing in R the same tasks that you've done in Excel, there are plenty of folks here who would be happy to help you out
like image 131
Ricardo Saporta Avatar answered Oct 07 '22 01:10

Ricardo Saporta


In read.table (and its relatives) it is the na.strings argument which specifies which strings are to be interpreted as missing values NA. The default value is na.strings = "NA"

If missing values in an otherwise numeric variable column are coded as something else than "NA", e.g. "." or "N/A", these rows will be interpreted as character, and then the whole column is converted to character.

Thus, if your missing values are some else than "NA", you need to specify them in na.strings.

like image 20
NC maize breeding Jim Avatar answered Oct 06 '22 23:10

NC maize breeding Jim