I have tab delimited text file, named 'a.txt'. The D column is empty. <pre class="prettyprint"><code> A B C D 10 20 NaN 30 40 40 30 20 20 NA 20 </code></pre> I want to have the dataframe looking and acting exactly as the text file, with a space in the 2nd row and in the 2nd column. Unfortunately, read.csv is converting all the blanks and NA to "NA". I want to read NA and NaN as characters. <pre class="prettyprint"><code> b<- read.csv("a.txt",sep="\t", skip =0, header = TRUE, comment.char = "",check.names = FALSE, quote="", ) </code></pre> To summarize: I want to replicate the same values in output file without modifying them: <ul> <li>If there is a blank in input, the output should be blank.</li> <li>If the input has NA or Nan, then the output should also have NA or NaN.</li> </ul>

After reading the csv file, try the following. It will replace the NA values with "". <pre class="prettyprint"><code>b[is.na(b)]<-"" </code></pre> Fairly certain that won't fix your NaN values. That will need to be resolved in a separate statement <pre class="prettyprint"><code>b[is.nan(b)]<-"" </code></pre>

read.csv blank fields to NA

Tags:

I have tab delimited text file, named 'a.txt'. The D column is empty.

 A       B       C    D 10      20     NaN 30              40 40      30      20 20      NA      20

I want to have the dataframe looking and acting exactly as the text file, with a space in the 2nd row and in the 2nd column.

Unfortunately, read.csv is converting all the blanks and NA to "NA". I want to read NA and NaN as characters.

 b<- read.csv("a.txt",sep="\t", skip =0, header = TRUE, comment.char = "",check.names = FALSE, quote="", )

To summarize: I want to replicate the same values in output file without modifying them:

If there is a blank in input, the output should be blank.
If the input has NA or Nan, then the output should also have NA or NaN.

967

asked Oct 01 '13 21:10

user1631306

2 Answers

Late edit: After re-reading this after the edits and extended comments, I'm wondering if what was needed (or asked for, at least) was pretty much the exact opposite of what I advise below. The request for this:

Unfortunately, read.csv is converting all the blanks and NA to "NA". I want to read NA and NaN as characters.

,,, might have been satisfied (somewhat paradoxically) with the arguments: colClasses="character", stringsAsFactors=FALSE, na.strings="."`

Then any character value including an empty string would come in as itself. Arguing against this is the acceptance of the answer that converts empty character values ("") to R _NA_character values.

Here's a test example with various results:

 sapply(read.csv(text='A\tB\tC\tD\na\t""\tNA\tNaN', sep='\t', na.strings=""), class ) #        A         B         C         D  # "factor" "logical"  "factor" "numeric"   sapply(read.csv(text='A\tB\tC\tD\na\t""\tNA\tNaN', sep='\t', na.strings="x"), class ) #        A         B         C         D  # "factor" "logical"  "factor" "numeric"   sapply(read.csv(text='A\tB\tC\tD\na\t""\tNA\tNaN', sep='\t', na.strings="x", stringsAsFactors=FALSE), class ) #          A           B           C           D  #"character"   "logical" "character"   "numeric"   #Almost the expressed desired result  sapply(read.csv(text='A\tB\tC\tD\na\t""\tNA\tNaN', sep='\t', #colClasses="character", stringsAsFactors=FALSE), class ) #          A           B           C           D  #"character" "character" "character" "character"  #But ... still get a real R <NA> read.csv(text='A\tB\tC\tD\na\t""\tNA\tNaN', sep='\t', colClasses="character", stringsAsFactors=FALSE) #  A B    C   D #1 a   <NA> NaN #So add all three  read.csv(text='A\tB\tC\tD\na\t""\tNA\tNaN', sep='\t', colClasses="character", stringsAsFactors=FALSE,na.strings=".") #  A B  C   D #1 a   NA NaN # Finally all columns are character and no "real" R NA's

The default for na.strings is just "NA", so you perhaps need to add "NaN". True blanks ("") are set to missing but spaces (" ") are not:

 b<- read.csv("a.txt",  skip =0,                  comment.char = "",check.names = FALSE, quote="",                na.strings=c("NA","NaN", " ") )

It's not clear that this is the problem since your data example is malformed and does not have commas. That may be the fundamental problem since read.csv does not allow tab-separation. Use read.delim or read.table if your data has tab-separation.

b<- read.table("a.txt", sep="\t" skip =0, header = TRUE,                 comment.char = "",check.names = FALSE, quote="",                na.strings=c("NA","NaN", " ") )  # worked example for csv text file connection  bt <- "A,B,C   10,20,NaN 30,,40 40,30,20 ,NA,20"   b<- read.csv(text=bt, sep=",",                  comment.char = "",check.names = FALSE, quote="\"",                 na.strings=c("NA","NaN", " ") )  b #--------------    A  B  C 1 10 20 NA 2 30 NA 40 3 40 30 20 4 NA NA 20

Example 2:

bt <- "A,B,C,D 10,20,NaN 30,,40 40,30,20 ,NA,20"   b<- read.csv(text=bt, sep=",",                  comment.char = "",check.names = FALSE, quote="\"",                 na.strings=c("NA","NaN", " ") , colClasses=c(rep("numeric", 3), "logical"))   b #----------------    A  B  C  D 1 10 20 NA NA 2 30 NA 40 NA 3 40 30 20 NA 4 NA NA 20 NA > str(b) 'data.frame':   4 obs. of  4 variables:  $ A: num  10 30 40 NA  $ B: num  20 NA 30 NA  $ C: num  NA 40 20 20  $ D: logi  NA NA NA NA

It's mildly interesting that NA and NaN are not identical for numeric vectors. NaN is returned by operations that have no mathematical meaning (but as noted in the help page you get with ?NaN, the results of operations may depend on the particular OS. Tests of equality are not appropriate for either NaN or NA. There are specific is functions for them:

> Inf*0 [1] NaN  > is.nan(c(1,2.2,3,NaN, NA) ) [1] FALSE FALSE FALSE  TRUE FALSE > is.na(c(1,2.2,3,NaN, NA) ) [1] FALSE FALSE FALSE  TRUE  TRUE  # note the difference

199

answered Sep 22 '22 11:09

IRTFM

After reading the csv file, try the following. It will replace the NA values with "".

b[is.na(b)]<-""

Fairly certain that won't fix your NaN values. That will need to be resolved in a separate statement

b[is.nan(b)]<-""

answered Sep 22 '22 11:09

silly_penguin

Related questions
                            
                                Inno Setup: Control panel icon does not show
                            
                                Brace matching and references highlight suddenly stop working (VS2013)
                            
                                Running code after an AngularJS animation has completed
                            
                                How to extract metadata from a image using python?
                            
                                Cython setup.py for several .pyx
                            
                                Python/IPython ImportError: no module named site
                            
                                Json.Net And ActionResult
                            
                                Disable table recreation in Spring Boot application
                            
                                Firefox says "Could Not Load Image" for some images, adds weird classes automatically
                            
                                Set Google Analytics User ID after creating the tracker
                            
                                How do I add a thymeleaf dialect to spring boot?
                            
                                How do I unit test HTTP request and response using NSURLSession in iOS 7.1?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With