Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

read.table() read numeric valuse as integer in R

I am using Data <- read.table("file", head=TRUE, sep=";") to read my file.

The head of my file looks like this:

         Date     Time Global_active_power Global_reactive_power Voltage Global_intensity
66637 2007-02-01 00:00:00               0.326                 0.128 243.150            1.400
66638 2007-02-01 00:01:00               0.326                 0.130 243.320            1.400
66639 2007-02-01 00:02:00               0.324                 0.132 243.510            1.400
66640 2007-02-01 00:03:00               0.324                 0.134 243.900            1.400
66641 2007-02-01 00:04:00               0.322                 0.130 243.160            1.400
66642 2007-02-01 00:05:00               0.320                 0.126 242.290            1.400
      Sub_metering_1 Sub_metering_2 Sub_metering_3
66637          0.000          0.000              0
66638          0.000          0.000              0
66639          0.000          0.000              0
66640          0.000          0.000              0
66641          0.000          0.000              0
66642          0.000          0.000              0

However, if I try typeof(Data$Global_reactive_power) it shows integer (should be numeric).

I do not understand why it is happening. I tried many methods but somehow none of them works, can any one help me for this?

My file is here: https://d396qusza40orc.cloudfront.net/exdata%2Fdata%2Fhousehold_power_consumption.zip

like image 262
Ginger Avatar asked Feb 12 '23 02:02

Ginger


2 Answers

It appears that your raw data as "?" for missing values. I looked by using

is.not.numeric<-function(x) {
    is.na(as.numeric(as.character(x)))
}

head(Filter(is.not.numeric, Data$Global_reactive_power))

When R encounters a non-numeric value such as "?" in a column, it coerces the column to a factor. In order you read your data in correctly, try

Data<-read.table("household_power_consumption.txt", 
    header=TRUE, sep=";", na.strings="?")

Now

class(Data$Global_reactive_power)
# [1] "numeric"

shows that it's numeric. (Note that you should never really need to use typeof. That tells you how the data for an object is stored, it doesn't tell you what the object is. Use class() for that).

like image 168
MrFlick Avatar answered Feb 15 '23 10:02

MrFlick


Your Global_reactive_power column has some non-numeric entries in it, which is causing read.table to turn it into a factor. Note that typeof(factor) is integer.

Open your file in a text editor and look for entries that aren't strictly numeric. If your data came from Excel, be sure to remove all formatting from columns (other than dates) before exporting to text.

like image 43
Hong Ooi Avatar answered Feb 15 '23 10:02

Hong Ooi