Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

indexing a matrix in R

Tags:

indexing

r

matrix

A novice R user here. So i have a data set formated like:

    Date  Temp  Month
 1-Jan-90 10.56      1
 2-Jan-90 11.11      1
 3-Jan-90 10.56      1
 4-Jan-90 -1.67      1
 5-Jan-90  0.56      1
 6-Jan-90 10.56      1
 7-Jan-90 12.78      1
 8-Jan-90 -1.11      1
 9-Jan-90  4.44      1
10-Jan-90 10.00      1

In R syntax:

datacl <- structure(list(Date = structure(1:10, .Label = c("1990/01/01", 
  "1990/01/02", "1990/01/03", "1990/01/04", "1990/01/05", "1990/01/06", 
  "1990/01/07", "1990/01/08", "1990/01/09", "1990/01/10"), class = "factor"), 
      Temp = c(10.56, 11.11, 10.56, -1.67, 0.56, 10.56, 12.78, 
      -1.11, 4.44, 10), Month = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
      1L, 1L)), .Names = c("Date", "Temp", "Month"), class = "data.frame", row.names = c(NA, 
  -10L))

i would like to subset the data for a particular month and apply a change factor to the temp then save the results. so i have something like

idx <- subset(datacl, Month == 1)  # Index
results[idx[,2],1] = idx[,2]+change  # change applied to only index values

but i keep getting an error like

Error in results[idx[, 2], 1] = idx[, 2] + change: 
  only 0's may be mixed with negative subscripts

Any help would be appreciated.

like image 703
user1408959 Avatar asked May 21 '12 23:05

user1408959


2 Answers

First, give the change factor a value:

change <- 1

Now, here is how to create an index:

# one approach to subsetting is to create a logical vector: 
jan.idx <- datacl$Month == 1

# alternatively the which function returns numeric indices:
jan.idx2 <- which(datacl$Month == 1)

If you want just the subset of data from January,

jandata <- datacl[jan.idx,]
transformed.jandata <- transform(jandata, Temp = Temp + change) 

To keep the entire data frame, but only add the change factor to Jan temps:

datacl$Temp[jan.idx] <- datacl$Temp[jan.idx] + change
like image 180
David LeBauer Avatar answered Sep 29 '22 19:09

David LeBauer


First, note that subset does not produce an index, it produces a subset of your original dataframe containing all rows with Month == 1.

Then when you are doing idx[,2], you are selecting out the Temp column.

results[idx[,2],1] = idx[,2] + change

But then you are using these as an index into results, i.e. you're using them as row numbers. Row numbers can't be things like 10.56 or -1.11, hence your error. Also, you're selecting the first column of results which is Date and trying to add temperatures to it.

There are a few ways you can do this.

You can create a logical index that is TRUE for a row with Month == 1 and FALSE otherwise like so:

idx <- datac1$Month == 1

Then you can use that index to select the rows in datac1 you want to modify (this is what you were trying to do originally, I think):

datac1$Temp[idx] <- datac1$Temp[idx] + change  # or 'results' instead of 'datac1'?

Note that datac1$Temp[idx] selects the Temp column of datac1 and the idx rows.

You could also do

datac1[idx,'Temp']

or

datac1[idx,2]  # as Temp is the second column.

If you only want results to be the subset where Month == 1, try:

results <- subset(datac1, Month == 1)
results$Temp <- results$Temp + change

This is because results only contains the rows you are interested in, so there's no need to do subsetting.

like image 39
mathematical.coffee Avatar answered Sep 29 '22 18:09

mathematical.coffee