Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - Extract multiple rows from column 1 if certain value appears in column 2

Tags:

dataframe

r

rows

I have a question about the extraction of multiple values from a data.frame in R and putting them into a new data.frame.

I have a data.frame that looks like this (df)

PRICE     EVENT
1.50        0
1.70        0
1.65        0
1.20        1
0.90        0
1.70        0
1.55        0 
  .         .
  .         .
1.10        0
1.20        0
1.14        1
0.90        0

My actual data.frame has these two columns and over 300.000 rows. The column called EVENT only has the values 0 OR 1 (the value 1 is a proxy that a certain event occurs).

First Step of my research: Analyze the price if the Event occurs. The first step is a easy one. I did it with

vector<-df[df$EVENT==1, "PRICE"]

now vector contains all the Prices for the Eventdays. (here: 1.20 and 1.14)

but now the second step of my research is where it gets interesting:

now i want not only the prices for the eventday, but also the prices for x days before and after the eventday and put them into a matrix

For Example: I want the prices of two days before the event and one day after the event (including event day)

than the new data.frame i am trying to create would look like

    Event 1               Event n
-2   1.70        ...        1.10
-1   1.65        ...        1.20
 0   1.20        ...        1.14
+1   0.90        ...        0.90

Please keep in mind that the 4 days span [-2:1] is only an example. In my actual research i have to cover a 91 day span [-30:60].

Thanks for the help :)

like image 674
Bit Avatar asked Jan 25 '18 08:01

Bit


People also ask

How do I select rows based on column values in R?

By using bracket notation on R DataFrame (data.name) we can select rows by column value, by index, by name, by condition e.t.c. You can also use the R base function subset() to get the same results. Besides these, R also provides another function dplyr::filter() to get the rows from the DataFrame.

How do I extract multiple rows from a matrix in R?

To get multiple rows of matrix, specify the row numbers as a vector followed by a comma, in square brackets, after the matrix variable name. This expression returns the required rows as a matrix.

How do I split a column into multiple rows in R?

To split a column into multiple columns in the R Language, We use the str_split_fixed() function of the stringr package library.


2 Answers

We can create a matrix that contains the relevant row numbers, and then use that as a mask to arrive at your expected output:

event_rows <- which(df$EVENT==1)
mask <- sapply(event_rows, function(x) (x-2):(x+2))
apply(mask, 2, function(x) df$PRICE[x])
#     [,1] [,2]
#[1,] 1.70 1.10
#[2,] 1.65 1.20
#[3,] 1.20 1.14
#[4,] 0.90 0.90
#[5,] 1.70   NA

Data

df <- structure(list(PRICE = c(1.5, 1.7, 1.65, 1.2, 0.9, 1.7, 1.55, 
1.1, 1.2, 1.14, 0.9), EVENT = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 
0L, 1L, 0L)), .Names = c("PRICE", "EVENT"), class = "data.frame", row.names = c(NA, 
-11L))
like image 141
mtoto Avatar answered Nov 07 '22 19:11

mtoto


For the sake of completion, here's a base R solution:

# example data
set.seed(123)
df <- data.frame(price = rnorm(100), event = rbinom(100, 1, 0.05))

# create a vector of unique event positions with additional 2 positions before and 1 ahead
offset <- unique(as.vector(sapply(which(df$event == 1), function(x) c((x-2):(x+1)))))

# subset data    
df[offset[offset >0 & offset <= 100],]


         price event
1  -0.56047565     0
2  -0.23017749     1
3   1.55870831     0
20 -0.47279141     0
21 -1.06782371     0
22 -0.21797491     1
23 -1.02600445     0
46 -1.12310858     0
47 -0.40288484     0
48 -0.46665535     1
49  0.77996512     1
50 -0.08336907     0
62 -0.50232345     0
63 -0.33320738     0
64 -1.01857538     1
65 -1.07179123     0
75 -0.68800862     0
76  1.02557137     0
77 -0.28477301     1
78 -1.22071771     0
95  1.36065245     0
96 -0.60025959     0
97  2.18733299     1
98  1.53261063     0

Edit: I didn't see the expected output at first, see @mtoto's answer for that.

like image 24
LAP Avatar answered Nov 07 '22 21:11

LAP