Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the index position of the first non-NA value in an R vector?

Tags:

r

I have a problem where a vector has a bunch of NAs at the beginning, and data thereafter. However the peculiarity of my data is that the first n values that are non NA, are probably unreliable, so I would like to remove them and replace them with NA.

For example, if I have a vector of length 20, and non-NAs start at index position 4:

> z  [1]          NA          NA          NA -1.64801942 -0.57209233  0.65137286  0.13324344 -2.28339326  [9]  1.29968050  0.10420776  0.54140323  0.64418164 -1.00949072 -1.16504423  1.33588892  1.63253646 [17]  2.41181291  0.38499825 -0.04869589  0.04798073 

I would like to remove the first 3 non-NA values, which I believe to be unreliable, to give this:

> z  [1]          NA          NA          NA          NA          NA          NA  0.13324344 -2.28339326  [9]  1.29968050  0.10420776  0.54140323  0.64418164 -1.00949072 -1.16504423  1.33588892  1.63253646 [17]  2.41181291  0.38499825 -0.04869589  0.04798073 

Of course I need a general solution and I never know when the first non-NA value starts. How would I go about doing this? IE how do I find out the index position of the first non-NA value?

For completeness, my data is actually arranged in a data frame with lots of these vectors in columns, and each vector can have a different non-NA starting position. Also once the data starts, there may be sporadic NAs further down, which prevents me from simply counting their number, as a solution.

like image 452
Thomas Browne Avatar asked Jul 24 '11 18:07

Thomas Browne


People also ask

How do you find position of NA in R?

The is.na() function returns a logical vector of True and False values to indicate which of the corresponding elements are NA or not. This is followed by the application of which() function which indicates the position of the data elements.

How do you find the index of a variable in R?

Use the which() Function to Find the Index of an Element in R. The which() function returns a vector with the index (or indexes) of the element which matches the logical vector (in this case == ).

How do I find missing values in a Dataframe in R?

In R, the easiest way to find columns that contain missing values is by combining the power of the functions is.na() and colSums(). First, you check and count the number of NA's per column. Then, you use a function such as names() or colnames() to return the names of the columns with at least one missing value.

How do I replace NAs with 0 in R?

To replace NA with 0 in an R data frame, use is.na() function and then select all those values with NA and assign them to 0. myDataframe is the data frame in which you would like replace all NAs with 0.


1 Answers

Use a combination of is.na and which to find the non-NA index locations.

NonNAindex <- which(!is.na(z)) firstNonNA <- min(NonNAindex)  # set the next 3 observations to NA is.na(z) <- seq(firstNonNA, length.out=3) 
like image 65
Joshua Ulrich Avatar answered Sep 25 '22 08:09

Joshua Ulrich