I need to replace the NA's of each row with non NA's values of different row for a given column for each group let say sample data like: <pre class="prettyprint"><code>id name 1 a 1 NA 2 b 3 NA 3 c 3 NA </code></pre> desired output: <pre class="prettyprint"><code>id name 1 a 1 a 2 b 3 c 3 c 3 c </code></pre> Is there a way to perform this in r ?

Here is an approach using <code>dplyr</code>. From the data frame <code>x</code> we group by <code>id</code> and replace <code>NA</code> with the relevant values. I am assuming one unique value of <code>name</code> per <code>id</code>. <pre class="prettyprint"><code>x <- data.frame(id = c(1, 1, 2, rep(3,3)), name = c("a", NA, "b", NA, "c", NA), stringsAsFactors=F) require(dplyr) x %>% group_by(id) %>% mutate(name = unique(name[!is.na(name)])) Source: local data frame [6 x 2] Groups: id # id name #1 1 a #2 1 a #3 2 b #4 3 c #5 3 c #6 3 c </code></pre>

We can use <code>data.table</code> to do this. Convert the 'data.frame' to 'data.table' (<code>setDT(df1)</code>). Grouped by 'id', we replace the 'name' with the non-NA value in 'name'. <pre class="prettyprint"><code>library(data.table)#v1.9.5+ setDT(df1)[, name:= name[!is.na(name)][1L] , by = id] df1 # id name #1: 1 a #2: 1 a #3: 2 b #4: 3 c #5: 3 c #6: 3 c </code></pre> NOTE: Here I assumed that there is only a single unique non-NA value within each 'id' group. Or another option would be to join the dataset with the <code>unique</code> rows of the data after we <code>order</code> by 'id' and 'name'. <pre class="prettyprint"><code> setDT(df1) df1[unique(df1[order(id, name)], by='id'), on='id', name:= i.name][] # id name #1: 1 a #2: 1 a #3: 2 b #4: 3 c #5: 3 c #6: 3 c </code></pre> NOTE: The <code>on</code> is only available with the devel version of <code>data.table</code>. Instructions to install the devel version are <code>here</code> <h3>data</h3> <pre class="prettyprint"><code>df1 <- structure(list(id = c(1L, 1L, 2L, 3L, 3L, 3L), name = c("a", NA, "b", NA, "c", NA)), .Names = c("id", "name"), class = "data.frame", row.names = c(NA, -6L)) </code></pre>

Replace NA with values in another row of same column for each group in r

Tags:

r

I need to replace the NA's of each row with non NA's values of different row for a given column for each group

let say sample data like:

id   name
 1     a
 1     NA
 2     b
 3     NA
 3     c
 3     NA

desired output:

Is there a way to perform this in r ?

417

asked Aug 07 '15 13:08

Dheeraj Singh

2 Answers

Here is an approach using dplyr. From the data frame x we group by id and replace NA with the relevant values. I am assuming one unique value of name per id.

x <- data.frame(id = c(1, 1, 2, rep(3,3)), 
 name = c("a", NA, "b", NA, "c", NA), stringsAsFactors=F)

require(dplyr)
x %>%
  group_by(id) %>%
  mutate(name = unique(name[!is.na(name)]))

Source: local data frame [6 x 2]
Groups: id

#  id name
#1  1    a
#2  1    a
#3  2    b
#4  3    c
#5  3    c
#6  3    c

141

answered Nov 02 '22 06:11

Whitebeard

We can use data.table to do this. Convert the 'data.frame' to 'data.table' (setDT(df1)). Grouped by 'id', we replace the 'name' with the non-NA value in 'name'.

library(data.table)#v1.9.5+
setDT(df1)[, name:= name[!is.na(name)][1L] , by = id]
df1
#   id name
#1:  1    a
#2:  1    a
#3:  2    b
#4:  3    c
#5:  3    c
#6:  3    c

NOTE: Here I assumed that there is only a single unique non-NA value within each 'id' group.

Or another option would be to join the dataset with the unique rows of the data after we order by 'id' and 'name'.

 setDT(df1)
 df1[unique(df1[order(id, name)], by='id'), on='id', name:= i.name][]
 #   id name
 #1:  1    a
 #2:  1    a
 #3:  2    b
 #4:  3    c
 #5:  3    c
 #6:  3    c

NOTE: The on is only available with the devel version of data.table. Instructions to install the devel version are here

data

df1 <- structure(list(id = c(1L, 1L, 2L, 3L, 3L, 3L), name = c("a", 
NA, "b", NA, "c", NA)), .Names = c("id", "name"),
class = "data.frame",    row.names = c(NA, -6L))

answered Nov 02 '22 08:11

akrun

Related questions
                            
                                Change thickness of a marker in ggplot2
                            
                                How can I shorten x-axis label text in ggplot?
                            
                                What is the most useful output format for graphs? [closed]
                            
                                Loop through netcdf files and run calculations - Python or R
                            
                                reading multiple csv files in R [duplicate]
                            
                                R: Compare all the columns pairwise in matrix
                            
                                error with scale_x_labels in ggplot2
                            
                                How can I summarizing data statistics using R
                            
                                Captions on tables in pdf documents generated by rmarkdown
                            
                                How do I evaluate columns inside data.table with different conditions
                            
                                Change the order of elements in vector in R
                            
                                Visualizing time series in spirals using R or Python?
                            
                                Creating variable in R data frame depending on another data frame
                            
                                Convert List of Vectors into Data Frame of Counts [duplicate]
                            
                                Remove duplicate column pairs, sort rows based on 2 columns [duplicate]
                            
                                How to construct a named list (a SEXP) to be returned from the C function called with .Call()?
                            
                                Changing node/vertice opacity in iGraph in R
                            
                                Cumulative total by group
                            
                                displaying TRUE when shiny files are split into different folders
                            
                                Dealing with missing information while converting a list into data frame or data table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With