applying a function across columns by extracting similar column names

Tags:

r

My data looks like:

[[1]]
        date germany france germany_mean france_mean germany_sd france_sd
1 2016-01-01      17     25     21.29429    48.57103   30.03026  47.05169

What I am trying to do is to compute the following calculation over all the lists using map.

germany_calc = (germany - germany_mean) / germany_sd 
france_calc = (france - france_mean) / france_sd

However the number of columns can change - here there are two categories/countries but in another list there could be 1 or 3 or N. The countries always follow the same structure. That is,

"country1", "country2", ... , "countryN", "country1_mean", "country2_mean", ... , "countryN_mean", "country1_sd", "country2_sd", ... , "countryN_sd".

Expected Output (for the first list):

Germany: -0.1429988 =  (17 - 21.29429) / 30.03026 
France: -0.5009603 = (25 - 48.57103) / 47.05169

EDIT: Apologies - expected output:

-0.1429988
-0.5009603

Function:

Scale_Me <- function(x){
  (x - mean(x, na.rm = TRUE)) / sd(x, na.rm = TRUE)
}

Data:

    my_list <- list(structure(list(date = structure(16801, class = "Date"), 
    germany = 17, france = 25, germany_mean = 21.2942922374429, 
    france_mean = 48.5710301846855, germany_sd = 30.030258443028, 
    france_sd = 47.0516928425878), class = "data.frame", row.names = c(NA, 
-1L)), structure(list(date = structure(16802, class = "Date"), 
    germany = 9, france = 29, germany_mean = 21.2993150684932, 
    france_mean = 48.5605316914534, germany_sd = 30.0286190461173, 
    france_sd = 47.0543871206842), class = "data.frame", row.names = c(NA, 
-1L)), structure(list(date = structure(16803, class = "Date"), 
    germany = 8, france = 18, germany_mean = 21.2947488584475, 
    france_mean = 48.551889593794, germany_sd = 30.0297291333284, 
    france_sd = 47.0562416513092), class = "data.frame", row.names = c(NA, 
-1L)), structure(list(date = structure(16804, class = "Date"), 
    germany = 3, france = 11, germany_mean = 21.2778538812785, 
    france_mean = 48.5382545766386, germany_sd = 30.0267943793948, 
    france_sd = 47.0607680244109), class = "data.frame", row.names = c(NA, 
-1L)), structure(list(date = structure(16805, class = "Date"), 
    germany = 4, france = 13, germany_mean = 21.2614155251142, 
    france_mean = 48.5214531240057, germany_sd = 30.0269420596686, 
    france_sd = 47.0676011750263), class = "data.frame", row.names = c(NA, 
-1L)), structure(list(date = structure(16806, class = "Date"), 
    germany = 4, france = 9, germany_mean = 21.253196347032, 
    france_mean = 48.5055948249362, germany_sd = 30.0292032528186, 
    france_sd = 47.0737183354519), class = "data.frame", row.names = c(NA, 
-1L)))

690

asked Nov 09 '19 13:11

user113156

1 Answers

Why not just rbind the thing?

with(do.call(rbind, my_list), 
     cbind(germany=(germany - germany_mean) / germany_sd,
           france=(france - france_mean) / france_sd))
#         germany     france
# [1,] -0.1429988 -0.5009603
# [2,] -0.4095864 -0.4157005
# [3,] -0.4427196 -0.6492633
# [4,] -0.6087181 -0.7976550
# [5,] -0.5748642 -0.7546901
# [6,] -0.5745473 -0.8392283

177

answered Oct 20 '22 19:10

jay.sf

Related questions
                            
                                Importing csv file with line breaks to R or Python Pandas
                            
                                R Shiny date slider animation by month (currently by day)
                            
                                How to print numbers divisible by 7
                            
                                How to pass multiple column names as input to group_by in dplyr [duplicate]
                            
                                Increment by one to each duplicate value
                            
                                Python version of R's ifelse statement
                            
                                Why is `speedglm` slower than `glm`?
                            
                                R data table: update join
                            
                                cbind with partially nested list
                            
                                How to make the table header bold with Knitr (for pdf output)?
                            
                                Levels function returning NULL
                            
                                How to save frames of gif created using gganimate package
                            
                                Mutating dummy variables in dplyr
                            
                                TwitteR r package: /usr/lib/x86_64-linux-gnu/libcurl.so.4: version `CURL_OPENSSL_3' not found
                            
                                How to run ANOVA on a wide format data.frame?
                            
                                Data Scraping in R
                            
                                R: Create dummy if column includes duplicate given group
                            
                                Dense Rank by Multiple Columns in R
                            
                                Animate ggplot time series plot with a sliding window
                            
                                return ID's of unique combinations

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With