Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pivot_longer with names_pattern [duplicate]

Tags:

r

dplyr

pivot

I am quite new to the whole programing stuff, but i need to skript reproducable for large datasets. I hope I provided a sufficient example.

I have a dataframe like this (with 8 more "Nutrients" and 5 more "trade-elements" and much more Years):

Year<-c(1961,1962)
Total_Energy_kcal_Production<-c(5,8)
Total_Energy_kcal_Import<-c(6,1)
Total_Ca_g_Production<-c(3,4)
Total_Ca_g_Import<-c(3,8)
df<-cbind(Year,Total_Energy_kcal_Production, Total_Energy_kcal_Import, Total_Ca_g_Production, Total_Ca_g_Import)

looks like:

Year  Total_Energy_kcal_Production   Total_Energy_kcal_Import   Total_Ca_g_Production    Total_Ca_g_Import 
1961   5                              6                          3                       3
1962   8                              1                          4                       8

and I want it to look like this:

Year  Nutrient            Production        Import
1961  Total_Energy_kcal   5                 6
1962  Total_Energy_kcal   8                 1
1961  Total_Ca_g          3                 3 
1962  Total_Ca_g          4                 8

I tried a lot with pivot_longer and names_patern. I thought this would work, but I do not fully understand the arguments:

df_piv<-df%>%
  pivot_longer(cols = -Year, names_to = "Nutrient", 
              names_pattern = ".*(?=_)")

I get an error-message that i can not interprete:

Error: Can't select within an unnamed vector.
like image 344
Annika Avatar asked Apr 28 '26 11:04

Annika


1 Answers

You can provide names_pattern regex as :

tidyr::pivot_longer(df, 
                    cols = -Year, 
                    names_to = c('Nutrient', '.value'),
                    names_pattern = '(.*)_(\\w+)')

#   Year Nutrient          Production Import
#  <dbl> <chr>                  <dbl>  <dbl>
#1  1961 Total_Energy_kcal          5      6
#2  1961 Total_Ca_g                 3      3
#3  1962 Total_Energy_kcal          8      1
#4  1962 Total_Ca_g                 4      8

This will put everything until the last underscore in Nutrient column and the remaining data is kept as column name.

data

cbind will create a matrix, use data.frame to create data.

df<-data.frame(Year,Total_Energy_kcal_Production,Total_Energy_kcal_Import, 
               Total_Ca_g_Production, Total_Ca_g_Import)
like image 102
Ronak Shah Avatar answered May 01 '26 00:05

Ronak Shah



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!