Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract the text within each single quote using R

Tags:

r

I have a dataset containing two columns, the first column is month, and the second column is all the firms that applied for a tax compensation in this month. Each entry in this column looks like this

['firm A','firm B',...]

I want to extract the names of the first three firms that submit the application for each month, and create three new columns for the name extractions, each firm in one column.

I tried commands like

gsub("^'(.*)'.*", "\\", df)

but it didn't work properly

like image 606
Ḥasan Avatar asked Oct 11 '25 12:10

Ḥasan


1 Answers

Using base R

cbind(df[1], read.csv(text = gsub("[]'[]", "",
  df$tax_compensation), header = FALSE, col.names = paste0("firm_", 1:4)))

-output

 month firm_1 firm_2 firm_3 firm_4
1   Jan firm A firm B firm C firm D
2   Feb firm B firm C firm A firm D

data

df <- structure(list(month = c("Jan", "Feb"), 
tax_compensation = c("['firm A','firm B','firm C','firm D']", 
"['firm B','firm C','firm A','firm D']")), class = "data.frame", 
row.names = c(NA, 
-2L))
like image 147
akrun Avatar answered Oct 14 '25 07:10

akrun