There is a csv data file that is populated with lot of raw data as follows:
data.frame(
id=1:4,
data=c(
"it's a programming language",
"this data is JUNK",
"refer www.google.com",
"check for more information")
)
I need to process this data, and check for an ALL CAPS sequence for every row and populate a new column with a 0/1 entry.
Output file is as follows:
id data all_caps
1 it's a programming language 0
2 this data is JUNK 1
3 refer www.google.com 0
4 check for more information 0
How to achieve this with R? I have been searching for this for a while now, not able to find any fruitful results for the processing of each row.
Assuming your data.frame is called test
:
test$all_caps <- grepl("[A-Z]{2,}",test$data)
id data all_caps
1 1 it's a programming language FALSE
2 2 this data is JUNK TRUE
3 3 refer www.google.com FALSE
4 4 check for more information FALSE
Which you can make 0's and 1's by calling as.numeric
test$all_caps <- as.numeric(grepl("[A-Z]{2,}",test$data))
id data all_caps
1 1 it's a programming language 0
2 2 this data is JUNK 1
3 3 refer www.google.com 0
4 4 check for more information 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With