Deleting specific rows from a data frame

Tags:

r

I am working with some US govt data which has a lengthy list of cities and zip codes. After some work, the data is in the following format.

dat1 = data.frame(keyword=c("Bremen", "Brent", "Centreville, AL", "Chelsea, AL", "Bailytown, Alabama", "Calera, Alabama",
              "54023", "54024"), tag=c(rep("AlabamCity",2), rep("AlabamaCityST",2), rep("AlabamaCityState",2), rep("AlabamaZipCode",2)))
dat1

However, there are certain keywords which aren't properly working. So in the below example, there are two 'zip codes' which are labeled as 'AlabamaCity' and 'AlabamaCityState'. For some reason, the original data set from the government has several zipcodes which aren't properly grouped with the other zip codes.

dat2 = data.frame(keyword=c("Bremen", "Brent", "50143", "Chelsea, AL", "Bailytown, Alabama", "52348",
              "54023", "54024"), tag=c(rep("AlabamCity",2), rep("AlabamaCityST",2), rep("AlabamaCityState",2), rep("AlabamaZipCode",2)))
dat2

I wanted to know how I could iterate through the entire list of keywords and delete all the rows with numeric values (they're acctually saved as character values) which don't have a 'AlabamaZipCode' tag. So the previous data should end up looking like.

dat3 = data.frame(keyword=c("Bremen", "Brent", "Chelsea, AL", "Bailytown, Alabama", "54023", "54024"), 
          tag=c(rep("AlabamCity",2), rep("AlabamaCityST",1), rep("AlabamaCityState",1), rep("AlabamaZipCode",2)))
dat3

The challange seems to be that there are certain numeric values which I want to keep and others which I want to delete. Can anyone help.

504

asked Jul 06 '11 19:07

ATMathew

1 Answers

I think two grepl expressions should do the trick:

> dat2[ !( grepl("City", dat2$tag) &  grepl("^\\d", dat2$keyword) ) , ]
             keyword              tag
1             Bremen       AlabamCity
2              Brent       AlabamCity
4        Chelsea, AL    AlabamaCityST
5 Bailytown, Alabama AlabamaCityState
7              54023   AlabamaZipCode
8              54024   AlabamaZipCode

You are eliminating the rows where there are digits in keyword and "City" in tag

155

answered Sep 19 '22 15:09

IRTFM

Related questions
                            
                                Uppercase the first letter in data frame
                            
                                What is the difference between string and character in R?
                            
                                Creating correlation matrix p values [duplicate]
                            
                                Rstudio shiny ggvis tooltip on mouse hover
                            
                                Create column with grouped values based on another column
                            
                                What is the difference between rel error and x error in a rpart decision tree? [closed]
                            
                                ggplot2: can't sort x axis by y value
                            
                                Locking R shiny dashboard sidebar (shinydashboard)
                            
                                change value to percentage of row in R [duplicate]
                            
                                Creating a contingency table using multiple columns in a data frame in R
                            
                                Hide/show outputs Shiny R
                            
                                Perform an operation on a vector using the previous value after an initial value
                            
                                R: `ID : Coercing LHS to a list` in adding an ID column, why?
                            
                                Move axis labels in between plot and facet strip
                            
                                group by in dplyr and calculating percentages
                            
                                How to convert list of list into a tibble (dataframe)
                            
                                Count consecutive TRUE values within each block separately [duplicate]
                            
                                merge data frame and named vector
                            
                                How to sort rows of a data frame based on a vector using dplyr pipe
                            
                                Querying Oracle DB from Revolution R using RODBC

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Deleting specific rows from a data frame

Tags:

dataframe

r

ATMathew

People also ask

1 Answers

IRTFM

Recent Activity

Donate For Us