I am working on a code that takes a bunch of SQL queries and aims to break down the queries only into the table names.
For example I have the following queries:
delete from pear.admin where jdjdj
delete from pear.admin_user where blah
delete from ss_pear.admin_user where blah 
I am trying to get a regex that matches all these patterns, would that be 
through creating a list of multiple patterns first and then passing it 
through str_extract ? 
I used a regex but it's giving me the following output:
delete from pear.admin 
how do I get rid of the first words before it? I tried (.*) but nothing 
seems to work.
sql_data$table_name <- 
str_extract(sql_data$Full.Sql, "[^_]+\\.[\\w]+\\_[\\w]+")
                I am only familiar with the base R regex functions, so here is an option using sub:
queries <- c("delete from pear.admin where jdjdj",
             "delete from pear.admin_user where blah",
             "delete from ss_pear.admin_user where blah")
table_names <- sapply(queries, function(x) {
    sub(".*\\bfrom\\s+(\\S+).*", "\\1", x)
})
table_names
           1                    2                    3 
"pear.admin"    "pear.admin_user" "ss_pear.admin_user" 
This should perform at least somewhat reliably, since, as far as I know, what immediately followed the keyword FROM must be a table name.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With