I am working on a code that takes a bunch of SQL queries and aims to break down the queries only into the table names.
For example I have the following queries:
delete from pear.admin where jdjdj
delete from pear.admin_user where blah
delete from ss_pear.admin_user where blah
I am trying to get a regex
that matches all these patterns, would that be
through creating a list of multiple patterns first and then passing it
through str_extract
?
I used a regex but it's giving me the following output:
delete from pear.admin
how do I get rid of the first words before it? I tried (.*)
but nothing
seems to work.
sql_data$table_name <-
str_extract(sql_data$Full.Sql, "[^_]+\\.[\\w]+\\_[\\w]+")
I am only familiar with the base R regex functions, so here is an option using sub
:
queries <- c("delete from pear.admin where jdjdj",
"delete from pear.admin_user where blah",
"delete from ss_pear.admin_user where blah")
table_names <- sapply(queries, function(x) {
sub(".*\\bfrom\\s+(\\S+).*", "\\1", x)
})
table_names
1 2 3
"pear.admin" "pear.admin_user" "ss_pear.admin_user"
This should perform at least somewhat reliably, since, as far as I know, what immediately followed the keyword FROM
must be a table name.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With