Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php & mysql: use a table for a filter list for another table

I have two mysql tables. One is a bad words list, the other is the table to compare against the bad words list. Essentially I want to filter out and return a list of rows with domains that do not have ANY occurrence of a word in the bad words table. A few sample tables:

bad words list

+----------+------------------+
| id       | words            |
+----------+------------------+
| 1        | porn             |
| 2        | sex              |
+----------+------------------+

table of domains to compare

+----------+------------------+
| id       | domain           |
+----------+------------------+
| 56       | google.com       |
| 57       | sex.com          |
+----------+------------------+

I want to return results such as

+----------+------------------+
| id       | domain           |
+----------+------------------+
| 56       | google.com       |
+----------+------------------+

A thing to note is that these tables have nothing in common, so I'm not even sure this is the best method. I was using a comparison function in PHP but that seemed to be way too slow over hundreds of thousands of rows to search.

like image 565
Jared Eitnier Avatar asked Dec 27 '12 14:12

Jared Eitnier


People also ask

What PHP stand for?

PHP, originally derived from Personal Home Page Tools, now stands for PHP: Hypertext Preprocessor, which the PHP FAQ describes as a "recursive acronym." PHP executes on the server, while a comparable alternative, JavaScript, executes on the client.

Is PHP a coding?

PHP is a server scripting language, and a powerful tool for making dynamic and interactive Web pages.

Is PHP coding or programming?

PHP is an open-source, server-side programming language that can be used to create websites, applications, customer relationship management systems and more. It is a widely-used general-purpose language that can be embedded into HTML.

Which is better Python or PHP?

Python is better than PHP in long term project. PHP has low learning curve, it is easy to get started with PHP. Compare to PHP Python has lower number of Frameworks. Popular ones are DJango, Flask.


2 Answers

It is possible to get from mysql. like this:

SELECT
    d.*
FROM 
    domains d 
LEFT JOIN
    words w ON(d.domain LIKE CONCAT('%',w.word,'%') ) 
GROUP BY
    d.domain
HAVING
    COUNT(w.id) < 1

but it is not optimal and will get slower and slower with more records in both tables.

like image 147
dev-null-dweller Avatar answered Sep 28 '22 11:09

dev-null-dweller


Data like this typically needs to be pre-calculated at insertion time rather than at fetch time. You should add a column to Domains something like "bad_words boolean default null".

null would mean "don't know" which in some context could be interpretted as "unsafe to show". false means "no bad words" and true means "contains bad words".

Everytime the list of bad words is updated all columns are reset to null and some background work will start to process them again. Probably in another language than sql.

like image 22
Andreas Wederbrand Avatar answered Sep 28 '22 12:09

Andreas Wederbrand