Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get Count for large tables?

Sample Table:

+----+-------+-------+-------+-------+-------+---------------+
| id | col1  | col2  | col3  | col4  | col5  | modifiedTime  |
+----+-------+-------+-------+-------+-------+---------------+
|  1 | temp1 | temp2 | temp3 | temp4 | temp5 | 1554459626708 |
+----+-------+-------+-------+-------+-------+---------------+ 

above table has 50 million records

  1. (col1, col2, col3, col4, col5 these are VARCHAR columns)
  2. (id is PK)
  3. (modifiedTime)

Every column is indexed

For Ex: I have two tabs in my website.

FirstTab - I print the count of above table with following criteria [col1 like "value1%" and col2 like "value2%"]

SeocndTab - I print the count of above table with following criteria [col3 like "value3%"]


As I have 50 million records, the count with those criteria takes too much time to get the result.

Note: I would change records data(rows in table) sometime. Insert new rows. Delete not needed records.

I need a feasible solution instead of querying the whole table. Ex: like caching the older count. Is anything like this possible.

like image 949
vinieth Avatar asked Jan 26 '23 12:01

vinieth


1 Answers

While I'm sure it's possible for MySQL, here's a solution for Postgres, using triggers.

Count is stored in another table, and there's a trigger on each insert/update/delete that checks if the new row meets the condition(s), and if it does, add 1 to the count. Another part of the trigger checks if the old row meets the condition(s), and if it does, subtracts 1.

Here's the basic code for the trigger that counts the rows with temp2 = '5':

CREATE OR REPLACE FUNCTION updateCount() RETURNS TRIGGER AS 
$func$
BEGIN
   IF TG_OP = 'INSERT' OR TG_OP = 'UPDATE' THEN
      EXECUTE 'UPDATE someTableCount SET cnt = cnt + 1 WHERE 1 = (SELECT 1 FROM (VALUES($1.*)) x(id, temp1, temp2, temp3) WHERE x.temp2 = ''5'')'
      USING NEW;
   END IF;
   IF TG_OP = 'DELETE' OR TG_OP = 'UPDATE' THEN
      EXECUTE 'UPDATE someTableCount SET cnt = cnt - 1 WHERE 1 = (SELECT 1 FROM (VALUES($1.*)) x(id, temp1, temp2, temp3) WHERE x.temp2 = ''5'')'
      USING OLD;
   END IF;
   RETURN new;
END
$func$ LANGUAGE plpgsql;

Here's a working example on dbfiddle.

You could of course modify the trigger code to have dynamic where expressions and store counts for each in the table like:

CREATE TABLE someTableCount
(
   whereExpr text,
   cnt INT
);

INSERT INTO someTableCount VALUES ('temp2 = ''5''', 0);

In the trigger you'd then loop through the conditions and update accordingly.

like image 82
MarcinJ Avatar answered Jan 29 '23 02:01

MarcinJ