I have two tables event <ul> <li>id</li> <li>os</li> </ul> params <ul> <li>id</li> <li>sx</li> <li>sy</li> </ul> This table have relation 1-1 by id. If execute query <pre class="prettyprint"><code>select count(*) from (select id from event where os like 'Android%') inner join (select id from params where sx >= 1024) using id </code></pre> they very slow But if all data contains in one table <pre class="prettyprint"><code>select count(*) from event where sx >= 1024 and os like 'Android%' </code></pre> Query executed very fast. Please, tell me how use join in ClickHouse DB effective? Keep all data in one table is not convenient.

You may rewrite query like this: <pre class="prettyprint"><code>select count(*) from event where os like 'Android%' AND id IN (select id from params where sx >= 1024) </code></pre>

How make JOIN table in ClickHouse DB faster?

Tags:

clickhouse

I have two tables

event

params

This table have relation 1-1 by id. If execute query

select count(*)
from
(select id from event where os like 'Android%')
inner join
(select id from params where sx >= 1024)
using id

they very slow

But if all data contains in one table

select count(*) from event where sx >= 1024 and os like 'Android%'

Query executed very fast.

Please, tell me how use join in ClickHouse DB effective? Keep all data in one table is not convenient.

828

asked Nov 02 '16 12:11

Oleg Khamov

2 Answers

I experience same problems with joining 2 huge distributed tables. There are 2 main problems

durion of executing
limits by needed memory for a query.

What works for me is sharding calculation query by id%N using subqueries and union all results then.

SELECT count(*)
FROM
(
    SELECT 1
    FROM event
    WHERE id%2=0 AND id IN
    (
        SELECT id
        FROM params
        WHERE id % 2 = 0 AND sx >= 1024
    )
    UNION ALL
    SELECT 2
    FROM event
    WHERE id % 2 = 1 AND id IN
    (
        SELECT id
        FROM params
        WHERE id % 2 = 1 AND sx >= 1024
    )
)

You can change id%N(2 in the example) until you get needed performance. Need to replace IN to GLOBAL IN if you use distributed engines for tables.

147

answered Sep 24 '22 14:09

Evgeniy Skomorokhov

You may rewrite query like this:

select count(*)
from event 
where os like 'Android%' 
AND id IN (select id from params where sx >= 1024)

answered Sep 23 '22 14:09

uYSIZfoz

Related questions
                            
                                ClickHouse Kafka Performance
                            
                                Too many simultaneous queries in clickhouse
                            
                                Clickhouse: how to convert date to long integer?
                            
                                Is 'distinct' an ordinary operation for ClickHouse?
                            
                                Return clickhouse array as column
                            
                                How to have auto increment in ClickHouse?
                            
                                Is there a way to join all arrays in clickhouse column and then filter for duplicates?
                            
                                How does one properly edit the clickhouse-server config.xml file?
                            
                                Importing from MySQL dump to Clickhouse
                            
                                Connect to remote clickhouse db via clickhouse command line
                            
                                Replacement for row_number() in clickhouse
                            
                                Understanding clickhouse partitions
                            
                                Filtering results from ClickHouse using values from dictionaries
                            
                                clickhouse - how get count datetime per 1minute or 1day ,
                            
                                How to implement `pivot` in clickhouse just like in dolphindb

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With