Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to delete old records from clickhouse table?

Tags:

sql

clickhouse

As far as I know, clickhouse allows only inserting new data. But is it possible to delete block older then some period to avoid overflow of HDD?

like image 963
Stepan Yakovenko Avatar asked Sep 16 '18 14:09

Stepan Yakovenko


People also ask

How do I delete data from ClickHouse table?

Clickhouse doesn't have update/Delete feature like Mysql database. But we still can do delete by organising data in the partition.

Can you delete a record from table using view?

Yes, you can insert, update and delete a record in a view but there are some restrictions. Use the following procedure to create a sample to understand how to perform such tasks. Step 1: Create a schema of a table named "Employee" in your Database. Step 4: Select the data from the view.

What happens when you delete a record in a table?

The result of the DELETE statement is the removal of zero or more rows of a table, depending on how many rows satisfy the search condition specified in the WHERE clause. If you omit the WHERE clause from a DELETE statement, SQL removes all the rows from the table.


2 Answers

Altering data using Mutations

See the docs on Mutations feature https://clickhouse.yandex/docs/en/query_language/alter/#mutations.
The feature was implemented in Q3 2018.

Delete data

ALTER TABLE <table> DELETE WHERE <filter expression>

"Dirty" delete all

You always have to specify a filter expression. If you want to delete all the data through Mutation, specify something that's always true, eg.:

ALTER TABLE <table> DELETE WHERE 1=1

Update data

It's also possible to mutate (UPDATE) the similar way

ALTER TABLE <table> UPDATE column1 = expr1 [, ...] WHERE <filter expression>

Mind it's async

Please note that all commands above do not execute the data mutation directly (in sync). Instead they schedule ClickHouse Mutation that is executed independently (async) on background. That is the reason why ALTER TABLE syntax was chosen instead of typical SQL UPDATE/DELETE. You can check unfinished Mutations' progress via

SELECT *
FROM system.mutations
WHERE is_done = 0

...unless

you change mutations_sync settings to

  • 1 so it synchronously waits for current server
  • 2 so it waits for all replicas

Altering data without using Mutations

Theres's TRUNCATE TABLE statement with syntax as follows:

TRUNCATE TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]

This synchronously truncates the table. It will check for table size so won't allow you to delete if table size exceeds max_table_size_to_drop. See docs here:

https://clickhouse.tech/docs/en/sql-reference/statements/truncate/

like image 92
simPod Avatar answered Sep 23 '22 23:09

simPod


Example to create and delete partition

    CREATE TABLE test.partitioned_by_month(d Date, x UInt8) ENGINE = MergeTree 

PARTITION BY toYYYYMM(d) ORDER BY x;

    INSERT INTO test.partitioned_by_month VALUES ('2000-01-01', 1), ('2000-01-02', 2), ('2000-01-03', 3);

INSERT INTO test.partitioned_by_month VALUES ('2000-02-03', 4), ('2000-02-03', 5);

INSERT INTO test.partitioned_by_month VALUES ('2000-03-03', 4), ('2000-03-03', 5);

SELECT * FROM test.partitioned_by_month;

---d------------|-------x-----

 2000-02-03 | 4 

 2000-02-03 | 5 


---d------------|-------x-----

 2000-03-03 | 4 

 2000-03-03 | 5 

---d------------|-------x-----

 2000-01-01 | 1 

 2000-01-02 | 2

 2000-01-03 | 3 

ALTER TABLE test.partitioned_by_month DROP PARTITION 200001;

select * from partitioned_by_month;


---d------------|-------x-----

 2000-03-03 | 4 

 2000-03-03 | 5 

---d------------|-------x-----


 2000-02-03 | 4 

 2000-02-03 | 5 
like image 27
Dipesh Maurya Avatar answered Sep 21 '22 23:09

Dipesh Maurya