Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hive aggregate query takes wrong value from cache

I am running aggregate query on hive session.

hive>select count(1) from table_name;

For the first time it runs mapreduce program and returns result. But for the consecutive runs later in the day it returns same count from the cache(though table is getting updated hourly). which is wrong count.

tried:-

set hive.metastore.aggregate.stats.cache.enabled=false

hive.cache.expr.evaluation=false

set hive.fetch.task.conversion=none

But no luck. Using Hive 1.2.1.2.3.4.29-5 hive version. Thanks

like image 712
sumitya Avatar asked Apr 19 '26 15:04

sumitya


1 Answers

Disable using stats for query calculation:

set hive.compute.query.using.stats=false;

See also this answer for more details: https://stackoverflow.com/a/41021682/2700344

like image 132
leftjoin Avatar answered Apr 24 '26 01:04

leftjoin



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!