I have a problem with a big query from ten joined tables. I'm migrating data from a wide fact table (f1) into a star schema. I begin by populating the dimension tables from f1 and then I populate the new fact table (f2) with a join to the dimension tables to get the corresponding ID's.
Unfortunately I'm getting an error, "inner partition did not fit in memory". From the log I see:
2012-10-18 16:20:31.607 Init Session:0x2aac6c02b250 [EE] <INFO> ENABLE_JOIN_SPILL may allow this query to run, with reduced performance
2012-10-18 16:20:31.607 Init Session:0x2aac6c02b250 [EE] <INFO> Query Retry action: Setting add_vertica_options('EE','ENABLE_JOIN_SPILL');
but that doesn't work either since later I get:
2012-10-18 16:23:31.138 Init Session:0x2aac6c02b250 [EE] <INFO> Join ((public.owa_search_term_dim x public.page_impressions_with_session) using owa_search_term_dim_projection_node0001 and previous join (PATH ID: 7)) inner partition did not fit in memory; value
2012-10-18 16:23:31.138 Init Session:0x2aac6c02b250 [EE] <INFO> Query Retry action: Swapping join order with override: 1|7|0
This goes on for some time, while Vertica apparently tries to find a way to perform the join, but eventually bails with an error saying the join didn't fit in memory.
Are there any tips on how to minimize the memory needed to perform joins or why spilling to disk isn't working? I can handle a performance hit, I just need to be able to execute the query.
The things I've done to work around this error...
Using these steps, I haven't encountered any situations where I haven't been able to get a result. Sometimes it takes a while. I have a set of queries pumping into a series of 14 temp tables which ends in a very small result set; but takes over 15 minutes to run because of the raw amount of crunching that has to be done.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With