Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Query's result set is too big

I have a query that can be fast or slow depending on how many records I'm fetching. Here's a table showing the number in my LIMIT clause and the corresponding time it takes to execute the query and fetch the results:

LIMIT | Seconds (Duration/Fetch)
------+-------------------------
   10 |  0.030/  0.0
  100 |  0.062/  0.0
 1000 |  1.700/  0.8
10000 | 25.000/100.0

As you can see, it's fine up to at least 1,000 but 10,000 is really slow, mostly due to a high fetch time. I don't understand why the growth of the fetch time isn't linear but I am grabbing over 200 columns from over 70 tables, so the fact that the result set takes a long time to fetch is not a surprise.

What I'm fetching, by the way, is data on all the accounts at a certain bank. The bank I'm dealing with has about 160,000 accounts so I ultimately need to fetch 160,000 rows from the database.

It's obviously not going to be feasible to try to fetch 160,000 rows at once (at least not unless I can somehow dramatically optimize my query). It seems to me that the biggest chunk I can reasonably grab is 1,000 rows, so I wrote a script that would run the query over and over with a SELECT INTO OUTFILE, limit and offset. Then, at the end, I take all the CSV files I dumped and cat them together. It works but it's slow. It takes hours. I have the script running right now and it's only dumped 43,000 rows in about an hour.

Should I attack this problem at the query optimization level or does the long fetch time suggest I should focus elsewhere? What would you recommend I do?

If you want to see the query you can see it here.

like image 454
Jason Swett Avatar asked Dec 01 '25 03:12

Jason Swett


2 Answers

The answer is going to greatly depend on what you're doing with the data. Querying 215 columns through 29 joins will never be quick for non-trivial record sizes.

If you're trying to display 160,000 records to the user, you should page the results and only fetch one page at a time. This will keep the result set small enough that even a relatively inefficient query will return quickly. In this case, you will also want to examine just how much data the user needs in order to select or manipulate the data. Chances are good that you can pare it down to a handful of fields and some aggregates (count, sum, etc) that will let the user make an informed decision about which records they want to work with. Use LIMIT with an offset to pull single pages of arbitrary size.

If you need to export the data for reporting purposes, ensure that you are only pulling the exact data that the report needs. Eliminate joins where possible and use subqueries where you need an aggregate of child data. You'll want to tune/add indexes for the frequently used joins and criteria. In the case of your provided query, ib.id and the myriad of foreign keys you're joining through. You can leave off boolean columns because there are not enough distinct values to form a meaningful index.

Regardless of what you're trying to accomplish, removing some of the joins and columns will inherently speed up your processing. The amount of heavy lifting that MySQL needs to do to fill that query is your main stumbling block.

like image 156
Karelzarath Avatar answered Dec 03 '25 18:12

Karelzarath


I've restructured your query to hopefully offer significant performance improvement time. By using the STRAIGHT_JOIN tells MySQL to do in the order you've stated (or I've adjusted here). The inner-most, first query "PreQuery" alias STARTS at your criteria of the import bundle and generic import, to the account import to the account... By pre-applying the WHERE clause there (and as you would test, add your LIMIT CLAUSE HERE) you are pre-joining these tables and getting them right out of the way before wasting any time trying to get the customers, address, etc other information going. In the query, I've adjusted the join/left joins to better show the relationship of the underlying linked tables (primarily for anyone else reading in).

As another person noted, what I've done in the PREQUERY could be a basis of "Account.ID" records in a master pre-query list used to go through and page-available. I would be curious to the performance of this to your existing especially at the 10,000 limit range.

The PREQUERY gets unique elements (including the Account ID used downstream, bank, month, year and category), so those tables don't have to be rejoined in the rest of the joining process.

SELECT STRAIGHT_JOIN
      PreQuery.*,
      customer.customer_number,
      customer.name,
      customer.has_bad_address,
      address.line1,
      address.line2,
      address.city,
      state.name,
      address.zip,
      po_box.line1,
      po_box.line2,
      po_box.city,
      po_state.name,
      po_box.zip,
      customer.date_of_birth,
      northway_account.cffna,
      northway_account.cfinsc,
      customer.deceased,
      customer.social_security_number,
      customer.has_internet_banking,
      customer.safe_deposit_box,
      account.has_bill_pay,
      account.has_e_statement,
      branch.number,
      northway_product.code,
      macatawa_product.code,
      account.account_number,
      account.available_line,
      view_macatawa_atm_card.number,
      view_macatawa_debit_card.number,
      uc.code use_class,
      account.open_date,
      account.balance,
      account.affinion,
      northway_account.ytdsc,
      northway_account.ytdodf,
      northway_account.ytdnsf,
      northway_account.rtckcy,
      northway_account.rtckwy,
      northway_account.odwvey,
      northway_account.ytdscw,
      northway_account.feeytd,
      customer.do_not_mail,
      northway_account.aledq1,
      northway_account.aledq2,
      northway_account.aledq3,
      northway_account.aledq4,
      northway_account.acolq1,
      northway_account.acolq2,
      northway_account.acolq3,
      northway_account.acolq4,
      o.officer_number,
      northway_account.avg_bal_1,
      northway_account.avg_bal_2,
      northway_account.avg_bal_3,
      account.maturity_date,
      account.interest_rate,
      northway_account.asslc,
      northway_account.paidlc,
      northway_account.lnuchg,
      northway_account.ytdlc,
      northway_account.extfee,
      northway_account.penamt,
      northway_account.cdytdwaive,
      northway_account.cdterm,
      northway_account.cdtcod,
      account.date_of_last_statement,
      northway_account.statement_cycle,
      northway_account.cfna1,
      northway_account.cfna2,
      northway_account.cfna3,
      northway_account.cfna4,
      northway_account.cfcity,
      northway_account.cfstate,
      northway_account.cfzip,
      northway_account.actype,
      northway_account.sccode,
      macatawa_account.account_type_code,
      macatawa_account.account_type_code_description,
      macatawa_account.advance_code,
      macatawa_account.amount_last_advance,
      macatawa_account.amount_last_payment,
      macatawa_account.available_credit,
      macatawa_account.balance_last_statement,
      macatawa_account.billing_day,
      macatawa_account.birthday_3,
      macatawa_account.birthday_name_2,
      macatawa_account.ceiling_rate,
      macatawa_account.class_code,
      macatawa_account.classified_doubtful,
      macatawa_account.classified_loss,
      macatawa_account.classified_special,
      macatawa_account.classified_substandard,
      macatawa_account.closed_account_flag,
      macatawa_account.closing_balance,
      macatawa_account.compounding_code,
      macatawa_account.cost_center_full,
      macatawa_account.cytd_aggregate_balance,
      macatawa_account.cytd_amount_of_advances,
      macatawa_account.cytd_amount_of_payments,
      macatawa_account.cytd_average_balance,
      macatawa_account.cytd_average_principal_balance,
      macatawa_account.cytd_interest_paid,
      macatawa_account.cytd_number_items_nsf,
      macatawa_account.cytd_number_of_advanes,
      macatawa_account.cytd_number_of_payments,
      macatawa_account.cytd_number_times_od,
      macatawa_account.cytd_other_charges,
      macatawa_account.cytd_other_charges_waived,
      macatawa_account.cytd_reporting_points,
      macatawa_account.cytd_service_charge,
      macatawa_account.cytd_service_charge_waived,
      macatawa_account.date_closed,
      macatawa_account.date_last_activity,
      macatawa_account.date_last_advance,
      macatawa_account.date_last_payment,
      macatawa_account.date_paid_off,
      macatawa_account.ddl_code,
      macatawa_account.deposit_rate_index,
      macatawa_account.employee_officer_director_full_desc,
      macatawa_account.floor_rate,
      macatawa_account.handling_code,
      macatawa_account.how_paid_code,
      macatawa_account.interest_frequency,
      macatawa_account.ira_plan,
      macatawa_account.load_rate_code,
      macatawa_account.loan_rate_code,
      macatawa_account.loan_rating_code,
      macatawa_account.loan_rating_code_1_full_desc,
      macatawa_account.loan_rating_code_2_full_desc,
      macatawa_account.loan_rating_code_3_full_desc,
      macatawa_account.loan_to_value_ratio,
      macatawa_account.maximum_credit,
      macatawa_account.miscellaneous_code_full_desc,
      macatawa_account.months_to_maturity,
      macatawa_account.msa_code,
      macatawa_account.mtd_agg_available_balance,
      macatawa_account.naics_code,
      macatawa_account.name_2,
      macatawa_account.name_3,
      macatawa_account.name_line,
      macatawa_account.name_line_2,
      macatawa_account.name_line_3,
      macatawa_account.name_line_1,
      macatawa_account.net_payoff,
      macatawa_account.opened_by_responsibility_code_full,
      macatawa_account.original_issue_date,
      macatawa_account.original_maturity_date,
      macatawa_account.original_note_amount,
      macatawa_account.original_note_date,
      macatawa_account.original_prepaid_fees,
      macatawa_account.participation_placed_code,
      macatawa_account.participation_priority_code,
      macatawa_account.pay_to_account,
      macatawa_account.payment_code,
      macatawa_account.payoff_principal_balance,
      macatawa_account.percent_participated_code,
      macatawa_account.pmtd_number_deposit_type_1,
      macatawa_account.pmtd_number_deposit_type_2,
      macatawa_account.pmtd_number_deposit_type_3,
      macatawa_account.pmtd_number_type_1,
      macatawa_account.pmtd_number_type_2,
      macatawa_account.pmtd_number_type_6,
      macatawa_account.pmtd_number_type_8,
      macatawa_account.pmtd_number_type_9,
      macatawa_account.principal,
      macatawa_account.purpose_code,
      macatawa_account.purpose_code_full_desc,
      macatawa_account.pytd_number_of_items_nsf,
      macatawa_account.pytd_number_of_times_od,
      macatawa_account.rate_adjuster,
      macatawa_account.rate_over_split,
      macatawa_account.rate_under_split,
      macatawa_account.renewal_code,
      macatawa_account.renewal_date,
      macatawa_account.responsibility_code_full,
      macatawa_account.secured_unsecured_code,
      macatawa_account.short_first_name_1,
      macatawa_account.short_first_name_2,
      macatawa_account.short_first_name_3,
      macatawa_account.short_last_name_1,
      macatawa_account.short_last_name_2,
      macatawa_account.short_last_name_3,
      macatawa_account.statement_cycle,
      macatawa_account.statement_rate,
      macatawa_account.status_code,
      macatawa_account.tax_id_number_name_2,
      macatawa_account.tax_id_number_name_3,
      macatawa_account.teller_alert_1,
      macatawa_account.teller_alert_2,
      macatawa_account.teller_alert_3,
      macatawa_account.term,
      macatawa_account.term_code,
      macatawa_account.times_past_due_01_29,
      macatawa_account.times_past_due_01_to_29_days,
      macatawa_account.times_past_due_30_59,
      macatawa_account.times_past_due_30_to_59_days,
      macatawa_account.times_past_due_60_89,
      macatawa_account.times_past_due_60_to_89_days,
      macatawa_account.times_past_due_over_90,
      macatawa_account.times_past_due_over_90_days,
      macatawa_account.tin_code_name_1,
      macatawa_account.tin_code_name,
      macatawa_account.tin_code_name_2,
      macatawa_account.tin_code_name_3,
      macatawa_account.total_amount_past_due,
      macatawa_account.waiver_od_charge,
      macatawa_account.waiver_od_charge_description,
      macatawa_account.waiver_service_charge_code,
      macatawa_account.waiver_transfer_advance_fee,
      macatawa_account.short_first_name,
      macatawa_account.short_last_name          
FROM
   ( SELECT STRAIGHT_JOIN DISTINCT
         b.name bank,
         ib.YEAR,
         ib.MONTH,
         ip.category,
         Account.ID
         FROM import_bundle ib 
            JOIN generic_import gi ON ib.id = gi.import_bundle_id
               JOIN account_import AI ON gi.id = ai.generic_import_id
                  JOIN Account ON AI.ID = account.account_import_id 
               JOIN import_profile ip ON gi.import_profile_id = ip.id
            JOIN bank b ib.Bank_ID = b.id
      WHERE
            IB.ID = 95
         AND IB.Active = 1
         AND GI.Active = 1
      LIMIT 1000 ) PreQuery
   JOIN Account on PreQuery.ID = Account.ID
      JOIN Customer on Account.Customer_ID = Customer.ID
      JOIN Officer on Account.Officer_ID = Officer.ID
      LEFT JOIN branch ON Account.branch_id = branch.id
      LEFT JOIN cd_type ON account.cd_type_id = cd_type.id
      LEFT JOIN use_class uc ON account.use_class_id = uc.id
      LEFT JOIN account_type at ON account.account_type_id = at.id
      LEFT JOIN northway_account ON account.id = northway_account.account_id
      LEFT JOIN macatawa_account ON account.id = macatawa_account.account_id
      LEFT JOIN view_macatawa_debit_card ON account.id = view_macatawa_debit_card.account_id
      LEFT JOIN view_macatawa_atm_card ON account.id = view_macatawa_atm_card.account_id
      LEFT JOIN original_address OA ON Account.ID = OA.account_id

      JOIN Account_Address AA ON Account.ID = AA.account_id
         JOIN address ON AA.address_id = address.id
            JOIN state ON address.state_id = state.id

      LEFT JOIN Account_po_box APB ON Account.ID = APB.account_id
         LEFT JOIN address po_box ON APB.address_id = po_box.id
            LEFT JOIN state po_state ON po_box.state_id = po_state.id

      LEFT JOIN Account_macatawa_product amp ON account.id = amp.account_id
         LEFT JOIN macatawa_product ON amp.macatawa_product_id = macatawa_product.id
            LEFT JOIN product_type pt ON macatawa_product.product_type_id = pt.id
            LEFT JOIN harte_hanks_service_category hhsc ON macatawa_product.harte_hanks_service_category_id = hhsc.id
            LEFT JOIN core_file_type cft ON macatawa_product.core_file_type_id = cft.id 

      LEFT JOIN Account_northway_product anp ON account.id = anp.account_id
         LEFT JOIN northway_product ON anp.northway_product_id = northway_product.id
like image 32
DRapp Avatar answered Dec 03 '25 18:12

DRapp



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!