In my database I have a zip
table with a code
column. The user can upload a list of Zip codes and I need to figure out which ones are already in the database. Currently, I do this using the following Hibernate query (HQL):
select zip.code from Zip zip
where zip.code in (:zipCodes)
The value of the :zipCodes
parameter is the list of codes uploaded by the user. However, in the version of Hibernate I'm using there's a bug which limits the size of such list parameters and on occasions we're exceeding this limit.
So I need to find another way to figure out which of the (potentially very long) list of Zip codes are already in the database. Here are a few options I've considered
Rewrite the query using SQL instead of HQL. While this will avoid the Hibernate bug, I suspect the performance will be terrible if there are 30,000 Zip codes that need to be checked.
Split the list of Zip codes into a series of sub-lists and execute a separate query for each sub-list. Again, this will avoid the Hibernate bug, but performance will likely still be terrible
Use a temporary table, i.e. insert the Zip codes to be checked into a temporary table, then join that to the zip
table. It seems the querying part of this solution should perform reasonably well, but the creation of the temporary table and insertion of up to 30,000 rows will not. But perhaps I'm not going about it the right way, here's what I had in mind in pseudo-Java code
/**
* Indicates which of the Zip codes are already in the database
*
* @param zipCodes the zip codes to check
* @return the codes that already exist in the database
* @throws IllegalArgumentException if the list is null or empty
*/
List<Zip> validateZipCodes(List<String> zipCodes) {
try {
// start transaction
// execute the following SQL
CREATE TEMPORARY TABLE zip_tmp
(code VARCHAR(255) NOT NULL)
ON COMMIT DELETE ROWS;
// create SQL string that will insert data into zip_tmp
StringBuilder insertSql = new StringBuilder()
for (String code : zipCodes) {
insertSql.append("INSERT INTO zip_tmp (code) VALUES (" + code + ");")
}
// execute insertSql to insert data into zip_tmp
// now run the following query and return the result
SELECT z.*
FROM zip z
JOIN zip_tmp zt ON z.code = zt.code
} finally {
// rollback transaction so that temporary table is removed to ensure
// that concurrent invocations of this method operate do not interfere
// with each other
}
}
Is there a more efficient way to implement this than in the pseudo-code above, or is there another solution that I haven't thought of? I'm using a Postgres database.
Load all the Zip codes in the database to a List. And on the user inputed list of Zip codes do a removeAll(databaseList)
.
Problem solved!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With