I have a Company
model that has many Disclosures
. The Disclosure
has columns named title
, pdf
and pdf_sha256
.
class Company < ActiveRecord::Base
has_many :disclosures
end
class Disclosure < ActiveRecord::Base
belongs_to :company
end
I want to make it unique by pdf_sha256
and if pdf_sha256
is nil
that should be treated as unique.
If it is an Array
, I'll write like this.
companies_with_sha256 = company.disclosures.where.not(pdf_sha256: nil).group_by(&:pdf_sha256).map do |key,values|
values.max_by{|v| v.title.length}
end
companies_without_sha256 = company.disclosures.where(pdf_sha256: nil)
companies = companies_with_sha256 + companeis_without_sha256
How can I get the same result by using ActiveRecord query?
It is possible to do it in one query by first getting a different id
for each different pdf_sha256
as a subquery, then in the query getting the elements within that set of ids by passing the subquery as follows:
def unique_disclosures_by_pdf_sha256(company)
subquery = company.disclosures.select('MIN(id) as id').group(:pdf_sha256)
company.disclosures.where(id: subquery)
.or(company.disclosures.where(pdf_sha256: nil))
end
The great thing about this is that ActiveRecord is lazy loaded, so the first subquery
will not be run and will be merged to the second main query to create a single query in the database. It will then retrieve all the disclosures
unique by pdf_sha256
plus all the ones that have pdf_sha256
set to nil
.
In case you are curious, given a company, the resulting query will be something like:
SELECT "disclosures".* FROM "disclosures"
WHERE (
"disclosures"."company_id" = $1 AND "disclosures"."id" IN (
SELECT MAX(id) as id FROM "disclosures" WHERE "disclosures"."company_id" = $2 GROUP BY "disclosures"."pdf_sha256"
)
OR "disclosures"."company_id" = $3 AND "disclosures"."pdf_sha256" IS NULL
)
The great thing about this solution is that the returned value is an ActiveRecord query, so it won't be loaded until you actually need. You can also use it to keep chaining queries. Example, you can select only the id
instead of the whole model and limit the number of results returned by the database:
unique_disclosures_by_pdf_sha256(company).select(:id).limit(10).each { |d| puts d }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With