I want to create sparklines that illustrate the number of posts created on my blog in the last 2 weeks. To do this, I need to first generate an array that contains the number of posts created on each day during the period in question.
For example, this array:
[40, 18, 0, 2, 39, 37, 22, 25, 30, 60, 36, 5, 2, 2]
generates this sparkline: (I'm using the Googlecharts wrapper around the Google Charts API)
My question is how to create these arrays. Here's what I'm doing now: (I'm using Searchlogic to do the queries, but it should be understandable even if you've never used it)
history = []
14.downto(1) do |days_ago|
history.push(Post.created_at_after((days_ago + 1).day.ago.beginning_of_day).created_at_before((days_ago - 1).days.ago.beginning_of_day).size)
end
This approach is ugly and slow -- there must be a better way!
This will give you a hash mapping dates to post counts:
counts = Post.count(
:conditions => ["created_at >= ?", 14.days.ago],
:group => "DATE(created_at)"
)
You can then turn this into an array:
counts_array = []
14.downto(1) do |d|
counts_array << (counts[d.days.ago.to_date.to_s] || 0)
end
You need to have your data indexed properly or this will never work efficiently. If you're using a granularity of "day" then it pays to have a Date column. You can then use a standard SQL GROUP BY operation to get the values you need directly.
For example, a migration could be done like:
self.up
add_column :posts, :created_on_date
add_index :posts, :created_on_date
execute "UPDATE posts SET created_on_date=created_at"
end
Then retrieval is really fast since it can exercise the index:
def sparkline_data
self.class.connection.select_values("
SELECT created_on_date, COUNT(id) FROM posts
WHERE created_on_date>DATE_SUB(UTC_TIMESTAMP(), INTERVAL 14 DAY)
GROUP BY created_on_date
").collect(&:to_i)
end
Keep in mind if you're potentially missing a day you'll have to account for that by inserting a zero value into your results. The date is returned here, so you should be able to compute the missing values and fill them in. Typically this is done by iterating over a group of days using collect.
When you need to retrieve a thin slice of data quickly, loading instances of the models will always be a huge bottleneck. Often you need to go directly to SQL if there's no simple way to fetch what you need.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With