Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get an array containing the number of posts created in the past 2 weeks

I want to create sparklines that illustrate the number of posts created on my blog in the last 2 weeks. To do this, I need to first generate an array that contains the number of posts created on each day during the period in question.

For example, this array:

[40, 18, 0, 2, 39, 37, 22, 25, 30, 60, 36, 5, 2, 2]

generates this sparkline: (I'm using the Googlecharts wrapper around the Google Charts API)

My question is how to create these arrays. Here's what I'm doing now: (I'm using Searchlogic to do the queries, but it should be understandable even if you've never used it)

  history = []
  14.downto(1) do |days_ago|
    history.push(Post.created_at_after((days_ago + 1).day.ago.beginning_of_day).created_at_before((days_ago - 1).days.ago.beginning_of_day).size)
  end

This approach is ugly and slow -- there must be a better way!

like image 611
Tom Lehman Avatar asked Jan 22 '23 12:01

Tom Lehman


2 Answers

This will give you a hash mapping dates to post counts:

counts = Post.count(
  :conditions => ["created_at >= ?", 14.days.ago],
  :group => "DATE(created_at)"
)

You can then turn this into an array:

counts_array = []
14.downto(1) do |d|
  counts_array << (counts[d.days.ago.to_date.to_s] || 0)
end
like image 52
Alex Reisner Avatar answered Feb 13 '23 21:02

Alex Reisner


You need to have your data indexed properly or this will never work efficiently. If you're using a granularity of "day" then it pays to have a Date column. You can then use a standard SQL GROUP BY operation to get the values you need directly.

For example, a migration could be done like:

self.up
  add_column :posts, :created_on_date
  add_index :posts, :created_on_date

  execute "UPDATE posts SET created_on_date=created_at"
end

Then retrieval is really fast since it can exercise the index:

def sparkline_data
  self.class.connection.select_values("
    SELECT created_on_date, COUNT(id) FROM posts
      WHERE created_on_date>DATE_SUB(UTC_TIMESTAMP(), INTERVAL 14 DAY)
      GROUP BY created_on_date
  ").collect(&:to_i)
end

Keep in mind if you're potentially missing a day you'll have to account for that by inserting a zero value into your results. The date is returned here, so you should be able to compute the missing values and fill them in. Typically this is done by iterating over a group of days using collect.

When you need to retrieve a thin slice of data quickly, loading instances of the models will always be a huge bottleneck. Often you need to go directly to SQL if there's no simple way to fetch what you need.

like image 25
tadman Avatar answered Feb 13 '23 23:02

tadman