I have a baseball tool that allows users to analyze a player's historical batting stats. For example, how many hits does A-Rod have over the past 7 days during night-time conditions? I want to expand the timeframe so a user can analyze a player's batting stats to as far back as 365 days. However, doing so requires some serious performance optimization. Here are my current set of models:
class AtBat < ActiveRecord::Base
belongs_to :batter
belongs_to :pitcher
belongs_to :weather_condition
### DATA MODEL ###
# id
# batter_id
# pitcher_id
# weather_condition_id
# hit (boolean)
##################
end
class BattingStat < ActiveRecord::Base
belongs_to :batter
belongs_to :recordable, :polymorphic => true # e.g., Batter, Pitcher, WeatherCondition
### DATA MODEL ###
# id
# batter_id
# recordable_id
# recordable_type
# hits7
# outs7
# at_bats7
# batting_avg7
# ...
# hits365
# outs365
# at_bats365
# batting_avg365
##################
end
class Batter < ActiveRecord::Base
has_many :batting_stats, :as => :recordable, :dependent => :destroy
has_many :at_bats, :dependent => :destroy
end
class Pitcher < ActiveRecord::Base
has_many :batting_stats, :as => :recordable, :dependent => :destroy
has_many :at_bats, :dependent => :destroy
end
class WeatherCondition < ActiveRecord::Base
has_many :batting_stats, :as => :recordable, :dependent => :destroy
has_many :at_bats, :dependent => :destroy
end
For the sake of keeping my question at a reasonable length, let me narrate what I am doing to update the batting_stats table instead of copying a bunch of code. Let's start with 7 days.
Steps 1-4 are repeated for other time periods as well -- 15 days, 30 days, etc.
Now I imagine how laborious this would be to run a script every day to make these updates if I were to expand the time periods from a mangeable 7/15/30 to 7/15/30/45/60/90/180/365.
So my question is how would you approach getting this to run at the highest levels of performance?
The single UPDATE is faster. That is, multiple UPDATE turned out to be 5-6 times slower than single UPDATE . Save this answer.
Best practices to improve SQL update statement performance We need to consider the lock escalation mode of the modified table to minimize the usage of too many resources. Analyzing the execution plan may help to resolve performance bottlenecks of the update query. We can remove the redundant indexes on the table.
The fastest way to update the bulk of records is using the Merge statement. The merge statement took 36 seconds to update records in fast way.
AR isn't really meant to do bulk processing like this. You're probably better off doing your batch updates by dropping into SQL proper and doing an INSERT FROM SELECT
(or perhaps using a gem that did this for you.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With