Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ActiveRecord has_many :through duplicating counter caches on mass assignment

It seems ActiveRecord's counter_cache feature can result in a counter cache being incremented twice. The scenario in which I am seeing this behavior is when I have two models that have a has_many :through relationship with each other through a join model (ie: Teacher has many Student through Classroom). When using the has_many :through generated methods to associate Teacher and Student directly (without manually creating the join record) the count goes up 2 times. Example: teacher.students << Student.create(name: "Bobby Joe") causes teacher.students_count to increment by 2.

Please help me find a solution that mitigates or eliminates this problem while allowing me to continue using built-in counter caching AND mass assignment through the has_many :through relation.

I have spent many hours looking for a solution and have extracted the problem out to a small test app which is the simplest failing example I could create. Any additional details needed to help me solve this should hopefully be below.

Example schema and models:

create_table :teachers do |t|
  t.string  :name
  t.integer :students_count, default: 0
  t.timestamps
end

class Teacher < ActiveRecord::Base
  has_many :classrooms
  has_many :students, :through => :classrooms
end

create_table :students do |t|
  t.string  :name
  t.integer :teachers_count, default: 0
  t.timestamps
end

class Student < ActiveRecord::Base
  has_many :classrooms
  has_many :teachers, :through => :classrooms
end

create_table :classrooms do |t|
  t.references :teacher
  t.references :student
  t.timestamps
end

class Classroom < ActiveRecord::Base
  belongs_to :student, :counter_cache => :teachers_count
  belongs_to :teacher, :counter_cache => :students_count
end

Here is a short rails console session showing the steps taken and the fact that rails is executing two updates to teachers to increment students_count:

1.9.2-p290 :001 > t = Teacher.create(name: "Miss Nice")
  SQL (9.7ms)  INSERT INTO "teachers" ("created_at", "name", "students_count", "updated_at") VALUES (?, ?, ?, ?)  [["created_at", Tue, 28 Feb 2012 03:31:53 UTC +00:00], ["name", "Miss Nice"], ["students_count", 0], ["updated_at", Tue, 28 Feb 2012 03:31:53 UTC +00:00]]
 => #<Teacher id: 1, name: "Miss Nice", students_count: 0, created_at: "2012-02-28 03:31:53", updated_at: "2012-02-28 03:31:53"> 
1.9.2-p290 :002 > t.students << Student.new(name: "Mary Ann")
  SQL (0.3ms)  INSERT INTO "students" ("created_at", "name", "teachers_count", "updated_at") VALUES (?, ?, ?, ?)  [["created_at", Tue, 28 Feb 2012 03:32:12 UTC +00:00], ["name", "Mary Ann"], ["teachers_count", 0], ["updated_at", Tue, 28 Feb 2012 03:32:12 UTC +00:00]]
  SQL (0.3ms)  INSERT INTO "classrooms" ("created_at", "student_id", "teacher_id", "updated_at") VALUES (?, ?, ?, ?)  [["created_at", Tue, 28 Feb 2012 03:32:12 UTC +00:00], ["student_id", 1], ["teacher_id", 1], ["updated_at", Tue, 28 Feb 2012 03:32:12 UTC +00:00]]
  SQL (0.2ms)  UPDATE "students" SET "teachers_count" = COALESCE("teachers_count", 0) + 1 WHERE "students"."id" = 1
  Teacher Load (0.1ms)  SELECT "teachers".* FROM "teachers" WHERE "teachers"."id" = 1 LIMIT 1
  SQL (0.1ms)  UPDATE "teachers" SET "students_count" = COALESCE("students_count", 0) + 1 WHERE "teachers"."id" = 1
  SQL (0.0ms)  UPDATE "teachers" SET "students_count" = COALESCE("students_count", 0) + 1 WHERE "teachers"."id" = 1
  Student Load (0.2ms)  SELECT "students".* FROM "students" INNER JOIN "classrooms" ON "students"."id" = "classrooms"."student_id" WHERE "classrooms"."teacher_id" = 1
 => [#<Student id: 1, name: "Mary Ann", teachers_count: 1, created_at: "2012-02-28 03:32:12", updated_at: "2012-02-28 03:32:12">] 

I've put the entire test app on github if anyone would like to look closer (https://github.com/carlzulauf/test_app). I also created a unit test which demonstrates the issue and fails to pass (https://github.com/carlzulauf/test_app/blob/master/test/unit/classroom_test.rb)

like image 603
Carl Zulauf Avatar asked Feb 28 '12 03:02

Carl Zulauf


1 Answers

So far my research has told me this is probably a bug. Here are some github issues already filed for this problem:

https://github.com/rails/rails/issues/3903

https://github.com/rails/rails/issues/3085

Apparently there is an undocumented automatic counter cache caused by has_many :through relations. So if Teacher.has_many :students, :through => :classrooms then teacher.students << student collection assignments already look for and increment teacher.students_count if that column exists.

If you add Classroom.belongs_to :teacher, :counter_cache => :students_count then an additional callback is triggered when the Classroom model is created, and the column gets incremented twice.

Effective work around: rename the counter cache columns to something else. Student#teacherz_count and Teacher#studentz_count were effective at allowing my test case to pass.

https://github.com/carlzulauf/test_app/commit/707a33f948d5d55a8aa942e825841fdd8a7e7705

I haven't yet been able to find where the problem lies in the ActiveRecord code base, so I won't accept my own answer for a while in case someone knows why has_many :through works this way and where the offending code lives.

Update

I believe I found the offending line of code. Commenting out this line resolves the problem:

https://github.com/rails/rails/blob/889e8bee82ea4f75adb6de5badad512d2c615b7f/activerecord/lib/active_record/associations/has_many_through_association.rb#L53

I can't seem to get edge rails up and running so I can't submit a pull for this bug yet. If someone else is able, please do.

Finding the offending line allowed me to craft a more effective monkey patch in my test app that resolves the issue without renaming any columns.

https://github.com/carlzulauf/test_app/commit/3c421b035bd032b91ff60e3d74b957651c37c7fa

like image 147
Carl Zulauf Avatar answered Nov 16 '22 16:11

Carl Zulauf