Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MSTest Unit tests (VS 2012) vs. SQL Server, intermittent Primary Key collisions

I'm dealing with a very strange problem. Initially, I thought this was a problem with cleaning up test data... but after completely refactoring my test data cleanup code and still seeing the exact same behavior... I'm at a loss.

I have 245 unit test methods in various classes. Each class has its own unique test data, and I initialize those objects, and then in each test method that data is usually inserted into the database and then manipulated with the tests. Each test class has a ClassCleanup method that cleans out all of the test data from the database, and that ClassCleanup is also run on TestInitialize, to ensure everything is cleaned up before any other test method is run.

When I "Run All" using VS 2012 Test Explorer, 22 tests fail. They all fail with some variation of Primary Key constraint violations. Meaning, when they are initializing data for these tests, the data was not cleaned up from a previous test method in that class. If I re-run all tests, I get the same tests failing every time. This is fairly reproducible. No matter how many times I run all tests. These same 27 tests fail with primary key violations.

However, the weird thing is that if I re-run ONLY those failed tests, only 9 tests fail. This is ALSO reproducible, meaning, no matter how many times I run only these 27 previously failing tests, 14 will fail, and the rest will pass. This continues as I run only failed tests, until I get to a point where none of the tests fail. It should also be noted that if I run each test class individually, everything passes.

I know how this looks.

"You obviously are not cleaning up your test data." If that were the case, then I should see the same tests fail on every run, no matter what. Those 27 tests should fail EVERY run, not just when I run everything else.

"You must not have unique primary keys between classes and things aren't being cleaned up." See above. Even IF I had primary keys repeated in my classes (which I don't, because I have personally triple-checked the uniqueness of the primary keys on test data within the various classes), since these tests are not run concurrently on separate threads (which has been verified by logging the ThreadId), the cleanup code for any given test would clean out the duplicated data, regardless.

"You must not be using connection pooling." No, actually I am. And I've verified using SQL Profiler that the requests are definitely pooled. Also, because these tests are not running in parallel, there is only ever the one connection thread.

"You shouldn't use connection pooling." Well, yes, I should, since the underlying codebase supports various web projects, but for the sake of argument I tried running all the tests with connection pooling disabled (using Pooling=false in the connection string) and I get the exact same results. No change in behavior, whatsoever.

"There must be something wrong with your local environment." I get the same results running these tests on other colleagues' dev boxes (which incidentally use SQL 2012), as well. This is not unique to my environment, or even my version of SQL Server.

"You should try running mstest from the command line." Already did that. Same results.

If anyone has encountered something like this, please let me know. I know there must be something simple I'm missing, as that's usually the case with these kinds of problems, but I've covered as many bases as I possibly can in trying to sort this out.

like image 217
Finster Avatar asked Nov 03 '22 00:11

Finster


1 Answers

The following is based off the assumption that your database is in full recovery mode and you do not perform any restores or other trickery during your tests (such as detaching/reattaching the db, etc).

Here is a fairly tedious approach to investigating your problem, but is guaranteed to provide the data needed to figure this out.

  1. Take a Full Backup of the database Do this right before starting the test suite. We're going to be restoring the database so also make sure you've got enough disk space for 2-3 copies of the database files.

  2. Create a Sql Profiler trace For events, select RPC Starting/completed, sql Batch Starting/completed, Sql Statement starting/completed, SP Statement starting/completed, TM:* completed, SQLTransaction, DTCTransaction, and user error message. Capture all the columns.

  3. Reproduce the Issue Run the minimum number of tests to produce a failure. Let the tests finish so you capture all the cleanup code, then stop the profiler trace.

  4. Take a transaction log backup We may need this for point-in-time restores later.

  5. Locate the failure in the trace If you're getting a primary key failure, then it should be easy to track down, just look for the User Error Message. Write down the exact time the error occurred.

  6. Examine the trace for obvious issues Start from the error and work backwards until you find the start of the test that failed. Write down the exact time the setup started for the last failed test. Examine all the sql in this range. Is the sql exactly what you expect? Are the row counts correct? Is the transactionId correct? (The transactionId column should be different for every statement not in a transaction, and the same for every statement inside a transaction). If you have mis-matched BEGIN TRAN/COMMIT TRAN/ROLLBACK TRAN's, the transactionId will let you know.

  7. Restore the DB to right before the failed test setup Restore it to a new database so we can compare the original and the copy. First restore the full backup using "RESTORE DATABASE .... WITH NORECOVERY". Then restore the transaction log backup using "RESTORE LOG .. WTIH STOPAT, RECOVERY" and specify a time immediately before the failed test setup.

  8. Verify the database state Check for test data that may not have been cleaned up. Is everything as it should be? If not, you can restore the database again to an earlier point. You're looking for a point in time just before a test starts where the database is in a good, known state.

  9. Restore the DB to right before the error occurred If you have room, restore to another new DB. Check for the data that caused the PK violation. Would the error occur if you ran the problematic statement again? Verify that it does or doesn't occur.

    • If it doesn't occur, your problem is likely mis-matched transaction handling. If you were missing a COMMIT earlier, you may have had a transaction still open. When you restore with STOPAT, any un-committed transactions would be rolled back. This would also explain how running the tests individually works, but together they fail.
    • If it does occur, then work backwards until you find the issue. You may need to restore the DB multiple times before you figure it out. Your process will be Restore DB, examine trace, examine data, restore to different point, examine trace, examine data, etc.
  10. If after all this you are still at a loss, then you may want to investigate using database snapshots as part of your unit tests. Basically, create the db snapshot, setup and run test, teardown is replaced with a reverting the database back to the snapshot. This will guarantee an identical database before and after each test.

2012 Management Studio has an improved database restore wizard that makes the point in time restores very easy. Good Luck!

like image 74
StrayCatDBA Avatar answered Nov 11 '22 06:11

StrayCatDBA