I'm currently reading Effective Perl Programming (2nd edition). I have come across a piece of code which was described as being poorly written, but I don't yet understand what's so bad about it, or how it should be improved. It would be great if someone could explain the matter to me.
Here's the code in question:
sub sum_values_per_key {
my ( $class, $dsn, $user, $password, $parameters ) = @_;
my %results;
my $dbh =
DBI->connect( $dsn, $user, $password, $parameters );
my $sth = $dbh->prepare(
'select key, calculate(value) from my_table');
$sth->execute();
# ... fill %results ...
$sth->finish();
$dbh->disconnect();
return \%results;
}
The example comes from the chapter on testing your code (p. 324/325). The sentence that has left me wondering about how to improve the code is the following:
Since the code was poorly written and accesses DBI directly, you'll have to create a fake DBI object to stand in for the real thing.
I have probably not understood a lot of what the book has so far been trying to teach me, or I have skipped the section relevant for understanding what's bad practice about the above code... Well, thanks in advance for your help!
Since the chapter is about testing, consider this:
When testing your function, you are also (implicitly) testing DBI. This is why it's bad.
Good testing always only checks one functionality. To guarantee this, it would be required to not use DBI directly, but use a mock object instead. This way, if your test fails, you know it's your function and not something else in another module (like DBI in your example).
I think what Brian was trying to say by "poorly written" is that you do not have a separation between business logic and data access code (and database connection mechanics, while at it).
A correct approach to writing functions is that a function (or method) should do one thing, not 3 things at once.
As a result of this big lump of functionality, when testing, you have to test ALL THREE at the same time, which is difficult (see discussion of using "test SQLite DB" in those paragraphs). Or, as an alternative, do what the chapter was devoted to, and mock the DBI object to test the business logic by pretending that the data access AND DB setup worked a certain way.
But mocking a complicated-behaving object like DBI is very and very complicated to do right.
What if the database is not accessible? What if there's blocking? What if your query has a syntax error? What if the DB connection times out when executing the query? What if...
Good test code tests ALL those error situations and more.
A more correct approach (pattern) for the code would be:
my $dbh = set_up_dbh();
my $query = qq[select key, calculate(value) from my_table];
my $data = retrieve_data($dbh, $query);
# Now, we don't need to test setting up database connection AND data retrieval
my $calc_results = calculate_results($data);
This way, to test the logic in calculate_results (e.g. summing the data), you merely need to mock DATA passed to it, which is very easy (in many cases, you just store several sets of test data in some test config); as opposed to mocking the behavior of a complicated DBI object used to retrieve the data.
There is nothing wrong with using DBI by itself.
The clue is in the fact that this is the testing chapter. I assume the issue being pointed out is that the function opens and closes a database connection itself. It should instead expect a database handle as a parameter and just run queries on it, leaving any concerns about opening and closing a database connection to its caller. That will make the job of the function more narrow, so it makes the function more flexible.
That in turn also makes the function easier to test: just pass it a mock object as a database handle. As it is currently written, you need at least to redefine DBI::connect
to test it, which isn’t hard, but is definitely messy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With