Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I assemble SQL with object-oriented Perl?

Tags:

oop

sql

perl

I'm currently in charge of a process that seems to be very intimate with the database. My program/script/framework's goal is to make uniformity out of disparate data sources. Using a form of dependency injection, my process at a very high level works fine. The implementation of each data source type is hidden from the highest level business abstraction of what's going on. Great. My questions are two.

1) I have a long paragraph (and it's the length that's bothering me) that assembles an SQL statement in Perl-space of how to translate these different data sources into one, homogeneous end format. So the SQL string always depends on the type of data I'm working with. The WHERE clause depends, the FROM clause depends, the INSERT clause depends, it all depends. It's the high level of depending-ness that's confusing me. How do I model this process in an object-oriented way? MagicObject->buildSQL? That's essentially what I have now, but it feels like all of the parts of the code know too much, hence it's length.

2) If I have a function that does something (builds SQL?), do I pass in the business objects whole and then stringify them at the last minute? Or do I stringify them early and only let my function handle what it needs, as opposed to rendering the objects itself?

Edit: While I don't doubt the importance of ORMs, I do not believe we are yet in the ORM space. Imagine baseball data for the American, National, and Fictional leagues were all stored in wildly different formats with varying levels of normalization. It is the job of my process to read these data sources and put them in one unified, normalized pool. I feel the ORM space of acting on these objects happens after my process. I'm a sort of data janitor, if you will. There are essentially no business objects yet to act on because of the lack of a unified pool, which I create.

Edit^2: It's been brought to my attention that maybe I haven't described the problem space in enough detail. Here's an example.

Imagine you had to make a master database of all the criminals in the United States. Your company's service is selling a product which sits atop and provides access to this data in a clean, unified format.

This data is provided publicly by the 50 states, but in wildly different formats. Some are one file of data, not normalized. Other are normalized tables in CSV format. Some are Excel documents. Some are TSVs. Some records are even provided that are not complete without manual intervention (other, manually created data sources).

The purpose of my project is to make a "driver" for each of the 50 states and make sure the end product of the process is a master database of criminals in a perfect, relation model. Everything keyed correctly, the schema in perfect shape, etc.

like image 841
Mark Canlas Avatar asked Jan 26 '09 16:01

Mark Canlas


1 Answers

You want to look at Fey. I started using it a few months ago on the job, and while the implementation still has rough corners due to young age, the idea behind it is solid. F.ex., take a query lightly adapted from the manual:

my $user = $schema->table( 'user' );
my $q = Fey::SQL
    ->new_select
    ->select( $user->columns( 'user_id', 'username' ) )
    ->from( $user );

Now you could write a function like this:

sub restrict_with_group {
    my ( $q, $table, @group_id ) = @_;
    my $group = $schema->table( 'group' )->alias;
    $q
        ->from( $table, $group )
        ->where( $group->column( 'group_id' ), 'IN', @group_id );
}

This will add an inner join from user to group as well as a WHERE condition. And voila, you can write the following in the main program:

restrict_with_group( $q, $user, qw( 1 2 3 ) );

But this restrict_with_group function will work for any query that that has a foreign key to the group table! To use it, you pass the query you want to restrict and the table to which you want to apply the restriction, as well as the group IDs to which you want to restrict it.

In the end you say $q->sql( $dbh ) and you get back a string of SQL representing the query that you have built up in the $q object.

So basically Fey gives you the abstractive capabilities that native SQL is missing. You can extract reusable aspects from your queries and package them as separate functions.

like image 50
Aristotle Pagaltzis Avatar answered Sep 30 '22 16:09

Aristotle Pagaltzis