Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How should I use Perl URI class?

Tags:

oop

url

perl

I need to handle some HTTP URLs in a Perl program, but I have doubts how should the URI class help me.

Particularly, I'd like the to use the URI class for resolving relative URLs and getting their components. However, the problems are:

  1. I need a function to work with both URI objects and URI strings as arguments (or ensure only one gets passed)

    sub foo_string_or_url {
      my $uri = URI->new(shift);
    

    is that the right approach? I don't quite like it, because it stringifies the URI and creates new object unnecessarily.

  2. Extract the components

    my $host = $uri->host;
    

    This is also problematic, because not all URIs have host, particularly, if someone passes garbage to the function, this will die().

  3. Resolve a relative URL

    my $new_url = URI::URL->new($uri, $base)->abs;
    

    IIUC, without the ->abs, the result will still stringify to the relative URL (and will not work for HTTP::Requests), am I right? Also, is this guaranteed to return a URI?

How should I handle these problems? The possibilities are

  • Use ->isa('URI') and ->can("host") all the time
    • Seems error prone and ugly to me
  • Don't use URI class at all and parse URLs using regexes
    • I'd still rather use a library solution than debug my own
  • Wrap URI operations in try { ... } catch { ... }
    • see the first point

Is there a sane, fool-proof way of using the URI classes? Something simple I haven't thought of (in the list above)?

like image 478
jpalecek Avatar asked Feb 11 '12 12:02

jpalecek


1 Answers

I think your question can be summarised: parameter validation is tedious, what do I do about it?

  1. I don't like it, either. This is a matter of differing opinion among developers, other say coercions are better than sliced bread, especially when automatically done by Moose. I argue that allowing only one type of simplifies the program. Also, YAGNI applies in the vast majority of cases. Reject wrong types, employ a helper module such as Params::Validate/MooseX::Method::Signatures/MooseX::Declare in order to avoid the manual checks as shown in your code samples.

  2. This is the desired behaviour. Exception handling mechanisms let you write custom code appropriate for each situation. If you think it's not aesthetically pleasing, remove it and mind the consequences of letting exceptions go unchecked.

    use Try::Tiny;
    my $host;
    try {
        $host = $uri->host;
    } catch {
        warn "Could not determine host for $uri. Message was: $_. Retry/abort/ignore?\n";
        …
    };
    
  3. Yes and yes.

like image 171
daxim Avatar answered Oct 07 '22 04:10

daxim