Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I read Perl data structures from Python?

I've often seen people use Perl data structures in lieu of configuration files; i.e. a lone file containing only:

%config = (
    'color' => 'red',
    'numbers' => [5, 8],
    qr/^spam/ => 'eggs'
);

What's the best way to convert the contents of these files into Python-equivalent data structures, using pure Python? For the time being we can assume that there are no real expressions to evaluate, only structured data.

like image 384
cdleary Avatar asked Dec 23 '08 20:12

cdleary


Video Answer


3 Answers

Is using pure Python a requirement? If not, you can load it in Perl and convert it to YAML or JSON. Then use PyYAML or something similar to load them in Python.

like image 135
codelogic Avatar answered Sep 26 '22 22:09

codelogic


I'd just turn the Perl data structure into something else. Not seeing the actual file, there might be some extra work that my solution doesn't do.

If the only thing that's in the file is the one variable declaration (so, no 1; at the end, and so on), it can be really simple to turn your %config it into YAML:

perl -MYAML -le 'print YAML::Dump( { do shift } )' filename 

The do returns the last thing it evaluated, so in this little code it returns the list of hash key-value pairs. Things such as YAML::Dump like to work with references so they get a hint about the top-level structure, so I make that into a hash reference by surrounding the do with the curly braces. For your example, I'd get this YAML output:

---
(?-xism:^spam): eggs
color: red
numbers:
  - 5
  - 8

I don't know how Python will like that stringified regex, though. Do you really have a key that is a regex? I'd be curious to know how that's being used as part of the configuration.


If there's extra stuff in the file, life is a bit more tough. There's probably a really clever way to get around that, but I used the same idea, but just hard-coded the variable name that I wanted.

I tried this on the Perl data structure that the CPAN.pm module uses, and it looks like it came out fine. The only ugliness is the fore-knowledge of the variable name that it supplies. Now that you've seen the error of configuration in Perl code, avoid making the same mistake with Python code. :)

YAML:

 perl -MYAML -le 'do shift; print YAML::Dump( $CPAN::Config )' MyConfig.pm

JSON:

 perl -MJSON::Any -le 'do shift; my $j = JSON::Any->new; print $j->objToJson( $CPAN::Config )' MyConfig.pm

or

# suggested by JF Sebastian
perl -MJSON -le 'do shift; print to_json( $CPAN::Config )' MyConfig.pm

XML::Simple doesn't work out so well because it treated everything like an attribute, but maybe someone can improve on this:

perl -MXML::Simple -le 'do shift; print XMLout( $CPAN::Config )' MyConfig.pm
like image 36
brian d foy Avatar answered Sep 23 '22 22:09

brian d foy


Not sure what the use case is. Here's my assumption: you're going to do a one-time conversion from Perl to Python.

Perl has this

%config = (
    'color' => 'red',
    'numbers' => [5, 8],
    qr/^spam/ => 'eggs'
);

In Python, it would be

config = {
    'color' : 'red',
    'numbers' : [5, 8],
    re.compile( "^spam" ) : 'eggs'
}

So, I'm guessing it's a bunch of RE's to replace

  • %variable = ( with variable = {
  • ); with }
  • variable => value with variable : value
  • qr/.../ => with re.compile( r"..." ) : value

However, Python's built-in dict doesn't do anything unusual with a regex as a hash key. For that, you'd have to write your own subclass of dict, and override __getitem__ to check REGEX keys separately.

class PerlLikeDict( dict ):
    pattern_type= type(re.compile(""))
    def __getitem__( self, key ):
        if key in self:
            return super( PerlLikeDict, self ).__getitem__( key )
        for k in self:
            if type(k) == self.pattern_type:
                if k.match(key):
                    return self[k]
        raise KeyError( "key %r not found" % ( key, ) )

Here's the example of using a Perl-like dict.

>>> pat= re.compile( "hi" )
>>> a = { pat : 'eggs' } # native dict, no features.
>>> x=PerlLikeDict( a )
>>> x['b']= 'c'
>>> x
{<_sre.SRE_Pattern object at 0x75250>: 'eggs', 'b': 'c'}
>>> x['b']
'c'
>>> x['ji']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 10, in __getitem__
KeyError: "key 'ji' not found"
>>> x['hi']
'eggs'
like image 24
S.Lott Avatar answered Sep 26 '22 22:09

S.Lott