Reverse Engineering a Perl script based on a core dump

Question

A friend's server (yes, really. Not mine.) was broken into and we discovered a perl binary running some bot code. We could not find the script itself (probably eval'ed as received over the network), but we managed to create a core dump of the perl process.

Running strings on the core gave us some hints (hostnames, usernames / passwords), but not the source code of the script.

We'd like to know what the script was capable of doing, so we'd like to reverse-engineer the perl code that was running inside that perl interpreter.

Searching around, the closest thing to a perl de-compiler I found is the B::Deparse module which seems to be perfectly suitable for converting the bytecode of the parse-trees back into readable code.

Now, how do I get B::Deparse to operate on a core dump? Or, alternatively, how could I restart the program from the core, load B::Deparse and execute it?

Any ideas are welcome.

Josh Jore · Accepted Answer

ysth asked me on IRC to comment on your question. I've done a whole pile of stuff "disassembling" compiled perl and stuff (just see my CPAN page [http://search.cpan.org/~jjore]).

Perl compiles your source to a tree of OP* structs which occasionally have C pointers to SV* which are perl values. Your core dump now has a bunch of those OP* and SV* stashed.

The best possible world would be to have a perl module like B::Deparse do the information-understanding work for you. It works by using a light interface to perl memory in the B::OP and B::SV classes (documented in B, perlguts, and perlhack). This is unrealistic for you because a B::* object is just a pointer into memory with accessors to decode the struct for our use. Consider:

require Data::Dumper;
require Scalar::Util;
require B;

my $value = 'this is a string';

my $sv      = B::svref_2object( \ $value );
my $address = Scalar::Util::refaddr( \ $value );

local $Data::Dumper::Sortkeys = 1;
local $Data::Dumper::Purity   = 1;
print Data::Dumper::Dumper(
  {
    address => $address,
    value   => \ $value,
    sv      => $sv,
    sv_attr => {
      CUR           => $sv->CUR,
      LEN           => $sv->LEN,
      PV            => $sv->PV,
      PVBM          => $sv->PVBM,
      PVX           => $sv->PVX,
      as_string     => $sv->as_string,
      FLAGS         => $sv->FLAGS,
      MAGICAL       => $sv->MAGICAL,
      POK           => $sv->POK,
      REFCNT        => $sv->REFCNT,
      ROK           => $sv->ROK,
      SvTYPE        => $sv->SvTYPE,
      object_2svref => $sv->object_2svref,
    },
  }
);

which when run showed that the B::PV object (it is ISA B::SV) is truely merely an interface to the memory representation of the compiled string this is a string.

$VAR1 = {
          'address' => 438506984,
          'sv' => bless( do{\(my $o = 438506984)}, 'B::PV' ),
          'sv_attr' => {
                         'CUR' => 16,
                         'FLAGS' => 279557,
                         'LEN' => 24,
                         'MAGICAL' => 0,
                         'POK' => 1024,
                         'PV' => 'this is a string',
                         'PVBM' => 'this is a string',
                         'PVX' => 'this is a string',
                         'REFCNT' => 2,
                         'ROK' => 0,
                         'SvTYPE' => 5,
                         'as_string' => 'this is a string',
                         'object_2svref' => \'this is a string'
                       },
          'value' => do{my $o}
        };
$VAR1->{'value'} = $VAR1->{'sv_attr'}{'object_2svref'};

This however implies that any B::* using code must actually operate on live memory. Tye McQueen thought he remembered a C debugger which could fully revive a working process given a core dump. My gdb can't. gdb can allow you to dump the contents of your OP* and SV* structs. You would most likely just read the dumped structs to interpret your program's structure. You could, if you wished, use gdb to dump the structs, then synthetically create B::* objects which behaved in interface as if they were ordinary and use B::Deparse on that. At root, our deparser and other debug dumping tools are mostly object oriented so you could just "fool" them by creating a pile of fake B::* classes and objects.

You may find reading the B::Deparse class's coderef2text method instructive. It accepts a function reference, casts it to a B::CV object, and uses that for input to the deparse_sub method:

require B;
require B::Deparse;
sub your_function { ... }

my $cv = B::svref_2object( \ &your_function );
my $deparser = B::Deparse->new;
print $deparser->deparse_sub( $cv );

For gentler introductions to OP* and related ideas, see the updated PerlGuts Illustrated and Optree guts.

blucz · Answer

I doubt there's a tool out there that does this out of the box, so...

Find the source code to the version of perl you were running. This should help you understand the memory layout of the perl interpreter. It will also help you figure out if there's a way to take a shortcut here (e.g. if bytecode is preceded by an easy to find header in memory or something).
Load up the binary + core dump in a debugger, probably gdb
Use the information in the perl source code to guide you in convincing the debugger to spit out the bytecode you're interested in.

Once you have the bytecode, B::Deparse should be able to get you to something more readable.

Reverse Engineering a Perl script based on a core dump

Tags:

reverse-engineering

perl

otmar

Video Answer

2 Answers

Josh Jore

blucz

Recent Activity

Donate For Us

Reverse Engineering a Perl script based on a core dump

Tags:

reverse-engineering

perl

otmar

Video Answer

2 Answers

Josh Jore

blucz

Related questions

Recent Activity

Donate For Us