Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which tool should I use for finding out my memory allocation in Perl?

I've slurped in a big file using File::Slurp but given the size of the file I can see that I must have it in memory twice or perhaps it's getting inflated by being turned into 16 bit unicode. How can I best diagnose that sort of a problem in Perl?

The file I pulled in is 800mb in size and my perl process that's analysing that data has roughly 1.6gb allocated at runtime.

I realise that I may be wrong about my reason for the problem but I'm not sure the most efficient way to prove/disprove my theory.

Update:

I have elminated dodgy character encoding from the list of suspects. It looks like I'm copying the variable at some point, I just can't figure out where.

Update 2:

I have now done some more investigation and discovered that it's actually just getting the data from File::Slurp that's causing the problem. I had a look through the documentation and discovered that I can get it to return a scalar_ref, i.e.

my $data = read_file($file, binmode => ':raw', scalar_ref => 1);

Then I don't get the inflation of my memory. Which makes some sense and is the most logical thing to do when getting the data in my situation.

The information about looking at what variables exist etc. has generally helpful though thanks.

like image 516
Colin Newell Avatar asked Jun 09 '10 14:06

Colin Newell


2 Answers

Maybe Devel::DumpSizes and/or Devel::Size can help out? I think the former would be more useful in your case.

Devel::DumpSizes - Dump the name and size in bytes (in increasing order) of variables that are available at a give point in a script.

Devel::Size - Perl extension for finding the memory usage of Perl variables

like image 53
Htbaa Avatar answered Nov 10 '22 05:11

Htbaa


Here are some generic resources on memory issues in Perl:

  • http://perl.active-venture.com/pod/perldebguts-perlmemory.html
  • Perl memory usage profiling and leak detection?
  • How can I find memory leaks in long-running Perl program?

As far as your own suggestion, the simplest way to disprove would be to write a simple Perl program that:

  1. Creates a big (100M) file of plain text, probably by just outputting the same string in a loop into a file, or for binary files running dd command via system() call

  2. Read the file in using standard Perl open()/@a=<>;

  3. Measure memory consumption.

Then repeat #2-#3 for your 800M file.

That will tell you if the issue is File::Slurp, some weird logic in your program, or some specific content in the file (e.g. non-ascii, although I'd be surprized if that ends up to be the reason)

like image 32
DVK Avatar answered Nov 10 '22 03:11

DVK