Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory use of Apply vs Map. Virtual memory use and lock-ups

I needed to find the sum of all pairs of numbers in a long list of pairs. Lots of ways to do this in Mathematica, but I was thinking of using either Plus or Total. Since Total works on lists, Map is the functional programming instrument to use there and Apply at level 1 (@@@) is the one to use for Plus, as Plus takes the numbers to be added as arguments.

Here is some demo code (warning: save all your work before executing this!):

pairs = Tuples[Range[6000], {2}]; (* toy example *)

TimeConstrained[Plus @@@ pairs; // Timing, 30]

(* Out[4]= {21.73, Null} *)

Total /@ pairs; // Timing

(* Out[5]= {3.525, Null} *)

You might have noticed that I've added TimeConstrained to the code for Plus. This is a protective measure I included for you because the bare code brought my PC almost to its knees. In fact, the above code works for me, but if I increase the range in the first line to 7000 my computer just locks up and never gets back. Nothing works, no alt-period, program switching, ctrl-alt-delete, attempts to fire up the process manager using the taskbar, closing the laptop lid to let it sleep, etc., really nothing.

The problem is caused by the extreme memory use of the Plus @@@ pairs line. While 'pairs' itself takes up about 288 MB, and the list of totals half of that, the Plus line quickly consumes about 7 GB for its calculations. This is the end of my free physical memory and anything bigger causes the use of virtual memory on disk. And Mathematica and/or Windows apparently don't play nice when virtual memory is used (BTW, do MacOS and Linux behave better?). In contrast, the Total line doesn't have a noticeable impact on the memory usage graph.

I have two questions:

  1. Given the equivalence between Plus and Total as stated in the documentation ("Total[list] is equivalent to Apply[Plus,list]." ) how to explain the extreme difference in behavior? I assume this has to do with the differences between Apply and Map, but I'm curious as to the internal mechanisms involved.
  2. I know I can restrict the memory footprint of a command by using MemoryConstrained, but it is a pain to have to use this everywhere where you suspect Mathematica might usurp all of your system resources. Is there a global setting that I can use to tell Mathematica to use physical memory only (or, preferably, a certain fraction thereof) for all of its operations? This would be extremely helpful as this behavior has caused a handful of lockups the last couple of weeks and it's really starting to annoy me.
like image 795
Sjoerd C. de Vries Avatar asked Jun 19 '11 21:06

Sjoerd C. de Vries


People also ask

How does the use of virtual memory improve system utilization?

A system using virtual memory uses a section of the hard drive to emulate RAM. With virtual memory, a system can load larger or multiple programs running at the same time, enabling each one to operate as if it has more space, without having to purchase more RAM.

What happens when virtual memory exceeds physical memory?

The excess is stored on hard disk and copied to RAM as required. Virtual memory is usually much larger than physical memory, making it possible to run programs for which the total code plus data size is greater than the amount of RAM available. This is known as "demand paged virtual memory".

What is mapping in virtual memory?

Memory-mapping is a mechanism that maps a portion of a file, or an entire file, on disk to a range of addresses within an application's address space. The application can then access files on disk in the same way it accesses dynamic memory.

Does Linux use virtual memory?

Linux supports virtual memory, that is, using a disk as an extension of RAM so that the effective size of usable memory grows correspondingly. The kernel will write the contents of a currently unused block of memory to the hard disk so that the memory can be used for another purpose.


1 Answers

Plus@@@pairs is unpacking:

In[11]:= On["Packing"]
In[12]:= pairs=Tuples[Range[6000],{2}];
In[13]:= TimeConstrained[Plus@@@pairs;//Timing,30]
During evaluation of In[13]:= Developer`FromPackedArray::punpack1: Unpacking array with dimensions {36000000,2}. >>
Out[13]= $Aborted

This will do the same thing and doesn't unpack, meaning it uses much less memory.

On["Packing"]
pairs=Tuples[Range[6000],{2}];
a = pairs[[All, 1]];b=pairs[[All, 2]];
Plus[a, b];

You can read more about packing in Mathematica here: http://www.wolfram.com/technology/guide/PackedArrays/

like image 141
Joshua Martell Avatar answered Sep 27 '22 17:09

Joshua Martell