I wrote a program to make a list of primes from 2 to a user given number in both python and C. I ran both the programs looking for primes up to the same number and looked at their respective processes in activity monitor. I found that the python implementation used exactly 9 times as much memory as the C implementation. Why does python require so much more memory and why that specific multiple to store the same array of integers? Here are both implementations of the program:
Python version:
import math
import sys
top = int(input('Please enter the highest number you would like to have checked: '))
num = 3
prime_list = [2]
while num <= top:
n = 0
prime = True
while int(prime_list[n]) <= math.sqrt(num):
if num % prime_list[n] == 0:
prime = False
n = 0
break
n = n + 1
if prime == True:
prime_list.append(num)
prime = False
num = num + 1
print("I found ", len(prime_list), " primes")
print("The largest prime I found was ", prime_list[-1])
C version:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <sys/types.h>
#include <unistd.h>
int main(){
int N;
int arraySize = 1;
int *primes = malloc(100*sizeof(int));
int isPrime = 1;
primes[0] = 2;
int timesRealloc = 0;
int availableSlots = 100;
printf("Please enter the largest number you want checked: \n");
scanf("%d", &N);
int j = 0;
int i;
for (i = 3; i <= N; i+=2){
j = 0;
isPrime = 1;
while (primes[j] <= sqrt(i)) {
if (i%primes[j] == 0) {
isPrime = 0;
break;
}
j++;
}
if (isPrime == 1){
primes[arraySize] = i;
arraySize++;
}
if (availableSlots == arraySize){
timesRealloc++;
availableSlots += 100;
primes = realloc(primes, availableSlots*sizeof(int));
}
}
printf("I found %d primes\n", arraySize);
printf("Memory was reallocated %d times\n", timesRealloc);
printf("The largest prime I found was %d\n", primes[(arraySize-1)]);
return 0;
}
>>> import sys
>>> sys.getsizeof(123456)
28
That's 7 times the size of C int
. In Python 3 integers are instances of struct _longobject
a.k.a PyLong
:
struct _longobject {
PyVarObject ob_base;
digit ob_digit[1];
};
where PyVarObject
is
typedef struct {
PyObject ob_base;
Py_ssize_t ob_size;
} PyVarObject;
and PyObject
is
typedef struct _object {
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
} PyObject;
From that we get the following memory usage for that object 123456 in 64-bit Python build:
Py_ssize_t
)&PyLong_Type
(of type PyTypeObject *
Py_ssize_t
)Since 123456 fits in the first 30 bits, this sums up to 28, or 7 * sizeof (int)
That in addition to the fact that each element in a Python list
is a PyObject *
which points the actual object; each of these pointers are 64 bits on 64-bit Python builds; which means that each list element reference alone consumes twice as much memory as a C int
.
Add together 7 and 2 and you get 9.
For more storage-efficient code you can use arrays; with type code 'i'
the memory consumption should be quite close to the C version. array
s have the append
method thanks to which growing an array should be even easier than in C / with realloc
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With