I run the exact same Python function, one as a PostgreSQL PL/Python, and the other one outside PostgreSQL as a usual Python script.
Surprisingly, when I call the PostgreSQL PL/Python using select * from pymax7(20000);
, it takes on average 65 seconds, while when I call the usual Python script python myscript.py 20000
it takes an average 48 seconds. The averages were computed running the queries and scripts 10 times.
Should such a difference be expected? How does Python inside the PostgreSQL RDBMS (PL/Python) compares with Python outside it in terms of performances?
I'm running PostgreSQL 9.1 and Python 2.7 on Ubuntu 12.04 64bits.
PostgreSQL PL/Python:
CREATE FUNCTION pymax7 (b integer)
RETURNS float
AS $$
a = 0
for i in range(b):
for ii in range(b):
a = (((i+ii)%100)*149819874987)
return a
$$ LANGUAGE plpythonu;
Python:
import time
import sys
def pymax7 (b):
a = 0
for i in range(b):
for ii in range(b):
a = (((i+ii)%100)*149819874987) # keeping Python busy
return a
def main():
numIterations = int(sys.argv[1])
start = time.time()
print pymax7(numIterations)
end = time.time()
print "Time elapsed in Python:"
print str((end - start)*1000) + ' ms'
if __name__ == "__main__":
main()
There shouldn't be any difference. Both of your test cases have about the same run time for me, 53 seconds plus or minus 1.
I did adjust the PL/Python test case to use the same measuring technique as the plain Python test case:
CREATE FUNCTION pymax7a (b integer)
RETURNS float
AS $$
import time
start = time.time()
a = 0
for i in range(b):
for ii in range(b):
a = (((i+ii)%100)*149819874987)
end = time.time()
plpy.info("Time elapsed in Python: " + str((end - start)*1000) + ' ms')
return a
$$ LANGUAGE plpythonu;
This would tell you if there is any non-Python overhead involved. FWIW, for me, the difference between what this printed and what psql on the client printed as the total time was consistently less than 1 millisecond.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With