Numerical Optimization of PL/SQL, or alternatives

Tags:

We have a requirement to do some computational heavy lifting to connect with an Oracle database. Up until now, we have performed our numerical calculations in PL/SQL and largely put up with the lack of performance.

Now we are implementing a multi-echelon inventory model (https://www.researchgate.net/publication/222409130_Evaluation_of_time-varying_availability_in_multi-echelon_spare_parts_systems_with_passivation) I am much more concerned about the performance due to the size of the problem.

I have implemented part of the algorithm in three languages: Fortran (90-2008 complied with gfortran), VBA in Excel and PL/SQL, and wrapped a one-million-calls test loop around it. Even using binary_double data type and native compilation using PLSQL_CODE_TYPE=NATIVE (both of which result in an improvement) the test code below still takes 37s to run (Oracle XE 11.2). In comparison, VBA takes 16s and Fortran 1.6s on the same hardware.

While it would probably be too much to ask for performance approaching the Fortran figure (though this would obviously be highly desirable) I am surprised that even humble VBA out performs the PL/SQL.

So my question has two parts:

Is there anything else I can try on the Oracle side to improve the performance further?
If we consider moving away from PL/SQL, what alternatives should we consider for interfacing to the Oracle database? The heavy lifting will be called relatively seldom in a batch mode-like fashion, although a performance improvement to that approaching the speed of Fortran would allow us to consider introducing some of the heavy lifting in the the interactive parts of the application. My preferred language for numerical work remains Fortran for its ease of implementation of multi-dimensional arrays with masking.

Also, though I am not directly after a critique of my source code per se, I would be grateful if anyone can spot any obvious optimizations which I can incorporate.

The function timeebo is the test function, which I am invoking with a simple select timeebo from dual; in SQL Developer.

create or replace FUNCTION gammln(
    x IN binary_double)
  RETURN binary_double
IS
  --Lanczos' approximation to the Log Gamma function
  gammln binary_double;
  ser binary_double;
  tmp binary_double;
BEGIN
  tmp := x + 5.5;
  tmp :=(x + 0.5) * Ln(tmp) - tmp;
  ser := 1.000000000190015;
  ser := ser + 76.18009172947146 /(x + 1.0) ;
  ser := ser - 86.50532032941677 /(x + 2.0) ;
  ser := ser + 24.01409824083091 /(x + 3.0) ;
  ser := ser - 1.231739572450155 /(x + 4.0) ;
  ser := ser + 1.208650973866179E-03 /(x + 5.0) ;
  ser := ser - 5.395239384953E-06 /(x + 6.0) ;
  RETURN tmp + Ln(2.5066282746310005 * ser / x) ;
END;
/
CREATE OR REPLACE FUNCTION PoissonDist(
    k      IN INTEGER,
    lambda IN binary_double)
  RETURN binary_double
IS
BEGIN
  RETURN Exp((k * Ln(lambda)) - lambda - gammln(k + 1)) ;
END;
/
CREATE OR REPLACE FUNCTION EBO(
    stock    IN pls_integer,
    pipeline IN binary_double,
    accuracy IN binary_double DEFAULT 0.0000000001)
  RETURN binary_double
IS
  i pls_integer;
  EBO binary_double;
  term binary_double;
  temp binary_double;
  PoissonVal binary_double;
  peaked BOOLEAN; --Flag the Poisson curve as having peaked
BEGIN
  EBO        := 0.0;
  IF(pipeline = 0.0) THEN
    RETURN EBO;
  END IF;
--Initialise
i            := 1;
peaked       := false;
PoissonVal   := PoissonDist(stock + 1, pipeline) ;         --Get p() value
IF(PoissonVal < accuracy AND floor(pipeline) > stock) THEN --If p() is very
  -- small...
  i          := floor(pipeline)   - stock;         --Revise i to just below peak of Poisson curve
  PoissonVal := PoissonDist(stock + i, pipeline) ; --Get p() value close to
  -- peak
  temp := PoissonVal *(pipeline / CAST(stock + i + 1 AS binary_double)) ; --
  -- Store poisson value just above peak
  LOOP
    term       := CAST(i AS binary_double) * PoissonVal;
    EBO        := EBO                         + term;
    i          := i                           - 1; --Work backwards
    PoissonVal := PoissonVal                  *(CAST(stock + i + 1 AS DOUBLE
    PRECISION)                                / pipeline) ; --Revise Poisson
    -- value for next time
    EXIT
  WHEN(term < accuracy OR i = 0) ;
  END LOOP;
  i          := 1 + floor(pipeline) - stock;
  PoissonVal := temp;
  peaked     := true;
END IF;
LOOP
  term       := CAST(i AS binary_double) * PoissonVal;
  EBO        := EBO                         + term;
  i          := i                           + 1;
  PoissonVal := PoissonVal                  *(pipeline / CAST(stock + i AS
  binary_double)) ; --Revise Poisson value for next time
  IF(CAST(stock + i AS binary_double) > pipeline) THEN
    peaked                              := true;
  END IF;
  EXIT
WHEN(term < accuracy AND peaked) ;
END LOOP;
IF(EBO < accuracy) THEN
  EBO := 0.0;
END IF;
RETURN EBO;
END;
/
CREATE OR REPLACE FUNCTION timeebo
  RETURN binary_double
IS
  i pls_integer;
  EBOVal binary_double;
  acc binary_double;
BEGIN
  acc := 0.0;
  FOR i IN 1..1000000
  LOOP
    EBOVal := EBO(500, CAST(i AS binary_double) / 1000.0) ;
    acc    := acc                                + EBOVal;
  END LOOP;
RETURN acc;
END;

817

asked Oct 22 '16 15:10

Matt Wenham

2 Answers

This is not an answer to the OP's question (but it is a "proof of concept" to show how something like what he is doing in PL/SQL can be done in plain SQL). I only use the Answer format because what I will do below doesn't fit in a Comment.

The OP asked to see how the summation of an infinite series up to a desired degree of precision can be done in pure SQL. I will give two examples.

First example:

An infinite series with positive terms: e = sum [ j = 0 to infinity ] ( 1 / factorial(j) ). A trivial upper bound for the error for the sum from 0 to n is the last term added to the series. The recursive query below (which requires Oracle 11.1 or above - actually 11.2 the way I wrote it, with column names in the declaration, but it can be easily changed for 11.1) computes the value of e precise to 38 decimal places (the maximum precision available in Oracle). The inverse-factorial series for e converges very quickly; this only takes 35 steps and runs in less than 0.001 seconds on my old, home-class computer (which is just a big Dell tablet with a keyboard).

Edit: Duh! Only I can post something where e = 3.71828! Even though in the recursive query I add ALL the terms (including 1/0!), I started the sum from 1 instead of 0. (Corrected now, but had this mistake before the correction.)

with 
     rec ( j, s, next_term, err ) as (
       select  0, 0, 1, 2
         from  dual
       union all
       select  j+1, s + next_term, next_term/(j+1), next_term
         from  rec
         where err > power(10, -38) and j < 1000000000
     )
select max(j) as steps, round(max(s), 38) as e 
from   rec
;

STEPS                                         E
-----  ----------------------------------------
   35  2.71828182845904523536028747135266249776

Second example:

OK, so now let's take an alternating series (where the absolute value of the last term is always an upper bound for the error), and let's take a very slow-converging one:

ln( 2 ) = sum [ j = 1 to infinity ] ( (-1)^(j - 1) / j )

The query below computes ln(2), accurate to five decimal places; here we know beforehand we need 100,000 steps, and the computation on my machine took about 1.1 seconds. (Remember though, this is a VERY slowly converging series.)

with
     rec ( j, s, sgn ) as (
       select  0, 0, 1
         from  dual
       union all
       select  j+1, s + sgn / (j+1), -sgn
         from  rec
         where j <= 100000
     )
select 100000 as steps, round(s, 5) as ln_2
from   rec
where  j = 100000
;

 STEPS     LN_2
------  -------
100000  0.69314

142

answered Oct 13 '22 11:10

mathguy

To summarise my findings based on the suggestions in the above comments:

Eliminating the casting has had the largest effect of the suggested refinements. Doing everything in binary_double makes the code run around 3.5x faster, a hair over ten seconds. Thence applying simple_double and pragma_inline reduce this to around 9.8s on average.

The current version of my code, with the above changes applied is as follows. The PoissonDist function could be optimized further, but the original question was more about how to make existing code run faster, rather than the optimization of the code per se.

EDIT: The code has been revised to reflect the correct use of pragma_inline, which has reduced the execution time to around 9.5s.

create or replace FUNCTION gammln(
    x IN simple_double)
  RETURN simple_double
IS
  --Lanczos' approximation to the Log Gamma function
  ser simple_double := 1.000000000190015d;
  tmp simple_double := x + 5.5d;
BEGIN
  tmp :=(x + 0.5d) * Ln(tmp) - tmp;
  ser := ser + 76.18009172947146d /(x + 1.0d) ;
  ser := ser - 86.50532032941677d /(x + 2.0d) ;
  ser := ser + 24.01409824083091d /(x + 3.0d) ;
  ser := ser - 1.231739572450155d /(x + 4.0d) ;
  ser := ser + 1.208650973866179E-03d /(x + 5.0d) ;
  ser := ser - 5.395239384953E-06d /(x + 6.0d) ;
  RETURN tmp + Ln(2.5066282746310005d * ser / x) ;
END;
/
create or replace FUNCTION PoissonDist(
    k      IN simple_double,
    lambda IN simple_double)
  RETURN simple_double
IS
BEGIN
  PRAGMA INLINE (gammln, 'YES');
  RETURN Exp((k * Ln(lambda)) - lambda - gammln(k + 1d)) ;
END;
/
CREATE OR REPLACE FUNCTION EBO(
    stock    IN simple_double,
    pipeline IN simple_double,
    accuracy IN simple_double DEFAULT 0.0000000001)
  RETURN simple_double
IS
  i simple_double          := 1d;
  EBO simple_double        := 0d;
  term simple_double       := 0d;
  temp simple_double       := 0d;
  PRAGMA INLINE(PoissonDist, 'YES') ;
  PoissonVal simple_double := PoissonDist(stock + 1d, pipeline) ;
  peaked BOOLEAN           := false; --Flag the Poisson curve as having peaked
BEGIN
  IF(pipeline = 0.0d) THEN
    RETURN EBO;
  END IF;
IF(PoissonVal < accuracy AND floor(pipeline) > stock) THEN --If p() is very
  -- small...
  i          := floor(pipeline)   - stock;         --Revise i to just below peak of Poisson curve
  PRAGMA INLINE(PoissonDist, 'YES') ;
  PoissonVal := PoissonDist(stock + i, pipeline) ; --Get p() value close to
  -- peak
  temp := PoissonVal *(pipeline /(stock + i + 1d)) ; --
  -- Store poisson value just above peak
  LOOP
    term       := i          * PoissonVal;
    EBO        := EBO        + term;
    i          := i          - 1d;                           --Work backwards
    PoissonVal := PoissonVal *((stock + i + 1) / pipeline) ; --Revise Poisson
    -- value for next time
    EXIT
  WHEN(term < accuracy OR i = 0d) ;
  END LOOP;
  i          := 1d + floor(pipeline) - stock;
  PoissonVal := temp;
  peaked     := true;
END IF;
LOOP
  term       := i          * PoissonVal;
  EBO        := EBO        + term;
  i          := i          + 1d;
  PoissonVal := PoissonVal *(pipeline /(stock + i)) ; --Revise Poisson value
  -- for next time
  IF((stock + i) > pipeline) THEN
    peaked      := true;
  END IF;
  EXIT
WHEN(term < accuracy AND peaked) ;
END LOOP;
IF(EBO < accuracy) THEN
  EBO := 0.0d;
END IF;
RETURN EBO;
END;
/
create or replace FUNCTION timeebo
  RETURN binary_double
IS
  i binary_double;
  EBOVal binary_double;
  acc binary_double;
BEGIN
  acc := 0.0d;
  FOR i IN 1d..1000000d
  LOOP
    PRAGMA INLINE (EBO, 'YES');
    EBOVal := EBO(500d, i / 1000d) ;
    acc    := acc                                + EBOVal;
  END LOOP;
RETURN acc;
END;

answered Oct 13 '22 11:10

Matt Wenham

Related questions
                            
                                How can I collectively select 100 rows from 3 different tables?
                            
                                Updating Oracle XMLType Column Content For a Given Value
                            
                                Oracle data pump impdp to remote server
                            
                                Anything wrong with using windows-1252 instead of UTF-8
                            
                                Oracle uses or ignores indexed column depending on the format of to_date(literal)
                            
                                Oracle 11g "Bind variable does not exist"
                            
                                removing milliseconds from a oracle tmstmp field
                            
                                Distinct LISTAGG that is inside a subquery in the SELECT list
                            
                                Why is Oracle is locking the statistics of my schema after import?
                            
                                Profiling stored functions in Oracle
                            
                                C# DataTable to Oracle Stored Procedure
                            
                                Creating a custom, multi-argument Oracle analytic function
                            
                                Why does this Oracle DROP COLUMN alter the default value of another column?
                            
                                A type with declaration but without definition used throughout many packages
                            
                                Jetbrains Datagrip / PHPStorm how to execute an oracle Procedure?
                            
                                How can I instal cx_Oracle package to Anaconda 3 to use with python 3.5
                            
                                Insert into a query
                            
                                oracle utl_lms integer substitution with variables
                            
                                Merge date ranges
                            
                                What exactly does Returning .... INTO .... do in Oracle SQL?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Numerical Optimization of PL/SQL, or alternatives

Tags:

performance

optimization

oracle

plsql

numerical

Matt Wenham

People also ask

2 Answers

mathguy

Matt Wenham

Recent Activity

Donate For Us