Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using derivatives as functions in CppAD

I am trying to modify the example here:

# include <cppad/cppad.hpp>
namespace { // ---------------------------------------------------------
// define the template function JacobianCases<Vector> in empty namespace
template <typename Vector>
bool JacobianCases()
{     bool ok = true;
     using CppAD::AD;
     using CppAD::NearEqual;
     double eps99 = 99.0 * std::numeric_limits<double>::epsilon();
     using CppAD::exp;
     using CppAD::sin;
     using CppAD::cos;

     // domain space vector
     size_t n = 2;
     CPPAD_TESTVECTOR(AD<double>)  X(n);
     X[0] = 1.;
     X[1] = 2.;

     // declare independent variables and starting recording
     CppAD::Independent(X);

     // a calculation between the domain and range values
     AD<double> Square = X[0] * X[0];

     // range space vector
     size_t m = 3;
     CPPAD_TESTVECTOR(AD<double>)  Y(m);
     Y[0] = Square * exp( X[1] );
     Y[1] = Square * sin( X[1] );
     Y[2] = Square * cos( X[1] );

     // create f: X -> Y and stop tape recording
     CppAD::ADFun<double> f(X, Y);

     // new value for the independent variable vector
     Vector x(n);
     x[0] = 2.;
     x[1] = 1.;

     // compute the derivative at this x
     Vector jac( m * n );
     jac = f.Jacobian(x);

     /*
     F'(x) = [ 2 * x[0] * exp(x[1]) ,  x[0] * x[0] * exp(x[1]) ]
             [ 2 * x[0] * sin(x[1]) ,  x[0] * x[0] * cos(x[1]) ]
             [ 2 * x[0] * cos(x[1]) , -x[0] * x[0] * sin(x[i]) ]
     */
     ok &=  NearEqual( 2.*x[0]*exp(x[1]), jac[0*n+0], eps99, eps99);
     ok &=  NearEqual( 2.*x[0]*sin(x[1]), jac[1*n+0], eps99, eps99);
     ok &=  NearEqual( 2.*x[0]*cos(x[1]), jac[2*n+0], eps99, eps99);

     ok &=  NearEqual( x[0] * x[0] *exp(x[1]), jac[0*n+1], eps99, eps99);
     ok &=  NearEqual( x[0] * x[0] *cos(x[1]), jac[1*n+1], eps99, eps99);
     ok &=  NearEqual(-x[0] * x[0] *sin(x[1]), jac[2*n+1], eps99, eps99);

     return ok;
}
} // End empty namespace
# include <vector>
# include <valarray>
bool Jacobian(void)
{     bool ok = true;
     // Run with Vector equal to three different cases
     // all of which are Simple Vectors with elements of type double.
     ok &= JacobianCases< CppAD::vector  <double> >();
     ok &= JacobianCases< std::vector    <double> >();
     ok &= JacobianCases< std::valarray  <double> >();
     return ok;
}

I am trying to modify it in the following way:

Let G be the Jacobian jac that is calculated in this example, in the line:

jac = f.Jacobian(x);

and, as in the example, let X be the independent variables. I would like to construct a new function, H, which is a function of jac, i.e. H(jacobian(X)) = something, such that H is autodifferentiable. An example may be H(X) = jacobian( jacobian(X)[0]), i.e. the jacobian of the first element of jacobian(X) w.r.t X (a second derivative of sorts).

The problem is that jac as written here is of type Vector, which is a parameterized type on a raw double, not an AD<double>. To my knowledge, this means the output is not autodifferentiable.

I am looking for some advice on if it is possible to use the Jacobian in a larger operation, and take the Jacobian of that larger operation (not unlike any arithmetic operator) or if this is not possible.

EDIT: This has been put up for a bounty once, but I'm putting it up again to see if there's a better solution, because I think this is important. To be a bit more clear, the elements that the "correct" answer needs are:

a) A means of calculating arbitrary order derivatives.

b) An intelligent way of not having to specify the order of derivatives a priori. If the maximum order derivative must be known at compile time, the order of derivative can't be determined algorithmically. Further, specifying an enormously large order as in the current answer given will lead to memory allocation issues and, I imagine, performance issues.

c) Abstracting the templating of derivative order away from the end-user. This is important, because it can be difficult to keep track of the order of derivatives needed. This is probably something that comes "for free" if b) is solved.

If anybody can crack this, it would be an awesome contribution and an extremely useful operation.

like image 391
user650261 Avatar asked May 30 '18 23:05

user650261


2 Answers

If you want to nest functions, you should nest the AD<> as well. You can nest Jacobians as other functions, for instance see the code snippet below, which is computing the double derivative by nesting Jacobian

#include <cstring>
#include <iostream>      // standard input/output                                                                                                                                                                                      
#include <vector>        // standard vector                                                                                                                                                                                            
#include <cppad/cppad.hpp> // the CppAD package http://www.coin-or.org/CppAD/                                                                                                                                                          

// main program                                                                                                                                                                                                                        
int main(void)
{     using CppAD::AD;           // use AD as abbreviation for CppAD::AD                                                                                                                                                               
  using std::vector;         // use vector as abbreviation for std::vector                                                                                                                                                             
  size_t i;                  // a temporary index                                                                                                                                                                                      


  // domain space vector                                                                                                                                                                                                               
  auto Square = [](auto t){return t*t;};
  vector< AD<AD<double>> > X(1); // vector of domain space variables                                                                                                                                                                   

  // declare independent variables and start recording operation sequence                                                                                                                                                              
  CppAD::Independent(X);

  // range space vector                                                                                                                                                                                                                
  vector< AD<AD<double>> > Y(1); // vector of ranges space variables                                                                                                                                                                   
  Y[0] = Square(X[0]);      // value during recording of operations                                                                                                                                                                    

  // store operation sequence in f: X -> Y and stop recording                                                                                                                                                                          
  CppAD::ADFun<AD<double>> f(X, Y);

  // compute derivative using operation sequence stored in f                                                                                                                                                                           
  vector<AD<double>> jac(1); // Jacobian of f (m by n matrix)                                                                                                                                                                          
  vector<AD<double>> x(1);       // domain space vector                                                                                                                                                                                

  CppAD::Independent(x);
  jac  = f.Jacobian(x);      // Jacobian for operation sequence                                                                                                                                                                        
  CppAD::ADFun<double> f2(x, jac);
  vector<double> result(1);
  vector<double> x_res(1);
  x_res[0]=15.;
  result=f2.Jacobian(x_res);

  // print the results                                                                                                                                                                                                                 
  std::cout << "f'' computed by CppAD = " << result[0] << std::endl;
}

As a side-note, since C++14 or 11 implementing expression templates and automatic differentiation became easier and can be done with much less effort, as shown e.g. in this video towards the end https://www.youtube.com/watch?v=cC9MtflQ_nI (sorry for the poor quality). If I had to implement reasonably simple symbolic operations I would start from scratch with modern C++: you can write simpler code, and you get errors that you can understand easily.

Edit: Generalizing the example to build arbitrary order derivatives can be a template metaprogramming exercice. The snippet below shows it is possible using template recursion

#include <cstring>
#include <iostream>
#include <vector>
#include <cppad/cppad.hpp>

using CppAD::AD;
using std::vector;

template<typename T>
struct remove_ad{
    using type=T;
};

template<typename T>
struct remove_ad<AD<T>>{
    using type=T;
};

template<int N>
struct derivative{
    using type = AD<typename derivative<N-1>::type >;
    static constexpr int order = N;
};

template<>
struct derivative<0>{
    using type = double;
    static constexpr int order = 0;
};

template<typename T>
struct Jac{
    using value_type = typename remove_ad<typename T::type>::type;

    template<typename P, typename Q>
    auto operator()(P & X, Q & Y){

    CppAD::ADFun<value_type> f(X, Y);
    vector<value_type> jac(1);
    vector<value_type> x(1);

    CppAD::Independent(x);
    jac  = f.Jacobian(x);

    return Jac<derivative<T::order-1>>{}(x, jac);
    }

};

template<>
struct Jac<derivative<1>>{
    using value_type = derivative<0>::type;

    template<typename P, typename Q>
    auto operator()(P & x, Q & jac){

    CppAD::ADFun<value_type> f2(x, jac);
    vector<value_type> res(1);
    vector<value_type> x_res(1);
    x_res[0]=15.;
    return f2.Jacobian(x_res);
    }
};

int main(void)
{
    constexpr int order=4;
    auto Square = [](auto t){return t*t;};
    vector< typename derivative<order>::type > X(1);
    vector< typename derivative<order>::type > Y(1);

    CppAD::Independent(X);   
    Y[0] = Square(X[0]);
    auto result = Jac<derivative<order>>{}(X, Y);

    std::cout << "f'' computed by CppAD = " << result[0] << std::endl;
} 
like image 74
Paolo Crosetto Avatar answered Nov 20 '22 20:11

Paolo Crosetto


There is a new feature in CppAD that eliminates the need for AD< AD >, see https://coin-or.github.io/CppAD/doc/base2ad.cpp.htm

like image 21
Bradley Bell Avatar answered Nov 20 '22 21:11

Bradley Bell