Talk Algorithmic Differentiation in Python with Application Examples

Presented by Sebastian F. Walter (Humboldt-Universit├Ąt zu Berlin) in Scientific track 2010 on 2010/07/10 from 14:00 to 14:30 in room Dussane
Abstract

Sebastian F. Walter / Humboldt-Universitaet zu Berlin

The talk consists of three parts. At first we introduce the fundamentals of Algorithmic Differentiation (AD) at some very basic examples. In particular we address univariate Taylor arithmetic and the reverse mode of AD.

The versatility of AD is then demonstrated at the example of Optimum Experimental Design of nonlinear models in chemical engineering. I.e. we explain how the derivative evaluation of rather complicated objective functions can be decomposed into smaller subproblems and describe how these subproblems can be plugged together.

To be more explicit, we consider chemical reactions where the underlying model is described by a semi-implicit DAE of the form:

A(t,y(t),z(t),p)  dy/dt(t) = f(t, y(t), z(t), u(t), p)
                        0   = g(t, y(t), z(t), u(t), p)

with initial values y(t_0) = y_0(u(t_0),p).

The parameters p are given by nature but unknown, u(t) = u(t;q) are control functions that are parametrized by the control vector q. y(t) = y(t; u(t),p) are differential state variables, z(t; u(t), p) algebraic state variables. In an experiment some measurement function h(t, y(t), z(t), u(t), p) can be observed at the measurement times [t_1, t_2, ..., t_Nmts] where measurements [eta_1, eta_2, ..., eta_Nmts] are taken (Nmts denotes the number of measurement times).

Generally, the measurements are the outcome of a iid normally distributed random variable for which the confidence region is described by a covariance matrix Sigma^2. The "uncertainty" in the measurements eta_i propagate to an "uncertainty" in the parameters p. Employing linear error propagation one obtains the covariance matrix C of the parameters from the covariance matrix Sigma^2. The standard approach is to minimize the "size" of the covariance matrix C by adjusting the control vector q. I.e. the OED objective function is described by:

q_* = argmin_q Phi(C)

                    /        /      J1^T J1    J2^T  \^-1  / I \   \
s.t. Phi(C) = trace | (I,0)  |                       |     |   |   |
                    \        \      J2          0    /     \ 0 /   /

J1 = Sigma d [h(t_1, y(t_1), z(t_1), u(t_1), p), ..., h(t_Nmts, y(t_Nmts), z(t_Nmts), u(t_Nmts), p)]/dp
J2 = dr/dp

where r is some constraint function of the constrained least-squares problem and y(t), z(t) is solution of the above semi-implicit DAE. There are also state and control constraints. However, they are not shown here to keep things simple. I.e., solving the OED problem by standard NLP solvers requires the derivative evaluation of both matrix-valued functions and the DAE integrator.

In the third part we have closer look at the subproblems. Particular focus will be differentiation of matrix valued functions. We give some live examples using the Python AD tool ALGOPY.