Adding Derivatives to Your Components

Many optimization algorithms make use of gradients. In our simple example problem, the SLSQP driver estimates the gradient at various times during the solution procedure by performing a local finite-difference step. Calculating the gradient typically involves one or more executions of the objective function depending on the finite difference method that is used. This, of course, means that your model is executed additional times each iteration.

Sometimes the solution process can be sped up by having a component supply its own derivatives. These derivatives may be analytical (as you will see in this example), or they might be estimated by some other means. Additionally, these derivatives can be more accurate than those estimated by finite differencing the component, and they are not dependent on the right choice of a step-size parameter.

In OpenMDAO, derivatives can be specified in the component API by following these two steps:

1. Declare a ``linearize`` method that calculates and saves the Jacobian that
   contains the derivatives between its numerical outputs and inputs.
2. Declare a ``provideJ`` method that returns the Jacobian along with a
   list of inputs and outputs.

Let’s look at an example, using the Paraboloid component, to see how this would work in practice. Starting with the original code, but calling our new class ParaboloidDerivative, we have:

from numpy import array

from openmdao.main.api import Component
from openmdao.lib.datatypes.api import Float

class ParaboloidDerivative(Component):
    """ Evaluates the equation f(x,y) = (x-3)^2 + xy + (y+4)^2 - 3 """

    # set up interface to the framework
    x = Float(0.0, iotype='in', desc='The variable x')
    y = Float(0.0, iotype='in', desc='The variable y')

    f_xy = Float(iotype='out', desc='F(x,y)')

The first function we need to add is linearize. This function calculates and saves a matrix (called the Jacobian) of the derivatives of the component’s outputs with respect to its inputs, evaluated at the current state of the model. The paraboloid model has one input and two outputs, so the Jacobian is a 1 by 2 numpy array. This also requires the numpy import seen in the code above. The linearize function doesn’t return the Jacobian but instead stores it in our component.

def linearize(self):
    """Calculate the Jacobian"""

    df_dx = 2.0*self.x - 6.0 + self.y
    df_dy = 2.0*self.y + 8.0 + self.x

    self.J = array([[df_dx, df_dy]])

So self.J is the stored Jacobian that we will use later. The linearize function is called once when an optimizer asks for a gradient of its workflow. When this component is part of a much larger model, it only contributes a small portion of the full Jacobian. OpenMDAO uses a numerical method developed by Martins and Hwang [1] to solve for the gradient of the full problem. For large problems the Jacobian could grow to be very large, and it would become impractical to construct the entire thing. Instead, the scheme used here represents the Jacobian as a linear operator or a function that returns a product of the Jacobian with an input vector. In this way, the full Jacobian never needs to be stored. However, since the solution is iterative, a component’s Jacobian needs to be queried multiple times after it is calculated. So, we need a method to provide the Jacobian and its ordering:

def provideJ(self):
    """Provide full Jacobian."""

    input_keys = ('x', 'y')
    output_keys = ('f_xy',)

    return input_keys, output_keys, self.J

Here input_keys and output_keys provide an index into the Jacobian so OpenMDAO knows which columns correspond to each input and which rows to each output. For our paraboloid example, if you linearized around the point (0,0) then the Jacobian would look like:

  x y
f_xy -6.0 8.0

Note that you don’t have to include all of the inputs and outputs in the Jacobian. There is certainly no reason to provide the derivative of inputs that are never hooked up to other outputs or irrelevant to the gradient for some other reason.

The ParaboloidDerivative component can be placed into a model, and the derivatives will be used with no changes required to the OptimizationConstrained or OptimizationUnconstrained assembly at this point. If the driver uses gradients (or Hessians) and can take advantage of the analytical ones you provide, then it will do so. Below is our model, using the new component with derivatives. We put this model in a file called optimization_constrained_derivative.py.

"""
    optimization_constrained.py - Top level assembly for the problem.
"""

# Perform an constrained optimization on our paraboloid component.

from openmdao.main.api import Assembly
from openmdao.lib.drivers.api import SLSQPdriver

from openmdao.examples.simple.paraboloid_derivative import ParaboloidDerivative

class OptimizationConstrained(Assembly):
    """Constrained optimization of the Paraboloid Component."""
    
    def configure(self):
        """ Creates a new Assembly containing a Paraboloid and an optimizer"""
                
        # Create Paraboloid component instances
        self.add('paraboloid', ParaboloidDerivative())

        # Create Optimizer instance
        self.add('driver', SLSQPdriver())
        
        # Driver process definition
        self.driver.workflow.add('paraboloid')
        
        # Optimizer Flags
        self.driver.iprint = 0
        
        # Objective 
        self.driver.add_objective('paraboloid.f_xy')
        
        # Design Variables 
        self.driver.add_parameter('paraboloid.x', low=-50., high=50.)
        self.driver.add_parameter('paraboloid.y', low=-50., high=50.)
        
        # Constraints
        self.driver.add_constraint('paraboloid.x-paraboloid.y >= 15.0')
        
        
if __name__ == "__main__": # pragma: no cover         

    import time
    
    opt_problem = OptimizationConstrained()
    
    tt = time.time()
    opt_problem.run()

    print "\n"
    print "Minimum found at (%f, %f)" % (opt_problem.paraboloid.x, \
                                         opt_problem.paraboloid.y)
    print "Elapsed time: ", time.time()-tt, "seconds"
    
# end optimization_constrained.py
[1]Martins, J. R. R. A., and Hwang, J. T., “Review and Unification of Methods for Computing Derivatives of Multidisciplinary Computational Models,” AIAA Journal, 2013.

Benchmarking

Sometimes it is useful to know how many times your component executes and how many times it calculates its derivatives. OpenMDAO provides this information for every component through a pair of counters: exec_count is incremented whenever a component executes, and derivative_exec_count is incremented whenever the derivatives are calculated. The following example shows how they can be accessed and used.

    >>> # Paraboloid Model
    >>>
    >>> from openmdao.examples.simple.optimization_constrained import OptimizationConstrained
    >>> model = OptimizationConstrained()
    >>> model.run()
    >>> print model.paraboloid.exec_count
    10
    >>> print model.paraboloid.derivative_exec_count
    0
    >>> # Paraboloid Model with analytical derivatives
    >>>
    >>> from openmdao.examples.simple.optimization_constrained_derivative import OptimizationConstrained
    >>> model = OptimizationConstrained()
    >>> model.run()
    >>> print model.paraboloid.exec_count
    4
    >>> print model.paraboloid.derivative_exec_count
    3

Here we’ve printed out the number of function and derivative executions for the paraboloid examples, both without and with analytical derivatives. Because this model is a simple equation, the advantage of using the analytical derivative isn’t evident in a comparison of the clock time, but the number of functional executions is much lower when you have them, at a cost of a small number of derivative evaluations.

This concludes an introduction to OpenMDAO using a simple problem of component creation and execution. The next tutorial introduces a problem with more complexity and presents additional features of the framework.

Finite Difference

If you don’t specify a provideJ function for you component, then OpenMDAO will finite difference it during the calculation of the full model gradient. OpenMDAO can identify groups of nondifferentiable components to finite difference as a block. Also, OpenMDAO can detect a non-differentiable connection between two differentiable components (e.g, components passing a file or string) and will include both components in with the nondifferentiables.

At present, the user doesn’t have much direct control over the finite difference, but a stepsize can be assigned to any Float or Array input as a variable attribute:

x = Float(0.0, iotype='in', desc='The variable x', fd_step=0.01)

The default stepsize is 1.0e-6, which will not be adequate for your problem if your variable is very large or small, so it is essential to choose this value carefully.

OpenMDAO Home

Table Of Contents

Previous topic

Recording Your Inputs and Outputs

Next topic

Choosing an Optimizer

This Page