Also in the documentation1:
>>> y = np.array([1, 2, 4, 7, 11, 16], dtype=np.float)
>>> j = np.gradient(y)
>>> j
array([ 1. , 1.5, 2.5, 3.5, 4.5, 5. ])
Gradient is defined as (change in
y)/(change inx).x, here, is the list index, so the difference between adjacent values is 1.At the boundaries, the first difference is calculated. This means that at each end of the array, the gradient given is simply, the difference between the end two values (divided by 1)
Away from the boundaries the gradient for a particular index is given by taking the difference between the the values either side and dividing by 2.
So, the gradient of y, above, is calculated thus:
j[0] = (y[1]-y[0])/1 = (2-1)/1 = 1
j[1] = (y[2]-y[0])/2 = (4-1)/2 = 1.5
j[2] = (y[3]-y[1])/2 = (7-2)/2 = 2.5
j[3] = (y[4]-y[2])/2 = (11-4)/2 = 3.5
j[4] = (y[5]-y[3])/2 = (16-7)/2 = 4.5
j[5] = (y[5]-y[4])/1 = (16-11)/1 = 5
You could find the minima of all the absolute values in the resulting array to find the turning points of a curve, for example.
1The array is actually called x in the example in the docs, I've changed it to y to avoid confusion.
python - Numerical gradient for nonlinear function in numpy/scipy - Stack Overflow
Need help in understanding np.gradient for calculating derivatives
python - Calculating gradient with NumPy - Stack Overflow
Higher order central differences using NumPy.gradient()
Videos
Also in the documentation1:
>>> y = np.array([1, 2, 4, 7, 11, 16], dtype=np.float)
>>> j = np.gradient(y)
>>> j
array([ 1. , 1.5, 2.5, 3.5, 4.5, 5. ])
Gradient is defined as (change in
y)/(change inx).x, here, is the list index, so the difference between adjacent values is 1.At the boundaries, the first difference is calculated. This means that at each end of the array, the gradient given is simply, the difference between the end two values (divided by 1)
Away from the boundaries the gradient for a particular index is given by taking the difference between the the values either side and dividing by 2.
So, the gradient of y, above, is calculated thus:
j[0] = (y[1]-y[0])/1 = (2-1)/1 = 1
j[1] = (y[2]-y[0])/2 = (4-1)/2 = 1.5
j[2] = (y[3]-y[1])/2 = (7-2)/2 = 2.5
j[3] = (y[4]-y[2])/2 = (11-4)/2 = 3.5
j[4] = (y[5]-y[3])/2 = (16-7)/2 = 4.5
j[5] = (y[5]-y[4])/1 = (16-11)/1 = 5
You could find the minima of all the absolute values in the resulting array to find the turning points of a curve, for example.
1The array is actually called x in the example in the docs, I've changed it to y to avoid confusion.
Here is what is going on. The Taylor series expansion guides us on how to approximate the derivative, given the value at close points. The simplest comes from the first order Taylor series expansion for a C^2 function (two continuous derivatives)...
- f(x+h) = f(x) + f'(x)h+f''(xi)h^2/2.
One can solve for f'(x)...
- f'(x) = [f(x+h) - f(x)]/h + O(h).
Can we do better? Yes indeed. If we assume C^3, then the Taylor expansion is
- f(x+h) = f(x) + f'(x)h + f''(x)h^2/2 + f'''(xi) h^3/6, and
- f(x-h) = f(x) - f'(x)h + f''(x)h^2/2 - f'''(xi) h^3/6.
Subtracting these (both the h^0 and h^2 terms drop out!) and solve for f'(x):
- f'(x) = [f(x+h) - f(x-h)]/(2h) + O(h^2).
So, if we have a discretized function defined on equal distant partitions: x = x_0,x_0+h(=x_1),....,x_n=x_0+h*n, then numpy gradient will yield a "derivative" array using the first order estimate on the ends and the better estimates in the middle.
Example 1. If you don't specify any spacing, the interval is assumed to be 1. so if you call
f = np.array([5, 7, 4, 8])
what you are saying is that f(0) = 5, f(1) = 7, f(2) = 4, and f(3) = 8. Then
np.gradient(f)
will be: f'(0) = (7 - 5)/1 = 2, f'(1) = (4 - 5)/(2*1) = -0.5, f'(2) = (8 - 7)/(2*1) = 0.5, f'(3) = (8 - 4)/1 = 4.
Example 2. If you specify a single spacing, the spacing is uniform but not 1.
For example, if you call
np.gradient(f, 0.5)
this is saying that h = 0.5, not 1, i.e., the function is really f(0) = 5, f(0.5) = 7, f(1.0) = 4, f(1.5) = 8. The net effect is to replace h = 1 with h = 0.5 and all the results will be doubled.
Example 3. Suppose the discretized function f(x) is not defined on uniformly spaced intervals, for instance f(0) = 5, f(1) = 7, f(3) = 4, f(3.5) = 8, then there is a messier discretized differentiation function that the numpy gradient function uses and you will get the discretized derivatives by calling
np.gradient(f, np.array([0,1,3,3.5]))
Lastly, if your input is a 2d array, then you are thinking of a function f of x, y defined on a grid. The numpy gradient will output the arrays of "discretized" partial derivatives in x and y.
First of all, some warnings:
- numerical-optimization is hard to do right
- ipopt is very complex software
- combining ipopt with numerical-differentiation sounds like you are asking for trouble, but that depends on your problem of course
- ipopt is almost always based on automatic-differentiation tools and not numerical-differentiation!
And some more:
- as this is a complex task and the state of python + ipopt is not as nice as in some other languages (julia + JuMP for example), it's a bit of work
And some alternatives:
- use pyomo which wraps ipopt and has automatic-differentiation
- use casadi which also wraps ipopt and has automatic-differentiation
- use autograd to automatically calculate gradients on a subset of numpy-code
- then use cyipopt to add those
- scipy.minimize with solvers SLSQP or COBYLA which can do everything for you (SLSQP can use equality and inequality constraints; COBYLA only inequality-constraints, where emulating equality-constraints by
x >= y+x <= ycan work)
Approaching your task with your tools
Your complete example-problem is defined in Test Examples for Nonlinear Programming Codes:

Here is some code, based on numerical-differentiation, solving your test-problem, including the official setup (function, gradients, start-point, bounds, ...)
import numpy as np
import scipy.sparse as sps
import ipopt
from scipy.optimize import approx_fprime
class Problem40(object):
""" # Hock & Schittkowski test problem #40
Basic structure follows:
- cyipopt example from https://pythonhosted.org/ipopt/tutorial.html#defining-the-problem
- which follows ipopt's docs from: https://www.coin-or.org/Ipopt/documentation/node22.html
Changes:
- numerical-diff using scipy for function & constraints
- removal of hessian-calculation
- we will use limited-memory approximation
- ipopt docs: https://www.coin-or.org/Ipopt/documentation/node31.html
- (because i'm too lazy to reason about the math; lagrange and co.)
"""
def __init__(self):
self.num_diff_eps = 1e-8 # maybe tuning needed!
def objective(self, x):
# callback for objective
return -np.prod(x) # -x1 x2 x3 x4
def constraint_0(self, x):
return np.array([x[0]**3 + x[1]**2 -1])
def constraint_1(self, x):
return np.array([x[0]**2 * x[3] - x[2]])
def constraint_2(self, x):
return np.array([x[3]**2 - x[1]])
def constraints(self, x):
# callback for constraints
return np.concatenate([self.constraint_0(x),
self.constraint_1(x),
self.constraint_2(x)])
def gradient(self, x):
# callback for gradient
return approx_fprime(x, self.objective, self.num_diff_eps)
def jacobian(self, x):
# callback for jacobian
return np.concatenate([
approx_fprime(x, self.constraint_0, self.num_diff_eps),
approx_fprime(x, self.constraint_1, self.num_diff_eps),
approx_fprime(x, self.constraint_2, self.num_diff_eps)])
def hessian(self, x, lagrange, obj_factor):
return False # we will use quasi-newton approaches to use hessian-info
# progress callback
def intermediate(
self,
alg_mod,
iter_count,
obj_value,
inf_pr,
inf_du,
mu,
d_norm,
regularization_size,
alpha_du,
alpha_pr,
ls_trials
):
print("Objective value at iteration #%d is - %g" % (iter_count, obj_value))
# Remaining problem definition; still following official source:
# http://www.ai7.uni-bayreuth.de/test_problem_coll.pdf
# start-point -> infeasible
x0 = [0.8, 0.8, 0.8, 0.8]
# variable-bounds -> empty => np.inf-approach deviates from cyipopt docs!
lb = [-np.inf, -np.inf, -np.inf, -np.inf]
ub = [np.inf, np.inf, np.inf, np.inf]
# constraint bounds -> c == 0 needed -> both bounds = 0
cl = [0, 0, 0]
cu = [0, 0, 0]
nlp = ipopt.problem(
n=len(x0),
m=len(cl),
problem_obj=Problem40(),
lb=lb,
ub=ub,
cl=cl,
cu=cu
)
# IMPORTANT: need to use limited-memory / lbfgs here as we didn't give a valid hessian-callback
nlp.addOption(b'hessian_approximation', b'limited-memory')
x, info = nlp.solve(x0)
print(x)
print(info)
# CORRECT RESULT & SUCCESSFUL STATE
Output:
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit http://projects.coin-or.org/Ipopt
******************************************************************************
This is Ipopt version 3.12.8, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).
Number of nonzeros in equality constraint Jacobian...: 12
Number of nonzeros in inequality constraint Jacobian.: 0
Number of nonzeros in Lagrangian Hessian.............: 0
Total number of variables............................: 4
variables with only lower bounds: 0
variables with lower and upper bounds: 0
variables with only upper bounds: 0
Total number of equality constraints.................: 3
Total number of inequality constraints...............: 0
inequality constraints with only lower bounds: 0
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 0
Objective value at iteration #0 is - -0.4096
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 -4.0960000e-01 2.88e-01 2.53e-02 0.0 0.00e+00 - 0.00e+00 0.00e+00 0
Objective value at iteration #1 is - -0.255391
1 -2.5539060e-01 1.28e-02 2.98e-01 -11.0 2.51e-01 - 1.00e+00 1.00e+00h 1
Objective value at iteration #2 is - -0.249299
2 -2.4929898e-01 8.29e-05 3.73e-01 -11.0 7.77e-03 - 1.00e+00 1.00e+00h 1
Objective value at iteration #3 is - -0.25077
3 -2.5076955e-01 1.32e-03 3.28e-01 -11.0 2.46e-02 - 1.00e+00 1.00e+00h 1
Objective value at iteration #4 is - -0.250025
4 -2.5002535e-01 4.06e-05 1.93e-02 -11.0 4.65e-03 - 1.00e+00 1.00e+00h 1
Objective value at iteration #5 is - -0.25
5 -2.5000038e-01 6.57e-07 1.70e-04 -11.0 5.46e-04 - 1.00e+00 1.00e+00h 1
Objective value at iteration #6 is - -0.25
6 -2.5000001e-01 2.18e-08 2.20e-06 -11.0 9.69e-05 - 1.00e+00 1.00e+00h 1
Objective value at iteration #7 is - -0.25
7 -2.5000000e-01 3.73e-12 4.42e-10 -11.0 1.27e-06 - 1.00e+00 1.00e+00h 1
Number of Iterations....: 7
(scaled) (unscaled)
Objective...............: -2.5000000000225586e-01 -2.5000000000225586e-01
Dual infeasibility......: 4.4218750883118219e-10 4.4218750883118219e-10
Constraint violation....: 3.7250202922223252e-12 3.7250202922223252e-12
Complementarity.........: 0.0000000000000000e+00 0.0000000000000000e+00
Overall NLP error.......: 4.4218750883118219e-10 4.4218750883118219e-10
Number of objective function evaluations = 8
Number of objective gradient evaluations = 8
Number of equality constraint evaluations = 8
Number of inequality constraint evaluations = 0
Number of equality constraint Jacobian evaluations = 8
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations = 0
Total CPU secs in IPOPT (w/o function evaluations) = 0.016
Total CPU secs in NLP function evaluations = 0.000
EXIT: Optimal Solution Found.
[ 0.79370053 0.70710678 0.52973155 0.84089641]
{'x': array([ 0.79370053, 0.70710678, 0.52973155, 0.84089641]), 'g': array([ 3.72502029e-12, -3.93685085e-13, 5.86974913e-13]), 'obj_val': -0.25000000000225586, 'mult_g': array([ 0.49999999, -0.47193715, 0.35355339]), 'mult_x_L': array([ 0., 0., 0., 0.]), 'mult_x_U': array([ 0., 0., 0., 0.]), 'status': 0, 'status_msg': b'Algorithm terminated successfully at a locally optimal point, satisfying the convergence tolerances (can be specified by options).'}
Remarks about the code
- We use scipy's approx_fprime which basically was added for all those gradient-based optimizers in scipy.optimize
- As stated in the sources; i did not take care about ipopt's need for the hessian and we used ipopts hessian-approximation
- the basic idea is described at wiki: LBFGS
- I did ignore ipopts need for sparsity structure of the Jacobian of the constraints
- a default-assumption: the default hessian structure is of a lower triangular matrix is used and i won't give any guarantees on what can happen here (bad performance vs. breaking everything)
I think you have some kind of misunderstanding about what is a mathematical function and what is its numerical implementation.
You should define your function as:
def func(x1, x2, x3, x4):
return -x1*x2*x3*x4
Now you want to evaluate your function at specific points, which you can do using the np.mgrid you provided.
If you want to compute your gradient, use copy.misc.derivative(https://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.derivative.html) (watch out the default parameters for dx is usually bad, change it to 1e-5. There is no difference between linear and non-linear gradient for the numerical evaluation, only that for non linear function the gradient won't be the same everywhere.
What you did was with np.gradient was actually to compute the gradient from the point in your array, the definition of your function being hidden by your definition of f, thus not allowing for multiple gradient evaluation at different points. Also using your method makes you dependant of your discretisation step.
Hi, I'm trying to expand my knowledge in Machine Learning, I came across the np.gradient function, I wanted to understand how it relates to Taylor's Series for estimating values. The documentation seemed a bit confusing for novice.
The problem is, that numpy can't give you the derivatives directly and you have two options:
With NUMPY
What you essentially have to do, is to define a grid in three dimension and to evaluate the function on this grid. Afterwards you feed this table of function values to numpy.gradient to get an array with the numerical derivative for every dimension (variable).
Example from here:
from numpy import *
x,y,z = mgrid[-100:101:25., -100:101:25., -100:101:25.]
V = 2*x**2 + 3*y**2 - 4*z # just a random function for the potential
Ex,Ey,Ez = gradient(V)
Without NUMPY
You could also calculate the derivative yourself by using the centered difference quotient.

This is essentially, what numpy.gradient is doing for every point of your predefined grid.
Numpy and Scipy are for numerical calculations. Since you want to calculate the gradient of an analytical function, you have to use the Sympy package which supports symbolic mathematics. Differentiation is explained here (you can actually use it in the web console in the left bottom corner).
You can install Sympy under Ubuntu with
sudo apt-get install python-sympy
or under any Linux distribution with pip
sudo pip install sympy
ยป pip install numdifftools