derivative of an array numpy example

How do I compute the derivative of an array in python

stackoverflow.com › questions › 16841729 › how-do-i-compute-the-derivative-of-an-array-in-python

1. Use numpy.gradient (best option)

Most people want this. This is now the Numpy provided finite difference aproach (2nd-order accurate.) Same shape-size as input array.

Uses second order accurate central differences in the interior points and either first or second order accurate one-sides (forward or backwards) differences at the boundaries. The returned gradient hence has the same shape as the input array.

2. Use numpy.diff (you probably don't want this)

If you really want something ~twice worse this is just 1st-order accurate and also doesn't have same shape as input. But it's faster than above (some little tests I did).

For constant space between x sampless

import numpy as np 
dx = 0.1; y = [1, 2, 3, 4, 4, 5, 6] # dx constant
np.gradient(y, dx) # dy/dx 2nd order accurate
array([10., 10., 10.,  5.,  5., 10., 10.])

For irregular space between x samples

your question

import numpy as np
x = [.1, .2, .5, .6, .7, .8, .9] # dx varies
y = [1, 2, 3, 4, 4, 5, 6]
np.gradient(y, x) # dy/dx 2nd order accurate
array([10., 8.333..,  8.333.., 5.,  5., 10., 10.])

What are you trying to achieve?

The numpy.gradient offers a 2nd-order and numpy.diff is a 1st-order approximation schema of finite differences for a non-uniform grid/array. But if you are trying to make a numerical differentiation, a specific finite differences formulation for your case might help you better. You can achieve much higher accuracy like 8th-order (if you need) much superior to numpy.gradient.

Answer from imbr on Stack Overflow

Stack Overflow

stackoverflow.com › questions › 16841729 › how-do-i-compute-the-derivative-of-an-array-in-python

How do I compute the derivative of an array in python - Stack Overflow

Top answer

1 of 4

57

1. Use numpy.gradient (best option)

Most people want this. This is now the Numpy provided finite difference aproach (2nd-order accurate.) Same shape-size as input array.

Uses second order accurate central differences in the interior points and either first or second order accurate one-sides (forward or backwards) differences at the boundaries. The returned gradient hence has the same shape as the input array.

2. Use numpy.diff (you probably don't want this)

If you really want something ~twice worse this is just 1st-order accurate and also doesn't have same shape as input. But it's faster than above (some little tests I did).

For constant space between x sampless

import numpy as np 
dx = 0.1; y = [1, 2, 3, 4, 4, 5, 6] # dx constant
np.gradient(y, dx) # dy/dx 2nd order accurate
array([10., 10., 10.,  5.,  5., 10., 10.])

For irregular space between x samples

your question

import numpy as np
x = [.1, .2, .5, .6, .7, .8, .9] # dx varies
y = [1, 2, 3, 4, 4, 5, 6]
np.gradient(y, x) # dy/dx 2nd order accurate
array([10., 8.333..,  8.333.., 5.,  5., 10., 10.])

What are you trying to achieve?

The numpy.gradient offers a 2nd-order and numpy.diff is a 1st-order approximation schema of finite differences for a non-uniform grid/array. But if you are trying to make a numerical differentiation, a specific finite differences formulation for your case might help you better. You can achieve much higher accuracy like 8th-order (if you need) much superior to numpy.gradient.

2 of 4

25

use numpy.gradient()

Please be aware that there are more advanced way to calculate the numerical derivative than simply using diff. I would suggest to use numpy.gradient, like in this example.

import numpy as np
from matplotlib import pyplot as plt

# we sample a sin(x) function
dx = np.pi/10
x = np.arange(0,2*np.pi,np.pi/10)

# we calculate the derivative, with np.gradient
plt.plot(x,np.gradient(np.sin(x), dx), '-*', label='approx')

# we compare it with the exact first derivative, i.e. cos(x)
plt.plot(x,np.cos(x), label='exact')
plt.legend()

Medium

medium.com › @whyamit404 › understanding-derivatives-with-numpy-e54d65fcbc52

Understanding Derivatives with NumPy | by whyamit404 | Medium

February 8, 2025 - In NumPy, we don’t have a dedicated function for derivatives. Instead, we use np.gradient(). This function calculates the derivative using numerical differentiation. It estimates the rate of change of an array by looking at the differences between adjacent points.

Discussions

Derivative of an array in python? - Stack Overflow

Currently I have two numpy arrays: x and y of the same size. I would like to write a function (possibly calling numpy/scipy... functions if they exist): def derivative(x, y, n = 1): # somethi... More on stackoverflow.com

stackoverflow.com

python - How do I compute derivative using Numpy? - Stack Overflow

As of v1.13, non uniform spacing can be specified using an array as the second argument. See the Examples section of this page. 2019-03-29T02:12:05.913Z+00:00 ... NumPy does not provide general functionality to compute derivatives. More on stackoverflow.com

stackoverflow.com

python - Derivative of 1D Numpy Array - Stack Overflow

This is the code that I have but it tells me my answer is incorrect. def dfunc(x): ''' Parameters x: 1D numpy array Returns df: 1D numpy array containing first derivatives wrt x ''' # WRITE YOUR CODE HERE df = np.gradient(x) return df ... Why do you think you are doing something wrong? Can you provide example ... More on stackoverflow.com

stackoverflow.com

Need help in understanding np.gradient for calculating derivatives

One definition of the derivative is f'(x) = (f(x+h)-f(x))/h where h goes to 0. Computers cannot store infinitely small numbers, so they might set h=1e-6 (that is 0.000001). It's a tradeoff because while we want h to be as small as possible, at some point the errors due to computer precision begin to dominate. Given any function that the computer can calculate, it can approximate the derivative. def f(x): return np.sin(x) x = np.arange(-2,2,0.01) y = f(x) dfdx = (f(x+h)-f(x))/h plt.plot(x,y) plt.plot(x,dfdx) plt.show() Assuming that the function is reasonably smooth (i.e. the derivative above exists), another definition of the derivative is f'(x) = (f(x+h)-f(x-h))/(2h) where h goes to 0. Going from x-h to x+h means 2 steps, that's the reason for 2h. Which works just as well. These methods are named finite difference to contrast from the normal derivative definition where h is infinitely small. The first one is the forward difference and the second one is called central difference. The backward difference is (f(x)-f(x-h))/2. Let's assume we want to write a derivative function. It takes a function f and values of x, and gives back f'(x). def f(x): return np.sin(x) def d(fun, x): return (fun(x+h)-fun(x))/h x = np.arange(-2,2,0.01) y = f(x) dfdx = d(f,x) plt.plot(x,y) plt.plot(x,dfdx) plt.show() By passing the function into the function, the derivative function can just call fun wherever it wants/needs to get the derivative. Now things become a bit more inconvenient. For some reason we do not know f. We only know y, i.e. f(x) for some values of x. Let's say that x is evenly spaced as usual. Then our best guess for h is not really tiny but identical to the spacing between neighboring x values. With the forward difference we need to take care at the rightmost value because we cannot just add +h to get a value even further out. Instead we use the backward difference. For values in the middle we decide to use the central difference instead of the forward difference. def f(x): return np.sin(x) def d(y, h=1): dfdx = [(y[1]-y[0])/h] for i in range(1,len(y)-1): dfdx.append((y[i+1]-y[i-1])/2/h) dfdx.append((y[i]-y[i-1])/h) return dfdx h = 0.01 x = np.arange(-2,2,h) y = f(x) dfdx = d(y,h) plt.plot(x,y) plt.plot(x,dfdx) plt.show() The implementation above corresponds to np.gradient in the one-dimensional case where varargs is set to case 1 or 2. The case where varargs is set to 3 or 4 would use x directly in d instead of h. However at that point the formula is more complicated as they mention in the documentation. Effectively any point has a hd (the forward step size) and a hs (the backward step size) and the formula is not just (f(x+hd)-f(x-hs))/(hd+hs) but instead that bigger expression given in the documentation, where the values of hd,hs act as some kind of weights. np.gradient is basically backwards, central and forward difference combined. When you have values like f(1),f(2),f(2+h) and want the derivative at 2, the code notices that 2 and 2+h are very close together and puts greater weight on that (and mostly ignores f(1)). The important part so far is that np.gradient when given a vector with N elements calculates N one-dimensional derivatives, which is not the typical idea of a gradient. np.gradient does support more dimensions which might make things clearer. So in the 1D case, we essentially go through all values from left to right and then consider that value and its direct left and right neighbor to quantify the uptrend or downtrend. In the 2D case, np.gradient still does this, but additionally also walks from top to bottom and does the same. So in 2D it returns 2 arrays, one for left-right and one for top-bottom. The actual definition of the gradient by finite differences is [(f(x+h,y)-f(x,y))/h, (f(x,y+h)-f(x,y))/h] in 2D. These values are indeed returned by np.gradient, the left part is in the first array and the right part in the second array. Say we are in 2D and want the gradient at x=3 and y=0, then we can plug it into np.gradient like this: hx = 1e-6 hy = 1e-3 x = [3,3+hx] y = [0,0+hy] xx,yy = np.meshgrid(x,y) def f(x,y): return x**2-2*x*np.sin(y) + 1/x grad = np.gradient(f(xx,yy), y,x) # Note the order. print(grad[1][0,0], grad[0][0,0]) # Note the order. This is dfdx, dfdy. but if the function f can be calculated by a computer, it makes more sense to just use automatic differentiation instead of finite differences. Automatic differentiation has no h that needs to be chosen carefully. It's always as accurate is possible. import torch x = torch.tensor([3.],requires_grad=True) y = torch.tensor([0.],requires_grad=True) z = x**2-2*x*torch.sin(y) + 1/x z.backward() print(x.grad, y.grad) So what's the deal with the Taylor series? It's just a minor piece in the derivation of that more general expression used by np.gradient. We just start by claiming that we can express the gradient by adding together function values in the direct neighborhood. f'(x) = a f(x) + b f(x+hd) + c f(x-hs) Given that finite differences do work out, this approach should work as well and generalize the idea. Expand f(x+hd) and f(x-hs) with their series: f(x+hd) = f(x) + hd f'(x) + hd^2 f''(x)/2 + ... f(x-hs) = f(x) - hs f'(x) + hs^2 f''(x)/2 + ... Then plug it in and reshape: f'(x) = a f(x) + b f(x) + b hd f'(x) + b hd^2 f''(x)/2 + c f(x) - c hs f'(x) + c hs^2 f''(x)/2 = (a+b+c) f(x) + (b hd - c hs) f'(x) + (b hd^2 + c hs^2 )/2 f''(x) 0 = (a+b+c) f(x) + (b hd - c hs - 1) f'(x) + (b hd^2 + c hs^2 )/2 f''(x) The = in the middle is actually more of an approximately equal sign. We won't be able to reach 0 for all f(x) as claimed on the left hand size, but we can get pretty close. We do NOT want to minimize the right-hand-side. We want it to reach 0 (it can go below 0 right now). To turn this into a minimization problem, we square it. This way we get a positive number always and it really becomes a matter of minimization. We COULD also take the absolute value instead of squaring, but it's pain to work this through and the end result are exactly the same parameters anyway. To minimize: E2 with E = (a+b+c) f(x) + (b hd - c hs - 1) f'(x) + (b hd2 + c hs2 )/2 f''(x) One requirement for an optimum is that the gradient is 0. In this case we take the derivatives with respect to a,b,c because we want to find the optimal a,b,c. First a reminder of the chain rule: dE2 /dt = 2E dE/dt for whatever t is. It's optional to do this but a bit less messy than working it through individually. In particular we have dE^2/da = 2E dE/da = 2E f(x) dE^2/db = 2E dE/db = 2E (f(x) + hd f'(x) + hd^2 f''(x)/2) dE^2/dc = 2E dE/dc = 2E (f(x) - hs f'(x) + hs^2 f''(x)/2) We want ALL three of them to be 0 at the same time. This can only happen if E is 0. 0 := (a+b+c) f(x) + (b hd - c hs - 1) f'(x) + (b hd2 + c hs2 )/2 f''(x) and we want this to be 0 for any f, f', f'' for any value of x. The only way for this to happen is if each coefficient is 0, i.e. a+b+c = 0 b hd - c hs = 1 b hd^2 + c hs^2 = 0 We would need to check the second derivative to make sure that this is a minimum, not a maximum, but given the problem it is fairly clear. So why did we stop exactly after f'' in the Taylor series? It's because this way we get exactly 3 unknowns and 3 equations, which is the most convenient to solve. Multiply the second equation by hd then subtract the third from it. (b hd^2 - c hs hd) - (b hd^2 + c hs^2) = hd -c hs^2 - c hs hd = hd c hs (hs + hd) = -hd c = -hd/hs/(hs+hd) = -hd^2 / (hs hd (hs+hd)) where the last step is just so it looks exactly like in np.gradient. Insert c into the second equation. b hd + hd/hs/(hs+hd) hs = 1 b hd + hd/(hs+hd) = 1 b + 1/(hs+hd) = 1/hd b = 1/hd - 1/(hs+hd) b = (hs(hs+hd) - hs hd) / [hs hd (hs+hd)] b = hs^2 / [hs hd (hs+hd)] From the first equation we know that a = -b-c = (hd2 - hs2 )/(hs hd (hs+hd)). So here's your summary: If you have a function that can be calculated by a computer, use torch or tensorflow or any other framework for automatic differentiation. If you have a function that can be calculated by a computer but such a framework is not available, np.gradient is still a bad idea because it is inefficient. Note for the 2D gradient we needed three values, f(x,y), f(x+dx,y), f(x,y+dy). But with np.gradient we would first need to set up arrays where it is almost natural to also include f(x+dx,y+dy) which is not needed for gradient calculations. It's more natural to set up some loop that increments x once, then y once, then z once, and so on. Many solvers in scipy.optimize work with finite differences. If you have a function that cannot be calculated by a computer, np.gradient may be useful. In practice this means that you have data from some experiment. Even there, the concept of a Taylor series plays no role here UNLESS the data was taken on an unevenly spaced grid. More on reddit.com

r/learnpython

4

2

June 30, 2023

Videos

youtube.com

Python Tutorial: Numerical Differentiation with NumPy

17:37

YouTube

Derivatives In PYTHON (Symbolic AND Numeric) - YouTube

August 9, 2021

10:45

YouTube

How to: Numerical Derivative in Python - YouTube

September 26, 2020

07:56

YouTube

How to find the derivative of the given function using scipy | ...

Numerical derivatives in python using numpy.gradient() function: ...

geeksforgeeks.org › python › how-to-compute-derivative-using-numpy

How to compute derivative using Numpy? - GeeksforGeeks

July 23, 2025 - Here we are taking the expression ... derivative derivative = var.deriv() print("Derivative, f(x)'=", derivative) # calculates the derivative of after # given value of x print("When x=5 f(x)'=", derivative(5))...

TutorialsPoint

tutorialspoint.com › how-to-compute-derivative-using-numpy

How to Compute Derivative Using Numpy?

July 20, 2023 - In this case, the gradient function will compute the derivative of f(x) = x^2 at each point in the domain and return an array representing the values of the derivative at each point.

NumPy

numpy.org › doc › stable › reference › generated › numpy.gradient.html

numpy.gradient — NumPy v2.4 Manual

For instance a uniform spacing: >>> x = np.arange(f.size) >>> np.gradient(f, x) array([1. , 1.5, 2.5, 3.5, 4.5, 5. ]) ... >>> x = np.array([0., 1., 1.5, 3.5, 4., 6.]) >>> np.gradient(f, x) array([1. , 3. , 3.5, 6.7, 6.9, 2.5]) For two dimensional arrays, the return will be two arrays ordered ...

Derivative

derivative.ca › UserGuide › NumPy

NumPy - Derivative |

For example a CHOP with all its channels and samples can be converted to a NumPy array: # Returns all of the channels in this CHOP a 2D NumPy array # with a width equal to the channel length (the number of samples) # and a height equal to the ...

Stack Overflow

stackoverflow.com › questions › 20044096 › derivative-of-an-array-in-python

Derivative of an array in python? - Stack Overflow

Top answer

1 of 3

6

This is not a simple problem, but there are a lot of methods that have been devised to handle it. One simple solution is to use finite difference methods. The command numpy.diff() uses finite differencing where you can specify the order of the derivative.

Wikipedia also has a page that lists the needed finite differencing coefficients for different derivatives of different accuracies. If the numpy function doesn't do what you want.

Depending on your application you can also use scipy.fftpack.diff which uses a completely different technique to do the same thing. Though your function needs a well defined Fourier transform.

There are lots and lots and lots of variants (e.g. summation by parts, finite differencing operators, or operators designed to preserve known evolution constants in your system of equations) on both of the two ideas above. What you should do will depend a great deal on what the problem is that you are trying to solve.

The good thing is that there is a lot of work has been done in this field. The Wikipedia page for Numerical Differentiation has some resources (though it is focused on finite differencing techniques).

2 of 3

1

The findiff project is a Python package that can do derivatives of arrays of any dimension with any desired accuracy order (of course depending on your hardware restrictions). It can handle arrays on uniform as well as non-uniform grids and also create generalizations of derivatives, i.e. general linear combinations of partial derivatives with constant and variable coefficients.

Find elsewhere

Google Bing Mojeek

Python Guides

pythonguides.com › python-scipy-derivative-of-array

Python SciPy Derivative Of Array: Calculate With Precision

June 23, 2025 - The second derivative helps identify where sales growth is accelerating or decelerating, which is valuable for inventory planning and marketing strategy. When working with large arrays, performance becomes critical. Here’s a comparison of different methods: import numpy as np import time from scipy import misc, interpolate, ndimage, signal # Generate a large array x = np.linspace(0, 10, 10000) y = np.sin(x) * np.exp(-0.1*x) # Method 1: NumPy gradient start = time.time() d1 = np.gradient(y, x) time1 = time.time() - start print(f"NumPy gradient time: {time1:.5f} seconds") # Method 2: SciPy int

SciPy

docs.scipy.org › doc › scipy-1.16.1 › reference › generated › scipy.differentiate.derivative.html

derivative — SciPy v1.16.1 Manual

In either case, for each scalar element xi[j] within xi, the array returned by f must include the scalar f(xi[j]) at the same index. Consequently, the shape of the output is always the shape of the input xi. See Examples. ... An optional user-supplied function to be called before the first iteration and after each iteration. Called as callback(res), where res is a _RichResult similar to that returned by derivative (but containing the current iterate’s values of all variables).

Berkeley

pythonnumericalmethods.berkeley.edu › notebooks › chapter20.05-Summary-and-Problems.html

Summary — Python Numerical Methods

The function should first create a vector of “smoothed” $y$ data points where $y\_smooth[i] = np.mean(y[i-n:i+n])$. The function should then compute $dy$, the derivative of the smoothed $y$-vector using the central difference method. The function should also output a 1D array $X$ that is the same size as $dy$ and denotes the x-values for which $dy$ is valid.

Svitla Systems

svitla.com › home › articles › blog › numerical differentiation methods in python

Python for Numerical Differentiation: Methods & Tools

January 14, 2021 - Please don’t write your own code to calculate the derivative of a function until you know why you need it. Scipy provides fast implementations of numerical methods and it is pre-compiled and tested across many use cases. import numpy import matplotlib.pyplot as plt def f(x): return x*x x = numpy.arange(0,4,0.01) y = f(x) plt.figure(figsize=(10,5)) plt.plot(x, y, 'b') plt.grid(axis = 'both') plt.show() Code language: JavaScript (javascript)

Price $$$

Call +1-415-891-8605

Address 100 Meadowcreek Drive, Suite 102, 94925, Corte Madera

SciPy

docs.scipy.org › doc › scipy › reference › generated › scipy.differentiate.derivative.html

derivative — SciPy v1.17.0 Manual

The signature must be: ... where each element of xi is a finite real number and argsi is a tuple, which may contain an arbitrary number of arrays that are broadcastable with xi. f must be an elementwise function: each scalar element f(xi)[j] must equal f(xi[j]) for valid indices j.

Python Like You Mean It

pythonlikeyoumeanit.com › Module3_IntroducingNumpy › AutoDiff.html

Automatic Differentiation — Python Like You Mean It

>>> y # y is a tensor, not a numpy array Tensor(9.) >>> y.backward() # compute derivatives of y >>> x.grad # stores dy/dx @ x=3 array(6.) How does this work? MyGrad’s tensor is able to tell NumPy’s function to actually call a MyGrad function. That is, the expression ... under the hood. Not only is this convenient, but it also means that you can take a complex function that is written in terms of numpy functions and pass a tensor through it so that you can differentiate that function!

arXiv

arxiv.org › pdf › 2011.08461 pdf

Deep Learning Framework From Scratch Using Numpy Andrei Nicolae Microsoft Corp.

Let 𝑦= 𝑓(𝑎,𝑏), 𝑎= 𝑔(𝑢, 𝑣), and 𝑏= 𝑝(𝑟,𝑠) where 𝑢, 𝑣,𝑟,𝑠are all · functions of 𝑥. Using the chain rule, the total derivative of 𝑦with · respect to 𝑥can be written in terms of partial derivatives as ... trivial to compute (𝜕𝑦/𝜕...

Stack Overflow

stackoverflow.com › questions › 9876290 › how-do-i-compute-derivative-using-numpy

python - How do I compute derivative using Numpy? - Stack Overflow

Top answer

1 of 9

208

You have four options

Finite Differences
Automatic Derivatives
Symbolic Differentiation
Compute derivatives by hand.

Finite differences require no external tools but are prone to numerical error and, if you're in a multivariate situation, can take a while.

Symbolic differentiation is ideal if your problem is simple enough. Symbolic methods are getting quite robust these days. SymPy is an excellent project for this that integrates well with NumPy. Look at the autowrap or lambdify functions or check out Jensen's blogpost about a similar question.

Automatic derivatives are very cool, aren't prone to numeric errors, but do require some additional libraries (google for this, there are a few good options). This is the most robust but also the most sophisticated/difficult to set up choice. If you're fine restricting yourself to numpy syntax then Theano might be a good choice.

Here is an example using SymPy

In [1]: from sympy import *
In [2]: import numpy as np
In [3]: x = Symbol('x')
In [4]: y = x**2 + 1
In [5]: yprime = y.diff(x)
In [6]: yprime
Out[6]: 2⋅x

In [7]: f = lambdify(x, yprime, 'numpy')
In [8]: f(np.ones(5))
Out[8]: [ 2.  2.  2.  2.  2.]

2 of 9

82

The most straight-forward way I can think of is using numpy's gradient function:

x = numpy.linspace(0,10,1000)
dx = x[1]-x[0]
y = x**2 + 1
dydx = numpy.gradient(y, dx)

This way, dydx will be computed using central differences and will have the same length as y, unlike numpy.diff, which uses forward differences and will return (n-1) size vector.

NumPy

numpy.org › doc › 2.1 › reference › generated › numpy.gradient.html

numpy.gradient — NumPy v2.1 Manual

With a similar procedure the forward/backward approximations used for boundaries can be derived. ... Quarteroni A., Sacco R., Saleri F. (2007) Numerical Mathematics (Texts in Applied Mathematics). New York: Springer. ... Durran D. R. (1999) Numerical Methods for Wave Equations in Geophysical Fluid Dynamics. New York: Springer. ... Fornberg B. (1988) Generation of Finite Difference Formulas on Arbitrarily Spaced Grids, Mathematics of Computation 51, no. 184 : 699-706. PDF. ... >>> import numpy as np >>> f = np.array([1, 2, 4, 7, 11, 16]) >>> np.gradient(f) array([1.

Stack Overflow

stackoverflow.com › questions › 54650047 › derivative-of-1d-numpy-array

python - Derivative of 1D Numpy Array - Stack Overflow

Top answer

1 of 1

1

The numpy gradient function computes the second order centered finite difference approximation for the gradient.

you can read in the Wikipedia finite difference page more abut the method.

let's see how we will get the right gradient with a simple example

f = np.linspace(0,100,1000) * 2

of curse the gradient of f should be 2 but

np.gradient(f)

will return array full with values 0.2002002 and thats because np.gradient default spacing between element is 1.0 so to get the right answer we should specify the spacing between elements in the f array.

np.gradient(f, varargs=np.linspace(0,100, 1000)[1])

will return the array fill with 2.0 as expected

reddit.com › r/learnpython › need help in understanding np.gradient for calculating derivatives

r/learnpython on Reddit: Need help in understanding np.gradient for calculating derivatives

June 30, 2023 -

Hi, I'm trying to expand my knowledge in Machine Learning, I came across the np.gradient function, I wanted to understand how it relates to Taylor's Series for estimating values. The documentation seemed a bit confusing for novice.

Top answer

1 of 2

7

One definition of the derivative is f'(x) = (f(x+h)-f(x))/h where h goes to 0. Computers cannot store infinitely small numbers, so they might set h=1e-6 (that is 0.000001). It's a tradeoff because while we want h to be as small as possible, at some point the errors due to computer precision begin to dominate. Given any function that the computer can calculate, it can approximate the derivative. def f(x): return np.sin(x) x = np.arange(-2,2,0.01) y = f(x) dfdx = (f(x+h)-f(x))/h plt.plot(x,y) plt.plot(x,dfdx) plt.show() Assuming that the function is reasonably smooth (i.e. the derivative above exists), another definition of the derivative is f'(x) = (f(x+h)-f(x-h))/(2h) where h goes to 0. Going from x-h to x+h means 2 steps, that's the reason for 2h. Which works just as well. These methods are named finite difference to contrast from the normal derivative definition where h is infinitely small. The first one is the forward difference and the second one is called central difference. The backward difference is (f(x)-f(x-h))/2. Let's assume we want to write a derivative function. It takes a function f and values of x, and gives back f'(x). def f(x): return np.sin(x) def d(fun, x): return (fun(x+h)-fun(x))/h x = np.arange(-2,2,0.01) y = f(x) dfdx = d(f,x) plt.plot(x,y) plt.plot(x,dfdx) plt.show() By passing the function into the function, the derivative function can just call fun wherever it wants/needs to get the derivative. Now things become a bit more inconvenient. For some reason we do not know f. We only know y, i.e. f(x) for some values of x. Let's say that x is evenly spaced as usual. Then our best guess for h is not really tiny but identical to the spacing between neighboring x values. With the forward difference we need to take care at the rightmost value because we cannot just add +h to get a value even further out. Instead we use the backward difference. For values in the middle we decide to use the central difference instead of the forward difference. def f(x): return np.sin(x) def d(y, h=1): dfdx = [(y[1]-y[0])/h] for i in range(1,len(y)-1): dfdx.append((y[i+1]-y[i-1])/2/h) dfdx.append((y[i]-y[i-1])/h) return dfdx h = 0.01 x = np.arange(-2,2,h) y = f(x) dfdx = d(y,h) plt.plot(x,y) plt.plot(x,dfdx) plt.show() The implementation above corresponds to np.gradient in the one-dimensional case where varargs is set to case 1 or 2. The case where varargs is set to 3 or 4 would use x directly in d instead of h. However at that point the formula is more complicated as they mention in the documentation. Effectively any point has a hd (the forward step size) and a hs (the backward step size) and the formula is not just (f(x+hd)-f(x-hs))/(hd+hs) but instead that bigger expression given in the documentation, where the values of hd,hs act as some kind of weights. np.gradient is basically backwards, central and forward difference combined. When you have values like f(1),f(2),f(2+h) and want the derivative at 2, the code notices that 2 and 2+h are very close together and puts greater weight on that (and mostly ignores f(1)). The important part so far is that np.gradient when given a vector with N elements calculates N one-dimensional derivatives, which is not the typical idea of a gradient. np.gradient does support more dimensions which might make things clearer. So in the 1D case, we essentially go through all values from left to right and then consider that value and its direct left and right neighbor to quantify the uptrend or downtrend. In the 2D case, np.gradient still does this, but additionally also walks from top to bottom and does the same. So in 2D it returns 2 arrays, one for left-right and one for top-bottom. The actual definition of the gradient by finite differences is [(f(x+h,y)-f(x,y))/h, (f(x,y+h)-f(x,y))/h] in 2D. These values are indeed returned by np.gradient, the left part is in the first array and the right part in the second array. Say we are in 2D and want the gradient at x=3 and y=0, then we can plug it into np.gradient like this: hx = 1e-6 hy = 1e-3 x = [3,3+hx] y = [0,0+hy] xx,yy = np.meshgrid(x,y) def f(x,y): return x**2-2*x*np.sin(y) + 1/x grad = np.gradient(f(xx,yy), y,x) # Note the order. print(grad[1][0,0], grad[0][0,0]) # Note the order. This is dfdx, dfdy. but if the function f can be calculated by a computer, it makes more sense to just use automatic differentiation instead of finite differences. Automatic differentiation has no h that needs to be chosen carefully. It's always as accurate is possible. import torch x = torch.tensor([3.],requires_grad=True) y = torch.tensor([0.],requires_grad=True) z = x**2-2*x*torch.sin(y) + 1/x z.backward() print(x.grad, y.grad) So what's the deal with the Taylor series? It's just a minor piece in the derivation of that more general expression used by np.gradient. We just start by claiming that we can express the gradient by adding together function values in the direct neighborhood. f'(x) = a f(x) + b f(x+hd) + c f(x-hs) Given that finite differences do work out, this approach should work as well and generalize the idea. Expand f(x+hd) and f(x-hs) with their series: f(x+hd) = f(x) + hd f'(x) + hd^2 f''(x)/2 + ... f(x-hs) = f(x) - hs f'(x) + hs^2 f''(x)/2 + ... Then plug it in and reshape: f'(x) = a f(x) + b f(x) + b hd f'(x) + b hd^2 f''(x)/2 + c f(x) - c hs f'(x) + c hs^2 f''(x)/2 = (a+b+c) f(x) + (b hd - c hs) f'(x) + (b hd^2 + c hs^2 )/2 f''(x) 0 = (a+b+c) f(x) + (b hd - c hs - 1) f'(x) + (b hd^2 + c hs^2 )/2 f''(x) The = in the middle is actually more of an approximately equal sign. We won't be able to reach 0 for all f(x) as claimed on the left hand size, but we can get pretty close. We do NOT want to minimize the right-hand-side. We want it to reach 0 (it can go below 0 right now). To turn this into a minimization problem, we square it. This way we get a positive number always and it really becomes a matter of minimization. We COULD also take the absolute value instead of squaring, but it's pain to work this through and the end result are exactly the same parameters anyway. To minimize: E2 with E = (a+b+c) f(x) + (b hd - c hs - 1) f'(x) + (b hd2 + c hs2 )/2 f''(x) One requirement for an optimum is that the gradient is 0. In this case we take the derivatives with respect to a,b,c because we want to find the optimal a,b,c. First a reminder of the chain rule: dE2 /dt = 2E dE/dt for whatever t is. It's optional to do this but a bit less messy than working it through individually. In particular we have dE^2/da = 2E dE/da = 2E f(x) dE^2/db = 2E dE/db = 2E (f(x) + hd f'(x) + hd^2 f''(x)/2) dE^2/dc = 2E dE/dc = 2E (f(x) - hs f'(x) + hs^2 f''(x)/2) We want ALL three of them to be 0 at the same time. This can only happen if E is 0. 0 := (a+b+c) f(x) + (b hd - c hs - 1) f'(x) + (b hd2 + c hs2 )/2 f''(x) and we want this to be 0 for any f, f', f'' for any value of x. The only way for this to happen is if each coefficient is 0, i.e. a+b+c = 0 b hd - c hs = 1 b hd^2 + c hs^2 = 0 We would need to check the second derivative to make sure that this is a minimum, not a maximum, but given the problem it is fairly clear. So why did we stop exactly after f'' in the Taylor series? It's because this way we get exactly 3 unknowns and 3 equations, which is the most convenient to solve. Multiply the second equation by hd then subtract the third from it. (b hd^2 - c hs hd) - (b hd^2 + c hs^2) = hd -c hs^2 - c hs hd = hd c hs (hs + hd) = -hd c = -hd/hs/(hs+hd) = -hd^2 / (hs hd (hs+hd)) where the last step is just so it looks exactly like in np.gradient. Insert c into the second equation. b hd + hd/hs/(hs+hd) hs = 1 b hd + hd/(hs+hd) = 1 b + 1/(hs+hd) = 1/hd b = 1/hd - 1/(hs+hd) b = (hs(hs+hd) - hs hd) / [hs hd (hs+hd)] b = hs^2 / [hs hd (hs+hd)] From the first equation we know that a = -b-c = (hd2 - hs2 )/(hs hd (hs+hd)). So here's your summary: If you have a function that can be calculated by a computer, use torch or tensorflow or any other framework for automatic differentiation. If you have a function that can be calculated by a computer but such a framework is not available, np.gradient is still a bad idea because it is inefficient. Note for the 2D gradient we needed three values, f(x,y), f(x+dx,y), f(x,y+dy). But with np.gradient we would first need to set up arrays where it is almost natural to also include f(x+dx,y+dy) which is not needed for gradient calculations. It's more natural to set up some loop that increments x once, then y once, then z once, and so on. Many solvers in scipy.optimize work with finite differences. If you have a function that cannot be calculated by a computer, np.gradient may be useful. In practice this means that you have data from some experiment. Even there, the concept of a Taylor series plays no role here UNLESS the data was taken on an unevenly spaced grid.

2 of 2

2

You might enjoy this stackoverflow post on the same question