Finite differences require no external tools but are prone to numerical error and, if you're in a multivariate situation, can take a while.
Symbolic differentiation is ideal if your problem is simple enough. Symbolic methods are getting quite robust these days. SymPy is an excellent project for this that integrates well with NumPy. Look at the autowrap or lambdify functions or check out Jensen's blogpost about a similar question.
Automatic derivatives are very cool, aren't prone to numeric errors, but do require some additional libraries (google for this, there are a few good options). This is the most robust but also the most sophisticated/difficult to set up choice. If you're fine restricting yourself to numpy syntax then Theano might be a good choice.
Here is an example using SymPy
In [1]: from sympy import *
In [2]: import numpy as np
In [3]: x = Symbol('x')
In [4]: y = x**2 + 1
In [5]: yprime = y.diff(x)
In [6]: yprime
Out[6]: 2⋅x
In [7]: f = lambdify(x, yprime, 'numpy')
In [8]: f(np.ones(5))
Out[8]: [ 2. 2. 2. 2. 2.]
July 23, 2025 - deriv(): Calculates and gives us the derivative expression · At first, we need to define a polynomial function using the numpy.poly1d() function.
Discussions
Need help in understanding np.gradient for calculating derivatives
One definition of the derivative is f'(x) = (f(x+h)-f(x))/h where h goes to 0. Computers cannot store infinitely small numbers, so they might set h=1e-6 (that is 0.000001). It's a tradeoff because while we want h to be as small as possible, at some point the errors due to computer precision begin to dominate. Given any function that the computer can calculate, it can approximate the derivative. def f(x): return np.sin(x) x = np.arange(-2,2,0.01) y = f(x) dfdx = (f(x+h)-f(x))/h plt.plot(x,y) plt.plot(x,dfdx) plt.show() Assuming that the function is reasonably smooth (i.e. the derivative above exists), another definition of the derivative is f'(x) = (f(x+h)-f(x-h))/(2h) where h goes to 0. Going from x-h to x+h means 2 steps, that's the reason for 2h. Which works just as well. These methods are named finite difference to contrast from the normal derivative definition where h is infinitely small. The first one is the forward difference and the second one is called central difference. The backward difference is (f(x)-f(x-h))/2. Let's assume we want to write a derivative function. It takes a function f and values of x, and gives back f'(x). def f(x): return np.sin(x) def d(fun, x): return (fun(x+h)-fun(x))/h x = np.arange(-2,2,0.01) y = f(x) dfdx = d(f,x) plt.plot(x,y) plt.plot(x,dfdx) plt.show() By passing the function into the function, the derivative function can just call fun wherever it wants/needs to get the derivative. Now things become a bit more inconvenient. For some reason we do not know f. We only know y, i.e. f(x) for some values of x. Let's say that x is evenly spaced as usual. Then our best guess for h is not really tiny but identical to the spacing between neighboring x values. With the forward difference we need to take care at the rightmost value because we cannot just add +h to get a value even further out. Instead we use the backward difference. For values in the middle we decide to use the central difference instead of the forward difference. def f(x): return np.sin(x) def d(y, h=1): dfdx = [(y[1]-y[0])/h] for i in range(1,len(y)-1): dfdx.append((y[i+1]-y[i-1])/2/h) dfdx.append((y[i]-y[i-1])/h) return dfdx h = 0.01 x = np.arange(-2,2,h) y = f(x) dfdx = d(y,h) plt.plot(x,y) plt.plot(x,dfdx) plt.show() The implementation above corresponds to np.gradient in the one-dimensional case where varargs is set to case 1 or 2. The case where varargs is set to 3 or 4 would use x directly in d instead of h. However at that point the formula is more complicated as they mention in the documentation. Effectively any point has a hd (the forward step size) and a hs (the backward step size) and the formula is not just (f(x+hd)-f(x-hs))/(hd+hs) but instead that bigger expression given in the documentation, where the values of hd,hs act as some kind of weights. np.gradient is basically backwards, central and forward difference combined. When you have values like f(1),f(2),f(2+h) and want the derivative at 2, the code notices that 2 and 2+h are very close together and puts greater weight on that (and mostly ignores f(1)). The important part so far is that np.gradient when given a vector with N elements calculates N one-dimensional derivatives, which is not the typical idea of a gradient. np.gradient does support more dimensions which might make things clearer. So in the 1D case, we essentially go through all values from left to right and then consider that value and its direct left and right neighbor to quantify the uptrend or downtrend. In the 2D case, np.gradient still does this, but additionally also walks from top to bottom and does the same. So in 2D it returns 2 arrays, one for left-right and one for top-bottom. The actual definition of the gradient by finite differences is [(f(x+h,y)-f(x,y))/h, (f(x,y+h)-f(x,y))/h] in 2D. These values are indeed returned by np.gradient, the left part is in the first array and the right part in the second array. Say we are in 2D and want the gradient at x=3 and y=0, then we can plug it into np.gradient like this: hx = 1e-6 hy = 1e-3 x = [3,3+hx] y = [0,0+hy] xx,yy = np.meshgrid(x,y) def f(x,y): return x**2-2*x*np.sin(y) + 1/x grad = np.gradient(f(xx,yy), y,x) # Note the order. print(grad[1][0,0], grad[0][0,0]) # Note the order. This is dfdx, dfdy. but if the function f can be calculated by a computer, it makes more sense to just use automatic differentiation instead of finite differences. Automatic differentiation has no h that needs to be chosen carefully. It's always as accurate is possible. import torch x = torch.tensor([3.],requires_grad=True) y = torch.tensor([0.],requires_grad=True) z = x**2-2*x*torch.sin(y) + 1/x z.backward() print(x.grad, y.grad) So what's the deal with the Taylor series? It's just a minor piece in the derivation of that more general expression used by np.gradient. We just start by claiming that we can express the gradient by adding together function values in the direct neighborhood. f'(x) = a f(x) + b f(x+hd) + c f(x-hs) Given that finite differences do work out, this approach should work as well and generalize the idea. Expand f(x+hd) and f(x-hs) with their series: f(x+hd) = f(x) + hd f'(x) + hd^2 f''(x)/2 + ... f(x-hs) = f(x) - hs f'(x) + hs^2 f''(x)/2 + ... Then plug it in and reshape: f'(x) = a f(x) + b f(x) + b hd f'(x) + b hd^2 f''(x)/2 + c f(x) - c hs f'(x) + c hs^2 f''(x)/2 = (a+b+c) f(x) + (b hd - c hs) f'(x) + (b hd^2 + c hs^2 )/2 f''(x) 0 = (a+b+c) f(x) + (b hd - c hs - 1) f'(x) + (b hd^2 + c hs^2 )/2 f''(x) The = in the middle is actually more of an approximately equal sign. We won't be able to reach 0 for all f(x) as claimed on the left hand size, but we can get pretty close. We do NOT want to minimize the right-hand-side. We want it to reach 0 (it can go below 0 right now). To turn this into a minimization problem, we square it. This way we get a positive number always and it really becomes a matter of minimization. We COULD also take the absolute value instead of squaring, but it's pain to work this through and the end result are exactly the same parameters anyway. To minimize: E2 with E = (a+b+c) f(x) + (b hd - c hs - 1) f'(x) + (b hd2 + c hs2 )/2 f''(x) One requirement for an optimum is that the gradient is 0. In this case we take the derivatives with respect to a,b,c because we want to find the optimal a,b,c. First a reminder of the chain rule: dE2 /dt = 2E dE/dt for whatever t is. It's optional to do this but a bit less messy than working it through individually. In particular we have dE^2/da = 2E dE/da = 2E f(x) dE^2/db = 2E dE/db = 2E (f(x) + hd f'(x) + hd^2 f''(x)/2) dE^2/dc = 2E dE/dc = 2E (f(x) - hs f'(x) + hs^2 f''(x)/2) We want ALL three of them to be 0 at the same time. This can only happen if E is 0. 0 := (a+b+c) f(x) + (b hd - c hs - 1) f'(x) + (b hd2 + c hs2 )/2 f''(x) and we want this to be 0 for any f, f', f'' for any value of x. The only way for this to happen is if each coefficient is 0, i.e. a+b+c = 0 b hd - c hs = 1 b hd^2 + c hs^2 = 0 We would need to check the second derivative to make sure that this is a minimum, not a maximum, but given the problem it is fairly clear. So why did we stop exactly after f'' in the Taylor series? It's because this way we get exactly 3 unknowns and 3 equations, which is the most convenient to solve. Multiply the second equation by hd then subtract the third from it. (b hd^2 - c hs hd) - (b hd^2 + c hs^2) = hd -c hs^2 - c hs hd = hd c hs (hs + hd) = -hd c = -hd/hs/(hs+hd) = -hd^2 / (hs hd (hs+hd)) where the last step is just so it looks exactly like in np.gradient. Insert c into the second equation. b hd + hd/hs/(hs+hd) hs = 1 b hd + hd/(hs+hd) = 1 b + 1/(hs+hd) = 1/hd b = 1/hd - 1/(hs+hd) b = (hs(hs+hd) - hs hd) / [hs hd (hs+hd)] b = hs^2 / [hs hd (hs+hd)] From the first equation we know that a = -b-c = (hd2 - hs2 )/(hs hd (hs+hd)). So here's your summary: If you have a function that can be calculated by a computer, use torch or tensorflow or any other framework for automatic differentiation. If you have a function that can be calculated by a computer but such a framework is not available, np.gradient is still a bad idea because it is inefficient. Note for the 2D gradient we needed three values, f(x,y), f(x+dx,y), f(x,y+dy). But with np.gradient we would first need to set up arrays where it is almost natural to also include f(x+dx,y+dy) which is not needed for gradient calculations. It's more natural to set up some loop that increments x once, then y once, then z once, and so on. Many solvers in scipy.optimize work with finite differences. If you have a function that cannot be calculated by a computer, np.gradient may be useful. In practice this means that you have data from some experiment. Even there, the concept of a Taylor series plays no role here UNLESS the data was taken on an unevenly spaced grid. More on reddit.com
r/learnpython
4
2
June 30, 2023
How do I find the relative maximum in a function?
Take the derivative and find where it equals zero. Also, do your calculus homework using calculus! But seriously - give us more context. You really can use numpy/scipy to get a derivative, or you can scan the function with some arbitrary input, or you can ask questions we can't answer without proper setup. WHY are you trying to find a local maxima? What inputs do you have? What context is this in? More on reddit.com
r/Python
7
1
January 23, 2014
How To Take Derivatives In Python: 3 Different Types of Scenarios
How To Take Derivatives In Python: 3 Different Types of Scenarios In this video I show how to properly take derivatives in python in 3 different types of scenarios. The first scenario is when you have an explicit form for your function, such as f(x)=x2 or f(x)=ex sin(x). In such a scenario, the sympy library can be used to take first, second, up to nth derivatives of a function. This comes in handy for complicated functions, but can later on be EXTREMELY useful for computing Lagrange's equations of motion given strange trajectories. The second scenario is when you collect data and want to compute a derivative. In such a scenario, the data is often noisey, and taking a simple derivative will fail since it will amplify the high-frequency component of the data. In such a case, one needs to smooth data before taking a derivative. The ideal library for managing this is numpy. The third scenario involves functions of an irregular form. By this, I mean that your function can't be written down as simply as "sin(x)" or "ex". For example. f(x) = "solve an ode using some complex odesolver with parameter abserr=x and compute the integral of the answer". In this case, derivatives can't be computed symbolically, but one can use scipy's derivative method to get a good estimate of df/dx at certain values of x. More on reddit.com
r/Physics
4
408
August 9, 2021
I need help with numpy.gradient
u/jtclimb makes a great point. You need to distinguish between gradient and gradient descent. They both have the word gradient but one is a property and one is a process. The gradient is the rate of change of a function. You can determine it several different ways. If you have a symbolic function you can use textbook calculus to write the equation for the derivative by hand. That is an exact solution. There are also numerical tricks you can do to approximate the derivative. Numpy gradient takes small steps to approximate the slope. If you want to minimize a function you need to find the variable values that will minimize. Once again there are different approaches. Gradient descent calculates the gradient of the function at a point in space and uses that to estimate how much to increase or decrease each variable. It does this iteratively until it gets within a tolerance or exceeds a number of iterations. So Gradient descent is an algorithm that minimizes a function using and uses gradient information. I highly recommend the free textbook by Martins and Ning. Just google “MDOBook Martins”. Chapter 4 covers in constrained gradient-based optimization which includes gradient descent. Chapter 6 explains the different ways to calculate a gradient. FYI if you’re referring to gradient descent for machine learning you’ll need to read more specific texts but the stuff in chapter 4 will give you a really strong basic understanding. More on reddit.com
February 8, 2025 - Let’s get into some code to make this clearer. Suppose we want to find the derivative of y = x². Here’s how you can do it: import numpy as np # Define the range for x from 0 to 10, split into 100 points x = np.linspace(0, 10, 100) # Define the function y = x^2 y = x**2 # Compute the derivative of y with respect to x using np.gradient dy_dx = np.gradient(y, x) # Print the result print(dy_dx)
The value at which the derivative of f was evaluated (after broadcasting with args and step_direction). ... The implementation was inspired by jacobi [1], numdifftools [2], and DERIVEST [3], but the implementation follows the theory of Taylor series more straightforwardly (and arguably naively so).
These examples will showcase the versatility of Python for computing derivatives and highlight the applicability of derivative calculations in different domains. Derivative calculations for different types of functions ... Below are a few code snippets that demonstrate the methods discussed earlier. The functions and input values in these examples can be adapted and modified to specific requirements. 1. Numerical differentiation using central difference · import numpy as np def f(x): return np.sin(x) def central_difference(f, x, h): return (f(x + h) - f(x - h)) / (2 * h) x = 0.5 h = 0.01 f_prime = central_difference(f, x, h) print(f_prime)
July 20, 2023 - To sum it up, NumPy's gradient function provides a straightforward method to calculate derivatives of functions in Python, making it a valuable tool for a range of applications across many fields.
Gradient is calculated only along the given axis or axes The default (axis = None) is to calculate the gradient for all the axes of the input array. axis may be negative, in which case it counts from the last to the first axis. ... A tuple of ndarrays (or a single ndarray if there is only one dimension) corresponding to the derivatives of f with respect to each dimension.
New in version 1.11.0. ... A tuple of ndarrays (or a single ndarray if there is only one dimension) corresponding to the derivatives of f with respect to each dimension.
MyGrad takes this one step further, and provides true drop-in automatic differentiation to NumPy. Install MyGrad into your Python environment. Open your terminal, activate your desired Python environment, and run the following command. ... Let’s jump right in with a simple example of using MyGrad to evaluate the derivative of a function at a specific point.
January 14, 2021 - Then, let’s set the function value in the form of pairs x, y with a step of 0.01 for the range of x from 0 to 4. We’re going to use the scipy derivative to calculate the first derivative of the function. Please don’t write your own code to calculate the derivative of a function until you know why you need it. Scipy provides fast implementations of numerical methods and it is pre-compiled and tested across many use cases. import numpy import matplotlib.pyplot as plt def f(x): return x*x x = numpy.arange(0,4,0.01) y = f(x) plt.figure(figsize=(10,5)) plt.plot(x, y, 'b') plt.grid(axis = 'both') plt.show() Code language: JavaScript (javascript)
New in version 1.11.0. ... A list of ndarrays (or a single ndarray if there is only one dimension) corresponding to the derivatives of f with respect to each dimension.
March 23, 2022 - NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.[1]
Autograd can automatically differentiate native Python and Numpy code. It can handle a large subset of Python's features, including loops, ifs, recursion and closures, and it can even take derivatives of derivatives of derivatives.
Hi, I'm trying to expand my knowledge in Machine Learning, I came across the np.gradient function, I wanted to understand how it relates to Taylor's Series for estimating values. The documentation seemed a bit confusing for novice.
January 31, 2021 -Polynomial to differentiate. A sequence is interpreted as polynomial coefficients, see poly1d. ... A new polynomial representing the derivative.