These functions, although related, do different actions.
np.diff simply takes the differences of matrix slices along a given axis, and used for n-th difference returns a matrix smaller by n along the given axis (what you observed in the n=1 case). Please see: https://docs.scipy.org/doc/numpy/reference/generated/numpy.diff.html
np.gradient produces a set of gradients of an array along all its dimensions while preserving its shape https://docs.scipy.org/doc/numpy/reference/generated/numpy.gradient.html Please also observe that np.gradient should be executed for one input array, your second argument b does not make sense here (was interpreted as first non-keyword argument from *varargs which is meant to describe spacings between the values of the first argument), hence the results that don't match your intuition.
I would simply use c = diff(a) / diff(b) and append values to c if you really need to have c.shape match a.shape. For instance, you might append zeros if you expect the gradient to vanish close to the edges of your window.
Improving numpy.gradient to be second order accurate over the full domain
Higher order central differences using NumPy.gradient()
Need help in understanding np.gradient for calculating derivatives
python - What does numpy.gradient do? - Stack Overflow
Hi, I'm trying to expand my knowledge in Machine Learning, I came across the np.gradient function, I wanted to understand how it relates to Taylor's Series for estimating values. The documentation seemed a bit confusing for novice.
Also in the documentation1:
>>> y = np.array([1, 2, 4, 7, 11, 16], dtype=np.float)
>>> j = np.gradient(y)
>>> j
array([ 1. , 1.5, 2.5, 3.5, 4.5, 5. ])
Gradient is defined as (change in
y)/(change inx).x, here, is the list index, so the difference between adjacent values is 1.At the boundaries, the first difference is calculated. This means that at each end of the array, the gradient given is simply, the difference between the end two values (divided by 1)
Away from the boundaries the gradient for a particular index is given by taking the difference between the the values either side and dividing by 2.
So, the gradient of y, above, is calculated thus:
j[0] = (y[1]-y[0])/1 = (2-1)/1 = 1
j[1] = (y[2]-y[0])/2 = (4-1)/2 = 1.5
j[2] = (y[3]-y[1])/2 = (7-2)/2 = 2.5
j[3] = (y[4]-y[2])/2 = (11-4)/2 = 3.5
j[4] = (y[5]-y[3])/2 = (16-7)/2 = 4.5
j[5] = (y[5]-y[4])/1 = (16-11)/1 = 5
You could find the minima of all the absolute values in the resulting array to find the turning points of a curve, for example.
1The array is actually called x in the example in the docs, I've changed it to y to avoid confusion.
Here is what is going on. The Taylor series expansion guides us on how to approximate the derivative, given the value at close points. The simplest comes from the first order Taylor series expansion for a C^2 function (two continuous derivatives)...
- f(x+h) = f(x) + f'(x)h+f''(xi)h^2/2.
One can solve for f'(x)...
- f'(x) = [f(x+h) - f(x)]/h + O(h).
Can we do better? Yes indeed. If we assume C^3, then the Taylor expansion is
- f(x+h) = f(x) + f'(x)h + f''(x)h^2/2 + f'''(xi) h^3/6, and
- f(x-h) = f(x) - f'(x)h + f''(x)h^2/2 - f'''(xi) h^3/6.
Subtracting these (both the h^0 and h^2 terms drop out!) and solve for f'(x):
- f'(x) = [f(x+h) - f(x-h)]/(2h) + O(h^2).
So, if we have a discretized function defined on equal distant partitions: x = x_0,x_0+h(=x_1),....,x_n=x_0+h*n, then numpy gradient will yield a "derivative" array using the first order estimate on the ends and the better estimates in the middle.
Example 1. If you don't specify any spacing, the interval is assumed to be 1. so if you call
f = np.array([5, 7, 4, 8])
what you are saying is that f(0) = 5, f(1) = 7, f(2) = 4, and f(3) = 8. Then
np.gradient(f)
will be: f'(0) = (7 - 5)/1 = 2, f'(1) = (4 - 5)/(2*1) = -0.5, f'(2) = (8 - 7)/(2*1) = 0.5, f'(3) = (8 - 4)/1 = 4.
Example 2. If you specify a single spacing, the spacing is uniform but not 1.
For example, if you call
np.gradient(f, 0.5)
this is saying that h = 0.5, not 1, i.e., the function is really f(0) = 5, f(0.5) = 7, f(1.0) = 4, f(1.5) = 8. The net effect is to replace h = 1 with h = 0.5 and all the results will be doubled.
Example 3. Suppose the discretized function f(x) is not defined on uniformly spaced intervals, for instance f(0) = 5, f(1) = 7, f(3) = 4, f(3.5) = 8, then there is a messier discretized differentiation function that the numpy gradient function uses and you will get the discretized derivatives by calling
np.gradient(f, np.array([0,1,3,3.5]))
Lastly, if your input is a 2d array, then you are thinking of a function f of x, y defined on a grid. The numpy gradient will output the arrays of "discretized" partial derivatives in x and y.
Numpy-gradient uses forward, backward, and central differences where appropriate.
Input:
x = np.array([1, 2, 4, 7, 11, 16], dtype=np.float)
np.gradient(x) # this uses default distance=1
Output:
array([ 1. , 1.5, 2.5, 3.5, 4.5, 5. ])
For the first item it uses forward (current -> next) difference:
- previous number: none
- current (first) number: 1
- next number: 2
(2 - 1) / 1 = 1.
For the last item it uses backward (previous -> current) difference:
- previous number: 11
- current (last) number: 16
- next number: none
(16 - 11) / 1 = 5.
And, for the items in between, the central difference is applied:
- previous number: 1
- current number: 2
- next number: 4
(4 - 1) / 2 = 1.5
- previous number: 2
- current number: 4
- next number: 7
(7 - 2) / 2 = 2.5
...
and so on:-
(11 - 4) / 2 = 3.5
(16 - 7) / 2 = 4.5
The differences are divided by the sample distance (default=1) for forward and backward differences, but twice the distance for the central difference to obtain appropriate gradients.
What you expect as output is what you will get when running np.diff, but then one element shorter:
np.diff(arr)
>>> array([ 1., 2., 3., 4., 5.])
np.gradient looks takes the i'th element and looks at the average between the differences for the (i+1)'th vs. i'th element and the (i-1)'th vs. i'th element. For the edge values it only can use one point. So value number two comes 1.5 comes from averaging (2-1) and (4-2).
The $x$-coordinates of your data points are not equally spaced (x[1:]-x[:-1] is not constant), so numpy.gradient is not applicable because it assumes that the data is equally spaced (http://docs.scipy.org/doc/numpy/reference/generated/numpy.gradient.html).
Even just forward differences on their own would be better than using an inaccurate value of $\Delta x$ in centered differences.
With forward differences, the small bumps disappear, but the discontinuities remain, so those probably come from the data itself. You can smooth them by constructing a spline interpolant (http://docs.scipy.org/doc/scipy/reference/interpolate.html).
Here's a plot of the derivative evaluated by first performing an spline fit of the data. Compared to the finite difference approach, the result appears more natural.
from scipy.interpolate import UnivariateSpline
spl = UnivariateSpline(sub_data[:,0],sub_data[:,1])
x_range = np.linspace(sub_data[0,0], sub_data[-1,0],1000)
plt.plot(x_range,spl.derivative(1)(x_range),'b', label = 'spline')
plt.plot(x,dydx, 'r-', label = 'finite diff.')
plt.legend(loc = 'best')

So i struggle to understand wich of the commands above make more sense in the Szenario of differentiate the Velocity by the time. My timestamps are 1 second apart. So time would be vector [1 2 3 4...] Gradient and diff use different approaches to the problem okay. But wich one is "right" since they provide me with different results. In my university we had a class for Matlab but we never used these commands so now that I want to get more into the "game" since I will be needing it in future classes and my bachelor. It's not directly a "homework" question but I suppose it makes more sense if it will be handled as such.
Edit: for more clarity I used diff(v)./diff(time) And gradient(v) since the step size is 1 anyway if o got that right.