The warning (not error) of

OptimizeWarning: Covariance of the parameters could not be estimated

means that the fit could not determine the uncertainties (variance) of the fitting parameters.

The main problem is that your model function f treats the parameters start and end as discrete values -- they are used as integer locations for the change in functional form. scipy's curve_fit (and all other optimization routines in scipy.optimize) assume that parameters are continuous variables, not discrete.

The fitting procedure will try to take small steps (typically around machine precision) in the parameters to get a numerical derivative of the residual with respect to the variables (the Jacobian). With values used as discrete variables, these derivatives will be zero and the fitting procedure will not know how to change the values to improve the fit.

It looks like you're trying to fit a step function to some data. Allow me to recommend trying lmfit (https://lmfit.github.io/lmfit-py) which provides a higher-level interface to curve fitting, and has many built-in models. For example, it includes a StepModel that should be able to model your data.

For a slight modification of your data (so that it has a finite step), the following script with lmfit can fit such data:

#!/usr/bin/python
import numpy as np
from lmfit.models import StepModel, LinearModel
import matplotlib.pyplot as plt

np.random.seed(0)
xdata = np.linspace(0., 1000., 1000)
ydata = -np.ones(1000)
ydata[500:1000] = 1.
# note that a linear step is added here:
ydata[490:510] = -1 + np.arange(20)/10.0
ydata = ydata + np.random.normal(size=len(xdata), scale=0.1)

# model data as Step + Line
step_mod = StepModel(form='linear', prefix='step_')
line_mod = LinearModel(prefix='line_')

model = step_mod + line_mod

# make named parameters, giving initial values:
pars = model.make_params(line_intercept=ydata.min(),
                         line_slope=0,
                         step_center=xdata.mean(),
                         step_amplitude=ydata.std(),
                         step_sigma=2.0)

# fit data to this model with these parameters
out = model.fit(ydata, pars, x=xdata)

# print results
print(out.fit_report())

# plot data and best-fit
plt.plot(xdata, ydata, 'b')
plt.plot(xdata, out.best_fit, 'r-')
plt.show()

which prints out a report of

[[Model]]
    (Model(step, prefix='step_', form='linear') + Model(linear, prefix='line_'))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 49
    # data points      = 1000
    # variables        = 5
    chi-square         = 9.72660131
    reduced chi-square = 0.00977548
    Akaike info crit   = -4622.89074
    Bayesian info crit = -4598.35197
[[Variables]]
    step_sigma:      20.6227793 +/- 0.77214167 (3.74%) (init = 2)
    step_center:     490.167878 +/- 0.44804412 (0.09%) (init = 500)
    step_amplitude:  1.98946656 +/- 0.01304854 (0.66%) (init = 0.996283)
    line_intercept: -1.00628058 +/- 0.00706005 (0.70%) (init = -1.277259)
    line_slope:      1.3947e-05 +/- 2.2340e-05 (160.18%) (init = 0)
[[Correlations]] (unreported correlations are < 0.100)
    C(step_amplitude, line_slope)     = -0.875
    C(step_sigma, step_center)        = -0.863
    C(line_intercept, line_slope)     = -0.774
    C(step_amplitude, line_intercept) =  0.461
    C(step_sigma, step_amplitude)     =  0.170
    C(step_sigma, line_slope)         = -0.147
    C(step_center, step_amplitude)    = -0.146
    C(step_center, line_slope)        =  0.127

and produces a plot of

Lmfit has lots of extra features. For example, if you want to set bounds on some of the parameter values or fix some from varying, you can do the following:

# make named parameters, giving initial values:
pars = model.make_params(line_intercept=ydata.min(),
                         line_slope=0,
                         step_center=xdata.mean(),
                         step_amplitude=ydata.std(),
                         step_sigma=2.0)

# now set max and min values for step amplitude"
pars['step_amplitude'].min = 0
pars['step_amplitude'].max = 100

# fix the offset of the line to be -1.0
pars['line_offset'].value = -1.0
pars['line_offset'].vary = False

# then run fit with these parameters
out = model.fit(ydata, pars, x=xdata)

If you know the model should be Step+Constant and that the constant should be fixed, you could also modify the model to be

from lmfit.models import ConstantModel
# model data as Step + Constant
step_mod = StepModel(form='linear', prefix='step_')
const_mod = ConstantModel(prefix='const_')

model = step_mod + const_mod

pars = model.make_params(const_c=-1,
                         step_center=xdata.mean(),
                         step_amplitude=ydata.std(),
                         step_sigma=2.0)
pars['const_c'].vary = False
Answer from M Newville on Stack Overflow
Top answer
1 of 4
36

The warning (not error) of

OptimizeWarning: Covariance of the parameters could not be estimated

means that the fit could not determine the uncertainties (variance) of the fitting parameters.

The main problem is that your model function f treats the parameters start and end as discrete values -- they are used as integer locations for the change in functional form. scipy's curve_fit (and all other optimization routines in scipy.optimize) assume that parameters are continuous variables, not discrete.

The fitting procedure will try to take small steps (typically around machine precision) in the parameters to get a numerical derivative of the residual with respect to the variables (the Jacobian). With values used as discrete variables, these derivatives will be zero and the fitting procedure will not know how to change the values to improve the fit.

It looks like you're trying to fit a step function to some data. Allow me to recommend trying lmfit (https://lmfit.github.io/lmfit-py) which provides a higher-level interface to curve fitting, and has many built-in models. For example, it includes a StepModel that should be able to model your data.

For a slight modification of your data (so that it has a finite step), the following script with lmfit can fit such data:

#!/usr/bin/python
import numpy as np
from lmfit.models import StepModel, LinearModel
import matplotlib.pyplot as plt

np.random.seed(0)
xdata = np.linspace(0., 1000., 1000)
ydata = -np.ones(1000)
ydata[500:1000] = 1.
# note that a linear step is added here:
ydata[490:510] = -1 + np.arange(20)/10.0
ydata = ydata + np.random.normal(size=len(xdata), scale=0.1)

# model data as Step + Line
step_mod = StepModel(form='linear', prefix='step_')
line_mod = LinearModel(prefix='line_')

model = step_mod + line_mod

# make named parameters, giving initial values:
pars = model.make_params(line_intercept=ydata.min(),
                         line_slope=0,
                         step_center=xdata.mean(),
                         step_amplitude=ydata.std(),
                         step_sigma=2.0)

# fit data to this model with these parameters
out = model.fit(ydata, pars, x=xdata)

# print results
print(out.fit_report())

# plot data and best-fit
plt.plot(xdata, ydata, 'b')
plt.plot(xdata, out.best_fit, 'r-')
plt.show()

which prints out a report of

[[Model]]
    (Model(step, prefix='step_', form='linear') + Model(linear, prefix='line_'))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 49
    # data points      = 1000
    # variables        = 5
    chi-square         = 9.72660131
    reduced chi-square = 0.00977548
    Akaike info crit   = -4622.89074
    Bayesian info crit = -4598.35197
[[Variables]]
    step_sigma:      20.6227793 +/- 0.77214167 (3.74%) (init = 2)
    step_center:     490.167878 +/- 0.44804412 (0.09%) (init = 500)
    step_amplitude:  1.98946656 +/- 0.01304854 (0.66%) (init = 0.996283)
    line_intercept: -1.00628058 +/- 0.00706005 (0.70%) (init = -1.277259)
    line_slope:      1.3947e-05 +/- 2.2340e-05 (160.18%) (init = 0)
[[Correlations]] (unreported correlations are < 0.100)
    C(step_amplitude, line_slope)     = -0.875
    C(step_sigma, step_center)        = -0.863
    C(line_intercept, line_slope)     = -0.774
    C(step_amplitude, line_intercept) =  0.461
    C(step_sigma, step_amplitude)     =  0.170
    C(step_sigma, line_slope)         = -0.147
    C(step_center, step_amplitude)    = -0.146
    C(step_center, line_slope)        =  0.127

and produces a plot of

Lmfit has lots of extra features. For example, if you want to set bounds on some of the parameter values or fix some from varying, you can do the following:

# make named parameters, giving initial values:
pars = model.make_params(line_intercept=ydata.min(),
                         line_slope=0,
                         step_center=xdata.mean(),
                         step_amplitude=ydata.std(),
                         step_sigma=2.0)

# now set max and min values for step amplitude"
pars['step_amplitude'].min = 0
pars['step_amplitude'].max = 100

# fix the offset of the line to be -1.0
pars['line_offset'].value = -1.0
pars['line_offset'].vary = False

# then run fit with these parameters
out = model.fit(ydata, pars, x=xdata)

If you know the model should be Step+Constant and that the constant should be fixed, you could also modify the model to be

from lmfit.models import ConstantModel
# model data as Step + Constant
step_mod = StepModel(form='linear', prefix='step_')
const_mod = ConstantModel(prefix='const_')

model = step_mod + const_mod

pars = model.make_params(const_c=-1,
                         step_center=xdata.mean(),
                         step_amplitude=ydata.std(),
                         step_sigma=2.0)
pars['const_c'].vary = False
2 of 4
4

This answer is way too late but if you want to stick to curve_fit from scipy, then re-writing the function to make it not explicitly depend on start and end as cutoff points does the job. For example, if x < start, then -1 can be written by shifting x by start and checking its sign, i.e. np.sign(x - start). Then it becomes a matter of writing a separate definition for each condition of the function and adding them up into a single function.

def f(x, start, end):
    left_tail = np.sign(x - start)
    left_tail[left_tail > -1] = 0         # only -1s are needed from here
    right_tail = np.sign(x - end)
    right_tail[right_tail < 1] = 0        # only 1s are needed from here
    rest = 1 / (end-start) * (x - start)
    rest[(rest < 0) | (rest > 1)] = 0     # only the values between 0 and 1 are needed from here
    return left_tail + rest + right_tail  # sum for a single value

popt, pcov = curve_fit(f, xdata, ydata, p0=[495., 505.])

The above function can be written substantially more concisely using np.clip() (to limit the values in an array) which can replace the boolean indexing and replacing done above.

import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt

def f(x, start, end):
    left_tail = np.clip(np.sign(x - start), -1, 0)
    rest = np.clip(1 / (end-start) * (x - start), 0, 1)
    return left_tail + rest

# sample data (as in the OP)
xdata = np.linspace(0, 1000, 1000)
ydata = np.r_[[-1.]*500, [1]*500]
ydata += np.random.normal(0, 0.25, len(ydata))

# fit function `f` to the data
popt, pcov = curve_fit(f, xdata, ydata, p0=[495., 505.])
print(popt, pcov)

# plot data along with the fitted function
plt.plot(xdata, ydata, 'b-', label='data')
plt.plot(xdata, f(xdata, *popt), 'r-', label='fit')
plt.legend();

# [499.4995098  501.51244722] [[ 1.24195553 -0.25654186]
#  [-0.25654186  0.2538896 ]]

Then using data constructed in the same way as in the OP, we get coefficients (499.499, 501.51) (that are pretty close to (500, 500)) and the plot looks like below.

Top answer
1 of 1
8

The problem seems to be one of scaling. When I added the jacobian of the function an overflow warning appeared. Thus, I divided the data by their maximum values and it worked. Following is the code.

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

t = np.array([33.90, 76.95, 166.65, 302.15, 330.11, 429.82, 533.59, 638.19])
t = t/t.max()
y = np.array([0.25, 1.81, 8.32, 11.60, 12.18, 10.12, 9.44, 5.81])
y = y/y.max()


def FFA(t, K1, K2, beta, delta):
        CSM_unif = K2 * t**delta
        return K1 * t**beta * np.exp(-CSM_unif)
    
    
def grad(t, K1, K2, beta, delta):
    g1 = t**beta*np.exp(-K2*t**delta)
    g2 = -K1*t**(delta + beta)*np.exp(-K2*t**delta)
    g3 = K1*t**beta*np.exp(-K2*t**delta)*np.log(t)
    g4 = -K1*K2*t**(delta+beta)*np.exp(-K2*t**delta)*np.log(t)
    return np.array([g1, g2, g3, g4]).T


fig, ax1 = plt.subplots()
popt, pcov = curve_fit(FFA, t, y, jac=grad, maxfev=2000)


K1, K2, beta, delta = popt
print("value of K1 = ",K1)
print("value of K2 = ",K2)
print("value of beta = ",beta)
print("value of delta = ",delta)


ax1.plot(t, y, "o")
teval = np.linspace(t.min(), t.max(), 201)
yeval = FFA(teval,*popt)
ax1.plot(teval, yeval, color='red')

ax1.set_xlabel('Time (in s)')
ax1.set_ylabel('Spectral flux density (in Jansky)')
ax1.set(title='Light Curve')
plt.show()

The results indeed show that you have some scaling issues

value of K1 =  858036532.7666308
value of K2 =  21.214387300160098
value of beta =  5.958499311376985
value of delta =  0.3661586949061432

I suggest that you non-dimensionalize your model beforehand trying that all your numbers are in the same orders of magnitude.


Update: November 12, 2021

You changed your model, but I will rewrite it as

and are fixed numbers and can be absorbed into and . Using this model it works for me.

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

t = np.array([33.90, 76.95, 166.65, 302.15, 330.11, 429.82, 533.59, 638.19, 747.94])
y = np.array([0.25, 1.81, 8.32, 11.60, 12.18, 10.12, 9.44, 5.81, 5.42])
yerr = np.array([0.09, 0.08, 0.14, 0.13, 0.16, 0.11, 0.06, 0.05, 0.06])


def FFA(t, K1, K2, beta):
        return K1 * t**beta * np.exp(-K2 * t**-3)


fig, ax1 = plt.subplots()
popt, pcov = curve_fit(FFA, t, y, sigma=yerr, maxfev=2000)


K1, K2, beta = popt
print("value of K1 = ",K1)
print("value of K2 = ",K2)
print("value of beta = ",beta)


ax1.plot(t, y, "o")
teval = np.linspace(t.min(), t.max(), 201)
yeval = FFA(teval,*popt)
ax1.plot(teval, yeval, color='red')

ax1.set_xlabel('Time (in s)')
ax1.set_ylabel('Spectral flux density (in Jansky)')
ax1.set(title='Light Curve')
plt.show()

The following are the results

value of K1 =  13773.167296442545
value of K2 =  6542145.117006296
value of beta =  -1.1791460005485805

and the covariance matrix is

[[ 4.90022065e+08  3.82085117e+10 -5.62753836e+03]
 [ 3.82085117e+10  4.62424755e+12 -4.33603182e+05]
 [-5.62753836e+03 -4.33603182e+05  6.47333340e-02]]

Discussions

BUG: `optimize.curve_fit` with `method='lm'` fails to determine `pcov` when `popt` has zeros
Describe your issue. Using optimize.curve_fit with the default method='lm' on data with non-random noise that should produce at least one of the optimized parameters (popt) being exactly ze... More on github.com
🌐 github.com
5
December 3, 2024
Covariance warnings
With latest master, I'm getting a covariance warning with all leastsq model fittings: WARNING:hyperspy.model:Covariance of the parameters could not be estimated. Estimated parameter standard de... More on github.com
🌐 github.com
5
June 13, 2020
scipy - Warning in curve_fit "Covariance of the parameters could not be estimated" - Computational Science Stack Exchange
While fitting the energy spectrum of cobalt for two distinct peaks I am not getting fit parameters and receiving the warning: warnings.warn('Covariance of the parameters could not be estimated'). I'm More on scicomp.stackexchange.com
🌐 scicomp.stackexchange.com
python - Scipy OptimizeWarning: Covariance of the parameters could not be estimated when trying to fit function to data - Stack Overflow
I'm trying to plot some data with a non-linear fit using the function: kB and Bv being a constant while J'' is the independent variable and T is a parameter. I tried to do this in Python: #Constan... More on stackoverflow.com
🌐 stackoverflow.com
🌐
Reddit
reddit.com › r/learnpython › optimizewarning: covariance of the parameters could not be estimated warnings.warn('covariance of the parameters could not be estimated', 1)
r/learnpython on Reddit: OptimizeWarning: Covariance of the parameters could not be estimated warnings.warn('Covariance of the parameters could not be estimated', 1)
August 18, 2023 -

I am trying to fit a curve to my data. My data follows a cos^2 curve.Here is my code:

    from scipy.optimize import curve_fit
from numpy import arange
from lmfit import Parameters,minimize, fit_report
from scipy.interpolate import interp1d

def fitting_func(x,liste):
    return  np.cos(x/2)*np.cos(x/2)

x_data = np.array([i[-1] for i in drawing_data])
print("x data",x_data)
y_data = np.array([i[1]/(i[1]+i[2]) if i[1] != 0 else 0 for i in drawing_data])
print("y data",y_data)

# Fit the curve using curve_fit
params, covariance = curve_fit(fitting_func, x_data, y_data)

# Get the optimized parameter
a_fit = params[0]
print(a_fit)
## Generate points for the fitted curve
x_fit = np.linspace(min(x_data), max(x_data), len(x_data))
y_fit = fitting_func(x_fit,*params)

Plot the original data and the fitted curve

plt.scatter(x_data, y_data, marker='^', color='red', label="the-Q")
plt.plot(x_fit, y_fit,'--', label="Fitted Curve")
plt.legend()
plt.xlabel("X Axis")
plt.ylabel("Probability")
plt.title("Fitted Curve to Data")
plt.show()

However, I can't access to covariant matrix and I have the following warning: OptimizeWarning: Covariance of the parameters could not be estimated warnings.warn('Covariance of the parameters could not be estimated', 1)My x data and y data as follows:x data [5.01257618 4.7123896 5.2194208 0. 3.7699133 3.76991336.28318531 0.62832187 0.62832187 3.33446627 1.57079571 0.61434047]y data [0.639481 0.50739774 0.74680994 1. 0.0941704 0.088

How can I solve this issue and be able to print np.linalg.cond(covariance)?

🌐
GitHub
github.com › scipy › scipy › issues › 21995
BUG: `optimize.curve_fit` with `method='lm'` fails to determine `pcov` when `popt` has zeros · Issue #21995 · scipy/scipy
December 3, 2024 - Using optimize.curve_fit with the default method='lm' on data with non-random noise that should produce at least one of the optimized parameters (popt) being exactly zero results in a very wrong estimated covariance matrix (pcov) or even OptimizeWarning: Covariance of the parameters could not be estimated (moreover, the appearance of this warning apparently depend on the hardware platform and compiler).
Author   MikhailRyazanov
🌐
Justinbois
justinbois.github.io › bootcamp › 2018 › lessons › l34_regressions.html
Lesson 34: Performing regressions
If the Jacobian matrix at the solution doesn't have a full rank, then 'lm' method returns a matrix filled with ``np.inf``, on the other hand 'trf' and 'dogbox' methods use Moore-Penrose pseudoinverse to compute the covariance matrix. Raises ------ ValueError if either `ydata` or `xdata` contain NaNs, or if incompatible options are used. RuntimeError if the least-squares minimization fails. OptimizeWarning if covariance of the parameters can not be estimated.
🌐
GitHub
github.com › hyperspy › hyperspy › issues › 2419
Covariance warnings · Issue #2419 · hyperspy/hyperspy
June 13, 2020 - With latest master, I'm getting a covariance warning with all leastsq model fittings: WARNING:hyperspy.model:Covariance of the parameters could not be estimated. Estimated parameter standard deviations will be np.inf. This could indicate...
Author   thomasaarholt
🌐
Stack Exchange
scicomp.stackexchange.com › questions › 44599 › warning-in-curve-fit-covariance-of-the-parameters-could-not-be-estimated
scipy - Warning in curve_fit "Covariance of the parameters could not be estimated" - Computational Science Stack Exchange
While fitting the energy spectrum of cobalt for two distinct peaks I am not getting fit parameters and receiving the warning: warnings.warn('Covariance of the parameters could not be estimated'). I'm
Find elsewhere
Top answer
1 of 1
6

Why?

You have three problems in your model:

  • Priority of operations are not respected, as pointed by @jared, you must add parenthesis around the term kb * T;
  • Target dataset is normalized to unity where it is its area which is unitary (it is a distribution). therefore such normalization prevents the model to converge. Adding the missing parameter N to the model allows convergence;
  • Scale issue and/or units issue, your Boltzmann constant is expressed in J/K but we don't have any clue about the units of B. Recall the term inside the exponential must be dimensionless. Scaling constant to eV/K seems to provide credible temperature.

MCVE

Below is a complete example fitting your data:

import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize, integrate
from sklearn import metrics

#kb = 1.380649e-23    # J/K
kb = 8.617333262e-5   # eV/K
B = 0.3808            # ?

def model(J, T, N):
    return N * (B * (2 * J + 1) / (kb * T) * np.exp(- B * J * (J + 1)/ (kb * T)))

J = np.array([2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48])
N = np.array([0.185,0.375,0.548,0.695,0.849,0.931,0.996,0.992,1.0,0.977,0.927,0.834,0.691,0.68,0.575,0.479,0.421,0.351,0.259,0.208,0.162,0.135,0.093,0.066])

popt, pcov = optimize.curve_fit(model, J, N, p0=[1e5, 1])
#(array([2.51661517e+06, 2.75770698e+01]),
# array([[1.17518427e+09, 3.25544771e+03],
#        [3.25544771e+03, 6.04463335e-02]]))

np.sqrt(np.diag(pcov))
# array([3.42809607e+04, 2.45858361e-01])

Nhat = model(J, *popt)
metrics.r2_score(N, Nhat)  # 0.9940231921241076

Jlin = np.linspace(J.min(), J.max(), 200)
Nlin = model(Jlin, *popt)

fig, axe = plt.subplots()
axe.scatter(J, N)
axe.plot(Jlin, Nlin)
axe.set_xlabel("")
axe.set_ylabel("")
axe.grid()

Regressed values are about T=2.51e6 ± 0.03e6 and N=2.76e1 ± 0.02e1, both have two significant figures. The fit is reasonable (R^2 = 0.994) and looks like:

If your Boltzmann constant is expressed in J/K, temperature T is about 1.6e+25 which seems a bit too much for such a variable.

Check

We can check the area under the curve is indeed unitary:

Jlin = np.linspace(0, 100, 2000)
Nlin = model(Jlin, *popt)
I = integrate.trapezoid(Nlin / popt[-1], Jlin)  # 0.9999992484190978

🌐
Stack Overflow
stackoverflow.com › questions › 79232728 › optimizewarning-covariance-of-the-parameters-could-not-be-estimated-in-python
optimization - optimizeWarning: Covariance of the parameters could not be estimated in python - Stack Overflow
Optimized Parameters: beta1 = 10.0000, beta2 = -7.6314, beta3 = 2.8412, lambda = 0.0100 Fitted Yields: [2.38171526 2.47248209 2.62498897 2.87073598] Parameter Covariance Matrix: [[inf inf inf inf] [inf inf inf inf] [inf inf inf inf] [inf inf inf inf]] C:\Users\CHOI\AppData\Local\Temp\ipykernel_32884\269502971.py:29: OptimizeWarning: Covariance of the parameters could not be estimated popt, pcov = curve_fit(nelson_siegel, tau, yields, p0=initial_guess, bounds=param_bounds, maxfev=10000, ftol=1e-6)
🌐
YouTube
youtube.com › watch
Covariance of the parameters could not be estimated" for Scipy.optimize curve_fit? - YouTube
How to fix the "OptimizeWarning: Covariance of the parameters could not be estimated" for Scipy.optimize curve_fit?Helpful? Please use the *Thanks* button ab...
Published   September 14, 2023
🌐
SciPy
docs.scipy.org › doc › scipy › reference › generated › scipy.optimize.curve_fit.html
curve_fit — SciPy v1.17.0 Manual
If False (default), only the relative magnitudes of the sigma values matter. The returned parameter covariance matrix pcov is based on scaling sigma by a constant factor. This constant is set by demanding that the reduced chisq for the optimal parameters popt when using the scaled sigma equals ...
🌐
Stack Overflow
stackoverflow.com › questions › 65190578 › scipi-optimize-curve-fit-returns-error-optimizewarning-covariance-of-the-param
python - SciPi Optimize Curve Fit returns error: OptimizeWarning: Covariance of the parameters could not be estimated - Stack Overflow
Scipy covariance of parameters could not be estimated. Working for similar data why not this data. ... scipy curve_fit raises “OptimizeWarning: Covariance of the parameters could not be estimated” for gompertz model
🌐
GitHub
github.com › scipy › scipy › issues › 12715
Why the covariance from curve_fit depends so sharply on the overall scale of the x variables? · Issue #12715 · scipy/scipy
August 13, 2020 - The covariance should be simply be rescaled according the choice of scale but it does not happen at all!
Author   RobertoFranceschini
Top answer
1 of 1
2

Try this fit, which works. The original function will not really fit the data. For large x it converges to a. That's okay. One has a drop to smaller x but only if b is positive. Moreover, one is lacking a parameter that is scaling the drop-rate. So with a<0 and b<0 one gets the right shape, but is converging to a negative number. i.e. an offset is missing. Hence the use of y = a + b / ( x + c )


import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.integrate import cumtrapz
import numpy as np

#Fitting function
def func( x, a, b, c ):
    return a + b / ( x + c )

#Experimental x and y data points
xData = np.array([
    31875.324, 31876.35, 31877.651, 31878.859, 31879.771, 31880.657,
    31881.617, 31882.343, 31883.099, 31883.758, 31884.489, 31885.311,
    31886.084, 31886.736, 31887.582, 31888.262, 31888.908, 31889.627,
    31890.312, 31890.989, 31891.534, 31892.142, 31892.759, 31893.323,
    31893.812, 31894.397, 31894.928, 31895.555, 31896.211, 31896.797,
    31898.16, 31898.761, 31899.462, 31900.099, 31900.609, 31901.197,
    31901.815, 31902.32, 31902.755, 31903.235, 31903.698, 31904.232,
    31904.776, 31905.291, 31905.806, 31906.182, 31906.533, 31906.843,
    31907.083, 31907.175, 31822.221, 31833.14, 31846.066, 31860.254,
    31875.324
], dtype=float )

yData = np.array([
    7999.026, 7999.483, 8000.048, 8000.559, 8000.937, 8001.298,
    8001.683, 8001.969, 8002.263, 8002.516, 8002.793, 8003.101,
    8003.387, 8003.625, 8003.931, 8004.174, 8004.403, 8004.655,
    8004.892, 8005.125, 8005.311, 8005.517, 8005.725, 8005.913,
    8006.076, 8006.269, 8006.443, 8006.648, 8006.861, 8007.05,
    8007.486, 8007.677, 8007.899, 8008.1, 8008.259, 8008.443,
    8008.636, 8008.793, 8008.929, 8009.077, 8009.221, 8009.386,
    8009.554, 8009.713, 8009.871, 8009.987, 8010.095, 8010.19,
    8010.264, 8010.293, 7956.451, 7969.307, 7981.115, 7991.074,
    7999.026
], dtype=float )

#x values for the fitted function
xFit = np.linspace( 31820, 31950, 150 )

### sorting and shifting t get reasonable values
### scaling also would be a good idea, but I skip this part here
mx=np.mean( xData )
my=np.mean( yData )


data = np.column_stack( ( xData - mx , yData - my ) )
data = np.sort( data, axis=0 )

### getting good starting values using the fact that
### int x y = ( b + a c) x + a/2 x^2 - c * ( int y ) + const
### With two simple and fast numerical integrals, hence, we get a good
### guess for a, b, c from a simple linear fit.

Sy = cumtrapz( data[:, 1], data[:, 0], initial = 0 )
Sxy = cumtrapz( data[:, 0] * data[:, 1], data[:, 0], initial = 0 )
ST = np.array([
    data[:, 0], data[:, 0]**2, Sy, np.ones( len( data ) )
])
S = np.transpose( ST )
A = np.dot( ST, S )
eta = np.dot( ST, Sxy )
sol = np.linalg.solve( A, eta )
a = 2 * sol[1]
c = -sol[2]
b = sol[0] - a * c

print( "a = {}".format( a ) )
print( "b = {}".format( b ) )
print( "c = {}".format( c ) )

sol, cov = curve_fit( func, xData, yData, p0 = (a + my, b, c - mx) )
print( "linear fit vs non-linear fit:" )
print( a + my, b, c - mx )
print( sol )
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( xData, yData, ls='', marker ='+', label="data")
ax.plot( xFit, func( xFit, a + my, b, c - mx), ls=':', label="linear guess")
ax.plot( xFit, func( xFit, *sol ), label="non-linear fit" )
ax.legend( loc=0 )
plt.show()

Providing

[  8054  -6613  -31754]

and

🌐
Stack Overflow
stackoverflow.com › questions › 65385831 › optimizewarning-covariance-of-the-parameters-could-not-be-estimated-what-does
python - OptimizeWarning: Covariance of the parameters could not be estimated - what does it really mean? - Stack Overflow
The parameters I get back seem to be fine and to estimate the given data quite well. ... minpack.py:829: OptimizeWarning: Covariance of the parameters could not be estimated category=OptimizeWarning)
🌐
The Data Frog
thedatafrog.com › en › articles › covid-19-analysis-uncertainties
COVID-19 Analysis: Uncertainties
/Users/cbernet/miniconda3/envs/covid19/lib/python3.6/site-packages/scipy/optimize/minpack.py:734: RuntimeWarning: divide by zero encountered in true_divide transform = 1.0 / sigma /Users/cbernet/miniconda3/envs/covid19/lib/python3.6/site-packages/scipy/optimize/minpack.py:808: OptimizeWarning: Covariance of the parameters could not be estimated category=OptimizeWarning)
🌐
Stack Overflow
stackoverflow.com › q › 49471129
python - gaussian fit..error: OptimizeWarning: Covariance of the parameters could not be estimated - Stack Overflow
October 24, 2024 - 2 Scipy covariance of parameters could not be estimated. Working for similar data why not this data. 17 scipy curve_fit raises "OptimizeWarning: Covariance of the parameters could not be estimated"
🌐
Phy224
phy224.ca › 19-curvefit › index.html
Fitting data to models – PHY224 Python Review
November 23, 2021 - 0 a= 3.0 b= -2.0 c= 0.1 1 a= 3.0 ... return a*t**(b-1) + c /Users/lee/anaconda3/lib/python3.8/site-packages/scipy/optimize/minpack.py:828: OptimizeWarning: Covariance of the parameters could not be estimated warnings.warn('Covariance of the parameters could not be ...