Note that is more often written as or just .

In the case where is unknown and you want to estimate it, you can use nonlinear least squares on this.

[In Matlab, see lsqcurvefit, for example]

Alternatively, since given a value for you can write it as a linear regression problem, you can write the problem as partially linear model*, where for example, given some value of you can estimate and by least squares, so you can just optimize the MSE over ). Any univariate optimizer should work nicely for that. While and can actually be eliminated, you can avoid the effort of doing that: within the function that calculates the sum of squares of residuals for a given , you just compute the least squares fit of and in a regression of on and the sum of squares of that fit is the function value.

[In Matlab, see fminsearch as an example of an optimizer, but since it's sum of squares, you should be able to take advantage of lsqnonlin]

* not to be confused with partial least squares which is quite a different thing.

I don't have Matlab at present, but I can illustrate some of these ideas in R easily enough; it's not hard to do the same kind of thing in Matlab.

First I generated some data with , and :

set.seed(329783)
m=11; a=0.75; b=5
x = runif(100,3,21)
y = a*abs(x-m)+b + rnorm(100,0,.3)

a) If you have moderately good guesses at the parameters, this is simple with a nonlinear least squares program:

> Vfit0 = nls(y~a*abs(x-m)+b,start=list(m=10,a=1,b=4))
>  summary(Vfit0)

Formula: y ~ a * abs(x - m) + b

Parameters:
  Estimate Std. Error t value Pr(>|t|)    
m 11.03212    0.03891   283.5   <2e-16 ***
a  0.74747    0.01122    66.6   <2e-16 ***
b  5.01031    0.05792    86.5   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2827 on 97 degrees of freedom

Number of iterations to convergence: 4 
Achieved convergence tolerance: 2.878e-09

If you don't have good guesses at the parameters (or you want to completely automate it), good estimates are easy enough to construct for this problem.

If you subtract off the minimum (to almost get rid of ), the square of is approximately quadratic in (it's not all that sensitive to accuracy in ). The x-value for the minimum of that quadratic should still be near .

The minimum of a quadratic ($a_2x^2+a_1x+a_0$) will be at .

So calculating that for a quadratic fit (in ) to , should give a good estimate of . Then given that , a linear regression of on gives a good and . So those should be nice starting values --

Here's how I did that in R:

qmin = function(v) -v[2]/v[3]/2

y0 = y^2-min(y^2)
m0 = qmin( lm( y0~x+I(x^2) )$coefficients ) # start estimate of m
stcoef = lm(y~abs(x-m0))$coefficients # to get start estimates of a and b

Vfit = nls(y~sign(x-m)*a*(x-m)+b,start=list(m=m0,a=stcoef[2],b=stcoef[1]))

the final fit was the same as above so I won't give that, but the starting values were

         m         a        b
  11.14097 0.7536526 4.984033

which are much closer to the final estimates than my (quite sufficient) original "guess"

Answer from Glen_b on Stack Exchange
Top answer
1 of 1
3

Note that is more often written as or just .

In the case where is unknown and you want to estimate it, you can use nonlinear least squares on this.

[In Matlab, see lsqcurvefit, for example]

Alternatively, since given a value for you can write it as a linear regression problem, you can write the problem as partially linear model*, where for example, given some value of you can estimate and by least squares, so you can just optimize the MSE over ). Any univariate optimizer should work nicely for that. While and can actually be eliminated, you can avoid the effort of doing that: within the function that calculates the sum of squares of residuals for a given , you just compute the least squares fit of and in a regression of on and the sum of squares of that fit is the function value.

[In Matlab, see fminsearch as an example of an optimizer, but since it's sum of squares, you should be able to take advantage of lsqnonlin]

* not to be confused with partial least squares which is quite a different thing.

I don't have Matlab at present, but I can illustrate some of these ideas in R easily enough; it's not hard to do the same kind of thing in Matlab.

First I generated some data with , and :

set.seed(329783)
m=11; a=0.75; b=5
x = runif(100,3,21)
y = a*abs(x-m)+b + rnorm(100,0,.3)

a) If you have moderately good guesses at the parameters, this is simple with a nonlinear least squares program:

> Vfit0 = nls(y~a*abs(x-m)+b,start=list(m=10,a=1,b=4))
>  summary(Vfit0)

Formula: y ~ a * abs(x - m) + b

Parameters:
  Estimate Std. Error t value Pr(>|t|)    
m 11.03212    0.03891   283.5   <2e-16 ***
a  0.74747    0.01122    66.6   <2e-16 ***
b  5.01031    0.05792    86.5   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2827 on 97 degrees of freedom

Number of iterations to convergence: 4 
Achieved convergence tolerance: 2.878e-09

If you don't have good guesses at the parameters (or you want to completely automate it), good estimates are easy enough to construct for this problem.

If you subtract off the minimum (to almost get rid of ), the square of is approximately quadratic in (it's not all that sensitive to accuracy in ). The x-value for the minimum of that quadratic should still be near .

The minimum of a quadratic ($a_2x^2+a_1x+a_0$) will be at .

So calculating that for a quadratic fit (in ) to , should give a good estimate of . Then given that , a linear regression of on gives a good and . So those should be nice starting values --

Here's how I did that in R:

qmin = function(v) -v[2]/v[3]/2

y0 = y^2-min(y^2)
m0 = qmin( lm( y0~x+I(x^2) )$coefficients ) # start estimate of m
stcoef = lm(y~abs(x-m0))$coefficients # to get start estimates of a and b

Vfit = nls(y~sign(x-m)*a*(x-m)+b,start=list(m=m0,a=stcoef[2],b=stcoef[1]))

the final fit was the same as above so I won't give that, but the starting values were

         m         a        b
  11.14097 0.7536526 4.984033

which are much closer to the final estimates than my (quite sufficient) original "guess"

Note that is more often written as or just .

In the case where is unknown and you want to estimate it, you can use nonlinear least squares on this.

[In Matlab, see lsqcurvefit, for example]

Alternatively, since given a value for you can write it as a linear regression problem, you can write the problem as partially linear model*, where for example, given some value of you can estimate and by least squares, so you can just optimize the MSE over ). Any univariate optimizer should work nicely for that. While and can actually be eliminated, you can avoid the effort of doing that: within the function that calculates the sum of squares of residuals for a given , you just compute the least squares fit of and in a regression of on and the sum of squares of that fit is the function value.

[In Matlab, see fminsearch as an example of an optimizer, but since it's sum of squares, you should be able to take advantage of lsqnonlin]

* not to be confused with partial least squares which is quite a different thing.

I don't have Matlab at present, but I can illustrate some of these ideas in R easily enough; it's not hard to do the same kind of thing in Matlab.

First I generated some data with , and :

set.seed(329783)
m=11; a=0.75; b=5
x = runif(100,3,21)
y = a*abs(x-m)+b + rnorm(100,0,.3)

a) If you have moderately good guesses at the parameters, this is simple with a nonlinear least squares program:

> Vfit0 = nls(y~a*abs(x-m)+b,start=list(m=10,a=1,b=4))
>  summary(Vfit0)

Formula: y ~ a * abs(x - m) + b

Parameters:
  Estimate Std. Error t value Pr(>|t|)    
m 11.03212    0.03891   283.5   <2e-16 ***
a  0.74747    0.01122    66.6   <2e-16 ***
b  5.01031    0.05792    86.5   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2827 on 97 degrees of freedom

Number of iterations to convergence: 4 
Achieved convergence tolerance: 2.878e-09

If you don't have good guesses at the parameters (or you want to completely automate it), good estimates are easy enough to construct for this problem.

If you subtract off the minimum (to almost get rid of ), the square of is approximately quadratic in (it's not all that sensitive to accuracy in ). The x-value for the minimum of that quadratic should still be near .

The minimum of a quadratic ($a_2x^2+a_1x+a_0$) will be at .

So calculating that for a quadratic fit (in ) to , should give a good estimate of . Then given that , a linear regression of on gives a good and . So those should be nice starting values --

Here's how I did that in R:

qmin = function(v) -v[2]/v[3]/2

y0 = y^2-min(y^2)
m0 = qmin( lm( y0~x+I(x^2) )$coefficients ) # start estimate of m
stcoef = lm(y~abs(x-m0))$coefficients # to get start estimates of a and b

Vfit = nls(y~sign(x-m)*a*(x-m)+b,start=list(m=m0,a=stcoef[2],b=stcoef[1]))

the final fit was the same as above so I won't give that, but the starting values were

         m         a        b
  11.14097 0.7536526 4.984033

which are much closer to the final estimates than my (quite sufficient) original "guess"

Answer from Glen_b on Stack Exchange
🌐
Stack Overflow
stackoverflow.com › questions › 48713587 › improving-linear-regression-model-by-taking-absolute-value-of-predicted-output
python 3.x - Improving linear regression model by taking absolute value of predicted output? - Stack Overflow
The thing is obviously that a LR fits a line, and if this line is not parallel to your x-axis (thinking in the single variable case) it will inevitably lead to negative values at some point on the line. That's one reason for why it is often advised not to use LRs for predictions outside the "fitted" data. ... Sign up to request clarification or add additional context in comments. ... A straight line y=a+bx will predict negative y for some x unless a>0 and b=0. Using logarithmic scale seems natural solution to fix this. In the case of linear regression, there is no restriction on your outputs.
🌐
Wikipedia
en.wikipedia.org › wiki › Least_absolute_deviations
Least absolute deviations - Wikipedia
November 22, 2024 - If in the sum of the absolute values ... absolute value function, which on the left half-line has slope ... {\displaystyle \tau =1/2} gives the standard regression by least absolute deviations and is also known as median regression. The least absolute deviation problem may be extended to include multiple explanators, constraints and regularization, e.g., a linear model with ...
🌐
Wayne State University
digitalcommons.wayne.edu › cgi › viewcontent.cgi pdf
Least Absolute Value vs. Least Squares Estimation and ...
Open Access research and scholarship produced by Wayne State University community and home of Wayne State University Press Journals.
🌐
Bradthiessen
bradthiessen.com › html5 › docs › ols.pdf pdf
Why we use “least squares” regression instead of “least ...
Recall that when we used absolute values, we were simply finding the total distance of all our error lines (the vertical dotted lines). When we square those error lines, we are literally making squares from those lines. We can visualize this as... So we want to find the regression line that minimizes the sum of the areas of these error squares.
🌐
Real-Statistics
real-statistics.com › multiple-regression › lad-regression
Least Absolute Deviation (LAD) Regression
Free downloadable statistics software (Excel add-in) plus comprehensive statistics tutorial for carrying out a wide range of statistical analyses in Excel.
Find elsewhere
🌐
Quora
quora.com › Why-do-we-use-square-error-instead-of-absolute-value-when-we-calculate-R-2-in-regression-analysis
Why do we use square error instead of absolute value when we calculate R^2 in regression analysis? - Quora
Answer (1 of 5): There are many reasons for using the square error, and there are cases one might prefer a different cost-function, but there is one really important reason to use the square error, that is often not mentioned because it is not a very impressive one: it is easier to solve than oth...
🌐
MathWorks
mathworks.com › matlabcentral › answers › 105516-least-absolute-value-based-regression
Least Absolute Value based regression - MATLAB Answers - MATLAB Central
November 9, 2013 - Hi , I want to use linear regression based on least absolute value deviation to find the coefficients of my model with the help of measured data and 3 independent variables. The number of measur...
🌐
NASA Technical Reports Server
ntrs.nasa.gov › api › citations › 20180007616 › downloads › 20180007616.pdf pdf
The Use of Absolute-Value Terms in Regression Modeling ...
This paper will illustrate the use of absolute-value terms for balance-calibration regression modeling. First, the general use of absolute-value terms is discussed. Then, a semi-empirical test is presented that · quantifies the bidirectional characteristics of balance bridge outputs. Several diagnostic methods are dis- cussed to assess the severity of near-linear dependencies between regressors of models with absolute-value
🌐
Lumen Learning
courses.lumenlearning.com › introstats1 › chapter › the-regression-equation
The Regression Equation | Introduction to Statistics
The term [latex]\displaystyle{y}_{0}-\hat{y}_{0}={\epsilon}_{0}[/latex] is called the “error” or residual. It is not an error in the sense of a mistake. The absolute value of a residual measures the vertical distance between the actual value of y and the estimated value of y.
🌐
Desmos
desmos.com › calculator › iah0qeuta5
Linear Regression for Absolute Value and Linear | Desmos
Explore math with our beautiful, free online graphing calculator. Graph functions, plot points, visualize algebraic equations, add sliders, animate graphs, and more.
🌐
Springer
link.springer.com › home › annals of operations research › article
Estimation and testing in least absolute value regression with serially correlated disturbances | Annals of Operations Research
Least absolute value (LAV) regression provides a robust alternative to least squares, particularly when the disturbances follow distributions that are nonnormal and subject to outliers.
🌐
Quora
quora.com › The-method-of-least-squares-of-residuals-is-commonly-used-to-get-the-best-fit-with-linear-regression-The-reason-why-the-absolute-value-of-residuals-y-ypred-is-not-used-is-that
The method of least squares of residuals is commonly used to get the best fit with linear regression. The reason why the absolute value of residual's (|y- ypred|) is not used is that? - Quora
Answer (1 of 6): You could run something like that if you wanted, but least squares intentionally squares the error with the thinking that it will react better when values stray away from your predicted value. As an example, say you take your data and plot all the actual values and see a straigh...
🌐
Wikipedia
en.wikipedia.org › wiki › Lasso_(statistics)
Lasso (statistics) - Wikipedia
2 weeks ago - The latter only groups parameters together if the absolute correlation among regressors is larger than a user-specified value. Just as ridge regression can be interpreted as linear regression for which the coefficients have been assigned normal prior distributions, lasso can be interpreted as linear regression for which the coefficients have Laplace prior distributions.
🌐
PubMed Central
pmc.ncbi.nlm.nih.gov › articles › PMC5384397
Common pitfalls in statistical analysis: Linear regression analysis - PMC
For instance, in a linear regression equation of BMI (independent) versus MUAC (dependent), the value of “b” will be 2.54-fold higher if the MUAC is expressed in inches instead of in centimeters (1 inch = 2.54 cm); alternatively, if the MUAC is expressed in millimeters, the regression coefficient will become one-tenth of the original value (1 mm = 1/10 cm). A change in the unit of “y” will also lead to a change in the value of the regression coefficient. This must be kept in mind when interpreting the absolute value of a regression coefficient.
🌐
Wikipedia
en.wikipedia.org › wiki › Linear_regression
Linear regression - Wikipedia
February 20, 2026 - Linear regression models are often fitted using the least squares approach, but they may also be fitted in other ways, such as by minimizing the "lack of fit" in some other norm (as with least absolute deviations regression), or by minimizing a penalized version of the least squares cost function ...
🌐
Cornell University Computational Optimization
optimization.cbe.cornell.edu › index.php
Optimization with absolute values - Cornell University Computational Optimization Open Textbook - Optimization Wiki
The primary application of absolute-value functionals in linear programming has been for absolute-value or L(i)-metric regression analysis.