linear regression absolute value

Regression of an absolute value function

stats.stackexchange.com › questions › 138969 › regression-of-an-absolute-value-function

Note that $\text{[math]}$ is more often written as $\text{[math]}$ or just $\text{[math]}$ .

In the case where $\text{[math]}$ is unknown and you want to estimate it, you can use nonlinear least squares on this.

[In Matlab, see lsqcurvefit, for example]

Alternatively, since given a value for $\text{[math]}$ you can write it as a linear regression problem, you can write the problem as partially linear model*, where for example, given some value of $\text{[math]}$ you can estimate $\text{[math]}$ and $\text{[math]}$ by least squares, so you can just optimize the MSE over $\text{[math]}$ ). Any univariate optimizer should work nicely for that. While $\text{[math]}$ and $\text{[math]}$ can actually be eliminated, you can avoid the effort of doing that: within the function that calculates the sum of squares of residuals for a given $\text{[math]}$ , you just compute the least squares fit of $\text{[math]}$ and $\text{[math]}$ in a regression of $\text{[math]}$ on $\text{[math]}$ and the sum of squares of that fit is the function value.

[In Matlab, see fminsearch as an example of an optimizer, but since it's sum of squares, you should be able to take advantage of lsqnonlin]

* not to be confused with partial least squares which is quite a different thing.

I don't have Matlab at present, but I can illustrate some of these ideas in R easily enough; it's not hard to do the same kind of thing in Matlab.

First I generated some data with $\text{[math]}$ , $\text{[math]}$ and $\text{[math]}$ :

set.seed(329783)
m=11; a=0.75; b=5
x = runif(100,3,21)
y = a*abs(x-m)+b + rnorm(100,0,.3)

a) If you have moderately good guesses at the parameters, this is simple with a nonlinear least squares program:

> Vfit0 = nls(y~a*abs(x-m)+b,start=list(m=10,a=1,b=4))
>  summary(Vfit0)

Formula: y ~ a * abs(x - m) + b

Parameters:
  Estimate Std. Error t value Pr(>|t|)    
m 11.03212    0.03891   283.5   <2e-16 ***
a  0.74747    0.01122    66.6   <2e-16 ***
b  5.01031    0.05792    86.5   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2827 on 97 degrees of freedom

Number of iterations to convergence: 4 
Achieved convergence tolerance: 2.878e-09

If you don't have good guesses at the parameters (or you want to completely automate it), good estimates are easy enough to construct for this problem.

If you subtract off the minimum $\text{[math]}$ (to almost get rid of $\text{[math]}$ ), the square of $\text{[math]}$ is approximately quadratic in $\text{[math]}$ (it's not all that sensitive to accuracy in $\text{[math]}$ ). The x-value for the minimum of that quadratic should still be near $\text{[math]}$ .

The minimum of a quadratic ($a_2x^2+a_1x+a_0$) will be at $\text{[math]}$ .

So calculating that for a quadratic fit (in $\text{[math]}$ ) to $\text{[math]}$ , should give a good estimate of $\text{[math]}$ . Then given that $\text{[math]}$ , a linear regression of $\text{[math]}$ on $\text{[math]}$ gives a good $\text{[math]}$ and $\text{[math]}$ . So those should be nice starting values --

Here's how I did that in R:

qmin = function(v) -v[2]/v[3]/2

y0 = y^2-min(y^2)
m0 = qmin( lm( y0~x+I(x^2) )$coefficients ) # start estimate of m
stcoef = lm(y~abs(x-m0))$coefficients # to get start estimates of a and b

Vfit = nls(y~sign(x-m)*a*(x-m)+b,start=list(m=m0,a=stcoef[2],b=stcoef[1]))

the final fit was the same as above so I won't give that, but the starting values were

         m         a        b
  11.14097 0.7536526 4.984033

which are much closer to the final estimates than my (quite sufficient) original "guess"

Answer from Glen_b on Stack Exchange

Stack Exchange

stats.stackexchange.com › questions › 138969 › regression-of-an-absolute-value-function

matlab - Regression of an absolute value function - Cross Validated

Top answer

1 of 1

3

Note that $\text{[math]}$ is more often written as $\text{[math]}$ or just $\text{[math]}$ .

In the case where $\text{[math]}$ is unknown and you want to estimate it, you can use nonlinear least squares on this.

[In Matlab, see lsqcurvefit, for example]

Alternatively, since given a value for $\text{[math]}$ you can write it as a linear regression problem, you can write the problem as partially linear model*, where for example, given some value of $\text{[math]}$ you can estimate $\text{[math]}$ and $\text{[math]}$ by least squares, so you can just optimize the MSE over $\text{[math]}$ ). Any univariate optimizer should work nicely for that. While $\text{[math]}$ and $\text{[math]}$ can actually be eliminated, you can avoid the effort of doing that: within the function that calculates the sum of squares of residuals for a given $\text{[math]}$ , you just compute the least squares fit of $\text{[math]}$ and $\text{[math]}$ in a regression of $\text{[math]}$ on $\text{[math]}$ and the sum of squares of that fit is the function value.

[In Matlab, see fminsearch as an example of an optimizer, but since it's sum of squares, you should be able to take advantage of lsqnonlin]

* not to be confused with partial least squares which is quite a different thing.

I don't have Matlab at present, but I can illustrate some of these ideas in R easily enough; it's not hard to do the same kind of thing in Matlab.

First I generated some data with $\text{[math]}$ , $\text{[math]}$ and $\text{[math]}$ :

set.seed(329783)
m=11; a=0.75; b=5
x = runif(100,3,21)
y = a*abs(x-m)+b + rnorm(100,0,.3)

a) If you have moderately good guesses at the parameters, this is simple with a nonlinear least squares program:

> Vfit0 = nls(y~a*abs(x-m)+b,start=list(m=10,a=1,b=4))
>  summary(Vfit0)

Formula: y ~ a * abs(x - m) + b

Parameters:
  Estimate Std. Error t value Pr(>|t|)    
m 11.03212    0.03891   283.5   <2e-16 ***
a  0.74747    0.01122    66.6   <2e-16 ***
b  5.01031    0.05792    86.5   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2827 on 97 degrees of freedom

Number of iterations to convergence: 4 
Achieved convergence tolerance: 2.878e-09

If you don't have good guesses at the parameters (or you want to completely automate it), good estimates are easy enough to construct for this problem.

If you subtract off the minimum $\text{[math]}$ (to almost get rid of $\text{[math]}$ ), the square of $\text{[math]}$ is approximately quadratic in $\text{[math]}$ (it's not all that sensitive to accuracy in $\text{[math]}$ ). The x-value for the minimum of that quadratic should still be near $\text{[math]}$ .

The minimum of a quadratic ($a_2x^2+a_1x+a_0$) will be at $\text{[math]}$ .

So calculating that for a quadratic fit (in $\text{[math]}$ ) to $\text{[math]}$ , should give a good estimate of $\text{[math]}$ . Then given that $\text{[math]}$ , a linear regression of $\text{[math]}$ on $\text{[math]}$ gives a good $\text{[math]}$ and $\text{[math]}$ . So those should be nice starting values --

Here's how I did that in R:

qmin = function(v) -v[2]/v[3]/2

y0 = y^2-min(y^2)
m0 = qmin( lm( y0~x+I(x^2) )$coefficients ) # start estimate of m
stcoef = lm(y~abs(x-m0))$coefficients # to get start estimates of a and b

Vfit = nls(y~sign(x-m)*a*(x-m)+b,start=list(m=m0,a=stcoef[2],b=stcoef[1]))

the final fit was the same as above so I won't give that, but the starting values were

         m         a        b
  11.14097 0.7536526 4.984033

which are much closer to the final estimates than my (quite sufficient) original "guess"

Quora

quora.com › After-learning-a-linear-regression-logistic-regression-linear-SVM-model-is-that-OK-to-interpret-the-absolute-values-of-the-weights-of-the-features-as-the-relative-importance-of-the-features-contributing-to-the-model-Or-should-p-value-be-used

After learning a linear regression/logistic regression/linear SVM model, is that OK to interpret the absolute values of the weights of the features as the relative importance of the features contributing to the model? Or should p-value be used? - Quora

The absolute value of the weight (aka the regression coefficient) measures how much a marginal change in the feature value would affect the output. The larger this value, the more sensitive the output is to changes in the value of the fea...

Videos

07:23

YouTube

Linear Regression: Minimizing the total absolute and total ...

Linear Regression: Total squared error is easier than total absolute ...

Problem with Minimizing Absolute Errors - YouTube

February 23, 2015

View all

Regression of an absolute value function

stats.stackexchange.com › questions › 138969 › regression-of-an-absolute-value-function

Note that $\text{[math]}$ is more often written as $\text{[math]}$ or just $\text{[math]}$ .

In the case where $\text{[math]}$ is unknown and you want to estimate it, you can use nonlinear least squares on this.

[In Matlab, see lsqcurvefit, for example]

Alternatively, since given a value for $\text{[math]}$ you can write it as a linear regression problem, you can write the problem as partially linear model*, where for example, given some value of $\text{[math]}$ you can estimate $\text{[math]}$ and $\text{[math]}$ by least squares, so you can just optimize the MSE over $\text{[math]}$ ). Any univariate optimizer should work nicely for that. While $\text{[math]}$ and $\text{[math]}$ can actually be eliminated, you can avoid the effort of doing that: within the function that calculates the sum of squares of residuals for a given $\text{[math]}$ , you just compute the least squares fit of $\text{[math]}$ and $\text{[math]}$ in a regression of $\text{[math]}$ on $\text{[math]}$ and the sum of squares of that fit is the function value.

[In Matlab, see fminsearch as an example of an optimizer, but since it's sum of squares, you should be able to take advantage of lsqnonlin]

* not to be confused with partial least squares which is quite a different thing.

I don't have Matlab at present, but I can illustrate some of these ideas in R easily enough; it's not hard to do the same kind of thing in Matlab.

First I generated some data with $\text{[math]}$ , $\text{[math]}$ and $\text{[math]}$ :

set.seed(329783)
m=11; a=0.75; b=5
x = runif(100,3,21)
y = a*abs(x-m)+b + rnorm(100,0,.3)

a) If you have moderately good guesses at the parameters, this is simple with a nonlinear least squares program:

> Vfit0 = nls(y~a*abs(x-m)+b,start=list(m=10,a=1,b=4))
>  summary(Vfit0)

Formula: y ~ a * abs(x - m) + b

Parameters:
  Estimate Std. Error t value Pr(>|t|)    
m 11.03212    0.03891   283.5   <2e-16 ***
a  0.74747    0.01122    66.6   <2e-16 ***
b  5.01031    0.05792    86.5   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2827 on 97 degrees of freedom

Number of iterations to convergence: 4 
Achieved convergence tolerance: 2.878e-09

If you don't have good guesses at the parameters (or you want to completely automate it), good estimates are easy enough to construct for this problem.

If you subtract off the minimum $\text{[math]}$ (to almost get rid of $\text{[math]}$ ), the square of $\text{[math]}$ is approximately quadratic in $\text{[math]}$ (it's not all that sensitive to accuracy in $\text{[math]}$ ). The x-value for the minimum of that quadratic should still be near $\text{[math]}$ .

The minimum of a quadratic ($a_2x^2+a_1x+a_0$) will be at $\text{[math]}$ .

So calculating that for a quadratic fit (in $\text{[math]}$ ) to $\text{[math]}$ , should give a good estimate of $\text{[math]}$ . Then given that $\text{[math]}$ , a linear regression of $\text{[math]}$ on $\text{[math]}$ gives a good $\text{[math]}$ and $\text{[math]}$ . So those should be nice starting values --

Here's how I did that in R:

qmin = function(v) -v[2]/v[3]/2

y0 = y^2-min(y^2)
m0 = qmin( lm( y0~x+I(x^2) )$coefficients ) # start estimate of m
stcoef = lm(y~abs(x-m0))$coefficients # to get start estimates of a and b

Vfit = nls(y~sign(x-m)*a*(x-m)+b,start=list(m=m0,a=stcoef[2],b=stcoef[1]))

the final fit was the same as above so I won't give that, but the starting values were

         m         a        b
  11.14097 0.7536526 4.984033

which are much closer to the final estimates than my (quite sufficient) original "guess"

Answer from Glen_b on Stack Exchange

Stack Overflow

stackoverflow.com › questions › 48713587 › improving-linear-regression-model-by-taking-absolute-value-of-predicted-output

python 3.x - Improving linear regression model by taking absolute value of predicted output? - Stack Overflow

The thing is obviously that a LR fits a line, and if this line is not parallel to your x-axis (thinking in the single variable case) it will inevitably lead to negative values at some point on the line. That's one reason for why it is often advised not to use LRs for predictions outside the "fitted" data. ... Sign up to request clarification or add additional context in comments. ... A straight line y=a+bx will predict negative y for some x unless a>0 and b=0. Using logarithmic scale seems natural solution to fix this. In the case of linear regression, there is no restriction on your outputs.

Wikipedia

en.wikipedia.org › wiki › Least_absolute_deviations

Least absolute deviations - Wikipedia

November 22, 2024 - If in the sum of the absolute values ... absolute value function, which on the left half-line has slope ... {\displaystyle \tau =1/2} gives the standard regression by least absolute deviations and is also known as median regression. The least absolute deviation problem may be extended to include multiple explanators, constraints and regularization, e.g., a linear model with ...

Formulation Solution Properties Variations, extensions, specializations Further reading

Wayne State University

digitalcommons.wayne.edu › cgi › viewcontent.cgi pdf

Least Absolute Value vs. Least Squares Estimation and ...

Open Access research and scholarship produced by Wayne State University community and home of Wayne State University Press Journals.

Bradthiessen

bradthiessen.com › html5 › docs › ols.pdf pdf

Why we use “least squares” regression instead of “least ...

Recall that when we used absolute values, we were simply finding the total distance of all our error lines (the vertical dotted lines). When we square those error lines, we are literally making squares from those lines. We can visualize this as... So we want to find the regression line that minimizes the sum of the areas of these error squares.

Real-Statistics

real-statistics.com › multiple-regression › lad-regression

Least Absolute Deviation (LAD) Regression

Free downloadable statistics software (Excel add-in) plus comprehensive statistics tutorial for carrying out a wide range of statistical analyses in Excel.

Find elsewhere

Google Bing Mojeek

Quora

quora.com › Why-do-we-use-square-error-instead-of-absolute-value-when-we-calculate-R-2-in-regression-analysis

Why do we use square error instead of absolute value when we calculate R^2 in regression analysis? - Quora

Answer (1 of 5): There are many reasons for using the square error, and there are cases one might prefer a different cost-function, but there is one really important reason to use the square error, that is often not mentioned because it is not a very impressive one: it is easier to solve than oth...