absolute value of residual

Why squared residuals instead of absolute residuals in OLS estimation? [duplicate]

stats.stackexchange.com › questions › 46019 › why-squared-residuals-instead-of-absolute-residuals-in-ols-estimation

Both are done.

Least squares is easier, and the fact that for independent random variables "variances add" means that it's considerably more convenient; for examples, the ability to partition variances is particularly handy for comparing nested models. It's somewhat more efficient at the normal (least squares is maximum likelihood), which might seem to be a good justification -- however, some robust estimators with high breakdown can have surprisingly high efficiency at the normal.

But L1 norms are certainly used for regression problems and these days relatively often.

If you use R, you might find the discussion in section 5 here useful:

https://socialsciences.mcmaster.ca/jfox/Books/Companion/appendices/Appendix-Robust-Regression.pdf

(though the stuff before it on M estimation is also relevant, since it's also a special case of that)

Answer from Glen_b on Stack Exchange

reddit.com › r/statistics › why square residuals and not use absolute value?

r/statistics on Reddit: Why square residuals and not use absolute value?

May 4, 2014 -

When deriving the coefficients for a linear regression, we tend to obtain the sum of the minimized squared residuals. I am struggling to intuitively understand why. I know that this offsets the negative residuals that would cancel the positive ones. However, why not just do the absolute value? Other answers say it is because of the mathematical convenience and because squaring makes sure that outliers have a more minimal effect on the regression. Are these the true reasons and are they valid? Why would squaring the outliers make it have a more minimal effect? Thanks!

Top answer

1 of 5

11

You can use the absolute value. This is called the L1 norm and is used for robust regression.

You can read more about it here: http://www.johnmyleswhite.com/notebook/2013/03/22/using-norms-to-understand-linear-regression/

Practically, the math is easier in ordinary least squares regression: You want to minimize the squared residuals so you can take the derivative, set it equal to 0 and solve. Easier to compute the derivative of a polynomial than absolute value.

2 of 5

7

We can do absolute values for regression; it's called L1 regression, and people certainly use it. It's certainly not as convenient as ordinary least squares (L2) regression, but that's what computers are for.

What's substantially more convenient is the inference (hypothesis tests and CIs) but again that's not such an issue these days; computers can help deal with that.

However, if the error distribution is close to normal, least squares will be substantially more efficient.

Stack Exchange

stats.stackexchange.com › questions › 46019 › why-squared-residuals-instead-of-absolute-residuals-in-ols-estimation

regression - Why squared residuals instead of absolute residuals in OLS estimation? - Cross Validated

Top answer

1 of 4

24

Both are done.

Least squares is easier, and the fact that for independent random variables "variances add" means that it's considerably more convenient; for examples, the ability to partition variances is particularly handy for comparing nested models. It's somewhat more efficient at the normal (least squares is maximum likelihood), which might seem to be a good justification -- however, some robust estimators with high breakdown can have surprisingly high efficiency at the normal.

But L1 norms are certainly used for regression problems and these days relatively often.

If you use R, you might find the discussion in section 5 here useful:

https://socialsciences.mcmaster.ca/jfox/Books/Companion/appendices/Appendix-Robust-Regression.pdf

(though the stuff before it on M estimation is also relevant, since it's also a special case of that)

2 of 4

27

I can't help quoting from Huber, Robust Statistics, p.10 on this (sorry the quote is too long to fit in a comment):

Two time-honored measures of scatter are the mean absolute deviation

$\text{[math]}$

and the mean square deviation

$\text{[math]}$

There was a dispute between Eddington (1914, p.147) and Fisher (1920, footnote on p. 762) about the relative merits of $\text{[math]}$ and $\text{[math]}$ .[...] Fisher seemingly settled the matter by pointing out that for normal observations $\text{[math]}$ is about 12% more efficient than $\text{[math]}$ .

By the relation between the conditional mean $\text{[math]}$ and the unconditional mean $\text{[math]}$ a similar argument applies to the residuals.

Discussions

least squares - Absolute value of residuals in simple linear regression - Cross Validated

One of the assumptions of the residuals is that there is no real pattern to the residuals and if you were to graph a density to them, you would end up with something approaching a uniform distribution. $\endgroup$ ... Find the answer to your question by asking. Ask question ... See similar questions with these tags. ... 12 Expected Value ... More on stats.stackexchange.com

stats.stackexchange.com

November 18, 2018

regression - R absolute value of residuals with log transformation - Stack Overflow

Indeed the absolute response residuals (on the scale of the original dependent variable) can be calculated as @Roland says with More on stackoverflow.com

stackoverflow.com

Why do we use residual sum of squares rather than adding absolute values of errors in linear regression?

Here's the answer I wish I'd had given to me when I asked the same question during my introductory statistics classes. There are many reasons, and the two objectives do not give equivalent results. Minimizing the sum of squared residuals is called "ordinary least squares" and is generally the first technique students learn in estimating functions. Minimizing the sum of absolute is generally called "median regression" for reasons I will discuss later, and is a somewhat less popular technique. Wikipedia indicates that the idea of median regression was actually developed first, which is unsurprising as it is indeed more intuitive. The issue is that there isn't a closed form solution (i.e. a simple formula you can plug numbers into) to find the coefficients that minimize the sum of absolute residuals. In contrast, summing squared residuals gives an objective function that is differentiable: differentiating, setting the derivative equal to zero, and then solving gives a formula for the coefficients that is straightforward to compute. (Technically we are using partial derivatives, and the algebra is a lot easier if you have matrices available, but the basic idea is the same as you would learn in an introductory differential calculus class.) Now, that was a big deal when these ideas were first being developed back in the 18th and 19th century, as then "computer" meant someone who had to perform computations by hand. Algorithms for finding the median regression coefficients existed but were harder to implement. Today we recognize that computing these coefficients is a "linear programming" optimization problem, for which many algorithms exist, most notably the Simplex algorithm. So on a modern computer the two methods are basically equally easy to compute. Then there's the question of inference. In a traditional statistics or econometrics course you would spend a lot of time developing the machinery to do things like hypothesis testing (e.g. suppose I gather a random sample and get an estimated slope coefficient of 0.017, which looks small. It is useful to ask the question "If the true population slope coefficient is 0, how likely is it that we could get an estimated slope coefficient of 0.017 or more extreme?". Very similar is the most basic A/B test, which asks "If the true difference between these two groups is 0, how likely is it that I would get an observed difference as big as that observed in the data just due to random chance?"). The fact we can explicitly write out the OLS formula makes it substantially easier to develop the statistical theory for this. There are also some optimality results, such as the Gauss-Markov Theorem, that say that OLS is "best", albeit in a very specific sense under a very restrictive set of assumptions. The statistical theory of median regression has also been figures out, but it requires more advanced math and is somewhat less elegant. So for both OLS and median regression you can compute the coefficients and perform statistical inference. So why do most students learn about OLS and not about median regression? Part of it is path dependence - OLS was developed first, so lots of people learned it and taught it to others, and it's just easier to stick with what people have learned in the past than switch to something else (e.g. you can keep using the same textbooks). But all the reasons that made it simpler to compute and develop inference for also make it easier for students to learn. Okay, but the pedagogy of introductory courses doesn't really matter once you get into the real world and are choosing which method to use. And these days there are pre-programmed algorithms that will do both estimation and inference for you in a single command, so the differences there don't really matter to a practitioner either. If you've got some data and want to estimate the relationship between Y and X, should you use OLS or median regression? You're absolutely right that squaring does something subtle to the residuals. It "skews" them in the sense of disproportionately trying to reduce large residuals, whereas median regression weights both small and large residuals equally. This is why advocates of median regression say that median regression is more "robust to outliers": if there is a weird Y observation (e.g. someone got a decimal point in the wrong location when transcribing data), OLS is going to try really hard to fit that observation. That is, if you imagine having a bunch of X's and Y's that pretty much follow a straight line, and then one really weird observation, the median regression line is going to be closer to the straight line than the OLS line. But the really exciting part happens when you pause and ask what the heck these estimators are actually estimating. One can show theoretically that minimizing squared errors results in a conditional mean, i.e. given a particular value of X, the predicted value of Y at that X is the "average" (in the sense of arithmetic mean or expected value) value of Y. In contrast, minimizing the sum of absolute errors results in the conditional median: for a fixed X, 50% of Y will be above this number, and 50% below. Upon realizing this people also realized that by weighting negative errors and positive errors differently, one can actually extend median regression to quantile regression (e.g. you can estimate the conditional 0.25 quantile: for a fixed X, 25% of Y will be below it and 75% will be above it). So there's a school of thought out there that quantile regression should be used a lot more: it's robust to outliers and by estimating it for several quantiles you can get a fuller picture of how things are going. On the flip side, however, there are some theoretical results that suggest OLS will give you a more precise estimate. OLS estimates are also much more interpretable (if you are confident your model has a causal interpretation, the slope is the average marginal effect of X on Y). In practice the difference between OLS and median regression usually isn't big enough to matter much - certainly not to the extent that advocates for median regression can say "here are 10,000 cases where using median regression would perform way better". Also, since median regression is a more advanced technique to learn, it would be better to compare it to other more advanced techniques. Median regression has all the issues OLS does in terms of needing to specify exactly what variables are in the model, and in what way (e.g. squared, interaction terms, etc.). If your goal is just to have an algorithm that will give you some sort of sensible prediction, many other tools exist that will do a much better job (see, for example, the later chapters in the book you are reading). And if you for some reason actually do need to estimate a conditional quantile and/or do inference, you might look into the "generalized random forest" R package and associated paper by Athey et al. And the sheer length of this comment reveals to me why my professors did not spend time explaining why they used squared residuals instead of absolute values! Edit: Thanks for the gold! More on reddit.com

r/datascience

110

155

November 18, 2018

Why square residuals and not use absolute value?

You can use the absolute value. This is called the L1 norm and is used for robust regression.

You can read more about it here: http://www.johnmyleswhite.com/notebook/2013/03/22/using-norms-to-understand-linear-regression/

Practically, the math is easier in ordinary least squares regression: You want to minimize the squared residuals so you can take the derivative, set it equal to 0 and solve. Easier to compute the derivative of a polynomial than absolute value.

Videos

05:55

YouTube

Calculating Residuals of a Scatter Plot - YouTube

August 7, 2025

khanacademy.org

Calculating residual example (video)

01:34

YouTube

Calculating a residual - YouTube

August 25, 2015

43.1K

youtube.com

Linear Regression #3 | Calculating and understanding the ...

khanacademy.org

Introduction to residuals and least-squares regression (video)

View all

Why squared residuals instead of absolute residuals in OLS estimation? [duplicate]

stats.stackexchange.com › questions › 46019 › why-squared-residuals-instead-of-absolute-residuals-in-ols-estimation

Both are done.

Least squares is easier, and the fact that for independent random variables "variances add" means that it's considerably more convenient; for examples, the ability to partition variances is particularly handy for comparing nested models. It's somewhat more efficient at the normal (least squares is maximum likelihood), which might seem to be a good justification -- however, some robust estimators with high breakdown can have surprisingly high efficiency at the normal.

But L1 norms are certainly used for regression problems and these days relatively often.

If you use R, you might find the discussion in section 5 here useful:

https://socialsciences.mcmaster.ca/jfox/Books/Companion/appendices/Appendix-Robust-Regression.pdf

(though the stuff before it on M estimation is also relevant, since it's also a special case of that)

Answer from Glen_b on Stack Exchange

Statalist

statalist.org › forums › forum › general-stata-discussion › general › 1520485-absolute-value-of-residual

Absolute value of residual - Statalist

October 15, 2019 - Hello Experts A typical measure for firms’ use of earnings management in the finance literature is based on the absolute value of the residual of OLS regression estimates.

Wikipedia

en.wikipedia.org › wiki › Errors_and_residuals

Errors and residuals - Wikipedia

5 days ago - Sum of squares of residuals (SSR) is the sum of the squares of the deviations of the actual values from the predicted values, within the sample used for estimation. This is the basis for the least squares estimate, where the regression coefficients are chosen such that the SSR is minimal (i.e. ...

Introduction In univariate distributions Regressions Other uses of the word "error" in statistics Further reading

Fiveable

fiveable.me › all key terms › intro to statistics › absolute value of a residual

Absolute value of a residual Definition - Intro to...

The absolute value of a residual is the non-negative difference between an observed value and the corresponding predicted value from a regression model. It...

ResearchGate

researchgate.net › figure › Mean-of-the-absolute-value-of-the-residuals-residuals-results_tbl2_228406847

Mean of the absolute value of the residuals (residuals) results. | Download Table

Download Table | Mean of the absolute value of the residuals (residuals) results. from publication: Toward Parsimony in Shoreline Change Prediction (II): Applying Basis Function Methods to Real and Synthetic Data | GENZ, A.S.; FRAZER, L.N., and FLETCHER, C.H., 2009.

Find elsewhere

Google Bing Mojeek

Stack Exchange

stats.stackexchange.com › questions › 420842 › absolute-value-of-residuals-in-simple-linear-regression

least squares - Absolute value of residuals in simple linear regression - Cross Validated

November 18, 2018 - One of the assumptions of the residuals is that there is no real pattern to the residuals and if you were to graph a density to them, you would end up with something approaching a uniform distribution. $\endgroup$ ... Find the answer to your question by asking. Ask question ... See similar questions with these tags. ... 12 Expected Value ...

Quora

quora.com › The-method-of-least-squares-of-residuals-is-commonly-used-to-get-the-best-fit-with-linear-regression-The-reason-why-the-absolute-value-of-residuals-y-ypred-is-not-used-is-that

The method of least squares of residuals is commonly used to get the best fit with linear regression. The reason why the absolute value of residual's (|y- ypred|) is not used is that? - Quora

Answer (1 of 6): You could run something like that if you wanted, but least squares intentionally squares the error with the thinking that it will react better when values stray away from your predicted value. As an example, say you take your data and plot all the actual values and see a straigh...

Fchart

fchart.com › ees › eeshelp › 9amb48.htm

Residuals

April 16, 2025 - The relative residual, Rel. Res., is the absolute value of Abs. Res. divided by the value of the left-hand side of the equation, assuming that it is not equal to zero.

Stack Overflow

stackoverflow.com › questions › 46902780 › r-absolute-value-of-residuals-with-log-transformation

regression - R absolute value of residuals with log transformation - Stack Overflow

Top answer

1 of 2

1

I want to interpret the residuals but get them back on the scale of num_encounters.

You can easily calculate them:

mod <- lm(log(num_encounters) ~ log(distance)*sampling_effort, data=df)
res <- df$num_encounters - exp(predict(mod))

2 of 2

1

In addition what @Roland suggests, which indeed is correct and works, the problem with my confusion was just basic high-school logarithm algebra.

Indeed the absolute response residuals (on the scale of the original dependent variable) can be calculated as @Roland says with

mod <- lm(log(num_encounters) ~ log(distance)*sampling_effort, data=df)
res <- df$num_encounters - exp(predict(mod))

If you want to calculate them from the model residuals, you need to keep logarithm substraction rules into account.

log(a)-log(b)=log(a/b)

The residual is calculated from the original model. So in my case, the model predicts log(num_encounters). So the residual is log(observed)-log(predicted).

What I was trying to do was

exp(resid) = exp(log(obs)-log(pred)) = exp(log(obs/pred)) = obs/pred

which is clearly not the number I was looking for. To get the absolute response residual from the model response residual, this is what I needed.

obs-obs/exp(resid)

So in R code, this is what you could also do:

mod <- lm(log(num_encounters) ~ log(distance)*sampling_effort, data=df)
abs_resid <- df$num_encounters - df$num_encounters/exp(residuals(mod, type="response"))

This resulted in the same number as with the method described by @Roland which is much easier of course. But at least I got my brain lined up again.

Penn State Statistics

online.stat.psu.edu › stat462 › node › 172

9.3 - Identifying Outliers (Unusual Y Values) | STAT 462

Some statistical software flags any observation with a standardized residual that is larger than 2 (in absolute value). Using a cutoff of 2 may be a little conservative, but perhaps it is better to be safe than sorry. The key here is not to take the cutoffs of either 2 or 3 too literally.

ThoughtCo

thoughtco.com › what-are-residuals-3126253

What Are Residuals?

January 27, 2019 - Residuals are zero for points that fall exactly along the regression line. The greater the absolute value of the residual, the further that the point lies from the regression line.

F-Chart Software

fchartsoftware.com › ees › eeshelp › 9amb48.htm

Residuals window

The relative residual, Rel. Res., is the absolute value of Abs. Res. divided by the value of the left-hand side of the equation, assuming that it is not equal to zero.

reddit.com › r/datascience › why do we use residual sum of squares rather than adding absolute values of errors in linear regression?

r/datascience on Reddit: Why do we use residual sum of squares rather than adding absolute values of errors in linear regression?

November 18, 2018 -

I am learning data science through ISLR(page 62). Why do we do RSS = (e1)2+(e2)2+(e3)2.... Rather than (|e1|+| e2 |+ | e3 |) as it will be right distance ? Will squaring not skew the results?

Top answer

1 of 19

320

Here's the answer I wish I'd had given to me when I asked the same question during my introductory statistics classes. There are many reasons, and the two objectives do not give equivalent results. Minimizing the sum of squared residuals is called "ordinary least squares" and is generally the first technique students learn in estimating functions. Minimizing the sum of absolute is generally called "median regression" for reasons I will discuss later, and is a somewhat less popular technique. Wikipedia indicates that the idea of median regression was actually developed first, which is unsurprising as it is indeed more intuitive. The issue is that there isn't a closed form solution (i.e. a simple formula you can plug numbers into) to find the coefficients that minimize the sum of absolute residuals. In contrast, summing squared residuals gives an objective function that is differentiable: differentiating, setting the derivative equal to zero, and then solving gives a formula for the coefficients that is straightforward to compute. (Technically we are using partial derivatives, and the algebra is a lot easier if you have matrices available, but the basic idea is the same as you would learn in an introductory differential calculus class.) Now, that was a big deal when these ideas were first being developed back in the 18th and 19th century, as then "computer" meant someone who had to perform computations by hand. Algorithms for finding the median regression coefficients existed but were harder to implement. Today we recognize that computing these coefficients is a "linear programming" optimization problem, for which many algorithms exist, most notably the Simplex algorithm. So on a modern computer the two methods are basically equally easy to compute. Then there's the question of inference. In a traditional statistics or econometrics course you would spend a lot of time developing the machinery to do things like hypothesis testing (e.g. suppose I gather a random sample and get an estimated slope coefficient of 0.017, which looks small. It is useful to ask the question "If the true population slope coefficient is 0, how likely is it that we could get an estimated slope coefficient of 0.017 or more extreme?". Very similar is the most basic A/B test, which asks "If the true difference between these two groups is 0, how likely is it that I would get an observed difference as big as that observed in the data just due to random chance?"). The fact we can explicitly write out the OLS formula makes it substantially easier to develop the statistical theory for this. There are also some optimality results, such as the Gauss-Markov Theorem, that say that OLS is "best", albeit in a very specific sense under a very restrictive set of assumptions. The statistical theory of median regression has also been figures out, but it requires more advanced math and is somewhat less elegant. So for both OLS and median regression you can compute the coefficients and perform statistical inference. So why do most students learn about OLS and not about median regression? Part of it is path dependence - OLS was developed first, so lots of people learned it and taught it to others, and it's just easier to stick with what people have learned in the past than switch to something else (e.g. you can keep using the same textbooks). But all the reasons that made it simpler to compute and develop inference for also make it easier for students to learn. Okay, but the pedagogy of introductory courses doesn't really matter once you get into the real world and are choosing which method to use. And these days there are pre-programmed algorithms that will do both estimation and inference for you in a single command, so the differences there don't really matter to a practitioner either. If you've got some data and want to estimate the relationship between Y and X, should you use OLS or median regression? You're absolutely right that squaring does something subtle to the residuals. It "skews" them in the sense of disproportionately trying to reduce large residuals, whereas median regression weights both small and large residuals equally. This is why advocates of median regression say that median regression is more "robust to outliers": if there is a weird Y observation (e.g. someone got a decimal point in the wrong location when transcribing data), OLS is going to try really hard to fit that observation. That is, if you imagine having a bunch of X's and Y's that pretty much follow a straight line, and then one really weird observation, the median regression line is going to be closer to the straight line than the OLS line. But the really exciting part happens when you pause and ask what the heck these estimators are actually estimating. One can show theoretically that minimizing squared errors results in a conditional mean, i.e. given a particular value of X, the predicted value of Y at that X is the "average" (in the sense of arithmetic mean or expected value) value of Y. In contrast, minimizing the sum of absolute errors results in the conditional median: for a fixed X, 50% of Y will be above this number, and 50% below. Upon realizing this people also realized that by weighting negative errors and positive errors differently, one can actually extend median regression to quantile regression (e.g. you can estimate the conditional 0.25 quantile: for a fixed X, 25% of Y will be below it and 75% will be above it). So there's a school of thought out there that quantile regression should be used a lot more: it's robust to outliers and by estimating it for several quantiles you can get a fuller picture of how things are going. On the flip side, however, there are some theoretical results that suggest OLS will give you a more precise estimate. OLS estimates are also much more interpretable (if you are confident your model has a causal interpretation, the slope is the average marginal effect of X on Y). In practice the difference between OLS and median regression usually isn't big enough to matter much - certainly not to the extent that advocates for median regression can say "here are 10,000 cases where using median regression would perform way better". Also, since median regression is a more advanced technique to learn, it would be better to compare it to other more advanced techniques. Median regression has all the issues OLS does in terms of needing to specify exactly what variables are in the model, and in what way (e.g. squared, interaction terms, etc.). If your goal is just to have an algorithm that will give you some sort of sensible prediction, many other tools exist that will do a much better job (see, for example, the later chapters in the book you are reading). And if you for some reason actually do need to estimate a conditional quantile and/or do inference, you might look into the "generalized random forest" R package and associated paper by Athey et al. And the sheer length of this comment reveals to me why my professors did not spend time explaining why they used squared residuals instead of absolute values! Edit: Thanks for the gold!

2 of 19

63

Gauss and others wanted to penalize outliers more, and squares are really easy to calculate. It really is that simple. They had to choose something and that was it. You could just as easily apply some other loss function. Edit: This post got bigger than I thought it would. The others have better answers. Mine is a little too flippant for the kind of attention this is getting.

Krista King Math

kristakingmath.com › blog › correlation-coefficient-and-the-residual

Correlation coefficients and the residual — Krista King Math | Online math help

August 6, 2019 - The blue lines in the chart represent the residual for each point. Notice that the absolute value of the residual is the distance from the predicted value on the line to the actual value of the point.

ResearchGate

researchgate.net › figure › The-maximum-absolute-value-of-the-residual-for-differentiating-a-spherical-harmonic-with_fig1_319974548

The maximum absolute value of the residual for differentiating a... | Download Scientific Diagram

June 15, 2016 - For the 341 sphere, it is natural to suppose that the Hermite weights would become exact for the 342 spherical harmonics. This is indeed the case, if the neighborhood size is sufficiently 343 large, which is demonstrated in Fig. 1. Let the residual for an example stencil on the 344 sphere be ...

Quora

quora.com › Why-do-we-use-residual-sum-of-squares-rather-than-adding-absolute-values-of-errors-in-linear-regression

Why do we use residual sum of squares rather than adding absolute values of errors in linear regression? - Quora

January 18, 2021 - Answer (1 of 10): Optimization problems involve minimizing some cost function. Fitting data to a curve is an optimization problem. So the question becomes: why use the sum of the squared differences between the fit and the data as the cost function? It is true that one can choose to minimize the...

UVA Library

library.virginia.edu › data › articles › understanding-deviance-residuals

Understanding Deviance Residuals | UVA Library

September 28, 2022 - To do this we fit an intercept-only ... the MLE of 0.6: ... We have 5 elements corresponding to each observation. If we take the square root of these, we get the absolute values of the deviance residuals....

Brainly

brainly.com › mathematics › college › please help me look at table and solve. identify the number of positive and negative residuals, the residual with the greatest absolute value, and the residual with the least absolute value. is the model a good fit? there are __?_ positive residuals and __?_ negative residuals. the residual with the greatest absolute value is__ , and the residual with the least absolute value is__ . question 3 the absolute values of the residuals are relatively (small or large) , and in general, the data points (are or are not)evenly dispersed about the line of fit. so, the equation y=−x+9.6 (is or is not) a good fit.

[FREE] PLEASE HELP ME Look at table and solve. Identify the number of positive and negative residuals, the - brainly.com

We can then calculate the residuals ... residuals and 4 negative residuals. The residual with the greatest absolute value is -4.8, and the residual with the least absolute value is 0.5....

Diversification

diversification.com › term › absolute-residual-value

Absolute residual value: Meaning, Criticisms & Real-World Uses

Absolute residual value refers to the specific, quantifiable monetary worth of an asset at the end of its projected useful life or lease term. Interpreting the absolute residual value is critical for both lessors and lessees, as well as for ...