Applied Statistical Theory: Quantile Regression

This is part two of the ‘applied statistical theory’ series that will cover the bare essentials of various statistical techniques. As analysts, we need to know enough about what we’re doing to be dangerous and explain approaches to others. It’s not enough to say “I used X because the misclassification rate was low.”

Standard linear regression summarizes the average relationship between a set of predictors and the response variable. $latex \beta_1 $ represents the change in the mean value of $latex Y $ given a one unit change in $latex X_1 $. A single slope is used to describe the relationship. Therefore, linear regression only provides a partial view of the link between the response variable and predictors. This is often inadequate when there is heterogenous variance between $latex X $ and $latex Y $. In such cases, we need to examine how the relationship between $latex X $ and $latex Y $ changes depending on the value of $latex Y $. For example, the impact of education on income may be more pronounced for those at higher income levels than those at lower income levels. Likewise, the the affect of parental care on the mean infant birth weight can be compared to it’s effect on other quantiles of infant birth weight. Quantile regression solves for these problems by looking at changes in the different quantiles of the response. The parameter estimates for this technique represent the change in a specified quantile of the response variable produced by a one unit change in the predictor variable. One major benefit of quantile regression is that it makes no assumptions about the error distribution.

library(quantreg)

head(mtcars)

frmla <- mpg ~ .
u=seq(.02,.98,by=.02)

mm = rq(frmla, data=mtcars, tau=u) # for a series of quantiles
mm = rq(frmla, data=mtcars, tau=0.50) # for the median

summ <- summary(mm, se = "boot")
summ

plot(summ)

6 thoughts on “Applied Statistical Theory: Quantile Regression”

  1. > plot(summ)
    Error in xy.coords(x, y, xlabel, ylabel, log) :
    ‘x’ is a list, but does not have components ‘x’ and ‘y’

    1. When you specify a specific quantile, it’s not possible to see the full sequence of how X is conditional on Y.
      Try just this:

      frmla <- mpg ~ .
      u=seq(.02,.98,by=.02)
      mm = rq(frmla, data=mtcars, tau=u) # for a series of quantiles
      summ <- summary(mm, se = "boot")
      plot(summ)

    2. When you specify a specific quantile, it’s not possible to see the full sequence of how X is conditional on Y. There’s only one quantile after all, not a sequence that includes the range of possible values taken by Y.

      Try just this:

      frmla <- mpg ~ .
      u=seq(.02,.98,by=.02)
      mm = rq(frmla, data=mtcars, tau=u) # for a series of quantiles
      summ <- summary(mm, se = "boot")
      plot(summ)

  2. Pingback: Applied Statistical Theory: Quantile Regression | Mubashir Qasim

  3. Pingback: Distilled News | Data Analytics & R

  4. Your plot doesn’t work here. It produces ”
    Error in xy.coords(x, y, xlabel, ylabel, log) :
    ‘x’ is a list, but does not have components ‘x’ and ‘y’

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top