Semiparametric Regression in R

A. INTRODUCTION When building statistical models, the goal is to define a compact and parsimonious mathematical representation of some data generating process. Many of these techniques require that one make assumptions about the data or how the analysis is specified. For example, Auto Regressive Integrated Moving Average (ARIMA) models require that the time series is Continue reading Semiparametric Regression in R

Packages for Getting Started with Time Series Analysis in R

A. Motivation During the recent RStudio Conference, an attendee asked the panel about the lack of support provided by the tidyverse in relation to time series data. As someone who has spent the majority of their career on time series problems, this was somewhat surprising because R already has a great suite of tools for Continue reading Packages for Getting Started with Time Series Analysis in R

Examining Website Pathing Data Using Markov Chains

A markov model can be used to examine a stochastic process describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Let’s define a stochastic process  that takes on a finite number of possible values which are nonnegative integers. Each state, , represents it’s value Continue reading Examining Website Pathing Data Using Markov Chains

Statistical Reading Rainbow

For those of us who received statistical training outside of statistics departments, it often emphasized procedures over principles. This entailed that we learned about various statistical techniques and how to perform analysis in a particular statistical software, but glossed over the mechanisms and mathematical statistics underlying these practices. While that training methodology (hereby referred to Continue reading Statistical Reading Rainbow

Weekly R-Tips: Visualizing Predictions

Lets say that we estimated a linear regression model on time series data with lagged predictors. The goal is to estimate sales as a function of inventory, search volume, and media spend from two months ago. After using the lm function to perform linear regression, we predict sales using values from two month ago. If Continue reading Weekly R-Tips: Visualizing Predictions

Applied Statistical Theory: Quantile Regression

This is part two of the ‘applied statistical theory’ series that will cover the bare essentials of various statistical techniques. As analysts, we need to know enough about what we’re doing to be dangerous and explain approaches to others. It’s not enough to say “I used X because the misclassification rate was low.” Standard linear Continue reading Applied Statistical Theory: Quantile Regression

Applied Statistical Theory: Belief Networks

Applied statistical theory is a new series that will cover the basic methodology and framework behind various statistical procedures. As analysts, we need to know enough about what we’re doing to be dangerous and explain approaches to others. It’s not enough to say “I used X because the misclassification rate was low.” At the same Continue reading Applied Statistical Theory: Belief Networks