Examining the Tweeting Patterns of Prominent Crossfit Gyms

A. Introduction The growth of Crossfit has been one of the biggest developments in the fitness industry over the past decade. Promoted as both a physical exercise philosophy and also as a competitive fitness sport, Crossfit is a high-intensity fitness program incorporating elements from several sports and exercise protocols such as high-intensity interval training, Olympic weightlifting, plyometrics, powerlifting, gymnastics, strongman, and so forth. Now with over 10,000 Crossfit affiliated gyms (boxes) throughout the United States, the market has certainly become more saturated and gyms must initiate more unique marketing strategies to attract new members. In this post, I will investigate how some prominent Crossfit boxes are utilizing Twitter to engage with consumers. While Twitter is a great platform for news and entertainment, it is usually not the place for customer acquisition given the lack of targeted messaging. Furthermore, unlike platforms like Instagram,Twitter is simply not an image/video centric tool where followers can view accomplishments from their favorite fitness heroes, witness people

read more Examining the Tweeting Patterns of Prominent Crossfit Gyms

Packages for Getting Started with Time Series Analysis in R

A. Motivation During the recent RStudio Conference, an attendee asked the panel about the lack of support provided by the tidyverse in relation to time series data. As someone who has spent the majority of their career on time series problems, this was somewhat surprising because R already has a great suite of tools for visualizing, manipulating, and modeling time series data. I can understand the desire for a ‘tidyverse approved’ tool for time series analysis, but it seemed like perhaps the issue was a lack of familiarity with the available toolage. Therefore, I wanted to put together a list of the packages and tools that I use most frequently in my work. For those unfamiliar with time series analysis, this could a good place to start investigating R’s current capabilities. B. Background Time series data refers to a sequence of measurements that are made over time at regular or irregular intervals with each observation being a single dimension. An

read more Packages for Getting Started with Time Series Analysis in R

Data.Table by Example – Part 3

For this final post, I will cover some advanced topics and discuss how to use data tables within user generated functions. Once again, let’s use the Chicago crime data. Let’s start by subseting the data. The following code takes the first 50000 rows within the dat dataset, selects four columns, creates three new columns pertaining to the data, and then removes the original date column. The output was saved as to new variable and the user can see the first few columns of the new data table using brackets or head function. We can now do some intermediate calculations and suppress their output by using braces. In this simple sample case, we have taken the mean value1 for each month, subtracted it from value2, and then squared that result. The output show the final calculation in the brackets, which is the result from squaring. Note that I also ordered the results by month with the chaining process. That’s all very

read more Data.Table by Example – Part 3

Data.Table by Example – Part 2

In part one, I provided an initial walk through of some nice features that are available within the data.table package. In particular, we saw how to filter data and get a count of rows by the date. Let us now add a few columns to our dataset on reported crimes in the city of Chicago. There are many ways to do do this but they involve the use of the := operator. Since data.table updates values by reference, we do not need to save the results as another variable. This is a very desirable feature. You can also just use the traditional base R solution for adding new columns as data.tables are also data frames. In any case, we now have three new columns with randomly selected values between 1 and 50. We can now look to summarize these values and see how they differ across the primary arrest type and other categorical variables. The above code allows us to

read more Data.Table by Example – Part 2

Data.Table by Example – Part 1

For many years, I actively avoided the data.table package and preferred to utilize the tools available in either base R or dplyr for data aggregation and exploration. However, over the past year, I have come to realize that this was a mistake. Data tables are incredible and provide R users with a syntatically concise and efficient data structure for working with small, medium, or large datasets. While the package is well documented, I wanted to put together a series of posts that could be useful for those who want to get introduced to the data.table package in a more task oriented format. For this series of posts, I will be working with data that comes from the Chicago Police Department’s Citizen Law Enforcement Analysis and Reporting system. This dataset contains information on reported incidents of crime that occured in the city of Chicago from 2001 to present. You can use the wget command in the terminal to export it as

read more Data.Table by Example – Part 1