Blog Posts

Is There A Shortage of Recruiters? A Look at H1B Proposals

While recently looking at submissions for H1B Visas, I noticed that a not insignificant number of companies had submitted proposals to receive visas for persons who would be working as recruiters. While I know very little about the labor maket for recruiters, I wanted to examine that data to see which companies were applying for Continue reading Is There A Shortage of Recruiters? A Look at H1B Proposals

The Rise of Powerlifting in the United States: A Data-Driven Look at Trends and Patterns

If are looking to hire an analytics professional, please send me a message at [email protected]. I am looking for new opportunities and available to start ASAP. sPowerlifting in the United States has seen explosive growth in recent years, both in popularity and participation. National federations like USA Powerlifting have reported a sharp increase in membership, with Continue reading The Rise of Powerlifting in the United States: A Data-Driven Look at Trends and Patterns

SQL Cheat Sheet

I’ve been putting together a basic SQL cheat sheet that could be used as a reference guide. Here are a series of common procedures that should be of use for anyone who uses SQL to extract data. No explanations are provided as they should largely be known to the end user.

Powerlytics: Impact of Age, Gender, and Body Weight on Total Weight Lifted in Powerlifting Meets

A. Background The Open Powerlifting initiative attempts to create an accurate and open archive of all powerlifting meet data throughout the world. As someone who recently started competing again after a six year delay from powerlifting, I often mess around with the Open Powerlifting data as it’s of personal interest. Most of the anlysis that Continue reading Powerlytics: Impact of Age, Gender, and Body Weight on Total Weight Lifted in Powerlifting Meets

Turning Data Into Awesome With sqldf and pandasql

Both R and Python possess libraries for using SQL statements to interact with data frames. While both languages have native facilities for manipulating data, the sqldf and pandasql provide a simple and elegant interface for conducting tasks using an intuitive framework that’s widely used by analysts.             R and sqldf sqldf(“SELECT COUNT(*) FROM Continue reading Turning Data Into Awesome With sqldf and pandasql

Packages for Getting Started with Time Series Analysis in R

A. Motivation During the recent RStudio Conference, an attendee asked the panel about the lack of support provided by the tidyverse in relation to time series data. As someone who has spent the majority of their career on time series problems, this was somewhat surprising because R already has a great suite of tools for Continue reading Packages for Getting Started with Time Series Analysis in R

Extract Google Trends Data with Python

Anyone who has regularly worked with Google Trends data has had to deal with the slightly tedious task of grabbing keyword level data and reformatting the spreadsheet provided by Google. After looking for a seamless way to pull the data, I came upon the PyTrends library on GitHub, and sought to put together some quick Continue reading Extract Google Trends Data with Python

Writing Functions in R: Example One

A. Background In previous posts, I covered a number of useful functions and packages for writing reusable code. I wanted to extend on that information by providing a working example of how to put together a function. In particular, I will walk through the process of generating a function that executes evaluation of a time Continue reading Writing Functions in R: Example One