Using the csv module Using the pandas module
My previous post covered the basics of logistic regression. We must now examine the model to understand how well it fits the data and generalizes to other observations. The evaluation process involves the assessment of three distinct areas – goodness of fit, tests of individual predictors, and validation of predicted values – in order to … Continue reading Logistic Regression in R – Part Two
Logistic regression is used to analyze the relationship between a dichotomous dependent variable and one or more categorical or continuous independent variables. It specifies the likelihood of the response variable as a function of various predictors. The model expressed as , where refers to the parameters and represents the independent variables. The , or log … Continue reading Logistic Regression in R – Part One
The command line can be a scary place for people who are traditionally accustomed to using point-and-click mechanisms for executing tasks on their computer. While the idea of interacting with files and software via text may seem like a terrifying concept, the terminal is a powerful tool that can boost productivity and provide users with … Continue reading The Command Line is Your Friend: A Quick Introduction
I don’t normally work with personal identifiable information such as emails. However, the recent data dump from Ashley Madison got me thinking about how I’d examine a data set composed of email addresses. What are the characteristics of an email that I’d look to extract? How would I perform that task in R? Here’s some … Continue reading Examining Email Addresses in R
Not too long ago, I was on the job market looking for work as an applied statistician or data scientist within the the online marketing industry. One thing I’ve come to expect with almost every company is some sort of homework assignment or challenge where a spreadsheet would be presented along with some guidelines on … Continue reading Homework during the hiring process…no thanks!
My previous post introduced the dot language and how it can be utilized to create flowcharts. For part two, I sought to partially reproduce a more demanding visualization to highlight how Graphviz could be used. The original graphic was taken from the website for the Python scikit library and provide a quick reference guide on … Continue reading Graphviz by Example: Part Two
Introduction GraphViz is an open-source software package developed by AT&T Labs for generating directed graphs and flowcharts. Outputs are created using Dot, a plain text graph description language that is part of the Graphviz package. GraphViz is a powerful application that allows users to create appealing flowcharts without getting hung up on the layout or positioning of the nodes. … Continue reading Graphviz by Example: Part One