Using the Best Tool for the Job

Automation with Bash and SQLite

Often as programmers we have a go-to language or library that we want to use for every problem. This is especially true of Data Analysts and Data Scientists whose backgrounds may not be in Computer Science. In fact, I'll be the first to admit I wish everything could be done …

more ...

Using SQLite at Home Pt 2: Better alternatives for data exploration

DB4S Plot

This post is the second in a two-part series on setting up and using SQLite at home (read Part 1 here). I was inspired to address this topic due to the number of posts on the Codecademy Forums regarding using SQLite locally on their own computers. Although I feel Codecademy is a great way to get started learning a new programming language, one of their weak points has always been transitioning their learners to coding offline. Hopefully these posts will serve as a definitive reference for getting beginners set up to explore and interact with SQLite databases on their own computers. more ...


Using SQLite at Home Pt 1: CLI

Example SQLite CLI

This post is the first in a two-part series on setting up and using SQLite at home (read Part 2 here). I was inspired to address this topic due to the number of posts on the Codecademy Forums regarding using SQLite locally on their own computers. Although I feel Codecademy is a great way to get started learning a new programming language, one of their weak points has always been transitioning their learners to coding offline. Hopefully these posts will serve as a definitive reference for getting beginners set up to explore and interact with SQLite databases on their own computers. more ...


Code, Coronavirus and Canceled School

boy doing math

Let's face it: for many of us parents, the coronavirus (COVID-19) epidemic pandemic couldn't have come at a worse time. Not only do we have a legitimate fear of dying every time we step out of the house into a public place, we also have to face the reality that schools around the U.S. will be closed for over a month more ...


NYC SAT Analysis Pt 2: Visualization and Analysis

Tableau Visualization

For the second half of this project, our goal is to understand how differences in school, borough and district relate to a school's mean SAT score and whether these differences actually have any meaningful correlation to the test scores. We will utilize several visualizations, some basic correlation analysis and hypothesis testing to help us discover trends and examine the significance of those trends.

more ...

NYC SAT Analysis Pt 1: Data Cleaning

Data Table

This project takes a look at the 2012 SAT scores from New York City high schools, and explores insights and correlations found in related data. For ease of reading and to allow for a more in-depth discussion of the data cleaning processes, this project is split into two separate Jupyter notebooks: the first for data cleaning/preparation, and the second covering visualization and analysis.

more ...

First Post

typewriter

I've never been an avid blogger. Perhaps because it's tedious to come up with material for a general purpose blog, or perhaps because I never felt I had any novel material to contiribute to society as a whole. The deeper I've delved in to world of data science, however, the more I've seen how integral blogging has become to the expansion and success of this field. Blog posts are where machine learning pioneers first share their findings, and simultaneously where budding data scientists look for others in their same position — students floundering their way through the ever-expanding technological and mathematical barriers to entry. More than once, I've found myself in the same shoes: scouring blogs for tips on career changes, statistical concepts, applicable code and success stories.

Now, more than two years after my initial forray into the world of Python, I'm taking the advice of fast.ai more ...