Author: Matt Needham

Blogging with Flask

Right before the NBA season began, I laid out my predictions for how many wins each team would achieve this season. To track the progress of these predictions as well as the overall NBA rankings, I planned to blog on a weekly basis. While I could have gone with a standard WordPress blog (like this one) I wanted to easily run and access Python code. To do that, I created a blog with Flask: MNBA.

Flask is “is a microframework for Python”. It is my favorite way to integrate Python into a website. The tutorial they offer, Flaskr, is the creation of a mini-blog. So setting up the foundations of my website MNBA was essentially an adaptation of that tutorial. For the hosting of my site I use PythonAnywhere.com, which is cheap and they do a great job of hosting Flask-based websites. The layout was a publicly available template using the Pure.css style.

I have a Python script that runs each week to pull current data about NBA teams (such as wins, losses, points scored, and points allowed), runs some simple calculations (to get my desired statistics such as Pythogrean wins expected) and writes this data to a MySQL database. Then I use Matplotlib to graph recent changes in expected wins. I do a short writeup of my thoughts on whatever is going on in the NBA as well as a table view of the MySQL data for that week.

I have created a few other Flask sites I plan to write about in the future. While there was a bit of a learning curve because it is different from a simple Python script on your computer, after some experience with Flask it is easy to get a new site up and running.

Creating a Google calendar training plan

My wife and I are both training for a triathlon right now. She found a training plan online that she wanted to follow, but thought she would actually stick to it if it was on her Google calendar. With a little Python, I was able to do this in a couple of hours.

The first step was getting the training plan from the web. This was fairly straightforward to accomplish with BeautifulSoup, a popular Python package for web scraping (extracting data from a web site). I then moved the data into a Pandas DataFrame, which I find easy to work with. The data came in a daily format, even though some days had two workouts at two different times. So I parsed through the data to look for “a.m.” or “p.m.” to split the workouts into their respective time.

There were some oddities to how the page was laid out. For example, in the text of swim workouts it would refer to a warmup type by number which was at the bottom of the page. So I had to make a few adjustments to deal with these ways that the layout was a little odd but it was not too much work.

After exporting this into an cvs file and opening in Excel, I added a column for date (setting the race day appropriately and working backwards). Morning and evening workouts were given a time of 6am and 5pm respectively. Google makes the final step of creating a new calendar from this cvs file easy.

In about two hours, I was able to turn this web page training plan into a Google Calendar!

 

Playing with NBA stats

I am a huge fan of the NBA and typically take an analytics approach to sports. I have my own NBA blog where my main focus this season has been on projecting teams’ win totals. In the future I plan to blog here about how I built that website and my win projection tools with Python. This post is about a smaller project I recently put together.

This project was my first time using nba_py, a Python wrapper for the NBA.com API. This means I can directly access stats from NBA.com in my Python scripts. Just as a matter of curiosity, I was wondering: which player had the best game against each team? For example, which player had the best game against the elite Houston Rockets? Who had the best game against the injury-laden Memphis Grizzlies? As a measure of best game I used Game Score, a statistic created by basketball analytics pioneer John Hollinger. Game Score attempts to put 1 number to a player’s contribution across major stats like points, assist, rebounds, etc. Here are the all-time best Game Scores.

As a first run, I used last season’s data. Using nba_py, I went through every game and calculated the Game Score for each player. Then, I set aside the best Game Score from each game along with the player and opponent. With this list of the best Game Scores, I then went through each team as the opponent and set aside the best Game Score. Below is the list of best Game Score against each team. What really stood out to me was how Russell Westbrook had the best Game Score against five different teams – no wonder he won the MVP award!

GAME_SCORE PLAYER_NAME OPPONENT
54.5 Devin Booker BOS
51.5 Jimmy Butler CHA
48.7 James Harden NYK
48.6 Damian Lillard UTA
46.9 James Harden PHI
46.2 Klay Thompson IND
45.6 Russell Westbrook DEN
44.2 Russell Westbrook ORL
43.6 Russell Westbrook POR
43.2 Isaiah Thomas MEM
42.5 Kyrie Irving ATL
42.4 James Harden CLE
41.4 Russell Westbrook PHX
41.0 Damian Lillard MIA
40.1 Kevin Durant OKC
39.9 Kyrie Irving NOP
39.7 Anthony Davis GSW
39.0 Anthony Davis LAL
38.2 Eric Bledsoe TOR
38.0 Giannis Antetokounmpo WAS
37.5 Kawhi Leonard HOU
37.4 Stephen Curry LAC
37.3 Giannis Antetokounmpo CHI
36.4 Anthony Davis MIN
35.2 Damian Lillard DAL
34.5 Brook Lopez MIL
34.0 Marc Gasol DET
33.7 Jimmy Butler BKN
33.0 Russell Westbrook SAC
30.8 Nikola Jokic SAS

Searching lots of code for one small line

After some shifting around of the team, at work I ended up being responsible for our websites. While most of the day-to-day work was simply making edits through the WordPress interface, this also meant maintaining the back-end code which was previously developed by contractors. I did not have a good sense of the architecture or structure of the source code and was not in touch with the original developers.

By inspecting the site I could see which element or function I wanted to edit, but I did not always know which file I would find it in. I put together a little Python script to help me out.

It’s a function that takes the string you are looking for, for example a PHP function or CSS class, and a file extension to search through, for example “.php”. Run it in the parent directory you want to start searching through and it will look through all files, including files in subdirectories. It then returns a list of all files containing your string.

Check it out on my GitHub.

Smart Pardot lists

My first real Python project solved an integration problem. At IHS we use Pardot as our email marketing platform because it is a product of Salesforce. It neatly connects with the Salesforce database which keeps records clean and avoids duplicate records.

While Pardot can access data on individuals in Salesforce, it can’t access all data. My colleagues and I found Pardot could not access information highly relevant to how we were emailing.

In Salesforce, we represent our programs and events as “campaigns” and the individuals tied to them are “campaign members”. When emailing folks about an upcoming program, Pardot is able to automatically pull the list of campaign members tied to a particular campaign. Cool! But it can’t access a field, “type”, which we use to distinguish between attendees, speakers, staff, etc.

So I wrote a workaround. You can find it here on GitHub. Included in the Readme is sample code. All you need to do is import the function, pass in connections to Pardot & Salesforce, your Salesforce object & query, and your Pardot list.

Here are the nuts and bolts of how it works:

  1. Connect to Salesforce via simple-salesforce, a Python package for Salesforce’s REST API
  2. Use simple-salesforce to run a query to get a list of the individuals we are looking for in Salesforce and pulls their Pardot ID. For example, speakers for all upcoming weekend seminars
  3. Connect to Pardot via pypardot4, a Python package for Pardot’s API. I will write more about this in a future post, but I took this existing package and upgraded it for compatibility with the latest version of the API
  4. Create a Pardot list using the the list of Pardot IDs pulled from Salesforce

There you have it. You can schedule it to run often to ensure the data is up to date. While I created this script to access one Salesforce field not accessible in Pardot, it basically opens up any Salesforce field. For example, I have queried a Salesforce custom object for applications and pushed the data into fields on Pardot prospects.

As an aside, it is baffling to me that you can’t access Salesforce Campaign and Campaign Member data in Pardot. But that’s just one of my many complaints about how much stronger the data integration should be between the two products.

What I do with Python

Over the past year I have become an everyday user of the programming language Python. I use it both at work and for personal projects. At work, I’m responsible for marketing analytics and primarily deal with the Salesforce database and Salesforce’s marketing automation platform Pardot. I use Python to solve problems not possible out of the box. Outside of work, I mainly use Python to analyze spots statistics.

I am not an expert programmer but I have programmed on and off for over a decade. I find Python easy to use, adept at data manipulation, and effective for accessing APIs. APIs are methods for communicating with particular services of products. For example, I built a tool at work that uses APIs to get data from accounts on Youtube, Google Analytics, Facebook, Twitter, and Pardot on a regular basis.

I first learned the basics of Python for free on Codecademy and then learned more by completing the Data Science with Python track on DataCamp, which cost me something like $60. The Codecademy course was a fine introduction although it wasn’t enough knowledge to do any projects truly on your own. More difficult for me than learning the actual Python language was learning how to set up a working environment. For most projects I now use Thonny, which I highly recommend to other beginners. For web apps and scripts that run daily, I use pythonanywhere.

While the Data Camp course involved lots of applied learning, taking on projects at work and for fun that use Python was what really solidified my knowledge. Here is a list of projects I have completed with Python:

I’m planning to keep blogging weekly to dive deeper into each of the projects and the various tools I have used along the way.

Navigation