Best performance of all time? — An analysis of NBA RAPTOR player data using Databricks (part 1)

Chris Borg
3 min readDec 22, 2021

Recently, FiveThirtyEight published a look into their new NBA statistic, the RAPTOR metric. Presented as the “new metric for the modern NBA” I was intrigued as to its usefulness for analyzing historical NBA player data. Conveniently, FiveThirtyEight also publishes their RAPTOR metric data all the way back to 1976. Let’s dive in.

To start with something straightforward, I asked what were the best single season performances of all time and are NBA players better than they were in the past?

Hypothesis: Steph Curry just marked his 2,974th career 3-point shot ascending him to a new league record. The best players today are likely to be better than the players before them.

To perform this analysis I’ve chosen to use databricks as an opportunity to learn PySpark.

  1. I signed up for the community edition of Databricks and spun up a cluster.
Figure 1 — Initializing our databricks cluster is as simple as giving it a name (elvis).

2. I uploaded nba_raptor.csv (found here) to the file system.

3. I loaded the csv into a PySpark dataframe and made a few operations:

  • drop any NaN values
  • filter on players who have played more than 1000 minutes in a season (this was a quick way to filter out RAPTOR scores of players who only played a few games in a season)
  • order by raptor_total (our performance metric) and limit to the top 200 players.
  • Note: The df that is returned includes some player season combinations that align with our intuition (Steph in 2016, MJ in ‘91)

4. To best analyze player performance similarities I used PCA to reduced the feature space to 2 dimensions so player representations could be easily plotted.

5. I plotted the PCA results, colored the points by year, and sized them by total_raptor. One of the first things I noted is how the most recent performances (yellow) are closer to performances of 1980s (purple): a sign that modern players are more similar to players of the 80s than the ones they grew up playing with. Maybe some of the old school game is coming back?

More to come!

--

--