How Much does Crowd Noise Affect NBA Players?

Posted on August 11, 2020 in NBA-Basketball

Premise

Basketball is back in the bubble! But it's been drastically different with virtual fans and the lack of crowd noise. I was curious to see the effect that crowd noise had on the players, and the most "controlled" aspect of the game I could think of was free throws. Therefore I wanted to see if player's free throw percentages improved due to the lack of crowd noise or not.

I used a matched pairs t-test to see whether a "bubble" effect existed on the NBA player's free throw percentage pre-bubble vs in the bubble. One of the disadvantages of the matched pairs t-test was that this treated the pre-bubble data as a single measurement point and the in-bubble data as another measurement point, but the data was actually binomial data.

Therefore, I also used a test of proportions to see whether any players deviated from their pre-bubble free throw shooting percentages

Load the data

I first loaded the box score data that was scraped from basketballreference.com using the code here

I then created dictionaries of each player's pre-bubble / bubble free throws made and free throw attempts. Thankfully no players in the NBA have the same name

import os
from os import path
import pickle
import pandas as pd
from scipy import stats
import numpy as np
import pylab
from statsmodels.stats.proportion import proportions_ztest
import matplotlib.pyplot as plt

def RepresentsInt(s):
    try:
        int(s)
        return True
    except ValueError:
        return False

working_dir = os.getcwd() + "/ft_data"
# Get all pickle files that match basic box score in the 2020 season
pre_bubble_box_scores = dict()
bubble_box_scores = dict()
for file in os.listdir(working_dir):
    if file.endswith(".pickle") and "basic_box_score" in file and "2020" in file:
        with open(working_dir + "/" + file, 'rb') as handle:
            b = pickle.load(handle)
            if 'july' in file or 'august' in file:
                bubble_box_scores.update(b)
            else:
                pre_bubble_box_scores.update(b)

pre_bubble_players_ft_attempts = dict()
pre_bubble_players_ft_made = dict()
debug = list()
for key, boxscore in pre_bubble_box_scores.items():
    ft_attempts = boxscore['FTA']
    ft_made = boxscore['FT']
    for index, value in ft_attempts.items():
        if RepresentsInt(value):
            if int(value) != 0:
                if index not in pre_bubble_players_ft_attempts.keys():
                    pre_bubble_players_ft_attempts[index] = int(ft_attempts[index])
                    pre_bubble_players_ft_made[index] = int(ft_made[index])
                else:
                    pre_bubble_players_ft_attempts[index] = pre_bubble_players_ft_attempts[index] + int(ft_attempts[index])
                    pre_bubble_players_ft_made[index] = pre_bubble_players_ft_made[index] + int(ft_made[index])

bubble_players_ft_attempts = dict()
bubble_players_ft_made = dict()
for key, boxscore in bubble_box_scores.items():
    ft_attempts = boxscore['FTA']
    ft_made = boxscore['FT']
    for index, value in ft_attempts.items():
        if RepresentsInt(value):
            if int(value) != 0:
                if index not in bubble_players_ft_attempts.keys():
                    bubble_players_ft_attempts[index] = int(value)
                    bubble_players_ft_made[index] = int(ft_made[index])
                else:
                    bubble_players_ft_attempts[index] = bubble_players_ft_attempts[index] + int(value)
                    bubble_players_ft_made[index] = bubble_players_ft_made[index] + int(ft_made[index])

Create the data frame

Next, I created a dataframe with the columns desired and ran through the players in the bubble to populate the the dataframe. I only took players who had taken at least 20 free throws in the pre-bubble as well as bubble.

  • Player
  • Pre-Bubble Free Throws Made
  • Pre-Bubble Free Throws Attempted
  • Bubble Free Throws Made
  • Bubble Free Throws Attempted
  • P-value of the difference between the two proportions
  • Pre-bubble Free Throw %
  • Bubble Free Throw %
  • Differences of Pre-bubble vs Bubble

Analysis

Test of Proportions

We had a total of 56 players who as of 8/10/2020 had taken more than 20 free throws in the bubble. Only 4 of them had significantly different free throw percentages at a 0.05 level

  • Jarret Allen - increase of .213
  • Mike Conley - increase of .205
  • Michael Porter - increase of .193
  • Russell Westbrook - decreased of .184
ft_pct_df = pd.DataFrame(columns = ['player', 'pre-bubble-made', 'pre-bubble-att', 'bubble-made', 'bubble-att', 'p-val'])

for player in bubble_players_ft_attempts.keys():
    if player in pre_bubble_players_ft_attempts.keys() and pre_bubble_players_ft_attempts[player] > 20 and bubble_players_ft_attempts[player] > 20:
        count = np.array([pre_bubble_players_ft_made[player], bubble_players_ft_made[player]])
        nobs = np.array([pre_bubble_players_ft_attempts[player], bubble_players_ft_attempts[player]])
        stat, pval = proportions_ztest(count, nobs)

        ft_pct_df = ft_pct_df.append({'player': player,
                                      'pre-bubble-made': pre_bubble_players_ft_made[player],
                                      'pre-bubble-att': pre_bubble_players_ft_attempts[player],
                                      'bubble-made': bubble_players_ft_made[player],
                                     'bubble-att': bubble_players_ft_attempts[player],
                                      'p-val': pval}, ignore_index=True)

ft_pct_df['pre-ft-pct'] = (ft_pct_df['pre-bubble-made']/ft_pct_df['pre-bubble-att']).astype(float)
ft_pct_df['bubble-pct'] = (ft_pct_df['bubble-made']/ft_pct_df['bubble-att']).astype(float)
ft_pct_df['ft-pct-diff'] = (ft_pct_df['bubble-pct'] - ft_pct_df['pre-ft-pct']).astype(float)
ft_pct_df = ft_pct_df.reindex(ft_pct_df['ft-pct-diff'].abs().sort_values(ascending=False).index)
ft_pct_df.round(3)
player pre-bubble-made pre-bubble-att bubble-made bubble-att p-val pre-ft-pct bubble-pct ft-pct-diff
52 Jarrett Allen 147 237 20 24 0.038 0.620 0.833 0.213
1 Mike Conley 93 117 22 22 0.020 0.795 1.000 0.205
44 Michael Porter 33 43 24 25 0.038 0.767 0.960 0.193
35 Russell Westbrook 269 346 19 32 0.020 0.777 0.594 -0.184
15 Khem Birch 41 67 18 24 0.224 0.612 0.750 0.138
38 Kristaps Porziņģis 177 228 44 49 0.054 0.776 0.898 0.122
11 Dwight Howard 89 180 17 28 0.267 0.494 0.607 0.113
31 Brook Lopez 98 121 23 25 0.183 0.810 0.920 0.110
2 Rudy Gobert 228 367 26 36 0.231 0.621 0.722 0.101
41 Goran Dragić 170 221 20 23 0.270 0.769 0.870 0.100
16 Caris LeVert 116 161 20 32 0.279 0.720 0.625 -0.095
22 Dario Šarić 79 95 25 27 0.223 0.832 0.926 0.094
48 Ivica Zubac 107 141 14 21 0.365 0.759 0.667 -0.092
29 Khris Middleton 178 196 26 26 0.107 0.908 1.000 0.092
5 J.J. Redick 147 163 18 22 0.236 0.902 0.818 -0.084
42 Nikola Jokić 218 268 26 29 0.267 0.813 0.897 0.083
17 Ja Morant 204 265 22 26 0.372 0.770 0.846 0.076
14 Terrence Ross 136 161 21 23 0.386 0.845 0.913 0.068
9 LeBron James 239 343 22 35 0.406 0.697 0.629 -0.068
3 Brandon Ingram 283 330 30 38 0.265 0.858 0.789 -0.068
39 Jimmy Butler 408 490 27 30 0.333 0.833 0.900 0.067
46 Chris Paul 225 250 24 25 0.328 0.900 0.960 0.060
18 Jonas Valančiūnas 123 168 19 24 0.534 0.732 0.792 0.060
25 Gordon Hayward 94 111 29 32 0.393 0.847 0.906 0.059
30 Giannis Antetokounmpo 361 570 27 47 0.422 0.633 0.574 -0.059
26 Jaylen Brown 159 216 17 25 0.550 0.736 0.680 -0.056
49 Joel Embiid 311 382 35 46 0.386 0.814 0.761 -0.053
36 James Harden 619 719 56 62 0.351 0.861 0.903 0.042
0 Donovan Mitchell 249 290 27 30 0.531 0.859 0.900 0.041
19 Damian Lillard 389 438 45 53 0.402 0.888 0.849 -0.039
53 Rudy Gay 99 113 21 23 0.616 0.876 0.913 0.037
43 Monte Morris 51 62 18 21 0.715 0.823 0.857 0.035
34 Derrick White 150 175 28 34 0.614 0.857 0.824 -0.034
37 Luka Dončić 369 491 43 55 0.621 0.752 0.782 0.030
4 Zion Williamson 98 152 16 26 0.773 0.645 0.615 -0.029
12 Nikola Vučević 114 146 17 21 0.765 0.781 0.810 0.029
45 Shai Gilgeous-Alexander 254 317 29 35 0.699 0.801 0.829 0.027
20 Carmelo Anthony 113 134 20 23 0.746 0.843 0.870 0.026
7 Kawhi Leonard 311 350 38 44 0.624 0.889 0.864 -0.025
24 Jerome Robinson 20 29 15 21 0.851 0.690 0.714 0.025
21 Devin Booker 405 442 47 50 0.561 0.916 0.940 0.024
27 Jayson Tatum 225 279 24 29 0.783 0.806 0.828 0.021
33 DeMar DeRozan 338 401 38 44 0.718 0.843 0.864 0.021
55 Malcolm Brogdon 145 162 21 24 0.767 0.895 0.875 -0.020
32 De'Aaron Fox 215 306 26 36 0.807 0.703 0.722 0.020
54 Fred VanVleet 140 166 25 29 0.797 0.843 0.862 0.019
10 Kyle Kuzma 96 130 18 25 0.848 0.738 0.720 -0.018
47 Danilo Gallinari 234 264 19 21 0.797 0.886 0.905 0.018
28 Marcus Smart 106 127 18 22 0.849 0.835 0.818 -0.016
6 Paul George 165 187 20 23 0.858 0.882 0.870 -0.013
40 Bam Adebayo 236 342 28 40 0.898 0.690 0.700 0.010
50 Pascal Siakam 220 275 17 21 0.916 0.800 0.810 0.010
8 Anthony Davis 386 457 58 68 0.860 0.845 0.853 0.008
13 Aaron Gordon 137 203 14 21 0.939 0.675 0.667 -0.008
51 Kyle Lowry 255 296 33 38 0.907 0.861 0.868 0.007
23 Rui Hachimura 92 111 19 23 0.975 0.829 0.826 -0.003

Matched Pairs T-Test

The effect size of the bubble on free throw percentage increased these 34 players free throw percentage on average by 2.6%. Note that this treats all players equally, regardless of how many free throws they've attempted because we group by each player

The matched pair's test assumes that the observations are independent of one another and that the dependent variable sould be approximately normally distributed. Plotting a kernal density estimator and a q-q plot, we see that we have approximate normality. The shapiro-wilks test for normality also confirms the assumption of approximate normality

A matched pair's t-test shows a p-value of .01, so we do have enough evidence to reject the null hypothesis that the effect of the bubble on a player's free throw percentage is 0.

fig, ax = plt.subplots()
ax = ft_pct_df['ft-pct-diff'].plot.kde()
fig.suptitle('KDE of Free Throw Difference\n Pre-Bubble to Bubble')
plt.show()

stats.probplot(ft_pct_df['ft-pct-diff'], dist="norm", plot=pylab)
pylab.show()

print('P-value of Shapiro-Wilks Test for Normality = ' + str(round(stats.shapiro(ft_pct_df['ft-pct-diff'])[1],3)))

print('Effect size of the bubble = ' + str(round(ft_pct_df['ft-pct-diff'].mean(),3)))
pval = stats.ttest_rel(ft_pct_df['pre-ft-pct'], ft_pct_df['bubble-pct'])[1]
print('P-value of Matched Pairs T-Test = ' + str(round(pval,3)))

!png

png

P-value of Shapiro-Wilks Test for Normality = 0.426
Effect size of the bubble = 0.026
P-value of Matched Pairs T-Test = 0.013

Conclusion

While the data (as of 8/10/2020) only shows that 4 players have a significant difference in their free throw percentage before the bubble and in the bubble, using a matched pairs t-test, we can conclude that a player's free throw percentage increases in the bubble. The immediate hypothesis would be that this is because of the reduced crowd noise, but other factors are at play here such as how much a player cares about the game (Lakers are in coast mode since locking up the top seed), etc.

As the season continues we will have many more players taking free throws and we will continue to analyze the data.