How Much does Crowd Noise Affect NBA Players?
Posted on August 11, 2020 in NBA-Basketball
Premise
Basketball is back in the bubble! But it's been drastically different with virtual fans and the lack of crowd noise. I was curious to see the effect that crowd noise had on the players, and the most "controlled" aspect of the game I could think of was free throws. Therefore I wanted to see if player's free throw percentages improved due to the lack of crowd noise or not.
I used a matched pairs t-test to see whether a "bubble" effect existed on the NBA player's free throw percentage pre-bubble vs in the bubble. One of the disadvantages of the matched pairs t-test was that this treated the pre-bubble data as a single measurement point and the in-bubble data as another measurement point, but the data was actually binomial data.
Therefore, I also used a test of proportions to see whether any players deviated from their pre-bubble free throw shooting percentages
Load the data
I first loaded the box score data that was scraped from basketballreference.com using the code here
I then created dictionaries of each player's pre-bubble / bubble free throws made and free throw attempts. Thankfully no players in the NBA have the same name
import os
from os import path
import pickle
import pandas as pd
from scipy import stats
import numpy as np
import pylab
from statsmodels.stats.proportion import proportions_ztest
import matplotlib.pyplot as plt
def RepresentsInt(s):
try:
int(s)
return True
except ValueError:
return False
working_dir = os.getcwd() + "/ft_data"
# Get all pickle files that match basic box score in the 2020 season
pre_bubble_box_scores = dict()
bubble_box_scores = dict()
for file in os.listdir(working_dir):
if file.endswith(".pickle") and "basic_box_score" in file and "2020" in file:
with open(working_dir + "/" + file, 'rb') as handle:
b = pickle.load(handle)
if 'july' in file or 'august' in file:
bubble_box_scores.update(b)
else:
pre_bubble_box_scores.update(b)
pre_bubble_players_ft_attempts = dict()
pre_bubble_players_ft_made = dict()
debug = list()
for key, boxscore in pre_bubble_box_scores.items():
ft_attempts = boxscore['FTA']
ft_made = boxscore['FT']
for index, value in ft_attempts.items():
if RepresentsInt(value):
if int(value) != 0:
if index not in pre_bubble_players_ft_attempts.keys():
pre_bubble_players_ft_attempts[index] = int(ft_attempts[index])
pre_bubble_players_ft_made[index] = int(ft_made[index])
else:
pre_bubble_players_ft_attempts[index] = pre_bubble_players_ft_attempts[index] + int(ft_attempts[index])
pre_bubble_players_ft_made[index] = pre_bubble_players_ft_made[index] + int(ft_made[index])
bubble_players_ft_attempts = dict()
bubble_players_ft_made = dict()
for key, boxscore in bubble_box_scores.items():
ft_attempts = boxscore['FTA']
ft_made = boxscore['FT']
for index, value in ft_attempts.items():
if RepresentsInt(value):
if int(value) != 0:
if index not in bubble_players_ft_attempts.keys():
bubble_players_ft_attempts[index] = int(value)
bubble_players_ft_made[index] = int(ft_made[index])
else:
bubble_players_ft_attempts[index] = bubble_players_ft_attempts[index] + int(value)
bubble_players_ft_made[index] = bubble_players_ft_made[index] + int(ft_made[index])
Create the data frame
Next, I created a dataframe with the columns desired and ran through the players in the bubble to populate the the dataframe. I only took players who had taken at least 20 free throws in the pre-bubble as well as bubble.
- Player
- Pre-Bubble Free Throws Made
- Pre-Bubble Free Throws Attempted
- Bubble Free Throws Made
- Bubble Free Throws Attempted
- P-value of the difference between the two proportions
- Pre-bubble Free Throw %
- Bubble Free Throw %
- Differences of Pre-bubble vs Bubble
Analysis
Test of Proportions
We had a total of 56 players who as of 8/10/2020 had taken more than 20 free throws in the bubble. Only 4 of them had significantly different free throw percentages at a 0.05 level
- Jarret Allen - increase of .213
- Mike Conley - increase of .205
- Michael Porter - increase of .193
- Russell Westbrook - decreased of .184
ft_pct_df = pd.DataFrame(columns = ['player', 'pre-bubble-made', 'pre-bubble-att', 'bubble-made', 'bubble-att', 'p-val'])
for player in bubble_players_ft_attempts.keys():
if player in pre_bubble_players_ft_attempts.keys() and pre_bubble_players_ft_attempts[player] > 20 and bubble_players_ft_attempts[player] > 20:
count = np.array([pre_bubble_players_ft_made[player], bubble_players_ft_made[player]])
nobs = np.array([pre_bubble_players_ft_attempts[player], bubble_players_ft_attempts[player]])
stat, pval = proportions_ztest(count, nobs)
ft_pct_df = ft_pct_df.append({'player': player,
'pre-bubble-made': pre_bubble_players_ft_made[player],
'pre-bubble-att': pre_bubble_players_ft_attempts[player],
'bubble-made': bubble_players_ft_made[player],
'bubble-att': bubble_players_ft_attempts[player],
'p-val': pval}, ignore_index=True)
ft_pct_df['pre-ft-pct'] = (ft_pct_df['pre-bubble-made']/ft_pct_df['pre-bubble-att']).astype(float)
ft_pct_df['bubble-pct'] = (ft_pct_df['bubble-made']/ft_pct_df['bubble-att']).astype(float)
ft_pct_df['ft-pct-diff'] = (ft_pct_df['bubble-pct'] - ft_pct_df['pre-ft-pct']).astype(float)
ft_pct_df = ft_pct_df.reindex(ft_pct_df['ft-pct-diff'].abs().sort_values(ascending=False).index)
ft_pct_df.round(3)
player | pre-bubble-made | pre-bubble-att | bubble-made | bubble-att | p-val | pre-ft-pct | bubble-pct | ft-pct-diff | |
---|---|---|---|---|---|---|---|---|---|
52 | Jarrett Allen | 147 | 237 | 20 | 24 | 0.038 | 0.620 | 0.833 | 0.213 |
1 | Mike Conley | 93 | 117 | 22 | 22 | 0.020 | 0.795 | 1.000 | 0.205 |
44 | Michael Porter | 33 | 43 | 24 | 25 | 0.038 | 0.767 | 0.960 | 0.193 |
35 | Russell Westbrook | 269 | 346 | 19 | 32 | 0.020 | 0.777 | 0.594 | -0.184 |
15 | Khem Birch | 41 | 67 | 18 | 24 | 0.224 | 0.612 | 0.750 | 0.138 |
38 | Kristaps Porziņģis | 177 | 228 | 44 | 49 | 0.054 | 0.776 | 0.898 | 0.122 |
11 | Dwight Howard | 89 | 180 | 17 | 28 | 0.267 | 0.494 | 0.607 | 0.113 |
31 | Brook Lopez | 98 | 121 | 23 | 25 | 0.183 | 0.810 | 0.920 | 0.110 |
2 | Rudy Gobert | 228 | 367 | 26 | 36 | 0.231 | 0.621 | 0.722 | 0.101 |
41 | Goran Dragić | 170 | 221 | 20 | 23 | 0.270 | 0.769 | 0.870 | 0.100 |
16 | Caris LeVert | 116 | 161 | 20 | 32 | 0.279 | 0.720 | 0.625 | -0.095 |
22 | Dario Šarić | 79 | 95 | 25 | 27 | 0.223 | 0.832 | 0.926 | 0.094 |
48 | Ivica Zubac | 107 | 141 | 14 | 21 | 0.365 | 0.759 | 0.667 | -0.092 |
29 | Khris Middleton | 178 | 196 | 26 | 26 | 0.107 | 0.908 | 1.000 | 0.092 |
5 | J.J. Redick | 147 | 163 | 18 | 22 | 0.236 | 0.902 | 0.818 | -0.084 |
42 | Nikola Jokić | 218 | 268 | 26 | 29 | 0.267 | 0.813 | 0.897 | 0.083 |
17 | Ja Morant | 204 | 265 | 22 | 26 | 0.372 | 0.770 | 0.846 | 0.076 |
14 | Terrence Ross | 136 | 161 | 21 | 23 | 0.386 | 0.845 | 0.913 | 0.068 |
9 | LeBron James | 239 | 343 | 22 | 35 | 0.406 | 0.697 | 0.629 | -0.068 |
3 | Brandon Ingram | 283 | 330 | 30 | 38 | 0.265 | 0.858 | 0.789 | -0.068 |
39 | Jimmy Butler | 408 | 490 | 27 | 30 | 0.333 | 0.833 | 0.900 | 0.067 |
46 | Chris Paul | 225 | 250 | 24 | 25 | 0.328 | 0.900 | 0.960 | 0.060 |
18 | Jonas Valančiūnas | 123 | 168 | 19 | 24 | 0.534 | 0.732 | 0.792 | 0.060 |
25 | Gordon Hayward | 94 | 111 | 29 | 32 | 0.393 | 0.847 | 0.906 | 0.059 |
30 | Giannis Antetokounmpo | 361 | 570 | 27 | 47 | 0.422 | 0.633 | 0.574 | -0.059 |
26 | Jaylen Brown | 159 | 216 | 17 | 25 | 0.550 | 0.736 | 0.680 | -0.056 |
49 | Joel Embiid | 311 | 382 | 35 | 46 | 0.386 | 0.814 | 0.761 | -0.053 |
36 | James Harden | 619 | 719 | 56 | 62 | 0.351 | 0.861 | 0.903 | 0.042 |
0 | Donovan Mitchell | 249 | 290 | 27 | 30 | 0.531 | 0.859 | 0.900 | 0.041 |
19 | Damian Lillard | 389 | 438 | 45 | 53 | 0.402 | 0.888 | 0.849 | -0.039 |
53 | Rudy Gay | 99 | 113 | 21 | 23 | 0.616 | 0.876 | 0.913 | 0.037 |
43 | Monte Morris | 51 | 62 | 18 | 21 | 0.715 | 0.823 | 0.857 | 0.035 |
34 | Derrick White | 150 | 175 | 28 | 34 | 0.614 | 0.857 | 0.824 | -0.034 |
37 | Luka Dončić | 369 | 491 | 43 | 55 | 0.621 | 0.752 | 0.782 | 0.030 |
4 | Zion Williamson | 98 | 152 | 16 | 26 | 0.773 | 0.645 | 0.615 | -0.029 |
12 | Nikola Vučević | 114 | 146 | 17 | 21 | 0.765 | 0.781 | 0.810 | 0.029 |
45 | Shai Gilgeous-Alexander | 254 | 317 | 29 | 35 | 0.699 | 0.801 | 0.829 | 0.027 |
20 | Carmelo Anthony | 113 | 134 | 20 | 23 | 0.746 | 0.843 | 0.870 | 0.026 |
7 | Kawhi Leonard | 311 | 350 | 38 | 44 | 0.624 | 0.889 | 0.864 | -0.025 |
24 | Jerome Robinson | 20 | 29 | 15 | 21 | 0.851 | 0.690 | 0.714 | 0.025 |
21 | Devin Booker | 405 | 442 | 47 | 50 | 0.561 | 0.916 | 0.940 | 0.024 |
27 | Jayson Tatum | 225 | 279 | 24 | 29 | 0.783 | 0.806 | 0.828 | 0.021 |
33 | DeMar DeRozan | 338 | 401 | 38 | 44 | 0.718 | 0.843 | 0.864 | 0.021 |
55 | Malcolm Brogdon | 145 | 162 | 21 | 24 | 0.767 | 0.895 | 0.875 | -0.020 |
32 | De'Aaron Fox | 215 | 306 | 26 | 36 | 0.807 | 0.703 | 0.722 | 0.020 |
54 | Fred VanVleet | 140 | 166 | 25 | 29 | 0.797 | 0.843 | 0.862 | 0.019 |
10 | Kyle Kuzma | 96 | 130 | 18 | 25 | 0.848 | 0.738 | 0.720 | -0.018 |
47 | Danilo Gallinari | 234 | 264 | 19 | 21 | 0.797 | 0.886 | 0.905 | 0.018 |
28 | Marcus Smart | 106 | 127 | 18 | 22 | 0.849 | 0.835 | 0.818 | -0.016 |
6 | Paul George | 165 | 187 | 20 | 23 | 0.858 | 0.882 | 0.870 | -0.013 |
40 | Bam Adebayo | 236 | 342 | 28 | 40 | 0.898 | 0.690 | 0.700 | 0.010 |
50 | Pascal Siakam | 220 | 275 | 17 | 21 | 0.916 | 0.800 | 0.810 | 0.010 |
8 | Anthony Davis | 386 | 457 | 58 | 68 | 0.860 | 0.845 | 0.853 | 0.008 |
13 | Aaron Gordon | 137 | 203 | 14 | 21 | 0.939 | 0.675 | 0.667 | -0.008 |
51 | Kyle Lowry | 255 | 296 | 33 | 38 | 0.907 | 0.861 | 0.868 | 0.007 |
23 | Rui Hachimura | 92 | 111 | 19 | 23 | 0.975 | 0.829 | 0.826 | -0.003 |
Matched Pairs T-Test
The effect size of the bubble on free throw percentage increased these 34 players free throw percentage on average by 2.6%. Note that this treats all players equally, regardless of how many free throws they've attempted because we group by each player
The matched pair's test assumes that the observations are independent of one another and that the dependent variable sould be approximately normally distributed. Plotting a kernal density estimator and a q-q plot, we see that we have approximate normality. The shapiro-wilks test for normality also confirms the assumption of approximate normality
A matched pair's t-test shows a p-value of .01, so we do have enough evidence to reject the null hypothesis that the effect of the bubble on a player's free throw percentage is 0.
fig, ax = plt.subplots()
ax = ft_pct_df['ft-pct-diff'].plot.kde()
fig.suptitle('KDE of Free Throw Difference\n Pre-Bubble to Bubble')
plt.show()
stats.probplot(ft_pct_df['ft-pct-diff'], dist="norm", plot=pylab)
pylab.show()
print('P-value of Shapiro-Wilks Test for Normality = ' + str(round(stats.shapiro(ft_pct_df['ft-pct-diff'])[1],3)))
print('Effect size of the bubble = ' + str(round(ft_pct_df['ft-pct-diff'].mean(),3)))
pval = stats.ttest_rel(ft_pct_df['pre-ft-pct'], ft_pct_df['bubble-pct'])[1]
print('P-value of Matched Pairs T-Test = ' + str(round(pval,3)))
!
P-value of Shapiro-Wilks Test for Normality = 0.426
Effect size of the bubble = 0.026
P-value of Matched Pairs T-Test = 0.013
Conclusion
While the data (as of 8/10/2020) only shows that 4 players have a significant difference in their free throw percentage before the bubble and in the bubble, using a matched pairs t-test, we can conclude that a player's free throw percentage increases in the bubble. The immediate hypothesis would be that this is because of the reduced crowd noise, but other factors are at play here such as how much a player cares about the game (Lakers are in coast mode since locking up the top seed), etc.
As the season continues we will have many more players taking free throws and we will continue to analyze the data.