Hello everyone,
Here is the background of my project:
I ran an OLS robust regression on National Basketball Association player data. The dependent variable is a statistic called Offensive RPM. The independent variables are various shooting zone field goals made and attempted totals as well as minutes total.
The regression as an entirety is statistically significant and each independent variable is statistically significant with the results, in a broad interpretation, being what's already known: making more shots on fewer attempts increases a player's offensive RPM and taking more shots with fewer makes decreases a player's offensive RPM -- i.e. efficient basketball is good.
However, the coefficient for corner three-point shot attempts is almost the same as midrange shot attempts. The coefficient figures I'm about to list are scaled up 100 to make typing and reading easier: -1.25 (midrange attempts) and -1.17 (corner three attempts).
This goes against the accepted norms and beliefs of the NBA statistics community because midrange shots have less value than three-point shots, especially in the corner. Three points is greater than two points, and the corner three is the shortest three point shot on the court. There are two conclusions to this:
1. Attempting corner three-point shots are not as beneficial as the norm suggests.
2. The fact that so many NBA players do not even attempt corner three-point shots, generating a lot of 0s in the sample, is forcing/pulling the two variables outward and creating a much larger risk-reward ratio than it really is.
When it comes to adding weights or trying to control for these potential issues, then I get a bit over my head. The zero values aren't an issue of "missing data," but rather mainly players simply don't take that many shots.
In all, what I'm asking is whether or not I should do something to address this potential issues and if so, what do I do in Stata to adjust my robust OLS regression?
Here is the background of my project:
I ran an OLS robust regression on National Basketball Association player data. The dependent variable is a statistic called Offensive RPM. The independent variables are various shooting zone field goals made and attempted totals as well as minutes total.
The regression as an entirety is statistically significant and each independent variable is statistically significant with the results, in a broad interpretation, being what's already known: making more shots on fewer attempts increases a player's offensive RPM and taking more shots with fewer makes decreases a player's offensive RPM -- i.e. efficient basketball is good.
However, the coefficient for corner three-point shot attempts is almost the same as midrange shot attempts. The coefficient figures I'm about to list are scaled up 100 to make typing and reading easier: -1.25 (midrange attempts) and -1.17 (corner three attempts).
This goes against the accepted norms and beliefs of the NBA statistics community because midrange shots have less value than three-point shots, especially in the corner. Three points is greater than two points, and the corner three is the shortest three point shot on the court. There are two conclusions to this:
1. Attempting corner three-point shots are not as beneficial as the norm suggests.
2. The fact that so many NBA players do not even attempt corner three-point shots, generating a lot of 0s in the sample, is forcing/pulling the two variables outward and creating a much larger risk-reward ratio than it really is.
When it comes to adding weights or trying to control for these potential issues, then I get a bit over my head. The zero values aren't an issue of "missing data," but rather mainly players simply don't take that many shots.
In all, what I'm asking is whether or not I should do something to address this potential issues and if so, what do I do in Stata to adjust my robust OLS regression?
Comment