Announcement

Collapse
No announcement yet.

Fun With Math 2015-16 Season

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fun With Math 2015-16 Season

    You may or may not recall I posted a model similar to this one last year. This post is crazy long so bail now if you don't care about stats/predictions. Or for other reasons, your call.

    ---

    I did this last year and thought it would be worth doing again, to give some statistical context on the team's performance so far, and what we can expect for the rest of the year if that performance holds up.

    I'm going to go over the process again, hopefully in a more clear way than last time, or as a refresher for those who want it. I started right from the first 10 games last year, and although it was interesting to see the numbers level out quickly from that first sample, which was obviously too small (and provided hilariously optimistic projections), I've decided to start later and do more consistent (every 10 games) updates this year. With 22 games in the books, we've got a decent sample (1/4 season), and a nice set-up for updates at the 32, 42, 52, 62, 72 and 82 game marks. The sample size impact is two-fold - not only do we have more Raptors performances to check, we also have a better understanding of how good their opponents were, as they all have more games played as well.

    For each game, we want to assign a value to how well the Raptors played. We want to take into consideration their opponent's typical offensive and defensive performance. We want to take into account whether one team or the other (or both) are on the second game of a back-to-back. We want to take into account whether the team is playing at home or on the road.

    So, we come up with an expected performance for a particular game assuming the Raptors play like a purely average team. We start with the opponent's full season offensive rating (points scored per 100 possessions) and defensive rating (points allowed per 100 possessions). I update these after each game and apply them retroactively - so the Raptors' early season win against the Pacers looks a lot more impressive now than it did at the time, and their performance value for that game has gone up as the Pacers have improved their season. There are pitfalls to this approach (teams sometimes actually are better or play better in different parts of the year) but there are pitfalls to adjusting it as well, as playing a very good team that has played a lot of very good opponents in their past 10 games (for example) does not mean it will be an easy game.

    So, we have an opponent ORTG and DRTG, which become an expected DRTG and ORTG for the Raptors that game. Then we adjust for home court and back to backs. Home court, statistically over the past couple seasons, has been worth 1.4 points each of ORTG and DRTG boost for the home team. So immediately we take the expected ORTG, and add 1.4 points if we are the home team (subtract 1.4 points if we are on the road). The opposite for DRTG (subtract 1.4 points at home, add 1.4 points on the road).

    We also need to adjust for back-to-backs. These are more rare, but have just as much impact as home court. On nights when both teams are on the second night of a back-to-back, no adjustment is applied. If the Raptors are on a back to back, they will see their expected ORTG decreased by 1.2 points and their expected DRTG increased by 1.2 points.

    So we boil it down to this for each game:

    Expected ORTG = Opponent DRTG + 1.4 (if home) - 1.4 (if away) + 1.2 (if opponent on B2B) - 1.2 (if Raps on B2B)
    Expected DRTG = Opponent ORTG - 1.4 (if home) + 1.4 (if away) - 1.2 (if opponent on B2B) + 1.2 (if Raps on B2B)

    Now we have an expected offensive and defensive rating for each game based entirely on the opponent and the schedule. Nothing to do with how the Raptors play as a team. That part comes next.

    The game is played at this point. This is the interesting, fun part. As such we will skip over it.

    Once the game is over, we have two numbers - the Raptors' actual ORTG in the game, and their actual DRTG in the game. We compare those to the expected ORTG and DRTG from before to end up with an offensive performance (O-Perf) and defensive performance (D-Perf) - relative to expectations. It looks like this:

    Offensive performance = Game ORTG - Expected ORTG
    Defensive performance = Expected DRTG - Game DRTG

    Note that the defensive performance subtraction pair is the opposite of the offensive performance so that a good result for either value is positive. Makes it easier to make a single-glance judgment of a game. Bigger numbers are better, simple.

    We do this for every game. So we end up with a bunch of data points, two for each game, O-Perf and D-Perf.

    Now we can do some data manipulation (very simple stuff for the most part). We can find the average performance numbers. That tells us a lot right away. The average O-Perf and D-Perf describe how well the Raptors have played on each end relative to their schedule. We can take those performance values and apply them to the league average ORTG and DRTG to see what the Raptors' ORTG and DRTG would be right now if they had played a perfectly average schedule (opponent net rating of 0, same number of home games versus road games, back-to-back balance).

    True ORTG = League Average ORTG + O-Perf
    True DRTG = League Average DRTG - D-Perf

    We can immediately use this as an adjusted measure of the team so far. Pythagorean wins are a quick way to judge a team's performance - they translate point differential (or efficiency differential as we are using here) to a winning percentage that is typical for that differential.

    Pythagorean Winning % = (TrueORTG^14) / (TrueORTG^14 + TrueDRTG^14)

    That winning percentage is easily converted into a projected 82 game record for the full season.

    That approach makes an adjustment for the teams you have already played, and assumes an average schedule from that point onwards. Which is fine. But we can do more to be able to account for future schedule, as well as to let us consider small stretches of the season, even make an attempt to predict individual games.

    We need, as it turns out, only one other data manipulation. We already have the average performances. Now we find the standard deviations of those performances. That lets us know how consistent or how wild the performance tends to be night to night. This matters as a team that is very good on average but wildly inconsistent will lose games they shouldn't (and steal a few they should lose). A team that is very poor on average but wildly inconsistent will win games they shouldn't (and drop even a few of the ones they should win). The better you are, the more important it is to be consistent. If a team outscores their opponent by 10 points a night on average, they'd much rather do that every night than alternate blowing people out by 30 and losing by 10 the next game.

    This plays into our predictions of individual games. We assume that the Raptors' performances fit a normal (bell curve) distribution, with most of the games clustered around the average and fewer and fewer games happening the further from the average performance you get. The only two values needed to define a bell curve are the average value and the standard deviation.

    We could treat offence and defence separately, but the probability calculation gets a little unwieldy, so for simplicity's sake (I know, I know...) we'll combine the offensive and defensive performances into a single "net performance." And we'll take the average of the Raptors' net performances and the standard deviation of their net performances and use those two values to build a bell curve.

    Average Perf = (Average O-Perf + Average D-Perf) / 2

    The standard deviation (SD Perf) is more complex but is easily calculated in Excel.

    Now, the value of a bell curve is that it predicts what proportion of a data set falls below a certain value within that data set. For example, how many of the Raptors' performances will fall below average. I can tell you right now the value for that is 50% - half of their performances will fall above average and half below. Other values depend on how far the threshold you are looking at is from the average, in units of the standard deviation. So if the average is a +1 performance, and you want to know how often they will have a 0 performance or worse, and your standard deviation is 5, you find the value (it is pretty searchable online - they are always the same no matter what your average and SD are) below -0.2 SD's from the average. That value is 42%. 42% of the time the team will break even or worse given those values.

    In any case, with an average value and a standard deviation, we can now start predicting individual games. Applying the Raptors' average O-Perf and D-Perf to the Expected ORTG and Expected DRTG for that game will give an expected final score margin (per 100 possessions played).

    Expected Score Margin = (Expected ORTG + Average O-Perf) - (Expected DRTG - Average D-Perf)

    Then we can say that to win the game, the Raptors need to have a positive score margin at the end of the game - any score margin above zero will be a win. Any below zero will be a loss. So, we find out how badly they would have to play (what performance they need to post relative to their average) to end up with a score margin of 0. That's easy - the combined performance would be the inverse of the expected score margin. So if the expected score margin is +1 (the Raptors are predicted to win by 1 point) then the performance needed to get a score margin of 0 is -1 (they need to play 1 point worse than their average).

    Performance Needed to Win = -1 * (Expected Score Margin)

    Then we translate that performance needed to a standard deviation value.

    SD Performance Needed to Win = Performance Needed to Win / (SD Perf)

    Then we check to see what percentile value is assigned to that SD value. That percentile will give us the odds the Raptors fail to break even in net rating - also known as their chance to lose the game. Their chance to win the game is obviously simply 100% subtract that chance to lose.

    Then we have a bunch of games with odds to win. Take all the games for a sample, or even for the whole season, and add up all those percentages, and you get the total number of wins expected for that sample, based on the Raptors' performance so far.

    ---

    Now, to the actual thing. The Raptors' performances so far:



    These yield an average O-Perf of +2.8 and an average O-Perf of +1.9, so above average in both categories. Those translate to a True ORTG of 104.2 and a True DRTG of 99.6. The True ORTG of 104.2 lies close to our overall ORTG of 104.1 on the season - we've played some defensively average teams thus far. But our True DRTG is 1 point better than our season DRTG of 100.6 - good for 10th in the league instead of tied for 12th.

    The True ORTG and DRTG combine for a pythagorean win percentage of 65%, and a prediction of 54 wins on the year.

    Taking the other approach, we need a standard deviation - our standard deviation for performance works out to 9.9, while our average performance is +3.7 (simply the two average performances on offence and defence added together).

    Below you can see a sample of the next 10 games.



    The total predicted wins for this approach are more reasonable and suggest a more difficult schedule is coming - this model predicts 48 wins at this time. It says we should have a record of 12.6 wins - pretty bang on our 12 wins. It predicts 5.8 wins in the next 10 games.

    Let me know if there are any improvements you think I should make to the model - last year I implemented the home court advantage and back to back numbers after suggestions.
    twitter.com/dhackett1565

  • #2
    The only missing piece in your work is the DNA.





    Drake Night Algorithm






    I should add....nice work. Don't take my reply to mean I don't think it's a great, informative post.
    Last edited by Mess; Wed Dec 9, 2015, 04:49 PM.
    Two beer away from being two beers away.

    Comment


    • #3
      I need to read this a few times to actually absorb and understand the logic, but much appreciation for putting this together. Keep this kind of stuff coming. It's great when one takes into consideration as many variables as possible (B2B, Home vs away) and keeps the analysis dynamic (adjusting for improving teams), so that wins/losses are more reflective.

      Like I said, I need to read a few more times. But good stuff.

      Comment


      • #4
        Awesome as always Dan. I know it's a lot of work but have you thought about doing it league wide? It'd be a pain to set up the first time but you could feed it game logs through the season from basketball-reference.com.

        Comment

        Working...
        X