A New Madness to March: Creating a Bracket Optimization Model (Part Two)

Submitted by mgoDAB on March 13th, 2023 at 10:42 PM

Read part one: HERE. TL;DR for part one: For the last several years, I've worked on building a bracket optimization model for March Madness designed to maximize my odds of winning pool(s). Part one gets into the nuts and bolts of how it exactly works, but essentially it consists of three main components:

  • (1) an ensemble model of Kenpom, Torvik, and BPI rating systems to derive win probabilities for every possible matchup and examine percentage chances for every team to advance to each round of the tournament;
  • (2) public selection data from ESPN used to simulate competing bracket entries across different size pools and identify misalignment in value; and
  • (3) an input page whereby I create my own bracket and calculate the probability that such bracket wins a pool, measuring it relative to tens of thousands of simulations of tournament results and bracket entries I would compete against in pool sizes of 10, 25, 50, 80, and 100 entries. 

This is in pretty big contrast to Seth's tool, which gets into the nitty gritty for injuries, player stats, roster construction, team stylistic stats, recent performance, etc. Rather, my model is a little more centered on game theory (i.e. how do I stack the odds most in my favor to win a simple betting game like a March Madness bracket pool).

 

Happy March Madness. I'm eager to once again make my bracket optimization model available after receiving a lot of positive feedback last year. I've won money from the past three tournaments using this model for my pools, and every year I look forward to making updates to make it even more powerful. You can read my post from last year to get the crash course on how it works, but for this post I mostly want to highlight the yearly updates I've made. The major things I wanted to improve upon were:

  • Have the model be more intuitive and the interface be better visually, specifically the Bracket Entry tab where the user enters their specific selections.
  • Have the model run faster in order to accommodate a greater number of competing bracket entry simulations. Last year the model supported 100,000 competing bracket entry simulations, while this year's edition supports double the amount at 200,000. 
  • Add more functionality to improve decision-making, specifically the new Pools tab.

The main tab you'll work in is Bracket Entry, where same as last year, you make your own selections through a series of dropdown lists. By doing so, you can observe how your odds of winning a pool change across different pool sizes, summarized in the table at the top of the page. In last year's edition of the model, this tab felt a little clunky so I made a number of edits to enhance user experience (including custom error messages when a selection needs to be updated).

In terms of new functionality, the major addition to this year's model is the Pools tab, which allows you to record specific brackets you've submitted in the Bracket Entry tab. Just click the Record this bracket entry in the "Pools" tab button as it's shown in the picture above. Following that, you can enter: (1) your wager amount, and (2) the pool size in order to derive specific win probabilities and expected payouts across multiple bracket submissions. If a singular bracket submission is like a "stock", the Pools tab acts like a "portfolio manager" by measuring expected returns and ensuring your odds are maximized and your Champion/Final 4/etc. are well diversified.

If you are interested in using this tool, please leave your email in the comment section (or send me one directly) and I will be happy to share via [email protected]. FYI - It is a large file, and you will need to be a little Excel savvy in order to use it. But I do include directions and a glossary to help you familiarize yourself with how it works.

Best luck with your pickings! Early indications suggest teams like Houston, UCLA, Tennessee, and UConn are some of the most favorable top-seeded picks to advance deeper into the tournament. While teams like Alabama (heavy public favorite) and Kansas (not rated as highly by Kenpom/Torvik/BPI relative to public's perception) look a little less attractive. I'll leave the rest of the picks to you!

- mgoDAB

 

Also, here's a data dump for fun.

E[PAP] is expected points against the public. I elaborate on this statistic in last year's post and explain its usage in my model. Essentially, it represents the expected incremental point value gained above the national average bracket entry by making a certain selection. I've found this to be the best statistic when identifying strategic picks designed to maximize my odds of winning pools.

Comments

4th phase

March 14th, 2023 at 9:26 AM ^

Thanks for doing this.

This is the same type of model that Ed Feng uses. Optimizing picks based on pool size / public picks.

 

Is E[PAP] cumulative or per round?

mgoDAB

March 14th, 2023 at 5:52 PM ^

The E[PAP] figures shown above are per round, not cumulative. The numbers increase as you move on to later rounds because the point values of the later rounds are higher than earlier rounds (e.g. first round in a standard ESPN format is 10 points, while the championship is 320 points). 

mgoDAB

March 14th, 2023 at 6:25 PM ^

Interesting. I would still urge you to use the Bracket Entry tab to see how your specific bracket performs relative to all the simulations (particularly the 100-entry simulation). E[PAP] is a tool but not the toolbox, so to speak.

As an interesting case study, in 2019 I was in a 120-person pool (or thereabouts). This was before my model included the simulations component, and I was relying solely on E[PAP]. Turns out that I got 3 of my Final 4 picks correct (MSU, UVa, and Texas Tech). But before the Final 4 games even began, I realized I had a zero percent chance of finishing in the top 3 of the pool (and winning money). That's because I was so focused on E[PAP] in the earlier rounds, and I got unnecessarily aggressive/risky with my picks and diluted my score. Had I picked fewer upsets in earlier rounds, I could have been in the money.

There is a bit of breakeven when it comes to selecting picks based on E[PAP]. It's a great way to set your model apart with some strategic selections, but it can definitely become counterproductive if you get too upset happy.

If you do want to do a full E[PAP] bracket, try building your bracket backwards. Start with your champion, championship game, Final 4, etc. This'll help keep you from getting too upset happy.

4th phase

March 15th, 2023 at 11:29 AM ^

Yeah I think to maximize you have to start at the championship and work backwards since those are the highest values. What I see in your spreadsheet is that the max has to include a final 4 of Alabama, Tennessee, Houston, UCLA, with Houston over Alabama in the championship. Elite 8 has to add Uconn, Kansas St, Creighton, Iowa St. 

So as a sanity check, that's a final four of the top 2 overall seeds, a 2 and a 4. The Elite 8 is 6/8 top 4 seeds, with the only weirdness being two 6s in Creighton and Iowa St. 

 

The sweet 16 is where we get crazy. 5 SDSU, 10 Utah St., 12 Drake, 7 Texas A&M, 9 FAU, 10 USC OR 7 MSU (makes no difference), 8 Arkansas, and 11 Nevada. That's 4 double digit seeds and two 1 seeds going down. 

 

For picking the 1st rd, E[PAP] is too "noisy", or doesn't change significantly enough, to impact decision making.

HarmonHowardWoodson

March 14th, 2023 at 1:48 PM ^

Interesting that Tennessee is still rated highly after their start guard tore his ACL a couple weeks ago.

Best luck with your pickings! Early indications suggest teams like Houston, UCLA, Tennessee, and UConn are some of the most favorable top-seeded picks to advance deeper into the tournament.

mgoDAB

March 14th, 2023 at 3:07 PM ^

This is a great example of how something like Seth’s tool can serve as a complement to mine. I don’t specifically adjust the team ratings for injuries anymore than Kenpom/Torvik/BPI would have. I thought of also including evanmiya.com’s ratings for this year’s edition of the model, but didn’t get around to it. I believe he makes manual adjustments for injuries, but even then Tenn is #4 in his rankings currently. 
 

Tennessee has been seen as a darling all season by the computers buoyed by their defense. They rank as top 5 by most all computer rating systems including NET. 

Eng1980

March 17th, 2023 at 1:55 AM ^

Awesome response.  I have been filling in brackets since 1980.  That means some brackets bust on the first game and sometimes you have 7 of the final 8 (only to have 1 of the final 4.)  It is fun but after a while it really gets boring knowing that the success of your bracket depends more on luck than skill.

Wally Llama

March 15th, 2023 at 10:48 PM ^

Thanks for sharing your hard work! I'm geeking out on this!

The only question I have so far is about how this tool accounts for different scoring systems. The tool uses a built-in a built-in 10-20-40-80-160-320 system. I assume that a 1-1-1-1-1-1 system would have a different optimal selection, with heavier emphasis on first round upsets and little penalty for missing the champion.

Can this feature be added for future releases?

4th phase

March 16th, 2023 at 1:06 PM ^

Too late now, but to do that, you go to the "Bracket Winners" sheet then look across row 44. In that row you see a bunch of values that go 320, 160, 160, 80, etc. Change every value in row 44 to 1.

Then go to the "Sim" sheet and look at row 8 from column EJ to column GT. Change all those values to 1 as well. 

Walmart Wolverine

March 16th, 2023 at 10:43 AM ^

After using this tool and Seth's and Ken Pom (redundant) and Vegas odds I have all Big 10 teams losing in the first two rounds

Then took another look at MSU and flipped them to a loss tomorrow because that will be more entertaining and my expert analysis is that whichever team plays better between them and USC will win.