The Honest Hypocrite: Simulating the winner of the 2015 Superbowl

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Tuesday, 16 March 2010

March Madness Simulations - Game winning probability schemes

Posted on 20:18 by kajal singh
After my recent success with simulating Playoff Fantasy Football, I wanted to apply that success to a simulation of the NCAA Basketball playoffs known as March Madness. Given the amount of data analysis that I have done over the years (2009, 2009, 2008, 2007, 2007, 2006, 2006) that even enabled me to win one year, I figured that a simulation might help.

My simulation matches up the teams that play in the NCAA bracket and uses one of the schmes below to generate a probability for a Monte Carlo simulation of games between the teams.

Probability scheme 1: Sagarin ratings only

The first simply uses the Sagarin ratings to create a probability of the team 1 winning. Probability = team 1 Sagarin /( Team 1 Sagarin + Team 2 Sagarin). I use the Predictor Sagarin Rating because that is what he suggests for predicting the score and outcome of a game. A random number from 0 to 1 which is less than the probability above means that team 1 wins, otherwise its team 2.



I calculated every team's probability of winning vs every other team and then plotted this vs the difference in seeds. A -15 means a 1 seed played a 16 seed. This scheme results in probabilities that only vary from 58% to about 50% for matchups between seeds with up to 15 difference to even. Unfortunately no 16 seed team has even beaten a number 1 seed so this scheme leave the games too evenly matched and does not reflect the history of outcomes in the tournament.

Simulation results with this scheme show the number of simulations out of 1000 that a given seed was the champion. The actual history is here. The results in the chart show far too high a probability that low seeds are the champion in the tournament in these simulations.

A histogram of the teams with seeds and the number of times they are champions in 10,000 simulations, shows that Kansas is the most likely winner, but the spread of the data even includes the unlikely play in winner at 16 seed as a champion. This simulation is unrealistic.

Probability scheme 2: Seed difference and tournament history only

Another approach is to use the seeds of the team in the tournament. With 25 years or so of data I captured the number of times a favorite beat an underdog based on the seed difference. For instance, never has a 16 seed beaten a 1 seed, while 8 vs. 9 seeds are almost 50/50. I use the data from 25 years of round of 64, round of 32 and round of 16 and then fit a line assuming that even seeds are 50/50 and that a seed difference of 15 (1 vs. 16) will result in a favorite win 99.07% of the time. That represents 1 in 108, though this upset has never occurred in 26 years of data, it will happen someday, and that could be as soon as 1 this year. Thus (26*4+3) wins/(27*4) attempts is 99.07%.

I did not use the fitted line in the curve above because of its unrealistic probabilities at high seed difference. While this approach captures the history, I feel this approach neglects the variation between similar seeded teams as reflected in the Sagarin ratings. Additionally the history shows pretty wide variations in outcome.

Simulation results with this scheme show the number of simulations out of 1000 that a given seed was the champion. These results are more similar to the historical outcomes, but the matchups between evenly seeded teams will be tossups that ignore the differences as determined by the Sagarin ratings.

A histogram of the teams with seeds and the number of times they are champions in 10,000 simulations, shows that Kentucky is the most likely winner, with low seeds favored to be champions, but I fear that it neglects the difference in teams as represented by the Sagarin ratings. This simulation is unrealistic.


Probability scheme 3: Sagarin ratings scaled by seed difference and tournament history

The final approach combines the two by scaling the average of the Sagarin ratings probability by the expected probability due to seeds as predicted by historical performance. Thus we make sure the average for teams. In practice I add the residuals of the line fitted through the Sagarin rating probabilities to the line fitted by setting the 15 difference probability to 99.07% and the even difference to 50%.

Thus the probabilities reflect the historical data with a more realistic and very rare chance of 16 seeds beating 1 seeds but with the Sagarin ratings to sort between evenly matched teams.

Simulation results with this scheme show the number of simulations out of 1000 that a given seed was the champion. The results is similar to the seed difference with history scheme above, but now the Sagarin ratings are included.

A histogram of the teams with seeds and the number of times they are champions in 10,000 simulations, shows that Duke is the most likely winner, and low seeds are still favored as is true historically. This is the simulation scheme we will proceed with.
Email ThisBlogThis!Share to XShare to Facebook
Posted in March Madness, NCAA Basketball, simulations, statistics | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Captains of the Axiom from WALL-E
    Captain Reardon 2105-2248 143 years Captain Fee 2248-2379 131 years Captain Thompson 2380-2520 140 years Captain Brace 2521-26...
  • Recreating Laffer's unemployment benefits payout vs. unemployment rate chart but not his ridiculous conclusions
    Howard points to an excellent refutation of the ridiculous claims by Laffe r that unemployment benefits drive unemployment. The chart fro...
  • Snow yesterday was fluffy
    The picture of the creek was taken while the snow was still falling. There was 5" on the pathway at 5pm yesterday. Amounts ranged up t...
  • Employment-Population Ratio - a scarier way to look at unemployment
    The Bureau of labor Statistics compiles a stat called the Employment-Population Ratio that is the number of people working of the available...
  • Tradescantia Blooming
    Just when you think a flower has washed away in the floods, or burned away in the droughts, or withered due to benighted neglect, they come ...
  • Wesley Snipes fan in Delaware
    Could this person be such a fan of Wesley Snipes that they have a BLADE II license plate to commemorate their favorite movie? Is there a co...
  • Autographed hot dog buns at Tony Packo's Cafe in Toldeo
    Tony Packo's Cafe in Toledo is world famous for it's "Hungarian hot dogs", which are really just hot dogs with a deliciou...
  • Oscar Meyer Wienermobile in Wilmington today
    The Oscar Meyer Wienermobile was at the Acme at Rt202 and Rt 141  north of Wilmington today. They let us go inside it.  I like the custom Wi...
  • My new band and its debut album
    Reddit has a make your own band and first album with artwork generator much like the one that I have done before . It's been so long si...
  • Focus not on the Rapture but on loving others.
    A Roman Catholic priest gives the right answer when asked , "How does the Rapture figure into your faith?" ...Essentially all of t...

Categories

  • 80's TV shows
  • 9
  • A-team
  • accident
  • actors
  • aircraft
  • airplane
  • allium
  • alphabet
  • aluminum
  • animals
  • anniversary
  • Apollo 11
  • art
  • astronauts
  • astronomy
  • auction
  • Baltimore Orioles
  • baptisia
  • baptist
  • Baseball
  • bathroom
  • bathroom remodel
  • Battlestar galacta
  • bee
  • Bensi
  • bicycle
  • biden
  • biology
  • bird watching
  • blogs
  • Blue Heron
  • books
  • Boxing Day
  • brachiopod
  • broadband
  • BSG
  • Caddyshack
  • calendar
  • cancer
  • car
  • carriages
  • cars
  • cartoons
  • caterpillar
  • charts
  • chemistry
  • cherry trees
  • Chicago
  • chicken
  • children
  • Chrysler auto plant
  • cigar
  • cigar rolling
  • cinema
  • clouds
  • color
  • comics
  • commentary
  • coprology
  • creationist
  • croatian
  • cuba
  • cucumber
  • curling
  • curveball
  • daffodils
  • data presentation
  • Dealware
  • Delaware
  • deldot
  • DeLorean
  • design
  • Detroit Tigers
  • dinner
  • drilling
  • Dupont Country Club
  • Easter Island
  • eclipse
  • economics
  • el lector
  • electric
  • elements
  • Escanaba
  • evolution
  • experiment
  • explosion
  • face blindness
  • fantasy football
  • father's day
  • fayette
  • ferns
  • fireworks
  • flash mob
  • flashlight
  • flooding
  • flowers
  • flowrate
  • flu
  • fluid mechnics
  • fossil
  • fox
  • fragrance
  • frog
  • frost
  • fun
  • funeral
  • furniture
  • games
  • garden
  • gardening
  • generations
  • generator
  • genetics
  • Ghostbusters
  • gifts
  • glass tile
  • golf
  • government
  • graffiti
  • graph
  • graphing
  • great egret
  • greenville
  • groundhog
  • groundhog's day
  • Guerrilla Drive In
  • H1N1
  • hard drive
  • health care
  • histograms
  • history
  • home maintenance
  • horses
  • hubris
  • ideas
  • IHM
  • Immaculate Heart of Mary
  • Incredible Hulk
  • Independence day
  • internet
  • ipod touch
  • iris
  • IRS
  • joe-pie weed
  • juniper
  • junk
  • Key West
  • knowledge
  • labor
  • Lake Michigan
  • language
  • LCROSS
  • lego
  • license plate
  • lily of the valley
  • Lincoln Park Zoo
  • linguistics
  • lobster
  • logging
  • love
  • lunar eclipse
  • magnolia
  • manhole
  • maps
  • March Madness
  • marching band
  • marketing
  • math
  • measurement
  • memory
  • meta
  • meterology
  • military records
  • modeling
  • modelling
  • models
  • Monarch Butterfly
  • monster truck
  • monte carlo
  • moon
  • mosaic
  • mountain
  • mountain ash
  • mouse
  • movie
  • movies
  • museum
  • names
  • NASA
  • NASA Wallops space
  • National Park
  • NCAA Basketball
  • network
  • New Castle County
  • NFL
  • NOAA
  • nodes
  • Norristown
  • Northern Water Snake
  • Northern Water Snake.
  • obama
  • panorama
  • parenting
  • parking
  • pencil
  • pennsylvania
  • peonies
  • personality
  • philadelphia
  • phillies
  • philosophy
  • pi
  • picture
  • pictures
  • Pillsbury Doughboy
  • Pittsburgh
  • playoff
  • Point to Point
  • polar bear
  • politics
  • preparation
  • president
  • probability
  • progress
  • prosopagnosia
  • psychology
  • Quantum Leap
  • quiz
  • raccoon
  • rain
  • reader
  • refinery
  • religion
  • remodeling
  • restaurants
  • review
  • reviews
  • robotic groundhog
  • robots
  • rock
  • rocket launch
  • rotisserie
  • Sagarin
  • satellites
  • saw mill
  • scat
  • Science
  • Science Fiction
  • scifi
  • screaming yellow
  • sculpture
  • separated at birth
  • Septa
  • sf
  • shellpot creek
  • signs
  • Simpsons
  • simulations
  • slime mold
  • slug
  • snakes
  • snow
  • snow chairs
  • soccer
  • solar eclipse
  • space
  • speed test
  • spider
  • spring
  • Spring Beauty
  • stars
  • State fair
  • statistics
  • storage
  • strawberries
  • sun
  • survival
  • t-shirt
  • tableau
  • taxes economics statistics sussex delaware
  • technology
  • television
  • terminator
  • test
  • tetris
  • time
  • time travel
  • titles
  • toad lily
  • tomato
  • traffic cams
  • transpass
  • travel
  • trees
  • trivia
  • truck
  • tuba
  • Tufte
  • tulips
  • tunnel
  • turnpike
  • twitter
  • unemployment
  • university
  • UP State Fair
  • urinal
  • V
  • vaccination
  • Verizon FIOS
  • video
  • Virginia
  • visualization
  • vocabulary
  • WALL-E
  • Wallops
  • Walt Disney World
  • war on children
  • weather
  • website
  • white heron
  • wikipedia
  • wilmginton
  • Wilmington
  • wine
  • winter solstice
  • Winterthur
  • wood
  • zoo

Blog Archive

  • ►  2015 (2)
    • ►  January (2)
  • ►  2014 (77)
    • ►  December (1)
    • ►  November (1)
    • ►  September (4)
    • ►  August (9)
    • ►  July (20)
    • ►  June (14)
    • ►  May (3)
    • ►  April (8)
    • ►  March (5)
    • ►  February (5)
    • ►  January (7)
  • ►  2013 (30)
    • ►  December (3)
    • ►  November (5)
    • ►  September (3)
    • ►  August (3)
    • ►  June (2)
    • ►  May (4)
    • ►  April (3)
    • ►  March (2)
    • ►  February (4)
    • ►  January (1)
  • ►  2012 (41)
    • ►  November (5)
    • ►  October (4)
    • ►  September (3)
    • ►  August (1)
    • ►  July (4)
    • ►  June (9)
    • ►  May (3)
    • ►  April (3)
    • ►  March (4)
    • ►  February (1)
    • ►  January (4)
  • ►  2011 (82)
    • ►  December (3)
    • ►  November (4)
    • ►  October (1)
    • ►  September (5)
    • ►  August (7)
    • ►  July (4)
    • ►  June (5)
    • ►  May (8)
    • ►  April (15)
    • ►  March (9)
    • ►  February (6)
    • ►  January (15)
  • ▼  2010 (129)
    • ►  December (6)
    • ►  November (2)
    • ►  October (3)
    • ►  September (6)
    • ►  August (15)
    • ►  July (20)
    • ►  June (8)
    • ►  May (7)
    • ►  April (15)
    • ▼  March (10)
      • A sudden glut of bicycle wines
      • The flowers of spring
      • March Madness Simulations - Game winning probabili...
      • Water shooting out of drain next to Shellpot Creek
      • FCC testing your broadband speed
      • The Honest Hypcorite in equations
      • Redneck sighting - duelie pickup with silver balls
      • Accident with State and County police blocking RT1...
      • Obsolete professions - el lector - reader to cigar...
      • Two curling cartoons in honor of Olympic curling
    • ►  February (22)
    • ►  January (15)
  • ►  2009 (139)
    • ►  December (11)
    • ►  November (9)
    • ►  October (15)
    • ►  September (19)
    • ►  August (14)
    • ►  July (28)
    • ►  June (22)
    • ►  May (21)
Powered by Blogger.

About Me

kajal singh
View my complete profile