Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best Schema to represent NCAA Basketball Bracket

Tags:

What is the best database schema to represent an NCAA mens basketball bracket? Here is a link if you aren't familiar: http://www.cbssports.com/collegebasketball/mayhem/brackets/viewable_men

I can see several different ways you could model this data, with a single table, many tables, hard-coded columns, somewhat dynamic ways, etc. You need a way to model both what seed and place each team is in, along with each game and the outcome (and possibly score) of each. You also need a way to represent who plays who at what stage in the tournament.

In the spirit of March Madness, I thought this would be a good question. There are some obvious answers here, and the main goal of this question is to see all of the different ways you could answer it. Which way is best could be subjective to the language you are using or how exactly you are working with it, but try to keep the answers db agnostic, language agnostic and fairly high level. If anyone has any suggestions on a better way to word this question or a better way to define it let me know in the comments.

like image 881
Ryan Guill Avatar asked Mar 16 '09 12:03

Ryan Guill


People also ask

Who is the most accurate NCAA bracket Predictor?

Heat Check CBB bracketologist Lukas Harkins is one of the most accurate bracketologists in all of college basketball media. Entering the 2021-22 season, Harkins ranks No. 2 in accuracy among more than 130 bracketologists on BracketMatrix.com.

What is the closest to a perfect NCAA bracket?

The odds of a perfect 63-game NCAA bracket can be as high as 1 in 9.2 quintillion — though those are the perfect bracket odds if every game was a 50-50 coin flip.

Who has the best bracket for March Madness?

A neuropsychologist from Columbus, Ohio, Gregg Nigl was perhaps also being modest in his estimation of how in 2019 he managed to achieve what stands as the longest verifiable perfect streak in March Madness bracket history, reaching 49 games before going bust.


2 Answers

The natural inclination is to look at a bracket in the order the games are played. You read the traditional diagram from the outside in. But let's think of it the other way around. Each game is played between two teams. One wins, the other loses.

Now, there's a bit more to it than just this. The winners of a particular pair of games face off against each other in another game. So there's also a relationship between the games themselves, irrespective of who's playing in those games. That is, the teams that face off in each game (except in the first round) are the winners of two earlier games.

So you might notice that each game has two "child games" that precede it and determine who faces off in that game. This sounds exactly like a binary tree: each root node has at most two child nodes. If you know who wins each game, you can easily determine the teams in the "parent" games.

So, to design a database to model this, you really only need two entities: Team and Game. Each Game has two foreign keys that relate to other Games. The names don't matter, but we would model them as separate keys to enforce the requirement that each game have no more than two preceding games. Let's call them leftGame and rightGame, to keep with the binary tree nomenclature. Similarly, we should have a key called parentGame that tracks the reverse relationship.

Also, as I noted earlier, you can easily determine the teams that face off in each game by looking at who won the two preceding games. So you really only need to track the winner of each game. So, give the Game entity a winner foreign key to the Team table.

Now, there's the small matter of seeding the bracket. That is, modeling the match-ups for the first round games. You could model this by having a Game for each team in the overall competition where that team is the winner and has no preceding games.

So, the overall schema would be:

Game:
    winner: Team
    leftGame: Game
    rightGame: Game
    parentGame: Game
    other attributes as you see fit

Team:
    name
    other attributes as you see fit

Of course, you would add all the other information you'd want to the entities: location, scores, outcome (in case the game was won by forfeit or some other out of the ordinary condition).

like image 143
Alex Avatar answered Sep 23 '22 19:09

Alex


For a RDBMS, I think the simplest approach that's still flexible enough to accommodate the majority of situations is to do the following:

  • Teams has [team-id (PK)], [name], [region-id (FK to Regions)], [initial-seed]. You will have one entry for each team. (The regions table is a trivial code table with only four entries, one for each NCAA region, and is not listed here.)

  • Participants has [game-id (FK to Games)], [team-id (FK to Teams)], [score (nullable)], [outcome]. [score] is nullable to reflect that a team might forfeit. You will have typically have two Participants per Game.

  • Games has [game-id (PK)], [date], [location]. To find out which teams played in a game, look up the appropriate game-id in the Participants table. (Remember, there might be more than two teams if someone dropped out or was disqualified.)

To set up the initial bracket, match the appropriate seeds to each other. As games are played, note which team has outcome = Winner for a particular game; this team is matched up against the winner of another game. Fill in the bracket until there are no more winning teams left.

like image 26
John Feminella Avatar answered Sep 25 '22 19:09

John Feminella