![]()
Seeding Proposal
![]()
Some will argue that the RPI and SOS of Tennessee deserve the second position, others will argue that winning a very tough conference trumps losing in the second round. The next four, and their order are very much in doubt. The contenders:
I think this decision is sound, at least until there is evidence that the regionals can sell out, or come close. However, while the committee has admitted that the decision includes location, they haven't articulated how they make the trade-offs between the "S-curve" and the location considerations. I'm going to propose a formal way to reflect the trade-offs. I'll emphasize now that I don't propose any formula to be applied slavishly. In the same way that the RPI rankings aren't the final rankings, but the starting point for discussion, I simply propose the following approach as a starting point, and I'd urge the committee to make movements if necessary, but always look at the implied "cost" of the move.
The usual way to weigh two factors not on the same basis is to design a cost function, and then to identify the options with the lowest costs. If you are trying to decide between two comparable job opportunities except for differing locations and differing salaries, you have to decide whether the advantage of one location outweighs the higher salary in the other (if both location and salary are better for one choice, you wouldn't be trying to compare). Whether implicitly or explicitly, you end up deciding that $x per year is worth more or less than the advantage of the other location.
In the same way, I want to assign a cost to each location, and check to see if it offsets the "cost" of moving someone from their natural seeding. With job opportunities, the natural cost function is dollars, and the difficult part is translating location preferences into dollars. With seeding decisions, I'm choosing to express the cost function as seeding position. If a team is naturally seeded fifth, but you end up assigning them to the location that "should" be seventh is a "cost" of two. (Only moves down incur a cost - moving up is good, but it doesn't balance a move down by a different team.) Location is a little harder to quantify. While it is easy to calculate distances, we need to calibrate distances to seeding "displacement". Second, if one team has to travel 300 miles, and another 310 miles, it would be silly to treat this as a longer distance. Small differences should be considered equivalent. I'll propose the following - there are four different "distances" a team, and its fans has to travel.
Obviously, these aren't hard and fast distinctions. Some people will drive further than 500 miles, some will fly to closer locations. But I think these distance reflect a first approximation to distinguish fan's decisions. Many fans will show up at a game a short drive away, because they can drive and return the same day. A long drive is a higher level of commitment, probably requiring a hotel stay. A short flight is the next level of commitment, requiring a plane ticket, a hotel stay and maybe a car rental. Those elements are all present in the long flight, but the plane fare is likely to be more expensive, and the timing may require an extra day or two away form home and work. I'd also like to work time zones into the formula - changing a couple of time zones is more "costly" than flying the same distance in the same time zone, but I'll work on that as a future refinement.
Here is the table of distance for each of the eleven candidates for the top two seeds:
| Distance to regional | |||||
| Location | School | Greensboro | Dayton | Dallas | Fresno |
| Storrs | UConn | 659 | 761 | 1650 | 3055 |
| Durham | Duke | 56 | 489 | 1164 | 2644 |
| Knoxville | TN | 286 | 303 | 843 | 2308 |
| Chapel Hill | UNC | 53 | 486 | 1161 | 2641 |
| Columbus | OSU | 406 | 75 | 1040 | 2368 |
| College Park | MD | 324 | 490 | 1343 | 2784 |
| Palo Alto | Stanford | 2739 | 2397 | 1704 | 167 |
| Tempe | AZ St | 2089 | 1806 | 1063 | 602 |
| Norman | OK | 1147 | 872 | 190 | 1472 |
Nashville |
Vanderbilt | 465 | 326 | 663 | 2128 |
Lafayette |
Purdue | 620 | 181 | 934 | 2227 |
Here are the point values for each location, using the following point system:
| Greensboro | Dayton | Dallas | Fresno | ||
| Storrs | UConn | 2 | 2 | 3 | 3 |
| Durham | Duke | 0 | 1 | 2 | 3 |
| Knoxville | TN | 1 | 1 | 2 | 3 |
| Chapel Hill | UNC | 0 | 1 | 2 | 3 |
| Columbus | OSU | 1 | 0 | 2 | 3 |
| College Park | MD | 1 | 1 | 2 | 3 |
| Palo Alto | Stanford | 3 | 3 | 3 | 0 |
| Tempe | AZ St | 3 | 3 | 2 | 2 |
| Norman | OK | 2 | 2 | 1 | 2 |
Nashville |
Vanderbilt | 1 | 1 | 2 | 3 |
Lafayette |
Purdue | 2 | 1 | 2 | 3 |
Let's start with the easy part - the first four seeds are likely to be Duke, UNC, TN and UConn - their assignment to regionals following the "S-curve", highest seed gets the best locations - works out fine.
| Natural | Assigned | Seeding Points | Location Points | Total Points | ||
| Duke | 1 | 1 | Greensboro | 0 | 0 | 0 |
| UNC | 2 | 2 | Dayton | 0 | 1 | 1 |
| Tenn | 3 | 3 | Dallas | 0 | 2 | 2 |
| UConn | 4 | 4 | Fresno | 0 | 3 | 3 |
| 6 |
The natural seed is the seed they deserve based upon the committee seeding without geographic considerations. The assigned seed is the same in this case, indicating that the committee did not have to make any moves. Because everyone's assigned seed matches their natural seed, there are zero penalty points for seeding.
The location points bare based on the table. Duke gets to go close to home, and that location is a zero point location. UNC gets the next location choice, they cannot go to Greensboro, because Duke is there, but there is a one point location available. Assign all the locations, calculate the location points and add up the points. In this case the point total is six. (The goal is to minimize the point total - there is no other way to assign the teams to locations and reduce the overall points.)
For the sake of discussion, assume the next four seeds, in order, are:
5. Maryland
6. Ohio State
7. Stanford
8. Arizona State
If the committee follows the "S-curve" without modification, the fifth seed is matched up with the four seed, etc.
This produces the following table:
| Natural | Assigned | S-curve | ||||
| Maryland | 5 | 5 | Fresno | 0 | 3 | 3 |
| OSU | 6 | 6 | Dallas | 0 | 2 | 2 |
| Stanford | 7 | 7 | Dayton | 0 | 3 | 3 |
| AZ St | 8 | 8 | Greensboro | 0 | 3 | 3 |
| 11 | ||||||
The seeding points are at a minimum, because everyone is assigned to their natural seed, but the location points are very high. No one is close to home, everyone has to fly, and three out of four have to take a long flight. Can we do better?
Yes, we can.
Consider the following assignments:
| Natural | Assigned | Better | ||||
| Maryland | 5 | 6 | Dallas | 1 | 2 | 3 |
| OSU | 6 | 7 | Dayton | 1 | 0 | 1 |
| Stanford | 7 | 5 | Fresno | 0 | 0 | 0 |
| AZ St | 8 | 8 | Greensboro | 0 | 3 | 3 |
| 7 | ||||||
In this case, we move Maryland out of its "natural" 5 seed to a 6 seed. It means they will face a tougher opponent, but they get to play closer to home. For Maryland the move is a wash - the gain in location points offsets the loss in seeding points. But it allows us to improve the overall results. OSU "pays" one seeding point, but gains two location points. Stanford gains two locations points, and AZ State is unchanged. The committee has modified the "S-curve" assignment, but it has made the results far superior.
I haven't actually written a program to solve for the best option, I am doing it by eye, but the general concept is to identify the seeding decision that minimizes the sum of seeding points and location points.
Obviously, there are other choices for the #2 seeds, and the assignment to location will depend on who they are.
I'll reiterate - I wouldn't propose letting a computer make the final selection, but this algorithm is a good way of starting the process, and a good way of measuring the impact of alternatives.
Charlie Creme proposes the following teams as #2 seeds:
I've listed them in order of RPI.
If you follow the strict S-curve, you get the following table:
| S-curve | ||||||
| Seeding Points | Location Points | Total Points | ||||
| Maryland | 5 | 5 | Fresno | 0 | 3 | 3 |
| Oklahoma | 6 | 6 | Dallas | 0 | 1 | 1 |
| Vanderbilt | 7 | 7 | Dayton | 0 | 1 | 1 |
| Stanford | 8 | 8 | Greensboro | 0 | 3 | 3 |
| 8 | ||||||
If you use Creme's locations:
| Creme | ||||||
| Seeding Points | Location Points | Total Points | ||||
| Maryland | 5 | 5 | Fresno | 0 | 3 | 3 |
| Oklahoma | 6 | 7 | Dayton | 1 | 2 | 3 |
| Vanderbilt | 7 | 8 | Greensboro | 1 | 1 | 2 |
| Stanford | 8 | 6 | Dallas | 0 | 3 | 3 |
| 11 | ||||||
I don't see a lot of rational behind the Creme decision. He leaves Maryland in its natural position, with a long way to travel, then moves the other teams around to increase their distance.
Can we do better? Yes:
| Better | ||||||
| Seeding Points | Location Points | Total Points | ||||
| Maryland | 5 | 7 | Dayton | 2 | 1 | 3 |
| Oklahoma | 6 | 6 | Dallas | 0 | 1 | 1 |
| Vanderbilt | 7 | 8 | Greensboro | 1 | 1 | 2 |
| Stanford | 8 | 5 | Fresno | 0 | 0 | 0 |
| 6 | ||||||
In this option, we've penalized Maryland by moving them to a tougher seed, but moved them much closer to home. According to our seeding/location tradeoff, this is a wash. Why would we do that? Because it allows us to do better with the other teams, moving Oklahoma closer and Stanford much closer.
This page last updated on 11 March 2007.