I've updated the Graded Stakes results and the Divisional Standings, as well as calculated the speed figure bonus points for winning races throughout the season. You can find the file at the link below (via Google Drive).
The whole discussion of assigning points based on the grade of the race is something I've grappled with all season long. On the one hand, the grades should represent the correct hierarchy of race quality during any given year. On the other hand, as we've discussed many times in the past, not all graded races are made alike. Some Grade 1s turn out to be weaker than a Grade 2, and so on. Most of the time, the Grade 1s are the top races, but not always.
One thing that continues to bother me about the graded stakes system (and the assignment of those grades) is races have their grades based on the quality of the fields in the past, not the quality of the actual field that runs in the race in the current year. This seems counter-intuitive to me. Take this year's Grade 1 Santa Anita Oaks as a prime example; the race was a Grade 1 (as it typically is) but this year's race turned out to be quite soft when compared to all other graded stakes races for 3YO fillies during the season. Looking at the 3YO Filly races, the ranking of the winning speed figure from the SA Oaks was the slowest of any 3YO Grade 1 this year and overall ranked 32nd of 39 within the division. The race was a Grade 1 in name only based on the figs.
If we truly want to operate a grading system that accurately reflects the quality of the race in a particular year wouldn't it make more sense to apply the grade at the end of the season when we can actually determine how strong a race was in comparison to all other races? Perhaps we don't want to assign grades purely based on speed figures since there is certainly an element of subjectivity with the numbers. Perhaps field size needs to be considered or the performance of the horses exiting the race throughout the year. Regardless, basing the graded of a race on prior races is somewhat useless since the results of one year have little to do with the quality of the field the next. Some races are quite prestigious and will almost always draw high quality fields, but not always.
Let's go back to the 3YO Filly races from 2012 and the breakdown of G1, G2 and G3 designated events. There were 12 Grade 1s, 12 Grade 2s and 15 Grade 3s. If we were to rank the races based on the comparative speed figures of the winner we'd come up with the following top 12 races in the division:
G1: Alabama, Acorn, Gazelle, CC American Oaks, Cotillion,
G2: Indiana Oaks, Delaware Oaks, Forward Gal, Raven Run
G3: Honeybee, Eight Belles, Victory Ride
Only five of the 12 3YO Filly Grade 1 races ranked in the top 12 overall based on speed figures and, as mentioned above, the Santa Anita Oaks ranked the worst of all the Grade 1s.
So what does all this mean? Well, for one, it suggests to pre-assign a grade to a race (and then having that grade matter so much when it comes to things like pedigree and auctions) doesn't accurately account for the relative strength or weakness of that specific race. Over the course of several years or a decade it's possible to separate the top races from the rest of the pack, but is that what the grading system is trying to accomplish? Are grades supposed to indicate past race strength or current? Given the importance of black type races in the sales catalogs I would think current strength is a more important designation. Perhaps we need a dual grading system: an over-all ten year trend and a current year grade.
Willa B Awesome wins the 2012 Grade 1 Santa Anita Oaks, something that will be proudly displayed on her pedigree for the rest of her career, yet we have at least some evidence to indicate that her victory was anything but Grade 1 quality.
Turning to the current divisional standings, another component that I've thrown into the mix with the bonus is sliding scale based on how other races match up to the top rated race. The goal with this exercise is to put a race in its proper context within the division.
When I started this project I had planned on simply assigning a bonus based on where the average speed figures fell - top 25% get one bonus, top 15% get another, and top 10% get the biggest bonus. The more I thought about it, the more I had serious problems with that process. First, while that bonus structure would separate some of the performances, I would still be lumping races together in groups instead of allowing them to stand on their own merits. Second and inexorably related, assigning blanket bonus points for the top 5% (or 10%, or whatever) diminished the importance of a big, dominating race by a horse.
In order to try and separate out the performances I came up with a crude sliding scale - the top performance in every division will receive a 50 point bonus and all other winners would receive a bonus based on the relation of their win as compared to the top horse. For example, let's look at the 3YO Filly and 3YO Colts divisions and the top five races, as ranked by comparative speed figures.
[The spreadsheets with the rate and bonus calculations aren't included in the above link due to size limitations; I've got so many formulas working that the sheet is starting to bog down. I'll link to the final sheet at the end of the season with simply the values to limit the size of the file.]
|Grace Hall||Indiana Oaks||105.0||0.972||48.61|
|Grace Hall||Delaware Oaks||104.0||0.963||48.15|
|On Fire Baby||Honeybee||94.0||0.870||43.52|
|I’ll Have Another||Preakness||153.0||0.860||42.98|
|Fort Loudon||Carry Back||149.0||0.837||41.85|
The total is simply a sum of the inverse rank of each horse based on the speed figures. For example, let's say there are 10 races within the division and one horse/race had the top speed figure under all three figure systems that I'm using (Beyer, Bris and Equibase); that would be 10 points for each top figure and 30 total points. The "Rate" is simply the total points divided by the total of the top performer. As the top performance is a perfect rate of 1.000, the rates of the races that rank below based on how far they deviate from the top.
The bonus points begin at 50 for the top performance in each division and are reduced based on the rate difference all the way down to the lowest winning performance in the division. The lowest performance could receive 10 points, 3 points, 1 point; there is not set amount of points since the bonus is based completely on how that number relates to the top rate.
The "rate" and "bonus" are useful for measuring the performances within the division but can be used to gain an assessment of the relative strength or weakness of a division as compared to a different division, at least in terms of the top performance. For example, looking at the 3YO Fillies vs. 3YO Colts, we note that Questing's Alabama and Paynter's Haskell rank at the top performance in each division. However, Paynter's victory, when compared to his rivals, was stronger than Questing's. Grace Hall's Indiana and Delaware Oaks provide a rate of 0.972 and 0.963, not that far off from the top effort. On the colts side, Bodemesiter and Trinniberg's races had a rate of 0.933 and 0.921, a bigger gap than the fillies side.
(As a quick aside, based on this analysis, Royal Delta was not only the best older female, she was the best by far. She had three of the top four performances on the season and her top race, the Fleur de Lis, was miles better than anyone else.)
One other piece of analysis I did using this "system" was to sort every single graded stakes race on the season - regardless of surface, age, sex - and simply compare the winning speed figures. Oddly enough, the results were quite intuitive when you looked at how the number fell.
Overall, the highest rated race based on compared/ranked speed figures was Shackleford's win in the Met Mile. Out of the 476 graded stakes races run (so far) in 2012, his Met Mile was the 4th highest Beyer and Bris Figure, and the 12th highest Equibase figure. That race as just a millimeter better than Calibrachoa's win in the Tom Fool and Fort Larned's win in the Classic. Flat Out's win in the Jockey Club Gold Cup came back as the 4th best race overall. Generally speaking, the top performances tended to come in the Classic, Dirt Mile and Sprint categories. Groupie Doll's win in the Breeders' Cup Filly & Mare Sprint was 10th overall; she was the only female in the top 20 and she appeared twice. (Her win in the Humana Distaff ranked 20th.)
So, those are the best performances. What about the bottom of the spectrum? Appropriately, most of the bottom 20 performances are juvenile races; this makes perfect sense given the still developing and sometimes dubious nature of many juvenile graded stakes races. However, there are three non-juvenile races in the bottom 20: Dayatthespa's victory in the Appalachian (459th of 476), Volcat's win in the Virginia Oaks (463rd of 476), and Toccetive's win in the Canadian Derby (467th of 477).
The lowest rated race on the year is a tie between Blueeyesintherin's win in the Debutante and Moonwalk's victory in the Jessamine.
Other bits of factoids at this point - below are the highest and lowest figures overall in graded stakes races. Interestingly enough, none of the three figure makers in this analysis agreed on the highest or lowest race on the year.
Beyer (66): Gold Edge - AW Lassie
Bris (82): Volcat - Virginia Oaks
Equibase (69): Blueeyesintherein - Dubtante
Beyer (117): Fort Larned - BC Classic; Wise Dan - Ben Ali
Bris (117): Flat Out - Jockey Club Gold Cup
Equibase (129): Nates Mineshaft - New Orleans Hcp.
As I've been thinking about changes or elements I'd like to implement into this process in the future I've come across several things that could strengthen the analysis.
- Pull the speed figures for each of the top three runners and assign a bonus simply on best performances regardless of whether it's a winning performance or a second, third, etc. A horse that loses by a nose in the bests race of the year is missing some of the credit that should come along with a big race. Unfortunately, there's a big issue preventing me from adding figs for the top three finishers: while I can find the top 3 published figs from Beyer and Equibase, the Bris numbers are much harder to come by.
If I can figure out a way to add in top 3 figs from Bris (and perhaps I'll need to just take a guess based on beaten lengths), I think that would strengthen the analysis.
- I would like to add a field size component although I'm not exactly sure how to incorporate such a number or if it would add any utility.
- I'd also like to come up with some kind of quality of field component whereby a race might be able to accumulate bonus points based on the subsequent success of the field. The only problem with that is races early in the year are at a extreme
disadvantage to those run at the end of the year.
- Finally, if I can add the figures for the top 3 finishers I might just do away with the 35, 25, 15, etc. points for Win, Place, Show for G1, G2 and G3 races. The reasoning for that would be to eliminate the "graded bias" and simply look at the quality of the race itself.
With just a handful of graded stakes events left on the calendar, I wouldn't expect any of the numbers to change dramatically but it's possible we could see some movement in the juvenile division based on a couple of stakes out at Hollywood Park.