Here is what I have so far to spur some discussion.
Data Set:
January 1, 2015 through most of the Saturday figures...my assistant just submitted a few tracks for that day which aren't included. I have several more years I can add but kept it shorter for time purposes now.
Dirt Sprints,
>6f and
<7f
None of the races have a missing 1/4 mile figure, 1/2 mile figure, or final time figure due to timing or suspected timing errors.
This gave a total of 48,851 races.
Definitions:
Gap1: 1/4 mile pace figure - 1/2 mile pace figure
- Hot: >8
- Even: >=-8 and <=8
- Cold: <-8
Gap2: 1/2 mile pace figure - final time figure
- Hot: >4
- Even: >=-4 and <=4
- Cold: <-4
Race Labels:
Hot: Gap1 and Gap2 Hot
Solid: Gap1 or Gap2 Hot, the other Even
Even: Gap1 and Gap2 Even
Mild: Gap1 or Gap2 Even, the other Cold
Cold: Gap1 and Gap2 Cold
Peak: Gap1 Cold, Gap2 Hot
Valley: Gap1 Hot, Gap2 Cold
These are the totals:
Code:
Label Count Percentage
Hot 14253 0.29
Solid 14959 0.31
Even 7676 0.16
Mild 8050 0.16
Cold 1422 0.03
Valley 1985 0.04
Peak 506 0.01
There is a lot to chew on here I hope. We can break it down by specific distances later, add more data, surfaces, distances, etc. But this is a pretty good start IMO. I'm happy to hear any suggestions on the labels but it isn't easy, trust me!
Next step should be to see what wins these race shapes and compare it to the general population.