Last year the UCI (international cycling union) changed the rules for the World Hourrecord. This change sparked a lot of interest from great riders in the cycling community. The goal of the hourrecord is to ride the farthest distance in one hour. This is usually done on a Velodrome, a wooden track of (normally) 250m.
Since the rule change there have been several attempts at the records, including that of Jack Bobridge. After his attempt I got a hold of his laptimes and I thought I could make a nice visualization with it.
The dataset contains the lapnumber, lapname and laptime. Knowing the track length (250m) I could then add variables like average laptime and average speed. Knowing the current records (51852m) I could add other interesting variables like the average laptime for the record and required laptimes. This last one is particularity interesting, more on that later.
hourrecord <- 51852 lapdistance <- 250 allhourdata <- allhourdata %>% mutate(Actual.laptimes.cummulative = cumsum(Actual.laptimes)) %>% mutate(Average.actual.laptimes = Actual.laptimes.cummulative/Id) %>% mutate(Target.laptimes=(60*60/(hourrecord/lapdistance))) %>% mutate(Target.laptimes.cummulative = cumsum(Target.laptimes)) %>% mutate(Difference = Actual.laptimes - Target.laptimes) %>% mutate(Difference.cummulative = cumsum(Difference)) %>% mutate(Projected.distance = (Id*lapdistance)/Actual.laptimes.cummulative*3.6) %>% mutate(Required.laptimes.for.record = ((60*60)-Actual.laptimes.cummulative)/((hourrecord-(lapdistance*Id))/lapdistance)) %>% mutate(Required.speed = 250/Required.laptimes.for.record*3.6) %>% mutate(laptime.to.slow = ifelse(Actual.laptimes < Required.laptimes.for.record, 0, Actual.laptimes))
I started of designing a static visualization but soon realized an animation would far better suit the dataset and the activity the rider performed. I needed to make a loop to create a plot for each lap. I used two variables so I could generate just a few plots when needed.
begin_ani <- 1 end_ani <- nrow(allhourdata) for (i in begin_ani:end_ani){ ## Code below comes here }
The first part of the loop was to select the data to create one plot for that specific lap. I wanted to show the laptimes he had done up until that lap and show the status of the recordattempt in that lap. I wanted to show the following metrics:
- Points for each laptime
- A line representing the current record
- A line representing the current average laptime/speed
- A line representing the laptimes/speed needed for the remaining laps to beat the record
To do that I filtered the dataset to get the data up until that lap (for the points) and selected the variables for that lap and create new data frames to display the lines. The color coding I added to be able to show if he was on track to beat the record( green) or not (red).
var <- allhourdata[i,] hourdata <- allhourdata %>% filter(Id <= i) avg_color <- ifelse(var[,10]<hourrecord/1000, "#800000", "#006600") hline_avg <- data.frame(Id=1:206, avg=as.numeric(var[,5])) hline_req <- data.frame(Id=1:206, req=as.numeric(var[,11])) hline_rec <- data.frame(Id=1:206, rec=as.numeric(var[,6]))
Now the interesting part, creating the plot. I’m a big fan of ggplot2 because of the countless possibilities to customize. I did borrow some of the layout from examples on the internet.
Step 1: Setting up the plot with basic themes, colors and margins.
gg <- ggplot() gg <- gg + theme_bw() + theme(panel.background=element_rect(fill="#F0F0F0")) + theme(plot.background=element_rect(fill="#F0F0F0")) + theme(panel.border=element_rect(colour="#F0F0F0")) + theme(panel.grid.major=element_line(colour="#D0D0D0",size=.5)) gg <- gg + ggtitle(paste("Hourrecord attempt Jack Bobridge: ", as.character(var[,2]))) + theme(plot.title=element_text(face="bold",hjust=-.08,vjust=2,colour="#3C3C3C",size=20)) + theme(legend.position="none") + ylab("Lap time") + xlab("Lap") + theme(axis.text.x=element_text(size=11,colour="#535353",face="bold")) + theme(axis.text.y=element_text(size=11,colour="#535353",face="bold")) + theme(axis.title.y=element_text(size=11,colour="#535353",face="bold",vjust=1.5)) + theme(axis.title.x=element_text(size=11,colour="#535353",face="bold",vjust=-.5)) gg <- gg + theme(plot.margin = unit(c(1.5, 1.5, 1, 1), "cm"))
Step 2: Adding the laptimes. I also added an annotation with the laptime.
## Laptimes gg <- gg + geom_point(data=hourdata, aes(x=Id, y=Actual.laptimes), colour="#000099", size=3) + annotate("text",x=as.numeric(var[,1])+2,y=as.numeric(var[,3]),label=round(as.numeric(var[,3]),digits=3),colour="#000099", hjust = 0)
Step 3: Adding the lines. Each lines is coded separately.
## Required Laptimes gg <- gg + geom_line(data=hline_req, aes(x=Id, y=req), colour=avg_color, size=1, alpha=1/2) + annotate("text",x=206+2,y=as.numeric(var[,11]),label=round(var[,12],digits=3),colour=avg_color, hjust = 0, alpha=1/2) + annotate("text",x=0,y=as.numeric(var[,11]),label=round(var[,11],digits=3),colour=avg_color, hjust = 1, alpha=1/2) + annotate("text",x=206,y=as.numeric(var[,11])+0.075,label="Required for record",colour=avg_color, hjust = 1, alpha=1/2) ## Current Record line gg <- gg + geom_line(data=hline_rec, aes(x=Id, y=rec),colour="#909090", size=1, linetype=2) + annotate("text",x=206+2,y=as.numeric(var[,6]),label="51.852",colour="#909090", hjust = 0) + annotate("text",x=0,y=as.numeric(var[,6]),label=round(var[,6],digits=3),colour="#909090", hjust = 1) ## Average speed/laptime line gg <- gg + geom_line(data=hline_avg, aes(x=Id, y=avg),colour=avg_color, size=1, alpha=1/2) + annotate("text",x=206+2,y=as.numeric(var[,5]),label=round(var[,10],digits=3),colour=avg_color, hjust = 0, alpha=1/2) + annotate("text",x=0,y=as.numeric(var[,5]),label=round(var[,5],digits=3),colour=avg_color, hjust = 1, alpha=1/2) + annotate("text",x=1,y=as.numeric(var[,5])+0.075,label="Average",colour=avg_color, hjust = 0, alpha=1/2)
Step 4: Changing the scale of the plot. In other words, zooming in on the range where most of the laptimes are. I’ve zoomed in to -5 laps to create some room for the laptimes.
gg <- gg + scale_x_continuous(minor_breaks=0,breaks=seq(0,200,50),limits=c(-5,216)) + scale_y_continuous(minor_breaks=0,breaks=seq(16, 18.5, 0.5)) + #,limits=c(16, 18.5)) + coord_cartesian(ylim=c(15.9, 18.6)) + ## zoom only, no cutoff data theme(axis.ticks=element_blank())
Step 5: Show and save the laptimes.
print(gg) ggsave(file=paste("pr",i,".jpg", sep=""))
Step 6: Here you could add code to create the animated GIF. I tried it but ran out of memory on my PC, so I did it by hand.
files = sprintf('pr%d.png', begin_ani:end_ani) im.convert(files, output = 'hourrecord.gif')
The result is the animated gif below. I’m pleased with the result (I ‘m sure the code can be better).
The general opinion after the record attempt was that it was a close call. But in the animation you can see that it all lost at 3/4 of the hour. The important variable in this is the required laptime, that is what he needs to ride to make it to the record. He does get to that laptime from around 100 laps. He did a great effort and was probably seriously hurting and that is what people saw and probably what made them think he was still able to make it until a few minutes before the end. But the data shows that was not the case.
Update 9-2-2015: Yesterday another attempt was made by Rohan Dennis. He made it and set a new World Record wth a distance of 52.491 meters.
You can find the dataset and code on GitHub.