It’s late and I have no intention of staying up past midnight for another night only to be woken by my darling, love-of-my-life, three year old at 6 o’clock wanting to play angry birds.
That all said, I have been messing around with collecting various Opta data on Deshorn Brown and DeAndre Yedlin for our upcoming Saturday podcast that is going to talk specifically about limited data analysis and cross positional rankings. Without giving away too much, it gave me an idea for a plus-minus system that could potentially work if balanced correctly.
There are basic good things and bad things that occur on the pitch during the 90. Most of those things, those that involve the ball, are recorded by Opta. If you could somehow weight those events and come up with a system that associates a value to them, then you could potentially have a system that inherently grades players for how they perform on the ball across the board.
What if you associated a key pass being worth a one-third what a goal is worth, being that a third of key passes lead to goals. Maybe a clearance or block is worth—whatever that associated average is with preventing goals from being scored.
Somewhere there could be a point scale that is attributed based upon how many times those events occur on average between goal scoring events and in game states.
Matty has a nice little utility that he’ll be presenting soon in regards game states. It should, to undersell it a bit, be grand and possibly allow for these events to be contextualized and valued in a more meaningful capacity.
I don’t know–maybe I’m drunk with coffee and in need of sleep.
I think we need to start measuring distances passed and dribbled, and maybe distances of passes received. The trick is to figure out how to weight the negative and wide ball movement compared to the movement straight at the goal. Working through these concepts is the purpose of my blog, so please stop by and leave your thoughts on what I have posted so far. Also important is the rate of turnovers a player has. I also think we can apply the concept of an unforced error to soccer as well. Being able to say a player had 600 yards of passing in a game, 250 yards of dribbling, a turnover rate of .28, and made 1 error in addition to the traditional stats like goals, shots, assists, tackles, and blocked shots would be very informative and insightful into player performance.
Since every ball movement on the field is going to equate into movement in an (x,y) grid, we can calculate how much each ball movement actually moves the ball down field or towards goal. There are two measures I can think of that are important as far as distances on the field go: one measuring straight down the field and straight across it, and the other as a series of arcs around the goal mouth measuring distance from the goal line in the goal to the ball. I think both are important and so we need to come up with some sort of averaged distance using both measures for each pass and each dribbling action. I also think keeping track of the negative and wide movement is important too since this motion helps open up spaces in other parts of the field. Obviously it’s more important to move the ball closer to goal because your team isn’t going to score unless you get the ball near the goal but you can’t just go straight down the field without mixing it up and kicking it out wide or occasionally turning it around to relieve pressure.
I think with the right data, the value of a wide pass, for instance, could be estimated. The issue is context. Getting too specific on context means that each situation lives in its own sample size of 1. Being not specific enough means that there are confounding variables at play.
I think knowing how many defenders are behind the ball for each pass and dribble, and the proximity of the closest defender to the ball, would provide good, quantitative context without.
I like your thought about how the (x,y) grid system would need to account for those cases that the ball is close to goal. “Forward” doesn’t have the same meaning anymore in the attacking third.