Individual Defensive Statistics: Which Ones Matter and Top 10 MLS Defenders

When a car breaks down, a mechanic’s job is to tell you what caused the failure. He or she can generally pinpoint the problem to a specific part reaching the end of its useful life. But have you ever asked a mechanic why your car is working fine? Or which part deserves the most credit for your car running smoothly? Of course not. That would be a waste of everyone’s time. There are many parts to a car and all are doing their job as designed. We never ask why when things are going well.

The same dilemma exists in assessing soccer defenders. After all, most of how we assess defenders has to do with what goals were not scored. And when all the parts of the defenses are working as designed, goals are avoided. But which defenders deserve the credit when goals aren’t scored? It’s like the pointless car question, which parts of the car deserve the most credit when the car runs smoothly?

To even begin this conversation we need to take stock of what data exists for soccer defenders. And just to be clear, I am going to steer clear looking at a defender’s offensive capability. I want to focus solely on defensive statistics. Whoscored is the only site that offers a collection of defensive statistics, and here is what they have and their definitions.

  • Blocked Shot: Prevention by an outfield player of an opponents shot reaching the goal
  • Clearance: Action by a defending player that temporarily removes the attacking threat on their goal/that effectively alleviates pressure on their goal
  • Interception: Preventing an opponent’s pass from reaching their teammates
  • Offside Won: The last man to step up to catch an opponent in an offside position
  • Tackle: Dispossessing an opponent, whether the tackling player comes away with the ball or not

These are the defensive-oriented statistics offered by Whoscored that are tracked at the individual player level. Of course, the other vital defensive statistic is shots conceded but those can’t be attributed to any one player. So then, do any of these statistics matter? First there are a couple of assumptions to iron out.

A defender should be judged by the rate at which he accumulates statistics. So to get to that number we need to adjust these statistics to account for the time that the opponent has the ball. For example, Player A who averages 5 clearances per game might be better than Player B who averages 6 clearances if Player A’s opposition had the ball 20% less often. That would mean player A made more clearances given the opportunities provided to him. So I will adjust all metrics by opposition possession.

Since I am trying to assess what goals are not scored, I going to look at the numbers at the team level first. It is only at the team level that goals can be attributed. After that analysis I will attempt to attribute value to the individual metrics.

sources: whoscored, mlssoccer.com

sources: whoscored, mlssoccer.com

Here are tackles per game per minute of opponent possession against goals scored. Tackles represents the strongest correlation of all the variables. In fact, tackles has a slightly stronger correlation to goals against than shots conceded. Here is a look at the shots conceded as a percent of opponent minute of possession.

sources: whoscored.com, mlssoccer.com

sources: whoscored.com, mlssoccer.com

The two points to the far left represent the LA Galaxy and Sporting Kansas City. They appear adept at limiting shots on goal per minute of opposition possession. They also stand out when looking at offsides won.

Rather than show every graph, here is a table of the defensive statistics, their level of impact and the R squared of the impact in predicting goals against.

Statistic Goals Avoided per Unit R squared
Clearances -0.041 27.1%
Interceptions -0.036 15.1%
Tackles -0.077 39.4%
Offsides Won -0.113 16.0%
Blocks % of Shots -0.017 0.3%

Offsides won is the most impactful of the statistics (has the greatest slope) but there is a weaker correlation than Tackles or Clearances–in other words, there are greater deviations from the trend line. It’s interesting to see that Blocks as a percent of shots has almost no impact on goals allowed.

This is interesting, but what to make of it all? In an ideal world we could compile these statistics into a meaningful metric in order to compare players. The most obvious way to do that statistically would be to run a multivariate regression using all of the statistics.  The trouble with the result is that the statistics end up not being statistically significant predictors when mashed together. So developing a score from these metrics would be a bit of a fool’s errand.

The other option would be to ignore the predictive strength of the variables and just use the goals avoided results as a scalar, multiply them by each player’s statistics, add them up and compile a score. In this case the resulting score would be something we relate to as we could say that this player avoids x number of goals per game. However, this would give offsides won the statistic with the greatest importance despite the fact that the correlation is not strong.

To factor in the correlation we could leave the realm of sound statistical practice. We could multiply the goals avoided scalar by the R square. We could turn that into an index with the highest metric (tackles) equaling 1. If we did that here is the resulting table and values for each metric.

Statistic Goals Avoided per Unit R squared GApU x R2 Index
Clearances -0.041 27.1% -0.011 0.37
Interceptions -0.036 15.1% -0.005 0.18
Tackles -0.077 39.4% -0.030 1.00
Offsides Won -0.113 16.0% -0.018 0.60
Blocks % of Shots -0.017 0.3% 0.000 0.00

Tackles would be the most important statistic followed by offsides won and then clearances and interceptions. It turns out blocked shots have no material value in estimating goals against.

Before I use these numbers to reveal the top 10 MLS defenders, here are the caveats. Obviously this ranking is missing a few vital elements of defending in soccer. The first major omission is positioning. Often a defender being in the right position forces an offense to not make a pass that would increase their chance of scoring. There is no measurement for that but obviously a defender out of position is not a valuable defender. Clearances, interceptions, tackles and offsides won are clearing indicators that the player was probably in position to make the play and they indicate the player succeeding making the necessary play. But offensive attempts avoided are clearly missing.

The other major omission is the offensive play of the defender. A defender who defends well and represents an offensive threat is that much more valuable. But I’m not trying to solve for that here. I leave that for the subject of another post to integrate passing and offensive numbers to build a better score for defenders.

Here are the top 10 MLS defenders based on the score developed through the last week for players with a minimum of four appearances.

Rank Name Team Tackles Intercepts Off Won Clears Defender Score
1 José Gonçalves New England Rev. 1.6 2.4 2 11.2 7.376
2 Giancarlo Gonzalez Columbus Crew 2.1 2.9 1.9 9.3 7.203
3 Norberto Paparatto Portland Timbers 1.8 4.8 1.3 9.3 6.885
4 Carlos Bocanegra CD Chivas USA 1.5 3.6 2.1 8.9 6.701
5 Andrew Farrell New England Rev. 2.9 2.4 0.3 8.3 6.583
6 Jamison Olave New York Red Bulls 1.9 3.1 1.7 6.7 5.957
7 Victor Bernardez San Jose Quakes 1.5 2.8 0.7 9.5 5.939
8 Matt Hedges FC Dallas 1.5 3.9 0.9 8.5 5.887
9 Eric Avila CD Chivas USA 4 2.4 0.8 2.3 5.763
10 Chris Schuler Real Salt Lake 1.8 2.8 0.5 8.3 5.675

I find it comforting that, for a new metric, Jose’ Goncalves, MLS Defender of the Year in 2013, tops the list. There’s a big drop between the top 2 defenders and Paparatto. There’s also another cliff after Andrew Farrell. But hey, it’s a start.

I hope this was an enlightening ride through the mechanics of defending from a soccer perspective. The next time you’re watching a game, don’t just focus on the breakdowns. Also look for what makes the defense successful.

 

Advertisement

Passing: An oddity in how it’s measured in Soccer (Part I)

In my passion to better understand how soccer is statistically tracked I’ve come across what I would call is an oddity about the general characterization of “passing” in the world’s greatest sport.

Here’s the deal – go to Squawka.com, whoscored.com, reference the “Stats” tab on mlssoccer.com, or review Golazo information, and you’ll notice they all provide passing information.

My intent is not to dig deep into passing details – not yet, anyway. We’ll get there in another article to follow after I get permission from OPTA to reference their F-24 definitions within their Appendices. For now here’s a simple question I have as a statistical person working on soccer analysis.

What is the number of passes I should use for teams and which denominator is the right number for total passes by both teams to help determine possession percentages?

In the MLS Chalkboard you can clearly see and count passes – here’s an example from a game this past week.

An important filter to note – the major term ‘Distribution’ is not to be clicked in creating this filter – all that is clicked is ‘successful pass and unsuccessful pass’; note also that some details are provided on the types of passes  – we’ll get there in another article.

Bottom line is that the MLS Chalkboard identifies 309 successful passes and 125 unsuccessful passes for a total of 434 passes attempted.

On the MLS Stat sheet – one tab over but linked here the number of passes for Chivas = 369; that number doesn’t match the Chalkboard in either total, unsuccessful or successful.

For Golazo, for that same game here’s their total: 369 Passes total with 75% accuracy meaning the total successful passes was 277 and unsuccessful passes totaled 92.  Not the same either.

For Squawka.com here’s their total:
Successful = 270 /// headers (8), throughballs (2), passes (239), long balls (21) and supposedly crosses (0)
Unsuccessful = 86 /// passes (52), headers (14), long balls (20), no unsuccessful crosses or throughballs logged here?! Yet the MLS chalkboard indicates 26 unsuccessful crosses!
All told that is 356 passes; those figures don’t match the other data sources.

For whoscored.com here’s their total: Short ball = 323, Long ball = 52, Through ball = 2, Cross = 35, for a total of 412 passes – again that figure doesn’t match the other data sources.

So what’s the right total?  Here’s a table to compare showing the source of data and the total passes submitted for statistical folks like us to leverage in our analysis.

MLS Chalkboard 434
MLS Statistics 369
Golazo (same as MLS Stats) 369
Squawka 356
Whoscored 412

Observations:

I have no idea what ‘right’ looks like here but here’s what I’ve done to work through this issue.

I chose one source, the MLS Chalkboard, to gather and analyze statistics on passing and possession and all other things available from that data source – where other information is not offered there I reference the MLS Stats tab and Formation tab.

Why did I choose the Chalkboard?  Because it provides additional detail that shows more clarity on all the other types of passes that occur in a game.

For example; if you scroll down on the Chalkboard link and select Set-Pieces you’ll see that Throw-ins are included in the successful passing totals – by definition a Throw-in is a pass as it travels from one player to another.

So my recommendation, if interested, is to track Major League Soccer statistics using the MLS Chalkboard first – it’s harder but seems to be the best one at this time.

I’m not sure why the MLS Chalkboard, Golazo, Whoscored and Squawka all had different team passing statistics; given that it is likely they all have different individual player statistics as well… but in asking a representative from OPTA about that – their response was provided below:

“The difference between the different websites could be down to a few things. Either they take different levels of data from us, or they take the same feed but only use a chosen set of information from each feed to display their own take on each game.”

By the way – I did try to find a reasonable definition of what a pass is defined as for soccer; here’s some of that information before final thoughts… note: they are all different and Wikipedia proves, by its definition, why it’s a pretty useless source for information…  for them a pass in soccer must travel on the ground – no kidding – here’s their definition up front:

“Passing the ball is a key part of association football. The purpose of passing is to keep possession of the ball by maneuvering it on the ground between different players and to advance it up the playing field.”

Other definitions get pretty detailed – it is what it is apparently – complicated…

Passing Definition: About.com World Soccer.

When the player in possession kicks the ball to a teammate. Passes can be long or short but must remain within the field of play.

Soccer Dictionary: Note there are numerous definitions provided in this link so offering up a specific link is troublesome so I will cut and paste those definitions below:

Cross, diagonal: Usually applied in the attacking third of the field to a pass played well infield from the touch-line and diagonally forward from right to left or left to right.
Cross, far-post: A pass made to the area, usually beyond the post, farthest from the point from which the ball was kicked.
Cross, flank (wing): A pass made from near to a touch-line, in the attacking third of the field, to an area near to the goal.
Cross, headers: 64% of all goals from crosses are scored by headers.
Cross, mid-goal: A pass made to the area directly in front of the goal and some six to twelve yards from the goal-line.
Pass, chip: A pass made by a stabbing action of the kicking foot to the bottom part of the ball to achieve a steep trajectory and vicious back spin on the ball.
Pass, flick: A pass made by an outward rotation of the kicking foot, contact on the ball being made with the outside of the foot.
Pass, half-volley: A pass made by the kicking foot making contact with the ball at the moment the ball touches the ground.
Pass, push: A pass made with the inside of the kicking foot.
Pass, sweve: A pass made by imparting spin to the ball, thereby causing it to swerve from either right to left or left to right. Which way the ball swerves depends on whether contact with the ball is made with the outside or the inside of the kicking foot.
Pass, volley: A pass made before the ball touches the ground.
Passing: When a player kicks the ball to his teammate.
Through pass: A pass sent to a teammate to get him/her the ball behind his defender; used to penetrate a line of defenders. This pass has to be made with perfect pace and accuracy so it beats the defense and allows attackers to collect it before the goalkeeper.

Ducksters.com offers up a Glossary and Terms for Soccer; here’s what they define a pass as being…  this one is geared more towards teaching players about various types of passes they will need good skill in order to execute them.

Direct Passes – The first type of soccer pass you learn is the direct pass. This is when you pass the ball directly to a teammate. A strong firm pass directly at the player’s feet is best. You want to make it easy for your teammate to handle, but not take too long to get there.

Passes to Open Spaces – Passing into space is an important concept in making passes in soccer. This is when you pass the ball to an area where a teammate is running. You must anticipate both the direction and speed of your teammate as well as the opponents. Good communication and practice is key to good passes into space.

Wall Passes (One-Twos) – Now we are getting into more complex passing. You can think of a wall pass as bouncing a ball off of a wall to yourself. Except in this case the wall is a teammate. In wall pass you pass the ball to a teammate who immediately passes the ball back to you into open space. This helps to keep the defense off balance. This is a difficult maneuver and takes a lot of practice, but the results will make it worth the effort.

Long Passes – Sometimes you will have the opportunity to get the ball up the field quickly to an open teammate. A long pass can be used. On a long pass you kick the ball differently than with other shorter passes. You use an instep kick where you kick the soccer ball with your instep or on the shoelaces. To do this you plant your non-kicking foot a few inches from the ball. Then, with your kicking leg swinging back and bending at the knee, snap your foot forward with your toe pointed down and kick the ball with the instep of your foot.

Backward Pass – Sometimes you will need to pass the ball backward. This is done all the time in professional soccer. There is nothing wrong with passing the ball back in order to get your offense set up and maintain control of the ball.

Now that’s probably not ‘every’ definition available but they pretty much say the same thing apart from ‘on-the-ground’ by Wikipedia – a pass is a transfer of the ball from one player to another…

In closing… 

As noted earlier – I’m not really sure what right looks like but I remain convinced that all these organizations are well-intentioned in offering up free statistics for others to use, be it for analysis, fantasy league or simply to check it out.

In my own effort to develop more comprehensive measurements and indicators a standardized source of data for the MLS would be beneficial – if the intent for MLS is to endorse OPTA then there remains a conflict as Golazo clearly does not use the same data filters as the Chalkboard.

My vote, is and will remain, keep the Chalkboard and then, MLS, consider ways, as OPTA (Perform Group) is now, to improve it for more beneficial analysis.

Here is Part II  – where I peel back a wee bit more – consider these phrases, successful crosses, launches, key passes, through-balls, throw-ins and more, as ASA continues its venture into Soccer Analysis in America.

Here’s a few paraphrased thoughts from other folks who offer up articles on ASA about this issue on passing statistics:

Jared Young – The massive difference in pass data between sites is troubling and disturbing;   I’ve been primarily using whoscored.com and golazo for my numbers so I may have to explore other options.

Cris Pannullo – Major League Soccer should take an initiative and define what pass means in their league; it is surprising that they haven’t given how popular things like fantasy sports are; people eat statistics up in this country.

All the best, Chris

You can follow me on twitter @chrisgluckpwp

Possession Confusion

Consider every conversation ever had about soccer tactics. I would bet 99.9% of them touched on one specific subject: possession. Whether it’s the men’s league team you play for, or the club team you cheer for, isn’t more possession always a good thing? I can’t answer that question confidently, but I will explore it.

The first obstacle to analyzing and discussing possession in MLS is the data itself. We get our data from Opta, and this is what Opta defines as possession:

During the game, the passes for each team are totaled up, and then each team’s total is divided by the game total to produce a percentage figure which shows the percentage of the game that each team has accrued in possession of the ball.

“Possession” in Opta’s data is thus a measure of the proportion of completed passes in a match for each team, not a proportion of time. A lot of short, quick passes will accrue possession for a team that may only have the ball for a matter of seconds. This isn’t necessarily bad or good. It is what it is, and we’ll work with it.

Continue reading

Montreal’s Paradox

If you have listened to our podcasts or read through our stuff, you will have heard us talk about shot ratios a lot. That’s how many shots a team gets divided by how many shots its allows its opponents. A shot ratio of 1.5, for example, means that a team gets one-and-a-half times as many shots as its opponents. When soccer teams create extra opportunities for themselves, it generally leads to more goals and more points in the standings. And then there’s Montreal.

The Montreal Impact has been something of a Cinderella story this season, at least statistically. Leading up to its matchup with the Chicago Fire on Saturday, the Impact had recorded the second-worst shot attempt ratio in the entire league. Montreal had earned just 61 shot attempts with 28 on target to its opponents’ 95 shot attempts with 32 on target.  Yet somehow, the Impact had maintained a positive goal differential (+2) and the second-most points per match right behind FC Dallas.

Against Chicago, Montreal not only won on the scoreboard two-nil, it also won the shooting and possession battles. But that is a rare feat this year for the Impact, and it’s worth posing the question: Has Montreal been lucky this season, or does it do things that shot ratios and possession just can’t explain?

Using just shots on goal for now, I regressed goal scoring ratios against shot ratios to see how teams “should do,” as if shots on goal were the only thing that matter. Even this early in the season, the regression was not all that bad (R2 = 0.4). It also said that Montreal’s 0.94 shot ratio should lead to about the same goal ratio.* Well that makes sense. If you generate roughly the same number of shots on target as your opponents, you should score about the same number of goals. The Impact, however, have scored nine goals to its opponents’ five—a 1.8 ratio, or +4 differential, if you prefer.

An obvious thing to consider is finishing rate. Despite being outshot, the Impact players finish their attempts with goals more than twice as efficiently as opponents do. That ratio is the best in the league. My first instinct is that the Impact has been somewhat lucky, and that opponents will start to finish with more frequency. But there are two possible explanations I want to explore first before waving the cliché luck flag: the quality of opportunities for Montreal and the quality of opportunities for its opponents.

Harrison talked a little bit about Montreal’s counter-attacking style during a recent podcast, and there’s a possibility that the Impact’s style allows low-quality opportunities to its opponents, leading to higher-percentage opportunities for itself on the counter attack. (Before we investigate, it should be noted that Montreal’s schedule has featured teams that average out to be, well, league-average when it comes to finishing.)

Let’s take Saturday’s match against the Fire as an example of the tools I’m using. Check out the Opta chalkboard for yourself here, and you can see from where teams are shooting and scoring by clicking the appropriate boxes for team and statistic of interest. During this particular game, I have Montreal down for 16 scoring attempts, nine from outside the box, six inside, and one from right on the edge. Both its goals were scored from inside the box (though you could argue one was one the edge). Chicago, on the other hand, earned 11 attempts, ripping seven of those from outside the box, just two from inside, and two from the edge of the box. Chicago did not score. I did this for each of Montreal’s seven games this season.

Obviously things like angle matter, too, but I’m not going to pull out my protractor for this one. Here’s the breakdown for Montreal and its opponents on the season:

Attempts Goals Finishing
Stat Montreal Opponents Montreal Opponents Montreal Opponents
Inside Box

40

45

6

4

15.0%

8.9%

Outside Box

31

56

3

1

9.7%

1.8%

On Edge

6

5

0

0

0.0%

0.0%

Total

77

106

9

5

11.7%

4.7%

 

Montreal earns more shots inside the box than outside, and that might very well be a product of its system and players, rather than just dumb luck. While the Impact is being outshot in total, perhaps that stat is skewed slightly by shot selection. Montreal’s system seems to create a greater proportion of opportunities in the box. I would still expect some regression from Montreal this season back toward the middle of the standings—as its shot ratios are not favorable even after adjusting for quality—but perhaps not as far as a simple shot model would suggest.

*One might note that Montreal’s attempts ratio is quite a bit worse than its shots-on-goal ratio, which isn’t even that good to begin with. It is apparently too early in the season for attempts ratios to explain much of anything with certainty, but shots models from past seasons suggests Montreal’s goal scoring ratio should probably be even worse than even-ish. That is, if shots aren’t broken down by quality.

Big and Small Data

We talk and we talk about the need for more information to solve some of the problems and general questions that we have as a collective community within Soccer Analytics. Today I ran across a general post about Big Data, and the revolution of really small data. It led me back to thinking about some of the discussions that Matthias (apparently he has a real name), Keith (the missing guy in the podcasts), and I have had outside of the podcasting realms. It’s not always about waiting to develop thoughts or theories until you have data, but making do with what you have at your current disposal and developing theories that later–with further advances–you can prove or disprove.

Just as we now find it ludicrous to talk of “big software” – as if size in itself were a measure of value – we should, and will one day, find it equally odd to talk of “big data”. Size in itself doesn’t matter – what matters is having the data, of whatever size, that helps us solve a problem or address the question we have.  – Rufus Pollock

I’m not saying that anyone is or is not doing this… it just seemed really profound after a cup a coffee and two shots of espresso, so I thought I’d mention it.

Opta loosens the chains a bit

opta

Look, it’s late, you’ll have to forgive the hack job JPEG above. I have no idea why I’m up besides the fact that I don’t have to go to work in the morning. But with the upswing of free time, I’m just perusing the internet and generally reviewing information that I often don’t find time to cruise through. While sifting through data and spending my time nodding off to sleep at my keyboard, I came across Opta’s playground site where they are “opening up the database.”

I’m not sure how new this is or if it is just something I missed. But I know it wasn’t available the last time I was around. It’s a basic request for people nerds like me (and possibly you…) to submit data requests.

An understatement would be to call this development “cool.”

A lot of data within Soccer is closed off and generally leaves a lot to be desired. Being a guy that used to write a lot about baseball, it would be awful–strictly speaking from my perspective–to write about a player if the lack of overall information that was provide is akin to that of modern day soccer data.

It’s safeguarded and looked after as if it was top secret defense information. To be fair, I actually think that some of that information is kept more secure than defense information. But that’s not really the subject. Having the ability to submit an e-mail request for specific data is exciting. It’s a marked improved over the current status quo.

Sure, you could complain about the fact that they only accept one application in all categories per email address, but who cares? It’s an improvement, and here at ASA, that’s what we’re all about. Improvement. And soccer. And beer. So that’s not what we’re all about. But it’s part of what we’re about.