Should away teams be more aggressive?

Second Half Shot chart - HOUvPOR - April 2014The Portland Timbers traveled to Houston on Sunday in desperate need of three points to get out of the cellar in the Western Conference. They played well in the first half, outshooting the Dynamo 8 – 7 en route to a 1 – 1 tie, while dominating possession. Then Portland came out in the second half much like many away teams do with a tie score, conservatively. The second-half shot charts to the right serve as an indication of the change in strategy.

 

This conjured up a question that constantly bugs me. Should away teams go for wins more often when tied in the second half? Let’s get right to the data. Here is chart summarizing the offensive aggression of away teams during gamestates when the score is tied and the teams are playing with the same number of players. The data presents the proportion of totals earned by the away team in both the first and second halves.

2013 – 2014 Goals% xGoals% Shots%
1st Half 44.8% (266) 42.3% (282.9) 43.4% (2948)
2nd Half 34.8% (184) 37.4% (168.6) 39.7% (1654)
P-value 0.017 0.007

The away team consistently garners 42% to 45% of these primary offensive stats during the first half, and then drops down to the 35%-to-40% range in the second half. For the proportions of goals and shots, those differences are statistically significant (there is no simple test for xGoals%, but it is probably statistically significant as well).

My instinct is that away teams are capable of playing in the second half as they do in the first half, and that these discrepancies are a product of conscious decision making by away coaches and players. Teams likely change strategy in the second half to preserve a tie. Playing more openly would ostensibly increase the chances of both a loss and win, while decreasing the chances of a tie. However, I would think based on the data above that it would increase the chances of a win more so than the chances of a loss. Since a win would earn the away team an extra two points, while a loss would cost it just one, my gut says teams should go for it more often.

Are away teams playing conservatively because mindless soccer conventionality tells them that it’s okay to get one point on the road? Is this the self-detrimental risk aversion that plagues coaches in other sports, or are these numbers missing something that could justify the conservative play?

I can’t say that I’ve proven anything, but these data suggest the former.

Advertisement

MLS Top 50: Total Shots Created

I’ve briefly mentioned the stat Total Shots Created before. Basically it’s how frequently a player contributes to the moment leading to an attempt on goal. It’s one that I like a lot in terms of crediting individual players for their single contributions to the team’s efforts. Obviously there are other elements to a match that are also important and lead to definitive events that have predictive value (i.e. other things that players can do to help a team win). However, shots are one of the more valuable numbers out there and available. There is also the little fact that everyone loves goals. Goals are awesome and invoke celebrations. Shot deflections, all out blocked shots, or midfield recoveries hardly elicit the same reaction from friends but arguably hold near as much individual performance weight/value to the team.

With all the emphasis on shots and individual production there is another number worth mentioning: %Tsh (Percentage of Team Shots). It’s a pretty percentage of how many of the total team shots a player was involved in creating, not just shooting himself.

This time around, the list is compiled of the top 50 players in shot creation based upon the shots they’ve taken, assists that they’ve been attributed, or other shots they’ve created by their passing ability. Players below have been sorted by their %Tsh.

Player Club POS GP GS MINS G A SHTS KP SH-C ShC-90 Total Team Shots %Tsh
Federico Higuain CLB F 5 5 448 4 2 18 17 37 7.43 71 52.11%
Pedro Morales VAN M 6 4 407 1 2 14 14 30 6.63 67 44.78%
Fabian Espindola DC F 5 5 441 1 2 9 13 24 4.90 55 43.64%
Mauro Diaz DAL M 6 6 515 2 3 10 14 27 4.72 62 43.55%
Robbie Keane LA F 4 4 360 3 1 17 11 29 7.25 68 42.65%
Mauro Rosales CHV M 6 6 540 0 3 9 15 27 4.50 64 42.19%
Lloyd Sam NY M 6 6 531 0 3 11 16 30 5.08 74 40.54%
Landon Donovan LA M-F 4 4 360 0 2 12 12 26 6.50 68 38.24%
Giles Barnes HOU M 5 5 437 0 1 21 5 27 5.56 71 38.03%
Diego Valeri POR M 6 6 518 1 0 18 14 32 5.56 85 37.65%
Erick Torres CHV F 6 6 524 5 0 18 6 24 4.12 64 37.50%
Thierry Henry NY F 4 4 360 1 0 19 8 27 6.75 74 36.49%
Shea Salinas SJ M 0 360 0 3 2 18 23 5.75 69 33.33%
Alvaro Saborio RSL F 6 6 540 3 0 18 4 22 3.67 66 33.33%
Felipe Martins MTL M 6 6 536 1 2 18 12 32 5.37 97 32.99%
Justin Mapp MTL M 6 6 540 0 3 11 18 32 5.33 97 32.99%
Gilberto TOR F 4 4 333 0 0 12 8 20 5.41 62 32.26%
Michael Bradley TOR M 343 1 0 5 15 20 5.25 62 32.26%
Deshorn Brown COL F 5 4 366 1 0 16 4 20 4.92 62 32.26%
Teal Bunbury NE F 6 6 540 0 1 14 9 24 4.00 75 32.00%
Clint Dempsey SEA M 4 3 303 6 1 20 5 26 7.72 84 30.95%
Obafemi Martins SEA F 6 6 531 1 4 10 12 26 4.41 84 30.95%
Quincy Amarikwa CHI F 6 5 475 3 1 14 11 26 4.93 84 30.95%
Eddie Johnson DC F 5 5 441 0 0 11 6 17 3.47 55 30.91%
Graham Zusi KC F-M 4 4 360 1 2 8 14 24 6.00 78 30.77%
Diego Fagundez NE M-F 6 6 539 0 0 19 4 23 3.84 75 30.67%
Darren Mattocks VAN F 6 6 490 1 2 11 7 20 3.67 67 29.85%
Chris Wondolowski SJ F-M 4 4 360 3 0 17 3 20 5.00 69 28.99%
Joao Plata RSL F 3 3 207 2 2 9 8 19 8.26 66 28.79%
Leo Fernandes PHI F 5 3 326 2 1 12 8 21 5.80 74 28.38%
Dom Dwyer KC F 5 4 340 2 0 19 3 22 5.82 78 28.21%
Brad Davis HOU M 311 0 2 3 15 20 5.79 71 28.17%
Marco Di Vaio MTL F 3 3 270 1 1 22 4 27 9.00 97 27.84%
Fabian Castillo DAL F 6 6 539 2 0 15 2 17 2.84 62 27.42%
Will Johnson POR M 6 6 539 1 0 18 5 23 3.84 85 27.06%
Maurice Edu PHI M 6 6 540 2 1 11 8 20 3.33 74 27.03%
Hector Jimenez CLB M 5 5 433 0 2 6 10 18 3.74 71 25.35%
Lamar Neagle SEA F 6 5 416 1 2 13 6 21 4.54 84 25.00%
Mike Magee CHI F 4 4 360 1 2 13 6 21 5.25 84 25.00%
Will Bruin HOU F 5 5 449 3 1 11 5 17 3.41 71 23.94%
Bernardo Anor CLB M 5 5 416 2 0 13 4 17 3.68 71 23.94%
Kenny Miller VAN F 6 5 447 3 1 9 6 16 3.22 67 23.88%
Cristian Maidana PHI M 5 4 293 0 2 8 7 17 5.22 74 22.97%
Vincent Nogueira PHI M 6 6 540 1 1 10 6 17 2.83 74 22.97%
Michel DAL M-D 6 3 312 3 1 9 4 14 4.04 62 22.58%
Vicente Sanchez COL F 3 2 193 4 0 6 8 14 6.53 62 22.58%
Darlington Nagbe POR F-M 6 6 490 0 1 5 13 19 3.49 85 22.35%
Juninho LA M 4 4 358 0 2 7 6 15 3.77 68 22.06%
Benny Feilhaber KC M 5 5 449 1 1 7 9 17 3.41 78 21.79%
Kyle Beckerman RSL M 6 6 540 2 2 7 5 14 2.33 66 21.21%

 

This list below is sorted by total ShC-90, shot creation per 90 minutes. The one stipulation I would make is to be careful when looking at some of the numbers below. Guys like Justin Meram end up at the top of the list after playing just 58 minutes and scoring a goal in that short time. This leads to incorrect perceptions of certain players, as well as providing horrible and trite narratives like “Justin Meram is the most underrated player ever.” That might be true, but probably not. Just look out for small sample sizes.

 

Player Club POS GP GS MINS G A SHTS KP SH-C ShC-90 Total Team Shots %Tsh
Justin Meram CLB M 5 0 58 1 1 5 2 8 12.41 71 11.27%
Yannick Djalo SJ M 2 0 56 0 0 6 1 7 11.25 69 10.14%
Marco Di Vaio MTL F 3 3 270 1 1 22 4 27 9.00 97 27.84%
Joao Plata RSL F 3 3 207 2 2 9 8 19 8.26 66 28.79%
Clint Dempsey SEA M 4 3 303 6 1 20 5 26 7.72 84 30.95%
Federico Higuain CLB F 5 5 448 4 2 18 17 37 7.43 71 52.11%
Robbie Keane LA F 4 4 360 3 1 17 11 29 7.25 68 42.65%
Kekuta Manneh VAN F 6 1 167 1 0 11 2 13 7.01 67 19.40%
Thierry Henry NY F 4 4 360 1 0 19 8 27 6.75 74 36.49%
Pedro Morales VAN M 6 4 407 1 2 14 14 30 6.63 67 44.78%
Vicente Sanchez COL F 3 2 193 4 0 6 8 14 6.53 62 22.58%
Landon Donovan LA M-F 4 4 360 0 2 12 12 26 6.50 68 38.24%
Graham Zusi KC F-M 4 4 360 1 2 8 14 24 6.00 78 30.77%
Dillon Serna COL M 2 1 106 0 1 5 1 7 5.94 62 11.29%
Dom Dwyer KC F 5 4 340 2 0 19 3 22 5.82 78 28.21%
Leo Fernandes PHI F 5 3 326 2 1 12 8 21 5.80 74 28.38%
Brad Davis HOU M 311 0 2 3 15 20 5.79 71 28.17%
Shea Salinas SJ M 0 360 0 3 2 18 23 5.75 69 33.33%
Giles Barnes HOU M 5 5 437 0 1 21 5 27 5.56 71 38.03%
Diego Valeri POR M 6 6 518 1 0 18 14 32 5.56 85 37.65%
Gilberto TOR F 4 4 333 0 0 12 8 20 5.41 62 32.26%
Felipe Martins MTL M 6 6 536 1 2 18 12 32 5.37 97 32.99%
Justin Mapp MTL M 6 6 540 0 3 11 18 32 5.33 97 32.99%
Mike Magee CHI F 4 4 360 1 2 13 6 21 5.25 84 25.00%
Michael Bradley TOR M 343 1 0 5 15 20 5.25 62 32.26%
Cristian Maidana PHI M 5 4 293 0 2 8 7 17 5.22 74 22.97%
Lloyd Sam NY M 6 6 531 0 3 11 16 30 5.08 74 40.54%
Chris Wondolowski SJ F-M 4 4 360 3 0 17 3 20 5.00 69 28.99%
Quincy Amarikwa CHI F 6 5 475 3 1 14 11 26 4.93 84 30.95%
Deshorn Brown COL F 5 4 366 1 0 16 4 20 4.92 62 32.26%
Fabian Espindola DC F 5 5 441 1 2 9 13 24 4.90 55 43.64%
Jermain Defoe TOR F 3 3 242 3 0 11 2 13 4.83 62 20.97%
Mauro Diaz DAL M 6 6 515 2 3 10 14 27 4.72 62 43.55%
Lamar Neagle SEA F 6 5 416 1 2 13 6 21 4.54 84 25.00%
Kelyn Rowe NE M 2 2 179 0 0 6 3 9 4.53 75 12.00%
Mauro Rosales CHV M 6 6 540 0 3 9 15 27 4.50 64 42.19%
Obafemi Martins SEA F 6 6 531 1 4 10 12 26 4.41 84 30.95%
Steven Lenhart SJ F 3 3 258 0 0 9 3 12 4.19 69 17.39%
Erick Torres CHV F 6 6 524 5 0 18 6 24 4.12 64 37.50%
Saer Sene NE M 6 4 286 0 0 8 5 13 4.09 75 17.33%
Michel DAL M-D 6 3 312 3 1 9 4 14 4.04 62 22.58%
Teal Bunbury NE F 6 6 540 0 1 14 9 24 4.00 75 32.00%
Juan Luis Anangono CHI F 6 1 113 1 0 5 0 5 3.98 84 5.95%
Sal Zizzo KC F 5 4 367 0 2 9 5 16 3.92 78 20.51%
Dwayne De Rosario TOR M 5 3 254 0 0 10 1 11 3.90 62 17.74%
Bradley Wright-Phillips NY F 5 2 278 1 0 9 3 12 3.88 74 16.22%
Diego Fagundez NE M-F 6 6 539 0 0 19 4 23 3.84 75 30.67%
Will Johnson POR M 6 6 539 1 0 18 5 23 3.84 85 27.06%
David Texeira DAL F 5 2 211 1 0 6 3 9 3.84 62 14.52%
Marco Pappa SEA M 4 2 165 0 0 6 1 7 3.82 84 8.33%

 

Overall, we’re still just getting used to this statistic, but it seems like it could help dig a little deeper into valuing those players that don’t always directly put the goal in the back of the net, but still play a key role in their teams’ abilities to do so.

Location Adjusted Total Shots Ratio

Millionaire Malcolm Forbes was famous for his quote, “He who dies with the most toys wins.” And while that might not be the most moral mantra for life, sports fans have a hard time arguing with the logic. After all, a game is about runs, points or goals, and after enough of those it’s about shiny trophy cases. But in the world of sports analysis there is no such victory in the absolute. Analysts need to explain how those runs, points or goals came about. In the world of soccer especially, there is never a complete answer. Goals are exceedingly rare, so explaining how they grace us with their presence mathematically is difficult, to say the least. We’re happy with higher R-squareds and other such geeky descriptive metrics. Have you ever seen a trophy case filled with strong correlations? Nope, all we get is a little blog post, and if we’re lucky, some twitter praise. Still, we search….

One of the more popular explanations for winning in soccer is Total Shots Ratio, which calculates the percentage of shots taken by a team in games played by that team. A 60% TSR means that a given team took 60% of the total shots fired in the games they played. The logic isn’t all that difficult to wrap your head around. If you can take more shots than your opponent you are likely to score more goals. For the English Premier League, TSR explains 68% of the variance in the point table, which is impressive for one statistic. TSR happens to be less important in MLS.

data sources: AmericanSoccerAnalysis, mlssoccer.com

data sources: AmericanSoccerAnalysis, mlssoccer.com

The variance prediction is just 37% and this is likely due to the lower finishing rates in MLS compared to the EPL, rendering shots less effective. But there are probably a number of other reasons why TSR is less predictive of points in MLS. There are a larger percentage of teams employing counterattack strategies which have significant impacts on finishing rates, which would in turn alter the effectiveness of TSR. But what if the shots were weighted to account for the location of the shots? It would be logical to assume that better teams take better shots and make it more difficult on the opposing shooters. But does that logic actually manifest itself when predicting points? ASA’s Expected Goals 1.0 worked pretty well, so a TSR adjusted for shot locations ought to work better than the original TSR.

The first thing required would be a fair weighting of shots by location. To do that I took the ratio of the finishing rate by location and divided by the average finishing rate. Here is the resulting table for adjusting the value of shots.

Location Weighting
1 3.14
2 1.79
3 0.72
4 0.54
5 0.24

For the sake of simplicity I have collapsed zones 5 & 6 into a fifth zone. This table illustrates that a shot from zone 1–inside the 6-yard box–is actually worth 3.14 average shots. And a shot from zone 5 is worth just .24 average shots. Adjusting all of the shots in MLS in 2013 yields the following result when attempting to predict table points.

data sources: AmericanSoccerAnalysis, mlssoccer.com

data sources: AmericanSoccerAnalysis, mlssoccer.com

You can tell from just eyeballing the dispersion of the data points that the location adjusted TSR better aligns with points and the Rsquared agrees. There is a 17-percent increase in R-squared. Not just pure volume of shots, but the location of those shots is vital to predicting points in MLS. It would be interesting to see if location is equally important in the EPL, where TSR is already such a strong predictor.

For the curious, the New York Red Bulls were the team that was best at getting better shots than their opponent. Their TSR improved from 47% to 52% when adjusting for shot location. Real Salt Lake actually took the biggest hit. Their TSR was 53% and their location-adjusted TSR dropped to 48%.

It’s only one season’s worth of data, but with such an impressive increase in the ability to explain the variance in point totals, it confirms that location does matter, and that teams are rewarded by taking better shots themselves while pushing their opponents -out farther from goal. And perhaps soccer analysts have another statistical toy to add to the toy box—Location-Adjusted Total Shot Ratio.

In Defense of the San Jose Earthquakes and American Soccer

Note: This is part II of the post using a finishing rate model and the binomial distribution to analyze game outcomes. Here is part I.

As if American soccer fans weren’t beaten down enough with the removal of 3 MLS clubs from the CONCACAF Champions League, Toluca coach Jose Cardozo questioned the growth of American soccer and criticized the strategy the San Jose Earthquake employed during Toluca’s penalty-kick win last Wednesday. Mark Watson’s team clearly packed it in defensively and looked to play “1,000 long balls” on the counterattack. It certainly doesn’t make for beautiful fluid soccer but was it a smart strategy? Are the Earthquakes really worthy of the criticism?

Perhaps it’s fitting that Toluca is almost 10,000 feet above sea level because at that level the strategy did look like a disaster. Toluca controlled the ball for 71.8% of the match and ripped off 36 shots to the Earthquakes’ 10. It does appear that San Jose was indeed lucky to be sitting 1-1 at the end of match. The fact that Toluca only scored one lone goal in those 36 shots must have been either unlucky or great defense, right? Or could it possibly have been expected?

The prior post examined using the binomial distribution to predict goals scored, and again one of the takeaways was that the finishing rates and expected goals scored in a match decline as shots increase, as seen below. This is a function of “defensive density,” I’ll call it, or basically how many players a team is committing to defense. When more players are committed to defending, the offense has the ball more and ultimately takes more shots. But due to the defensive intensity, the offense is less likely to score on each shot.

 source: AmericanSoccerAnalysis

Data source: American Soccer Analysis

Mapping that curve to an expected goals chart you can see that the Earthquakes expected goals are not that different from Toluca’s despite the extreme shot differential.

source data: AmericanSoccerAnalysis

Data sources: American Soccer Analysis, Golazo

Given this shot distribution, let’s apply the binomial distribution model to determine what the probability was of San Jose advancing to the semifinals of the Champions League. I’m going to use the actual shots and the expected finishing rate to model the outcomes. The actual shots taken can be controlled through Mark Watson’s strategy, but it’s best to use expected finishing rates to simulate what outcomes the Earthquakes were striving for. Going into the match the Earthquake needed a 1-1 draw to force a shootout. Any better result would have seen them advancing and anything worse would have seen them eliminated.

Inputs:

Toluca Shots: 36

Toluca Expected Finishing Rate: 3.6%

San Jose Shots: 10

San Jose Expected Finishing Rate: 11.2%

Outcomes:

Toluca Win: 39.6%

Toluca 0-0 Draw: 8.3%

Toluca 1-1 Draw: 13.9% x 50% PK Toluca = 6.9%

Total Probability Toluca advances= 54.9%

 

San Jose Win: 32.3%

2-2 or higher Draw = 5.8%

San Jose 1-1 Draw: 13.9% x 50% PK San Jose = 6.9%

Total Probability San Jose Advances = 45.1%

 

The odds of San Jose advancing with that strategy are clearly not as bad as the 10,000-foot level might indicate. Counterattacking soccer certainly isn’t pretty, but it wouldn’t still exist if it weren’t considered a solid strategy.

It’s difficult, but we can also try to simulate what a “normal” possession-based strategy might have looked like in Toluca. In MLS the average possession for the home team this year is 52.5% netting 15.1 shots per game. In Liga MX play, Toluca is only averaging about 11.4 shots per game so they are not a prolific shooting team. They are finishing at an excellent 15.2%, which could be the reason San Jose attempted to pack it in defensively. The away team in MLS is averaging 10.4 shots per game. If we assume that a more possession oriented strategy would have resulted in a typical MLS game then we have the following expected goals outcomes.

source data: AmericanSoccerAnalysis

Data sources: American Soccer Analysis, Golazo

Notice the expected goal differential is actually worse for San Jose by .05 goals. Though it may not be statistically significant, at the very least we can say that San Jose’s strategy was not ridiculous.

Re-running the expected outcomes with the above scenario reveals that San Jose advances 43.3% of the time. A 1.8% increase in the probability of advancing did not deserve any criticism, and definitely not such harsh criticism. It shows that the Earthquakes probably weren’t wrong in their approach to the match. And if we had factored in a higher finishing rate for Toluca, the probabilities would favor the counterattack strategy even more.

Even though the US struck out again in the CONCACAF Champions League, American’s don’t need to take abuse for their style of play. After all, soccer is about winning, and in the case of a tie, advancing. We shouldn’t be ashamed or be criticized when we do whatever it takes to move on.

 

Predicting Goals Scored using the Binomial Distribution

Much is made of the use of the Poisson distribution to predict game outcomes in soccer. Much less attention is paid to the use of the binomial distribution. The reason is a matter of convenience. To predict goals using a Poisson distribution, “all” that is needed is the expected goals scored (lambda). To use the binomial distribution, you would need to both know the number of shots taken (n) and the rate at which those shots are turned into goals (p). But if you have sufficient data, it may be a better way to analyze certain tactical decisions in a match. First, let’s examine if the binomial distribution is actually dependable as a model framework.

Here is the chart that shows how frequently a certain number of shots were taken in a MLS match.

source data: AmericanSoccerAnalysis

source data: AmericanSoccerAnalysis

The chart resembles a binomial distribution with right skew with the exception of the big bite taken out of the chart starting with 14 shots. How many shots are taken in a game is a function of many things, not the least of which are tactical decisions made by the club. For example it would be difficult to take 27 shots unless the opposing team were sitting back and defending and not looking to possess the ball. Deliberate counterattacking strategies may very well result in few shots taken but the strategy is supposed to provide chances in a more open field.

Out of curiosity let’s look at the average shot location by shots taken to see if there are any clues about the influence of tactics. To estimate this I looked expected goals by each shot total. This does not have any direct influence on the binomial analysis but could come in useful when we look for applications.

source: AmericanSoccerAnalysis

source data: AmericanSoccerAnalysis

The average MLS finishing rate was just over 10 percent in 2013. You can see that, at more than 10 shots per game, the expected finishing rate stays constant right at that 10-percent rate. This indicates that above 10 shots, the location distribution of those shots is typical of MLS games. However, at fewer than 10 shots you can see that the expected goal scoring rate dips consistently below 10%. This indicates that teams that take fewer shots in a game also take those shots from worse locations on average.

The next element in the binomial distribution is the actual finishing rate by number of shots taken.

 source: AmericanSoccerAnalysis

source data: AmericanSoccerAnalysis

Here it’s plain that the number of shots taken has a dramatic impact on the accuracy rate of each shot. This speaks to the tactics and pace of play involved in taking different shot amounts. A team able to squeeze off more than 20 shots is likely facing a packed box and a defense less interested in ball possession. What’s fascinating then is that teams that take few shots in a game have a significantly higher rate of success despite the fact that they are taking shots from farther out. This indicates that those teams are taking shots with significantly less pressure. This could indicate shots taken during a counterattack where the field of play is more wide open.

Combining the finishing accuracy model curve with number of shots we can project expected goals per game based on number of shots taken.

ExpGoalsbyShotsTaken

What’s interesting here is that the expected number of goals scored plateaus at about 18 shots and begins to decline after 23 shots. This, of course, must be a function of the intensity of the defense they are facing for those shots because we know their shot location is not significantly different. This model is the basis by which I will simulate tactical decisions throughout a game in Part II of this post.

Now we have the two key pieces to see if the binomial distribution is a good predictor of goals scored using total shots taken and finishing rate by number of shots taken. As a refresher, since most of us haven’t taken a stat class in a while, the probability mass function of the binomial distribution looks like the following:

source: wikipedia

Where:

n is the number of shots

p is the probability of success in each shot

k is the number of successful shots

Below I compare the actual distribution to the binomial distribution using 13 shots (since 13 is the mode number of shots from 2013’s data set), assuming a 10.05% finishing rate.

source data: AmericanSoccerAnalysis, Finishing Rate model

source data: AmericanSoccerAnalysis, Finishing Rate model

The binomial distribution under predicts scoring 2 goals and over predicts all other options. Overall the expected goals are close (1.369 actual to 1.362 binomial). The Poisson is similar to the binomial but the average error of the binomial is 12% better than the Poisson.

If we take the average of these distributions between 8 and 13 shots (where the sample size is greater than 40) the bumps smooth out.

source data: AmericanSoccerAnalysis, Finishing Rate model

source data: AmericanSoccerAnalysis, Finishing Rate model

The binomial distribution seems to do well to project the actual number of goals scored in a game, and the average binomial error is 23% lower than with the Poisson. When individually looking at shots taken 7 to 16 the binomial has 19% lower error if we just observe goal outcomes 0 and 1. But so what? Isn’t it near impossible to predict the number of shots a team will take in the game? It is. But there may be tactical decisions like counterattacking where we can look at shots taken and determine if the strategy was correct or not. And a model where the final stage of estimation is governed by the binomial distribution appears to be a compelling model for that analysis. In part II I will explore some possible applications of the model.

Jared Young writes for Brotherly Game, SB Nation’s Philadelphia Union blog. This is his first post for American Soccer Analysis, and we’re excited to have him!

North American Soccer League and its 2013 First Half

The last 12 months have been rather eventful for the North American Soccer League (NASL). A league that once folded before some of us were born has begun to become some what relevant again.

Even putting aside the excitement surrounding the return of the New York Cosmos to professional soccer—a team that is surrounded and entrenched in US Soccer history—one sees how well the league fared against some of the MLS clubs. NASL knocked out two of the big dogs in the LA Galaxy (2-0, Carolina RailHawks) and Seattle Sounders FC (1-0, Tampa Bay Rowdies) this past year.

Add that to the expansion plans of the league outside of New York. This past year they’ve added Indianapolis, Jacksonville and Oklahoma City to their growing portfolio. These were shrewd moves to get toe holds in two cities that have limited professional sports and strengthen their ties in Florida, what with three soccer cities in South Florida and four in the Southeastern region.

The league is obviously poised for a positive return.

Living in Tampa for the next few months, I plan on taking in at least one match (this weekend in their Derby game vs. Fort Lauderdale) and checking out the scene.

Okay, there is the narrative. Let’s take a look at the table and some numbers:

Shot info

NASL2

Advanced Shot Info

NASL

Table Data

NASL1

Okay, my new friends here in Tampa won’t like this very much but Fort Lauderdale should have finished much higher in the table. The disparity in the table between Minnesota and the Strikers is amazing considering the shot data. Though, between expected points and PDO, maybe United FC finished about where they should expect.

There is surprisingly a lot of data in these supplied match reports. I know it may not seem like it, but there is. The time stamped shots can give us a bit more insight to the context of the shots. While we still can’t get to know some of the players (outside of the goal scorers) as well, it helps us get to know the teams as a whole within that league.

You can say what you want, but I love the idea of NASL growing and becoming legit competition with MLS. I love USL, NASL and MLS playing in the Open Cup, and I love seeing the sport grow in the country.

I’ll continue to throw NASL data out as I collect it. With my new city having an NASL team and a derby game this weekend, I thought it a great time to put this stuff out there. Now talk among yourselves…

PDO: Week 22 Rankings

I dropped the ball a bit last week not updating the tables. Here is how they look as of this past weekend’s results.

Team Shots Against GA Sv% SoT GF SH% TSR Points Games PPG PDO
Portland Timbers 89 20 77.53% 101 30 29.70% 0.532 34 21 1.62 1072
New England Rev. 85 19 77.65% 84 22 26.19% 0.497 30 21 1.43 1038
New York Red Bulls 92 27 70.65% 88 29 32.95% 0.489 35 22 1.59 1036
Houston Dynamo 83 20 75.90% 81 22 27.16% 0.494 30 20 1.5 1031
Salt Lake 102 24 76.47% 121 32 26.45% 0.543 37 22 1.68 1029
Dallas 109 27 75.23% 98 27 27.55% 0.473 32 21 1.52 1028
Vancouver Whitecaps 92 29 68.48% 98 32 32.65% 0.516 32 21 1.52 1011
Philadelphia Union 97 30 69.07% 102 32 31.37% 0.513 34 22 1.55 1004
Seattle Sounders FC 80 22 72.50% 76 21 27.63% 0.487 28 19 1.47 1001
Colorado Rapids 92 24 73.91% 91 23 25.27% 0.497 34 23 1.48 992
Montreal Impact 92 29 68.48% 105 31 29.52% 0.533 35 20 1.75 980
Columbus Crew 99 27 72.73% 94 23 24.47% 0.487 23 21 1.1 972
Kansas City 63 21 66.67% 103 29 28.16% 0.620 36 22 1.64 948
San Jose Earthquakes 109 33 69.72% 87 21 24.14% 0.444 27 22 1.23 939
CD Chivas USA 118 37 68.64% 69 17 24.64% 0.369 17 21 0.81 933
L.A. Galaxy 76 27 64.47% 108 30 27.78% 0.587 33 22 1.5 923
Toronto FC 77 29 62.34% 69 17 24.64% 0.473 17 21 0.81 870
Chicago Fire 85 30 64.71% 103 20 19.42% 0.548 25 20 1.25 841
DC 93 35 62.37% 62 8 12.90% 0.400 10 21 0.48 753

Again, Portland, even with their loss, retains their title as the luckiest club in MLS by PDO*. Meanwhile, New England continues to mystify as they pretty much pulled that win together with duct tape, spit and some wood glue. Is Jay Heaps really Macgyver? I’m going to guess no, though as we talked about on the podcast, home field advantage not only helps to place pressure on the ref, but it may also encourage more aggression from the home side. One can only wonder if Jay Heaps is able to simulate this effect with a stirring pep talk prior to the match against a terrible team on the road.

However, just like how Chivas and Toronto have been largely unaffected this season, likely due to some terrible play and a limited talent base, you have to wonder if we are seeing many of these clubs performing at their true rates. I don’t think you can completely attribute RSL’s finishing success to luck when defensively they have some great pieces and offensively they, again, have some great pieces.

As we watch the year unfold it’s going to be rather interesting to see where these clubs end up with playoff spots at seasons end.

 

*PDO here is based on shots on target, not total attempts. 

PDO: Week 20 Update

Last week, we talked about PDO…a lot. We likely will continue to talk about PDO and monitor it through the season. After games played this past weekend here are the up-to-date rankings. I know that Matthias usually just updates his page on Monday, but I’m actually going to make these a post so that when I want to do a week-by-week investigation later on this off-season, it saves me time. Because it’s all about me.

Team SA GA GA% Sv% SF GF SH% TSR Points Games PPG PDO
Portland Timbers 83 18 0.22 0.78 93 30 0.32 0.528 33 19 1.74 1106
New England Rev. 75 16 0.21 0.79 72 22 0.31 0.490 24 18 1.33 1092
Real Salt Lake 87 18 0.21 0.79 112 32 0.29 0.563 37 20 1.85 1079
New York Red Bulls 83 24 0.29 0.71 79 29 0.37 0.488 31 20 1.55 1078
Seattle Sounders FC 73 20 0.27 0.73 61 21 0.34 0.455 24 17 1.41 1070
Houston Dynamo 76 19 0.25 0.75 76 22 0.29 0.500 29 19 1.53 1039
Vancouver Whitecaps 81 26 0.32 0.68 91 32 0.35 0.529 32 19 1.68 1031
FC Dallas 105 27 0.26 0.74 96 27 0.28 0.478 31 20 1.55 1024
Colorado Rapids 81 22 0.27 0.73 80 23 0.29 0.497 27 20 1.35 1016
Philadelphia Union 88 30 0.34 0.66 93 32 0.34 0.514 30 20 1.50 1003
Columbus Crew 90 23 0.26 0.74 89 23 0.26 0.497 23 19 1.21 1003
Montreal Impact 91 29 0.32 0.68 97 31 0.32 0.516 31 18 1.72 1001
Sporting Kansas City 56 19 0.34 0.66 92 29 0.32 0.622 33 20 1.65 976
L.A. Galaxy 70 24 0.34 0.66 100 30 0.30 0.588 30 20 1.50 957
San Jose Earthquakes 105 32 0.30 0.70 84 21 0.25 0.444 24 21 1.14 945
CD Chivas USA 107 35 0.33 0.67 63 17 0.27 0.371 14 19 0.74 943
Toronto FC 77 27 0.35 0.65 59 17 0.29 0.434 13 18 0.72 937
Chicago Fire 77 28 0.36 0.64 91 20 0.22 0.542 21 18 1.17 856
DC United 81 29 0.36 0.64 56 8 0.14 0.409 10 19 0.53 785

This week you see Montreal continue to sit somewhere rather neutral in the luck department. Interesting situation after reading Richard Whittall’s weekly analytic piece on the Canadian club yesterday. Even more-so when considering some of the screaming by the press and cries about replacing possibly replacing Marco Schallibaum  at the helm…in fact I kind of think it’s down right crazy. I wouldn’t considered the Impact to be a Supporter Shield contender—that’s just me—but it also doesn’t mean they won’t be. Their points-per-match total is third in MLS, and they still have one-two games in hand on the clubs ahead of them.

Speaking of the East. The New York Red Bulls continue their rise up the luck charts. Something to consider after defeating the Impact 4-0 this week and all the talk about “finally coming together”. Remember this graphic is about luck, not about talent. That is to say, be careful about high and lofty dreams, east siders. I can see the Red Bulls struggling to retain that first place position.

Another riser, this one out west, is Vancouver. They are on their way up with the recent performances of Kenny Miller, Camilo and Brad Knighton. Their 1.68 points-per-game average have them quietly (or, of late, not so quietly) contending for a top-3 playoff position ahead of Dallas, LA and Seattle. Something to take note and see whether they are truly overachieving and just on a hot-streak, or finally hitting a much-needed groove.

Lastly, on the subject of FC Dallas, I think it’s interesting how they’ve held pretty firm with a PDO over 1000. Expect them to continue to regress over the next few weeks. The number of shots that they are allowing to reach Raul Fernandez is quiet surprising, and the fact that they are producing an above average save% makes me question how much longer they’ll stick around. Though, admittedly, much of that has been due to George John being MIA. His return from the hamstring strain will be crucial to stopping attacks before they get to the keeper.

Montreal’s Paradox

If you have listened to our podcasts or read through our stuff, you will have heard us talk about shot ratios a lot. That’s how many shots a team gets divided by how many shots its allows its opponents. A shot ratio of 1.5, for example, means that a team gets one-and-a-half times as many shots as its opponents. When soccer teams create extra opportunities for themselves, it generally leads to more goals and more points in the standings. And then there’s Montreal.

The Montreal Impact has been something of a Cinderella story this season, at least statistically. Leading up to its matchup with the Chicago Fire on Saturday, the Impact had recorded the second-worst shot attempt ratio in the entire league. Montreal had earned just 61 shot attempts with 28 on target to its opponents’ 95 shot attempts with 32 on target.  Yet somehow, the Impact had maintained a positive goal differential (+2) and the second-most points per match right behind FC Dallas.

Against Chicago, Montreal not only won on the scoreboard two-nil, it also won the shooting and possession battles. But that is a rare feat this year for the Impact, and it’s worth posing the question: Has Montreal been lucky this season, or does it do things that shot ratios and possession just can’t explain?

Using just shots on goal for now, I regressed goal scoring ratios against shot ratios to see how teams “should do,” as if shots on goal were the only thing that matter. Even this early in the season, the regression was not all that bad (R2 = 0.4). It also said that Montreal’s 0.94 shot ratio should lead to about the same goal ratio.* Well that makes sense. If you generate roughly the same number of shots on target as your opponents, you should score about the same number of goals. The Impact, however, have scored nine goals to its opponents’ five—a 1.8 ratio, or +4 differential, if you prefer.

An obvious thing to consider is finishing rate. Despite being outshot, the Impact players finish their attempts with goals more than twice as efficiently as opponents do. That ratio is the best in the league. My first instinct is that the Impact has been somewhat lucky, and that opponents will start to finish with more frequency. But there are two possible explanations I want to explore first before waving the cliché luck flag: the quality of opportunities for Montreal and the quality of opportunities for its opponents.

Harrison talked a little bit about Montreal’s counter-attacking style during a recent podcast, and there’s a possibility that the Impact’s style allows low-quality opportunities to its opponents, leading to higher-percentage opportunities for itself on the counter attack. (Before we investigate, it should be noted that Montreal’s schedule has featured teams that average out to be, well, league-average when it comes to finishing.)

Let’s take Saturday’s match against the Fire as an example of the tools I’m using. Check out the Opta chalkboard for yourself here, and you can see from where teams are shooting and scoring by clicking the appropriate boxes for team and statistic of interest. During this particular game, I have Montreal down for 16 scoring attempts, nine from outside the box, six inside, and one from right on the edge. Both its goals were scored from inside the box (though you could argue one was one the edge). Chicago, on the other hand, earned 11 attempts, ripping seven of those from outside the box, just two from inside, and two from the edge of the box. Chicago did not score. I did this for each of Montreal’s seven games this season.

Obviously things like angle matter, too, but I’m not going to pull out my protractor for this one. Here’s the breakdown for Montreal and its opponents on the season:

Attempts Goals Finishing
Stat Montreal Opponents Montreal Opponents Montreal Opponents
Inside Box

40

45

6

4

15.0%

8.9%

Outside Box

31

56

3

1

9.7%

1.8%

On Edge

6

5

0

0

0.0%

0.0%

Total

77

106

9

5

11.7%

4.7%

 

Montreal earns more shots inside the box than outside, and that might very well be a product of its system and players, rather than just dumb luck. While the Impact is being outshot in total, perhaps that stat is skewed slightly by shot selection. Montreal’s system seems to create a greater proportion of opportunities in the box. I would still expect some regression from Montreal this season back toward the middle of the standings—as its shot ratios are not favorable even after adjusting for quality—but perhaps not as far as a simple shot model would suggest.

*One might note that Montreal’s attempts ratio is quite a bit worse than its shots-on-goal ratio, which isn’t even that good to begin with. It is apparently too early in the season for attempts ratios to explain much of anything with certainty, but shots models from past seasons suggests Montreal’s goal scoring ratio should probably be even worse than even-ish. That is, if shots aren’t broken down by quality.