Brazil is a better team than you think.

Posted on June 29, 2014 by Matthias Kullowatz

Admittedly, it hasn’t felt like Brazil has played all that well this World Cup. The referee seemingly made its two-goal victory over Croatia a more relaxed finish than it should have been; against Mexico, the fourth-place team from CONCACAF, it only managed a draw; and Cameroon was just low-hanging fruit. The host team then took a lot of flak for its play in the Round of 16 against Chile, especially for its performance after halftime. Indeed, Brazil conceded a silly goal on a defensive giveaway, and Chile had chances to win that game.

But I’m here to tell you that Brazil has played better that it has looked. Too often, it seems, the scorelines heavily influence our praise and criticism of what’s happening on the field.

Brazil dominated Group A in terms of Expected Goal Differential (xGD), and recorded the second-highest tally of any team during the group stage. Brazil’s 1.05 xGD during even (tied) gamestates ranked fifth among the 32 teams. You might have expected better from the hosts, but most teams only played about 130 minutes in such gamestates. That’s a big enough sample size to get a general idea of which are the best teams, but too small a sample to split hairs over the top five.

Croatia – June 12th

Against the Croats, a penalty awarded to Fred on what appeared to be a dive marred what was actually a solid performance by Brazil. Up to that controversial call, Brazil had earned 1.4 Expected Goals (xGoals) to Croatia’s 0.4, dominating in quantity and quality of shots. Even after taking the lead on the penalty, Brazil still edged Croatia in xGoals the rest of the way, 0.30 to 0.24—a differential that matches what we’d expect of teams that were leading in this tournament.

Mexico – June 17th

Mexico is a better team than their last-second World Cup qualification (and that commentator) would suggest. It led the CONCACAF Hexagonal (the Hex!) in shot ratios and is currently ranked 13th in the world in the Soccer Power Index (though some of that improved ranking is because of their tie against Brazil). Despite a disappointing 0 – 0 tie on the scoreboard, Brazil’s 1.4 xGoals again dwarfed that of its opponents. Mexico totaled just 0.5 xGoals.

Cameroon – June 23rd

There’s not much to say about this one. Brazil’s 1.9 xGD against Cameroon was the third highest discrepancy thus far in the tournament, trailing only France’s drubbing of Honduras and Germany’s handling of Portugal. It should be noted that both France and Germany enjoyed a man advantage for the majorities of those games.

Chile – June 28th

For Chile, the scoreboard and their well-developed rapport with the woodwork are clear indications that they could have won this game. However, the opportunity creation department informs us that Brazil probably should have won, as it did. 94 percent of this game’s shots were taken during an even gamestate, either 0 – 0 or 1 – 1, and Brazil outpaced Chile during that time by a full expected goal. Even after halftime, when Brazil looked disorganized and sloppy, it still edged Chile 1.1-to-0.7 in xGoals.

Perhaps Brazil has not “looked” the part of tournament favorites during its first four games, but its shot creation numbers suggest it is definitely playing like one of the best teams. Add that to their pre-tournament resume, throw in the home-field advantage that’s not going away anytime soon, and there is little doubt that Brazil is still the favorite to win this World Cup—maybe not with a majority of the probability, but definitely with a plurality.

World Cup Statistics

Posted on June 27, 2014 by Matthias Kullowatz

We have begun rolling out World Cup statistics in the same format as those we provide for MLS. Scroll over “World Cup 2014” along the top bar to check it out!

In the Team Stats Tables, one may observe that the recently-eliminated Spain outshot its opponents, and a much higher proportion of its possession occurred in the attacking third than that of its opponents.

Our team-by-team Expected Goals data shows that England played better than its results would suggest, earning more dangerous opportunities than its opponents. It was a matter of inches for Wayne Rooney a few times there…

SHOT MAP: Wayne Rooney has now missed with his right foot, his left foot & with a header. An English hat-trick. #ENG pic.twitter.com/HfR86YL23w

— Squawka Football (@Squawka) June 19, 2014

Finishing data suggests that Lionel Messi has made the most of his opportunities—surprise, surprise—but did you know that none of Thomas Muller’s seven shots were assisted?

And despite giving up a tournament-high seven goals in the group stages, our Goalkeeping Data actually suggests that Honduran goalkeeper Noel Valladares performed admirably—especially considering the onslaught of shots he faced that were worth a tournament-most 0.4 goals per shot on target.

USA versus Ghana: Gamestates Analysis

Posted on June 17, 2014 by Matthias Kullowatz

In analyzing MLS shot data, I have learned that—with small sample sizes—how a team plays when the game is tied is a strong indication of how well it will do in future games. The US Mens National Team spent just four-and-a-half minutes tied Monday evening, the epitome of small sample sizes. In case you were curious, the US generated two shots during that time worth about 0.13 goals. Ghana did not generate a shot over those 4.5 minutes.

The next most-important gamestate for a team is being ahead. With at least 17 games of data in MLS, knowing how well a team did when it was leading becomes an important piece of information for predicting that team’s future success. Almost 95 minutes were spent with the US in the lead, a time in which the USMNT took six shots worth 0.5 goals to Ghana’s 21 shots worth 1.7 goals.

Though MLS is definitely far below the level of even a USA-versus-Ghana match, I think a lot of the statistics from our MLS database still apply. I wrote a few weeks back about how away teams that were satisfied with the current gamestate went overboard with their conservative play. I think that could apply to the World Cup, as well. By most statistical accounts, USA versus Ghana was a fairly even matchup going in, yet the US played an annoying conservative style after going up a goal early. It gave up a majority of possession to Ghana in the attacking third, completing just 81 passes to Ghana’s 171 in that zone—not to mention the US being tripled up in Expected Goals when it was ahead.

Granted, Expected Goals likely overestimates the losing team’s chances of scoring. But not by much. In even gamestates in MLS, we see that teams are expected to score 1.29 goals per game, and they actually score 1.30 goals per game. Virtually no difference. However, when teams are ahead they are expected to score 1.79 goals per game, yet they only score about 1.60—an 11-percent drop. This discrepancy is likely due in large part to defenses being more packed in and capable of blocking shots. Indeed, teams that are losing have their shots blocked 27 percent of the time, while teams that are winning only have their shots blocked 22 percent of the time.

All that was simply to say that Ghana’s 1.7 Expected Goals are still representative of a team that was in control—too much control for my comfort level. Even if we assume it was really about 1.5 Expected Goals against a defensive-minded American side, that still triples the USA’s shot potential. Either the US strategy was overly conservative, or Ghana is really that much better. I’d like to believe in the former, but it’s picking between the lesser of two evils.

It just doesn’t make sense to me to play conservatively to maintain the status quo. It invariably leads to massive discrepancies in Expected Goals, and too often allows the opposition an easier way to come back.

Sporting KC still has edge in the capital

Posted on May 31, 2014 by Matthias Kullowatz

If you come in from a certain angle, you can hype this evening’s DC United-Sporting KC game as the Eastern Conference’s clash of the week. The two teams enter this game tied for the second seed with two of the best goal differentials in the conference. With DCU playing at home, and Sporting missing half its team, the edge would appear to go to United. But not so fast.

Despite being inseparable by points, DCU and Sporting are about as far apart as two teams can be by Expected Goal Differential. Sporting sits atop the league at +0.62 per game,* while DCU is ahead of only San Jose with -0.33. If we look to even gamestates—during only those times when the score was tied and the teams were playing 11-on-11—the chasm between them grows even wider. Sporting’s advantage over DCU in Even xGD is more than 1.5 goals per game.*

To this point, as early as it is in the season, I have found that winners are best predicted by Even xGD, rather than overall goal differential. Though the sample size of shots is smaller for each team in these scenarios, the information is less clouded by the various tactics that are employed when one team goes ahead, or when one team loses a player.

Of course, Sporting will be missing the likes of Graham Zusi, Matt Besler, and Lawrence Olum, as they have for the past three games. The loss of those key players has mostly coincided with their current four-game winless stretch, and it would be tempting to argue that they are not in form. However, over those last three games, Sporting overall xGD is +0.27 per game,* and its Even xGD is +0.68.*

Making predictions in sports is generally just setting oneself up for failure—especially in a sport where there are three outcomes—but I will say this. Sporting is likely better than the +180 betting line I’m seeing this morning.

*I use the phrase “per game” for simplicity, but xGD is actually calculated on a per-minute basis in our season charts. Per game implies per 96 minutes, which is the average length of an MLS game.

Calculating Expected Goals 2.0

Posted on May 8, 2014 by Matthias Kullowatz

I wrote a post similar to this a while back, outlining the process for calculating our first version of Expected Goals. This is going to be harder. Get out your TI-89 calculators, please. (Or you can just used my Expected Goals Cheatsheet).

Expected Goals is founded on the idea that each shot had a certain probability of going in based on some important details about that shot. If we add up all the probabilities of a team’s shots, that gives us its Expected Goals. Our goal is that this metric conveys the quality of opportunities a team earns for itself. For shooters and goal keepers, the details about the shot change a little bit, so pay attention.

The formulas are all based on a logistic regression, which allows us to sort out the influence of each shot’s many details all at once. The formula changes slightly each week because we base the regression on all the data we have, including each week’s new data, but it won’t change by much.

Expected Goals for a Team

Start with -0.19
Subtract 0.95 if the shot was headed (0.0 if it was kicked or othered).
Subtract 0.74 if the shot was taken from a corner kick (by Opta definition)
Subtract one of the following amounts for the shot’s location:

Zone 1

Zone 2

Zone 3

Zone 4

Zone 5

Zone 6

Now you have what are called log odds of that shot going in. To find the odds of that shot going in, put the log odds in an exponent over the number “e”.

Finally, to find the estimated probability of that shot going in, take the odds and divide by 1 + odds.

Example: Shot from zone 3, header, taken off a corner kick:

-0.19 – 0.95 – 0.74 – 2.37 = -4.25

e^(-4.25) = .0143

.0143 / (1 + .0143) = 0.014 or a 1.4% chance of going in.

A team that took one of these shots would earn 0.014 expected goals.

Expected Goals for Shooter

Start with -0.28
Subtract 0.83 if the shot was headed (0.0 if it was kicked or othered).
Subtract 0.65 if the shot was taken from a corner kick (by Opta definition).
Add 2.54 if the shot was as a penalty kick.
Add 0.71 if the shot was taken on a fastbreak (by Opta definition).
Add 0.16 if the shot was taken from a set piece (by Opta definition).
Subtract one of the following amounts for the shot’s location:

0.0
1.06
2.32
2.61
3.48
2.99

Now you have what are called log odds of that shot going in. To find the odds of that shot going in, put the log odds in an exponent over the number “e”.

Finally, to find the estimated probability of that shot going in, take the odds and divide by 1 + odds.

Example: A penalty kick

-0.28 + 2.54 – 1.06 = 1.2

e^(1.2) = 3.320

3.320/ (1 + 3.320) = 0.769 or a 76.9% chance of going in.

A player that took a penalty would gain an additional 0.769 Expected Goals. If he missed, then he be underperforming his Expected Goals by 0.769.

Expected Goals for Goalkeeper

*These are calculated only from shots on target.

Start with 1.61
Subtract 0.72 if the shot was headed (0.0 if it was kicked or othered).
Add 1.58 if the shot was as a penalty kick.
Add 0.42 if the shot was taken from a set piece (by Opta definition).
Subtract one of the following amounts for the shot’s location:

One) 0.0
Two) 1.10
Three) 2.57
Four) 2.58
Five) 3.33
Six) 3.21

Subtract 1.37 if the shot was taken toward the middle third of the goal (horizontally).
Subtract 0.29 if the shot was taken at the lower half of the goal (vertically).
Add 0.35 if the was taken outside the width of the six-yard box and was directed toward the far post.

Now you have what are called log odds of that shot going in. To find the odds of that shot going in, put the log odds in an exponent over the number “e”.

Finally, to find the estimated probability of that shot going in, take the odds and divide by 1 + odds.

Example: Shot from zone 2, kicked toward lower corner, from the run of play.

1.61 – 1.10 – 0.29 = 0.22

e^(0.22) = 1.246

1.246/ (1 + 1.246) = 0.555 or a 55.5% chance of going in.

A keeper that took on one of these shots would gain an additional 0.555 Expected Goals against. If he saved it, then he would be outperforming his Expected Goals by 0.555.

Frequently Asked Questions

1. Why a regression model? Why not just subset each shot in a pivot table by its type across all variables?

I think a lot of information–degrees of freedom we call it–would be lost if I were to partition each shot into a specific type by location, pattern of play, body part, and for keepers, placement. The regression gets more information about, say, headed shots in general, rather than “headed shots from zone 2 off corner kicks,” of which there are far fewer data points.

2. Why don’t you include info about penalty kicks in the team model?

Penalty kicks are not earned in a stable manner. Teams that get lots of PK’s early in the season are no more likely to get additional PK’s later in the season. Since we want this metric to be predictive at the team level, including penalty kicks would cloud that prediction for teams that have received an extreme number of PK’s thus far.

3. The formula looks quite a bit different for shooters versus for keepers. How is that possible since one is just taking a shot on the other?

There are a few reasons for this. The first is that the regression model for keepers is based only on shots on target. It is meant only to assess their ability to produce quality saves. A different data set leads to different regression results. Also, we are now accounting for the shooter’s placement. It is very possible that corner kicks are finished less often than shots from other patterns of play because they are harder to place. By including shot placement information in the keeper model, the information about whether the shot came off a corner is now no longer needed for assessing the keeper’s ability.

4. Why don’t you include placement for shooters, then?

We wish to assess a shooter’s ability to create goals beyond what’s expected. Part of that skill is placement. When a shooter has recorded more goals than his expected goals, it indicates a player that is outperforming his expectation. It could be because he places well, or that he is deceptive, or he is good at getting opportunities that are better than what the model thinks. In any case, we want the expected goals to reflect the opportunities earned, and thus the actual goals should help us to measure finishing ability to some extent.

Looking for the model-busting formula

Posted on April 18, 2014 by Matthias Kullowatz

Well that title is a little contradictory, no? If there’s a formula to beat the model then it should be part of the model and thus no longer a model buster. But I digress. That article about RSL last week sparked some good conversation about figuring out what makes one team’s shots potentially worth more than those of another team. RSL scored 56 goals (by their own bodies) last season, but were only expected to score 44, a 12-goal discrepancy. Before getting into where that came from, here’s how our Expected Goals data values each shot:

Shot Location: Where the shot was taken
Body part: Headed or kicked
Gamestate: xGD is calculated in total, and also specifically during even gamestates when teams are most likely playing more, shall we say, competitively.
Pattern of Play: What the situation on the field was like. For instance, shots taken off corner kicks have a lower chance of going in, likely due to a packed 18-yard box. These things are considered, based on the Opta definitions for pattern of play.

But these exclude some potentially important information, as Steve Fenn and Jared Young pointed out. I would say, based on their comments, that the two primary hindrances to our model are:

How to differentiate between the “sub-zones” of each zone. As Steve put it, was the shot from the far corner of Zone 2, more than 18 yards from goal? Or was it from right up next to zone 1, about 6.5 yards from goal?
How clean a look the shooter got. A proportion of blocked shots could help to explain some of that, but we’re still missing the time component and the goalkeeper’s positioning. How much time did the shooter have to place his shot and how open was the net?

Unfortunately, I can’t go get a better data set right now so hindrance number 1 will have to wait. But I can use the data set that I already have to explore some other trends that may help to identify potential sources of RSL’s ability to finish. My focus here will be on their offense, using some of the ideas from the second point about getting a clean look at goal.

Since we have information about shot placement, let’s look at that first. I broke down each shot on target by which sixth of the goal it targeted to assess RSL’s accuracy and placement. Since the 2013 season, RSL is second in the league in getting its shots on goal (37.25%), and among those shots, RSL places the ball better than any other team. Below is a graphic of the league’s placement rates versus those of RSL over that same time period. (The corner shots were consolidated for this analysis because it didn’t matter to which corner the shot was placed.)

RSL obviously placed shots where the keeper was not likely at: the corners. That’s a good strategy, I hear. If I include shot placement in the model, RSL’s 12-goal difference in 2013 completely evaporates. This new model expected them to score 55.87 goals in 2013, almost exactly the 56 they scored.

Admittedly, it isn’t earth-shattering news that teams score by shooting at the corners, but I still think it’s important. In baseball, we sometimes assess hitters and pitchers by their batting average on balls in play (BABIP), a success rate during specific instances only when the ball is contacted. It’s obvious that batters with higher BABIPs will also have higher overall batting averages, just like teams that shoot toward the corners will score more goals.

But just because it is obvious doesn’t mean that this information is worthless. On the contrary, baseball’s sabermetricians have figured out that BABIP takes a long time to stabilize, and that a player who is outperforming or underperforming his BABIP is likely to regress. Now that we know that RSL is beating the model due to its shot placement, this begs the question, do accuracy and placement stabilize at the team level?

To some degree, yes! First, there is a relationship between a team’s shots on target totals from the first half of the season and the second half of the season. Between 2011 and 2013, the correlation coefficient for 56 team-seasons was 0.29. Not huge, but it does exist. Looking further, I calculated the differences between teams’ expected goals in our current model and teams’ expected goals in this new shot placement model. The correlation from first half to second half on that one was 0.54.

To summarize, getting shots on goal can be repeated to a small degree, but where those shots are placed in the goal can be repeated at the team level. There is some stabilization going on. This gives RSL fans hope that at least some of this model-busting is due to a skill that will stick around.

Of course, that still doesn’t tell us why RSL is placing shots well as a team. Are their players more skilled? Or is it the system that creates a greater proportion of wide-open looks?

Seeking details that may indicate a better shot opportunity, I will start with assisted shots. A large proportion of assisted shots may indicate that a team will find open players in front of net more often, thus creating more time and space for shots. However, an assisted shot is no more likely to go in than an unassisted one, and RSL’s 74.9-percent assist rate is only marginally better than the league’s 73.1 percent, anyway. RSL actually scored about six fewer goals than expected on assisted shots, and six more goals than expected on unassisted shots. It becomes apparent that we’re barking up the wrong tree here.*

Are some teams more capable of not getting their shots blocked? If so then then those teams would likely finish better than the league average. One little problem with this theory is that RSL gets it shots blocked more often than the league average. Plus, in 2013, blocked shot percentages from the first half of the season had a (statistically insignificant) negative correlation to blocked shots in the second half of the season, suggesting strongly that blocked shots are more influenced by randomness and the defense, rather than by the offense which is taking the shots.

Maybe some teams get easier looks by forcing rebounds and following them up efficiently. Indeed, in 2013 RSL led the league in “rebound goals scored” with nine, where a rebounded shot is one that occurs within five seconds of the previous shot. That beat their expected goals on those particular shots by 5.6 goals. However, earning rebounds does not appear to be much of a skill, and neither does finishing them. The correlation between first-half and second-half rebound chances was a meager–and statistically insignificant–0.13, while the added value of a “rebound variable” to the expected goals model was virtually unnoticeable. RSL could be the best team at tucking away rebounds, but that’s not a repeatable league-wide skill. And much of that 5.6-goal advantage is explained by the fact that RSL places the ball well, regardless of whether or not the shot came off a rebound.

Jared did some research for us showing that teams that get an extremely high number of shots within a game are less likely to score on each shot. It probably has something to do with going for quantity rather than quality, and possibly playing from behind and having to fire away against a packed box. While that applies within a game, it does not seem to apply over the course of a season. Between 2011 and 2013, the correlation between a teams attempts per game and finishing rate per attempt was virtually zero.

If RSL spends a lot of time in the lead and very little time playing from behind–true for many winning teams–then its chances may come more often against stretched defenses. RSL spent the fourth most minutes in 2013 with the lead, and the fifth fewest minutes playing from behind. In 2013, there was a 0.47 correlation between teams’ abilities to outperform Expected Goals and the ratio of time they spent in positive versus negative gamestates.

If RSL’s boost in scoring comes mostly from those times when they are in the lead, that would be bad news since their Expected Goals data in even gamestates was not impressive then, and is not impressive now. But if the difference comes more from shot placement, then the team could retain some of its goal-scoring prowess. 8.3 goals of that 12-goal discrepancy I’m trying to explain in 2013 came during even gamestates, when perhaps their ability to place shots helped them to beat the expectations. But the other 4-ish additional goals likely came from spending increased time in positive gamestates. It is my guess that RSL won’t be able to outperform their even gamestate expectation by nearly as much this season, but at this point, I wouldn’t put it past them either.

We come to the unsatisfying conclusion that we still don’t know exactly why RSL is beating the model. Maybe the players are more skilled, maybe the attack leaves defenses out of position, maybe it spent more time in positive gamestates than it “should have.” And maybe RSL just gets a bunch of shots from the closest edge of each zone. Better data sets will hopefully sort this out someday.

*This doesn’t necessarily suggest that assisted shots have no advantage. It could be that assisted shots are more commonly taken by less-skilled finishers, and that unassisted shots are taken by the most-skilled finishers. However, even if that is true, it wouldn’t explain why RSL is finishing better than expected, which is the point of this article.

ASA Podcast XLIV: The One Where We Talk About What We Write About

Posted on April 17, 2014 by guy

Harrison and Matty discuss their two most recent articles, respectively about Harrison’s Shots Created per 90 statistic and Matty’s obsessive need to put RSL down because its players are more gooder at soccer than he is. It’s a short one, perfect for your commute!

Real Salt Lake: Perennial Model Buster?

Posted on April 11, 2014 by Matthias Kullowatz

If you take a look back at 2013’s expected goal differentials, probably the biggest outlier was MLS Cup runner up Real Salt Lake. Expected to score 0.08 fewer goals per game than its opponents, RSL actually scored 0.47 more goals than its opponents. That translates to a discrepancy of about 19 unexplained goals for the whole season. This year, RSL finds itself second in the Western Conference with a goal differential of a massive 0.80. However, like last year, the expected goal differential is lagging irritatingly behind at –0.77.

There are two extreme explanations for RSL’s discrepancy in observed versus expected performance, and while the truth probably lies in the middle, I think it’s valuable to start the discussion at the extremes and move in from there.

It could be that RSL plays a style and has the personnel to fool my expected goal differential statistic. Or, it could be that RSL is one lucky son of a bitch. Or XI lucky sons of bitches. Whatever.

Here are some ways that a team could fool expected goal differential:

It could have the best fucking goalkeeper in the league.
It could have players that simply finish better than the league average clip in each defined shot type.
It could have defenders that make shots harder than they appear to be in each defined shot type–perhaps by forcing attackers onto their weak feet, or punching attackers in the balls whilst winding up.
That’s about it.

We ~~know~~ are pretty sure that RSL does indeed have the best goalkeeper in the league, and Will and I estimated Nick Rimando’s value at anywhere between about six and eight goals above average* during the 2013 season. That makes up a sizable chunk of the discrepancy, but still leaves at least half unaccounted for.

The finishing ability conversation is still a controversial one, but that’s where we’re likely to see the rest of the difference. RSL scored 56 goals (off their own bodies rather than those of their opponents), but were only expected to score about 44. That 12-goal difference can be conveniently explained by their five top scorers–Alvaro Saborio, Javier Morales, Ned Grabavoy, Olmes Garcia, and Robbie Findley–who scored 36 goals between them while taking shots valued at 25.8 goals. (see: Individual Expected Goals, and yes it’s biased to look at just the top five goal scorers, but read on.)

Here’s the catch, though. Using the sample of 28 players that recorded at least 50 shots last season and at least 5 shots this season, the correlation coefficient for the goals above expectation statistic is –0.43. It’s negative. Basically, players that were good last year have been bad this year, and players that were bad last year have been good this year. That comes with some caveats–and if the correlation stays negative then that is a topic fit for another whole series of posts–but for our purposes here it suggests that finishing isn’t stable, and thus finishing isn’t really a reliable skill. The fact that RSL players have finished well for the last 14 months means very little for how they will finish in the future.

Since I said there was a third way to fool expected goal differential–defense. I should point out that once we account for Rimando, RSL’s defense allowed about as many goals as expected. Thus the primary culprits of RSL’s ability to outperform expected goal differential have been Nick Rimando and its top five scorers. So now we can move on to the explanation on the other extreme, luck.

RSL has been largely lucky, using the following definition of lucky: Scoring goals they can’t hope to score again. A common argument I might expect is that no team could be this “lucky” for this long. If you’re a baseball fan, I urge you to read my piece on Matt Cain, but if not, here’s the point. 19 teams have played soccer in MLS the past two seasons. The probability that at least one of them gets lucky for 1.2 seasons worth of games is actually quite high. RSL very well may be that team–on offense, anyway.

Unless RSL’s top scorers are all the outliers–which is not impossible, but unlikely–then RSL is likely in for a rude awakening, and a dogfight for a playoff spot.

*Will’s GSAR statistic is actually Goals Saved Above Replacement, so I had to calibrate.

ASA Podcast XLIII: The one where Matty Makes the Call

Posted on April 11, 2014 by guy

Hey everyone, here is our latest ~~terrible~~ exhilarating podcast for your listening pleasure. The delay this week in posting was largely due to us switching to ‘Mixcloud’ for the conceivable hosting future as we move way from our current site and into a domain of our own. Admittedly, we ate up a good 15 minutes in the start of the podcast talking about the Seattle-Portland match, but you saw that coming…right? The rest of the podcast is also solid, and perhaps more importantly, less Cascadia-specific, so don’t give up on it just because of that segment!

Montreal and Philadelphia Swap Young Strikers

Posted on April 4, 2014 by guy

Okay, I’m sure by now that, given you follow our site, you’ve also probably been made aware of the fact that the Philadelphia Union (an underrated team in my opinion) traded their young 20-year old striker Jack McInerney to the Montreal Impact for their young 22-year old striker Andrew Wenger. The trade has a very Matt Garza for Delmon Young feel to it, leaving me with an odd taste in my mouth. Are the Montreal Impact selling low on Andrew Wenger? It’s, at the very least, presumable that they know something that we don’t about him and his nature. The question becomes, then, is that assessment accurate?

Obviously the idea of a poacher is one that is met with a bit of contention, in the sense of how do you measure being in the “right place at the right time” for an individual? However assessing the 86 shots taken by ‘JackMac’ from the 2013 season, we can know that no fewer than 57 of them came from inside the 18 yard box, courtesy of digging around on the MLS Chalkboards. It’s obvious that he’s a player that can get the ball in advantageous locations. Already on the season he’s put together 12 shots and 11 of them have come inside the 18-yard box with 6 coming directly in front of goal. He’s been appropriately tagged on twitter as a “fox in the box”—hold the sexual innuendos—and I think the term poacher probably comes naturally with that association. Unfortunately, that term may harbor and imply the idea that he’s more lucky than good. I’m not sure I entirely buy that approach.

Meanwhile with everyone’s attention directly focused on McInerney–audaciously stamped as ‘The American Chicharito’–having already being called in the USMNT Camp for training during the Gold Cup, people are forgetting about Wenger and his potential that once made him a #1 overall MLS draft pick. Back in 2012, Wenger was painted as a potent and rising talent in MLS, named to MLSSoccer.com’s 24 under 24 roster, coming in 7th overall. Just one year later McInerney jumped onto the list himself, rocketing to 4th overall, while Wenger was left off. The perpetual “what have you done for me lately?” seemed to come out in these rankings.

Wenger–despite all his talent–has run into a slew of various injury-related setbacks the last two seasons; it’s so much failing to perform. The talent is still there, and I fully expect John Hackworth to tinker in an effort to get as much out of him as possible. The easy narrative here might just be the returning home to “revitalize his career” or something like that. Instead I think Philadelphia possibly got an undervalued piece in this move.

Looking at the last two years and a total of 31 shots Wenger has taken, 24 of those came from inside the 18-yard box, a higher percentage than that of JacMac. With that you can see above with xGpSH (expected goals per shot) that Wenger’s average shot has been more likely to become a goal than that of his counterpart. Now, understand that this all comes with the requisite small sample sizes admission. Wenger has played less than half the amount of time as McInerney and has less than half the amount of shots. However, estimations based upon their current performances with creating shots has them near the same level as that of Eddie Johnson, Will Bruin and Chris Rolfe in years past.

Creating shots isn’t everything. Creating shots in important positions is something. As we attempt to analyze the value of certain events on the pitch and how certain players are responsible for those events, we’ll see some things and maybe understand how to assess performances. It’s easy to overact to certain things that come with doing this type of analysis— Such as McInerney, Wenger, Bruin and Rolfe all averaging about 4.0 shots created per game individually. That seems rather important, but there is additional data that is missing. How much was each shot that they created worth? What other attributes do they bring to the match? This is just an simple break down between two players and comparing how they’ve impacted their respective clubs.

Personally, looking at all of this data, I’m of the mindset that Montreal got the better player. However, it’s extremely close and that isn’t taking into account the rosters in which they are joining or how they might be utilized on the pitch with their new teams (4-3-3 concerns vs. 4-4-2 placement). I would say at this time the difference between the two is that one is younger and has more experience. That might be a bit simplistic approach but honestly both create shots the same way in the same space. McInerney does so at a higher rate but Wenger has made up for taking less shots with taking advantage of his more experienced partner, Marco Di Vaio, and feeding him opportunities.

This may be one of the more interesting trades in recent memory. I’m fascinated to watch what happens next and how each of these two players develop. Their career arcs will go a long way in providing the narrative for this trade and I’m not so certain that this is as one-sided as some people might think. Referencing baseball again, the Tampa Bay (then, Devil) Rays were largely regarded as having “sold low” on Delmon Young. We can now see, looking over the past decade, that he never managed to put together all those tools that we once believed he had. The lesson being: don’t be too quick to judge Philadelphia. This isn’t necessarily going to be something as easily evaluated by just a single season, and time will reveal the significance of this day.

American Soccer Analysis

Numbers.

Category Archives: Expected Goals

Brazil is a better team than you think.

Croatia – June 12th

Mexico – June 17th

Cameroon – June 23rd

Chile – June 28th

World Cup Statistics

USA versus Ghana: Gamestates Analysis

Sporting KC still has edge in the capital

Calculating Expected Goals 2.0

Expected Goals for a Team

Example: Shot from zone 3, header, taken off a corner kick:

Expected Goals for Shooter

Example: A penalty kick

Expected Goals for Goalkeeper

Example: Shot from zone 2, kicked toward lower corner, from the run of play.

Frequently Asked Questions

Looking for the model-busting formula

ASA Podcast XLIV: The One Where We Talk About What We Write About

Real Salt Lake: Perennial Model Buster?

ASA Podcast XLIII: The one where Matty Makes the Call

Montreal and Philadelphia Swap Young Strikers