Brazil is a better team than you think.

Admittedly, it hasn’t felt like Brazil has played all that well this World Cup. The referee seemingly made its two-goal victory over Croatia a more relaxed finish than it should have been; against Mexico, the fourth-place team from CONCACAF, it only managed a draw; and Cameroon was just low-hanging fruit. The host team then took a lot of flak for its play in the Round of 16 against Chile, especially for its performance after halftime. Indeed, Brazil conceded a silly goal on a defensive giveaway, and Chile had chances to win that game.

But I’m here to tell you that Brazil has played better that it has looked. Too often, it seems, the scorelines heavily influence our praise and criticism of what’s happening on the field.

Brazil dominated Group A in terms of Expected Goal Differential (xGD), and recorded the second-highest tally of any team during the group stage. Brazil’s 1.05 xGD during even (tied) gamestates ranked fifth among the 32 teams. You might have expected better from the hosts, but most teams only played about 130 minutes in such gamestates. That’s a big enough sample size to get a general idea of which are the best teams, but too small a sample to split hairs over the top five.

Croatia – June 12th

Against the Croats, a penalty awarded to Fred on what appeared to be a dive marred what was actually a solid performance by Brazil. Up to that controversial call, Brazil had earned 1.4 Expected Goals (xGoals) to Croatia’s 0.4, dominating in quantity and quality of shots. Even after taking the lead on the penalty, Brazil still edged Croatia in xGoals the rest of the way, 0.30 to 0.24—a differential that matches what we’d expect of teams that were leading in this tournament.

Mexico – June 17th

Mexico is a better team than their last-second World Cup qualification (and that commentator) would suggest. It led the CONCACAF Hexagonal (the Hex!) in shot ratios and is currently ranked 13th in the world in the Soccer Power Index (though some of that improved ranking is because of their tie against Brazil). Despite a disappointing 0 – 0 tie on the scoreboard, Brazil’s 1.4 xGoals again dwarfed that of its opponents. Mexico totaled just 0.5 xGoals.

Cameroon – June 23rd

There’s not much to say about this one. Brazil’s 1.9 xGD against Cameroon was the third highest discrepancy thus far in the tournament, trailing only France’s drubbing of Honduras and Germany’s handling of Portugal. It should be noted that both France and Germany enjoyed a man advantage for the majorities of those games.

Chile – June 28th

For Chile, the scoreboard and their well-developed rapport with the woodwork are clear indications that they could have won this game. However, the opportunity creation department informs us that Brazil probably should have won, as it did. 94 percent of this game’s shots were taken during an even gamestate, either 0 – 0 or 1 – 1, and Brazil outpaced Chile during that time by a full expected goal. Even after halftime, when Brazil looked disorganized and sloppy, it still edged Chile 1.1-to-0.7 in xGoals.

 

Perhaps Brazil has not “looked” the part of tournament favorites during its first four games, but its shot creation numbers suggest it is definitely playing like one of the best teams. Add that to their pre-tournament resume, throw in the home-field advantage that’s not going away anytime soon, and there is little doubt that Brazil is still the favorite to win this World Cup—maybe not with a majority of the probability, but definitely with a plurality.

Advertisements

The Manaus effect, or lack thereof

During the United States’ game against Germany on Thursday, it was hard to go 10 minutes without hearing Ian Darke or Taylor Twellman mention Manaus and its effect on the players. The US Men’s National team played its previous game against Portugal in the “Jungle City,” as did Italy, England, Croatia and Cameroon, before each dropping three points in their next games.

Business Insider pointed out that those first four teams to play in Manaus lost by a combined score of 10 – 3, though it conceded the tiny sample size. A Washington Post article cited the same statistics, and pondered the possibility of a curse in Manaus. The Independent, based in the United Kingdom, noted on June 24th that each of the seven teams that played in Manaus lost its next game. That was confusing since only six teams had played in Manaus to that point, and only four of those had actually played a “next game.” But whatever. #stats

Graham Zusi, Sporting Kansas City’s All-star midfielder and starter for the USMNT, wasn’t having any of it, stating “I don’t think it was that bad to be honest. When it got down to it, at night it cooled off and the humidity wasn’t as bad. I think after about 24 hours the bodies felt great.”

Hugh Laurie would tell us that everybody lies, especially athletes on record, but there might be something to Zusi’s statement. Below is a chart depicting the average temperature, humidity and heat index for each game site. The weather stats were taken from Weather Underground at the beginning of the second half of each game.

City  Games  Temp  Humidity  Index
Fortaleza 4 82.4 62% 85.4
Salvador 4 80.2 73% 83.5
Manaus 4 79.3 81% 82.1
Natal 4 78.5 83% 80.7
Cuiaba 4 78.4 66% 79.1
Brasilia 4 77.5 43% 78.7
Sao Paulo 4 69.4 55% 78.3
Belo Horizonte 4 76.1 40% 77.8
Recife 4 77.5 86% 77.8
Rio De Janeiro 4 75.7 71% 76.6
Porto Alegre 4 65.8 71% 75.5
Curitiba 4 64.0 79% 71.5

It’s reasonable to theorize that more extreme environments take their toll on the human body, even professional athletes. But if we’re going to get serious here, we need to consider all locales that were exceptionally uncomfortable. Manaus actually ranked third in average heat index, and had a lower average humidity than fourth-place Natal. Italy and England were the first to play in Manaus on June 14th and sparked the notion that it was a hell hole. But while they were duking it out in Manaus, Costa Rica and Uruguay were playing in Fortaleza, number one on that list up there. Though it was less humid to start the second half in Fortaleza, it was actually hotter, and Fortaleza’s halftime heat index beat that of Manaus by a few points, 87.3 to 84.6.

It turns out that teams which most recently played in Natal, Salvador or Fortaleza—the other three extreme locations—did alright. Those teams outscored their opponents by a combined five goals. That makes it hard to believe that the conditions of Manaus were responsible for the downfalls of Italy, England and Croatia, though that still leaves the possibility of a non-weather-related curse.

To make this a legit study, there are some other factors we need to control for, and that is why God invented linear regression. Using ESPN’s (Nate Silver’s) Soccer Power Index, I controlled for each team’s overall ability, and then I measured the effects of extra rest and past-game heat index on the goal differential outcome. The output is below:

Variable Estimate P-Value
Intercept -0.37 15.8%
SPI Ratings Differential 1.01 0.1%
Additonal Days Rest (home) -0.21 68.1%
Heat Index Differential 0.01 74.3%

If you’r not a linear regression kind of person, then basically what that chart up there says is that neither the heat index of the teams’ past games nor any rest discrepancy seemed to matter during this tournament. At least not in terms of goal differential. But we know that goal differential is finicky, and Expected Goals are a better indicator of team performance. Good thing we’ve got our World Cup Expected Goals data up and running! If we measure team performance by some Expected Goal Differential statistics (xGD), then we get these linear regression outputs:

Expected Goal Diff Estimate P-value Even Expected Goal Diff Estimate P-value
Intercept -0.06 61.9% Intercept 0.01 88.4%
SPI Ratings Differential 0.46 0.1% SPI Ratings Differential 0.34 0.2%
Additional Days Rest (home) 0.13 57.0% Additional Days Rest (home) 0.17 37.0%
Heat Index Differential 0.00 79.7% Heat Index Differential 0.00 86.3%

Again, regardless of whether we look at overall xGD or even-gamestate xGD, there are no statistically significant effects due to extreme heat index figures from past matches. Expected Goals data are obviously not a direct measurement of how heat impacts the athletes’ bodies, but they should be a stable representation the teams’ relative strengths during a match.

The Swiss were the last team (that is still in the tournament) to play in an 80+ heat index environment, but I wouldn’t expect that to matter much based on what I’ve shown above. What will matter is that Argentina is much better. Talent has trumped the heat index so far this World Cup.

World Cup Statistics

We have begun rolling out World Cup statistics in the same format as those we provide for MLS. Scroll over “World Cup 2014” along the top bar to check it out!

In the Team Stats Tables, one may observe that the recently-eliminated Spain outshot its opponents, and a much higher proportion of its possession occurred in the attacking third than that of its opponents.

Our team-by-team Expected Goals data shows that England played better than its results would suggest, earning more dangerous opportunities than its opponents. It was a matter of inches for Wayne Rooney a few times there…

 

Finishing data suggests that Lionel Messi has made the most of his opportunities—surprise, surprise—but did you know that none of Thomas Muller’s seven shots were assisted?

And despite giving up a tournament-high seven goals in the group stages, our Goalkeeping Data actually suggests that  Honduran goalkeeper Noel Valladares performed admirably—especially considering the onslaught of shots he faced that were worth a tournament-most 0.4 goals per shot on target.

USA versus Ghana: Gamestates Analysis

In analyzing MLS shot data, I have learned that—with small sample sizes—how a team plays when the game is tied is a strong indication of how well it will do in future games. The US Mens National Team spent just four-and-a-half minutes tied Monday evening, the epitome of small sample sizes. In case you were curious, the US generated two shots during that time worth about 0.13 goals. Ghana did not generate a shot over those 4.5 minutes.

The next most-important gamestate for a team is being ahead. With at least 17 games of data in MLS, knowing how well a team did when it was leading becomes an important piece of information for predicting that team’s future success. Almost 95 minutes were spent with the US in the lead, a time in which the USMNT took six shots worth 0.5 goals to Ghana’s 21 shots worth 1.7 goals.

Though MLS is definitely far below the level of even a USA-versus-Ghana match, I think a lot of the statistics from our MLS database still apply. I wrote a few weeks back about how away teams that were satisfied with the current gamestate went overboard with their conservative play. I think that could apply to the World Cup, as well. By most statistical accounts, USA versus Ghana was a fairly even matchup going in, yet the US played an annoying conservative style after going up a goal early. It gave up a majority of possession to Ghana in the attacking third, completing just 81 passes to Ghana’s 171 in that zone—not to mention the US being tripled up in Expected Goals when it was ahead.

Granted, Expected Goals likely overestimates the losing team’s chances of scoring. But not by much. In even gamestates in MLS, we see that teams are expected to score 1.29 goals per game, and they actually score 1.30 goals per game. Virtually no difference. However, when teams are ahead they are expected to score 1.79 goals per game, yet they only score about 1.60—an 11-percent drop. This discrepancy is likely due in large part to defenses being more packed in and capable of blocking shots. Indeed, teams that are losing have their shots blocked 27 percent of the time, while teams that are winning only have their shots blocked 22 percent of the time.

All that was simply to say that Ghana’s 1.7 Expected Goals are still representative of a team that was in control—too much control for my comfort level. Even if we assume it was really about 1.5 Expected Goals against a defensive-minded American side, that still triples the USA’s shot potential. Either the US strategy was overly conservative, or Ghana is really that much better. I’d like to believe in the former, but it’s picking between the lesser of two evils.

It just doesn’t make sense to me to play conservatively to maintain the status quo. It invariably leads to massive discrepancies in Expected Goals, and too often allows the opposition an easier way to come back.

Portland Timbers: Comeback Kids?

I watched the Timbers go down 2 – 0 in the first half Wednesday night against FC Dallas before leaving disgusted for my indoor game. At halftime of my game, I noticed that Portland had come back to tie. Two common occurrences for the Timbers this year have been comebacks and ties, so perhaps it shouldn’t have been that surprising.

The Timbers have played nearly 400 minutes this season from behind–a quarter of their time spent on the field–which has given them plenty of time to win back the home crowd after early goals conceded. In all that time spent losing (nearly four game’s worth) Portland has outscored its opponents 13-to-4. That’s like four straight 3 – 1 wins. Even though most teams perform better when playing from behind, that still ranks Portland second in the league behind Vancouver (see chart below).

This begs the question, is Portland actually one of the best teams when facing a deficit, or might this be a product of some random variation? To the stats!

It turns out, Portland also does well by Expected Goals in losing gamestates. In fact, relative to the league, the Timbers are the best at generating quality and quantity of opportunities in these situations with an expected goal differential of +1.4. We know Expected Goals to be more stable, and thus it is probably a truer indication of what to expect in the future. Check out the chart below, scaled on a per 96-minute basis (basically, per game).

xGD When Losing

Team GF GA GD xGF xGA xGD GD Rank xGD Rank
POR 3.1 1.0 2.2 2.5 1.1 1.4 2 1
FCD 2.0 0.9 1.1 1.9 0.8 1.2 6 2
SEA 2.3 1.3 1.0 1.6 0.7 1.0 8 3
LA 1.8 0.0 1.8 1.8 0.9 1.0 3 4
NYRB 2.0 1.0 1.0 1.8 1.0 0.8 9 5
TOR 2.3 1.1 1.1 1.9 1.2 0.7 7 6
SJ 1.6 0.7 0.9 1.6 1.0 0.6 10 7
PHI 1.6 1.6 0.0 1.8 1.3 0.5 14 8
CHI 3.0 1.5 1.5 1.5 1.0 0.5 4 9
SKC 1.3 0.9 0.4 1.7 1.3 0.4 12 10
DCU 2.0 0.7 1.3 1.2 0.9 0.3 5 11
CLB 0.9 0.5 0.5 1.5 1.3 0.2 11 12
COL 2.7 2.3 0.4 1.6 1.5 0.1 13 13
MTL 0.8 1.8 -1.0 1.4 1.3 0.1 16 14
RSL 1.6 2.6 -1.0 1.6 1.5 0.0 17 15
NE 0.5 1.4 -0.9 1.4 1.3 0.0 15 16
CHV 0.6 2.9 -2.3 1.3 1.4 0.0 19 17
VAN 3.1 0.4 2.7 1.3 1.5 -0.1 1 18
HOU 0.8 2.5 -1.7 1.1 1.7 -0.6 18 19
Averages 1.8 1.3 0.5 1.6 1.2 0.4    

But wait! Hold the bus. There is one major confounding factor that we can control for here. Home field advantage. The Timbers have oddly found themselves frequently facing deficits at home, which means that a large portion of their time spent losing is spent in the friendly confines of Providence Park in downtown Portland. In fact, the Timbers lead the league in minutes spent losing at home–a weird stat, to be sure. Here’s the same chart, but for teams losing at home.

xGD When Losing at Home

Team GF GA GD xGF xGA xGD GD Rank xGD Rank
SJ 3.3 0.8 2.5 3.5 0.5 3.0 5 1
NYRB 3.2 1.6 1.6 2.6 0.6 2.1 7 2
POR 3.6 1.0 2.6 3.0 1.0 2.1 4 3
FCD 2.8 0.0 2.8 2.1 0.4 1.7 3 4
COL 3.6 3.6 0.0 2.1 0.8 1.3 14 5
TOR 3.8 0.0 3.8 2.5 1.3 1.3 2 6
SEA 1.6 0.5 1.1 1.6 0.6 1.0 8 7
CHI 2.5 1.6 0.8 1.5 0.6 0.9 10 8
LA 0.9 0.0 0.9 1.8 1.0 0.8 9 9
NE 0.0 1.2 -1.2 1.4 0.6 0.7 16 10
CLB 0.8 0.4 0.4 1.7 1.0 0.7 13 11
PHI 2.4 1.7 0.7 1.9 1.3 0.6 11 12
VAN 5.1 0.0 5.1 1.5 0.9 0.6 1 13
MTL 0.7 1.5 -0.7 1.8 1.4 0.4 15 14
DCU 1.9 1.3 0.6 1.0 0.9 0.1 12 15
SKC 2.1 0.0 2.1 1.3 1.2 0.1 6 16
HOU 1.5 2.9 -1.5 1.7 1.6 0.1 17 17
RSL 0.0 1.8 -1.8 0.5 0.8 -0.3 18 18
CHV 0.0 3.8 -3.8 1.0 2.1 -1.0 19 19
Averages 2.1 1.3 0.8 1.8 1.0 0.8  

Even when I control for home field advantage, we still see the Timbers among the best teams at playing from behind, averaging 2.1 more goals than their opponents per 96 minutes. Is it the coaching? The players’ mentalities? The raucous home turf on West Burnside? Luck? I don’t know, but I know it’s happening.

 

World Cup Bracket Challenge

MLSsoccer.com has been kind enough to provide us with blank, virtual brackets for the World Cup. We here at ASA have been kind enough to provide prizes to the person that picks best.

Here’s the deal. Like us on Facebook to get info about our group’s name and password, then fill out your bracket via the link above, and you could win stuff for free! The prize includes copies of the books Soccernomics and The Numbers Game. If you win and you already have those books, I’m sure we can find something comparable that you like.

Oh, by the way, the World Cup starts next Thursday. Get your shit together.

Some Facts about Corner Kicks

I find myself getting worked up for corner kicks. Whether it’s anxiety because the opponent is about to whip one into the Timbers’ box, or hope because the Timbers are about to do the same to the opponent. There is a chance that corner kick will result in a goal, so perhaps my feelings are justified. However, statistics don’t find corner kicks nearly as exciting.

In 2013, our data shows that the 3,185 corner kicks taken led to just 1,110 shots, 258 of which were on target and 80 of which found the back of the net. That means that only one-third of corner kicks ever produced shots, and the finishing rate on those shots was just 7.2 percent–compared to the league’s typical finishing rate of about 10 percent in 2013. Though shots from corners tended to be struck closer to goal, they also tend to be taken with the head, which is the least efficient body part for finishing. In the end, just one of 40 corner kicks could be found in the back of the net (2.5%).

For comparison’s sake, let’s take a look at how often other possessions lead to goals. Thanks to Alex at Tempo-Free Soccer, we can estimate that an average team gets about 4,500 possessions in a season. Here’s how those 4,500 possessions end for a league-average team.

End in… Possessions Shots Shots/Poss Goals Goals/Poss Finish%
Corner 170 60 0.353 4.3 0.025 7.2%
Attacking 3rd* 2,030 375 0.180 40.0 0.019 10.7%

We can see that, while corner kicks produced about twice as many shots per possession than typical attacking-third possessions, they only led to about 25% more goals per possession due to packed boxes and low finishing rates. But not all attacking-third possession are equal, and it seems as though many of the possessions that lead to corners come from attacking-third possessions that are deeper in the opponent’s territory. As the attacking-third possessions get closer and closer to goal, they probably become more dangerous than corner kicks. It may be more correct to say that teams don’t earn corners, but rather, they settle for corners.

These numbers aren’t as precise as I’d like, but they still sobered me up a little for corner kicks. But no promises that I can keep my cool if the Timbers are facing a corner in the waning seconds stoppage time.

*Teams typically lose possession on bad passes about 89 times per match, and about 43.5 of those instances occur in the attacking third according to OPTA data. This led to a 48.8-percent estimate of possessions ending in the final third.