Wearing a plaid flannel shirt and a backward Astros cap, Daren Willman went upstairs in his home just outside Houston one Sunday afternoon this summer. He sat down behind a tabletop desk in an office decorated with framed sports jerseys and flicked on a TV across the room. “I’m at work,” he said.
For the next two hours, Willman toggled between major-league baseball games while monitoring his Twitter feed. This is essentially what he does every day during the season. After a few minutes, he noticed that a Pittsburgh radio commentator, David Todd, had criticized the Pirates’ center fielder, Andrew McCutchen. “Another misplay by McCutchen,” Todd tweeted. “Have to make that play.”
Willman pulled up numbers on his laptop. Then he retweeted Todd’s comment to his own nearly 27,000 followers — including fans, some journalists and employees from every big-league team — and added information that only he and a few others with access to a technological tool called Statcast could produce: Based on the ball’s flight and where McCutchen started the play, a fielder makes a putout in that situation 62 percent of the time. When Todd saw Willman’s response, he tweeted back a question about another ball that had also eluded McCutchen, in the first inning. Within moments, Willman had the pertinent Statcast data on his screen. “That one is caught 74 percent of the time,” he responded.
Major League Baseball Advanced Media, a company owned jointly by all 30 franchises, introduced Statcast before the 2015 season. BAM, as it is known, was created in 2001 as a sort of in-house tech start-up to help standardize baseball’s presence online, where teams maintained their own sites of variable quality. Since then, its tracking of pitch speeds and home-run distances has become hugely popular with fans on M.L.B.’s website and app. As something of a side effect, the service has been producing valuable insights for the executives who run teams.
BAM’s latest breakthrough comes from combining the same radar system that records pitch locations with two groupings of three high-definition cameras. Together, the technologies generate three-dimensional snapshots of every movement on a baseball field, some 40,000 frames per second converted into digital data. That electronic output, captured from all games, is accessible to each major-league team after the last out every night.
Statcast churns out an overwhelming amount of information; describing the movements of all the players on the field during a routine ground out, according to Willman, who is an analyst at BAM, can fill the equivalent of 21,000 rows on a spreadsheet. Teams are still learning how to dig through the digital sediment for usable knowledge. For now, they rely mostly on Willman and a couple of other colleagues to serve as electronic archaeologists. The nuggets extracted are typically fan-friendly conversation-starters, like which outfielders reach the most fly balls or where Bryce Harper’s hardest hits tend to end up. They are disseminated haphazardly — through Twitter, a dedicated podcast and enhanced game broadcasts on MLB.TV, whose announcers receive a steady flow of Statcast-generated talking points.
Some of what they learn, invariably, involves pitching and hitting. But Statcast’s current impact pales in comparison to its potential achievements: the quantification of how well fielders play their positions, which baseball watchers have been trying to do without success since the sport’s beginning. The 2016 regular season, which ends this week, was only Statcast’s second. Eventually, teams will figure out how to use it to gauge fielding with the same acuity they bring to other aspects of the game, which have been scrutinized since baseball’s analytic revolution began in the late 1990s.
Once they do, Willman believes, there will be an upheaval in the way ballplayers are valued, from roster decisions to salary structure to postseason awards. Future stars are out there, he knows, some of them in the guise of ordinary players. A forward-thinking team has the opportunity to start stockpiling them now, before the rest of baseball even figures out who they are.
In the 1860s, the English-born sportswriter Henry Chadwick, who grew up watching cricket, started using metrics like hits, at-bats and putouts to measure how baseball players were performing. As the practice spread, batting champions were crowned and earned-run averages calculated. For much of the 20th century, the Sunday sports sections during the baseball season listed the statistics of every regular player in the majors. Historical comparisons became possible once “The Official Encyclopedia of Baseball” began appearing in 1951.
In 1977, Bill James, a nighttime security guard at a food-packaging plant who had a keen curiosity and plenty of downtime, published his first “Baseball Abstract,” a series of articles that tried to explain with greater mathematical precision what was actually happening on the field. By the 1990s, Billy Beane’s Oakland A’s were mining such data for competitive advantage, using unconventional measures — on-base percentage, for example, or how many pitches a hitter sees during a typical at-bat — to find value others were missing. Eventually every team established its own analytics department.
None of the new formulations accurately measured defensive prowess. Even Beane’s extremely thorough analysts were unable to extend their methodology to fielding, mostly because they couldn’t assess what they couldn’t quantify. When you categorize someone as a good hitter, you can cite all sorts of statistics — traditional measurements like batting average and slugging percentage and more recent inventions like runs created and OPS (on-base percentage plus slugging percentage). But declare that someone is a good fielder, and usually within moments the discussion devolves into “Did you see that catch?”
It should be axiomatic that a run prevented is as important as a run scored. Yet perhaps because accurately validating defensive contributions is so difficult, baseball people tend not to think that way. “It starts when you’re young,” says Tampa Bay’s center fielder Kevin Kiermaier, who is known for his acrobatic catches. “When you’re done with a Little League game and your parents say, ‘How’d you do?’ you say, ‘Two for four with a double.’ Not ‘I held a guy to a single with a great throw.’ ”
‘Players are going to start getting paid a ton more money because they play great defense and everybody realizes it.’
If you’re an outstanding hitter, a manager will keep you in the lineup no matter how you field. If you’re an outstanding fielder — well, you’d better be able to hit a little too. “There’s a huge difference in perception between a guy hitting .200 and .240,” Kiermaier told me. But that huge difference really only amounts to four hits out of 100 at-bats. On the other hand, if you were to replace Kiermaier with a subpar outfielder for 100 at-bats by the opposition, it’s very likely that his team would surrender more than four additional hits. And that doesn’t take into account extra bases taken by runners, extra batters faced by pitchers and plenty of other hidden consequences.
Baseball’s analytics experts understand that fielding has eluded them. “Everything we do is trying to predict the future,” says Zack Scott, director of major-league operations for the Red Sox. “We build predictive models. And that’s far easier to do with hitting.” After graduating with a statistics degree from the University of Vermont in 1999, Scott went to work at Diamond Mind, a baseball simulation game. Its inventor, Tom Tippett, had been doing rudimentary analysis of outfielders’ range. Scott, whose job was to produce player ratings, tried to advance the discipline. Using hitters’ spray charts, which showed where each ball was hit, he could figure out where batted balls went and if they were caught. But he had no idea where the fielders were standing when the pitch was thrown. And even if he could estimate how often a fielder made a certain play, he had no idea why. “Instincts?” Scott asks. “Speed? The routes he was taking to the ball? You couldn’t answer that question.”
The Red Sox hired Scott in late 2003. “We felt like we had an advantage because we were into this stuff so early,” Scott says. “We had the most experience using defensive metrics. But we weren’t where we wanted to be.” So when a 22-year-old named Jackie Bradley Jr. was promoted to the majors in Boston in 2013 with a reputation as the best defensive outfielder in its organization, he was judged the same way as every fielder before him: “The eye test,” as John Farrell, Boston’s manager, puts it.
Bradley hit well during his college and minor-league careers, but he struggled in Boston. He hit .189 in 37 games in 2013, and .198 the following season. In 2014, Bradley was the starting center fielder. “But he was at .180, .190,” Farrell says. “And that was subjectively outweighing the defense. I didn’t have the ability to say, ‘What is he saving us defensively?’ ” Farrell adds: “I remember thinking that if he hit .240, he’s an everyday player. But I knew it was arbitrary. We just didn’t know.”
Bradley may be the ideal Statcast outfielder. Unlike Kiermaier, he’s rarely the fastest player on the field. You seldom see him on SportsCenter making spectacular catches. But he frequently seems to be in just the right place to get balls that look like certain hits when they leave the bat. “His instincts are ridiculous,” Scott says. “It’s as if he immediately knows where the landing spot of the ball will be, and he puts his head down and runs there. You don’t see many guys do that.”
At one game in Fenway Park in June, I saw Bradley play against the White Sox. In the second inning, Tim Anderson, the shortstop for the White Sox, hit a long fly headed over Bradley’s head. Bradley turned and raced toward the wall without bothering to track the ball. He arrived at the spot with enough time to stop, turn back toward the plate and make the catch while almost standing still. To anyone buying a hot dog and glancing up at the last moment, it would have seemed a routine play. But the Statcast radar and cameras, I knew, had captured it as it unfolded. If I’d been able to access the information, it could have told me what made the play special — or whether it was special at all.
For the rest of the game, I watched Bradley on every pitch. Occasionally he began moving even before the batter made contact. His sense of where the ball would go seemed uncanny. When I asked Willman about this, he looked up the data on all the balls hit in Bradley’s direction this season. Then he compared them with all the balls hit to every other center fielder.
Bradley proved to be far from the fastest runner, and while his teammates claim that he usually takes the shortest routes to get to where fly balls will land, it turns out that his are far from the most efficient. In several directions, including heading straight back to get a ball, he ranked below the league average in that regard. But the quickness of Bradley’s first step on all batted balls was near the top among outfielders; on balls that resulted in outs, he was the best in baseball. On some plays, Statcast showed that Bradley was moving before the ball was even hit — exactly what I thought I was seeing in Fenway.
To track a moving object like a ball, you typically set two cameras perpendicular to each other. Then a computer program hunts through the images from each camera for something that, based on its shape and movement, seems likely to be a ball. By triangulating the two views, you can figure out where that ball was at a given moment, and then where it went. In 2006, BAM started using that approach to gauge the velocity, movement, location and spin rate of every pitch. It called the system PitchF/X. But PitchF/X doesn’t work nearly as well with forms that don’t have a predictable shape, like people. And when you try to follow those forms as they come together and then split apart in seemingly random patterns, which happens in baseball on nearly every play, it doesn’t work well at all.
In 2012, at the International Broadcasters Conference in Amsterdam, the Swedish company Hego (now ChyronHego) introduced a technology it had patterned after the human visual system. “It could see the depth of the field inherently,” recalls Joe Inzerillo, M.L.B.’s chief technology officer. The new technology couldn’t pick up balls, which were too small and moving too fast. But Inzerillo believed it would have no trouble tracking players.
BAM had been trying to capture data on player movements ever since its start more than a decade before. At the Amsterdam conference, Inzerillo sensed that he and his team were close. He proposed the idea of trying to integrate the Hego system with another technology so they could follow the ball and the players simultaneously. It wouldn’t be PitchF/X, he knew; in a large space like the entire field, the prospect of the ball getting lost entirely amid the background clutter would increase drastically. But he wondered about the modified Doppler radar system, called TrackMan, that was starting to be used to measure the trajectory of thrown and batted balls. “Radar, on the other hand, does not really see the background,” Inzerillo says. “And one unit can cover the field pretty well. We literally sat down and sketched it out on a piece of paper and figured out how these two systems could talk to each other.”
Hego’s two camera pods have now been installed, 70 to 150 feet apart, along the third-base line in all but two major league ballparks. (In Boston and Milwaukee they’re along the first-base line because of architectural quirks.) The TrackMan system, adapted from one used for missile defense, traces the ball as it would any moving object. Statcast is programmed to layer the information generated by one atop the other, creating a representation of what’s happening on the field. Usually it works.
Sometimes, though, it doesn’t. Nearly three years since an initial trial run in the Arizona Fall League, Statcast is still committing rookie errors. Chopped grounders that bound high into the air elude the radar. So do high pop-ups. The system is accurate at the middle of the field, less so toward the foul lines. And even when the technologies are in sync, glitches can occur. Willman showed me an example: an out made by the Boston right fielder Mookie Betts earlier this season, a play categorized by the Statcast database as one that’s made successfully only 5 percent of the time. When Willman called up the archived video from the home telecast, I expected to see Betts diving across the outfield for a sinking liner or leaping against the wall to pull a home run from the stands. Instead, this was a fly hit directly at him. “Easy play for Mookie,” the announcer said. Willman shook his head. “That’s one that got lost in the radar,” he said.
Even if the vast majority of the Statcast data is accurate, its sheer volume — millions of lines of digital output from every day of the baseball season — remains difficult to process. Merely coming up with a program to unpack the pages of computer coding is beyond the wherewithal of most teams. “It’s nice to say that we have the technology, the means of capturing data,” says Jeff Bridich, the Colorado Rockies’ general manager, who played baseball at Harvard. “But now we’re plowing through it, trying to understand it. What does all of this mean?”
That’s not to say Statcast isn’t already having an influence. Not every franchise can risk $72.5 million on an untested outfielder, which is what Boston did with Rusney Castillo, a Cuban defector who played 99 games in the outfield for the Red Sox over the last three seasons before being sent back to the minors for good. Instead, some teams hire young analysts to crunch data. They take a chance on unproven technologies. And like Beane’s A’s a generation ago, they try to find an edge.
The Tampa Bay Rays have one of baseball’s lowest payrolls. Baseball insiders also assume the team has the largest analytics department, though because of the team’s secretiveness, nobody can say for certain. “They have people who are just out of school, smart people,” Scott says. “They’re not paying them much — it’s kind of the sweatshop model. They’re just cranking it out. The risk is whether the data they’re using is real, that they’re actually learning what they think they’re learning.” When I visited Tropicana Field this summer, none of the Tampa analysts were allowed to talk with me. But I asked the manager, Kevin Cash, about Statcast. “It has become very popular here,” he admitted. “You’re able to compare a lot of things that you couldn’t compare before.”
When Kiermaier was struggling offensively last year, Cash wanted him to understand how well he was nonetheless playing. “What he was doing was helping us win games more than anyone picking up a newspaper could tell,” he said. So Cash did something that every big-league manager might be doing in five years. Using Wins Above Replacement, a somewhat arbitrary (but increasingly popular) statistic that amalgamates the output of various categories into a single number, he showed Kiermaier how well a particular All-Star outfielder was hitting. Then he used Statcast data to quantify Kiermaier’s value into an approximate defensive equivalent. “I said, ‘Here’s what you’re doing — not with home runs, not with batting average, just on defense,’ ” Cash said. “ ‘You’re impacting our club in a huge way, just like he’s impacting their club.’ ” Relieved of the pressure to hit, Kiermaier loosened up. He finished the season at .263. He also won the Gold Glove Award as the best center fielder in the American League.
While Kiermaier was slumping, Jackie Bradley Jr. was struggling to stay in the majors. As of early August last year, his batting average stood at .102. That was not only well below the standard that Farrell had set for him to stay in Boston’s lineup; it was on pace for the worst batting average by a position player ever recorded.
Fortunately for Bradley, the Red Sox — who would finish last in their division for the second consecutive season — were looking ahead. That month, they hired the former Expos, Marlins and Tigers executive Dave Dombrowski as their new president. Though Dombrowski considered Bradley a gifted outfielder, he had never seen him play more than a few games at a time. Lacking the statistical means to judge just how good he was, he hoped to find out by watching him in center field for the rest of the season. So Bradley remained in the Boston lineup.
Unexpectedly, he began to hit. From Aug. 6, shortly before Dombrowski was hired, to Sept. 7, Bradley went 39 for 92, hit seven homers and drove in 32 runs. He finished the season with a respectable .249 batting average. “The defense allowed him the opportunity to grow into an everyday player,” Farrell says. This season, Bradley has done even better as a batter. He had exceeded his 2015 output in every major offensive category by the end of June. At one point, he recorded a hit in 29 consecutive games. He was voted into the All-Star Game as a starting outfielder. By late September, he was hitting .276, with 26 home runs, and was closing in on 100 R.B.I.
I was in Houston visiting Willman the weekend before the All-Star Game in July. As he drove me to the airport, he chuckled at the perception of Bradley as a breakout star. Like Kiermaier, Bradley had been contributing to his team even when he wasn’t hitting. It’s just that nobody understood quite how much.
Before he was hired by M.L.B. Advanced Media in January, Willman worked as a software developer for the Harris County district attorney. He also created the popular statistical website Baseball Savant, which he still runs, now under the MLB.com umbrella. The insights he offered were so acute that several franchises interviewed him for positions in their analytics department. All of them wanted him to take a pay cut, which astounded him. “Think of how good the Red Sox could be right now if they used this data,” he said. “They have the money. They could spend $1 million and hire five guys like me, who really understand baseball and really know the technology. How much is a win worth? Five million? More? If we give you three or four wins, that $1 million pays for us several times over.”
When Willman played college baseball at Texas Lutheran, he was a talented defensive center fielder. But the pros had no interest in him. “I just didn’t hit well enough,” he said. “Even if I was good enough to be the best defensive outfielder in the majors, there was no way to quantify that. So all they could look at was my hitting.”
The same approach to evaluating talent that might have undervalued Willman then seems to be undervaluing him now. But Willman is confident that his skills will be appreciated, at least in hindsight, as Statcast transforms his sport. “Where it’s really going to end up is, players are going to start getting paid a ton more money because they play great defense and everybody realizes it,” he said. As he talked, his phone dinged every few moments, signaling that his work had been retweeted. “There will be a whole new baseball revolution based on information that we are just starting to get.”