They're Watching You at Work - Don Peck - The Atlantic - 2 views
-
Predictive statistical analysis, harnessed to big data, appears poised to alter the way millions of people are hired and assessed.
-
By one estimate, more than 98 percent of the world’s information is now stored digitally, and the volume of that data has quadrupled since 2007.
-
The application of predictive analytics to people’s careers—an emerging field sometimes called “people analytics”—is enormously challenging, not to mention ethically fraught
- ...52 more annotations...
-
By the end of World War II, however, American corporations were facing severe talent shortages. Their senior executives were growing old, and a dearth of hiring from the Depression through the war had resulted in a shortfall of able, well-trained managers. Finding people who had the potential to rise quickly through the ranks became an overriding preoccupation of American businesses. They began to devise a formal hiring-and-management system based in part on new studies of human behavior, and in part on military techniques developed during both world wars, when huge mobilization efforts and mass casualties created the need to get the right people into the right roles as efficiently as possible. By the 1950s, it was not unusual for companies to spend days with young applicants for professional jobs, conducting a battery of tests, all with an eye toward corner-office potential.
-
But companies abandoned their hard-edged practices for another important reason: many of their methods of evaluation turned out not to be very scientific.
-
this regime, so widespread in corporate America at mid-century, had almost disappeared by 1990. “I think an HR person from the late 1970s would be stunned to see how casually companies hire now,”
-
Many factors explain the change, he said, and then he ticked off a number of them: Increased job-switching has made it less important and less economical for companies to test so thoroughly. A heightened focus on short-term financial results has led to deep cuts in corporate functions that bear fruit only in the long term. The Civil Rights Act of 1964, which exposed companies to legal liability for discriminatory hiring practices, has made HR departments wary of any broadly applied and clearly scored test that might later be shown to be systematically biased.
-
about a quarter of the country’s corporations were using similar tests to evaluate managers and junior executives, usually to assess whether they were ready for bigger roles.
-
Aptitude, skills, personal history, psychological stability, discretion, loyalty—companies at the time felt they had a need (and the right) to look into them all. That ambit is expanding once again, and this is undeniably unsettling. Should the ideas of scientists be dismissed because of the way they play a game? Should job candidates be ranked by what their Web habits say about them? Should the “data signature” of natural leaders play a role in promotion? These are all live questions today, and they prompt heavy concerns: that we will cede one of the most subtle and human of skills, the evaluation of the gifts and promise of other people, to machines; that the models will get it wrong; that some people will never get a shot in the new workforce.
-
Knack makes app-based video games, among them Dungeon Scrawl, a quest game requiring the player to navigate a maze and solve puzzles, and Wasabi Waiter, which involves delivering the right sushi to the right customer at an increasingly crowded happy hour. These games aren’t just for play: they’ve been designed by a team of neuroscientists, psychologists, and data scientists to suss out human potential. Play one of them for just 20 minutes, says Guy Halfteck, Knack’s founder, and you’ll generate several megabytes of data, exponentially more than what’s collected by the SAT or a personality test. How long you hesitate before taking every action, the sequence of actions you take, how you solve problems—all of these factors and many more are logged as you play, and then are used to analyze your creativity, your persistence, your capacity to learn quickly from mistakes, your ability to prioritize, and even your social intelligence and personality. The end result, Halfteck says, is a high-resolution portrait of your psyche and intellect, and an assessment of your potential as a leader or an innovator.
-
When the results came back, Haringa recalled, his heart began to beat a little faster. Without ever seeing the ideas, without meeting or interviewing the people who’d proposed them, without knowing their title or background or academic pedigree, Knack’s algorithm had identified the people whose ideas had panned out. The top 10 percent of the idea generators as predicted by Knack were in fact those who’d gone furthest in the process.
-
What Knack is doing, Haringa told me, “is almost like a paradigm shift.” It offers a way for his GameChanger unit to avoid wasting time on the 80 people out of 100—nearly all of whom look smart, well-trained, and plausible on paper—whose ideas just aren’t likely to work out.
-
He has encouraged the company’s HR executives to think about applying the games to the recruitment and evaluation of all professional workers.
-
scoring distance from work could violate equal-employment-opportunity standards. Marital status? Motherhood? Church membership? “Stuff like that,” Meyerle said, “we just don’t touch”—at least not in the U.S., where the legal environment is strict. Meyerle told me that Evolv has looked into these sorts of factors in its work for clients abroad, and that some of them produce “startling results.”
-
consider the alternative. A mountain of scholarly literature has shown that the intuitive way we now judge professional potential is rife with snap judgments and hidden biases, rooted in our upbringing or in deep neurological connections that doubtless served us well on the savanna but would seem to have less bearing on the world of work.
-
We may like to think that society has become more enlightened since those days, and in many ways it has, but our biases are mostly unconscious, and they can run surprisingly deep. Consider race. For a 2004 study called “Are Emily and Greg More Employable Than Lakisha and Jamal?,” the economists Sendhil Mullainathan and Marianne Bertrand put white-sounding names (Emily Walsh, Greg Baker) or black-sounding names (Lakisha Washington, Jamal Jones) on similar fictitious résumés, which they then sent out to a variety of companies in Boston and Chicago. To get the same number of callbacks, they learned, they needed to either send out half again as many résumés with black names as those with white names, or add eight extra years of relevant work experience to the résumés with black names.
-
a sociologist at Northwestern, spent parts of the three years from 2006 to 2008 interviewing professionals from elite investment banks, consultancies, and law firms about how they recruited, interviewed, and evaluated candidates, and concluded that among the most important factors driving their hiring recommendations were—wait for it—shared leisure interests.
-
Lacking “reliable predictors of future performance,” Rivera writes, “assessors purposefully used their own experiences as models of merit.” Former college athletes “typically prized participation in varsity sports above all other types of involvement.” People who’d majored in engineering gave engineers a leg up, believing they were better prepared.
-
the prevailing system of hiring and management in this country involves a level of dysfunction that should be inconceivable in an economy as sophisticated as ours. Recent survey data collected by the Corporate Executive Board, for example, indicate that nearly a quarter of all new hires leave their company within a year of their start date, and that hiring managers wish they’d never extended an offer to one out of every five members on their team
-
In the late 1990s, as these assessments shifted from paper to digital formats and proliferated, data scientists started doing massive tests of what makes for a successful customer-support technician or salesperson. This has unquestionably improved the quality of the workers at many firms.
-
In 2010, however, Xerox switched to an online evaluation that incorporates personality testing, cognitive-skill assessment, and multiple-choice questions about how the applicant would handle specific scenarios that he or she might encounter on the job. An algorithm behind the evaluation analyzes the responses, along with factual information gleaned from the candidate’s application, and spits out a color-coded rating: red (poor candidate), yellow (middling), or green (hire away). Those candidates who score best, I learned, tend to exhibit a creative but not overly inquisitive personality, and participate in at least one but not more than four social networks, among many other factors. (Previous experience, one of the few criteria that Xerox had explicitly screened for in the past, turns out to have no bearing on either productivity or retention
-
the idea that hiring was a science fell out of favor. But now it’s coming back, thanks to new technologies and methods of analysis that are cheaper, faster, and much-wider-ranging than what we had before
-
Gone are the days, Ostberg told me, when, say, a small survey of college students would be used to predict the statistical validity of an evaluation tool. “We’ve got a data set of 347,000 actual employees who have gone through these different types of assessments or tools,” he told me, “and now we have performance-outcome data, and we can split those and slice and dice by industry and location.”
-
Evolv’s tests allow companies to capture data about everybody who applies for work, and everybody who gets hired—a complete data set from which sample bias, long a major vexation for industrial-organization psychologists, simply disappears. The sheer number of observations that this approach makes possible allows Evolv to say with precision which attributes matter more to the success of retail-sales workers (decisiveness, spatial orientation, persuasiveness) or customer-service personnel at call centers (rapport-building)
-
There are some data that Evolv simply won’t use, out of a concern that the information might lead to systematic bias against whole classes of people
-
When Xerox started using the score in its hiring decisions, the quality of its hires immediately improved. The rate of attrition fell by 20 percent in the initial pilot period, and over time, the number of promotions rose. Xerox still interviews all candidates in person before deciding to hire them, Morse told me, but, she added, “We’re getting to the point where some of our hiring managers don’t even want to interview anymore”
-
what most excites him are the possibilities that arise from monitoring the entire life cycle of a worker at any given company.
-
Mullainathan expressed amazement at how little most creative and professional workers (himself included) know about what makes them effective or ineffective in the office. Most of us can’t even say with any certainty how long we’ve spent gathering information for a given project, or our pattern of information-gathering, never mind know which parts of the pattern should be reinforced, and which jettisoned. As Mullainathan put it, we don’t know our own “production function.”
-
What begins with an online screening test for entry-level workers ends with the transformation of nearly every aspect of hiring, performance assessment, and management.
-
I turned to Sandy Pentland, the director of the Human Dynamics Laboratory at MIT. In recent years, Pentland has pioneered the use of specialized electronic “badges” that transmit data about employees’ interactions as they go about their days. The badges capture all sorts of information about formal and informal conversations: their length; the tone of voice and gestures of the people involved; how much those people talk, listen, and interrupt; the degree to which they demonstrate empathy and extroversion; and more. Each badge generates about 100 data points a minute.
-
he tried the badges out on about 2,500 people, in 21 different organizations, and learned a number of interesting lessons. About a third of team performance, he discovered, can usually be predicted merely by the number of face-to-face exchanges among team members. (Too many is as much of a problem as too few.) Using data gathered by the badges, he was able to predict which teams would win a business-plan contest, and which workers would (rightly) say they’d had a “productive” or “creative” day. Not only that, but he claimed that his researchers had discovered the “data signature” of natural leaders, whom he called “charismatic connectors” and all of whom, he reported, circulate actively, give their time democratically to others, engage in brief but energetic conversations, and listen at least as much as they talk.
-
His group is developing apps to allow team members to view their own metrics more or less in real time, so that they can see, relative to the benchmarks of highly successful employees, whether they’re getting out of their offices enough, or listening enough, or spending enough time with people outside their own team.
-
Torrents of data are routinely collected by American companies and now sit on corporate servers, or in the cloud, awaiting analysis. Bloomberg reportedly logs every keystroke of every employee, along with their comings and goings in the office. The Las Vegas casino Harrah’s tracks the smiles of the card dealers and waitstaff on the floor (its analytics team has quantified the impact of smiling on customer satisfaction). E‑mail, of course, presents an especially rich vein to be mined for insights about our productivity, our treatment of co-workers, our willingness to collaborate or lend a hand, our patterns of written language, and what those patterns reveal about our intelligence, social skills, and behavior.
-
people analytics will ultimately have a vastly larger impact on the economy than the algorithms that now trade on Wall Street or figure out which ads to show us. He reminded me that we’ve witnessed this kind of transformation before in the history of management science. Near the turn of the 20th century, both Frederick Taylor and Henry Ford famously paced the factory floor with stopwatches, to improve worker efficiency.
-
“The quantities of data that those earlier generations were working with,” he said, “were infinitesimal compared to what’s available now. There’s been a real sea change in the past five years, where the quantities have just grown so large—petabytes, exabytes, zetta—that you start to be able to do things you never could before.”
-
People analytics will unquestionably provide many workers with more options and more power. Gild, for example, helps companies find undervalued software programmers, working indirectly to raise those people’s pay. Other companies are doing similar work. One called Entelo, for instance, specializes in using algorithms to identify potentially unhappy programmers who might be receptive to a phone cal
-
He sees it not only as a boon to a business’s productivity and overall health but also as an important new tool that individual employees can use for self-improvement: a sort of radically expanded The 7 Habits of Highly Effective People, custom-written for each of us, or at least each type of job, in the workforce.
-
the most exotic development in people analytics today is the creation of algorithms to assess the potential of all workers, across all companies, all the time.
-
The way Gild arrives at these scores is not simple. The company’s algorithms begin by scouring the Web for any and all open-source code, and for the coders who wrote it. They evaluate the code for its simplicity, elegance, documentation, and several other factors, including the frequency with which it’s been adopted by other programmers. For code that was written for paid projects, they look at completion times and other measures of productivity. Then they look at questions and answers on social forums such as Stack Overflow, a popular destination for programmers seeking advice on challenging projects. They consider how popular a given coder’s advice is, and how widely that advice ranges.
-
The algorithms go further still. They assess the way coders use language on social networks from LinkedIn to Twitter; the company has determined that certain phrases and words used in association with one another can distinguish expert programmers from less skilled ones. Gild knows these phrases and words are associated with good coding because it can correlate them with its evaluation of open-source code, and with the language and online behavior of programmers in good positions at prestigious companies.
-
having made those correlations, Gild can then score programmers who haven’t written open-source code at all, by analyzing the host of clues embedded in their online histories. They’re not all obvious, or easy to explain. Vivienne Ming, Gild’s chief scientist, told me that one solid predictor of strong coding is an affinity for a particular Japanese manga site.
-
Gild’s CEO, Sheeroy Desai, told me he believes his company’s approach can be applied to any occupation characterized by large, active online communities, where people post and cite individual work, ask and answer professional questions, and get feedback on projects. Graphic design is one field that the company is now looking at, and many scientific, technical, and engineering roles might also fit the bill. Regardless of their occupation, most people leave “data exhaust” in their wake, a kind of digital aura that can reveal a lot about a potential hire.
-
professionally relevant personality traits can be judged effectively merely by scanning Facebook feeds and photos. LinkedIn, of course, captures an enormous amount of professional data and network information, across just about every profession. A controversial start-up called Klout has made its mission the measurement and public scoring of people’s online social influence.
-
Now the two companies are working together to marry pre-hire assessments to an increasing array of post-hire data: about not only performance and duration of service but also who trained the employees; who has managed them; whether they were promoted to a supervisory role, and how quickly; how they performed in that role; and why they eventually left.
-
Over time, better job-matching technologies are likely to begin serving people directly, helping them see more clearly which jobs might suit them and which companies could use their skills. In the future, Gild plans to let programmers see their own profiles and take skills challenges to try to improve their scores. It intends to show them its estimates of their market value, too, and to recommend coursework that might allow them to raise their scores even more. Not least, it plans to make accessible the scores of typical hires at specific companies, so that software engineers can better see the profile they’d need to land a particular job
-
Knack, for its part, is making some of its video games available to anyone with a smartphone, so people can get a better sense of their strengths, and of the fields in which their strengths would be most valued. (Palo Alto High School recently adopted the games to help students assess careers.) Ultimately, the company hopes to act as matchmaker between a large network of people who play its games (or have ever played its games) and a widening roster of corporate clients, each with its own specific profile for any given type of job.
-
When I began my reporting for this story, I was worried that people analytics, if it worked at all, would only widen the divergent arcs of our professional lives, further gilding the path of the meritocratic elite from cradle to grave, and shutting out some workers more definitively. But I now believe the opposite is likely to happen, and that we’re headed toward a labor market that’s fairer to people at every stage of their careers
-
For decades, as we’ve assessed people’s potential in the professional workforce, the most important piece of data—the one that launches careers or keeps them grounded—has been educational background: typically, whether and where people went to college, and how they did there. Over the past couple of generations, colleges and universities have become the gatekeepers to a prosperous life. A degree has become a signal of intelligence and conscientiousness, one that grows stronger the more selective the school and the higher a student’s GPA, that is easily understood by employers, and that, until the advent of people analytics, was probably unrivaled in its predictive powers.
-
the limitations of that signal—the way it degrades with age, its overall imprecision, its many inherent biases, its extraordinary cost—are obvious. “Academic environments are artificial environments,” Laszlo Bock, Google’s senior vice president of people operations, told The New York Times in June. “People who succeed there are sort of finely trained, they’re conditioned to succeed in that environment,” which is often quite different from the workplace.
-
because one’s college history is such a crucial signal in our labor market, perfectly able people who simply couldn’t sit still in a classroom at the age of 16, or who didn’t have their act together at 18, or who chose not to go to graduate school at 22, routinely get left behind for good. That such early factors so profoundly affect career arcs and hiring decisions made two or three decades later is, on its face, absurd.
-
I spoke with managers at a lot of companies who are using advanced analytics to reevaluate and reshape their hiring, and nearly all of them told me that their research is leading them toward pools of candidates who didn’t attend college—for tech jobs, for high-end sales positions, for some managerial roles. In some limited cases, this is because their analytics revealed no benefit whatsoever to hiring people with college degrees; in other cases, and more often, it’s because they revealed signals that function far better than college history,
-
Google, too, is hiring a growing number of nongraduates. Many of the people I talked with reported that when it comes to high-paying and fast-track jobs, they’re reducing their preference for Ivy Leaguers and graduates of other highly selective schools.
-
This process is just beginning. Online courses are proliferating, and so are online markets that involve crowd-sourcing. Both arenas offer new opportunities for workers to build skills and showcase competence. Neither produces the kind of instantly recognizable signals of potential that a degree from a selective college, or a first job at a prestigious firm, might. That’s a problem for traditional hiring managers, because sifting through lots of small signals is so difficult and time-consuming.
-
all of these new developments raise philosophical questions. As professional performance becomes easier to measure and see, will we become slaves to our own status and potential, ever-focused on the metrics that tell us how and whether we are measuring up? Will too much knowledge about our limitations hinder achievement and stifle our dreams? All I can offer in response to these questions, ironically, is my own gut sense, which leads me to feel cautiously optimistic.
-
Google’s understanding of the promise of analytics is probably better than anybody else’s, and the company has been changing its hiring and management practices as a result of its ongoing analyses. (Brainteasers are no longer used in interviews, because they do not correlate with job success; GPA is not considered for anyone more than two years out of school, for the same reason—the list goes on.) But for all of Google’s technological enthusiasm, these same practices are still deeply human. A real, live person looks at every résumé the company receives. Hiring decisions are made by committee and are based in no small part on opinions formed during structured interviews.