Can We Track COVID-19 and Protect Privacy at the Same Time?

Passengers in a subway car looking at their phones
Will Americans allow the government to track them via their phones, in order to beat back a deadly pandemic?Photograph by Tom Brenner / Reuters

Caroline Buckee, a top epidemiologist at Harvard’s T. H. Chan School of Public Health, has devoted her professional life to studying malaria and other infectious diseases. As news of a novel coronavirus emerged from China, Buckee realized that her area of expertise—how infectious diseases evolve as they move through vulnerable populations—would be valuable to health-care workers and elected officials as the virus spread across the globe. “The methods and the tools are the same, and epidemiological models are easily adapted,” Buckee told me. “But, for many of us, like me, we work with endemic pathogens. COVID-19 is new. There is so much we don’t know.” Since the most urgent imperative was to “flatten the curve” of infections, it was crucial to know where public-health strategies like stay-at-home orders were working and where they were not. Buckee quickly assembled a consortium of infectious-disease researchers to make the data accessible to policymakers—data that they did not yet have.

At just about the same time, Ian Allen, a former marine and C.I.A. paramilitary officer, cold-called Harvard’s School of Public Health and asked if there was anything that his new company, Camber Systems, could do to help with the pandemic. Soon afterward, Allen was connected with Buckee, the associate director of the School of Public Health’s Center for Communicable Disease Dynamics. Buckee had created the COVID-19 Mobility Data Network, a network of epidemiologists from universities around the world, to try to track the efficacy of social-distancing measures. Allen agreed to provide Buckee with the software to query and scrub data collected by tech companies and use it to track the coronavirus’s spread without violating Americans’ privacy. “I wasn’t really expecting ever to hear back, assuming that Harvard, of all places, would have all the resources they’d ever need,” Allen told me, while standing in a field in rural Virginia as his son shot at tin cans with a BB gun. (Like many parents, Allen has been homeschooling his children during the pandemic; this was geometry class.) “Caroline asked me if we could help aggregate location data. Just aggregating the data and anonymizing it in the right way to protect privacy would take some of the burden off of her.” Allen reached out to a handful of data firms, including Unacast, Kochava, and X-Mode. All agreed to provide their data for free.

Camber Systems, of which Allen is the C.E.O., is a year-old startup that, among other things, hopes to offer federal, state, and local government agencies ways to use commercially harvested location data to improve their operations without violating privacy laws. Shortly before the pandemic, Allen and his business partner, Navin Vembar, a mathematician who served as the chief technology officer of the General Services Administration, were searching for potential clients, talking with officials in Madison, Wisconsin, about using location data to shore up tourism and distribute the city’s public resources equitably. Assisting Buckee’s COVID-19 Mobility Data Network was the kind of project they envisioned when they launched their company with Hangar, a venture-capital firm that funds companies that use technology in the public interest.

By presenting aggregated location data in an accessible and searchable format for epidemiologists studying COVID-19, the project would enable researchers and policymakers to see how members of the public move around their communities. When paired with other metrics, such as the number of new infections or mortality rates, the data would guide policymakers as they grappled with when and where to lift stay-at-home orders. Facebook is also supplying the network with data. According to Buckee, correspondence between the various groups gives researchers confidence in the trends they are seeing. “One data set is not going to show what’s going on,” she said.

Location data are the bread and butter of “ad tech.” They let marketers know you recently shopped for running shoes, are trying to lose weight, and have an abiding affection for kettle corn. Apps on cell phones emit a constant trail of longitude and latitude readings, making it possible to follow consumers through time and space. Location data are often triangulated with other, seemingly innocuous slivers of personal information—so many, in fact, that a number of data brokers claim to have around five thousand data points on almost every American. It’s a lucrative business—by at least one estimate, the data-brokerage industry is worth two hundred billion dollars. Though the data are often anonymized, a number of studies have shown that they can be easily unmasked to reveal identities—names, addresses, phone numbers, and any number of intimacies. As Buckee knew, public-health surveillance, which serves the community at large, has always bumped up against privacy, which protects the individual. But, in the past, public-health surveillance was typically conducted by contract tracing, with health-care workers privately interviewing individuals to determine their health status and trace their movements. It was labor-intensive, painstaking, memory-dependent work, and, because of that, it was inherently limited in scope and often incomplete or inefficient. (At the start of the pandemic, there were only twenty-two hundred contact tracers in the country.)

Digital technologies, which work at scale, instantly provide detailed information culled from security cameras, license-plate readers, biometric scans, drones, G.P.S. devices, cell-phone towers, Internet searches, and commercial transactions. They can be useful for public-health surveillance in the same way that they facilitate all kinds of spying by governments, businesses, and malign actors. South Korea, which reported its first COVID-19 case a month after the United States, has achieved dramatically lower rates of infection and mortality by tracking citizens with the virus via their phones, car G.P.S. systems, credit-card transactions, and public cameras, in addition to a robust disease-testing program. Israel enlisted Shin Bet, its secret police, to repurpose its terrorist-tracking protocols. China programmed government-installed cameras to point at infected people’s doorways to monitor their movements.

As unlikely as it may seem that such privacy-compromising measures will be adopted in the United States, the Trump Administration reportedly summoned tech executives to the White House to discuss sharing data with the government. Not much is known about the meeting. The Administration has classified all its discussions about COVID-19, and it later denounced Politico for reporting that the White House was in talks with tech firms to create a national coronavirus surveillance system. Last week, Gizmodo reported that Palantir, a secretive data-analytics firm owned by the conservative billionaire and the Trump backer Peter Thiel, has a contract from the Trump Administration to build a database to track the spread of the virus. Palantir is best known for its work with the N.S.A. and ICE, where its software is used to track undocumented immigrants. (Other private surveillance companies, most notably the Israeli firm NSO, are also pitching COVID-19 tracing to governments around the world.)

“We’re all too familiar with the historical record of crises, where new powers in the hands of governments and corporations lead to them holding on to them indefinitely,” Adam Schwartz, a lawyer with the digital-rights group the Electronic Frontier Foundation, told me. Schwartz pointed out that most of the sweeping investigative powers given to the intelligence community after the 9/11 terrorist attack are still in place nearly two decades later. As Senator Maria Cantwell wrote, on April 9th, in her opening remarks for a paper hearing by the Senate Committee on Commerce, Science, and Transportation on the role of Big Tech during the pandemic, “Rights and data surrendered temporarily during an emergency can become very difficult to get back.”

As difficult as it is now to look ahead, lawmakers like Cantwell and privacy advocates like Schwartz are asking us to think about how much privacy we are willing to sacrifice to combat a rampaging virus. If we accept government data tracking, the surveillance necessary to curtail COVID-19 could become a permanent fixture in our lives. It’s an unknowable trade-off. “In this particular case, if we have technology for minimizing harm, we have a moral obligation to use it,” Marcello Ienca, a bioethicist at the Swiss university ETH Zurich, told me. “But we have to merge it with the best available technology in the areas of cybersecurity and privacy.” To do this right, Ienca added, the public-health experts need to work with privacy advocates.

Buckee agrees. “We’ve spent several years thinking about privacy,” she said, referring to herself and her Harvard colleagues, who worry that “some of the companies offering their services might not be totally aware of the debate around this.” Buckee has made sure that the Mobility Data Network Web site prominently features the group’s strict privacy policies. She will soon publish a paper in a medical journal (Vembar, the C.T.O. of Camber Systems, is a co-author) that includes, among other things, privacy-preserving best practices that she believes should be used during the pandemic. “We have to be explicit about what happens to this data,” she said. “It’s not as simple as deleting a file.” Because the location data Camber Systems receives from data brokers can be exploited to expose personal information, Allen and Vembar do not release it to Buckee’s researchers. Instead, they aggregate and spike the data sets with probabilistic math to create noise that makes it difficult for anyone to zero in on particular individuals. And, if any of their data sets rely on a small number of devices, they eliminate them, since those can be revealing as well.

The result of their work is Camber Systems’ “Social Distancing Reporter.” A page on the COVID-19 Data Mobility Network Web site, it lets the public see, county by county, where people are moving around—or not—in predictable ways, like going to work, and unpredictable ways, like gathering in parks or travelling out of state. Because of privacy concerns, only Buckee’s researchers are allowed to dig through data at the census-tract level, which is more granular. “One reason we don’t go public with data below the county level is that, if there are poor neighborhoods where people are moving around a lot because they have to go to essential jobs, we don’t want to put a target on them,” Vembar said. Buckee told me that they want to be mindful that even anonymous data could lead to discrimination against people who live in certain neighborhoods.

Studying mobility data isn’t about determining individuals’ risk levels; instead, it asks whether public-health risks are being appropriately managed. That’s especially useful for policymakers and health-care professionals, but, in the face of a virus that has us eyeing one another warily over our masks, understanding our personal vulnerability is the first thing most of us want to know. It would be easy to figure out if we’ve been in contact with someone who has tested positive for COVID-19 if we were required to wear an identifying badge that displayed our infection status. But that goes against Americans’ expectations of civil liberties and privacy rights, even in a pandemic. Constant digital surveillance as we move through the day might accomplish the same thing, but it clearly violates our privacy. Even less comprehensive monitoring with digital technology that updates traditional contact tracing by alerting us when we’ve been near a person who is COVID-19-positive “may fail to protect data, or can be misused or extended far beyond [its] initial purpose,” members of a pan-European consortium warned recently. Members of the group, known as DP-3T, are among a handful of technologists around the world building “proximity tracing” apps that aim to preserve privacy. They say that the software they have developed requires participants to opt in, does not store personal information, and will experience a “graceful dismantling” after the pandemic is over. The group has posted its code on the Internet, where it is available for free to health authorities.

The privacy-conscious contact-tracing software that has got the most attention, though, does not yet exist. On April 10th, the day after Cantwell’s hearing, Google and Apple announced that they were collaborating on a new software interface similar to the one developed by the academics in Europe. It will use low-level Bluetooth signals to alert anyone whose Android device or iPhone has come near a phone owned by an infected person in the past two weeks. Participation will be voluntary, and the companies say that no identifying data will be exchanged or stored. If a user is diagnosed with the coronavirus, it is up to them to inform the app, which then notifies anyone whose phone has been near that phone. It also assumes that the phones are being carried by their owners.

Will this work? Technically, yes: phones can communicate with one another. (As they do, third-party apps are likely to grab their location data, too.) Bluetooth, though, is notoriously glitchy. Will the app report contact between people who live in adjoining apartments because Bluetooth penetrates walls? What about people who are out for a walk in the open air? “Such false positives will both waste valuable resources in terms of testing people who were not actually exposed as well as cause people to turn off the app,” Susan Landau, a professor of cybersecurity at Tufts University, told me.

If, somehow, the technology itself can be made to work well enough, will it make a difference to public health? That’s unclear. The efficacy of contact tracing hinges on the existence of widespread, accessible, free testing. So far, that is not something health authorities have been able to offer in this country. It also depends on people owning smartphones. (Additionally, Android users need the right phone.) And it requires a significant number of those phone owners to choose to participate in the program. It’s not known yet what that number is. If too few people participate, the app is likely to generate a false sense of security. Sharing one’s medical status is also voluntary—another weak link. Landau also pointed out that “contact-tracing apps cannot handle asymptomatic carriers. As long as asymptomatic carriers are a significant portion of the population, the contact-tracing apps create a serious risk of missing the fact that someone has been exposed.”

The Google-Apple application interface will be released in May to public-health agencies. In June, the companies plan to add the software to their operating systems. It’s too early to know how widely it will be adopted, if it will assuage people’s fears, whether it will improve health outcomes—or if, as the Cambridge University researcher Ross Anderson said, it’s just an expression of “do-something-itis.” Yet watching Big Tech, which has often played fast and loose with users’ personal information, embrace Buckee’s belief that public health and civic health should not be mutually exclusive is an encouraging development. “The rate of scientific collaboration and output right now is just astonishing,” she told me the first time we talked. A few days later, when I checked in with Buckee, she said, “It’s absolutely exhausting, often quite emotional, and completely all-consuming. I go to sleep thinking about COVID-19. I wake up thinking about COVID-19. It’s the same for everyone I know working on this. All my colleagues check on each other, we try to go outside at least once a day to walk and clear our heads, but usually we end up on a call while we’re walking. I don’t know how we can sustain this pace, but there is so much to do.”


A Guide to the Coronavirus