Microsoft scientists, in an article published this week in the Journal of Oncology Practice, demonstrated that by analyzing large samples of search engine queries, they may, in some cases, be able to identify internet users who are suffering from pancreatic cancer, even before they have received a diagnosis of the disease.
This is an example of extremely useful research performed by a private company using data that it collects about individuals. Such research faces an important challenge. The ethical review standards that govern academic research using human subject data frequently do not apply in the private context. Without such standards and safeguards, public confidence in research of this type will erode. In today’s big-data world, we need effective ethical review processes in the private sector, as well as in the academic one. Some companies are starting to respond to this challenge.
Today, Facebook’s ethics and policy staff published an important paper that provides a detailed overview of the company’s research review process.
Studying data about people is not new. It has been a central occupation of researchers in almost every field of scientific endeavor. Whether to seek the causes of disease, to improve the safety of transportation, to understand human behavior or simply to improve general scientific knowledge, researchers have designed experiments, sought the express consent of individuals when warranted and proceeded with their studies.
The Common Rule, a federal government requirement, sets standards for such research when supported by government funds, and makes these studies subject to approval of university Independent Review Boards (IRBs), which provide ethical guidance allowing or deterring the research.
Increasingly, the data sets sought by researchers are not the limited experiments conducted on a group of volunteers recruited for a study. Large data sets are often available in the public domain, made public by users themselves on we sites and services or by open data projects making government data accessible. Other large data sets sought by researchers include the wide range of consumer information held privately by companies of every size. Companies themselves regularly study this data, usually to improve their own services and test new features and often to publish general research valuable to the broader scientific community. The opportunities for breakthroughs can be unexpected and potentially lifesaving, as the recent Microsoft study demonstrates.
If companies feel that working with academics to conduct research is too uncertain or risky, innovative work will continue, but it will remain confidential and protected within companies, and unavailable to a wide research audience.
None of these data sets are subject to Common Rule and IRB oversight, either because they aren’t linked to federal funding, or because the ethical guidelines of IRBs do not consider public data or data already collected for business purposes to be the types of "experiments" on humans that require review. In many cases, such a private company testing which layout of a website is appealing to web shoppers, such oversight is certainly unnecessary. But other new research, whether conducted by corporate researchers at leading social media companies or by academic researchers analyzing data sets that obtained from government sources or from the open web, has generated public debate.
The publicity around the Facebook "Emotional Contagion" study, which sought to understand the effect of posts by social media users on their friends, helped bring the research ethics question to a broad audience. But many academic or corporate researchers had long been struggling to find the right frameworks for ethical review of the vast amount of research taking place today beyond the traditional academic context.
The questions about such research are numerous, and important. If a new type of research review process is needed, who should be subject to it? Startups don’t have the resources to staff special review committees, and major corporations often have hundreds of tests of different kinds happening at any given time. Is only research intended for scientific publication subject to review, or should general product improvement be studied? Who should staff these review committees, and how do they fit in with privacy and security reviews which often look at related issues? What factors should be assessed to determine the ethics of a project? Are they universal? Or should they differ from culture to culture? What benefits are valuable enough that researchers should allow risks to users, if ever?
If members of the public feel that they cannot trust the type of big-data research that the private sector is able to carry out today, this could pose serious obstacles for this type of research.
Companies are starting to provide some answers. Today, Facebook’s ethics and policy staff published an important paper that provides a detailed overview of the company’s research review process. Informed by consultations with a wide range of experts, the Facebook process details the specifics steps taken by the company to review its internal research work, and is an important step forward for corporate research ethics.
More is happening, in academic and corporate circles, but it can’t happen quickly enough. The Center for Democracy and Technology recently published a report describing the internal research ethics process at Fitbit. And academics, advocates and researchers in every field are continuing to work through the wide range of issues to be considered for such review processes to become widespread and meaningful.
The stakes are high. If companies feel that working with academics to conduct research is too uncertain or risky, innovative work will continue, but it will remain confidential and protected within companies, and unavailable to a wide research audience.
By the same token, if members of the public feel that they cannot trust the type of big-data research that the private sector is able to carry out today, this could pose serious obstacles for this type of research.
Protecting consumer data, while ensuring that it can be used safely and responsibly for scientific research that may yield the next breakthroughs in knowledge, is an ethical challenge we need to meet.
Dennis Hirsch is the faculty director of the Ohio State University Program on Data and Governance, and is a professor of law at the Moritz College of Law. Reach him @OSU_Law.
This article originally appeared on Recode.net.