Anthony Philippakis is a physician, geneticist, and data scientist. He is currently a cardiologist at Brigham and Women’s Hospital, a venture partner at GV (formerly Google Ventures), chief data officer at Broad Institute of Harvard and MIT, and a founding member of the RARE-X board. His team at Broad developed the data platform that is being used by RARE-X.
Philippakis believes as the cost of genomic sequencing has fallen and the amount of genomic data has proliferated, we are at an inflection point where data can be harnessed to transform the diagnosis of rare diseases and the development of therapies to treat them.
RARE-X editor Daniel S. Levine spoke to Philippakis about why he believes data sharing is critical to improving the lives of rare disease patients, what challenges need to be overcome, and the potential he sees for RARE-X. Edited excerpts follow.
Q: Why is data sharing critical to improving diagnosis and treatment?
A: Medicine is in the midst of the data revolution with the ability to generate large data sets on individuals—for rare or common diseases, or cancer. I don’t think it’s unique to a disease population, but the ability to deeply genotype and phenotype people in order to understand the underlying mechanisms of disease is crucial. These are efforts that are very expensive, require a large amount of funding, and there’s no one group that has a monopoly on good ideas.
Given the social resources that need to be invested in generating these human subjects research cohorts, it’s imperative that the data be opened up to the world and shared. There are some great precedents of fields within medicine that have moved much faster than others. One example I would point to is human genetics and the Human Genome Project. There was a public effort and a private effort. The public effort pretty much had as a rallying cry that the data should be available to all. It was hard to walk that back later.
There’s been a very strong tradition within genomics of sharing data. When you look at what’s happened in human genetics for the last 15 years, there aren’t too many periods in science like it where we’ve gone from having less than 10 divisions in the genome that we could reliably associate with common disease, like coronary artery disease or diabetes, to today where we have something more like 80,000 or 100,000, depending on how you count. This is just this incredible window of progress that happened because data sharing was possible.
Now the foil to this is fields like electronic health records or epidemiology, where data sharing is far from the norm. Epidemiologists, in my experience, want to be varied with their data in contrast to the kind of early great discoveries, like the Framingham heart studies where we discovered the risk factors for heart disease or that smoking causes cancer and heart disease. There are just fewer breakthroughs in that community, and nobody would say that they are one of the fast-moving branches of science.
The main reason for data sharing is accelerating progress towards making new therapies by enabling an ecosystem of people with good ideas.
Q: What are the challenges from a technical point of view that need to be addressed?
The technical challenges are not so hard. But there are much more social challenges. To the extent that there are technical challenges, they are around seeing standards created and reduced to practice, but even there they are less than social challenges. One of the biggest social phobias is this idea of the “great database in the sky” or “Facebook for genomes.” There’s a lot of different words that get used, but it’s this idea that there will be this one monolithic data platform out there that controls all of the world’s data and it’s all powerful. And if you didn’t build it, then you are going to be behind forever. There are a lot of pre-formed antibodies to this idea and people are very allergic to it—both because people don’t want it to happen and because there are a lot of technical challenges to it with data stored around the world and restrictions on health data and how it is moved around the world.
It’s almost certainly the case that we’re not going to ever have one group control it all. And what everyone strongly believes is that what we need looks much more like the web or the Internet, which is kind of decentralized and federated and a lot of groups are building it. What that requires are standards. We couldn’t have the Internet without TCP/IP and we couldn’t have the web without HTML. These are the kinds of things where there really are more social challenges than there are technical challenges.
Q: From an obstacle point of view, what are the biggest obstacles to data sharing today?
A: In general, the researchers who generate data, don’t want to share it. It is much better for them. It’s very unlikely that you have both the operational excellence to generate big data sets and are the best group in the world to analyze it. It’s a lot to pull together in one place. In general, the people who generate the data sets want to keep it to themselves so that they have privileged access to it.
There are three bodies that can fix this. They are funders (by just demanding that as a condition of funding that if you can generate data you share it). The second is publishers (demanding data sharing as a condition of publication). The third body is more in the clinical realm, but I have a lot of hope for them—the payers. If you are either CMS or United Healthcare, and you’re paying for cancer genome sequencing, it is actually in your best interest to say, the quid pro quo of me paying for this test is that the results become available to accredited researchers to accelerate disease research.
The reason why it’s good for them is then you can’t have a group that has a data monopoly. There’s a little bit of a positive feedback where the more data you have, the more accurate your diagnosis. You could create a monopoly of data and then charge more, so the payers could undermine that.
Q: Why did you become involved in RARE-X?
A: By training I’m a cardiologist. Most of my time is spent as the chief data officer at the Broad. I lead a large team of software engineers. When I was a cardiologist, I was focused on a lot of rare genetic cardiovascular conditions. I’ve always cared very much about rare diseases. I got to know Nicole Boice [founder of Global Genes and founder of RARE-X] through data sharing circles, and she asked me to be on her scientific advisory board, and then I chaired Global Gene’s SAB for several years. One of the big unmet needs is registries. The opportunity to help support her and see the creation of just-add-water registries that would empower rare disease foundations, those are very exciting opportunities.
Q: What role do you see RARE-X playing and what do you hope to accomplish through it?
A: There are a few different things that we hope to accomplish. There is an alignment of interests around the importance of registries, where everyone agrees the need is very great. On the research side it becomes a tool to collect data. Because these diseases are rare, it’s often hard to understand their pathogenesis, their natural history, and things like that. That’s one group of people who really want to see registries come together. Pharmaceutical companies also very much want to see registries come together because in order to run a clinical trial, you really need to have a registry. Otherwise you spend so much money trying to find the patients that you don’t have enough funding to then also run a trial. For that reason, a lot of these foundations are very invested in seeing registries created. Everyone wants them and there have been a lot of startups that have tried to come into existence, but so far none of them have succeeded in capturing hearts and minds.
The reason why is that it’s very hard and you need to bring four things together and there has yet to be a group that brings all four things together. The first is you need an organization that can engage patients and be authentic, say we’re on your team, and capture the patients’ hearts and minds. This is where Nicole is probably the best in the world. Then you need an organization that can make commercial grade software that has all the security and compliance with HIPAA and CLIA and all the other acronyms that are needed, and also has a great user experience and feel of a consumer technology product. You need the scientific expertise to design the study, which is hard. And finally, you need an incentive structure that is aligned with patients.
I spend part of my time as a venture capitalist. At various points, I wondered if we should try and start a company? I always decided it wasn’t a good idea. The business model for a company is inevitably selling data on patients or selling access on patients or selling access to them in terms of enrolling them for a clinical trial. It’s not good for data sharing. It’s not good for patients with rare diseases. And then there’s the social side: as soon as you have investors and CEOs who are getting rich, if the company is successful, then you have all of these families from across America that have very challenging lives. It’s just a very unsavory social dynamic. This is a big reason why this industry has never succeeded.
What I hope with RARE-X, which is one of the neat opportunities, is that you bring in people like Nicole and the people on her team who know how to engage patients and be on their team. And then my group is a software development shop that makes professional software and works with a lot of big tech companies. By being at the Broad, I collaborate closely with a lot of experts in rare disease and pull them in. And then finally, both Nicole and I are invested in this being a nonprofit and we’re not trying to make money on it. I hope what will happen is that we’ll be able to be in a place where we can essentially give every disease foundation an amazing registry where they can set it up very quickly, figure out where the funding will come from to do the sequencing and other assays on the patients, and do a lot to accelerate both rare disease basic research and also clinical trials.