The proteins in our bodies (the proteome) can tell us a lot about our health, such as if we have or are likely to get certain diseases.
However, analyzing a patient’s entire proteome is time-consuming and costly.
A new study by an interdisciplinary team of scientists, engineers and physicians from MIT, the Koch Institute for Integrative Cancer Research at MIT, Harvard Medical School, Seer, and other organizations demonstrates how this could change.
Combining a panel of engineered nanoparticles’ protein chemistry and machine learning, they analyzed the proteome in an unbiased, unconstrained manner, and with a depth, breadth, and speed not previously possible.
There could be numerous uses for rapid access to the comprehensive, scalable information enabled by this technology – with the potential for academia and industry to utilize proteomics data in much the same way as genomics data is now ubiquitously used.
In a study published in Nature Communications, the research team provided one proof-of-concept application: accurate detection of early-stage lung cancer.
MIT Sloan Prof. Vivek Farias, a member of the research team, says, “Just as high throughput DNA sequencing revolutionized genomic research, we believe this technology could prove to be a similar game changer for proteomic research.
As such, it could open up new avenues to predict, diagnose, and treat disease.”
Team member and Harvard Medical School Professor Omid Farokhzad, an MIT Sloan alumnus and physician-scientist at Brigham and Women’s Hospital, agrees. “To really understand what is happening in the human body, we need to look at proteins.
Up until now, we have not been able to do that a large scale. Our technology paves the way for an exciting and powerful new platform for discovery of new biomarkers for treatment and diagnosis of disease.”
The foundational technology for this work originated in part from the laboratory of Dr. Farokhzad, who is currently on sabbatical leave and serving as CEO of Seer, Inc., which is commercializing this technology.
David H. Koch Institute Professor Robert Langer, who was also on the research team and is head of the Langer Lab and a faculty member at the Koch Institute at MIT, notes, “The exploitation of bio-nano interfaces for capture and interrogation of the proteome provides a unique application of nanotechnology, which offers potential new opportunities, such as early prediction of cancer and understanding the proteome of people who are asymptotic for COVID-19.”
Historically, it has been prohibitively difficult to analyze the proteome to inform human health and disease. The researchers developed a more efficient way to study the proteome. By identifying proteins that stick to the surface of nanoparticles in plasma, they could generate comprehensive proteomic information on a single patient in a rapid timeframe.
“The data from the approach – which reveals which proteins stick to which nanoparticles – is akin to a hazy photograph of the patient’s proteome. We used a number of machine learning approaches – some known and others we developed ourselves – to de-noise that photograph,” says Farias.
“That de-noised data can then be used to many ends, including diagnosis of disease.”
In their study, the team was able to detect non-small-cell lung cancer in human plasma samples at the disease’s earliest stages with high accuracy, says Farias.
This technology could also be used to better understand a number of human diseases, including infectious diseases like COVID-19. Farokhzad explains, “Current research on coronavirus is mostly on the viral genome or proteins, but there is also a critical component of a patient’s response in determining the course of disease.
We could use our technology to observe physical changes in proteins as part of the patient’s response to the infection, and this could guide development of novel diagnostics and therapeutics.”
Farokhzad hopes to enable access to proteomic information at a similar scale to that of genomic information. “A human genome is sequenced about every 30 seconds.
We don’t know what all of that information means – out of the approximately 500 million genetic variations that have been identified, only approximately 30,000 are clinically validated.
However, when we put proteomic information together with genomic information, our knowledge will grow exponentially and that can fundamentally change the way we address healthcare.”