It’s a logical conclusion that expanding data sources in clinical trials can lead to more therapies, but does this additional data speed up or slow down a clinical trial? In the past, clinical trials have used only structured, clinically-sourced data, which was relatively easy to organize and mine, but clinical trials today are more complex and utilize data from a plethora of sources such as mHealth devices for remote monitoring of trial patients, mobile health applications, biomarker data clinically-sourced data and more. If the end goal is getting more therapies to market, and currently only 20-30 percent of drugs make it to market, the question becomes, can expanding our data sources and improving the aggregation and management of both unstructured and structured data help improve the likelihood of regulatory approval and getting safe and effective drugs into the hands of patients? The answer is “yes” and we believe that the combination of advanced technology and the emerging role of the data scientist hold the key.
" There are tremendous challenges in aggregating, storing and preparing the data for high speed analysis because it is no longer just structured data "
If we can get to a place where larger and more comprehensive datasets are made available to pharmaceutical companies and their clinical researchers, we can move beyond the questions of what data to collect and how to use it, to more meaningful questions such as, what new theory can I prove regarding the population of this trial, should my study continue based on new insights I’ve received, or what new patterns can I uncover to lead me to a new hypothesis. The answer to these questions lies in technology—technology with advanced metadata management capabilities that can offer the flexibility and scalability needed to handle all real-world data in the format, size, and frequency required as clinical trials evolve.
Today, data from various domains is exploding to give us a more comprehensive picture of clinical studies that can improve our decision making. There are tremendous challenges in aggregating, storing and preparing the data for high speed analysis because it is no longer just structured data. It is both structured and unstructured data coming from a growing number of systems, devices and publications such as EMRs, medical devices and documented research. This influx, while valuable, is making it difficult to capture and analyze at different intervals throughout a study.
Take for example a biopharmaceutical company focused on cancer drug development. To improve their study and time to market, study teams want the ability to quickly combine and analyze data collected through the clinical trial along with genomics data, to have insight into whether or not the patients admitted into the trial are the right people based on their genes and likelihood of responding to the treatment. Armed with this information early on in the process, study teams can determine if they are on the right track or if they need to course correct with more precise patients. This correlation between the study and genomics enables biopharmaceuticals to better identify the exact type of patient who has a better chance of responding positively to the drug. It is a big step toward precision medicine.
At first glance the notion of combining multiple data types for better outcomes may seem time consuming, daunting and not worth the additional effort as it could slow down the time to market, but in reality, the combining of these invaluable data types and making that information available throughout a study actually increases the probability of a drug making it to market faster. Today’s technological advances and the emerging role of the data scientist, are giving us new levels of analysis and the ability to predict outcomes in near real-time. Innovations such as machine learning and artificial intelligence are providing us with new algorithmic techniques that can assist in identifying patterns that support enhanced and automated decision making along the drug development path. For example, having historical data about a clinical trial’s ability to recruit suitable patients can predict the probability of future recruitment success. The combination of qualitative and descriptive data now enables researchers to identify similar groups of patients who are best suited for a new trial. By knowing exactly who you’re looking for, you not only shorten site selection and patient recruitment, you also increase the likelihood of positive results.
Historically the gap between combining structured and unstructured data for clinical trial decision making has caused errors and delays in setting up and completing successful trials. However, with the new aggregation, storage and analysis solutions, combined with artificial intelligence and machine learning we can now detect trends and negative signals much earlier in clinical trials. Not only does this compress the critical path for drug development and improve safety, it increases our chances of creating a more robust pipeline of disease-treating therapies – all at a faster pace than ever before.
However, even with all of our technological advances, the human role is still critical. In fact, with the growing amount of data and data types, a fairly new role has emerged—the data scientist or data revolutionist. As clinical researchers continue to focus their daily efforts on creating successful trials, the role of the data scientist is focused on directing overall clinical data quality and management activities that will support the progression of the drug development pipeline. This emerging role has responsibility for connecting the various dots across the organization to ensure the right data makes it into the right hands at the right time. While the role is defined based on each biopharmaceutical company’s needs, ultimately the data scientist’s job is to be a leader in both science and complex data management, quality and visualization from source to submission. So even as we continue to see more innovation in automating decision making, growing datasets and the management of the technological advances will still require a human touch.
When we combine today’s technological innovations and the new data scientist role, and apply them to the clinical world, we see a huge opportunity for positive impact. Not only does the insight from more data from different sources enhance the questions we ask of our clinical trials and fill a much needed disease-fighting pipeline of new drugs, it also allows us to uncover hidden relationships that can precipitate new hypotheses and provoke new, potentially life-saving questions that we never thought to ask before. With the right technology and people, new sources of data promise to speed, not slow, clinical trials allowing biopharmaceuticals to bring more life-saving drugs to market faster.