Creating innovative medicines through collaboration between AI and humans
The pharmaceutical industry has long been founded on the basis of data utilization, even before society became aware of the importance of digital transformation. For example, in drug discovery research, exploration of novel drug seeds*1 has been based on a large amount of experimental data. We have also been working on new computation-based science such as bioinformatics, which elucidate life phenomena, and chemoinformatics, which promotes chemical research. Pharmaceutical companies have long been analyzing clinical trials data, and in recent years, utilization of real-world data has been deployed. The use of medical big data is a new way to find innovative treatment methods.
Kazuhisa Tsunoyama, Ph.D., Senior Director of Advanced Informatics & Analytics (AIA)*2 Japan, AIA division, who originally studied computational biochemistry and molecular evolution, says “Pharmaceutical companies need to further accelerate the movement toward the information industry, based on his own experience as a data scientist in life science.”
"Drug discovery must be based on scientific evidence. Not only do we have to check if the drug is effective, but we also have to find out if there are any side effects. Scientific data is the basis for these judgments."
Tsunoyama talks about the use of scientific data:
"Looking at drug discovery research and development in sequence, the first step of using data is to find the target biology. To find drugs that effect on the target, we then analyze the data on modalities such as small molecules or antibodies. In clinical trials to determine the therapeutic efficacy and whether there are any adverse events, the vast amount of data from participating subjects is also analyzed.”
“Even after a new drug has been approved and is used for a treatment of patients, post-marketing surveillance is conducted to analyze the data to see if there are any side effects that could not be observed during the clinical trial. Epidemiological analysis of data on drug prescriptions, procedures, diagnosis or treatments is also active, and efforts are also being made to find diseases with unmet medical needs by using the data, and to identify new drug targets by studying the cause of such needs.“
“I was originally studying computer science. I wanted to analyze something by computer, so I selected computational biochemistry and molecular evolution, because life is considered to be the most complex system. In particular, in the field of molecular evolution, I researched the mechanism of evolution using a gene sequence database that was established by international collaboration. The whole human genome sequencing research was conducted worldwide, led by the U.S. in the late 1990s, and the importance of data science began to be recognized widely in the life science field. I believe many pharmaceutical companies stepped up their efforts toward the information industry during this period.”
In 1999, Tsunoyama joined Yamanouchi Pharmaceutical Co., Ltd.*3 the predecessor of Astellas. Since then, he has worked on research to analyze the relationship between gene expression status and disease using microarray technology, which can comprehensively measure the expression levels of thousands to tens of thousands of genes in a short period of time, as well as exploring new therapeutic targets based on the deciphered whole human genome sequence information, or so-called genome-based drug discovery research. “Genome-based drug discovery is an example of the data utilization, but the pharmaceutical companies which have improved their capabilities of data analysis and have used it to find new treatments have gained and maintained stronger competitiveness," recalls Tsunoyama.
*1 Drug seeds: a seed of a new drug that is thought to be effective in treating a certain disease
*2 Advanced Informatics & Analytics (AIA): Please see "Digital Transformation" page for the purpose of establishing the AIA division and examples of initiatives using data, informatics and analytics.
*3 Yamanouchi Pharmaceutical Co., Ltd.: Astellas was established in 2005 through the merger of Yamanouchi Pharmaceutical Co., Ltd. and Fujisawa Pharmaceutical Co., Ltd.
Using data science to solve the challenges the pharmaceutical industry faces
Data utilization is accelerating in the pharmaceutical industry, including Astellas. The pharmaceutical industry is digitizing at a scale and speed that make it one of the most information-intensive industries. This trend is not unrelated to the significant challenges the pharmaceutical industry faces.
In recent years, the types of diseases and treatments for patients have become increasingly diverse and subdivided, and there is a need for precision medicine to select the best treatment based on detailed data, including individual genetic information. As a result, we will move toward using data more effectively than ever before.
What is required of pharmaceutical companies is to deliver new medicines to patients faster, with higher efficacy, and reliability. In order to achieve this, Astellas is actively working on data utilization and is trying to find new value by integrating data analysis technology with human intelligence.
“There is a saying that Isaac Newton is said to have used: ‘If I have seen further it is by standing on the shoulders of Giants.’ This is a metaphor for discovering something based on the accumulated discoveries of our predecessors. In this sense, I feel it is our mission to stand on the shoulders of the giants of data and AI, and to promote science and business based on data science,” explains Tsunoyama.
"For example, we get a lot of choices suggested by AI, thereby stimulating human creativity that will lead to new possibilities and directions. In other words, we hope to promote advanced science and business through the collaboration of AI and humans. This will not only lead to the development of new medicines, but also to increased productivity of our company and eventually, to the value of patients."
Astellas launched AIA in 2019 to fully implement digital transformation (DX)
It was around 2015 that Astellas began to make full-scale efforts to use data effectively. At Drug Discovery Research division in Japan, members who were highly interested in searching for new drug candidates from data of various life science fields gathered and started cross-organizational Big Data Initiative activities to use medical big data, which was attracting attention at the time, for drug discovery. In the U.S. and Europe, Real World Informatics division was established to make effective use of real-world data on healthcare in the real world.
One of the driving forces behind digital transformation at Astellas is AIA, which was established in 2019. We have integrally developed our Big Data Initiative activities in Japan and Real World Informatics divisions in the U.S. and Europe into a single organization,” explains Tsunoyama.
“My group in AIA collects internal and external data such as molecular biology data like omics data, which includes human genome sequence information, text data like literature, data on modalities/technologies, biology, diseases, industry trends, information on diseases, medical economics, clinical trials, and so forth. We analyze those data using information processing technologies such as data science, machine learning, and AI, and based on the knowledge we obtain, we test hypotheses, conduct experiments and validate hypotheses gained as well as discover modalities or develop the technologies. Once we obtain new ideas this way, we add the necessary data for further analysis. By repeating these processes, we aim to achieve more advanced science."
*4 Omics data: data from the comprehensive analysis and research of molecules in living organisms. For example, genomics (gene + omics) data for all genes, proteomics (protein + omics) data for all proteins, etc.
Approaches to drug discovery and precision medicine based on human genomics data
In the past, decoding of the human genome sequence required a huge budget, but the cost has dropped significantly, so pharmaceutical companies are actively investing in acquiring human genome sequence data. The integration of human genomics data with real-world data is bound to bring great value.
"Pharmaceutical companies are actively investing in acquiring human genome sequence data because the selection of genetically-supported therapeutic targets leads to a higher probability of success in clinical trials. Astellas also uses a wide variety of molecular biology data as well as collaborations with medical institutions and research organizations in Japan and overseas that promote cutting-edge human genetics research, including UK government-owned Genomics England, to promote the identification of causal genes for monogenic diseases. We are also conducting research into the causes of so-called common diseases*5 in cooperation with biobanks in various countries. By integrating these findings and applying our expertise in drug discovery, we will be able to discover new therapeutic targets and biomarkers, and provide new treatment methods for many patients,” says Tsunoyama.
*5 Common diseases: diseases with high prevalence such as hypertension and hyperlipidemia
Effective extraction of important knowledge from vast amounts of information
“We are also working on text mining to analyze the information contained in medical/life science literature data, patent data, research funding data, and many other text data. We have developed our own dictionary of 1.5 million terms because there are many alternative names and abbreviations for protein and diseases (fluctuation of notation). For word identification, we have developed the algorithm that can identify all the terms in our dictionary in the abstract of a single paper in 0.005 seconds. Many other technologies are also used to extract important knowledge from textual information.”
"We can analyze the trends of competitors and related companies based on the huge amount of text information in the medical/life science field, and we hope to use this technology to enhance our own research capabilities," says Tsunoyama.
In addition, we are trying to extract useful knowledge from real-world data, such as electronic medical records and insurance claim data. By conducting this analysis, we will be able to understand the treatment flow of "health - pre-disease - onset - treatment - prognosis," deepen our understanding of disease and treatment, and identify unmet medical needs. Our goal is to create a system that can provide optimal treatment methods that enable precision medicine.
Approaches to drug candidate discovery using AI and robotics
AIA, together with Drug Discovery Research division, is working on AI-driven drug candidate discovery. By building an automated system that integrates AI and robotics, we aim to enable rapid search for modalities with high pharmacological activity and shorten the research period. We have started to build a cycle to efficiently find higher quality candidates by repeating the process of molecular design based on compound property prediction by AI, automated synthesis by AI and robots, automated biological testing, and then enhancing the compound property prediction by AI based on the data.
Tsunoyama explains, "We are developing a system in which AI learns from approximately 650,000 data points owned by Astellas to predict physical properties, pharmacokinetic characteristics, and toxicity, and then AI designs the structures of a candidate compound that is expected to have drug efficacy based on approximately 20 million compound structure conversion data. We have also introduced equipment to synthesize the designed compounds, and are building a cycle to smoothly connect all the processes. We are also operating an automated assay system*6 to investigate pharmacological activity.”
"When we used them in multiple research projects, we could immediately reduce the search time. In a part of the designing process, we could reduce the time by up to 90% compared to the previous process. We have also been able to identify highly active compounds more quickly by repeating the process of using the AI activity prediction model to design compounds in the short term, synthesizing them in a semi-automated high-throughput synthesis system, and then re-training the AI on the assay results,” Tsunoyama says about the achievements. We were able to demonstrate that the research time could be shortened by advancing research ahead of time through a combination of human judgment and AI predictions.
Automated systems can be operated remotely. By enabling remote-controlled compound synthesis and assays, we expect to be able to proceed with drug discovery research without delay, even in the COVID-19 pandemic. Astellas will continue to develop this technology as a necessary technology for the future society as part of our work style reform.
*6 Automated assay system: an equipment that automates biological tests by robots
Maximize value through collaboration with partners
"These activities will not be handled by Astellas alone, but can be further expanded and accelerated through collaboration and cooperation with various partners," says Tsunoyama. "I would like to work more creatively with various partners.
“As we move forward with data-driven drug discovery, we would like to actively collaborate with partners who have highly original data or creative data analysis technology. At the same time, Astellas must make further progress to be chosen as their partner.
“The R&D activities of pharmaceutical companies require long-term efforts. It's not surprising to take about 10 years to complete a project. It takes a long time to reach the final goal.
“I think my desire to contribute to the health of people around the world is what keeps me working patiently. We are looking for partners who share our goals and can continue to work towards the final common goal.
“In the future, AI and robotics will become commonplace. We should let AI and robots do more of what they can do, and use our free time for creative endeavors. In this new era of data-driven innovation, let's join us in our creative work to contribute to the health of people around the world."