Using data to create a quality-first approach to Industry 4.0
Industry 4.0 within the pharmaceutical industry is all about connecting machines and systems with advanced technologies to digitally transform how experiments are completed, and ultimately how value is created.
The key word to emphasize here is “quality.” First and foremost, organizations must prioritize how the quality of data, experiments and other insights will be achieved within these new parameters. This can be done by investing time and effort in technologies, such as automation and digitalization tools, that support the operation’s end result. For organizations looking to embrace the realities of Industry 4.0, here are some key considerations for keeping quality at the core of the process, through automation, digitalization and analysis.
Automating the workflow
When automating a scientist or researcher’s workflow, there are two common processes that are typically followed — distributed manufacturing and continuous manufacturing.
In the laboratory setting, distributed manufacturing is practiced by those using a network of physically dispersed facilities that are coordinated using information technology. When moving through the manufacturing process from raw material procurement to finished product distribution, organizations operating under this trend have to engage with external manufacturers to conduct specific unit operations.
In the case of synthetic products, such as small molecule drugs, each synthesis operation may actually be conducted by separate contract manufacturing organizations, which then leaves room for error or data discrepancies. Due to the disconnected nature of distributed manufacturing operations, there is a trend to move towards continuous manufacturing where instead of having large-scale batch operations, scientists and researchers along the supply chain practice ongoing operations, enabling them to fill requirements in real-time. Continuous manufacturing also supports rapid technology transfer — implementation and validation of a manufacturing process at a new site requires a significantly smaller equipment commissioning effort.
It currently takes at least 10 years for a new medicine to complete the journey from initial discovery to the marketplace and the average cost to research and develop each successful drug is estimated to be $2.6 billion. That’s exactly why the approach to continuous manufacturing is centered on automation. The technology leveraged within an automated ecosystem can ensure high-quality operations, reduce associated costs, and enable shorter development times. Not to mention, hundreds of thousands of data points are collected at an ongoing basis using this method. Every molecule, every compound, and every insight is accessible to the scientist conducting the research. To make sense of the vast amount of information captured, it’s important for the data to be in a digital form so it can be easily read by and consumed by machines.
Digitalizing the data
Human brains can only think about so many dimensions at a time whereas a machine can consider infinite dimensions and infinite parameters when looking for correlation patterns.
For a typical drug program, getting to a single lead candidate can take three to five years and may involve the synthesis and analysis of 2,000 to 3,000 molecules. With artificial intelligence (AI) and machine learning (ML), scientists have been able to focus on a lead candidate from just 400 compounds.
This reduction in the number of compounds that a scientist has to manually assemble and analyze in the laboratory has also reduced the number of experiments that need to be run. Characterization data can also be paired with process information allowing for correlations a human wouldn't normally find.
Ultimately, the goal of digitalizing data is for the data to be consumable so trial and error efforts in the R&D process can be eliminated or drastically reduced, further allowing hardware and software to work in tandem to gather the insights needed.
To understand the quality that can be achieved when enabling hardware and software to work in tandem, consider a high-level design of an integrated system.
- First, the manufacturing process parameters are carefully monitored during the process execution;
- Then, sensors, probes, and ex-situ sampling units are connected to flow cells (for continuous) or batch vessels (for batch);
- Finally, the data generated from these sensors, probes, and samples are subjected to a machine to determine whether operating parameters need to be adjusted in real time.
The initial step for data intended to be used in AI and ML is normalization and standardization. The data must be consistent for direct comparison, which is more of a challenge in some areas than others considering how complex scientific data is. While chemical structure information can be relatively easily normalized by selecting a particular format (i.e. SMILES, InCHI or *.mol), analytical data is at the other end of the spectrum — generated from a variety of vendors and across different techniques for different purposes. All of this plays a part in making data disparate and challenging to normalize. To help, organizations should look to standardize/normalize the data intended for use in data science projects.
When planning for the laboratory of the future, laboratory execution systems, experimental data capture systems, and decision support systems must effectively capture the “digital twin” of instrumental methods of analysis. Moreover, it is important for laboratories to ensure that digitalized laboratories of the future not only allow for the on-demand access of analytical data, but must also provide data provenance to assure that sample genealogy is preserved within datasets.
Additionally, laboratories should look to achieve data interoperability to assure that data formats do not require specific software to consume and allow for visualization. Laboratories should also have the ability to process-on-demand to assure that data is formatted and that automated processing of large datasets is convenient, fast and reproducible. Laboratories that grant the ability to associate digital representations of analysis, both automated and human-initiated, will be able to assure that the interpretations of results are stored along with the experimental data.
There is software that can be selected for such an endeavor which can assure data integrity, enhance regulatory compliance and drive innovation. With the automation and digitalization of data complete, the analysis of that data can begin.
Analyzing the data
Although data might be digital, relevant datasets can often be disassociated from each other or stored and managed in disparate digital silos, only being accessible by domain-specific applications. AI and ML are a key part of reaching the full picture view of data to associate experimental data with chemical provenance.
If scientists have a digital representation of data as opposed to a document summary, they can compare the batch information to the digital specification. By analyzing the data sets exposed by applications, scientists can do an interrogation or investigation as to why a certain correlation is happening.
For clinical batches, data access in one place is a key requirement for analysis by scientists. The analysis portion is also how impurities can be found and drug recalls avoided. For instance, observing any compositional variance — prior to any “out-of-specification” results are observed — can be helpful for identifying any problems before manufacturing batches fall out of release specifications. This is achieved by having the digital representations of batch information rate, process information, and the characterization results all in one digital data set. The automation and digitalization stages also help optimize processes and give scientists the information they need to conclude an experiment and have high-performing workflows.
Digitalization is a major way Industry 4.0 is being realized by companies across industries. By connecting all different pieces of hardware with software, and coupling that with data, machines can draw conclusions to advance the experiment — and in real-time. These advancements are also making the laboratory of the future a more collaborative and interactive experience for scientists since they are using AI and ML to guide experimental design, which improves efficiency and draws meaningful insights from myriads of data. Scientists are accelerating innovation in R&D through digitalization efforts that streamline workflows and demonstrate data integrity, and they are reducing the risk and expense associated with developing new chemicals with faster, cheaper, and more reliable structure characterization.
Industry 4.0 is no longer just an idea. There are endless tools and resources available for organizations to realize the ecosystem’s benefits to achieve the efficiency and productivity it offers.