Difference between revisions of "Journal:NIMS-OS: An automation software to implement a closed loop between artificial intelligence and robotic experiments in materials science"

From LIMSWiki
Jump to navigationJump to search
(Saving and adding more.)
(Saving and adding more.)
Line 80: Line 80:
The experimental condition is expressed as a real-valued vector <math>\mathbf{x}_{i} \in \mathbb{R}^{d}</math>. This condition is prepared with information such as the compositions and structures of materials and the processes required to synthesize them. If the number of candidates for the experimental conditions is ''N'', the dataset for candidates is defined as <math>D = \{\mathbf{x}_{i}\}_{i = 1,\ldots,N}</math>. The initial candidates file is created by this dataset ''D''. An example of a candidates file with ''l'' objective functions is presented in Figure 3. All the candidates of ''D'' are written in the first ''d'' columns. In this part, there should be no empty spaces. The next ''l'' columns are used for the objective function values. In this part, at the initial stage, all cells are empty because experiments have not been performed for all the experimental conditions.
The experimental condition is expressed as a real-valued vector <math>\mathbf{x}_{i} \in \mathbb{R}^{d}</math>. This condition is prepared with information such as the compositions and structures of materials and the processes required to synthesize them. If the number of candidates for the experimental conditions is ''N'', the dataset for candidates is defined as <math>D = \{\mathbf{x}_{i}\}_{i = 1,\ldots,N}</math>. The initial candidates file is created by this dataset ''D''. An example of a candidates file with ''l'' objective functions is presented in Figure 3. All the candidates of ''D'' are written in the first ''d'' columns. In this part, there should be no empty spaces. The next ''l'' columns are used for the objective function values. In this part, at the initial stage, all cells are empty because experiments have not been performed for all the experimental conditions.


[[File:Fig3 Tamura SciTechAdvMatMeth2023 3-1.jpeg|900px]]
{{clear}}
{|
| style="vertical-align:top;" |
{| border="0" cellpadding="5" cellspacing="0" width="900px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;" |<blockquote>'''Figure 3.''' (Top panels) Examples of the candidates files of the initial stage and that after some experiments. Here, an example for the case that ''N''=9 is shown. (Bottom panels) Examples for the list of descriptors depending on the types of search space. If the continuous parameter space is considered, <math>D = \{\mathbf{x}_{i}\}_{i = 1,\ldots,N}</math> is the discretized parameters. When the combination of materials is the search space, the bit strings where the material used is represented by 1 and the material not used is represented by 0 are prepared in <math>D = \{\mathbf{x}_{i}\}_{i = 1,\ldots,N}</math>. Furthermore, materials descriptors from compositions obtained by such as magpie [36–37] and fingerprint of molecules obtained by such as RDKit [38] would be used as <math>D = \{\mathbf{x}_{i}\}_{i = 1,\ldots,N}</math>.</blockquote>
|-
|}
|}
In NIMS-OS, some promising conditions are selected from among those listed in the candidates file using AI models (available algorithms are described in the next section). When the values of objective functions are obtained by performing experiments, the objective functions in the candidates file are updated accordingly. That is, when the experiments are completed for ''M'' experimental conditions, only results for ''M'' conditions are entered at the ''l'' columns for the objective functions. Thus, at the next step, the experimental conditions are selected from among <math>N - M</math> candidates.
==Modules in NIMS-OS==
In this section, we introduce the modules included in NIMS-OS for AI algorithms and robotic systems. In the present work, we prepared four and two types of modules as AI algorithms and robotic systems, respectively.
===AI algorithms===
To select promising experimental conditions, three types of AI algorithms are implemented as standard in NIMS-OS. In addition, random exploration can be selected. Each algorithm is briefly explained in this subsection. In the future, more algorithms will be made available.
====Bayesian optimization: PHYSBO===
Bayesian optimization (BO) is an optimization technique using [[machine learning]] (ML) prediction. In this method, by using Gaussian process regression, the value of an objective function is predicted when the experimental conditions are input. The next promising experimental conditions are then selected based on the prediction values. Here, because the Gaussian process can evaluate not only the mean value of the prediction but also its variance, an acquisition function defined by mean and variance can be used to make the selection. In NIMS-OS, BO can be performed using the Python package PHYSBO. [32] PHYSBO supports single- and multi-objective optimizations, and multiple proposals are calculated. Note that the number of objective functions is recommended to be no more than three due to excessively large computational time with higher values. In NIMS-OS, Thompson sampling is used to define the acquisition function for rapid calculation. The key point in using PHYSBO is that the exploration is performed to maximize the objective functions. Thus, if a material with the smaller properties is explored, we need to add a negative value to the objective functions.
====Boundless objective-free exploration: BLOX====
BLOX is a Python package that performs boundless objective-free exploration. It is based on an algorithm designed to select the next experimental conditions, to perform uniform sampling in the space of the objective functions. For materials science, curious materials can be found using BLOX. Specifically, BLOX trains ML models to predict objective functions from experimental conditions. Experimental conditions that realize uniform sampling in the space of objective functions are found based on the Stein discrepancy evaluated using the prediction results. In NIMS-OS, a modified version of the BLOX algorithm that can propose multiple candidates is implemented. To select multiple candidates, after the experimental condition with the largest Stein discrepancy is selected, another condition is selected when the predicted values of the selected condition are regarded as a correct value. This procedure is iterated, and we obtain multiple proposals. In NIMS-OS, random forest regression is used as a prediction model. Although BLOX can handle any number of objective functions, it is recommended that the number of the objective functions be limited to three or four, because exploration in more dimensions requires more time. BLOX has been used to search chemical spaces [33] and to explore superhard materials. [39]
===Phase diagram construction: PDC===
PDC is a Python package that can create a detailed phase diagram with a small number of experiments. To investigate a phase diagram, PDC proposes promising experimental conditions for the next experiment by using active learning. Specifically, uncertainty sampling based on the label propagation method finds uncertain points in the phase diagram, and these uncertain points are proposed for the next experiments. PDC was developed to propose multiple experimental conditions for batch experiments. [40] In NIMS-OS, the least confident score is used as an uncertainty score to evaluate uncertain points. Note that, for PDC, the objective function is the phase name or an index of phases, and thus only a one-dimensional objective function can be specified in the candidates file. PDC has been used to create new phase diagrams for the growth conditions of thin film [41] and to determine large and small areas of creep phenomena in polymer materials. [42]
====Random exploration: RE====
In RE, the next candidate experimental condition is selected randomly. This approach can be used to generate initial data before executing AI algorithms when no experimental data have yet been recorded. Furthermore, it can also be used to generate data for comparison as new AI algorithms can be developed.
===Robotic experiments===
The module for robotic experiments comprises two Python scripts. The first script creates input files for robotic experiments according to the experimental conditions selected by the AI and commands a robot to begin the experiment. The second script analyzes the experimental results when the experiments are finished and updates the candidates file. At present, two types of modules are implemented in NIMS-OS: STAN and NAREE.
====Standard module for robotic experiments: STAN====
The STANdard module (STAN) is a virtual implementation of the procedure for conducting robotic experiments. Thus, NIMS-OS can be run virtually using this module, even without a robotic device. In this module, the following steps are executed:
# Create the input files for the robotic experiments in an appropriate folder according to the experimental conditions selected by the AI. In this standard module, we simply create a text file with a date as its name.
# Send a signal to the robotic system to begin the experiments. Depending on the machine, various cases can be considered, such as sending a start signal via serial communication. In this standard module, we assume that the experiments are begun by storing the inputend.txt file in the specified folder.
# Wait until the robotic experiments are completed. This step includes various operations, such as receiving signals from the robot when the experiment is finished. This standard module assumes that the robot outputs outputend.txt file to indicate that the experiment is finished, and NIMS-OS continues waiting until this file appears.
# Read the files of experimental results and extract the values of objective functions. Here, the case of simply reading results.csv, which contains the objective function values, is implemented.
# Update the candidates file according to the values extracted in (4).
Steps (1) and (2) are performed by preparation_input.py, and analysis_output.py conducts steps (3)-(5). In practice, for use with actual robotic systems, new modules can be created according to this standard module.
Additionally, this module can also facilitate closed-loop materials exploration between AI and experiments for processes that are time-consuming and cannot be partially automated. The procedure is as follows: When the proposals.csv file is generated, NIMS-OS automatically enters a sleep mode until experimental results are obtained. Based on the information in proposals.csv, the corresponding manual experiments are conducted. Once the objective function values are obtained through the experiments, a results.csv file is created, containing the objective function values corresponding to each line in proposals.csv. The results.csv file, along with an empty file named outputend.txt, is stored in the specified folder where the experimental results are output. Subsequently, NIMS-OS restarts and generates a new proposals.csv file.
====NIMS automated robotic electrochemical experiments (NAREE) system: NAREE====
As a robotic system for materials science, the NIMS Automated Robotic Electrochemical Experiments (NAREE) system [13,35] can be used in NIMS-OS. NAREE comprises a liquid-handling dispenser, an electrochemical measurement unit, and a robotic arm. By using a microplate-based electrochemical cell equipped with electrodes, the performance of electrolytes prepared by mixing solution by a liquid handling dispenser is electrochemically evaluated in a high-throughput manner. This module was developed according to the procedures of the previously described STAN.
==Usage of the NIMS-OS Python version==
===Install===
NIMS-OS is written in Python3 programming language (version 3.6 or higher is required), and it can be installed via PyPI as follows:
:<tt>$ python3 -m pip install nimsos</tt>
If this installation is successful, the following packages are also installed or updated automatically:
* Cython
* matplotlib
* numpy
* physbo
* scikit-learn
* scipy
===Basic usage===
We show a small example program (Program 1) in which PHYSBO is performed. In this program, assuming no experimental results in the candidates file, random exploration is performed in the first cycle.
====Assignment of parameters and candidates file====
First, the parameters for closed-loop experiments are defined. For example, when the number of objective functions is two, the number of proposals for each cycle is two, and the number of cycles is three. We define this in the code as follows:
:<tt>ObjectivesNum = 2ProposalsNum = 2CyclesNum = 3</tt>
Next, we specify a .csv file containing the candidates of experimental conditions, which is prepared as described under "Preparation of candidates for experimental conditions.:
:<tt>candidates_file = “./candidates.csv”</tt>
The name of the file that will contain the experimental conditions selected by the AI is as follows:
:<tt>proposals_file = “./proposals.csv”</tt>
We specify the folder name where the input files for the robotic experiments are stored and the folder name where the results from the experiments are output, respectively, as follows:
:<tt>input_folder = “./EXPInput”</tt><br />
:<tt>output_folder = “./EXPOutput”</tt>
====Execution of AI====
<tt>nimsos.selection</tt> is a class to select the next experimental conditions with the help of the AI. For example, <tt>nimsos.selection</tt> is used as follows:
:<tt>nimsos.selection(method = “PHYSBO”,
:input_file = candidates_file,
:output_file = proposals_file,
: num_objectives = ObjectivesNum,
:num_proposals = ProposalsNum)</tt>
The parameters of the method in this class (Program 1) indicate the module for AI algorithms. For the method, "PHYSBO" (Bayesian optimization), "BLOX" (objective free search), "PDC" (phase diagram construction), and "RE" (random exploration) are specified. The experimental conditions are selected from the data without the values of objective functions among <tt>input_file</tt>. In addition, selected conditions are outputted to <tt>output_file</tt>. For <tt>num_objectives</tt>, the number of objectives is input, and the number of proposals is specified as <tt>num_proposals</tt>. In general, although many hyperparameters should be considered to use the AI, they are determined automatically in NIMS-OS. Note that if there are no experimental results in the candidates file, only "RE" is used. For "PHYSBO," "BLOX," and "PDC," some values of objective functions must be stored in the candidates file.
[[File:Prog1 Tamura SciTechAdvMatMeth2023 3-1.jpeg|700px]]
{{clear}}
{|
| style="vertical-align:top;" |
{| border="0" cellpadding="5" cellspacing="0" width="700px"
|-
  | style="background-color:white; padding-left:10px; padding-right:10px;" |<blockquote>'''Program 1.''' Small example of NIMS-OS for Bayesian optimization.</blockquote>
|-
|}
|}
====Preparation of input files for robotic experiments and execution of experiments====
<tt>nimsos.preparation_input</tt> is a class to prepare the input files for robotic experiments and send the start message to the robot. For example, <tt>nimsos.preparation_input</tt> is used as follows.
:<tt>nimsos.preparation_input(machine = “STAN”,
:input_file = proposals_file,
:input_folder = input_folder)</tt>
The parameter of <tt>machine</tt> selects the module of robotic experiments. For <tt>machine</tt>, "STAN," which is the standard module for this procedure, and "NAREE" (NIMS automated robotic electrochemical experiments) are used. For <tt>input_file</tt>, the experimental conditions selected by the AI are specified. In addition, the folder in the computer where the input files for robotic experiments are stored is referred to as the <tt>input_folder</tt>. In the <tt>nimsos.preparation_input module</tt>, the two functions <tt>make_machine_file()</tt> and <tt>send_message_machine()</tt> should be modified depending on the robotic systems used. The former creates the input files for robotic experiments from selected experimental conditions, whereas the latter sends the message to begin the robotic experiments.
====Analysis of output files from experiments and update of candidates file====
<tt>nimsos.analysis_output</tt> is a class used to analyze the experimental results and update the candidates file. For example, <tt>nimsos.analysis_output</tt> is used as follows:
:<tt>nimsos.analysis_output(machine = “STAN”,
:input_file = proposals_file,
:  output_file = candidates_file,
:  num_objectives = ObjectivesNum,
:  output_folder = output_folder)</tt>
The parameter of <tt>machine</tt> is the same as that found in the <tt>nimsos.preparation_input</tt> module, which selects the module for robotic experiments. Here, "STAN" and "NAREE" can be selected. For <tt>input_file</tt>, the experimental conditions selected by the AI are specified, and </tt>output_file</tt> is the name of the candidates file. The file specified by <tt>output_file</tt> is updated by this module. In addition, for <tt>num_objectives</tt>, the number of objectives is input. For <tt>output_folder</tt>, the folder in the computer where the results from robotic experiments are output is specified. In the <tt>nimsos.analysis_output</tt> module, two functions <tt>extract_objectives()</tt> and <tt>recieve_exit_message()</tt> should be modified depending on the robot systems. The former extracts the values of objective functions from the output files of robotic experiments, and the latter receives the message when the robotic experiments are finished. If "NAREE" is selected, <tt>objectives_info</tt> should be specified as a dictionary indicating which objective function is extracted from the experimental results.
===Visualization of the results===
By using <tt>nimsos.visualization</tt>, the figures of the results are obtained. When this module is used, the new folder named "fig" is prepared in advance in the same folder where the main script is stored. The figures are output to this folder. <tt>nimsos.visualization.plot_history</tt> and <tt>nimsos.visualization.plot_distribution.plot</tt> create figures for the history and distributions of objective functions, respectively. These modules are useful when using AI algorithms other than PDC. In contrast, <tt>nimsos.visualization.plot_phase_diagram.plot</tt> creates the predicted phase diagram when PDC is used as an AI algorithm.
==Usage of the NIMS-OS GUI version==





Revision as of 17:44, 18 September 2023

Full article title NIMS-OS: An automation software to implement a closed loop between artificial intelligence and robotic experiments in materials science
Journal Science and Technology of Advanced Materials: Methods
Author(s) Tamura, Ryo; Tsuda, Koji; Matsuda, Shoichi
Author affiliation(s) The University of Tokyo, National Institute for Materials Science
Primary contact Email: tamura dot ryo at nims dot go dot jp
Year published 2023
Volume and issue 3(1)
Article # 2232297
DOI 10.1080/27660400.2023.2232297
ISSN 2766-0400
Distribution license Creative Commons Attribution 4.0 International
Website https://www.tandfonline.com/doi/full/10.1080/27660400.2023.2232297
Download https://www.tandfonline.com/doi/pdf/10.1080/27660400.2023.2232297 (PDF)

Abstract

NIMS-OS (NIMS Orchestration System) is a Python library created to realize a closed loop of robotic experiments and artificial intelligence (AI) without human intervention for automated materials exploration. It uses various combinations of modules to operate autonomously. Each module acts as an AI for materials exploration or a controller for a robotic experiments. As AI techniques, Optimization Tools for PHYSics Based on Bayesian Optimization (PHYSBO), BoundLess Objective-free eXploration (BLOX), phase diagram construction (PDC), and random exploration (RE) methods can be used. Moreover, a system called NIMS Automated Robotic Electrochemical Experiments (NAREE) is available as a set of robotic experimental equipment. Visualization tools for the results are also included, which allows users to check the optimization results in real time. Newly created modules for AI and robotic experiments can be added easily to extend the functionality of the system. In addition, we developed a graphical user interface (GUI)-driven application to control NIMS-OS. To demonstrate the operation of NIMS-OS, we consider an automated exploration for new electrolytes. NIMS-OS is available at https://github.com/nimsos-dev/nimsos.

Keywords: NIMS-OS, robotic experiments, artificial intelligence, electrochemistry, materials informatics

Introduction

The integration of robotic experiments and artificial intelligence (AI) is essential to realize automated materials exploration. If an AI system can take on some information tasks conventionally performed by human researchers, robotic systems can then execute the required physical tasks and experiments for materials exploration can proceed automatically. Such a platform may be expected to discover many novel materials and lead to substantial innovation in materials science. In recent years, significant progress has been made in the development of AI techniques and robotic devices suitable for materials exploration.

Since the launch of the Materials Genome Initiative [1], AI techniques have been actively used for materials exploration. [2–4] In general, materials exploration can be regarded as the problem of finding optimal materials from among a materials search space. The elements to be used in the search space must be configured, along with its composition range, process parameter range, and so forth. To solve this problem, black-box optimization methods are useful [5], and various methods have been developed and applied to fit various needs. Bayesian optimization (BO) is among the most frequently used methods in materials science. [6–8]. In this method, promising materials can be selected in the materials search space using the predictions of their properties and the uncertainty of these predictions evaluated by Gaussian process regression. Using BO, various real materials, such as Li-ion conductive materials [9], multilayered metamaterials [10], halide perovskite [11], superalloys [12], and electrolytes [13] have been explored. BO is also used for the automated analysis of materials [14–15]. In addition, many methods have been proposed for black-box optimization in materials exploration, such as genetic algorithms [16–17], Monte Carlo tree search [18], rare event sampling [19], and algorithms using an Ising machine. [20–22] In the future, many more innovative methods are expected to be developed.

Robotic experiments have progressed to realize laboratory automation of chemical analysis and high-throughput screening in the field of biology. [23–26]. Various types of automated analyzers and pipetting devices have been developed, and robotic arms have been used as a transport system to connect these systems. Moreover, robotic technology has been used to explore novel materials, such as thin-film materials [27–28], battery electrolytes [13,29], and photocatalysts. [30] These studies used BO to automate the proposal of promising experimental conditions. This enables a closed loop of robotic experiments and AI that can perform automated materials exploration without human intervention. This approach involves some key advantages, such as the ability to generate materials data of uniform quality and the absence of human error. In contrast, at present, robotics systems are limited in their ability to perform complex material synthesis tasks that require the skills of experts. Thus, further innovation in robotic devices will be important.

In addition to AI and robotic technologies, the control systems and software used to interlink them are also an important element to realize a closed loop without human intervention. Generally, different AI algorithms should be used depending on the motivation of a materials exploration task. Furthermore, the procedure to control the devices should depends on the nature and characteristics of the robotic systems used. Therefore, control software has thus far been developed on a case-by-case basis for different AI algorithms and robotic systems.

In this study, we developed NIMS-OS (NIMS Orchestration System) to realize a closed loop between AI models and robotic experiments, with the aim of establishing a generic control software system. Although this software was written in the Python programming language, we also developed a graphical user interface (GUI)-driven version to improve operability after installation. NIMS-OS treats each AI algorithm and each robotic system as separate modules (see Figure 1). This enables the implementation of a closed loop with any combination of these modules. If modules for new AI algorithms or robotic systems are prepared, new closed-loop systems can be easily controlled via NIMS-OS. One of the advantages of developing such generic control software is the establishment of technical standards for automated materials exploration. For AI algorithms, we determined standard formats for the input and output. Algorithms created according to the standard format can be immediately tested using any currently available robotic system. Specifically, we developed a standard format in which all the experimental conditions to be explored are listed in advance, and the appropriate experimental conditions that have not yet been tested are selected from the list by AI algorithms. The advantage of this approach is that it enables automated materials exploration utilizing materials databases. When utilizing materials databases, the compositional and structural information needs to be converted into materials descriptors, which serve as the materials search space. However, this search space generated from the materials databases cannot be solely defined by a continuous parameter space, and it requires a selection from the pre-listed descriptors. Of course, optimization of continuous parameters can still be handled approximately by preparing a list of grid points that discretize the continuous parameters. Furthermore, we expect this work to contribute to the development of new AI algorithms for automated materials exploration. For robotic systems, we expect modules developed based on NIMS-OS to increase the commonality of operational procedures, leading to cost reductions as new robotic experimental devices are introduced. Note that ChemOS [31] is similar to NIMS-OS; it was developed as an automation system in the field of chemistry. ChemOS specializes in BO within a defined continuous or discretized parameter space and includes several default modules for various BO methods. On the other hand, NIMS-OS offers the capability to perform automated materials explorations not only within a defined parameter space but also utilizing materials databases. In addition to BO, NIMS-OS also incorporates several default implementations of black-box optimization methods to deal with different motivations in materials explorations.

Fig1 Tamura SciTechAdvMatMeth2023 3-1.jpeg

Figure 1. Image of the combinations of AI algorithms and robotic systems via NIMS-OS.

Let us briefly introduce the specifications of NIMS-OS. First, a candidates file listing experimental conditions as a materials search space should be prepared in advance. A closed loop is formed according to the following three steps (see Figure 2):

  • Step 1: Select promising experimental conditions from the candidates file using an AI model.
  • Step 2: Create an input file for the robotic experiments and execute the experiments.
  • Step 3: Analyze the output from the experiments and update the candidates file based on the experimental results.


Fig2 Tamura SciTechAdvMatMeth2023 3-1.jpeg

Figure 2. Procedures in NIMS-OS and roles of each Python scripts.

Currently, the following AI algorithms are used as modules, which are available for Step 1: (i) Optimization Tools for PHYSics Based on Bayesian Optimization (PHYSBO) [32], (ii) BoundLess Objective-free eXploration (BLOX) [33], and (iii) phase diagram construction (PDC) [34] methods, with the random exploration (RE) approach able to be selected according to the purpose of materials exploration effort. For Steps 2 and 3, a STANdard module (STAN) is provided for robotic experiments, which enables operation checks even without devices, along with a module for NIMS Automated Robotic Electrochemical Experiments (NAREE) [13,35]. We plan to continue developing additional modules for this system.

The reminder of this study is organized as follows. The next section describes the preparation of a candidates file storing experimental conditions, followed by an introduction to the available modules for the AI and robotic experiments in NIMS-OS. Then the use of the Python code, and the usage of the GUI version is explained. As a demonstration, the results of an autonomous electrolyte exploration via a closed-loop approach using PHYSBO and NAREE in NIMS-OS are described. Finally, this work concludes with some discussion and suggests some important avenues for further research.

Preparation of candidates for experimental conditions

A major feature of NIMS-OS is that a data file listing candidate experimental conditions is prepared in advance (we refer to this data file a candidates file). In general, because there are many candidates, conducting experiments in all possible conditions is impractical. Thus, automated materials exploration proceeds by selecting promising experimental conditions from these listed candidates. This makes the closed-loop strategy more generalizable. That is, a variety of exploration motivations and robotic systems can be handled by NIMS-OS.

The experimental condition is expressed as a real-valued vector . This condition is prepared with information such as the compositions and structures of materials and the processes required to synthesize them. If the number of candidates for the experimental conditions is N, the dataset for candidates is defined as . The initial candidates file is created by this dataset D. An example of a candidates file with l objective functions is presented in Figure 3. All the candidates of D are written in the first d columns. In this part, there should be no empty spaces. The next l columns are used for the objective function values. In this part, at the initial stage, all cells are empty because experiments have not been performed for all the experimental conditions.


Fig3 Tamura SciTechAdvMatMeth2023 3-1.jpeg

Figure 3. (Top panels) Examples of the candidates files of the initial stage and that after some experiments. Here, an example for the case that N=9 is shown. (Bottom panels) Examples for the list of descriptors depending on the types of search space. If the continuous parameter space is considered, is the discretized parameters. When the combination of materials is the search space, the bit strings where the material used is represented by 1 and the material not used is represented by 0 are prepared in . Furthermore, materials descriptors from compositions obtained by such as magpie [36–37] and fingerprint of molecules obtained by such as RDKit [38] would be used as .

In NIMS-OS, some promising conditions are selected from among those listed in the candidates file using AI models (available algorithms are described in the next section). When the values of objective functions are obtained by performing experiments, the objective functions in the candidates file are updated accordingly. That is, when the experiments are completed for M experimental conditions, only results for M conditions are entered at the l columns for the objective functions. Thus, at the next step, the experimental conditions are selected from among candidates.

Modules in NIMS-OS

In this section, we introduce the modules included in NIMS-OS for AI algorithms and robotic systems. In the present work, we prepared four and two types of modules as AI algorithms and robotic systems, respectively.

AI algorithms

To select promising experimental conditions, three types of AI algorithms are implemented as standard in NIMS-OS. In addition, random exploration can be selected. Each algorithm is briefly explained in this subsection. In the future, more algorithms will be made available.

=Bayesian optimization: PHYSBO

Bayesian optimization (BO) is an optimization technique using machine learning (ML) prediction. In this method, by using Gaussian process regression, the value of an objective function is predicted when the experimental conditions are input. The next promising experimental conditions are then selected based on the prediction values. Here, because the Gaussian process can evaluate not only the mean value of the prediction but also its variance, an acquisition function defined by mean and variance can be used to make the selection. In NIMS-OS, BO can be performed using the Python package PHYSBO. [32] PHYSBO supports single- and multi-objective optimizations, and multiple proposals are calculated. Note that the number of objective functions is recommended to be no more than three due to excessively large computational time with higher values. In NIMS-OS, Thompson sampling is used to define the acquisition function for rapid calculation. The key point in using PHYSBO is that the exploration is performed to maximize the objective functions. Thus, if a material with the smaller properties is explored, we need to add a negative value to the objective functions.

Boundless objective-free exploration: BLOX

BLOX is a Python package that performs boundless objective-free exploration. It is based on an algorithm designed to select the next experimental conditions, to perform uniform sampling in the space of the objective functions. For materials science, curious materials can be found using BLOX. Specifically, BLOX trains ML models to predict objective functions from experimental conditions. Experimental conditions that realize uniform sampling in the space of objective functions are found based on the Stein discrepancy evaluated using the prediction results. In NIMS-OS, a modified version of the BLOX algorithm that can propose multiple candidates is implemented. To select multiple candidates, after the experimental condition with the largest Stein discrepancy is selected, another condition is selected when the predicted values of the selected condition are regarded as a correct value. This procedure is iterated, and we obtain multiple proposals. In NIMS-OS, random forest regression is used as a prediction model. Although BLOX can handle any number of objective functions, it is recommended that the number of the objective functions be limited to three or four, because exploration in more dimensions requires more time. BLOX has been used to search chemical spaces [33] and to explore superhard materials. [39]

Phase diagram construction: PDC

PDC is a Python package that can create a detailed phase diagram with a small number of experiments. To investigate a phase diagram, PDC proposes promising experimental conditions for the next experiment by using active learning. Specifically, uncertainty sampling based on the label propagation method finds uncertain points in the phase diagram, and these uncertain points are proposed for the next experiments. PDC was developed to propose multiple experimental conditions for batch experiments. [40] In NIMS-OS, the least confident score is used as an uncertainty score to evaluate uncertain points. Note that, for PDC, the objective function is the phase name or an index of phases, and thus only a one-dimensional objective function can be specified in the candidates file. PDC has been used to create new phase diagrams for the growth conditions of thin film [41] and to determine large and small areas of creep phenomena in polymer materials. [42]

Random exploration: RE

In RE, the next candidate experimental condition is selected randomly. This approach can be used to generate initial data before executing AI algorithms when no experimental data have yet been recorded. Furthermore, it can also be used to generate data for comparison as new AI algorithms can be developed.

Robotic experiments

The module for robotic experiments comprises two Python scripts. The first script creates input files for robotic experiments according to the experimental conditions selected by the AI and commands a robot to begin the experiment. The second script analyzes the experimental results when the experiments are finished and updates the candidates file. At present, two types of modules are implemented in NIMS-OS: STAN and NAREE.

Standard module for robotic experiments: STAN

The STANdard module (STAN) is a virtual implementation of the procedure for conducting robotic experiments. Thus, NIMS-OS can be run virtually using this module, even without a robotic device. In this module, the following steps are executed:

  1. Create the input files for the robotic experiments in an appropriate folder according to the experimental conditions selected by the AI. In this standard module, we simply create a text file with a date as its name.
  2. Send a signal to the robotic system to begin the experiments. Depending on the machine, various cases can be considered, such as sending a start signal via serial communication. In this standard module, we assume that the experiments are begun by storing the inputend.txt file in the specified folder.
  3. Wait until the robotic experiments are completed. This step includes various operations, such as receiving signals from the robot when the experiment is finished. This standard module assumes that the robot outputs outputend.txt file to indicate that the experiment is finished, and NIMS-OS continues waiting until this file appears.
  4. Read the files of experimental results and extract the values of objective functions. Here, the case of simply reading results.csv, which contains the objective function values, is implemented.
  5. Update the candidates file according to the values extracted in (4).

Steps (1) and (2) are performed by preparation_input.py, and analysis_output.py conducts steps (3)-(5). In practice, for use with actual robotic systems, new modules can be created according to this standard module.

Additionally, this module can also facilitate closed-loop materials exploration between AI and experiments for processes that are time-consuming and cannot be partially automated. The procedure is as follows: When the proposals.csv file is generated, NIMS-OS automatically enters a sleep mode until experimental results are obtained. Based on the information in proposals.csv, the corresponding manual experiments are conducted. Once the objective function values are obtained through the experiments, a results.csv file is created, containing the objective function values corresponding to each line in proposals.csv. The results.csv file, along with an empty file named outputend.txt, is stored in the specified folder where the experimental results are output. Subsequently, NIMS-OS restarts and generates a new proposals.csv file.

NIMS automated robotic electrochemical experiments (NAREE) system: NAREE

As a robotic system for materials science, the NIMS Automated Robotic Electrochemical Experiments (NAREE) system [13,35] can be used in NIMS-OS. NAREE comprises a liquid-handling dispenser, an electrochemical measurement unit, and a robotic arm. By using a microplate-based electrochemical cell equipped with electrodes, the performance of electrolytes prepared by mixing solution by a liquid handling dispenser is electrochemically evaluated in a high-throughput manner. This module was developed according to the procedures of the previously described STAN.

Usage of the NIMS-OS Python version

Install

NIMS-OS is written in Python3 programming language (version 3.6 or higher is required), and it can be installed via PyPI as follows:

$ python3 -m pip install nimsos

If this installation is successful, the following packages are also installed or updated automatically:

  • Cython
  • matplotlib
  • numpy
  • physbo
  • scikit-learn
  • scipy

Basic usage

We show a small example program (Program 1) in which PHYSBO is performed. In this program, assuming no experimental results in the candidates file, random exploration is performed in the first cycle.

Assignment of parameters and candidates file

First, the parameters for closed-loop experiments are defined. For example, when the number of objective functions is two, the number of proposals for each cycle is two, and the number of cycles is three. We define this in the code as follows:

ObjectivesNum = 2ProposalsNum = 2CyclesNum = 3

Next, we specify a .csv file containing the candidates of experimental conditions, which is prepared as described under "Preparation of candidates for experimental conditions.:

candidates_file = “./candidates.csv”

The name of the file that will contain the experimental conditions selected by the AI is as follows:

proposals_file = “./proposals.csv”

We specify the folder name where the input files for the robotic experiments are stored and the folder name where the results from the experiments are output, respectively, as follows:

input_folder = “./EXPInput”
output_folder = “./EXPOutput”

Execution of AI

nimsos.selection is a class to select the next experimental conditions with the help of the AI. For example, nimsos.selection is used as follows:

nimsos.selection(method = “PHYSBO”,
input_file = candidates_file,
output_file = proposals_file,
 num_objectives = ObjectivesNum,
num_proposals = ProposalsNum)

The parameters of the method in this class (Program 1) indicate the module for AI algorithms. For the method, "PHYSBO" (Bayesian optimization), "BLOX" (objective free search), "PDC" (phase diagram construction), and "RE" (random exploration) are specified. The experimental conditions are selected from the data without the values of objective functions among input_file. In addition, selected conditions are outputted to output_file. For num_objectives, the number of objectives is input, and the number of proposals is specified as num_proposals. In general, although many hyperparameters should be considered to use the AI, they are determined automatically in NIMS-OS. Note that if there are no experimental results in the candidates file, only "RE" is used. For "PHYSBO," "BLOX," and "PDC," some values of objective functions must be stored in the candidates file.


Prog1 Tamura SciTechAdvMatMeth2023 3-1.jpeg

Program 1. Small example of NIMS-OS for Bayesian optimization.

Preparation of input files for robotic experiments and execution of experiments

nimsos.preparation_input is a class to prepare the input files for robotic experiments and send the start message to the robot. For example, nimsos.preparation_input is used as follows.

nimsos.preparation_input(machine = “STAN”,
input_file = proposals_file,
input_folder = input_folder)

The parameter of machine selects the module of robotic experiments. For machine, "STAN," which is the standard module for this procedure, and "NAREE" (NIMS automated robotic electrochemical experiments) are used. For input_file, the experimental conditions selected by the AI are specified. In addition, the folder in the computer where the input files for robotic experiments are stored is referred to as the input_folder. In the nimsos.preparation_input module, the two functions make_machine_file() and send_message_machine() should be modified depending on the robotic systems used. The former creates the input files for robotic experiments from selected experimental conditions, whereas the latter sends the message to begin the robotic experiments.

Analysis of output files from experiments and update of candidates file

nimsos.analysis_output is a class used to analyze the experimental results and update the candidates file. For example, nimsos.analysis_output is used as follows:

nimsos.analysis_output(machine = “STAN”,
input_file = proposals_file,
  output_file = candidates_file,
  num_objectives = ObjectivesNum,
  output_folder = output_folder)

The parameter of machine is the same as that found in the nimsos.preparation_input module, which selects the module for robotic experiments. Here, "STAN" and "NAREE" can be selected. For input_file, the experimental conditions selected by the AI are specified, and output_file is the name of the candidates file. The file specified by output_file is updated by this module. In addition, for num_objectives, the number of objectives is input. For output_folder, the folder in the computer where the results from robotic experiments are output is specified. In the nimsos.analysis_output module, two functions extract_objectives() and recieve_exit_message() should be modified depending on the robot systems. The former extracts the values of objective functions from the output files of robotic experiments, and the latter receives the message when the robotic experiments are finished. If "NAREE" is selected, objectives_info should be specified as a dictionary indicating which objective function is extracted from the experimental results.

Visualization of the results

By using nimsos.visualization, the figures of the results are obtained. When this module is used, the new folder named "fig" is prepared in advance in the same folder where the main script is stored. The figures are output to this folder. nimsos.visualization.plot_history and nimsos.visualization.plot_distribution.plot create figures for the history and distributions of objective functions, respectively. These modules are useful when using AI algorithms other than PDC. In contrast, nimsos.visualization.plot_phase_diagram.plot creates the predicted phase diagram when PDC is used as an AI algorithm.

Usage of the NIMS-OS GUI version

References

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. In the original, there are multiple instances of citing research work using the last name of the last author listed, rather than the last name of the first author listed; this may have been a product of Japanese culture tending to read text from right to left. For this version, the last name of the first author was used to be consistent with research norms.