Large scientific facilities – such as nanoscale centers, neutron facilities, and X-ray sources, including the Advanced Photon Source, at the U.S. Department of Energy’s Argonne National Laboratory – are becoming more capable and more complex. While the increased abilities of the facilities can help scientists learn more about the materials they study, they’re also more challenging to understand and operate. Now one group of researchers has found that they can use generative artificial intelligence (AI) to aid scientists in planning and running their experiments.
The researchers looked at large language models (LLMs), such as the one that underlies OpenAI’s ChatGPT, to see whether they could aid users of scientific facilities. They created the Context-Aware Language Model for Science (CALMS), which adds facility-specific information to LLMs, and compared how it performed next to LLMs that had not been enhanced. They also compared the performance using the proprietary ChatGPT and an open-source LLM.
At a facility such as the APS, a DOE Office of Science user facility, researchers from different institutions can perform a wide variety of experiments at different beamlines to study an array of questions, from microelectronics to nuclear materials. The researchers tend not to be experts on the beamlines, so APS staff must spend time helping them to set up and run their experiments. Their inexperience can also lead to errors, which then require the experiment to be repeated.
CALMS uses retrieval-augmented generation (RAG), which supplements pretrained LLMs with additional documentation. That provides researchers with two opportunities. First, they can ask it to help them design their experiment and suggest what tests can be run at which beamline. Then, once they’re onsite, they can ask the model how to operate the beamline they’re working with.
Models with context provided answers to queries that non-augmented models could not. The researchers also compared OpenAI’s GPT 3.5 with Vicuna, an open-source model. On several benchmarks, GPT outperformed Vicuna. The researchers say, however, that both proprietary and open-source LLMs have improved since their original tests, and the performance difference between the two types of models is now minimal.
Both types of models, however, continue to suffer from hallucinations, in which the LLM makes up an answer that is not factual. That will require users to double-check the results they get from their queries.
The researchers also explored automating some tasks with LLMs. For instance, APS users taking diffraction measurements often have to maneuver an area detector to a particular position, determined by the energy of the beam and the material being measured. This requires them to look up lattice information in a materials database, then enter that information into the beamline interface.
The researchers found they could automate both parts of the process, having the LLM find the lattice information then provide it to a controller that could move the detector. They envision the same being done with more complex activities involving many more than two steps, such as automating the process of synthesizing a new material at Argonne’s Center for Nanoscale Materials, also a DOE Office of Science user facility.
The CALMS model could prove useful not only at the APS, but at any complex science facility. While standard scientific instruments usually come with a set of operating instructions, scientific user facilities with unique equipment can benefit from LLMs with context. – Neil Savage
See: M.H. Prince1, H. Chan1, A. Vriza1, T. Zhao1, V.K. Sastry1, Y. Luo1, M.T. Dearing1, R.J. Harder1, R.K. Vasudevan2, M.J. Cherukara1, “Opportunities for retrieval and tool augmented large language models in scientific facilities,” npj Comput Mater 10 251 (2024)
Author affiliations: 1Argonne National Laboratory; 2Oak Ridge National Laboratory.
Work performed at the Center for Nanoscale Materials and Advanced Photon Source, both U.S. Department of Energy Office of Science User Facilities, was supported by the U.S. DOE, Office of Basic Energy Sciences, under Contract No. DE-AC02-06CH11357. This research used resources of the Argonne Leadership Computing Facility, a U.S. Department of Energy (DOE) Office of Science user facility at Argonne National Laboratory and is based on research supported by the U.S. DOE Office of Science-Advanced Scientific Computing Research Program, under Contract No. DE-AC02-06CH11357. M.J.C. and H.C. also acknowledge support from the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences Data, Artificial Intelligence, and Machine Learning at DOE Scientific User Facilities program under Award Number 34532.
The U.S. Department of Energy's APS at Argonne National Laboratory is one of the world’s most productive x-ray light source facilities. Each year, the APS provides high-brightness x-ray beams to a diverse community of more than 5,000 researchers in materials science, chemistry, condensed matter physics, the life and environmental sciences, and applied research. Researchers using the APS produce over 2,000 publications each year detailing impactful discoveries and solve more vital biological protein structures than users of any other x-ray light source research facility. APS x-rays are ideally suited for explorations of materials and biological structures; elemental distribution; chemical, magnetic, electronic states; and a wide range of technologically important engineering systems from batteries to fuel injector sprays, all of which are the foundations of our nation’s economic, technological, and physical well-being.
Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation's first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America's scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC, for the U.S. DOE Office of Science.
The U.S. Department of Energy's Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit the Office of Science website.