BER Structural Biology and Imaging Resources
Synchrotron, Neutron, and Cryo-EM
BER Structural Biology logo

AI Accelerates Autonomous Discovery at DOE Synchrotron and Neutron Facilities

With gpCAM, an experiment that used to take over 8 hours can now be completed within 30 minutes.

Depiction of the inner workings of the algorithm inside gpCAM, a software tool developed by researchers at Lawrence Berkeley National Laboratory’s (LBNL) CAMERA facility to facilitate autonomous scientific discovery. [Courtesy LBNL.]

As instruments at U.S. Department of Energy (DOE) Office of Science user facilities have become more powerful, the volume and complexity of data have also grown. To make full use of modern instruments and facilities, researchers are exploring new ways to decrease the amount of data required for scientific discovery and address data acquisition rates that people can no longer keep pace with.

A promising route lies in the emerging field of autonomous discovery, where algorithms learn from a comparatively little amount of input data and decide themselves on the next steps to take. This approach enables faster and more efficient exploration of multi-dimensional parameter spaces with minimal human intervention.

To meet this challenge, “gpCAM,” a mathematical, algorithmic, and software approach to enable autonomous discovery, was developed by Marcus Noack, a research scientist at the Center for Advanced Mathematics for Energy Research Applications (CAMERA), a Lawrence Berkeley National Laboratory (LBNL) center jointly funded by DOE’s Office of Basic Energy Sciences (BES) and Office of Advanced Scientific Computing Research.

A July 2021 Nature Reviews Physics paper describes the successful deployment of gpCaM at various user facilities and resources. One such resource is the Berkeley Synchrotron Infrared Structural Biology (BSISB) Imaging  Program. BSISB is an infrared beamline funded by DOE’s Office of Biological and Environmental Research at the Advanced Light Source, a BES user facility at LBNL.

“gpCAM was designed to assist decision-making processes in any kind of experiment, with minimal effort,” Noack says. With gpCAM, an experiment that used to take over 8 hours can now be completed within 30 minutes. “This is a game changer,” says co-author and LBNL staff scientist Peter Zwart, who worked with a team to further develop gpCAM for biogeochemical and geobiological applications.

The overall approach uses automated beamline measurements to speed data collection, combined with the decision-making algorithm, gpCAM, to analyze complex, multidimensional data sets and make experimental decisions in real time. The algorithm uses a Gaussian process to model a system’s behavior and then determines the best next measurements to make while an experiment is still running.

BSISB director Hoi-Ying Holman and colleagues worked with the CAMERA team to customize and deploy gpCAM for biogeochemical and biological applications, including chemical imaging of geobiological samples. “Finding targeted biogeochemical activity hot spots in a sample is like looking for oases in a desert without directions,” Holman says. “We wasted a lot of time looking at areas in the sample that aren’t useful.”

Automation in data collection combined with machine learning enables scientists to more quickly and thoroughly explore parameter spaces, efficiently measure high-value datasets, and optimize the use of instruments and facilities, accelerating scientific discovery.

RELATED LINKS

References

Noack, M.M., Zwart, P.H., Ushizima, D.M. et al. Gaussian processes for autonomous data acquisition at large-scale synchrotron and neutron facilities. Nat Rev Phys (2021). [DOI: 10.1038/s42254-021-00345-y]