Berkeley Lab and Grungemaster/Dreamstime
Berkeley Lab researchers

COVID-19 Machine Learning Tool Assimilates Research Papers

April 28, 2020
Online AI tool uses text mining algorithms to scan and make sense of hundreds of new papers every day.

The volume of literature produced on the topic of COVID-19 is daunting. So much so that scientists can’t keep up and need help finding relevant papers and building correlations.

Enter COVIDScholar.com. The search engine uses natural language processing techniques to scan, search, synthesize, draw insights and make connections.

A group of materials scientists at Lawrence Berkeley National Laboratory (Berkeley Lab), who usually spend their time researching high-performance materials for thermoelectrics or battery cathodes, built the text mining tool. Their quest to develop text and data mining techniques that can help answer high-priority questions related to COVID-19 stems from the White House’s March 16 call to action.

At the time, the COVID-19 Open Research Dataset (CORD-19) of scholarly literature about COVID-19, SARS-CoV-2 and the Coronavirus group had the most extensive machine-readable coronavirus literature collection available for data and text mining, with more than 29,000 articles.

Once the Berkeley Lab team set to work, its prototype was up and running within a week; after a month the tool had collected more than 61,000 research papers. About 8,000 were specifically about COVID-19 and the balance were about related topics, such as other viruses and pandemics in general. They estimate 200 new articles are published every day on the coronavirus. “Within 15 minutes of the paper appearing online, it will be on our website,” said Amalie Trewartha, a postdoctoral fellow who is one of the lead developers.

Ready for Public Use

The tool went live this week when the Berkeley Lab team released an upgraded version that allows the user to search for “related papers” and sort articles using machine-learning-based relevance tuning. COVIDScholar will also recommend similar abstracts and automatically sort papers in subcategories, such as testing or transmission dynamics, allowing users to do specialized searches.

The developers built automated scripts to grab new papers (including preprint papers), clean them up and make them searchable. At the most basic level, COVIDScholar acts as a simple search engine—albeit a highly specialized one touted as the largest single-topic literature collection on COVID-19—according to the developers.

Next Steps

The team of artificial intelligence experts will now train its algorithms to look for unnoticed connections between concepts. “You can use the generated representations for concepts from the machine learning models to find similarities between things that don’t actually occur together in the literature, so you can find things that should be connected but haven’t been yet,” said John Dagdelen, a UC Berkeley graduate student and Berkeley Lab researcher who is one of the lead developers.

Further on, the team plans to work with researchers in Berkeley Lab’s Environmental Genomics and Systems Biology Division and UC Berkeley’s Innovative Genomics Institute to improve COVIDScholar’s algorithms. The idea is to synthesize systems in a way that will allow researchers to discover new connections within their data, said Dagdelen.

Not From Left Field

The entire tool runs on the supercomputers of the National Energy Research Scientific Computing Center (NERSC), a DOE Office of Science user facility located at Berkeley Lab. The online search engine and portal are powered by the Spin cloud platform at NERSC.

Chalk up the speed with which the team was able to iterate ideas to experience. The group spent three years doing natural language processing for materials science and built a similar tool, called MatScholar, a project supported by the Toyota Research Institute and Shell.

Last year the team published a paper in Nature that showed how an algorithm with no training in materials science could recommend materials for functional applications several years before their discovery.

About the Author

Rehana Begg | Editor-in-Chief, Machine Design

As Machine Design’s content lead, Rehana Begg is tasked with elevating the voice of the design and multi-disciplinary engineer in the face of digital transformation and engineering innovation. Begg has more than 24 years of editorial experience and has spent the past decade in the trenches of industrial manufacturing, focusing on new technologies, manufacturing innovation and business. Her B2B career has taken her from corporate boardrooms to plant floors and underground mining stopes, covering everything from automation & IIoT, robotics, mechanical design and additive manufacturing to plant operations, maintenance, reliability and continuous improvement. Begg holds an MBA, a Master of Journalism degree, and a BA (Hons.) in Political Science. She is committed to lifelong learning and feeds her passion for innovation in publishing, transparent science and clear communication by attending relevant conferences and seminars/workshops. 

Follow Rehana Begg via the following social media handles:

X: @rehanabegg

LinkedIn: @rehanabegg and @MachineDesign

Sponsored Recommendations

How to Build Better Robotics with Integrated Actuators

July 17, 2024
Reese Abouelnasr, a Mechatronics Engineer with Harmonic Drive, answers a few questions about the latest developments in actuators and the design or engineering challenges these...

Crisis averted: How our AI-powered services helped prevent a factory fire

July 10, 2024
Discover how Schneider Electric's services helped a food and beverage manufacturer avoid a factory fire with AI-powered analytics.

Pumps Push the Boundaries of Low Temperature Technology

June 14, 2024
As an integral part of cryotechnology, KNF pumps facilitate scientific advances in cryostats, allowing them to push temperature boundaries and approach absolute zero.

The entire spectrum of drive technology

June 5, 2024
Read exciting stories about all aspects of maxon drive technology in our magazine.

Voice your opinion!

To join the conversation, and become an exclusive member of Machine Design, create an account today!