UMD Researchers Receive USDA Funding to Use Big Data Analytics and Machine Learning to Integrate Microbial Genomics with Food Safety Risk Assessment

Models and tools developed will advance next-generation food safety risk assessments to improve risk management of foodborne illness and better protect public health

Image Credit: Kevin Ku

November 13, 2020 Samantha Watters

The University of Maryland (UMD) recently received a grant from the United States Department of Agriculture National Institute of Food and Agriculture (USDA-NIFA) to develop a next-generation food safety risk assessment model by combining emerging techniques in both food safety and machine learning. With vast genomic data on foodborne pathogens such as Salmonella now available through whole genome sequencing, there exists the new potential to substantially improve public health through more specific food safety risk assessments that can better predict the risk of outbreaks and guide strong risk management decisions at the policy level. However, big data analytics approaches such as artificial intelligence (AI) and machine learning have yet to be leveraged in the field of food safety to integrate this genomic data with pathogen characteristics of interest to risk assessors. With this new funding, UMD is paving the way to a more robust food safety risk assessment model that combines computational techniques, genomic and microbial data, and machine learning to improve the management of foodborne illness and better protect public health.

Abani Pradhan, associate professor in Nutrition and Food Science, UMD

“I am so excited about this grant award, as it gives us the resources to conduct cutting-edge research on food safety and risk assessment, which is very important in controlling foodborne diseases and protecting public health,” says Abani Pradhan, associate professor in Nutrition and Food Science at UMD and lead investigator of this work. “AI as an emerging technology can take advantage of big data available in the agriculture and food sectors and has the potential to integrate food production, processing, food safety risk factors, and genomic data that can transform public health strategies to prevent foodborne diseases and rapidly respond to outbreaks.”

The advancement and applications of big data analytic techniques like AI and machine learning have significantly gained momentum across various fields, including transportation, manufacturing, healthcare, and even finance. However, according to Pradhan, applications in agriculture and food are just starting to get attention with the simultaneous progress of bioinformatics and genomic data presenting a clear opportunity for the advancement of food safety risk assessment. Alongside the expertise of co-investigators Jianghong Meng, director of the UMD Joint Institute for Food Safety and Applied Nutrition (JIFSAN) and the Center for Food Safety and Security Systems (CFS3), Hector Corrada Bravo, associate professor in Computer Science at UMD, and Marc Allard as a collaborator from the U.S. Food and Drug Administration (FDA), Pradhan and his team are poised to lead the way in combining these tools to develop a next-generation quantitative microbial risk assessment (QMRA).

“Risk assessment is an important area that has great significance to food safety,” explains Pradhan. “It is a holistic approach to food safety and is the umbrella under which all food safety information can be organized systematically to control the risk of foodborne diseases by providing a scientifically-sound basis for informed risk management and policy decisions.”

Food safety risk assessment generally involves the scientific evaluation of adverse public health effects resulting from exposure to foodborne hazards. QMRA specifically uses mathematical and statistical models to understand, predict, and prevent risks presented by foodborne pathogens like Salmonella, E.coli, Listeria, and many more. According to the Centers for Disease Control and Prevention (CDC), foodborne illnesses affect an estimated 1 in 6 Americans every year and in extreme cases cause hospitalization and even death, making foodborne disease outbreaks a serious public health concern.

QMRA can be used to predict the behavior and transmission of pathogens across the food production, processing, and supply chain, identify areas in the chain that could lead to contamination, and estimate the probability and consequence of adverse public health effects in the event that tainted products are consumed. QMRA models incorporate uncertainties and variabilities in different steps of the farm-to-fork pathway in order to obtain an accurate indicator of the risk posed by the foodborne pathogen.

As Pradhan explains, this is why integrating genomic data using big data analytics techniques is so important. “The sheer abundance of information by including molecular and genomic data available should increase the robustness of disease risk estimates by reducing the sources of uncertainty and variability in the QMRA model. This is important because there are so many different species of each foodborne pathogen, and even within the same species, there are different variations or types called serovars.”

Pradhan and his team are focusing on Salmonella with this particular work to develop the framework to be applied to other foodborne pathogens, particularly because Salmonella has so much available data and so many different variations to account for. Salmonella enterica specifically is a subspecies of this pathogen that causes about 1.2 million cases of foodborne illness each year and consists of over 2,500 serovars with highly variable characteristics. How resistant the pathogen is to heat stress or antimicrobials, how infectious it is, and how quickly it grows and spreads are all characteristics of the pathogen that can be partially explained by genomic data.

“The idea is to connect that genetic information with the characteristics of the pathogen to bridge the gap between the genes and the food safety aspects for consumers,” says Pradhan. “If we can use machine learning tools to understand the linkages between genotypes [genetic information] and phenotypes [physical traits], based upon that we can determine which serovars are the most concerning so that we can focus our experimental work on those types and further strengthen our models to create a risk assessment that provides a more robust and complete picture of the risk for risk mitigation.”

This cutting-edge work is just one of the ways that Pradhan and his lab, along with colleagues at JIFSAN and CFS3, focus on innovation for the future of food safety. “We have continued to conduct innovative and interdisciplinary research to improve the safety of the global food supply and public health by integrating experimental and field data with mathematical modeling, and developing predictive and risk models that will aid in guiding stakeholders like policy makers, government agencies, and the food industry. I always try to be innovative and focus on the future, and big data analytics and machine learning are the future of predictive modeling and food safety risk assessment. I’m super excited about this work because it ties together all these different components to improve public health.”

This work is funded by the United States Department of Agriculture National Institute of Food and Agriculture (USDA-NIFA), Award #2020-67017-30785.