Saurabh Paul, Christos Boutsidis, et al.
JMLR
Omics and multi-omics data analysis is a vast field in which data from multiple ‘omics sources including genomics, transcriptomics, proteomics and metabolomics can be subject to a wide array of various analytical and statistical techniques, depending on the data and hypothesis at hand. Construction and execution of these complex ‘omics workflows require expert domain input, multi-step manual curation and computational expertise, meaning it is often one of the most time consuming, challenging and interdisciplinary tasks within an 'omics experiment. Agentic AI are LLM-driven systems which autonomously plan, reason and dynamically call tools/functions, and have demonstrated a powerful capability for planning and executing complex workflows. These capabilities suggest Agentic AI has the potential to significantly benefit this field, allowing i) automation of repetitive tasks, ii) enhancing decision making such as workflow, software or parameter selection, iii) removal of coding and computational pre-requisites, iv) improved data management and v) reduced human errors. As such, we present an Agentic AI system capable of complex omics workflow construction and execution, equipped with domain specific tools including BLAST, DeSeq2, domain database querying and more, and a ‘domain-expert’, a concept which incorporates domain specific knowledge via vector databases and RAG, guiding the agent to more accurately plan omics’ workflows and prevent LLM hallucinations. We demonstrate the capability of our system with several omics workflows, ranging from differential gene expression analysis to more complex biomedical foundation model inference tasks, including cell-type annotation and t-cell receptor (TCR)-epitope binding prediction. Successful execution demonstrates that domain-guided Agentic AI systems are capable of automated and accurate scientific reasoning, planning and execution, providing great potential for this technology to enhance omics analysis in terms of scalability, reproducibility and scientific accuracy.
Saurabh Paul, Christos Boutsidis, et al.
JMLR
Joxan Jaffar
Journal of the ACM
Cristina Cornelio, Judy Goldsmith, et al.
JAIR
Erik Altman, Jovan Blanusa, et al.
NeurIPS 2023