Genome-Bench: A Scientific Reasoning Benchmark from Real-World Expert Discussions

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

In this short report, we present an automated pipeline tailored for the genomics domain and introduceGenome-Bench, a new benchmark constructed from over a decade of scientific forum discussions on genome engineering. Our pipeline transforms raw interactions into a reinforcement learningfriendly multiple-choice questions format, supported by 3000+ high-quality questionanswer pairs spanning foundational biology, experimental troubleshooting, tool usage, and beyond. To our knowledge, this is the first end-to-end pipeline for teaching LLMs to reason from scientific discussions, with promising potential for generalization across scientific domains beyond biology. The dataset is available at<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://huggingface.co/datasets/Mingyin0312/Genome-Bench">https://huggingface.co/datasets/Mingyin0312/Genome-Bench</ext-link>.

Related articles

Related articles are currently not available for this article.