MOOSE-Star (ICML 2026): 7B model + 108K-paper dataset for scientific hypothesis discovery

Disclosure first: I work on community at MiroMind.
One of our researchers just dropped the full MOOSE-Star collection on Hugging Face — a 7B model post-trained for scientific hypothesis discovery, plus the dataset behind it. Paper accepted at ICML 2026.

🤗 Collection: https://huggingface.co/collections/ZonglinY/moose-star-models-and-data

Inside:

  • MS-IR-7B / MS-HC-7B / MS-7B: 7B models for inspiration retrieval, hypothesis composition, and joint use. Base: DeepSeek-R1-Distill-Qwen-7B.
  • TOMATO-Star: 108,717 NCBI papers decomposed into (background, hypothesis, inspirations), every inspiration anchored to a real citation. Covers biology, chemistry, medicine, medical imaging, psychology, cognitive science. ~38,400 A800 GPU-hours of preprocessing went into building it.
  • Strict temporal split for evaluation: train ≤ Sep 2025, test = Oct 2025 (after the base model's knowledge cutoff).

Inspiration retrieval accuracy

Model IR accuracy
Random Selection 6.70%
R1-Distilled-Qwen-7B (base) 28.42%
Claude Sonnet 4.6 45.02%
DeepSeek-R1 45.11%
Gemini-3 Flash 51.44%
GPT-5.4 51.50%
MS-7B (7B, joint IR + HC) 54.34%
MS-IR-7B (7B, IR-only) 54.37%
Gemini-3 Pro 54.89%

Locally: it's a standard DeepSeek-R1-Distill-Qwen-7B fine-tune, so anything that runs that runs this — llama.cpp / vLLM / SGLang all fine. ~14GB at fp16, single 24GB card territory. Apache-2.0 code, CC-BY-4.0 data.

Stress-test it, anything! Qestions or any views welcomed below!

📄 https://arxiv.org/abs/2603.03756
💻 https://github.com/ZonglinY/MOOSE-Star

submitted by /u/MiroMindAI
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top