- Provide.ai - Page 26

Grounding Multi-Hop Reasoning in Structural Causal Models via Group Relative Policy Optimization

/ May 6, 2026

arXiv:2605.01482v1 Announce Type: new
Abstract: Multi-Hop Fact Verification (MHFV) necessitates complex reasoning across disparate evidence, posing significant challenges for Large Language Models (LLMs) which often suffer from hallucinations and frac…

cs.AI

MSEarth: A Multimodal Benchmark for Earth Science Phenomenon Discovery with MLLMs

/ May 6, 2026

arXiv:2505.20740v3 Announce Type: replace
Abstract: The rapid advancement of multimodal large language models (MLLMs) offers new opportunities for complex scientific challenges, yet their application in earth science-especially at the graduate level-r…

cs.CL

EduCoder: An Open-Source Annotation System for Education Transcript Data

/ May 6, 2026

arXiv:2507.05385v5 Announce Type: replace
Abstract: We introduce EduCoder, a domain-specialized tool designed to support utterance-level annotation of educational dialogue. While general-purpose text annotation tools for NLP and qualitative research a…

cs.AI

MILD: Mediator Agent System with Bidirectional Perception and Multi-Layered Alignment for Human-Vehicle Collaboration

/ May 6, 2026

arXiv:2605.01507v1 Announce Type: new
Abstract: Prior studies report that partial driving automation can increase the cognitive demands on human drivers. This effect largely arises from human drivers’ lack of transparent insight into the vehicle’s int…

cs.AI, cs.CR

When Alignment Isn’t Enough: Response-Path Attacks on LLM Agents

/ May 6, 2026

arXiv:2605.02187v1 Announce Type: cross
Abstract: Bring-Your-Own-Key (BYOK) agent architectures let users route LLM traffic through third-party relays, creating a critical integrity gap: a malicious relay can modify an aligned LLM response after gener…

cs.AI

Multi-Agent Reasoning Improves Compute Efficiency: Pareto-Optimal Test-Time Scaling

/ May 6, 2026

arXiv:2605.01566v1 Announce Type: new
Abstract: Advances in inference methods have enabled language models to improve their predictions without additional training. These methods often prioritize raw performance over cost-effective compute usage. Howe…

cs.AI, cs.CL

LitVISTA: A Benchmark for Narrative Orchestration in Literary Text

/ May 6, 2026

arXiv:2601.06445v2 Announce Type: replace-cross
Abstract: Computational narrative analysis aims to capture rhythm, tension, and emotional dynamics in literary texts. Existing large language models can generate long stories but overly focus on causal c…

cs.LG, q-bio.QM

A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion

/ May 6, 2026

arXiv:2605.03360v1 Announce Type: cross
Abstract: We present A-CODE, a fully atomic unified one-stage protein co-design model that simultaneously refines discrete atom types and continuous atom coordinates. Unlike predominant two-stage methods that ca…

cs.CV

First Shape, Then Meaning: Efficient Geometry and Semantics Learning for Indoor Reconstruction

/ May 6, 2026

arXiv:2605.03463v1 Announce Type: new
Abstract: Neural Surface Reconstruction has become a standard methodology for indoor 3D reconstruction, with Signed Distance Functions (SDFs) proving particularly effective for representing scene geometry. A varie…

cs.AI, cs.CE

Beyond Isolated Investor: Predicting Startup Success via Roleplay-Based Collective Agents

/ May 6, 2026

arXiv:2512.22608v3 Announce Type: replace
Abstract: Due to the high value and high failure rates of startups, predicting their success is a critical challenge. Existing approaches typically model startup success from a single decision-maker’s perspect…