- Provide.ai - Page 103

RewardBench 2: Advancing Reward Model Evaluation

/ April 24, 2026

arXiv:2506.01937v2 Announce Type: replace
Abstract: Reward models are used throughout the post-training of language models to capture nuanced signals from preference data and provide a training target for optimization across instruction following, rea…

cs.CL, cs.CY

Mapping the Political Discourse in the Brazilian Chamber of Deputies: A Multi-Faceted Computational Approach

/ April 24, 2026

arXiv:2604.21897v1 Announce Type: new
Abstract: Analyses of legislative behavior often rely on voting records, overlooking the rich semantic and rhetorical content of political speech. In this paper, we ask three complementary questions about parliame…

cs.AI, cs.CL

GiVA: Gradient-Informed Bases for Vector-Based Adaptation

/ April 24, 2026

arXiv:2604.21901v1 Announce Type: new
Abstract: As model sizes continue to grow, parameter-efficient fine-tuning has emerged as a powerful alternative to full fine-tuning. While LoRA is widely adopted among these methods, recent research has explored …

cs.CL

Evaluation of Automatic Speech Recognition Using Generative Large Language Models

/ April 24, 2026

arXiv:2604.21928v1 Announce Type: new
Abstract: Automatic Speech Recognition (ASR) is traditionally evaluated using Word Error Rate (WER), a metric that is insensitive to meaning. Embedding-based semantic metrics are better correlated with human perce…

cs.CL

Dr. Assistant: Enhancing Clinical Diagnostic Inquiry via Structured Diagnostic Reasoning Data and Reinforcement Learning

/ April 24, 2026

arXiv:2601.13690v2 Announce Type: replace
Abstract: Clinical Decision Support Systems (CDSSs) provide reasoning and inquiry guidance for physicians, yet they face notable challenges, including high maintenance costs and low generalization capability. …

cs.CL

Improving Clinical Diagnosis with Counterfactual Multi-Agent Reasoning

/ April 24, 2026

arXiv:2603.27820v2 Announce Type: replace
Abstract: Clinical diagnosis is a complex reasoning process in which clinicians gather evidence, form hypotheses, and test them against alternative explanations. In medical training, this reasoning is explicit…

cs.CL

Why Supervised Fine-Tuning Fails to Learn: A Systematic Study of Incomplete Learning in Large Language Models

/ April 24, 2026

arXiv:2604.10079v3 Announce Type: replace
Abstract: Supervised Fine-Tuning (SFT) is the standard approach for adapting large language models (LLMs) to downstream tasks. However, we observe a persistent failure mode: even after convergence, models ofte…

cs.CL, cs.IR

From Past To Path: Masked History Learning for Next-Item Prediction in Generative Recommendation

/ April 24, 2026

arXiv:2509.23649v2 Announce Type: replace-cross
Abstract: Generative recommendation, which directly generates item identifiers, has emerged as a promising paradigm for recommendation systems. However, its potential is fundamentally constrained by the …

cs.CL, cs.LG

Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning

/ April 24, 2026

arXiv:2512.05591v2 Announce Type: replace-cross
Abstract: Large language model post-training relies on reinforcement learning to improve model capability and alignment quality. However, the off-policy training paradigm introduces distribution shift, w…

cs.CL, cs.CR

CI-Work: Benchmarking Contextual Integrity in Enterprise LLM Agents

/ April 24, 2026

arXiv:2604.21308v1 Announce Type: cross
Abstract: Enterprise LLM agents can dramatically improve workplace productivity, but their core capability, retrieving and using internal context to act on a user’s behalf, also creates new risks for sensitive i…