- Provide.ai - Page 86

Reinforcement Learning for LLM Post-Training: A Survey

/ May 4, 2026

arXiv:2407.16216v3 Announce Type: replace
Abstract: Large language models (LLMs) trained via pretraining and supervised fine-tuning (SFT) can still produce harmful and misaligned outputs, or struggle in domains like math and coding. Reinforcement lear…

cs.LG

The Power of Order: Fooling LLMs with Adversarial Table Permutations

/ May 4, 2026

arXiv:2605.00445v1 Announce Type: new
Abstract: Large Language Models have achieved remarkable success and are increasingly deployed in critical applications involving tabular data, such as Table Question Answering. However, their robustness to the st…

cs.CL

FinSafetyBench: Evaluating LLM Safety in Real-World Financial Scenarios

/ May 4, 2026

arXiv:2605.00706v1 Announce Type: new
Abstract: Large language models (LLMs) are increasingly applied in financial scenarios. However, they may produce harmful outputs, including facilitating illegal activities or unethical behavior, posing serious co…

cs.LG, cs.NI

TURBOTEST: Learning When Less is Enough through Early Termination of Internet Speed Tests

/ May 4, 2026

arXiv:2510.21141v2 Announce Type: replace-cross
Abstract: Internet speed tests are indispensable for users, ISPs, and policymakers, but their static flooding-based design imposes growing costs: a single high-speed test can transfer hundreds of MB, and…

cond-mat.mtrl-sci, cs.CE, cs.LG

Probabilistic Predictions of Process-Induced Deformation in Carbon/Epoxy Composites Using a Deep Operator Network

/ May 4, 2026

arXiv:2512.13746v5 Announce Type: replace-cross
Abstract: Fiber reinforcement and polymer matrix respond differently to manufacturing conditions due to mismatch in coefficient of thermal expansion and matrix shrinkage during curing of thermosets. Thes…

cs.AI, cs.LG

Scalable Context-Aware Graph Attention for Unsupervised Anomaly Detection in Large-Scale Mobile Networks

/ May 4, 2026

arXiv:2605.00482v1 Announce Type: cross
Abstract: Mobile network operators must monitor thousands of heterogeneous network elements across the radio access network and the packet core, each exposing high-dimensional KPI time series. The scale and cost…

cs.AI, cs.CL

Directed Social Regard: Surfacing Targeted Advocacy, Opposition, Aid, Harms, and Victimization in Online Media

/ May 4, 2026

arXiv:2605.00776v1 Announce Type: cross
Abstract: The language in online platforms, influence operations, and political rhetoric frequently directs a mix of pro-social sentiment (e.g., advocacy, helpfulness, compassion) and anti-social sentiment (e.g….

cs.CL

Reward Modeling from Natural Language Human Feedback

/ May 4, 2026

arXiv:2601.07349v3 Announce Type: replace
Abstract: Reinforcement Learning with Verifiable reward (RLVR) on preference data has become the mainstream approach for training Generative Reward Models (GRMs). Typically in pairwise rewarding tasks, GRMs ge…

cs.CL

BanglaSocialBench: A Benchmark for Evaluating Sociopragmatic and Cultural Alignment of LLMs in Bangladeshi Social Interaction

/ May 4, 2026

arXiv:2603.15949v3 Announce Type: replace
Abstract: Large Language Models have demonstrated strong multilingual fluency, yet fluency alone does not guarantee socially appropriate language use. In high-context languages, communicative competence requir…

cs.CL

SCOPE:Planning for Hybrid Querying over Clinical Trial Data

/ May 4, 2026

arXiv:2604.25120v2 Announce Type: replace
Abstract: We study clinical trial table reasoning, where answers are not directly stored in visible cells but must be reasoned from semantic understanding through normalization, classification, extraction, or …