- Provide.ai - Page 54

SRA: Span Representation Alignment for Large Language Model Distillation

/ May 5, 2026

arXiv:2605.01205v1 Announce Type: new
Abstract: Cross-Tokenizer Knowledge Distillation (CTKD) enables knowledge transfer between a large language model and a smaller student, even when they employ different tokenizers. While existing approaches mainly…

cs.CL

GIFT: Guided Fine-Tuning and Transfer for Enhancing Instruction-Tuned Language Models

/ May 5, 2026

arXiv:2605.01256v1 Announce Type: new
Abstract: A promising paradigm for adapting instruction-tuned language models is to learn task-specific updates on a pretrained base model and subsequently merge them into the instruction-tuned model. However, exi…

cs.CL

Enhancing Game Review Sentiment Classification on Steam Platform with Attention-Based BiLSTM

/ May 5, 2026

arXiv:2605.01315v1 Announce Type: new
Abstract: This paper investigates sentiment classification of Steam game reviews using an attention-based Bidirectional Long Short-Term Memory (BiLSTM) model. Using a dataset of 50,000 reviews sampled from a large…

cs.CL, cs.SD, eess.AS

Balalaika: Data-Centric, Prosody-Aware Annotation Pipeline for Russian Speech

/ May 5, 2026

arXiv:2507.13563v2 Announce Type: replace
Abstract: We introduce Balalaika, an open-source, data-centric pipeline for processing audio and producing prosody-aware annotations. It combines semantic VAD for context-preserving segmentation, multi-ASR ens…

cs.CL

Benchmarking LightGBM and BiLSTM for Sentiment Analysis on Indonesian E-Commerce Reviews

/ May 5, 2026

arXiv:2605.01322v1 Announce Type: new
Abstract: This study presents a comparative analysis between two primary approaches in Natural Language Processing (NLP): Machine Learning (ML) utilizing the PyCaret AutoML framework, and Deep Learning (DL). The e…

cs.AI, cs.CL

STAGE: A Full-Screenplay Benchmark for Reasoning over Evolving Storie

/ May 5, 2026

arXiv:2601.08510v3 Announce Type: replace-cross
Abstract: Movie screenplays are rich long-form narratives that interleave complex character relationships, temporally ordered events, and dialogue-driven interactions. While prior benchmarks target indiv…

cs.CL

A Multi-View Media Profiling Suite: Resources, Evaluation, and Analysis

/ May 5, 2026

arXiv:2605.01336v1 Announce Type: new
Abstract: News outlets shape public opinion at a scale that makes automated detection of political bias and factuality essential. However, the field still lacks unified resources, comprehensive evaluations across …

cs.AI, cs.CL, cs.LG

MAD-OPD: Breaking the Ceiling in On-Policy Distillation via Multi-Agent Debate

/ May 5, 2026

arXiv:2605.01347v1 Announce Type: cross
Abstract: On-policy distillation (OPD) trains a student on its own trajectories under token-level teacher supervision, but existing methods are capped by a single-teacher capability ceiling: when the teacher err…

cs.CL

Weird Generalization is Weirdly Brittle

/ May 5, 2026

arXiv:2604.10022v2 Announce Type: replace
Abstract: Weird generalization is a phenomenon in which models fine-tuned on data from a narrow domain (e.g. insecure code) develop surprising traits that manifest even outside that domain (e.g. broad misalign…

cs.AI, cs.CL

MemeLens: Multilingual Multitask VLMs for Memes

/ May 5, 2026

arXiv:2601.12539v3 Announce Type: replace
Abstract: Memes are a dominant medium for online communication and manipulation because meaning emerges from interactions between embedded text, imagery, and cultural context. Existing meme research is distrib…