- Provide.ai - Page 524

Training LLMs for Multi-Step Tool Orchestration with Constrained Data Synthesis and Graduated Rewards

/ March 27, 2026

arXiv:2603.24709v1 Announce Type: cross
Abstract: Multi-step tool orchestration, where LLMs must invoke multiple dependent APIs in the correct order while propagating intermediate outputs, remains challenging. State-of-the-art models frequently fail o…

cs.AI, cs.CV

TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts

/ March 27, 2026

arXiv:2601.08881v2 Announce Type: replace
Abstract: Unified image generation and editing models suffer from severe task interference in dense diffusion transformers architectures, where a shared parameter space must compromise between conflicting obje…

cs.AI, cs.CL, cs.SE

SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks

/ March 27, 2026

arXiv:2603.24755v1 Announce Type: cross
Abstract: Software development is iterative, yet agentic coding benchmarks overwhelmingly evaluate single-shot solutions against complete specifications. Code can pass the test suite but become progressively har…

cs.CV

MedOpenClaw: Auditable Medical Imaging Agents Reasoning over Uncurated Full Studies

/ March 27, 2026

arXiv:2603.24649v1 Announce Type: new
Abstract: Currently, evaluating vision-language models (VLMs) in medical imaging tasks oversimplifies clinical reality by relying on pre-selected 2D images that demand significant manual labor to curate. This setu…

cs.AI, cs.CV, cs.LG, cs.MM, cs.SD

SAVe: Self-Supervised Audio-visual Deepfake Detection Exploiting Visual Artifacts and Audio-visual Misalignment

/ March 27, 2026

arXiv:2603.25140v1 Announce Type: new
Abstract: Multimodal deepfakes can exhibit subtle visual artifacts and cross-modal inconsistencies, which remain challenging to detect, especially when detectors are trained primarily on curated synthetic forgerie…

Artificial Intelligence, Government, Industry, Laws and Regulations, Markets

European Parliament delays implementation of parts of the EU AI Act

/ March 27, 2026

The European Parliament’s Thursday vote to delay parts of the EU AI Act adds more uncertainty to the already chaotic AI compliance universe. But analysts say that CIOs must proceed as though the compliance rules are in effect.

…

Apple, Computers and Peripherals, IT Operations, Mac, MacOS

Hexnode CEO: MacBook Neo forces IT to rethink its budget laptop strategy

/ March 26, 2026

Apple’s MacBook Neo (reviewed here) challenges what we expect from budget laptops. Accompanied by shrewd enterprise-focused moves, the new model gives Apple a chance to convert hitherto resistant IT purchasers to adopt its platfo…

Commercial Providers, Computer Components, Computers, Computers and Peripherals, CPUs and Processors, Dell, Desktop PCs, HP, Intel, Laptops, Vendors and Providers

Enterprise laptops adopt Intel’s new Core Ultra Series 3 chips

/ March 26, 2026

Intel’s Core Ultra Series 3 processors with Intel vPro, built for business PCs, are off to a fast start, already powering more than 125 designs including newly-announced systems from Dell and HP, the company said.

Unveiled thi…

Research Blog

GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation

/ March 26, 2026

Vision-language models (VLMs) use images and text to plan robot actions, but they still struggle to decide what actions to take and where to take them. Most systems split these decisions into two steps: a VLM generates a plan in natural language, and a separate model translates it into executable actions. This approach often breaks […]

The post GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation appeared first on Microsoft Research.