Archon

Browse and search harvested arxiv metadata.

1273993 results (page 120 of 50960)

CombiMOTS: Combinatorial Multi-Objective Tree Search for Dual-Target Molecule Generation

2604.23307 cs.LG 2026-04-25 PDF (arxiv)

Thibaud Southiratn, Bonil Koo, Yijingxiu Lu, Sun Kim

Dual-target molecule generation, which focuses on discovering compounds capable of interacting with two target proteins, has garnered significant attention due to its potential for improving therapeutic efficiency, safety and resistance mitigation. Existing approaches face two critical challenges. First, by simplifying the complex dual-target optimization problem to scalarized combinations of indi…

Open PDF (arxiv)
Summation-by-parts operators for general function spaces: optimal nodes

2604.23306 math.NA 2026-04-25 PDF (arxiv)

Nicholas Hale, Charis Harley, Prince Nchupang, Jan Nordström

Gauss-Lobatto quadrature nodes and weights are optimal for closed summation-by-parts (SBP) formulations based on polynomial approximation spaces in the sense that for a prescribed function space they yield an SBP operator of minimal dimension. We show that the same principle extends to general (possibly non-polynomial) function spaces: an associated generalised Gauss-Lobatto quadrature provides th…

Open PDF (arxiv)
gateau: an observation simulator for ground-based submillimeter astronomy with integral field units and kinetic inductance detectors

2604.23305 astro-ph.IM 2026-04-25 PDF (arxiv)

A. Moerman, N. Soshnin, S. A. Brackenhoff, S. O. Dabironezare, K. Karatsu, L. H. Marting, S. A. H. de Rooij, M. Roos, B. R. Brandl, A. Endo

Submillimeter (submm) integral field units (IFUs) utilising kinetic inductance detectors (KIDs) are a promising instrument architecture for the study of galaxies, galaxy clusters, and the large-scale structure of the Universe. In order to design successful experiments targeting these science cases, several aspects such as instrument design, observation and calibration strategies, and data reductio…

Open PDF (arxiv)
Proteus: Shapeshifting Desktop Visualizations for Mobile via Multi-level Intelligent Adaptation

2604.23299 cs.HC 2026-04-25 PDF (arxiv)

Can Liu, Sizhe Cheng, Feng Liang, Zhibang Jiang, Lingru Huang, Kavinda Athapaththu, Yong Wang

With the rise of mobile-first consumption, users increasingly engage with data visualizations on mobile devices. However, the vast majority of existing visualizations are originally authored for desktop environments. Due to significant differences in viewport size and interaction paradigms, directly scaling desktop charts often results in illegible text, information loss, and interaction failures.…

Open PDF (arxiv)
$\mathcal{S}^2$IT: Stepwise Syntax Integration Tuning for Large Language Models in Aspect Sentiment Quad Prediction

2604.23296 cs.CL 2026-04-25 PDF (arxiv)

Bingfeng Chen, Chenjie Qiu, Yifeng Xie, Boyan Xu, Ruichu Cai, Zhifeng Hao

Aspect Sentiment Quad Prediction (ASQP) has seen significant advancements, largely driven by the powerful semantic understanding and generative capabilities of large language models (LLMs). However, while syntactic structure information has been proven effective in previous extractive paradigms, it remains underutilized in the generative paradigm of LLMs due to their limited reasoning capabilities…

Open PDF (arxiv)
Human-1 by Josh Talks: A Full-Duplex Conversational Modeling Framework in Hindi using Real-World Conversations

2604.23295 cs.CL 2026-04-25 PDF (arxiv)

Bhaskar Singh, Shobhit Banga, Pranav Sharma

Full-duplex spoken dialogue systems can model natural conversational behaviours such as interruptions, overlaps, and backchannels, yet such systems remain largely unexplored for Indian languages. We present the first open, reproducible full-duplex spoken dialogue system for Hindi by adapting Moshi, a state-of-the-art duplex speech architecture, using a custom Hindi tokeniser and training on 26,000…

Open PDF (arxiv)
An Analysis of Active Learning Algorithms using Real-World Crowd-sourced Text Annotations

2604.23290 cs.LG 2026-04-25 PDF (arxiv)

Varun Totakura, Ankita Singh, Yushun Dong, Shayok Chakraborty

Active learning algorithms automatically identify the most informative samples from large amounts of unlabeled data and tremendously reduce human annotation effort in inducing a machine learning model. In a conventional active learning setup, the labeling oracles are assumed to be infallible, that is, they always provide correct answers (in terms of class labels) to the queried unlabeled instances…

Open PDF (arxiv)
MetaErr: Towards Predicting Error Patterns in Deep Neural Networks

2604.23289 cs.CV 2026-04-25 PDF (arxiv)

Varun Totakura, Shayok Chakraborty

Due to the unprecedented success of deep learning, it has become an integral component in several multimedia computing applications in todays world. Unfortunately, deep learning systems are not perfect and can fail, sometimes abruptly, without prior warning or explanation. While reducing the error rate of deep neural networks has been the primary focus of the multimedia community, the problem of p…

Open PDF (arxiv)
Towards Agentic Test-Driven Quality Assurance for 6G Networks

2604.23285 cs.NI 2026-04-25 PDF (arxiv)

Christos Tranoris, Besiana Agko, Kostis Trantzas, Irene Denazi

This work proposes an agentic, intent-driven end-to-end (E2E) orchestration framework that integrates intent co-creation with a Test-Driven Quality Assurance paradigm. In this framework, autonomous agents iteratively refine a user's initial intent into a confirmed, auditable specification. Furthermore, the system automatically derives validation tests from these intents before provisioning, direct…

Open PDF (arxiv)
Au-M-ol: A Unified Model for Medical Audio and Language Understanding

2604.23284 cs.CL 2026-04-25 PDF (arxiv)

Meizhu Liu, Nistha Mitra, Paul Li, Amine Abdaoui, Adam Ledyard, Tao Sheng

In this work, we present Au-M-ol, a novel multimodal architecture that extends Large Language Models (LLMs) with audio processing. It is designed to improve performance on clinically relevant tasks such as Automatic Speech Recognition (ASR). Au-M-ol has three main components: (1) an audio encoder that extracts rich acoustic features from medical speech, (2) an adaptation layer that maps audio feat…

Open PDF (arxiv)
Revisable by Design: A Theory of Streaming LLM Agent Execution

2604.23283 cs.LG 2026-04-25 PDF (arxiv)

Zhiyuan Zhai, Ming Li, Xin Wang

Current LLM agents operate under an implicit but universal assumption: execution is a transaction -- the user submits a request, the agent works in isolation, and only upon completion does the dialogue resume. This forces users into a binary choice: wait for a potentially incorrect output, or interrupt and lose all progress. We reject this assumption and propose the stream paradigm, in which agent…

Open PDF (arxiv)
Bridging the Pose-Semantic Gap: A Cascade Framework for Text-Based Person Anomaly Search

2604.23282 cs.CV 2026-04-25 PDF (arxiv)

Zequn Xie, Guijin Luo, Chuxin Wang, Sihang Cai, Tao Jin, Zhou Zhao, Yixuan Tang

Text-based person anomaly search retrieves specific behavioral events from surveillance archives using natural-language queries. Although recent pose-aware methods align geometric structures well, they face a fundamental Pose-Semantic Gap: semantically different actions can share similar skeletal geometries. While Multimodal Large Language Models (MLLMs) can reduce this ambiguity, using them for l…

Open PDF (arxiv)
Contrastive Learning for Multimodal Human Activity Recognition with Limited Labeled Data

2604.23281 cs.LG 2026-04-25 PDF (arxiv)

Long Jing, Zhixiong Yang, Yajun Zhang, Xinlong Feng

Human activity recognition serves as the foundation for various emerging applications. In recent years, researchers have used collaborative sensing of multi-source sensors to capture complex and dynamic human activities. However, multimodal human activity sensing typically encounters highly heterogeneous data across modalities and label scarcity, resulting in an application gap between existing so…

Open PDF (arxiv)
AI Identity: Standards, Gaps, and Research Directions for AI Agents

2604.23280 cs.AI 2026-04-25 PDF (arxiv)

Takumi Otsuka, Kentaroh Toyoda, Alex Leung

AI agents are now running real transactions, workflows, and sub-agent chains across organizational boundaries without continuous human supervision. This creates a problem no current infrastructure is equipped to solve: how do you identify, verify, and hold accountable an entity with no body, no persistent memory, and no legal standing? We define AI Identity as the continuous relationship between w…

Open PDF (arxiv)
Active Inference: A method for Phenotyping Agency in AI systems?

2604.23278 cs.AI 2026-04-25 PDF (arxiv)

Philip Wilson, Axel Constant, Mahault Albarracin, Nicolás Hinrichs, Jasmine Moore, Daniel Polani, Karl Friston

The proliferation of agentic artificial intelligence has outpaced the conceptual tools needed to characterize agency in computational systems. Prevailing definitions mainly rely on autonomy and goal-directedness. Here, we argue for a minimal notion open to principled inspection given three criteria: intentionality as action grounded in beliefs and desires, rationality as normatively coherent actio…

Open PDF (arxiv)
From Similarity to Structure: Training-free LLM Context Compression with Hybrid Graph Priors

2604.23277 cs.CL 2026-04-25 PDF (arxiv)

Yitian Zhou, Chaoning Zhang, Jiaquan Zhang, Zhenzhen Huang, Jinyu Guo, Sung-Ho Bae, Lik-Hang Lee, Caiyan Qin, Yang Yang

Long-context large language models remain computationally expensive to run and often fail to reliably process very long inputs, which makes context compression an important component of many systems. Existing compression approaches typically rely on trained compressors, dense retrieval-style selection, or heuristic trimming, and they often struggle to jointly preserve task relevance, topic coverag…

Open PDF (arxiv)
Lightweight and Production-Ready PDF Visual Element Parsing

2604.23276 cs.CV 2026-04-25 PDF (arxiv)

Meizhu Liu, Yassi Abbasi, Matthew Rowe, Michael Avendi, Paul Li

PDF documents contain critical visual elements such as figures, tables, and forms whose accurate extraction is essential for document understanding and multimodal retrieval-augmented generation (RAG). Existing PDF parsers often miss complex visuals, extract non-informative artifacts (e.g., watermarks, logos), produce fragmented elements, and fail to reliably associate captions with their correspon…

Open PDF (arxiv)
SemiGDA: Generative Dual-distribution Alignment for Semi-Supervised Medical Image Segmentation

2604.23274 cs.CV 2026-04-25 PDF (arxiv)

Kaiwen Huang, Yi Zhou, Yizhe Zhang, Jingxiong Li, Tao Zhou

Semi-supervised learning addresses label scarcity and high annotation costs in medical image segmentation by exploiting the latent information in unlabeled data to enhance model performance. Traditional discriminative segmentation relies on segmentation masks, neglecting feature-level distribution constraints. This limits robust semantic representation learning and adaptive modeling of unlabeled d…

Open PDF (arxiv)
Modular Sensory Stream for Integrating Physical Feedback in Vision-Language-Action Models

2604.23272 cs.RO 2026-04-25 PDF (arxiv)

Jimin Lee, Huiwon Jang, Myungkyu Koo, Jungwoo Park, Jinwoo Shin

Humans understand and interact with the real world by relying on diverse physical feedback beyond visual perception. Motivated by this, recent approaches attempt to incorporate physical sensory signals into Vision-Language-Action models (VLAs). However, they typically focus on a single type of physical signal, failing to capture the heterogeneous and complementary nature of real-world interactions…

Open PDF (arxiv)
A Hierarchical Ensemble Inference Pipeline for Robust White Blood Cell Classification Under Domain Shifts

2604.23271 cs.CV 2026-04-25 PDF (arxiv)

Ruyi Dai, Tingkwong Ng, Hao Chen

Automated white blood cell (WBC) classification is essential for scalable leukaemia screening. However, real-world deployment is challenged by domain shifts caused by staining protocols, scanner characteristics, and inter-laboratory variability, which often degrade model performance. The White Blood Cell Classification Challenge (WBCBench) at ISBI 2026 aims to advance robust WBC recognition, with …

Open PDF (arxiv)
CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

2604.23270 cs.AI 2026-04-25 PDF (arxiv)

Shuxu Chen, Yitian Zhou, Jiaquan Zhang, Haoyu Bian, Aming Wu, Sungyoung Lee, Chaoning Zhang, Hyundong Shin

Chain-of-Thought (CoT) prompting has emerged as a simple and effective way to elicit step-by-step solutions from large language models (LLMs). However, CoT reasoning can be unstable across runs on long, multi-step problems, leading to inconsistent answers for unchanged task. Most prior work focuses on improving the forward reasoning chain within a single pass, with less attention to iterative and …

Open PDF (arxiv)
WSINDy for Model Predictive Control with Applications to Fusion, Drones, and Chaos

2604.23269 math.DS 2026-04-25 PDF (arxiv)

Cristian López, Mckenna Partridge, Sebastian De Pascuale, Jeremy Lore, Andrew Christlieb, Stephen Becker, David M. Bortz

The control of complex dynamical systems remains a fundamental challenge in science and engineering, where strong nonlinearities, the presence of noise, and computational constraints often pose significant obstacles in traditional control approaches. Recent advances in data-driven methods, particularly system identification techniques, have shown a powerful alternative by providing fast, parsimoni…

Open PDF (arxiv)
LatentBurst: A Fast and Efficient Multi Frame Super-Resolution for Hexadeca-Bayer Pattern CIS images

2604.23268 cs.CV 2026-04-25 PDF (arxiv)

Sangwook Baek, Vin Van Duong, Karam Park, Pilkyu Park

This paper introduces a novel multi frame super-resolution network (MFSR) for burst hexadeca Bayer pattern Contact Image Sensor (CIS) images, which includes demosaicing, denoising, multi-frame fusion, and super-resolution. Designing a high-quality reconstruction network poses several challenges as follows: 1) Unlike the Bayer color filter array (CFA) pattern, it is hard to interpolate hexadeca-Bay…

Open PDF (arxiv)
Fine-tuning vs. In-context Learning in Large Language Models: A Formal Language Learning Perspective

2604.23267 cs.CL 2026-04-25 PDF (arxiv)

Bishwamittra Ghosh, Soumi Das, Till Speicher, Qinyuan Wu, Mohammad Aflah Khan, Deepak Garg, Krishna P. Gummadi, Evimaria Terzi

Large language models (LLMs) operate in two fundamental learning modes - fine-tuning (FT) and in-context learning (ICL) - raising key questions about which mode yields greater language proficiency and whether they differ in their inductive biases. Prior studies comparing FT and ICL have yielded mixed and inconclusive results due to inconsistent experimental setups. To enable a rigorous comparison,…

Open PDF (arxiv)
The Blockchain Execution Dilemma: Optimizing Revenue XOR Fair Ordering

2604.23266 cs.DC 2026-04-25 PDF (arxiv)

Artjom Pugatsov, Can Umut Ileri, Jérémie Decouchant

The successive generations of consensus algorithms have progressively shifted the performance bottleneck of blockchains to the execution layer. While recent works address this by parallelizing transaction execution, they often overlook the critical role of transaction sequencing. Historically, transaction ordering was left to validator discretion, a practice prone to Maximal Extractable Value (MEV…

Open PDF (arxiv)