Archon

Browse and search harvested arxiv metadata.

943429 results (page 34 of 37738)

Learning to Credit the Right Steps: Objective-aware Process Optimization for Visual Generation

2604.19234 cs.CV 2026-04-21 PDF (arxiv)

Rui Li, Ke Hao, Yuanzhi Liang, Haibin Huang, Chi Zhang, YunGu, XueLong Li

Reinforcement learning, particularly Group Relative Policy Optimization (GRPO), has emerged as an effective framework for post-training visual generative models with human preference signals. However, its effectiveness is fundamentally limited by coarse reward credit assignment. In modern visual generation, multiple reward models are often used to capture heterogeneous objectives, such as visual q…

Open PDF (arxiv)
Adaptive Slicing-Assisted Hyper Inference for Enhanced Small Object Detection in High-Resolution Imagery

2604.19233 cs.CV 2026-04-21 PDF (arxiv)

Francesco Moretti, Yi Jin, Guiqin Mario

Deep learning-based object detectors have achieved remarkable success across numerous computer vision applications, yet they continue to struggle with small object detection in high-resolution aerial and satellite imagery, where dense object distributions, variable shooting angles, diminutive target sizes, and substantial inter-class variability pose formidable challenges. Existing slicing strateg…

Open PDF (arxiv)
Reliable Remote Inference from Unreliable Components: Joint Communication and Computation Limits

2604.19231 cs.IT 2026-04-21 PDF (arxiv)

Zhenyu Liu, Yi Ma, Rahim Tafazolli

Classical information theory typically assumes reliable receiver-side processing. We study remote inference when communication is noisy and the receiver itself is built from unreliable components under a finite redundancy budget. Under a committed/no-bypass receiver closure, task-relevant information can affect the final estimate only by passing through a budgeted collection of vulnerable primitiv…

Open PDF (arxiv)
Preconditioners for the Onsager-Stefan-Maxwell equations for multicomponent diffusion

2604.19230 math.NA 2026-04-21 PDF (arxiv)

Kars Knook, Aaron Baier-Reinio, Patrick E. Farrell

The Onsager-Stefan-Maxwell (OSM) equations are an important model of mass transport in multicomponent flows with multiple chemical species. They describe the coupling of diffusive fluxes between species, accounting for their interactions through frictional and thermodynamic driving forces. In this work we propose an augmented Lagrangian preconditioner and prove its discretization-robustness for a …

Open PDF (arxiv)
Exact Quadratic Penalty Function for Symplectic Eigenvalue Problem

2604.19229 math.OC 2026-04-21 PDF (arxiv)

Jiaqi Wang, Nachuan Xiao, Xin Liu

The symplectic eigenvalue problem for symmetric positive-definite (spd) matrices plays a crucial role in various scientific fields, including quantum mechanics and control theory. This paper introduces a trace-penalty minimization method, which transforms the symplectic eigenvalue problem into the unconstrained minimization of the trace-penalty function. We prove the equivalence between the penalt…

Open PDF (arxiv)
iCoRe: An Iterative Correlation-Aware Retriever for Bug Reproduction Test Generation

2604.19224 cs.SE 2026-04-21 PDF (arxiv)

Junyi Wang, Jialun Cao, Zhongxin Liu

Automatically generating bug reproduction tests (BRT) from issue descriptions is crucial for software maintenance. LLM-based approaches have shown great potential for this task. Their effectiveness heavily relies on retrieving high-quality context from the codebase. The retrieval phase of existing approaches relies on either traditional methods like BM25 or LLM-driven strategies. LLM-based retriev…

Open PDF (arxiv)
UAF: A Unified Audio Front-end LLM for Full-Duplex Speech Interaction

2604.19221 cs.AI 2026-04-21 PDF (arxiv)

Yadong Li, Guoxin Wu, Haiping Hou, Biye Li

Full-duplex speech interaction, as the most natural and intuitive mode of human communication, is driving artificial intelligence toward more human-like conversational systems. Traditional cascaded speech processing pipelines suffer from critical limitations, including accumulated latency, information loss, and error propagation across modules. To address these issues, recent efforts focus on the …

Open PDF (arxiv)
Thinking Before Matching: A Reinforcement Reasoning Paradigm Towards General Person Re-Identification

2604.19218 cs.CV 2026-04-21 PDF (arxiv)

Quan Zhang, Jingze Wu, Jialong Wang, Xiaohua Xie, Jianhuang Lai, Hongbo Chen

Learning identity-discriminative representations with multi-scene generality has become a critical objective in person re-identification (ReID). However, mainstream perception-driven paradigms tend to identify fitting from massive annotated data rather than identity-causal cues understanding, which presents a fragile representation against multiple disruptions. In this work, ReID-R is proposed as …

Open PDF (arxiv)
Sherpa.ai Privacy-Preserving Multi-Party Entity Alignment without Intersection Disclosure for Noisy Identifiers

2604.19219 cs.CR 2026-04-21 PDF (arxiv)

Daniel M. Jimenez-Gutierrez, Enrique Zuazua, Georgios Kellaris, Joaquin Del Rio, Oleksii Sliusarenko, Xabi Uribe-Etxebarria

Federated Learning (FL) enables collaborative model training among multiple parties without centralizing raw data. There are two main paradigms in FL: Horizontal FL (HFL), where all participants share the same feature space but hold different samples, and Vertical FL (VFL), where parties possess complementary features for the same set of samples. A prerequisite for VFL training is privacy-preservi…

Open PDF (arxiv)
Attention-based Multi-modal Deep Learning Model of Spatio-temporal Crop Yield Prediction with Satellite, Soil and Climate Data

2604.19217 cs.CV 2026-04-21 PDF (arxiv)

Gopal Krishna Shyam, Ila Chandrakar

Crop yield prediction is one of the most important challenge, which is crucial to world food security and policy-making decisions. The conventional forecasting techniques are limited in their accuracy with reference to the fact that they utilize static data sources that do not reflect the dynamic and intricate relationships that exist between the variables of the environment over time [5,13]. This…

Open PDF (arxiv)
An Object-Centered Data Acquisition Method for 3D Gaussian Splatting using Mobile Phones

2604.19216 cs.CV 2026-04-21 PDF (arxiv)

Yuezhe Zhang, Luqian Bai, Mengting Yu, Lei Wei, Shuai Wan, Yifan Zhang

Data acquisition through mobile phones remains a challenge for 3D Gaussian Splatting (3DGS). In this work we target the object-centered scenario and enable reliable mobile acquisition by providing on-device capture guidance and recording onboard sensor signals for offline reconstruction. After the calibration step, the device orientations are aligned to a baseline frame to obtain relative poses, a…

Open PDF (arxiv)
Conceptual Design and Analysis of a NanoTug Swarm for Active Debris Removal

2604.19214 astro-ph.EP 2026-04-21 PDF (arxiv)

F. Alnaqbi, S. Biktimirov, G. Gaias

This paper investigates a swarm-based concept in which a number of nanosatellites, referred to as NanoTugs, are deployed by a mother spacecraft to capture and cooperatively stabilize and de-orbit space debris. The study focuses on the stabilization and de-orbiting phases of the mission, where each NanoTug is equipped with thrusters to perform the de-orbiting maneuver. An analytical method is devel…

Open PDF (arxiv)
The Logical Expressiveness of Topological Neural Networks

2604.19212 cs.LG 2026-04-21 PDF (arxiv)

Amirreza Akbari, Amauri H. Souza, Vikas Garg

Graph neural networks (GNNs) are the standard for learning on graphs, yet they have limited expressive power, often expressed in terms of the Weisfeiler-Leman (WL) hierarchy or within the framework of first-order logic. In this context, topological neural networks (TNNs) have recently emerged as a promising alternative for graph representation learning. By incorporating higher-order relational str…

Open PDF (arxiv)
ClawNet: Human-Symbiotic Agent Network for Cross-User Autonomous Cooperation

2604.19211 cs.AI 2026-04-21 PDF (arxiv)

Zhiqin Yang, Zhenyuan Zhang, Xianzhang Jia, Jun Song, Wei Xue, Yonggang Zhang, Yike Guo

Current AI agent frameworks have made remarkable progress in automating individual tasks, yet all existing systems serve a single user. Human productivity rests on the social and organizational relationships through which people coordinate, negotiate, and delegate. When agents move beyond performing tasks for one person to representing that person in collaboration with others, the infrastructure f…

Open PDF (arxiv)
VLTI-GRAVITY observations of blazars

2604.19210 astro-ph.GA 2026-04-21 PDF (arxiv)

Talvikki Hovatta, Elina Lindfors, Heidi Korhonen, Preeti Kharb, Markus Wittkowski, Aaron Labdon, Tapio Pursimo, Kaj Wiik

Parsec-scale jets of blazars have so far been spatially resolved only in mm- and submm wavelengths, where very long baseline interferometry can be used to obtain milliarcsecond-scale images of the jets. We have attempted to spatially resolve the near-infrared emission in jet-dominated blazars for the first time. We used the VLTI-GRAVITY instrument to obtain milliarcsecond-scale near-infrared inter…

Open PDF (arxiv)
Forage V2: Knowledge Evolution and Transfer in Autonomous Agent Organizations

2604.19837 cs.AI 2026-04-21 PDF (arxiv)

Huaqing Xie

Autonomous agents operating in open-world tasks -- where the completion boundary is not given in advance -- face denominator blindness: they systematically underestimate the scope of the target space. Forage V1 addressed this through co-evolving evaluation (an independent Evaluator discovers what "complete" means) and method isolation (Evaluator and Planner cannot see each other's code). V2 extend…

Open PDF (arxiv)
Audio Spoof Detection with GaborNet

2604.19209 cs.SD 2026-04-21 PDF (arxiv)

Waldek Maciejko

An direction of development in the extraction of features from audio signals is based on processing raw samples in the time domain. Such an approach appears to be effective, especially in the era of neural networks. An example is SincNet. In this solution, the core of the neural network layer is a set of sinc functions that are convolved with the input signal. Due to the finite length of sinc func…

Open PDF (arxiv)
When Can We Trust Deep Neural Networks? Towards Reliable Industrial Deployment with an Interpretability Guide

2604.19206 cs.CV 2026-04-21 PDF (arxiv)

Hang-Cheng Dong, Yuhao Jiang, Yibo Jiao, Lu Zou, Kai Zheng, Bingguo Liu, Dong Ye, Guodong Liu

The deployment of AI systems in safety-critical domains, such as industrial defect inspection, autonomous driving, and medical diagnosis, is severely hampered by their lack of reliability. A single undetected erroneous prediction can lead to catastrophic outcomes. Unfortunately, there is often no alternative but to place trust in the outputs of a trained AI system, which operates without an intern…

Open PDF (arxiv)
Demonstrating Online Schema Alignment in Decentralized Knowledge Graphs Querying

2604.19205 cs.DB 2026-04-21 PDF (arxiv)

Bryan-Elliott Tam, Pieter Colpaert, Ruben Taelman

Decentralized Knowledge Graphs querying enables integrating distributed data without centralization, but is highly sensitive to vocabulary heterogeneity. Query issuers cannot realistically anticipate all vocabulary mismatches, especially when alignment rules are local, scoped, or discovered at runtime. We present an online schema alignment approach for Link Traversal Query Processing (LTQP) that d…

Open PDF (arxiv)
Auditing LLMs for Algorithmic Fairness in Casenote-Augmented Tabular Prediction

2604.19204 cs.CY 2026-04-21 PDF (arxiv)

Xiao Qi Lee, Ezinne Nwankwo, Angela Zhou

LLMs are increasingly being considered for prediction tasks in high-stakes social service settings, but their algorithmic fairness properties in this context are poorly understood. In this short technical report, we audit the algorithmic fairness of LLM-based tabular classification on a real housing placement prediction task, augmented with street outreach casenotes from a nonprofit partner. We au…

Open PDF (arxiv)
SketchFaceGS: Real-Time Sketch-Driven Face Editing and Generation with Gaussian Splatting

2604.19202 cs.GR 2026-04-21 PDF (arxiv)

Bo Li, Jiahao Kang, Yubo Ma, Feng-Lin Liu, Bin Liu, Fang-Lue Zhang, Lin Gao

3D Gaussian representations have emerged as a powerful paradigm for digital head modeling, achieving photorealistic quality with real-time rendering. However, intuitive and interactive creation or editing of 3D Gaussian head models remains challenging. Although 2D sketches provide an ideal interaction modality for fast, intuitive conceptual design, they are sparse, depth-ambiguous, and lack high-f…

Open PDF (arxiv)
Cascaded Code Editing: Large-Small Model Collaboration for Effective and Efficient Code Editing

2604.19201 cs.SE 2026-04-21 PDF (arxiv)

Chaozheng Wang, Zezhou Yang, Shuzheng Gao, Cuiyun Gao, Zongjie Li, Yichen Li, Ting Peng, Hailiang Huang, Yuetang Deng, Michael R. Lyu

Code editing constitutes a fundamental practice in software development, wherein developers modify existing codebases according to natural language requirements. Accurate code editing necessitates a comprehensive understanding of both the existing codebase and the modification requirements. Although large language models (LLMs) have demonstrated promising performance in code editing tasks, they su…

Open PDF (arxiv)
Cosmological constraints on TeV-scale dark matter subcomponents decaying between recombination and reionisation

2604.19198 astro-ph.CO 2026-04-21 PDF (arxiv)

Markus R. Mosbech, Cristina Benso, Felix Kahlhoefer

The Dark Ages and the Cosmic Dawn are an untapped well of information about the particle physics properties of dark matter, which may become accessible with future radio telescopes able to probe the 21-cm signal from atomic hydrogen. In this work we study the impact on cosmological observables of a dark matter subcomponent composed of TeV-scale particles that decay into electrons, photons or neutr…

Open PDF (arxiv)
Benchmarking Vision Foundation Models for Domain-Generalizable Face Anti-Spoofing

2604.19196 cs.CV 2026-04-21 PDF (arxiv)

Mika Feng, Pierre Gallin-Martel, Koichi Ito, Takafumi Aoki

Face Anti-Spoofing (FAS) remains challenging due to the requirement for robust domain generalization across unseen environments. While recent trends leverage Vision-Language Models (VLMs) for semantic supervision, these multimodal approaches often demand prohibitive computational resources and exhibit high inference latency. Furthermore, their efficacy is inherently limited by the quality of the u…

Open PDF (arxiv)
How Far Are Video Models from True Multimodal Reasoning?

2604.19193 cs.CV 2026-04-21 PDF (arxiv)

Xiaotian Zhang, Jianhui Wei, Yuan Wang, Jie Tan, Yichen Li, Yan Zhang, Ziyi Chen, Daoan Zhang, Dezhi YU, Wei Xu, Songtao Jiang, Zuozhu Liu

Despite remarkable progress toward general-purpose video models, a critical question remains unanswered: how far are these models from achieving true multimodal reasoning? Existing benchmarks fail to address this question rigorously, as they remain constrained by straightforward task designs and fragmented evaluation metrics that neglect complex multimodal reasoning. To bridge this gap, we introdu…

Open PDF (arxiv)