Archon

Browse and search harvested arxiv metadata.

1273993 results (page 125 of 50960)

Reducing Detail Hallucinations in Long-Context Regulatory Understanding via Targeted Preference Optimization

2604.23113 cs.SI 2026-04-25 PDF (arxiv)

Yang Liu, Bin Chong, Yuhan Lin, Chongyang Zhang, Hao Zheng, Ziyi Zhang, Jiayu Liang, Ran Ran, Qian Li, Kefu Xu

Large language models (LLMs) frequently produce \emph{detail hallucinations} when processing long regulatory documents, including subtle errors in threshold values, units, scopes, obligation levels, and conditions that preserve surface plausibility while corrupting safety-critical parameters. We formalize this phenomenon through a fine-grained \emph{Detail Error Taxonomy} of five error types and i…

Open PDF (arxiv)
Conditional Imputation for Within-Modality Missingness in Multi-Modal Federated Learning

2604.23112 cs.LG 2026-04-25 PDF (arxiv)

Wugeng Zheng, Ziwen Kan, Katie Wang, Chen Chen, Song Wang

Multimodal Federated Learning (MMFL) enables privacy-preserving collaborative training, but real-world clinical applications often suffer from within-modality missingness caused by sensor intermittency or irregular sampling. Existing methods implicitly represent unobserved data via architectural alignment or missing embeddings, often failing to recover the true distribution and yielding sub-optima…

Open PDF (arxiv)
Mixture of Heterogeneous Grouped Experts for Language Modeling

2604.23108 cs.CL 2026-04-25 PDF (arxiv)

Zhicheng Ma, Xiang Liu, Zhaoxiang Liu, Ning Wang, Yi Shen, Kai Wang, Shuming Shi, Shiguo Lian

Large Language Models (LLMs) based on Mixture-of-Experts (MoE) are pivotal in industrial applications for their ability to scale performance efficiently. However, standard MoEs enforce uniform expert sizes,creating a rigidity that fails to align computational costs with varying token-level complexity. While heterogeneous expert architectures attempt to address this by diversifying expert sizes, th…

Open PDF (arxiv)
MOCA: A Transformer-based Modular Causal Inference Framework with One-way Cross-attention and Cutting Feedback

2604.23107 stat.ML 2026-04-25 PDF (arxiv)

Lei Wang, Debashis Ghosh

Causal effect estimation from observational data requires careful adjustment for confounding. Classical estimators such as inverse probability weighting and augmented inverse probability weighting are effective under favorable model specification, but may become unstable when treatment assignment and outcome mechanisms are complex, non-linear, and high-dimensional. Machine learning and representat…

Open PDF (arxiv)
No Test Cases, No Problem: Distillation-Driven Code Generation for Scientific Workflows

2604.23106 cs.SE 2026-04-25 PDF (arxiv)

Siddeshwar Raghavan, Tanwi Mallick

Existing multi-agent Large Language Model (LLM) frameworks for code generation typically use execution feedback and improve iteratively using Input/Output (I/O) test cases. However, this does not work for scientific workflows, where I/O test cases do not exist, and generating them requires solving the very problem at hand. To address this, we introduce MOSAIC, a training-free multi-agent framework…

Open PDF (arxiv)
Transferable Physical-World Adversarial Patches Against Object Detection in Autonomous Driving

2604.23105 cs.CV 2026-04-25 PDF (arxiv)

Zihui Zhu, Ziqi Zhou, Yichen Wang, Lulu Xue, Minghui Li, Shengshan Hu

Deep learning drives major advances in autonomous driving (AD), where object detectors are central to perception. However, adversarial attacks pose significant threats to the reliability and safety of these systems, with physical adversarial patches representing a particularly potent form of attack. Physical adversarial patch attacks pose severe risks but are usually crafted for a single model, yi…

Open PDF (arxiv)
Rank One Completion for Higher Order Tensors

2604.23104 math.NA 2026-04-25 PDF (arxiv)

Linghao Zhang, Ioana Dumitriu, Jiawang Nie

We study the rank one completion problem for tensors of arbitrary orders. The notion of rank one determinable tensors is introduced. We explore its properties and propose a recursive algorithm for computing rank one tensor completion. This algorithm only requires solving linear systems and computing singular vectors. In the absence of noise, it produces a unique rank one completion under some assu…

Open PDF (arxiv)
UHECR doublets and their conditional association with nearby radio galaxies

2604.23103 astro-ph.HE 2026-04-25 PDF (arxiv)

Victor Barbosa Martins

The origin of ultra-high-energy cosmic rays (UHECRs) remains a fundamental question in astroparticle physics. While localized 3 $σ$ correlations with active galactic nuclei and starburst galaxies have been reported using time-integrated analyses, we propose and implement a spatiotemporal multiplet search method utilizing a pre-defined fixed window of 3 degrees and 15 days, a kinematic filter desig…

Open PDF (arxiv)
Unstable Rankings in Bayesian Deep Learning Evaluation

2604.23102 cs.LG 2026-04-25 PDF (arxiv)

Qishi Zhan, Minxuan Hu, Guansu Wang, Jiaxin Liu, Liang He

Standard evaluations of Bayesian deep learning methods assume that metric estimates are reliable, but we show this assumption fails under data scarcity. Method rankings are not only unreliable at small $n$, but also dataset-dependent in ways that point estimates cannot reveal: the same method comparison yields $P(\mathrm{MCD} \prec \mathrm{Ensemble}) = 1.000$ at $n = 50$ on one dataset and remains…

Open PDF (arxiv)
Compressed Traffic Assignment with the Augmented Lagrangian Method

2604.23101 math.OC 2026-04-25 PDF (arxiv)

Xuesong, Zhou, Peiheng Li, Yuchao Li, Dimitri Bertsekas

We consider large-scale traffic assignment problems and develop a path-based compression framework. In particular, we partition paths into major and minor paths according to a set of nominal flows and a prescribed threshold, and retain the major paths explicitly. For the minor paths, we introduce a low-dimensional representation based on a truncated singular value decomposition of the minor path-l…

Open PDF (arxiv)
From Language to Logic: Bridging LLMs & Formal Representations for RTL Assertion Generation

2604.23100 cs.CR 2026-04-25 PDF (arxiv)

Nowfel Mashnoor, Hadi Kamali, Kimia Azar

SystemVerilog Assertions (SVA) are essential for formal verification of digital hardware, yet their manual creation demands significant expertise in both the design under verification and temporal logic. Recent studies have explored using large language models (LLMs) to automate SVA generation, but existing approaches suffer from incorrect signal references, missing timing constraints, and lack of…

Open PDF (arxiv)
ProEval: Proactive Failure Discovery and Efficient Performance Estimation for Generative AI Evaluation

2604.23099 cs.LG 2026-04-25 PDF (arxiv)

Yizheng Huang, Wenjun Zeng, Aditi Kumaresan, Zi Wang

Evaluating generative AI models is increasingly resource-intensive due to slow inference, expensive raters, and a rapidly growing landscape of models and benchmarks. We propose ProEval, a proactive evaluation framework that leverages transfer learning to efficiently estimate performance and identify failure cases. ProEval employs pre-trained Gaussian Processes (GPs) as surrogates for the performan…

Open PDF (arxiv)
On the hull of linearized polynomial codes

2604.23097 cs.IT 2026-04-25 PDF (arxiv)

Daniele Bartoli, Giovanni Giuseppe Grimaldi, Pantelimon Stănică

Motivated by entanglement-assisted quantum error-correcting codes, where the hull dimension determines the number of required pre-shared entangled pairs, we study hulls of two families of $\mathbb{F}_q$-linear codes defined by $q$-polynomial operators over $\mathbb{F}_{q^m}$. Our main tool is a unified Gram-matrix method. For image codes $\mathcal{C}(\boldsymbolα)=\operatorname{im}Φ_{\boldsymbolα}…

Open PDF (arxiv)
INSIGHT: Indoor Scene Intelligence from Geometric-Semantic Hierarchy Transfer for Public~Safety

2604.23095 cs.CV 2026-04-25 PDF (arxiv)

Alexander Nikitas Dimopoulos, Joseph Grasso, John Beltz

Indoor environments lack the spatial intelligence infrastructure that GPS provides outdoors; first responders arriving at unfamiliar buildings typically have no machine-readable map of safety equipment. Prior work on 3D semantic segmentation for public safety identified two barriers: scarcity of labeled indoor training data and poor recognition of small safety-critical features by native point-clo…

Open PDF (arxiv)
Toward Real-World Adoption of Portrait Relighting via Hybrid Domain Knowledge Fusion

2604.23094 cs.CV 2026-04-25 PDF (arxiv)

Qian Huang, Mayoore Selvarasa Jaiswal, Zhen Zhong, Rochelle Pereira, Jianyuan Min

The real-world adoption of portrait relighting is hindered by dataset domain gaps, camera sensitivity, and computational costs. We address these challenges with Hybrid Domain Knowledge Fusion, a paradigm that fuses the specialized strengths of synthetic, One-Light-at-A-Time (OLAT), and real-world datasets into a compact model. Our approach features specialized prior models hardened by domain-aware…

Open PDF (arxiv)
Analysis of a Septuple Open Cluster System and Its Extended Family in Gaia DR3

2604.23093 astro-ph.GA 2026-04-25 PDF (arxiv)

Muhammad Akmal Husain, Ferdinand, Mochamad Ikbal Arifyanto, Muhammad Irfan Hakim

A rare multiple open cluster system has been analyzed using Gaia DR3 astrometry and photometry data. Using Agglomerative Hierarchical clustering and Bayesian-HDBSCAN, we identify a compact core consisting of seven known open clusters and two additional components, including a new candidate, forming a nine-member association. Membership probabilities are refined through statistical modeling, combin…

Open PDF (arxiv)
Channel Adaptation for EEG Foundation Models: A Systematic Benchmark Across Architectures, Tasks, and Training Regimes

2604.23091 cs.LG 2026-04-25 PDF (arxiv)

Kuntal Kokate, Bruno Aristimunha, Dung Truong, Arnaud Delorme

Scaling EEG foundation models requires pooling data across heterogeneous electrode montages, a prerequisite both for larger pretraining corpora and for downstream deployment. We present the first systematic comparison of four channel adaptation methods (Conv1d projection, spherical spline interpolation (SSI), source-space decomposition, and Riemannian re-centering) across five pretrained EEG found…

Open PDF (arxiv)
Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach

2604.23090 cs.AI 2026-04-25 PDF (arxiv)

Abid Talukder, Maruf Ahmed Mridul, Oshani Seneviratne

Automatically generating formal ontologies from unstructured natural language remains a central challenge in knowledge engineering. While large language models (LLMs) show promise, it remains unclear which architectural design choices drive generation quality and why current approaches fail. We present a controlled experimental study using domain-specific insurance contracts to investigate these q…

Open PDF (arxiv)
Code Broker: A Multi-Agent System for Automated Code Quality Assessment

2604.23088 cs.SE 2026-04-25 PDF (arxiv)

Samer Attrah

We present Code Broker, a multi agent system built with Google Agent Development Kit ADK that analyses Python code from files, local directories, or GitHub repositories and generates actionable quality assessment reports. The system employs a hierarchical five agents architecture in which a root orchestrator coordinates a sequential pipeline agent, which in turn dispatches three specialised agents…

Open PDF (arxiv)
Using Importance Sampling to Estimate $p$-values in All-Subset Meta-Analysis, with Applications to Single-Cell eQTL Mapping

2604.23085 stat.ME 2026-04-25 PDF (arxiv)

Samuel Anyaso-Samuel, Thong Luong, Fei Qin, Jiyeon Choi, Kai Yu, Paul S. Albert, Jianxin Shi

Pooling genome-wide association studies of multiple related traits can substantially increase power for detecting genetic variants with pleiotropic effects. ASSET, which exhaustively searches all subsets of studies for association signals, has been widely used to detect modest effects and improve interpretability. Under a normality assumption, ASSET computes p-values via an analytic approximation …

Open PDF (arxiv)
Multi-Viewpoint Observation of a Failed Prominence Eruption on the Sun

2604.23084 astro-ph.SR 2026-04-25 PDF (arxiv)

Tingyu Gou, Katharine K. Reeves, Peter R. Young, Astrid M. Veronig, Xingyao Chen, Sijie Yu, Bin Chen, Bin Zhuang

Solar eruptions are sudden ejections of coronal mass and magnetic fields accompanied by intense energy release. The eruptive structure does not always erupt successfully, but sometimes fails to escape the Sun after initiation. The failure of an eruption, however, provides an invaluable opportunity for understanding the intricate mechanism of eruptions. We present a comprehensive observation of a f…

Open PDF (arxiv)
Turtle shell clustering: A mixture approach to discriminative clustering with applications to flow cytometry and other data

2604.23083 stat.ML 2026-04-25 PDF (arxiv)

Mackenzie R. Neal, Paul D. McNicholas, Arthur White

Generative approaches to clustering provide information on geometric properties of clusters, whereas discriminative approaches provide boundaries between clusters. Ideas from both approaches are incorporated to present a fully unsupervised, probabilistic, and discriminative clustering method via a regularized mutual information objective function, wherein a mixture of mixtures of Gaussian and unif…

Open PDF (arxiv)
Visual Accessibility in a Virtual Kitchen: Effects of Open Shelving on Performance, Cognitive Load, and Experience in Older Adults with and without MCI

2604.23081 cs.HC 2026-04-25 PDF (arxiv)

Ibrahim Bilau, Eunhwa Yang, Hyeokhyen Kwon, Stacie Smith, Bruce Walker, Hui Cai, Ece Erdogmus, Omobolanle Ogunseiju

This study examines how visual accessibility through cabinet design influences task performance, cognitive load, physical activity level, motivation, and user experience in a virtual kitchen among older adults with and without mild cognitive impairment (MCI). Seventeen older adults (7 with MCI, 10 without) completed a repeated-measures item retrieval task under two conditions, closed cabinets and …

Open PDF (arxiv)
Usable Agent Discovery for Decentralized AI Systems

2604.23080 cs.MA 2026-04-25 PDF (arxiv)

Patrizio Dazzi, Emanuele Carlini, Matteo Mordacchini, Saul Urso

Large-scale agentic systems run on distributed infrastructures where many software agents share physical hosts and are discovered via peer-to-peer mechanisms. Discovery must handle node-level churn from failures and host departures and agent-level churn from demand-driven activation, deactivation, and state changes. Their interaction reshapes classic trade-offs between structured and unstructured …

Open PDF (arxiv)
From Pixels to Explanations: Interpretable Diabetic Retinopathy Grading with CNN-Transformer Ensembles, Visual Explainability and Vision-Language Models

2604.23079 cs.CV 2026-04-25 PDF (arxiv)

Pir Bakhsh Khokhar, Carmine Gravino, Fabio Palomba, Sule Yildirim Yayilgan, Sarang Shaikh

The quality of diabetic retinopathy (DR) screening relies on the ability to correctly grade severity; however, many deep-learning (DL) classifiers cannot be easily interpreted in the clinical context. This study presents a methodology that combines strong discriminative models with multimodal explanations, converting retinal pixels into clinically interpretable outputs. Using the APTOS 2019 benchm…

Open PDF (arxiv)