Archon

Browse and search harvested arxiv metadata.

1273993 results (page 133 of 50960)

Can QPP Choose the Right Query Variant? Evaluating Query Variant Selection for RAG Pipelines

2604.22661 cs.IR 2026-04-24 PDF (arxiv)

Negar Arabzadeh, Andrew Drozdov, Michael Bendersky, Matei Zaharia

Large Language Models (LLMs) have made query reformulation ubiquitous in modern retrieval and Retrieval-Augmented Generation (RAG) pipelines, enabling the generation of multiple semantically equivalent query variants. However, executing the full pipeline for every reformulation is computationally expensive, motivating selective execution: can we identify the best query variant before incurring dow…

Open PDF (arxiv)
RealBench: A Repo-Level Code Generation Benchmark Aligned with Real-World Software Development Practices

2604.22659 cs.SE 2026-04-24 PDF (arxiv)

Jia Li, Hongyi Deng, Yiran Zhang, Kechi Zhang, Tianqi Shao, Tiankuo Zhao, Weinan Wang, Zhi Jin, Ge Li, Yang Liu, Yingtao Fang, Yihong Dong

Writing code requires significant time and effort in software development. To automate this process, researchers have made substantial progress using Large Language Models (LLMs) for code generation. Many benchmarks like HumanEval and EvoCodeBench have been created to evaluate LLMs by requiring them to generate code from natural language requirements. However, in enterprise applications and team d…

Open PDF (arxiv)
PASR: Pose-Aware 3D Shape Retrieval from Occluded Single Views

2604.22658 cs.CV 2026-04-24 PDF (arxiv)

Jiaxin Shi, Guofeng Zhang, Wufei Ma, Naifu Liang, Adam Kortylewski, Alan Vuile

Single-view 3D shape retrieval is a fundamental yet challenging task that is increasingly important with the growth of available 3D data. Existing approaches largely fall into two categories: those using contrastive learning to map point cloud features into existing vision-language spaces and those that learn a common embedding space for 2D images and 3D shapes. However, these feed-forward, holist…

Open PDF (arxiv)
A Non-Invasive Alternative to RFID: Self-Sufficient 3D Identification of Group-Housed Livestock

2604.22657 cs.CV 2026-04-24 PDF (arxiv)

Shiva Paudel, TsungCheng Tsai, Dongyi Wang

Accurate identification of individual farm animals in group-housed environments is a cornerstone of precision livestock management. However, current industry standards rely heavily on Radio Frequency Identification (RFID) ear tags, which are invasive, prone to loss, and restricted by the spatial limitations of antenna fields. In this paper, we propose a non-intrusive, vision-based identification s…

Open PDF (arxiv)
Associativity-Peakiness Metric for Contingency Tables

2604.22655 cs.LG 2026-04-24 PDF (arxiv)

Naomi E. Zirkind, William J. Diehl

For the use case of comparing the performance of clustering algorithms whose output is a contingency table, a single performance metric for contingency tables is needed. Such a metric is vital for comparative performance analysis of clustering algorithms. A survey of publicly available literature did not show the presence of such a metric. Metrics do exist for vector pairs of truth values and pred…

Open PDF (arxiv)
What People See (and Miss) About Generative AI Risks: Perceptions of Failures, Risks, and Who Should Address Them

2604.22654 cs.HC 2026-04-24 PDF (arxiv)

Megan Li, Wendy Bickersteth, Ningjing Tang, Parv Kapoor, Khinezin Win, Peter Zhong, Jason I. Hong, Lorrie Faith Cranor, Hoda Heidari, Hong Shen

Despite growing concerns about the risks of Generative AI (GenAI), there is limited understanding of public perceptions of these risks and their associated failure modes -- defined as recurring patterns of sociotechnical breakdown across the GenAI lifecycle that contribute to risks of real-world harm. To address this gap, we present a survey instrument, validated with eight subject matter experts …

Open PDF (arxiv)
Verifier Warnings Do Not Improve Comprehensibility Prediction

2604.22653 cs.SE 2026-04-24 PDF (arxiv)

Nadeeshan De Silva, Martin Kellogg, Oscar Chaparro

Proponents of software verification suggest that code simplicity is linked to the effort to verify code, hypothesizing that formal verifiers produce fewer false positive warnings and require less manual intervention when analyzing simpler code. A recent meta-analysis study found empirical support for this hypothesis: a small correlation between the sum of verifier warnings and human-derived code c…

Open PDF (arxiv)
A dataset of early blockchain-registered AI agents on Ethereum

2604.22652 cs.DB 2026-04-24 PDF (arxiv)

Yulin Liu

This study presents a structured dataset of blockchain-registered artificial intelligence agents under the ERC-8004 standard on Ethereum. The dataset integrates on-chain identity records, minting transactions, transfer events, reputation summaries, and individual feedback records, together with resolved off-chain metadata where available. Data were collected from Ethereum mainnet using Web3 RPC qu…

Open PDF (arxiv)
From core to envelope: revealing the deep dynamics of stars with two convective zones

2604.22651 astro-ph.SR 2026-04-24 PDF (arxiv)

Sylvain N. Breton, Allan Sacha Brun, Rafael A. García

On the Hertzsprung-Russell diagram, F-type solar pulsators connect the Sun to intermediate mass stars located on the instability strip. With respect to lower mass stars, they are structurally peculiar in the sense that they are constituted of three distinct dynamical layers: a small convective core, a deep radiative interior, and a shallow convective envelope. Current asteroseismic techniques only…

Open PDF (arxiv)
Triple-Phase Sequential Fusion Network for Hepatobiliary Phase Liver MRI Synthesis

2604.22904 eess.IV 2026-04-24 PDF (arxiv)

Qiuli Wang, Xinhuan Sun, Fengxi Chen, Yongxu Liu, Jie Cheng, Lin Chen, Jiafei Chen, Yue Zhang, Xiaoming Li, Wei Chen

Gadoxetate disodium-enhanced MRI is essential for the detection and characterization of hepatocellular carcinoma. However, acquisition of the hepatobiliary phase (HBP) requires a prolonged post-contrast delay, which reduces workflow efficiency and increases the risk of motion artifacts. In this study, we propose a Triple-Phase Sequential Fusion Network (TriPF-Net) to synthesize HBP images by lever…

Open PDF (arxiv)
Structure-Guided Diffusion Model for EEG-Based Visual Cognition Reconstruction

2604.22649 cs.NE 2026-04-24 PDF (arxiv)

Yongxiang Lian, Yueyang Cang, Pingge Hu, Yuchen He, Li Shi

Objective: Decoding visual information from electroencephalography (EEG) is an important problem in neuroscience and brain-computer interface (BCI) research. Existing methods are largely restricted to natural images and categorical representations, with limited capacity to capture structural features and to differentiate objective perception from subjective cognition. We propose a Structure-Guided…

Open PDF (arxiv)
Radiative feedbacks as drivers for quasi-periodic-oscillation activity in black-hole X-ray binaries

2604.22643 astro-ph.HE 2026-04-24 PDF (arxiv)

Apostolos Mastichiadis

Black-hole X-ray binaries (BHXRBs) in the hard and hard-intermediate spectral states commonly exhibit prominent type-C quasi-periodic oscillations (QPOs) in their X-ray power spectra. Despite extensive observational and theoretical efforts, the physical mechanism responsible for these oscillations has not yet been firmly established. The disk-corona system in BHXRBs is radiatively coupled, as hard…

Open PDF (arxiv)
Preconditioning of a hybridizable discontinuous Galerkin method for the coupled Stokes--Darcy system

2604.22641 math.NA 2026-04-24 PDF (arxiv)

Esteban Henríquez, Miroslav Kuchta, Jeonghun J. Lee, Sander Rhebergen

We propose parameter-robust preconditioners for the statically condensed linear system arising from a hybridizable discontinuous Galerkin discretization of the coupled Stokes--Darcy system. The design strategy relies on first applying the operator-preconditioning framework [Numer. Linear Algebra Appl., 18(1):1--40, 2011] to construct a preconditioner for the non-condensed discretization. This is d…

Open PDF (arxiv)
Quality-Driven Selective Mutation for Deep Learning

2604.22640 cs.SE 2026-04-24 PDF (arxiv)

Zaheed Ahmed, Emmanuel Charleson Dapaah, Philip Makedonski, Jens Grabowski

Mutants support testing and debugging in two roles: (i) as test goals and (ii) as substitutes for real faults. Hard-to-kill mutants provide better guidance for test improvement, while realism is essential when mutants are used to simulate real bugs. Building on these roles, selective mutation for deep learning (DL) aims to reduce the cost of mutant generation and execution by choosing operator con…

Open PDF (arxiv)
Adversarial Malware Generation in Linux ELF Binaries via Semantic-Preserving Transformations

2604.22639 cs.CR 2026-04-24 PDF (arxiv)

Lukáš Hrdonka, Martin Jureček

Malware development and detection have undergone significant changes in recent years as modern concepts, such as machine learning, have been used for both adversarial attacks and defense. Despite intensive research on Windows Portable Executable (PE) files, there is minimal work on Linux Executable and Linkable Format (ELF). In this work, we summarize the academic papers submitted in this field an…

Open PDF (arxiv)
Variability of Sagittarius A* at 3 GHz on minute-scale with MeerKAT

2604.22638 astro-ph.GA 2026-04-24 PDF (arxiv)

K. Kaur, I. Rammala-Zitha, A. Basu, G. Witzel, M. Wielgus, V. Balakrishnan, E. D. Barr, A. Brunthaler, S. Buchner, D. J. Champion, M. Hoeft, S. Khan, H. -R. Klöckner, C. König, M. Kramer, V. Venkatraman Krishnan, Y. K. Ma, S. A. Mao, P. V. Padmanabh, S. Ranchod, S. S. Sridhar, J. D. Wagenveld, R. S. Wharton, O. Wucknitz

The supermassive black hole Sagittarius A* (Sgr A*) exhibits temporal and spectral variability across the electromagnetic spectrum. However, variability at radio frequencies below ~ 5 GHz for timescales shorter than a day remains largely unexplored. We investigate the variability of Sgr A* at 2.79 GHz on short timescales (1 min), to probe an under-explored regime of its emission process. Through p…

Open PDF (arxiv)
CLVAE: A Variational Autoencoder for Long-Term Customer Revenue Forecasting

2604.22636 stat.ML 2026-04-24 PDF (arxiv)

Jeffrey Näf, Riana Valera Mbelson, Markus Meierer

Predicting customers' long-term revenue from sparse and irregular transaction data is central to marketing resource allocation in non-contractual settings, yet existing approaches face a trade-off. Traditional probabilistic customer base models deliver robust long-horizon forecasts by imposing strong structural assumptions, while flexible machine-learning models often require substantial training …

Open PDF (arxiv)
Constraints on the Primordial Black Hole Abundance using Pulsar Parameter Drifts

2604.22634 astro-ph.CO 2026-04-24 PDF (arxiv)

Yan-Chen Bi, Yu-Mei Wu, Qing-Guo Huang

Primordial black holes (PBHs) provide a compelling interpretation for the binary black holes (BBHs) observed by ground-based gravitational-wave (GW) detectors, especially for those BBHs in the theoretical mass gap. In the early Universe, the scalar perturbations required to produce such PBHs inevitably generate scalar-induced GWs (SIGWs). These SIGWs peak in the sub-nanohertz band, and manifest se…

Open PDF (arxiv)
Mixed Membership sub-Gaussian Models

2604.22633 stat.ML 2026-04-24 PDF (arxiv)

Huan Qing

The Gaussian mixture model is widely used in unsupervised learning, owing to its simplicity and interpretability. However, a fundamental limitation of the classical Gaussian mixture model is that it forces each observation to belong to exactly one component. In many practical applications, such as genetics, social network analysis, and text mining, an observation may naturally belong to multiple c…

Open PDF (arxiv)
Identifying and typifying demographic unfairness in phoneme-level embeddings of self-supervised speech recognition models

2604.22631 cs.CL 2026-04-24 PDF (arxiv)

Felix Herron, Solange Rossato, Alexandre Allauzen, François Portet

Modern automatic speech recognition (ASR) systems have been observed to function better for certain speaker groups (SGs) than others, despite recent gains in overall performance. One potential impediment to progress towards fairer ASR is a more nuanced understanding of the types of modeling errors that speech encoder models make, and in particular the difference between the structure of embeddings…

Open PDF (arxiv)
Detecting Concept Drift in Evolving Malware Families Using Rule-Based Classifier Representations

2604.22629 cs.CR 2026-04-24 PDF (arxiv)

Tomáš Kalný, Martin Jureček, Mark Stamp

This work proposes a structural approach to concept drift detection in malware classification using decision tree rulesets. Classifiers are trained across temporal windows on the EMBER2024 dataset, and drift is quantified by comparing extracted rule representations using feature importance, prediction agreement, activation stability, and coverage metrics. These metrics are correlated with both acc…

Open PDF (arxiv)
Cloud to Edge: Benchmarking LLM Inference On Hardware-Accelerated Single-Board Computers

2604.24785 cs.AR 2026-04-24 PDF (arxiv)

Harri Renney, Fouad Trad, Michael Mattarock, Zena Wood

Large language models (LLMs) are becoming increasingly capable at small parameter scales. At the same time, conventional cloud-centric deployment introduces challenges around data privacy, latency, and cost that are acute in operational technology and defence environments. Advances in model distillation, quantisation, and affordable edge accelerators now make local LLM inference on single-board co…

Open PDF (arxiv)
The Exact Replica Threshold for Nonlinear Moments of Quantum States

2604.22627 quant-ph 2026-04-24 PDF (arxiv)

Shuai Zeng

Joint measurements on multiple copies of a quantum state provide access to nonlinear observables such as $\operatorname{tr}(ρ^t)$, but whether replica number marks a sharp information-theoretic resource boundary has remained unclear. For every fixed order $t\ge 3$, existing protocols show that $\lceil t/2\rceil$ replicas already suffice for polynomial-sample estimation of $\operatorname{tr}(ρ^t)$,…

Open PDF (arxiv)
From graphemic dependence to lexical structure: a Markovian perspective on Dante's Commedia

2604.22626 cs.CL 2026-04-24 PDF (arxiv)

Angelo Maria Sabatini

This study investigates the structural organisation of Dante's Divina Commedia through a symbolic representation based on vowel-consonant (V/C) encoding. Modelling the resulting sequence as a four-state Markov chain yields a parsimonious index of graphemic memory, capturing the balance between persistence and alternation patterns. Across the poem, this index exhibits a slight but consistent incr…

Open PDF (arxiv)
On the Complementarity of Quantum and Classical Features: Adaptive Hybrid Quantum-Classical Feature Fusion for Breast Cancer Classification

2604.22903 cs.CV 2026-04-24 PDF (arxiv)

Yasmin Rodrigues Sobrinho, João Renato Ribeiro Manesco, João Paulo Papa

The integration of quantum machine learning with classical deep learning offers promising avenues for medical image analysis by mapping data into high-dimensional Hilbert spaces. However, effectively unifying these distinct paradigms remains challenging due to common optimization asymmetries. In this paper, a novel hybrid quantum-classical architecture for breast cancer diagnosis based on a dual-b…

Open PDF (arxiv)