Archon

Browse and search harvested arxiv metadata.

1067889 results (page 58 of 42716)

Cyber Defense Benchmark: Agentic Threat Hunting Evaluation for LLMs in SecOps

2604.19533 cs.CR 2026-04-21 PDF (arxiv)

Alankrit Chona, Igor Kozlov, Ambuj Kumar

We introduce the Cyber Defense Benchmark, a benchmark for measuring how well large language model (LLM) agents perform the core SOC analyst task of threat hunting: given a database of raw Windows event logs with no guided questions or hints, identify the exact timestamps of malicious events. The benchmark wraps 106 real attack procedures from the OTRF Security-Datasets corpus - spanning 86 MITRE…

Open PDF (arxiv)
BEAT: Tokenizing and Generating Symbolic Music by Uniform Temporal Steps

2604.19532 cs.SD 2026-04-21 PDF (arxiv)

Lekai Qian, Haoyu Gu, Jingwei Zhao, Ziyu Wang

Tokenizing music to fit the general framework of language models is a compelling challenge, especially considering the diverse symbolic structures in which music can be represented (e.g., sequences, grids, and graphs). To date, most approaches tokenize symbolic music as sequences of musical events, such as onsets, pitches, time shifts, or compound note events. This strategy is intuitive and has pr…

Open PDF (arxiv)
Hypergraph Mining via Proximity Matrix

2604.19531 cs.SI 2026-04-21 PDF (arxiv)

Junhao Bian, Yilin Bi, Tao Zhou

Hypergraphs serve as an effective tool widely adopted to characterize higher-order interactions in complex systems. The most intuitive and commonly used mathematical instrument for representing a hypergraph is the incidence matrix, in which each entry is binary, indicating whether the corresponding node belongs to the corresponding hyperedge. Although the incidence matrix has become a foundational…

Open PDF (arxiv)
Calibrating Scientific Foundation Models with Inference-Time Stochastic Attention

2604.19530 cs.LG 2026-04-21 PDF (arxiv)

Akash Yadav, Taiwo A. Adebiyi, Ruda Zhang

Transformer-based scientific foundation models are increasingly deployed in high-stakes settings, but current architectures give deterministic outputs and provide limited support for calibrated predictive uncertainty. We propose Stochastic Attention, a lightweight inference-time modification that randomizes attention by replacing softmax weights with normalized multinomial samples controlled by a …

Open PDF (arxiv)
Cosmic evolution of the [CII]-to-molecular gas relation

2604.19529 astro-ph.GA 2026-04-21 PDF (arxiv)

Cédric Accard, Florent Renaud, Katarina Kraljic, Diana Ismail, Matthieu Béthermin, Oscar Agertz

The [CII] 158 $μ$m line is widely used to trace star formation and the gas contents of high-redshift galaxies. However, it remains unclear under which physical conditions it reliably traces the molecular reservoir, and whether a unique conversion factor $α_{\rm [CII]}$ can be applied across cosmic time. We investigate the evolution of the relation between the [CII] luminosity and molecular gas mas…

Open PDF (arxiv)
Revisiting RaBitQ and TurboQuant: A Symmetric Comparison of Methods, Theory, and Experiments

2604.19528 cs.LG 2026-04-21 PDF (arxiv)

Jianyang Gao, Yutong Gou, Yuexuan Xu, Jifan Shi, Yongyi Yang, Shuolin Li, Raymond Chi-Wing Wong, Cheng Long

This technical note revisits the relationship between RaBitQ and TurboQuant under a unified comparison framework. We compare the two methods in terms of methodology, theoretical guarantees, and empirical performance, using a reproducible, transparent, and symmetric setup. Our results show that, despite the claimed advantage of TurboQuant, TurboQuant does not provide a consistent improvement over R…

Open PDF (arxiv)
Evaluating LLM-Generated Obfuscated XSS Payloads for Machine Learning-Based Detection

2604.19526 cs.CR 2026-04-21 PDF (arxiv)

Divyesh Gabbireddy, Suman Saha

Cross-site scripting (XSS) remains a persistent web security vulnerability, especially because obfuscation can change the surface form of a malicious payload while preserving its behavior. These transformations make it difficult for traditional and machine learning-based detection systems to reliably identify attacks. Existing approaches for generating obfuscated payloads often emphasize syntactic…

Open PDF (arxiv)
Revac: A Social Deduction Reasoning Agent

2604.19523 cs.AI 2026-04-21 PDF (arxiv)

Mihir Shriniwas Arya, Avinash Anish, Aditya Ranjan

Social deduction games such as Mafia present a unique AI challenge: players must reason under uncertainty, interpret incomplete and intentionally misleading information, evaluate human-like communication, and make strategic elimination decisions. Unlike deterministic board games, success in Mafia depends not on perfect information or brute-force search, but on inference, memory, and adaptability i…

Open PDF (arxiv)
GenerativeMPC: VLM-RAG-guided Whole-Body MPC with Virtual Impedance for Bimanual Mobile Manipulation

2604.19522 cs.RO 2026-04-21 PDF (arxiv)

Marcelino Julio Fernando, Miguel Altamirano Cabrera, Jeffrin Sam, Yara Mahmoud, Konstantin Gubernatorov, Dzmitry Tsetserukou

Bimanual mobile manipulation requires a seamless integration between high-level semantic reasoning and safe, compliant physical interaction - a challenge that end-to-end models approach opaquely and classical controllers lack the context to address. This paper presents GenerativeMPC, a hierarchical cyber-physical framework that explicitly bridges semantic scene understanding with physical control …

Open PDF (arxiv)
Singularities in phase separation models: a spectral element approach for the nonlocal Cahn-Hilliard equation

2604.19521 math.NA 2026-04-21 PDF (arxiv)

Andrés Miniguano-Trujillo, Andrea Poiatti, Maurizio Grasselli, Benjamin Goddard, John Pearson

The nonlocal Cahn-Hilliard equation provides a natural extension of the classical model for phase separation by incorporating long-range interactions through a singular convolution kernel. While this formulation admits a rich existence and regularity theory, its numerical approximation remains challenging: discretisation of the nonlocal term leads to dense operators, and the singularity of the ker…

Open PDF (arxiv)
SimDiff: Depth Pruning via Similarity and Difference

2604.19520 cs.AI 2026-04-21 PDF (arxiv)

Yuli Chen, Shuhao Zhang, Fanshen Meng, Bo Cheng, Jiale Han, Qiang Tong, Xiulei Liu

Depth pruning improves the deployment efficiency of large language models (LLMs) by identifying and removing redundant layers. A widely accepted standard for this identification process is to measure the similarity between layers using cosine distance. However, we find that methods relying solely on this one-dimensional heuristic can exhibit unpredictable performance and even catastrophic collapse…

Open PDF (arxiv)
Accelerating Optimization and Machine Learning through Decentralization

2604.19518 cs.LG 2026-04-21 PDF (arxiv)

Ziqin Chen, Zuang Wang, Yongqiang Wang

Decentralized optimization enables multiple devices to learn a global machine learning model while each individual device only has access to its local dataset. By avoiding the need for training data to leave individual users' devices, it enhances privacy and scalability compared to conventional centralized learning, where all data has to be aggregated to a central server. However, decentralized op…

Open PDF (arxiv)
PRADAS: PRior-Assisted DAta Splitting for False Discovery Rate Control

2604.19517 stat.ME 2026-04-21 PDF (arxiv)

Yuanchuan Guo, Buyu Lin, Jun S. Liu

In the FDR-controlling literature, mirror statistics offer a flexible alternative to $p$-value based procedures. When prior information is available, however, it is unclear how to incorporate mirror statistics in a principled way, and the standard equal split used by data-splitting methods can be inefficient. In this paper, we characterize a broader class of mirror statistics for any fixed splitti…

Open PDF (arxiv)
From Experience to Skill: Multi-Agent Generative Engine Optimization via Reusable Strategy Learning

2604.19516 cs.AI 2026-04-21 PDF (arxiv)

Beining Wu, Fuyou Mao, Jiong Lin, Cheng Yang, Jiaxuan Lu, Yifu Guo, Siyu Zhang, Yifan Wu, Ying Huang, Fu Li

Generative engines (GEs) are reshaping information access by replacing ranked links with citation-grounded answers, yet current Generative Engine Optimization (GEO) methods optimize each instance in isolation, unable to accumulate or transfer effective strategies across tasks and engines. We reframe GEO as a strategy learning problem and propose MAGEO, a multi-agent framework in which coordinated …

Open PDF (arxiv)
Constructive Approaches to Perception-Aware Lossy Source Coding: Information-Theoretic Guidelines

2604.19515 cs.IT 2026-04-21 PDF (arxiv)

Ali Hussein, Jun Chen, Chao Tian, S. Sandeep Pradhan

Perception-aware lossy source coding has attracted significant recent interest. It augments the classical distortion criterion with an explicit perception constraint, thereby enabling more refined control over fidelity and perceptual quality. Despite rapid progress, the diversity of rate-distortion-perception formulations and their underlying assumptions remains poorly understood by many practitio…

Open PDF (arxiv)
When Graph Structure Becomes a Liability: A Critical Re-Evaluation of Graph Neural Networks for Bitcoin Fraud Detection under Temporal Distribution Shift

2604.19514 cs.LG 2026-04-21 PDF (arxiv)

Saket Maganti

The consensus that GCN, GraphSAGE, GAT, and EvolveGCN outperform feature-only baselines on the Elliptic Bitcoin Dataset is widely cited but has not been rigorously stress-tested under a leakage-free evaluation protocol. We perform a seed-matched inductive-versus-transductive comparison and find that this consensus does not hold. Under a strictly inductive protocol, Random Forest on raw features ac…

Open PDF (arxiv)
Defining Robust Ultrasound Quality Metrics via an Ultrasound Foundation Model

2604.19512 eess.IV 2026-04-21 PDF (arxiv)

Ziyang Huang, Bingyan Li, Chen Ma, Tianyi Liu, Yihui Zhai, Hong Xu, Yi Guo, Zeju Li, Yuanyuan Wang

Clinicians lack a principled framework to quantify diagnostic utility in ultrasound reconstructions. Existing standards like PSNR and VGG-LPIPS are inadequate, failing to account for modality-specific physics or the structural nuances of acoustic imaging. We close this gap with a TinyUSFM-based evaluation framework featuring two distinct metrics: TinyUSFM-uLPIPS, a full-reference perceptual distan…

Open PDF (arxiv)
Evaluating Histogram Matching for Robust Deep learning-Based Grapevine Disease Detection

2604.19510 cs.CV 2026-04-21 PDF (arxiv)

Ruben Pascual, Inés Hernández, Salvador Gutiérrez, Javier Tardaguila, Pedro Melo-Pinto, Daniel Paternain, Mikel Galar

Variability in illumination is a primary factor limiting deep learning robustness for field-based plant disease detection. This study evaluates Histogram Matching (HM), a technique that transforms the pixel intensity distribution of an image to match a reference profile, to mitigate this in grapevine classification, distinguishing among healthy leaves, downy mildew, and spider mite damage. We prop…

Open PDF (arxiv)
Assessing VLM-Driven Semantic-Affordance Inference for Non-Humanoid Robot Morphologies

2604.19509 cs.RO 2026-04-21 PDF (arxiv)

Jess Jones, Raul Santos-Rodriguez, Sabine Hauert

Vision-language models (VLMs) have demonstrated remarkable capabilities in understanding human-object interactions, but their application to robotic systems with non-humanoid morphologies remains largely unexplored. This work investigates whether VLMs can effectively infer affordances for robots with fundamentally different embodiments than humans, addressing a critical gap in the deployment of th…

Open PDF (arxiv)
Bangla Key2Text: Text Generation from Keywords for a Low Resource Language

2604.19508 cs.CL 2026-04-21 PDF (arxiv)

Tonmoy Talukder, G M Shahariar

This paper introduces \textit{Bangla Key2Text}, a large-scale dataset of $2.6$ million Bangla keyword--text pairs designed for keyword-driven text generation in a low-resource language. The dataset is constructed using a BERT-based keyword extraction pipeline applied to millions of Bangla news texts, transforming raw articles into structured keyword--text pairs suitable for supervised learning. To…

Open PDF (arxiv)
Market Dynamics, Governance and Open Research Metadata in the AI Era

2604.19507 cs.DL 2026-04-21 PDF (arxiv)

Daniel W. Hook

The debate about scholarly knowledge infrastructure has long been framed as a contest between openness and commercial enclosure. This framing distorts both policy and practice. The real tension lies between the persistent cost of producing and refining structured metadata under deep technological friction, and the differentiated demands distinct communities place on data quality, focus and granula…

Open PDF (arxiv)
Enhancing Unsupervised Keyword Extraction in Academic Papers through Integrating Highlights with Abstract

2604.19505 cs.IR 2026-04-21 PDF (arxiv)

Yi Xiang, Chengzhi Zhang

Automatic keyword extraction from academic papers is a key area of interest in natural language processing and information retrieval. Although previous research has mainly focused on utilizing abstract and references for keyword extraction, this paper focuses on the highlights section - a summary describing the key findings and contributions, offering readers a quick overview of the research. Our …

Open PDF (arxiv)
Cyclic Equalizability Characterized by Parikh Vectors

2604.19504 math.CO 2026-04-21 PDF (arxiv)

Sarunyu Thongjarast, Sarit Pasiphol, Suthee Ruangwises

Cyclic equalizability is a notion introduced by Shinagawa and Nuida in 2025, in the study of card-based cryptography. Informally, a collection of words is cyclically equalizable if, by inserting the same letters at the same positions in all words, they can be transformed into words that are cyclic shifts of one another. Shinagawa and Nuida showed that two binary words of equal length are cyclicall…

Open PDF (arxiv)
ReaLB: Real-Time Load Balancing for Multimodal MoE Inference

2604.19503 cs.DC 2026-04-21 PDF (arxiv)

Yingping Wang, Yi Wu, Xiangyu Wu, Junwei Cui, Weilin Cai, Zhijiang Guo, Jiayi Huang

Mixture-of-Experts (MoE) architectures are widely used in modern large language models and multimodal models. However, inference efficiency is often limited by highly dynamic and skewed expert workloads across different modalities. During the prefill stage with large batch sizes, vision tokens frequently dominate the input sequences. Under expert parallelism (EP), this leads to severe load imbalan…

Open PDF (arxiv)
Beyond Rating: A Comprehensive Evaluation and Benchmark for AI Reviews

2604.19502 cs.CL 2026-04-21 PDF (arxiv)

Bowen Li, Haochen Ma, Yuxin Wang, Jie Yang, Yining Zheng, Xinchi Chen, Xuanjing Huang, Xipeng Qiu

The rapid adoption of Large Language Models (LLMs) has spurred interest in automated peer review; however, progress is currently stifled by benchmarks that treat reviewing primarily as a rating prediction task. We argue that the utility of a review lies in its textual justification--its arguments, questions, and critique--rather than a scalar score. To address this, we introduce Beyond Rating, a h…

Open PDF (arxiv)