Archon

Browse and search harvested arxiv metadata.

1273993 results (page 126 of 50960)

Adopting State-of-the-Art Pretrained Audio Representations for Music Recommender Systems

2604.23077 cs.IR 2026-04-25 PDF (arxiv)

Yan-Martin Tamm, Anna Aljanaki

Over the years, Music Information Retrieval (MIR) research community has released various models pretrained on large amounts of music data. Transfer learning showcases the proven effectiveness of pretrained backend models for a broad spectrum of downstream tasks, including auto-tagging and genre classification. However, MIR papers generally do not explore the efficiency of pretrained models for Mu…

Open PDF (arxiv)
Rejection Sampling is Optimal for Relative Entropy Coding

2604.23076 cs.IT 2026-04-25 PDF (arxiv)

Spencer Hill, Fady Alajaji, Tamás Linder, Gergely Flamich

In relative entropy coding, a sender aims to design a stochastic code such that, on input $X \sim P_X$, the receiver can generate a sample $Y \sim P_{Y \mid X}$. It is a standard result that (1) this requires at least $I(X; Y)$ bits, (2) the lower bound is achievable within a logarithmic gap, and (3) this gap cannot be reduced in general. The necessity of the gap suggests that the mutual informati…

Open PDF (arxiv)
A Lightweight Toggleable Adhesion Prototype for Multirotor UAV Landing on Tilting Platforms

2604.23074 cs.RO 2026-04-24 PDF (arxiv)

Teighin Nordholt, Melissa Greeff

Autonomous multirotor landings on uncrewed surface vessels (USVs) are critical for persistent maritime operations but remain challenging due to wave-induced tilt, wind disturbances, and limited landing area. Many existing approaches exhibit small pose tolerance for reliable landing. This paper presents a lightweight toggleable adhesion mechanism to improve landing reliability. The system uses a mo…

Open PDF (arxiv)
RL Token: Bootstrapping Online RL with Vision-Language-Action Models

2604.23073 cs.LG 2026-04-24 PDF (arxiv)

Charles Xu, Jost Tobias Springenberg, Michael Equi, Ali Amin, Adnan Esmail, Sergey Levine, Liyiming Ke

Vision-language-action (VLA) models can learn to perform diverse manipulation skills "out of the box," but achieving the precision and speed that real-world tasks demand requires further fine-tuning -- for example, via reinforcement learning (RL). We introduce a lightweight method that enables sample-efficient online RL fine-tuning of pretrained VLAs using just a few hours of real-world practice. …

Open PDF (arxiv)
Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis

2604.23072 cs.AI 2026-04-24 PDF (arxiv)

Junyan Cheng, Kyle Richardson, Peter Chin

Large language model (LLM) agents are increasingly tasked with complex real-world analysis (e.g., in financial forecasting, scientific discovery), yet their reasoning suffers from stochastic instability and lacks a verifiable, compositional structure. To address this, we introduce Analytica, a novel agent architecture built on the principle of Soft Propositional Reasoning (SPR). SPR reframes compl…

Open PDF (arxiv)
Learning the Weather-Grid Nexus via Weather-to-Voltage (W2V) Predictive Modeling

2604.23070 eess.SY 2026-04-24 PDF (arxiv)

Sol Lim, Min-Seung Ko, Farnaz Safdarian, Hao Zhu

This paper proposes a weather-to-voltage (W2V) predictive modeling framework to learn the underlying weather-grid nexus. Unlike existing approaches on weather-informed grid operations, our proposed W2V model can achieve the joint analysis of weather and grid states, and further leverage this coupling to enhance grid-aware weather forecasting (GAWF) as a key application. To achieve this end-to-end …

Open PDF (arxiv)
ContextWeaver: Selective and Dependency-Structured Memory Construction for LLM Agents

2604.23069 cs.CL 2026-04-24 PDF (arxiv)

Yating Wu, Yuhao Zhang, Sayan Ghosh, Sourya Basu, Anoop Deoras, Jun Huan, Gaurav Gupta

Large language model (LLM) agents often struggle in long-context interactions. As the agent accumulates more interaction history, context management approaches such as sliding window and prompt compression may omit earlier structured information that later steps rely on. Recent retrieval-based memory systems surface relevant content but still overlook the causal and logical structure needed for mu…

Open PDF (arxiv)
Probabilistic Hazard Analysis Framework with Stochastic Optimal Control for Deteriorating Civil Infrastructure Systems

2604.23068 eess.SY 2026-04-24 PDF (arxiv)

Sudhir P. Jodha, Konstantinos G. Papakonstantinou

The safety and resilience of civil infrastructure systems are increasingly threatened by compounded risks from various hazard events and structural deterioration due to environmental stressors. This study presents a comprehensive risk-informed, life-cycle optimization framework that extends the Performance-Based Earthquake Engineering (PBEE) and probabilistic seismic loss estimation paradigms by c…

Open PDF (arxiv)
Training a General Purpose Automated Red Teaming Model

2604.23067 cs.CR 2026-04-24 PDF (arxiv)

Aishwarya Padmakumar, Leon Derczynski, Traian Rebedea, Christopher Parisien

Automated methods for red teaming LLMs are an important tool to identify LLM vulnerabilities that may not be covered in static benchmarks, allowing for more thorough probing. They can also adapt to each specific LLM to discover weaknesses unique to it. Most current automated red teaming methods are intended for tackling safety and content moderation. Thus, they make use of content safety models as…

Open PDF (arxiv)
Urban Flood Observations (UFO): A hand-labeled training and validation dataset of post-flood inundation

2604.23066 cs.CV 2026-04-24 PDF (arxiv)

Rohit Mukherjee, Hannah K. Friedrich, Beth Tellman, Ariful Islam, Zhijie Zhang, Jonathan Giezendanner, Upmanu Lall, Venkataraman Lakshmi

Urban flooding affects lives and infrastructure worldwide. Mapping inundation in complex urban environments from satellite imagery remains challenging due to limited spatial resolution, infrequent acquisitions, and cloud cover. We present Urban Flood Observations (UFO), a global, hand-labeled dataset of post-flood inundation in diverse urban settings. UFO comprises 215 image chips (1024 by 1024 pi…

Open PDF (arxiv)
What Should Frontier AI Developers Disclose About Internal Deployments?

2604.23065 cs.CY 2026-04-24 PDF (arxiv)

Jacob Charnock, Raja Mehta Moreno, Justin Miller, William L. Anderson

Frontier AI developers are increasingly deploying highly capable models internally to automate AI R&D, but these deployments currently face limited external oversight. It is essential, therefore, that developers provide evidence that internally deployed models are safe. While recent work has highlighted the risks of internal deployments and proposed broad approaches to transparency and governance,…

Open PDF (arxiv)
Efficient primal-dual algorithm for imaging applications with matrix stacking, applied to DBT image reconstruction

2604.23063 math.OC 2026-04-24 PDF (arxiv)

Emil Y. Sidky, John Paul Phillips, Zheng Zhang, Dan Xia, Ingrid S. Reiser, Xiaochuan Pan

The primal-dual hybrid gradient (PDHG) algorithm for solving convex optimization problems that arise in tomographic imaging is revisited. In particular, simplification of the selection of step-size parameters is developed for optimization problems with multiple terms, each containing a linear transform subject to splitting. This simplification maintains algorithm efficiency while avoiding massive …

Open PDF (arxiv)
C-MORAL: Controllable Multi-Objective Molecular Optimization with Reinforcement Alignment for LLMs

2604.23061 cs.LG 2026-04-24 PDF (arxiv)

Rui Gao, Youngseung Jeon, Swastik Roy, Morteza Ziyadi, Xiang 'Anthony' Chen

Large language models (LLMs) show promise for molecular optimization, but aligning them with selective and competing drug-design constraints remains challenging. We propose C-Moral, a reinforcement learning post-training framework for controllable multi-objective molecular optimization. C-Moral combines group-based relative optimization, property score alignment for heterogeneous objectives, and c…

Open PDF (arxiv)
Learning to Trust AI and Data-driven models in Data Assimilation through a Multifidelity Ensemble Gaussian Mixture Filter Framework

2604.23060 cs.CE 2026-04-24 PDF (arxiv)

Andrey A. Popov

AI and data-driven models have large potential for data assimilation applications by creating fast and accurate forecasts. Their tendency to produce spurious inaccurate, nonphysical results -- hallucination -- however, raises a serious question about their long-term use, and can be categorized as untrustworthy methods. Theory-driven methods on the other hand are slow, but are capable of staying ph…

Open PDF (arxiv)
Implicit Framing in Obstetric Counseling Notes: A Grounded LLM Pipeline on a VBAC-Eligible Cohort

2604.23059 cs.CL 2026-04-24 PDF (arxiv)

Baris Karacan, Barbara Di Eugenio, Patrick Thornton, Joanna Tess, Subhash Kumar Kolar

Clinical framing -- the linguistic manner in which clinical information is presented -- can influence patient understanding and decision-making, with important implications for healthcare outcomes. Obstetrics is a high-stakes domain in which physicians counsel patients on delivery mode choices such as vaginal birth after cesarean (VBAC) and repeat cesarean section (RCS), yet counseling language re…

Open PDF (arxiv)
The Security Cost of Intelligence: AI Capability, Cyber Risk, and Deployment Paradox

2604.23058 econ.GN 2026-04-24 PDF (arxiv)

Sukwoong Choi

Firms are deploying more capable AI systems, but organizational controls often have not kept pace. These systems can generate greater productivity gains, but high-value uses require broader authority exposure -- data access, workflow integration, and delegated authority -- when governance controls have not yet decoupled capability from authority exposure. We develop an analytical model in which a …

Open PDF (arxiv)
Don't Make the LLM Read the Graph: Make the Graph Think

2604.23057 cs.AI 2026-04-24 PDF (arxiv)

Yuqi Sun, Tianqin Meng, George Liu, Yashraj Panwar, Lakshya Chaudhry, Munasib Ilham, Aman Chadha

We investigate whether explicit belief graphs improve LLM performance in cooperative multi-agent reasoning. Through 3,000+ controlled trials across four LLM families in the cooperative card game Hanabi, we establish four findings. First, integration architecture determines whether belief graphs provide value: as prompt context, graphs are decorative for strong models and beneficial only for weak m…

Open PDF (arxiv)
K-Score: Kalman Filter as a Principled Alternative to Reward Normalization in Reinforcement Learning

2604.23056 cs.LG 2026-04-24 PDF (arxiv)

Zixuan Xia, Quanxi Li

We propose a simple yet effective alternative to reward normalization in policy gradient reinforcement learning by integrating a 1D Kalman filter for online reward estimation. Instead of relying on fixed heuristics, our method recursively estimates the latent reward mean, smoothing high-variance returns and adapting to non-stationary environments. This approach incurs minimal overhead and requires…

Open PDF (arxiv)
DeepImagine: Learning Biomedical Reasoning via Successive Counterfactual Imagining

2604.23054 cs.CL 2026-04-24 PDF (arxiv)

Youze Zheng, Jianyou Wang, Yuhan Chen, Matthew Feng, Longtian Bao, Hanyuan Zhang, Maxim Khan, Aditya K. Sehgal, Christopher D. Rosin, Umber Dube, Ramamohan Paturi

Predicting the outcomes of prospective clinical trials remains a major challenge for large language models. Prior work has shown that both traditional correlational predictors, such as random forests and logistic regression, and strong commercial LLMs achieve limited performance on this task. In this paper, we propose DeepImagine, a framework for teaching LLMs biomedical reasoning through successi…

Open PDF (arxiv)
ML-Guided Primal Heuristics for Mixed Binary Quadratic Programs

2604.23053 cs.LG 2026-04-24 PDF (arxiv)

Weimin Huang, Natalie M. Isenberg, Ján Drgoňa, Draguna L Vrabie, Bistra Dilkina

Mixed Binary Quadratic Programs (MBQPs) are an important and complex set of problems in combinatorial optimization. As solving large-scale combinatorial optimization problems is challenging, primal heuristics have been developed to quickly identify high-quality solutions within a short amount of time. Recently, a growing body of research has also used machine learning to accelerate solution method…

Open PDF (arxiv)
Evaluating Temporal Consistency in Multi-Turn Language Models

2604.23051 cs.CL 2026-04-24 PDF (arxiv)

Yash Kumar Atri, Steven L. Johnson, Tom Hartvigsen

Language models are increasingly deployed in interactive settings where users reason about facts over time rather than in isolation. In such scenarios, correct behavior requires models to maintain and update implicit temporal assumptions established earlier in a conversation. We study this challenge through the lens of temporal scope stability: the ability to preserve, override, or transfer time-s…

Open PDF (arxiv)
A Decoupled Human-in-the-Loop System for Controlled Autonomy in Agentic Workflows

2604.23049 cs.AI 2026-04-24 PDF (arxiv)

Edward Cheng, Jeshua Cheng

AI agents are increasingly deployed to execute tasks and make decisions within agentic workflows, introducing new requirements for safe and controlled autonomy. Prior work has established the importance of human oversight for ensuring transparency, accountability, and trustworthiness in such systems. However, existing implementations of Human-in-the-Loop (HITL) mechanisms are typically embedded wi…

Open PDF (arxiv)
The Impact of Documentation on Test Engagement in Pull Requests in OSS

2604.23048 cs.SE 2026-04-24 PDF (arxiv)

Teal Amore, Nathan Berman, Siyuan Jiang

Automated testing is crucial for maintaining open-source software quality. However, motivating contributors to include tests for code changes remains a challenge. While existing interventions, such as code coverage metrics and reviewer feedback, are often reactive and applied only after a pull request is opened, this study investigates whether documentation on testing can serve as a proactive meas…

Open PDF (arxiv)
Shape of Memory: a Geometric Analysis of Machine Unlearning in Second-Order Optimizers

2604.23046 cs.LG 2026-04-24 PDF (arxiv)

Kennon Stewart

We argue that current definitions of machine unlearning are underspecified for second-order optimizers. We compare first-order and second-order learners for their ability to handle the data deletion task with varying degrees of eigendecomposition to mimic the loss model memory. While both first and second-order methods realign with the ideal counterfactul in terms of performance and gradient, the …

Open PDF (arxiv)
A Differentiable Framework for Global Circulation Model Precipitation Bias Correction

2604.23045 cs.LG 2026-04-24 PDF (arxiv)

Kamlesh Sawadekar, Seth McGinnis, Peijun Li, Chaopeng Shen

Systematic biases in Global Circulation Model (GCM) outputs limit their direct applicability in regional planning, necessitating bias correction. Correcting precipitation is particularly challenging due to its non-Gaussian distribution, intermittent nature, and non-linear extremes. However, traditional statistical methods cannot learn from big data and easily address systematic biases in the GCMs,…

Open PDF (arxiv)