822856 results (page 13 of 32915)
-
MetaCloak-JPEG: JPEG-Robust Adversarial Perturbation for Preventing Unauthorized DreamBooth-Based Deepfake Generation
The rapid progress of subject-driven text-to-image synthesis, and in particular DreamBooth, has enabled a consent-free deepfake pipeline: an adversary needs only 4-8 publicly available face images to fine-tune a personalized diffusion model and produce photorealistic harmful content. Current adversarial face-protection systems -- PhotoGuard, Anti-DreamBooth, and MetaCloak -- perturb user images to…
-
A differentiable software suite for accelerated simulation of turbulent flows
We present IncompressibleNavierStokes.jl, an open-source Julia package for solving the incompressible Navier--Stokes equations on staggered Cartesian grids. The package features matrix-free, hardware-agnostic kernels that are compiled from a single source for multi-threaded CPU or GPU execution, and hand-written adjoint kernels for all discrete operators, enabling efficient reverse-mode automatic …
-
Symbolic Synthesis for LTLf+ Obligations
We study synthesis for obligation properties expressed in LTLfp, the extension of LTLf to infinite traces. Obligation properties are positive Boolean combinations of safety and guarantee (co-safety) properties and form the second level of the temporal hierarchy of Manna and Pnueli. Although obligation properties are expressed over infinite traces, they retain most of the simplicity of LTLf. In par…
-
OGER: A Robust Offline-Guided Exploration Reward for Hybrid Reinforcement Learning
Recent advancements in Reinforcement Learning with Verifiable Rewards (RLVR) have significantly improved Large Language Model (LLM) reasoning, yet models often struggle to explore novel trajectories beyond their initial latent space. While offline teacher guidance and entropy-driven strategies have been proposed to address this, they often lack deep integration or are constrained by the model's in…
-
HybridGen: Efficient LLM Generative Inference via CPU-GPU Hybrid Computing
As modern LLMs support thousands to millions of tokens, KV caches grow to hundreds of gigabytes, stressing memory capacity and bandwidth. Existing solutions, such as KV cache pruning and offloading, alleviate these but underutilize hardware by relying solely on either GPU or CPU for attention computing, and considering yet limited CPU local memory for KV cache storage. We propose HybridGen, an eff…
-
Towards Better Static Code Analysis Reports: Sentence Transformer-based Filtering of Non-Actionable Alerts
Static code analysis (SCA) tools are widely used as effective ways to detect bugs and vulnerabilities in software systems. However, the reports generated by these tools often contain a large number of non-actionable findings, which can overwhelm developers to the point of ignoring them altogether -- this phenomenon is known as "alert fatigue". In this paper, we combat alert fatigue by proposing ST…
-
BBP transition and the leading eigenvector of the spiked Wigner model with inhomogeneous noise
The spiked Wigner ensemble is a prototypical model for high-dimensional inference. We study the spectral properties of an inhomogeneous rank-one spiked Wigner model in which the variance of each entry of the noise matrix is itself a random variable. In the high-dimensional limit, we derive exact equations for the spectral edges, the outlier eigenvalue, and the distribution of the components of the…
-
A Census of Na D-traced neutral ISM and outflows at $0.6<z<4$
We present a statistical census of the Na D-traced neutral interstellar medium (ISM) and outflows in 309 galaxies at $0.6<z<4$ using JWST/NIRSpec medium-resolution grating spectroscopy from the SMILES, JADES, Blue Jay, and Aurora surveys. After subtracting the stellar continuum, we model the Na D $λλ5890, 5896$ Åand detect neutral ISM absorption in 76 galaxies. Of the Na D-traced ISM detections, 8…
-
IDOBE: Infectious Disease Outbreak forecasting Benchmark Ecosystem
Epidemic forecasting has become an integral part of real-time infectious disease outbreak response. While collaborative ensembles composed of statistical and machine learning models have become the norm for real-time forecasting, standardized benchmark datasets for evaluating such methods are lacking. Further, there is limited understanding on performance of these methods for novel outbreaks with …
-
Joint Scheduling of Multi-Band Radar Sensing and DNN Inference for Cross-Stage Parallelism
This paper studies end-to-end latency minimization for a multi-band radar sensing and deep neural network (DNN) inference pipeline. Unlike conventional stage-wise designs that treat radar sensing and DNN inference as two sequential stages, the proposed framework exploits cross-stage parallelism by allowing the inference branch associated with a sensed band to start as soon as that band completes s…
-
LLM Safety From Within: Detecting Harmful Content with Internal Representations
Guard models are widely used to detect harmful content in user prompts and LLM responses. However, state-of-the-art guard models rely solely on terminal-layer representations and overlook the rich safety-relevant features distributed across internal layers. We present SIREN, a lightweight guard model that harnesses these internal features. By identifying safety neurons via linear probing and combi…
-
UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models
Uniform Discrete Diffusion Model (UDM) has recently emerged as a promising paradigm for discrete generative modeling; however, its integration with reinforcement learning remains largely unexplored. We observe that naively applying GRPO to UDM leads to training instability and marginal performance gains. To address this, we propose \Ours, the first framework to integrate UDM with RL. Our method is…
-
S2H-DPO: Hardness-Aware Preference Optimization for Vision-Language Models
Vision-Language Models (VLMs) have demonstrated remarkable progress in single-image understanding, yet effective reasoning across multiple images remains challenging. We identify a critical capability gap in existing multi-image alignment approaches: current methods focus primarily on localized reasoning with pre-specified image indices (``Look at Image 3 and...''), bypassing the essential skills …
-
An adaptive discretization algorithm for locally optimal experimental design with constraints
We develop a novel iterative algorithm for locally optimal experimental design under constraints, like budget or performance constraints. It is an adaptive discretization algorithm. In every iteration, a discretized version of the constrained-design problem is solved and then the discretization is adaptively refined by adding an approximate violator of a suitable sufficient $\eps$-optimality condi…
-
Different Paths to Harmful Compliance: Behavioral Side Effects and Mechanistic Divergence Across LLM Jailbreaks
Open-weight language models can be rendered unsafe through several distinct interventions, but the resulting models may differ substantially in capabilities, behavioral profile, and internal failure mode. We study behavioral and mechanistic properties of jailbroken models across three unsafe routes: harmful supervised fine-tuning (SFT), harmful reinforcement learning with verifiable rewards (RLVR)…
-
MASS-RAG: Multi-Agent Synthesis Retrieval-Augmented Generation
Large language models (LLMs) are widely used in retrieval-augmented generation (RAG) to incorporate external knowledge at inference time. However, when retrieved contexts are noisy, incomplete, or heterogeneous, a single generation process often struggles to reconcile evidence effectively. We propose \textbf{MASS-RAG}, a multi-agent synthesis approach to retrieval-augmented generation that structu…
-
Document-as-Image Representations Fall Short for Scientific Retrieval
Many recent document embedding models are trained on document-as-image representations, embedding rendered pages as images rather than the underlying source. Meanwhile, existing benchmarks for scientific document retrieval, such as ArXivQA and ViDoRe, treat documents as images of pages, implicitly favoring such representations. In this work, we argue that this paradigm is not well-suited for text-…
-
Learning the Riccati solution operator for time-varying LQR via Deep Operator Networks
We propose a computational framework for replacing the repeated numerical solution of differential Riccati equations in finite-horizon Linear Quadratic Regulator (LQR) problems by a learned operator surrogate. Instead of solving a nonlinear matrix-valued differential equation for each new system instance, we construct offline an approximation of the associated solution operator mapping time-depend…
-
Physics-Informed Neural Networks for Maximizing Quantum Fisher Information in Time-Dependent Many-Body Systems
Quantum Fisher Information (QFI) sets the ultimate precision limit for parameter estimation and is therefore a central quantity in quantum metrology. In time-dependent many-body systems, however, maximizing QFI is a highly non-trivial task due to the combined effects of non-commutativity, control complexity, and the exponential growth of the Hilbert space. In this work, we present a physics-inform…
-
Bayesian experimental design: grouped geometric pooled posterior via ensemble Kalman methods
Bayesian experimental design (BED) for complex physical systems is often limited by the nested inference required to estimate the expected information gain (EIG) or its gradients. Each outer sample induces a different posterior, creating a large and heterogeneous set of inference targets. Existing methods have to sacrifice either accuracy or efficiency: they either perform per-outer-sample posteri…
-
NOEMA3D: Spatially resolved dust, CO, and [C I] in massive star-forming main sequence galaxies at cosmic noon
We present a spatially resolved study of cold molecular gas and dust in ten main-sequence galaxies at z=1.1-1.6, using observations of CO(4-3), CO(3-2), [C I](1-0), and dust continuum from the NOEMA3D survey. We find a widely presence of spatially extended molecular gas and dust, with sizes comparable to those of the stellar disk, in contrast to those of central-dominated starburst galaxies at sim…
-
NOEMA3D: Resolving radial gas flows in disk galaxies at z~1.1-1.6 with high-resolution CO observations
We present NOEMA3D, a unique high-resolution study of purely molecular gas kinematics at $z \sim 1.1$ to 1.6, providing a dedicated view of cold gas dynamics at the late stages of the peak epoch of cosmic star formation. Using deep ($gtrsim 20$ hr on source per target) IRAM-NOEMA CO observations of 10 massive ($10.45 < log(M^*/M_\odot) < 11.43$) ) main-sequence galaxies, complemented by high-resol…
-
Moving beyond Principles: Identifying Actionable AI Fairness Practices
Because artificial intelligence (AI) increasingly mediates organizational work, fairness has become a critical governance challenge. Existing frameworks often prioritize abstract ethical principles rather than fairness-specific ones and lack actionable guidance across the entire AI lifecycle. This study addresses the principles-to-practice gap in AI fairness governance. We develop actionable AI fa…
-
QRAFTI: An Agentic Framework for Empirical Research in Quantitative Finance
We introduce a multi-agent framework intended to emulate parts of a quantitative research team and support equity factor research on large financial panel datasets. QRAFTI integrates a research toolkit for panel data with MCP servers that expose data access, factor construction, and custom coding operations as callable tools. It can help replicate established factors, formulate and test new signal…
-
Missingness-Adaptive Factor Identification in High-Dimensional Data
Determining the number of factors in high-dimensional factor models remains a fundamental challenge, particularly when data are incomplete. This paper introduces the concept of identifiable factors, those that can be reliably recovered despite missing observations, and proposes the Missingness-Adaptive Thresholding Estimator (MATE). To our knowledge, MATE is the first missingness-adaptive framewor…