1273993 results (page 139 of 50960)
-
How Hard is it to Decide if a Fact is Relevant to a Query?
We consider the following fundamental problem: given a database D, Boolean conjunctive query (CQ) q, and fact f in D, decide whether f is relevant to q wrt. D, i.e., does f belong to a minimal subset S of D such that S |= q. Despite being of central importance to query answer explanation, the combined complexity of deciding query relevance has not been studied in detail, leaving open what makes th…
-
Utility-Aware Data Pricing: Token-Level Quality and Empirical Training Gain for LLMs
Traditional data valuation methods based on ``row-count $\times$ quality coefficient'' paradigms fail to capture the nuanced, nonlinear contributions that data makes to Large Language Model (LLM) capabilities. This paper presents a dynamic data valuation framework that transitions from static accounting to utility-based pricing. Our approach operates on three layers: (1) token-level information de…
-
Graviton propagation in ghost-free massive gravity
We consider the ghost-free dRGT massive gravity with two of its three possible mass terms. This theory has five gravitational degrees of freedom. On Minkowski spacetime these modes have helicity-2, -1 and -0 and propagate on the Minkowski lightcone in the high-frequency limit. However for a general background the degrees of freedom corresponding to the helicity-1 and -0 modes have characteristics …
-
Trust as a Situated User State in Social LLM-Based Chatbots: A Longitudinal Study of Snapchat's My AI
Social chatbots based on large language models are increasingly embedded in everyday platforms, yet how users develop trust in these systems over time remains unclear. We present a four-week longitudinal qualitative survey study (N = 27) of trust formation in Snapchat's My AI, a socially embedded conversational agent. Our findings show that trust is shaped by perceived ability, conversational beha…
-
From Local to Cluster: A Unified Framework for Causal Discovery with Latent Variables
Latent variables pose a fundamental challenge to causal discovery and inference. Conventional local methods focus on direct neighbors but fail to provide macro level insights. Cluster level methods enable macro causal reasoning but either assume clusters are known a priori or require causal sufficiency. Moreover, directly applying single variable causal discovery methods to cluster level problems …
-
A Model-Driven Approach to Database Migration with a Unified Data Model
Database migration is a key task in software modernization, increasingly involving transformations across heterogeneous data models such as relational and NoSQL systems. Existing approaches are typically designed for specific source-target combinations, which limits their applicability in multi-model environments. This paper proposes a generic database migration approach based on the U-Schema un…
-
Computational Control of Nonlinear Partial Differential Equations Using Machine Learning
The numerical reconstruction of controls for nonlinear partial differential equations remains a challenging and relatively underdeveloped problem, despite the extensive literature on control theory. While recent works have introduced constructive approaches for semilinear wave and heat equations, the design of reliable computational methods for approximating control functions continues to raise si…
-
Distance-Misaligned Training in Graph Transformers and Adaptive Graph-Aware Control
Graph Transformers can mix information globally, but this flexibility also creates failure modes: some tasks require long-range communication while others are better served by local interaction. We study this through a synthetic node-classification benchmark on contextual stochastic block model graphs, where labels are generated by a controllable mixture of local and far-shell signals. We define d…
-
Introducing Background Temperature to Characterise Hidden Randomness in Large Language Models
Even when decoding with temperature $T=0$, large language models (LLMs) can produce divergent outputs for identical inputs. Recent work by Thinking Machines Lab highlights implementation-level sources of nondeterminism, including batch-size variation, kernel non-invariance, and floating-point non-associativity. In this short note we formalize this behavior by introducing the notion of \emph{backgr…
-
Maximization of the efficiency of the first Dirichlet eigenfunction and improved eigenvalue inequalities
We study the efficiency of the first Dirichlet eigenfunction $u$ on bounded convex domains $Ω\subset \mathbb{R}^N$, defined as the ratio between the mean value of $u$ on $Ω$ and its maximum value. By exploiting improved log-concavity estimates, we establish new sharp lower bounds for the first eigenvalue $λ_1$ and upper bounds for the efficiency in terms of the geometry of the domain, refining cla…
-
SpaMEM: Benchmarking Dynamic Spatial Reasoning via Perception-Memory Integration in Embodied Environments
Multimodal large language models (MLLMs) have advanced static visual--spatial reasoning, yet they often fail to preserve long-horizon spatial coherence in embodied settings where beliefs must be continuously revised from egocentric observations under environmental change. We introduce SpaMEM (Spatial Memory from Action Sequences), a large-scale diagnostic benchmark that isolates the mechanics of s…
-
Hidden Failure Modes of Gradient Modification under Adam in Continual Learning, and Adaptive Decoupled Moment Routing as a Repair
Many continual-learning methods modify gradients upstream (e.g., projection, penalty rescaling, replay mixing) while treating Adam as a neutral backend. We show this composition has a hidden failure mode. In a high-overlap, non-adaptive 8-domain continual LM, all shared-routing projection baselines collapse close to vanilla forgetting (12.5--12.8 vs. 13.2). A 0.5% replay buffer is the strongest sh…
-
Near-deterministic loading of optical tweezer arrays via repulsive barricade potentials
Optical tweezers are a powerful tool for creating defect-free arrays of atoms and molecules, enabling advances in quantum simulation, computation, and precision metrology. However, the achievable array size is limited by the initial loading fraction, typically $50\,\%$ for atoms and $35\,\%$ for molecules. Here, we propose a general scheme for enabling multiple loading cycles by protecting trapped…
-
Robust Fuzzy local k-plane clustering with mixture distance of hinge loss and L1 norm
K-plane clustering (KPC), hyperplane clustering, and mixture regression all essentially fall within the same class of problems. This problem can be conceptualized as clustering in relatively high-dimensional K subspaces or K linear manifolds. Traditional KPC or fuzzy KPC models demonstrate a pronounced susceptibility to outliers, as they presuppose that the projection distance between data points …
-
StackFeat RL: Reinforcement Learning over Iterative Dual Criterion Feature Selection for Stable Biomarker Discovery
Feature selection in high-dimensional genomic data ($d \gg n$) demands methods that are simultaneously accurate, sparse, and stable. Existing approaches either require manual threshold specification (mRMR, stability selection), produce unstable selections under data perturbation (Lasso, Boruta), or ignore biological structure entirely. We introduce StackFeat-RL, a meta-learning framework that opti…
-
Quantifying and Mitigating Self-Preference Bias of LLM Judges
LLM-as-a-Judge has become a dominant approach in automated evaluation systems, playing critical roles in model alignment, leaderboard construction, quality control, and so on. However, the scalability and trustworthiness of this approach can be substantially distorted by Self-Preference Bias (SPB), which is a directional evaluative deviation in which LLMs systematically favor or disfavor their own…
-
Quiescent fractions in high-redshift galaxy groups reflect their hot-or-cold state of gas accretion
Cold accretion and quenching are closely related aspects of galaxy evolution, as sustained gas supply is required to maintain star formation. High-redshift galaxy groups therefore provide a valuable laboratory for testing how the thermal state of accreting gas relates to the emergence of quiescence. We measure quiescent fractions in a sample of 16 spectroscopically confirmed galaxy groups at $1.6<…
-
Enhancing a gamified tool for UML modeling education
Unified Modeling Language (UML) Use Case and Class Diagrams are fundamental modeling notations in Software Engineering (SE) education due to their importance for requirements and model-based engineering, yet their relevance is underestimated by students, who tend to dismiss the topic as secondary. Gamification has been adopted to make modeling education more appealing, but existing tools focus alm…
-
Nature of point defects in bulk hexagonal diamond
Hexagonal diamond (HD), an exotic carbon allotrope recently synthesized in bulk form, exhibits superior mechanical properties compared to cubic diamond (CD) and holds promise for advanced industrial and quantum applications. Using first-principles calcu-lations, we systematically investigate intrinsic defects, extrinsic dopants, and defect complexes in HD. Our study shows that VC dominates intrins…
-
Multi-User ISAC with Heterogeneous Unknown Parameters: Optimal Beamforming based on Distribution Information
This paper studies an integrated sensing and communication (ISAC) system where a multi-antenna base station (BS) communicates with multiple single-antenna users in the downlink and senses the unknown and random angle information of a target based on its prior distribution information and the received echo signals. We focus on a challenging scenario with heterogeneous unknown parameters where the t…
-
Conformalized Super Learner
The Super Learner (SL) is a widely used ensemble method that combines predictions from a library of learners based on their predictive performance. Interval predictions are of considerable practical interest because they allow uncertainty in predictions produced by an individual learner or an ensemble to be quantified. Several methods have been proposed for constructing interval predictions based …
-
Region Matters: Efficient and Reliable Region-Aware Visual Place Recognition
Visual Place Recognition (VPR) determines a query image's geographic location by matching it against geotagged databases. However, existing methods struggle with perceptual aliasing caused by irrelevant regions and inefficient re-ranking due to rigid candidate scheduling. To address these issues, we introduce FoL++, a method combining robust discriminative region modeling with adaptive re-ranking.…
-
HFS-TriNet: A Three-Branch Collaborative Feature Learning Network for Prostate Cancer Classification from TRUS Videos
Transrectal ultrasound (TRUS) imaging is a cost-effective and non-invasive modality widely used in the diagnosis of prostate cancer. The computer-aided diagnosis (CAD) relying on TRUS images has been extensively investigated recently. Compared to static images, TRUS video provides richer spatial-temporal information, which make it a promising alternative for improving the accuracy and robustness o…
-
Fixed-phase Resonance Tracking for Fast Nonlinear Resonant Ultrasound Spectroscopy
Nonlinear Resonant Ultrasound Spectroscopy (NRUS) experiments that rely on repeated sampling of resonance curves are inherently sensitive to measurement protocol due to evolution of material parameters caused by fast and slow dynamic effects. We introduce a model-assisted discrete-time resonance tracking method that maintains a system at its instantaneous resonance condition without the need to ac…
-
Pack only the essentials: Adaptive dictionary learning for kernel ridge regression
One of the major limits of kernel ridge regression (KRR) is that storing and manipulating the kernel matrix K_n for n samples requires O(n^2) space, which rapidly becomes unfeasible for large n. Nystrom approximations reduce the space complexity to O(nm) by sampling m columns from K_n. Uniform sampling preserves KRR accuracy (up to epsilon) only when m is proportional to the maximum degree of free…