935753 results (page 32 of 37431)
-
Extrinsic geometry and Hamiltonian analysis of symmetric teleparallel gravity
We analyze the properties of foliations in presence of non-metricity, deriving the generalized Gauss-Codazzi relations in full generality. These results are employed to study the teleparallel framework of non-metric geometry, obtaining constraints on the extrinsic and intrinsic tensors. In particular, an extrinsic symmetric two-tensor plays the role of the extrinsic curvature in Riemannian geometr…
-
Co-Refine: AI-Powered Tool Supporting Qualitative Analysis
Qualitative coding relies on a researcher's application of codes to textual data. As coding proceeds across large datasets, interpretations of codes often shift (temporal drift), reducing the credibility of the analysis. Existing Computer-Assisted Qualitative Data Analysis (CAQDAS) tools provide support for data management but offer no workflow for real-time detection of these drifts. We present C…
-
DebugRepair: Enhancing LLM-Based Automated Program Repair via Self-Directed Debugging
Automated Program Repair (APR) has benefited from the code understanding and generation capabilities of Large Language Models (LLMs). Existing feedback-based APR methods iteratively refine candidate patches using test execution feedback and have shown promising results. However, most rely on outcome-level failure symptoms, such as stack traces, which show how failures are observed but fail to expo…
-
Large Language Models Exhibit Normative Conformity
The conformity bias exhibited by large language models (LLMs) can pose a significant challenge to decision-making in LLM-based multi-agent systems (LLM-MAS). While many prior studies have treated "conformity" simply as a matter of opinion change, this study introduces the social psychological distinction between informational conformity and normative conformity in order to understand LLM conformit…
-
HalluAudio: A Comprehensive Benchmark for Hallucination Detection in Large Audio-Language Models
Large Audio-Language Models (LALMs) have recently achieved strong performance across various audio-centric tasks. However, hallucination, where models generate responses that are semantically incorrect or acoustically unsupported, remains largely underexplored in the audio domain. Existing hallucination benchmarks mainly focus on text or vision, while the few audio-oriented studies are limited in …
-
Rethinking Scale: Deployment Trade-offs of Small Language Models under Agent Paradigms
Despite the impressive capabilities of large language models, their substantial computational costs, latency, and privacy risks hinder their widespread deployment in real-world applications. Small Language Models (SLMs) with fewer than 10 billion parameters present a promising alternative; however, their inherent limitations in knowledge and reasoning curtail their effectiveness. Existing research…
-
IndiaFinBench: An Evaluation Benchmark for Large Language Model Performance on Indian Financial Regulatory Text
We introduce IndiaFinBench, to our knowledge the first publicly available evaluation benchmark for assessing large language model (LLM) performance on Indian financial regulatory text. Existing financial NLP benchmarks draw exclusively from Western financial corpora (SEC filings, US earnings reports, and English-language financial news), leaving a significant gap in coverage of non-Western regulat…
-
Debiased neural operators for estimating functionals
Neural operators are widely used to approximate solution maps of complex physical systems. In many applications, however, the goal is not to recover the full solution trajectory, but to summarize the solution trajectory via a scalar target quantity (e.g., a functional such as time spent in a target range, time above a threshold, accumulated cost, or total energy). In this paper, we introduce DOPE …
-
TEMPO: Scaling Test-time Training for Large Reasoning Models
Test-time training (TTT) adapts model parameters on unlabeled test instances during inference time, which continuously extends capabilities beyond the reach of offline training. Despite initial gains, existing TTT methods for LRMs plateau quickly and do not benefit from additional test-time compute. Without external calibration, the self-generated reward signal increasingly drifts as the policy mo…
-
Location Not Found: Exposing Implicit Local and Global Biases in Multilingual LLMs
Multilingual large language models (LLMs) have minimized the fluency gap between languages. This advancement, however, exposes models to the risk of biased behavior, as knowledge and norms may propagate across languages. In this work, we aim to quantify models' inter- and intra-lingual biases, via their ability to answer locale-ambiguous questions. To this end, we present LocQA, a test set contain…
-
Community Detection with the Canonical Ensemble
Network community detection is usually considered as an unsupervised learning problem. Given a network, the aim is to partition it using some general purpose algorithm. In this paper we instead treat community detection as a hypothesis testing problem. Given a network, we examine the evidence for specific community structure in the observed network compared to a null model. To do this we define an…
-
Orthogonal reparametrization of the Nelson-Siegel-Svensson interest rate curve model: conditioning, diagnostics, and identifiability
The Nelson-Siegel-Svensson (NSS) interest rate curve model yields a separable nonlinear least-squares problem whose inner linear block is often ill-conditioned because the basis functions become nearly collinear. We analyze this instability via an exact orthogonal reparametrization of the design matrix. A thin QR decomposition produces orthogonal linear parameters for which, conditional on the non…
-
Mass Matrix Assembly on Tensor Cores for Implicit Particle-In-Cell Methods
Matrix-multiply-accumulate (MMA) units, or tensor cores, are now widespread across modern computing architectures. Yet, their use for particle-grid operators remains limited. In implicit particle methods, mass-matrix assembly is a reduction-dominated kernel in which weighted outer products of interpolation weights are accumulated over particle support. We show that this operation can be reformulat…
-
When Transparency Falls Short: Auditing Platform Moderation During a High-Stakes Election
During major political events, social media platforms encounter increased systemic risks. However, it is still unclear if and how they adjust their moderation practices in response. The Digital Services Act Transparency Database provides-for the first time-an opportunity to systematically examine content moderation at scale, allowing researchers and policymakers to evaluate platforms' compliance a…
-
Spatio-temporal modelling of electric vehicle charging demand
Accurate forecasting of electric vehicle (EV) charging demand is critical for grid management and infrastructure planning. Yet the field continues to rely on legacy benchmarks; such as the Palo Alto (2020) dataset; that fail to reflect the scale and behavioral diversity of modern charging networks. To address this, we introduce a novel large-scale longitudinal dataset collected across Scotland (20…
-
Beyond Semantic Similarity: A Component-Wise Evaluation Framework for Medical Question Answering Systems with Health Equity Implications
The use of Large Language Models (LLMs) to support patients in addressing medical questions is becoming increasingly prevalent. However, most of the measures currently used to evaluate the performance of these models in this context only measure how closely a model's answers match semantically, and therefore do not provide a true indication of the model's medical accuracy or of the health equity r…
-
A Lagrangian framework for canonical analysis for the Holst model with $β= 0$
We perform a canonical analysis of the Holst model for General Relativity, within the framework laid out in arXiv:2401.07307 and arXiv:2010.07725, distinguishing our approach by setting the Barbero parameter to $β=0$ and leaving the lapse and shift functions unconstrained. The $β= 0$ choice is of particular interest because it is viable across all dimensions, providing a necessary foundation for e…
-
Explicit Trait Inference for Multi-Agent Coordination
LLM-based multi-agent systems (MAS) show promise on complex tasks but remain prone to coordination failures such as goal drift, error cascades, and misaligned behaviors. We propose Explicit Trait Inference (ETI), a psychologically grounded method for improving coordination. ETI enables agents to infer and track partner characteristics along two established psychological dimensions--warmth (e.g., t…
-
Designing Transparent AI-Mediated Language Support for Intergenerational Family Communication
Intergenerational linguistic differences pose challenges to effective and intimate family communication. This paper presents GenSync, a chat-based interface that supports intergenerational understanding through different forms of translation visibility. We conducted a controlled within-subjects study with 16 family dyads (32 participants), comparing three conditions: no translation, black-box tran…
-
Scheduling Analysis of UAV Flight Control Workloads using Raspberry Pi 5 Using PREEMPT_RT Linux
Modern UAV architectures increasingly aim to unify high-level autonomy and low-level flight control on a single General-Purpose Operating System (GPOS). However, complex multi-core System-on-Chips (SoCs) introduce significant timing indeterminism due to shared resource contention. This paper performs an architectural analysis of the PREEMPT RT Linux kernel on a Raspberry Pi 5, specifically isolati…
-
HarDBench: A Benchmark for Draft-Based Co-Authoring Jailbreak Attacks for Safe Human-LLM Collaborative Writing
Large language models (LLMs) are increasingly used as co-authors in collaborative writing, where users begin with rough drafts and rely on LLMs to complete, revise, and refine their content. However, this capability poses a serious safety risk: malicious users could jailbreak the models-filling incomplete drafts with dangerous content-to force them into generating harmful outputs. In this paper, w…
-
Sparsification of Precoding Codebooks for PAPR Reduction via Grassmannian Representations
In this letter, we propose a sparsification method for precoding codebooks that reduces the peak-to-average power ratio (PAPR) while preserving the achievable rate. By exploiting the fact that precoder matrices lie on the Grassmann manifold, we formulate a codebook design problem that enables sparsification without modifying the existing feedback mechanism. We develop two sparsification approaches…
-
Symplectic Error of Implicit Symplectic Integrators: A Qualitative Structural Analysis
We study how inexact nonlinear solvers lead to a loss of exact symplecticity in the Symplectic Euler (SE) and Stormer-Verlet (SV) schemes when applied to general nonseparable Hamiltonian systems. These schemes are implicit and require nonlinear solvers in practice. Here, we consider a fixed number $M$ of fixed-point iterations (FPI). While SE is exactly symplectic under exact solves, a finite $M$ …
-
Effective Traveling for Metric Instances of the Traveling Thief Problem
The Traveling Thief Problem (TTP) is a multi-component optimization problem that captures the interplay between routing and packing decisions by combining the classical Traveling Salesperson Problem (TSP) and the Knapsack Problem (KP). The TTP has gained significant attention in the evolutionary computation literature and a wide range of approaches have been developed over the last 10 years. Judgi…
-
Warmth and Competence in the Swarm: Designing Effective Human-Robot Teams
As groups of robots increasingly collaborate with humans, understanding how humans perceive them is critical for designing effective human-robot teams. While prior research examined how humans interpret and evaluate the abilities and intentions of individual agents, social perception of robot teams remains relatively underexplored. Drawing on the competence-warmth framework, we conducted two studi…