1273993 results (page 137 of 50960)
-
Objective Shaping with Hard Negatives: Windowed Partial AUC Optimization for RL-based LLM Recommenders
Reinforcement learning (RL) effectively optimizes Large Language Model (LLM)-based recommenders by contrasting positive and negative items. Empirically, training with beam-search negatives consistently outperforms random negatives, yet the mechanism is not well understood. We address this gap by analyzing the induced optimization objective and show that: (i) Under binary reward feedback, optimizin…
-
Measuring and Mitigating Persona Distortions from AI Writing Assistance
Hundreds of millions of people use artificial intelligence (AI) for writing assistance. Here, we evaluated how AI writing assistance distorts writer personas - their perceived beliefs, personality, and identity. In three large-scale experiments, writers (N=2,939) wrote political opinion paragraphs with and without AI assistance. Separate groups of readers (N=11,091) blindly evaluated these paragra…
-
Numerical homogenization for indefinite time-harmonic Maxwell equations
We propose a novel numerical homogenization method based on the edge multiscale approach for solving indefinite time-harmonic Maxwell equations in heterogeneous media with large wavenumber. Numerical methods for these equations in homogeneous media with high wavenumber are particularly challenging due to the so-called pollution effect: the mesh size must be significantly smaller than the reciproca…
-
Decoding High-Dimensional Finger Motion from EMG Using Riemannian Features and RNNs
Continuous estimation of high-dimensional finger kinematics from forearm surface electromyography (EMG) could enable natural control for hand prostheses, AR/XR interfaces, and teleoperation. However, the complexity of human hand gestures and the entanglement of forearm muscles make accurate recognition intrinsically challenging. Existing approaches typically reduce task complexity by relying on cl…
-
CGC: Compositional Grounded Contrast for Fine-Grained Multi-Image Understanding
Although Multimodal Large Language Models (MLLMs) have advanced rapidly, they still face notable challenges in fine-grained multi-image understanding, often exhibiting spatial hallucination, attention leakage, and failures in object constancy. In addition, existing approaches typically rely on expensive human annotations or large-scale chain-of-thought (CoT) data generation. We propose Composition…
-
Catheter Monitoring in Intelligent Endovascular Navigation Systems: Interactive Simulations and Mixed Reality for Enhanced Navigational Awareness
Purpose: Developing and testing a framework that integrates real-time catheter shape reconstruction, interactive simulations, and mixed reality visualization to enable accurate monitoring of catheter-vessel interactions during endovascular navigation. Methods: A finite element model (FEM) of the venous pathway from the right femoral vein to the inferior vena cava was generated from computed tomo…
-
Deep Learning for Model Calibration in Simulation of Itaconic Acid Production
In this study, deep learning is used to estimate kinetic parameters for modeling itaconic acid production based on real batch experiments conducted at different agitation speeds and reactor scales. Two deep learning strategies, namely direct deep learning (DDL) and generative conditional flow matching (CFM) are compared and benchmarked against nonlinear regression as a reference method. Compared w…
-
On the equivalence of semidefinite programming and zero-sum semidefinite games
By results of Dantzig (1951) and Adler (2013), computing the optimal solutions of a linear program is equivalent to finding optimal strategies in zero-sum bimatrix games. Dantzig's original result was incomplete, in the sense that the reduction of a linear program to a zero-sum game did not work for all possible linear programs. We show that, under a natural constraint qualification requiring ei…
-
FedSPDnet: Geometry-Aware Federated Deep Learning with SPDnet
We introduce two federated learning frameworks for the classical SPDnet model operating on symmetric positive definite (SPD) matrices with Stiefel-constrained parameters. Unlike standard Euclidean averaging, which violates orthogonality, our approach preserves geometric structure through two efficient aggregation strategies: ProjAvg, projecting arithmetic means onto the Stiefel manifold, and RLAvg…
-
MTT-Bench: Predicting Social Dominance in Mice via Multimodal Large Language Models
Understanding social dominance in animal behavior is critical for neuroscience and behavioral studies. In this work, we explore the capability of Multimodal Large Language Models(MLLMs) to analyze raw behavioral video of mice and predict their dominance hierarchy. We introduce MTT-Bench, a novel benchmark comprising annotated videos of pairwise mouse interactions for Mouse Tube Test analysis. Buil…
-
Point & Grasp: Flexible Selection of Out-of-Reach Objects Through Probabilistic Cue Integration
Selecting out-of-reach objects is a fundamental task in mixed reality (MR). Existing methods rely on a single cue or deterministically fuse multiple cues, leading to performance degradation when the dominant cue becomes unreliable. In this work, we introduce a probabilistic cue integration framework that enables flexible combination of multiple user-generated cues for intent inference. Inspired by…
-
Radial evolution of Alfvén wave Parametric Decay Instability in the near-Sun solar wind: Effects of Temperature Anisotropy
Parametric decay instability (PDI) of Alfvén wave is thought to play an important role in the dissipation of the large-amplitude Alfvén waves and in the heating of magnetized plasmas. Temperature anisotropy is frequently observed by spacecraft, including Parker Solar Probe (PSP), in the near-Sun solar wind, yet its impact on PDI in the near-Sun solar wind has been understudied. We calculate the ma…
-
Citation-Driven Multi-View Training for Patent Embeddings: QaECTER and Sophia-Bench
Patent retrieval underpins critical decisions in innovation, examination, and IP strategy, yet progress has been hampered by the absence of benchmarks that reflect the diversity of real world search scenarios. We address this gap with two contributions. First, we introduce Sophiabench, a large-scale patent retrieval benchmark comprising 10,000 queries and 75,000 corpus documents stratified across …
-
Non static exponential turnpike property for optimal control problems with symmetries and boundary conditions
Optimal control problems with symmetries often admit a non stationary turnpike property called trim turnpike, which characterizes the convergence of optimal solutions to certain symmetry induced trajectories called trim primitives. In this paper we establish an exponential trim turnpike property for a class of optimal control problems with structural properties related to Abelian Lie group symmetr…
-
Magnetic Indoor Localization through CNN Regression and Rotation Invariance
Indoor positioning is an essential technology for a wide range of applications in GNSS-denied environments, including indoor navigation and IoT systems. Combining convolutional neural networks (CNNs) and magnetic field-based features offers a low-cost, infrastructure-free solution for precise positioning. While magnetic fingerprints are a promising approach for indoor positioning, models trained o…
-
Holo360D: A Large-Scale Real-World Dataset with Continuous Trajectories for Advancing Panoramic 3D Reconstruction and Beyond
While feed-forward 3D reconstruction models have advanced rapidly, they still exhibit degraded performance on panoramas due to spherical distortions. Moreover, existing panoramic 3D datasets are predominantly collected with 360 cameras fixed at discrete locations, resulting in discontinuous trajectories. These limitations critically hinder the development of panoramic feed-forward 3D reconstructio…
-
AI-based experts' knowledge visualization of cultural heritage: A case study of Terracotta Warriors
Advancements in 3D modeling,digital display technologies,and the growing availability of digital cultural heritage data have significantly improved the accuracy of heritage depictions and expanded opportunities for analysis.However,while many studies focus on presenting specific cultural heritage figurines,an often overlooked aspect is the visualization of the Terracotta Warriors as a unified enti…
-
Improving Driver Drowsiness Detection via Personalized EAR/MAR Thresholds and CNN-Based Classification
Driver drowsiness is a major cause of traffic accidents worldwide, posing a serious threat to public safety. Vision-based driver monitoring systems often rely on fixed Eye Aspect Ratio (EAR) and Mouth Aspect Ratio (MAR) thresholds; however, such fixed values frequently fail to generalize across individuals due to variations in facial structure, illumination, and driving conditions. This paper prop…
-
Time-Frequency Pilot Sequence Design and LoS Delay-Doppler Estimation
We present a novel framework for line-of-sight (LoS) delay-Doppler (DD) estimation in dense scattering propagation environments. We present two time-frequency (TF) domain pilot sequences inspired by the Zadoff-Chu sequence that exhibit desirable autocorrelation properties. Further, we present a twisted convolution-based approach for LoS DD estimation directly from the TF-domain received signal, av…
-
Contrastive Semantic Projection: Faithful Neuron Labeling with Contrastive Examples
Neuron labeling assigns textual descriptions to internal units of deep networks. Existing approaches typically rely on highly activating examples, often yielding broad or misleading labels by focusing on dominant but incidental visual factors. Prior work such as FALCON introduced contrastive examples -- inputs that are semantically similar to activating examples but elicit low activations -- to sh…
-
All Eyes on the Workflow: Automated and Efficient Event Discovery from Video Streams
Disciplines such as business process management and process mining aid organizations by discovering insights about processes on the basis of recorded event data. However, an obstacle to process analysis is data multi-modality: for instance, data in video form are not directly interpretable as events. In this work, we present SnapLog, an approach to extract event data from videos by converting fram…
-
Test Design and Review Argumentation in AI-Assisted Test Generation
AI assistants can increasingly generate and evolve test cases. The challenge is no longer merely to produce them, but also to help engineers understand why a generated artefact exists and what supports it. Existing work has focused on classifying testing techniques, linking requirements to tests and structuring system assurance arguments, but it does not explicitly represent the argumentation behi…
-
Error of discretization of Caputo fractional derivative in weighted spaces
We establish uniform error bounds of the L1 discretization of the Caputo fractional derivative of the function from the weighted Sobolev space with weight belonging to the Mucknenhoupt class. We present how our framework works for several examples of weight, which belong to the Muckenhoupt class. As and application, we show the convergence of the L1 scheme for the Fractional ODE. Finally, we verif…
-
The manifold of unitary and symmetric matrices: characterization, Riemannian optimization and application to BD-RIS design
This paper proposes and analyzes Riemannian optimization algorithms on the manifold of unitary and symmetric matrices, denoted ${\cal {U}}_s$, which naturally models the scattering matrices of passive and reciprocal devices such as beyond-diagonal reconfigurable intelligent surfaces (BD-RISs). Despite its relevance, the geometry of ${\cal {U}}_s$ has remained largely unexplored, and existing BD-RI…
-
DM-ASR: Diarization-aware Multi-speaker ASR with Large Language Models
Multi-speaker automatic speech recognition (ASR) aims to transcribe conversational speech involving multiple speakers, requiring the model to capture not only what was said, but also who said it and sometimes when it was spoken. Recent Speech-LLM approaches have shown the potential of unified modeling for this task, but jointly learning speaker attribution, temporal structure, and lexical recognit…