1613351 results (page 2 of 64535)
-
Confidence is Not Reliability: Rethinking MC Dropout in Brain Tumour Segmentation
Glioma segmentation in multiparametric MRI is a critical component of treatment planning. A segmentation model that fails silently on treatment-critical sub-regions represents a patient safety risk that overlap-based metrics such as Dice scores cannot expose. We ask whether voxel-level uncertainty estimation via Monte Carlo (MC) Dropout can reliably identify segmentation errors in clinically criti…
-
Does VLA Even Know the Basics? Measuring Commonsense and World Knowledge Retention in Vision-Language-Action Models
Embodied Vision-Language-Action (VLA) models are typically obtained by fine-tuning powerful pretrained VLMs on robotics data, yet it is unclear how much commonsense and factual knowledge they retain after adaptation. Failures on knowledge-sensitive tasks are ambiguous, conflating missing knowledge with poor generalization of low-level control. We introduce Act2Answer, a lightweight protocol that a…
-
A New Methodology for Classifying Eclipsing Binaries with Kepler Data and Deep Learning
We present a new method for the automated classification of eclipsing binaries, into contact, detached, and semi-detached types using Kepler data. Phase-folded light curves are generated and chi-square vs. box size plots are constructed by comparing flux values to the median flux, revealing distinct class patterns. These patterns were first modelled using a polynomial damped sinusoidal function, w…
-
Risk Stratification for ICU Delirium using Pervasive Ambient Sensing Information
Delirium is a common and serious complication in the Intensive Care Unit (ICU), associated with increased morbidity, prolonged hospital stays, and higher healthcare costs. Despite its prevalence, early prediction and prevention remain challenging. Environmental factors such as ambient sound and light may influence the onset of delirium, yet they are often overlooked in risk assessments. In this st…
-
A Potential Black Hole Mimicker From Non-Minimal Coupling
We present a class of horizonless, regular ultra-compact objects arising in a theory of gravity which allows curvature-fluid coupling. The non-minimal interaction between fluid variables and the Ricci scalar generates a vacuum-like equation of state in the interior, while the exterior remains exactly Schwarzschild. The two spacetimes are glued through a shell at the junction. The interior metric i…
-
Correct Yourself, Keep My Trust: How Self-Correction and Social Connection Shape Credibility in Social Chatbots
When social chatbots make mistakes, and they do, how they recover determines whether users trust them again. Social chatbots are increasingly integrated into everyday life, yet they remain prone to generating convincing but inaccurate information. The social connection they build with users makes such errors particularly consequential. We conducted a between-subjects experiment (N=120) comparing t…
-
Direct Tests of Black Hole Accretion Rate Prescriptions: I. Bondi Accretion at Different Scales
We present spatially resolved parsec-scale measurements of nuclear conditions (gas density and kinetic temperature) relevant for black hole accretion rate predictions in the Seyfert 2 galaxy, NGC 1068. We inject these parameters into the prescription for a Bondi-like accretion model, then compare the resulting accretion rate prediction to the empirical accretion rate derived from hard X-ray observ…
-
The first detection of dense gas in a massive main-sequence galaxy at cosmic noon
Dense gas is the direct fuel for star formation, but measuring it has long been difficult at z>2, especially in typical star-forming main-sequence galaxies. In this work, we report the first detection of HNC (J = 5--4) and CN (N = 4--3) emission in a massive main-sequence galaxy, BX610, at z=2.21. The velocity integrated emission of HNC(5--4)+CN(4--3) is concentrated in the galactic centre, coinci…
-
NeSyCat Torch: A Differentiable Tensor Implementation of Categorical Semantics for Neurosymbolic Learning
Neurosymbolic semantics is fragmented: classical, fuzzy, probabilistic and neural systems each define truth by their own inductive rules. NeSyCat, extending ULLER, subsumes them under a single inductive definition of truth, parametric in a strong monad and an aggregation structure on truth-values. NeSyCat has so far lacked an account of predicates and functions learned by neural networks. We pro…
-
A Unified Framework for Efficient Remote Sensing Visual Question Answering: Adapting Dual, Hybrid, and Encoder-Decoder Architectures
Visual Question Answering (VQA) in the Remote Sensing (RS) domain presents unique challenges due to the high resolution, multi scale object distribution, and semantic complexity of aerial imagery. While general domain Foundation Models have achieved remarkable success, their direct application to RSVQA is hindered by massive domain shifts and the computationally prohibitive nature of full fine tun…
-
Improved proper motion and gravity tests with PSR J1913+1102
PSR J1913+1102 is a highly asymmetric double neutron star system and an excellent laboratory for testing scalar-tensor gravity theories, as well as a potential progenitor analogue of GW170817 that will merge in 470 Myr. We present an updated timing analysis combining 13 years of historical Arecibo observations and new FAST measurements, using two approaches to model dispersion-measure variations. …
-
Reconstruction Limits for Repeated Differentially Private Aggregates: A Cramer-Rao Perspective on Query Geometry
Repeated differentially private (DP) releases are often evaluated by transcript length or cumulative privacy accounting. We show that these quantities do not by themselves determine local reconstruction risk. For Gaussian-calibrated repeated statistical queries, the key object is the nuisance-profiled Fisher geometry of the release sequence: repetition helps only when new releases create identifia…
-
TurboServe: Serving Streaming Video Generation Efficiently and Economically
Streaming video generation is emerging as a new serving workload in which users interact with long-lived sessions that generate video progressively, chunk by chunk. Unlike offline video generation or typical LLM serving, streaming video generation must preserve session state across active and idle periods, repeatedly schedule ongoing sessions, and deliver each chunk under a tight latency target. T…
-
Beyond Algorithms: Conceptual Innovation in Medical Imaging AI
Artificial intelligence has driven rapid progress in medical imaging research, producing increasingly sophisticated algorithms and steady improvements on benchmark tasks. However, this algorithm-centric trajectory has also revealed a growing imbalance: while computational methods advance rapidly, the conceptual foundations that define imaging tasks, evaluation metrics, and clinical meaning sometim…
-
Scoring Backends Matter More Than Pooling: A Systematic Study of Training-Free Anomalous Sound Detection under Domain Shift
Training-free anomalous sound detection (ASD) scores a test clip against a memory bank of normal embeddings from a frozen pretrained audio encoder. Recent work attributes domain-shift robustness mainly to how frame-level features are pooled over time; the scoring backend applied on top of the pooled embedding has received far less systematic attention. Using a single frozen BEATs encoder on the DC…
-
A Mixed-Reality Testbed for Autonomous Vehicles
We propose a mixed-reality, hardware-in-the-loop (HIL) testbed for autonomous vehicles that seamlessly integrates a physical testbed of mobile robots with a high-fidelity simulation environment. The virtual simulation enables the creation of diverse, safety-critical driving scenarios to validate state-of-the-art perception, planning, and control algorithms, while augmenting simulations with physic…
-
Trade-offs in Medical LLM Adaptation: An Empirical Study in French QA
The development of large language models (LLMs) has led to an increased focus on their adaptation to specialized domains and languages, yet the effectiveness of domain adaptation strategies remains unclear. We present a study of medical domain adaptation using French medical question-answering (QA) as a case study. We compare continual pretraining (CPT), supervised fine-tuning (SFT), and their com…
-
Shape Sensing of Continuum Robots using Direct Laser Writing
Continuum robots offer a promising approach for minimally invasive and natural-orifice surgical procedures due to their inherent compliance and dexterity. However, this flexibility also makes estimating the current shape of the robot challenging. Several approaches have been used to reconstruct the shape of these robots, including imaging, optical sensing, magnetic sensing, and resistive sensing. …
-
Structured Inference with Large Language Gibbs
The knowledge encoded in large language models (LLMs) can serve as a substrate for structured reasoning over variables describing a complex world, but accessing this knowledge in a probabilistically coherent manner poses a difficult inference problem. We propose Large Language Gibbs, a scheme for structured probabilistic inference that uses conditional distributions of an LLM as transition operato…
-
Digital Speech Acts Retain Control of Copyright with People, Not Platforms
Legal precedents protect computer code as copyrightable expression. They have enabled centralized digital platforms -- operating from corporate servers that hold all user data -- to construct private governance regimes through the interaction of copyright, contract, and technical architecture: people who create virtually all platform value must surrender effective copyright control through Terms o…
-
Detecting Hidden ML Training With Zero-Overhead Telemetry
Hardware-enabled monitoring of GPU workloads underpins many proposals for AI compute governance, but if developers can defeat monitoring mechanisms, such schemes are unworkable. We evaluate the adversarial robustness of GPU workload classification using only zero-overhead, privacy-preserving NVML telemetry: content-agnostic signals that observe physical effects of computation without accessing mod…
-
A Multi-Domain Benchmark for Detecting AI-Generated Text-Rich Images from GPT-Image-2
Text-rich images often contain privacy-sensitive, transactional, or decision-relevant information. As recent multimodal image generation models become increasingly capable of synthesizing realistic textual content and structured visual designs, detecting AI-generated text-rich images has become an important challenge for digital trust and content authenticity. Existing benchmarks, however, largely…
-
CABLE: Cloud-Assisted Bandwidth-efficient LMM-based Encoding for V2X Systems
Cloud-hosted large multimodal models (LMMs) can provide strong open-vocabulary perception for Vehicle-to-Everything systems, but naively transmitting full-resolution frames from edge to cloud causes severe communication overhead and high cloud-side prefill latency. We present CABLE, a cloud-assisted bandwidth-efficient LMM-based encoding framework for edge-cloud perception. CABLE propagates the pr…
-
DreamReasoner-8B: Block-Size Curriculum Learning for Diffusion Reasoning Models
Block diffusion language models accelerate decoding through parallel block-wise denoising, yet whether they can be reliably scaled for long chain-of-thought (CoT) reasoning remains unresolved. To this end, we develop DreamReasoner-8B, an open-source block diffusion reasoning model, and conduct a systematic study of how training and inference block sizes affect long-CoT reasoning. Our analysis reve…
-
X+Slides: Benchmarking Audience-Conditioned Slide Generation
Automatically generating slide decks from source documents is an important application of large language models (LLMs). Existing benchmarks primarily assess slide completeness and technical depth, while overlooking the target audience as a critical real-world factor. For instance, specialists demand rigorous proofs, whereas decision-makers prioritize actionable conclusions. To bridge this gap, we …