1045714 results (page 54 of 41829)
-
An Answer is just the Start: Related Insight Generation for Open-Ended Document-Grounded QA
Answering open-ended questions remains challenging for AI systems because it requires synthesis, judgment, and exploration beyond factual retrieval, and users often refine answers through multiple iterations rather than accepting a single response. Existing QA benchmarks do not explicitly support this refinement process. To address this gap, we introduce a new task, document-grounded related insig…
-
PREF-XAI: Preference-Based Personalized Rule Explanations of Black-Box Machine Learning Models
Explainable artificial intelligence (XAI) has predominantly focused on generating model-centric explanations that approximate the behavior of black-box models. However, such explanations often overlook a fundamental aspect of interpretability: different users require different explanations depending on their goals, preferences, and cognitive constraints. Although recent work has explored user-cent…
-
Mask World Model: Predicting What Matters for Robust Robot Policy Learning
World models derived from large-scale video generative pre-training have emerged as a promising paradigm for generalist robot policy learning. However, standard approaches often focus on high-fidelity RGB video prediction, this can result in overfitting to irrelevant factors, such as dynamic backgrounds and illumination changes. These distractions reduce the model's ability to generalize, ultimate…
-
Frequency-Forcing: From Scaling-as-Time to Soft Frequency Guidance
While standard flow-matching models transport noise to data uniformly, incorporating an explicit generation order - specifically, establishing coarse, low-frequency structure before fine detail - has proven highly effective for synthesizing natural images. Two recent works offer distinct paradigms for this. K-Flow imposes a hard frequency constraint by reinterpreting a frequency scaling variable a…
-
IR-Flow: Bridging Discriminative and Generative Image Restoration via Rectified Flow
In image restoration, single-step discriminative mappings often lack fine details via expectation learning, whereas generative paradigms suffer from inefficient multi-step sampling and noise-residual coupling. To address this dilemma, we propose IR-Flow, a novel image restoration method based on Rectified Flow that serves as a unified framework bridging the gap between discriminative and generativ…
-
MMControl: Unified Multi-Modal Control for Joint Audio-Video Generation
Recent advances in Diffusion Transformers (DiTs) have enabled high-quality joint audio-video generation, producing videos with synchronized audio within a single model. However, existing controllable generation frameworks are typically restricted to video-only control. This restricts comprehensive controllability and often leads to suboptimal cross-modal alignment. To bridge this gap, we present M…
-
Exploring Language-Agnosticity in Function Vectors: A Case Study in Machine Translation
Function vectors (FVs) are vector representations of tasks extracted from model activations during in-context learning. While prior work has shown that multilingual model representations can be language-agnostic, it remains unclear whether the same holds for function vectors. We study whether FVs exhibit language-agnosticity, using machine translation as a case study. Across three decoder-only mul…
-
Learning Hybrid-Control Policies for High-Precision In-Contact Manipulation Under Uncertainty
Reinforcement learning-based control policies have been frequently demonstrated to be more effective than analytical techniques for many manipulation tasks. Commonly, these methods learn neural control policies that predict end-effector pose changes directly from observed state information. For tasks like inserting delicate connectors which induce force constraints, pose-based policies have limite…
-
MedFlowSeg: Flow Matching for Medical Image Segmentation with Frequency-Aware Attention
Flow matching has recently emerged as a principled framework for learning continuous-time transport maps, enabling efficient deterministic generation without relying on stochastic diffusion processes. While generative modeling has shown promise for medical image segmentation, particularly in capturing uncertainty and complex anatomical variability, existing approaches are predominantly built upon …
-
Resolved UV-Optical HST Imaging and Spectral Energy Distribution Modeling of Nearby BAT Active Galactic Nuclei
We use high-resolution UV-to-optical imaging from the Hubble Space Telescope (HST) to construct spatially resolved spectral energy distributions (SEDs) for seven nearby ($z<0.07$) hard (14--195$\,$keV) X-ray-selected broad-line active galactic nuclei (AGN) with $L_{\rm bol}=10^{43.26}-10^{45.34}\,\rm{erg\,s^{-1}}$. The high spatial resolution of HST, which physically resolves structures on the sca…
-
InHabit: Leveraging Image Foundation Models for Scalable 3D Human Placement
Training embodied agents to understand 3D scenes as humans do requires large-scale data of people meaningfully interacting with diverse environments, yet such data is scarce. Real-world motion capture is costly and limited to controlled settings, while existing synthetic datasets rely on simple geometric heuristics that ignore rich scene context. In contrast, 2D foundation models trained on intern…
-
Budgeted Online Influence Maximization
We introduce a new budgeted framework for online influence maximization, considering the total cost of an advertising campaign instead of the common cardinality constraint on a chosen influencer set. Our approach better models the real-world setting where the cost of influencers varies and advertisers want to find the best value for their overall social advertising budget. We propose an algorithm …
-
Multi-Cycle Spatio-Temporal Adaptation in Human-Robot Teaming
Effective human-robot teaming is crucial for the practical deployment of robots in human workspaces. However, optimizing joint human-robot plans remains a challenge due to the difficulty of modeling individualized human capabilities and preferences. While prior research has leveraged the multi-cycle structure of domains like manufacturing to learn an individual's tendencies and adapt plans over re…
-
HardNet++: Nonlinear Constraint Enforcement in Neural Networks
Enforcing constraint satisfaction in neural network outputs is critical for safety, reliability, and physical fidelity in many control and decision-making applications. While soft-constrained methods penalize constraint violations during training, they do not guarantee constraint adherence during inference. Other approaches guarantee constraint satisfaction via specific parameterizations or a proj…
-
Abstract null hypersurfaces and characteristic initial value problems in General Relativity
This thesis is framed within the field of Mathematical Relativity and is organized into six chapters. After an introduction to the topic in Chapter 1, Chapter 2 reviews and further develops the formalism of hypersurface data, which provides the unifying framework for the entire thesis. In Chapter 3 we study the characteristic Cauchy problem from a fully detached perspective. Chapter 4 is devoted t…
-
Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language
At present, executable visual workflows have emerged as a mainstream paradigm in real-world industrial deployments, offering strong reliability and controllability. However, in current practice, such workflows are almost entirely constructed through manual engineering: developers must carefully design workflows, write prompts for each step, and repeatedly revise the logic as requirements evolve-ma…
-
ECLASS-Augmented Semantic Product Search for Electronic Components
Efficient semantic access to industrial product data is a key enabler for factory automation and emerging LLM-based agent workflows, where both human engineers and autonomous agents must identify suitable components from highly structured catalogs. However, the vocabulary mismatch between natural-language queries and attribute-centric product descriptions limits the effectiveness of traditional re…
-
From Top-1 to Top-K: A Reproducibility Study and Benchmarking of Counterfactual Explanations for Recommender Systems
Counterfactual explanations (CEs) provide an intuitive way to understand recommender systems by identifying minimal modifications to user-item interactions that alter recommendation outcomes. Existing CE methods for recommender systems, however, have been evaluated under heterogeneous protocols, using different datasets, recommenders, metrics, and even explanation formats, which hampers reproducib…
-
Modelling time-order effects in haptic perception with a Bayesian dynamical framework
Perceptual judgments of sequential stimuli are systematically biased by prior expectations and by the temporal structure of sensory input. In haptic discrimination tasks, these effects often manifest as time-order asymmetries, whereby the perceived difference between two stimuli depends on their presentation order. Here, we introduce a dynamical Bayesian model that accounts for these biases by com…
-
Pilot-Free Predictive Multi-User Beamforming via Sensing Management in Cell-Free Networks
This paper presents a sensing management frame- work for integrated sensing and communications (ISAC) within cell-free massive multiple-input multiple-output (MIMO) systems to reduce pilot-based channel state information (CSI) acquisition overhead. Conventional communication systems rely on frequent channel estimation procedures that impose significant signaling overhead, consuming valuable time-f…
-
Disentangling Damage from Operational Variability: A Label-Free Self-Supervised Representation Learning Framework for Output-Only Structural Damage Identification
Damage identification is a core task in structural health monitoring. In practice, however, its reliability is often compromised by confounding non-damage effects, such as variations in excitation and environmental conditions, which can induce changes comparable to or larger than those caused by structural damage. To address this challenge, this study proposes a self-supervised label-free disentan…
-
An AI Agent Execution Environment to Safeguard User Data
AI agents promise to serve as general-purpose personal assistants for their users, which requires them to have access to private user data (e.g., personal and financial information). This poses a serious risk to security and privacy. Adversaries may attack the AI model (e.g., via prompt injection) to exfiltrate user data. Furthermore, sharing private data with an AI agent requires users to trust a…
-
Pause or Fabricate? Training Language Models for Grounded Reasoning
Large language models have achieved remarkable progress on complex reasoning tasks. However, they often implicitly fabricate information when inputs are incomplete, producing confident but unreliable conclusions -- a failure mode we term ungrounded reasoning. We argue that this issue arises not from insufficient reasoning capability, but from the lack of inferential boundary awareness -- the abili…
-
FEPLB: Exploiting Copy Engines for Nearly Free MoE Load Balancing in Distributed Training
Fine-grained, per-micro-batch load balancing is essential for efficient Mixture-of-Experts (MoE) training, yet every prior dynamic scheduling scheme pays for it with extra communication that is hard to hide. Especially on modern bulk-transfer backends such as DeepEP. We make a simple but consequential observation: on the NVIDIA Hopper architecture the NVLink Copy Engine can move data between intra…
-
A Dual Perspective on Synthetic Trajectory Generators: Utility Framework and Privacy Vulnerabilities
Human mobility data are used in numerous applications, ranging from public health to urban planning. Human mobility is inherently sensitive, as it can contain information such as religious beliefs and political affiliations. Historically, it has been proposed to modify the information using techniques such as aggregation, obfuscation, or noise addition, to adequately protect privacy and eliminate …