arXiv論文一覧 - stat.ML updates on arXiv.org

#1 One Permutation Is All You Need: Fast, Reliable Variable Importance and Model Stress-Testing

著者: Albert Dorador

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.13892

要約:
Reliable estimation of feature contributions in machine learning models is essential for trust, transparency and regulatory compliance, especially when models are proprietary or otherwise operate as black boxes. While permutation-based methods are a standard tool for this task, classical implementations rely on repeated random permutations, introducing computational overhead and stochastic instability. In this paper, we show that by replacing multiple random permutations with a single, deterministic, and optimal permutation, we achieve a method that retains the core principles of permutation-based importance while being non-random, faster, and more stable. We validate this approach across nearly 200 scenarios, including real-world household finance and credit risk applications, demonstrating improved bias-variance tradeoffs and accuracy in challenging regimes such as small sample sizes, high dimensionality, and low signal-to-noise ratios. Finally, we introduce Systemic Variable Importance, a natural extension designed for model stress-testing that explicitly accounts for feature correlations. This framework provides a transparent way to quantify how shocks or perturbations propagate through correlated inputs, revealing dependencies that standard variable importance measures miss. Two real-world case studies demonstrate how this metric can be used to audit models for hidden reliance on protected attributes (e.g., gender or race), enabling regulators and practitioners to assess fairness and systemic risk in a principled and computationally efficient manner.

#2 Maximum Mean Discrepancy with Unequal Sample Sizes via Generalized U-Statistics

著者: Aaron Wei, Milad Jalali, Danica J. Sutherland

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.13997

要約:
Existing two-sample testing techniques, particularly those based on choosing a kernel for the Maximum Mean Discrepancy (MMD), often assume equal sample sizes from the two distributions. Applying these methods in practice can require discarding valuable data, unnecessarily reducing test power. We address this long-standing limitation by extending the theory of generalized U-statistics and applying it to the usual MMD estimator, resulting in new characterization of the asymptotic distributions of the MMD estimator with unequal sample sizes (particularly outside the proportional regimes required by previous partial results). This generalization also provides a new criterion for optimizing the power of an MMD test with unequal sample sizes. Our approach preserves all available data, enhancing test accuracy and applicability in realistic settings. Along the way, we give much cleaner characterizations of the variance of MMD estimators, revealing something that might be surprising to those in the area: while zero MMD implies a degenerate estimator, it is sometimes possible to have a degenerate estimator with nonzero MMD as well; we give a construction and a proof that it does not happen in common situations.

#3 On the Hardness of Conditional Independence Testing In Practice

著者: Zheng He, Roman Pogodin, Yazhe Li, Namrata Deka, Arthur Gretton, Danica J. Sutherland

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.14000

要約:
Tests of conditional independence (CI) underpin a number of important problems in machine learning and statistics, from causal discovery to evaluation of predictor fairness and out-of-distribution robustness. Shah and Peters (2020) showed that, contrary to the unconditional case, no universally finite-sample valid test can ever achieve nontrivial power. While informative, this result (based on "hiding" dependence) does not seem to explain the frequent practical failures observed with popular CI tests. We investigate the Kernel-based Conditional Independence (KCI) test - of which we show the Generalized Covariance Measure underlying many recent tests is nearly a special case - and identify the major factors underlying its practical behavior. We highlight the key role of errors in the conditional mean embedding estimate for the Type-I error, while pointing out the importance of selecting an appropriate conditioning kernel (not recognized in previous work) as being necessary for good test power but also tending to inflate Type-I error.

#4 Weighted Conformal Prediction Provides Adaptive and Valid Mask-Conditional Coverage for General Missing Data Mechanisms

著者: Jiarong Fan, Juhyun Park. Thi Phuong Thuy Vo, Nicolas Brunel

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.14221

要約:
Conformal prediction (CP) offers a principled framework for uncertainty quantification, but it fails to guarantee coverage when faced with missing covariates. In addressing the heterogeneity induced by various missing patterns, Mask-Conditional Valid (MCV) Coverage has emerged as a more desirable property than Marginal Coverage. In this work, we adapt split CP to handle missing values by proposing a preimpute-mask-then-correct framework that can offer valid coverage. We show that our method provides guaranteed Marginal Coverage and Mask-Conditional Validity for general missing data mechanisms. A key component of our approach is a reweighted conformal prediction procedure that corrects the prediction sets after distributional imputation (multiple imputation) of the calibration dataset, making our method compatible with standard imputation pipelines. We derive two algorithms, and we show that they are approximately marginally valid and MCV. We evaluate them on synthetic and real-world datasets. It reduces significantly the width of prediction intervals w.r.t standard MCV methods, while maintaining the target guarantees.

#5 Improving the Accuracy of Amortized Model Comparison with Self-Consistency

著者: \v{S}imon Kucharsk\'y, Aayush Mishra, Daniel Habermann, Stefan T. Radev, Paul-Christian B\"urkner

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.14308

要約:
Amortized Bayesian inference (ABI) offers fast, scalable approximations to posterior densities by training neural surrogates on data simulated from the statistical model. However, ABI methods are highly sensitive to model misspecification: when observed data fall outside the training distribution (generative scope of the statistical models), neural surrogates can behave unpredictably. This makes it a challenge in a model comparison setting, where multiple statistical models are considered, of which at least some are misspecified. Recent work on self-consistency (SC) provides a promising remedy to this issue, accessible even for empirical data (without ground-truth labels). In this work, we investigate how SC can improve amortized model comparison conceptualized in four different ways. Across two synthetic and two real-world case studies, we find that approaches for model comparison that estimate marginal likelihoods through approximate parameter posteriors consistently outperform methods that directly approximate model evidence or posterior model probabilities. SC training improves robustness when the likelihood is available, even under severe model misspecification. The benefits of SC for methods without access of analytic likelihoods are more limited and inconsistent. Our results suggest practical guidance for reliable amortized Bayesian model comparison: prefer parameter posterior-based methods and augment them with SC training on empirical datasets to mitigate extrapolation bias under model misspecification.

#6 Continual Learning at the Edge: An Agnostic IIoT Architecture

著者: Pablo Garc\'ia-Santaclara, Bruno Fern\'andez-Castro, Rebeca P. D\'iaz-Redondo, Carlos Calvo-Moa, Henar Mari\~no-Bodel\'on

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.14311

要約:
The exponential growth of Internet-connected devices has presented challenges to traditional centralized computing systems due to latency and bandwidth limitations. Edge computing has evolved to address these difficulties by bringing computations closer to the data source. Additionally, traditional machine learning algorithms are not suitable for edge-computing systems, where data usually arrives in a dynamic and continual way. However, incremental learning offers a good solution for these settings. We introduce a new approach that applies the incremental learning philosophy within an edge-computing scenario for the industrial sector with a specific purpose: real time quality control in a manufacturing system. Applying continual learning we reduce the impact of catastrophic forgetting and provide an efficient and effective solution.

#7 From STLS to Projection-based Dictionary Selection in Sparse Regression for System Identification

著者: Hangjun Cho, Fabio V. G. Amaral, Andrei A. Klishin, Cassio M. Oishi, Steven L. Brunton

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.14404

要約:
In this work, we revisit dictionary-based sparse regression, in particular, Sequential Threshold Least Squares (STLS), and propose a score-guided library selection to provide practical guidance for data-driven modeling, with emphasis on SINDy-type algorithms. STLS is an algorithm to solve the $\ell_0$ sparse least-squares problem, which relies on splitting to efficiently solve the least-squares portion while handling the sparse term via proximal methods. It produces coefficient vectors whose components depend on both the projected reconstruction errors, here referred to as the scores, and the mutual coherence of dictionary terms. The first contribution of this work is a theoretical analysis of the score and dictionary-selection strategy. This could be understood in both the original and weak SINDy regime. Second, numerical experiments on ordinary and partial differential equations highlight the effectiveness of score-based screening, improving both accuracy and interpretability in dynamical system identification. These results suggest that integrating score-guided methods to refine the dictionary more accurately may help SINDy users in some cases to enhance their robustness for data-driven discovery of governing equations.

#8 LLmFPCA-detect: LLM-powered Multivariate Functional PCA for Anomaly Detection in Sparse Longitudinal Texts

著者: Prasanjit Dubey, Aritra Guha, Zhengyi Zhou, Qiong Wu, Xiaoming Huo, Paromita Dubey

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.14604

要約:
Sparse longitudinal (SL) textual data arises when individuals generate text repeatedly over time (e.g., customer reviews, occasional social media posts, electronic medical records across visits), but the frequency and timing of observations vary across individuals. These complex textual data sets have immense potential to inform future policy and targeted recommendations. However, because SL text data lack dedicated methods and are noisy, heterogeneous, and prone to anomalies, detecting and inferring key patterns is challenging. We introduce LLmFPCA-detect, a flexible framework that pairs LLM-based text embeddings with functional data analysis to detect clusters and infer anomalies in large SL text datasets. First, LLmFPCA-detect embeds each piece of text into an application-specific numeric space using LLM prompts. Sparse multivariate functional principal component analysis (mFPCA) conducted in the numeric space forms the workhorse to recover primary population characteristics, and produces subject-level scores which, together with baseline static covariates, facilitate data segmentation, unsupervised anomaly detection and inference, and enable other downstream tasks. In particular, we leverage LLMs to perform dynamic keyword profiling guided by the data segments and anomalies discovered by LLmFPCA-detect, and we show that cluster-specific functional PC scores from LLmFPCA-detect, used as features in existing pipelines, help boost prediction performance. We support the stability of LLmFPCA-detect with experiments and evaluate it on two different applications using public datasets, Amazon customer-review trajectories, and Wikipedia talk-page comment streams, demonstrating utility across domains and outperforming state-of-the-art baselines.

#9 Modular connectivity in neural networks emerges from Poisson noise-motivated regularisation, and promotes robustness and compositional generalisation

著者: Daoyuan Qian, Qiyao Liang, Ila Fiete

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.13707

要約:
Circuits in the brain commonly exhibit modular architectures that factorise complex tasks, resulting in the ability to compositionally generalise and reduce catastrophic forgetting. In contrast, artificial neural networks (ANNs) appear to mix all processing, because modular solutions are difficult to find as they are vanishing subspaces in the space of possible solutions. Here, we draw inspiration from fault-tolerant computation and the Poisson-like firing of real neurons to show that activity-dependent neural noise, combined with nonlinear neural responses, drives the emergence of solutions that reflect an accurate understanding of modular tasks, corresponding to acquisition of a correct world model. We find that noise-driven modularisation can be recapitulated by a deterministic regulariser that multiplicatively combines weights and activations, revealing rich phenomenology not captured in linear networks or by standard regularisation methods. Though the emergence of modular structure requires sufficiently many training samples (exponential in the number of modular task dimensions), we show that pre-modularised ANNs exhibit superior noise-robustness and the ability to generalise and extrapolate well beyond training data, compared to ANNs without such inductive biases. Together, our work demonstrates a regulariser and architectures that could encourage modularity emergence to yield functional benefits.

#10 Time-aware UNet and super-resolution deep residual networks for spatial downscaling

著者: Mika Sipil\"a, Sabrina Maggio, Sandra De Iaco, Klaus Nordhausen, Monica Palma, Sara Taskinen

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.13753

要約:
Satellite data of atmospheric pollutants are often available only at coarse spatial resolution, limiting their applicability in local-scale environmental analysis and decision-making. Spatial downscaling methods aim to transform the coarse satellite data into high-resolution fields. In this work, two widely used deep learning architectures, the super-resolution deep residual network (SRDRN) and the encoder-decoder-based UNet, are considered for spatial downscaling of tropospheric ozone. Both methods are extended with a lightweight temporal module, which encodes observation time using either sinusoidal or radial basis function (RBF) encoding, and fuses the temporal features with the spatial representations in the networks. The proposed time-aware extensions are evaluated against their baseline counterparts in a case study on ozone downscaling over Italy. The results suggest that, while only slightly increasing computational complexity, the temporal modules significantly improve downscaling performance and convergence speed.

#11 Dropout Neural Network Training Viewed from a Percolation Perspective

著者: Finley Devlin, Jaron Sanders

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.13853

要約:
In this work, we investigate the existence and effect of percolation in training deep Neural Networks (NNs) with dropout. Dropout methods are regularisation techniques for training NNs, first introduced by G. Hinton et al. (2012). These methods temporarily remove connections in the NN, randomly at each stage of training, and update the remaining subnetwork with Stochastic Gradient Descent (SGD). The process of removing connections from a network at random is similar to percolation, a paradigm model of statistical physics. If dropout were to remove enough connections such that there is no path between the input and output of the NN, then the NN could not make predictions informed by the data. We study new percolation models that mimic dropout in NNs and characterise the relationship between network topology and this path problem. The theory shows the existence of a percolative effect in dropout. We also show that this percolative effect can cause a breakdown when training NNs without biases with dropout; and we argue heuristically that this breakdown extends to NNs with biases.

#12 Understanding the Gain from Data Filtering in Multimodal Contrastive Learning

著者: Divyansh Pareek, Sewoong Oh, Simon S. Du

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.14230

要約:
The success of modern multimodal representation learning relies on internet-scale datasets. Due to the low quality of a large fraction of raw web data, data curation has become a critical step in the training pipeline. Filtering using a trained model (i.e., teacher-based filtering) has emerged as a successful solution, leveraging a pre-trained model to compute quality scores. To explain the empirical success of teacher-based filtering, we characterize the performance of filtered contrastive learning under the standard bimodal data generation model. Denoting $\eta\in(0,1]$ as the fraction of data with correctly matched modalities among $n$ paired samples, we utilize a linear contrastive learning setup to show a provable benefit of data filtering: $(i)$ the error without filtering is upper and lower bounded by $\frac{1}{\eta \sqrt{n}}$, and $(ii)$ the error with teacher-based filtering is upper bounded by $\frac{1}{\sqrt{\eta n}}$ in the large $\eta$ regime, and by $\frac{1}{\sqrt{n}}$ in the small $\eta$ regime.

#13 Randomized multi-class classification under system constraints: a unified approach via post-processing

著者: Evgenii Chzhen (LMO, CELESTE), Mohamed Hebiri (LAMA), Gayane Taturyan (LAMA, IMT)

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.14246

要約:
We study the problem of multi-class classification under system-level constraints expressible as linear functionals over randomized classifiers. We propose a post-processing approach that adjusts a given base classifier to satisfy general constraints without retraining. Our method formulates the problem as a linearly constrained stochastic program over randomized classifiers, and leverages entropic regularization and dual optimization techniques to construct a feasible solution. We provide finite-sample guarantees for the risk and constraint satisfaction for the final output of our algorithm under minimal assumptions. The framework accommodates a broad class of constraints, including fairness, abstention, and churn requirements.

#14 A variational Bayes latent class approach for EHR-based patient phenotyping in R

著者: Brian Buckley, Adrian O'Hagan, Marie Galligan

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.14272

要約:
The VBphenoR package for R provides a closed-form variational Bayes approach to patient phenotyping using Electronic Health Records (EHR) data. We implement a variational Bayes Gaussian Mixture Model (GMM) algorithm using closed-form coordinate ascent variational inference (CAVI) to determine the patient phenotype latent class. We then implement a variational Bayes logistic regression, where we determine the probability of the phenotype in the supplied EHR cohort, the shift in biomarkers for patients with the phenotype of interest versus a healthy population and evaluate predictive performance of binary indicator clinical codes and medication codes. The logistic model likelihood applies the latent class from the GMM step to inform the conditional.

#15 Trunc-Opt vine building algorithms

著者: D\'aniel Pfeifer, Edith Alice Kov\'acs

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.14399

要約:
Vine copula models have become highly popular and practical tools for modelling multivariate probability distributions due to their flexibility in modelling different kinds of dependences between the random variables involved. However, their flexibility comes with the drawback of a high-dimensional parameter space. To tackle this problem, truncated vine copulas were introduced by Kurowicka (2010) (Gaussian case) and Brechmann and Czado (2013) (general case). Truncated vine copulas contain conditionally independent pair copulas after the truncation level. So far, in the general case, truncated vine constructing algorithms started from the lowest tree in order to encode the largest dependences in the lower trees. The novelty of this paper starts from the observation that a truncated vine is determined by the first tree after the truncation level (see Kov\'acs and Sz\'antai (2017)). This paper introduces a new score for fitting truncated vines to given data, called the Weight of the truncated vine. Then we propose a completely new methodology for constructing truncated vines. We prove theorems which motivate this new approach. While earlier algorithms did not use conditional independences, we give algorithms for constructing and encoding truncated vines which do exploit them. Finally, we illustrate the algorithms on real datasets and compare the results with well-known methods included in R packages. Our method generally compare favorably to previously known methods.

#16 Learning the score under shape constraints

著者: Rebecca M. Lewis, Oliver Y. Feng, Henry W. J. Reeve, Min Xu, Richard J. Samworth

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.14624

要約:
Score estimation has recently emerged as a key modern statistical challenge, due to its pivotal role in generative modelling via diffusion models. Moreover, it is an essential ingredient in a new approach to linear regression via convex $M$-estimation, where the corresponding error densities are projected onto the log-concave class. Motivated by these applications, we study the minimax risk of score estimation with respect to squared $L^2(P_0)$-loss, where $P_0$ denotes an underlying log-concave distribution on $\mathbb{R}$. Such distributions have decreasing score functions, but on its own, this shape constraint is insufficient to guarantee a finite minimax risk. We therefore define subclasses of log-concave densities that capture two fundamental aspects of the estimation problem. First, we establish the crucial impact of tail behaviour on score estimation by determining the minimax rate over a class of log-concave densities whose score function exhibits controlled growth relative to the quantile levels. Second, we explore the interplay between smoothness and log-concavity by considering the class of log-concave densities with a scale restriction and a $(\beta,L)$-H\"older assumption on the log-density for some $\beta \in [1,2]$. We show that the minimax risk over this latter class is of order $L^{2/(2\beta+1)}n^{-\beta/(2\beta+1)}$ up to poly-logarithmic factors, where $n$ denotes the sample size. When $\beta < 2$, this rate is faster than could be obtained under either the shape constraint or the smoothness assumption alone. Our upper bounds are attained by a locally adaptive, multiscale estimator constructed from a uniform confidence band for the score function. This study highlights intriguing differences between the score estimation and density estimation problems over this shape-constrained class.

#17 Bias-Variance Trade-off for Clipped Stochastic First-Order Methods: From Bounded Variance to Infinite Mean

著者: Chuan He

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.14686

要約:
Stochastic optimization is fundamental to modern machine learning. Recent research has extended the study of stochastic first-order methods (SFOMs) from light-tailed to heavy-tailed noise, which frequently arises in practice, with clipping emerging as a key technique for controlling heavy-tailed gradients. Extensive theoretical advances have further shown that the oracle complexity of SFOMs depends on the tail index $\alpha$ of the noise. Nonetheless, existing complexity results often cover only the case $\alpha \in (1,2]$, that is, the regime where the noise has a finite mean, while the complexity bounds tend to infinity as $\alpha$ approaches $1$. This paper tackles the general case of noise with tail index $\alpha\in(0,2]$, covering regimes ranging from noise with bounded variance to noise with an infinite mean, where the latter case has been scarcely studied. Through a novel analysis of the bias-variance trade-off in gradient clipping, we show that when a symmetry measure of the noise tail is controlled, clipped SFOMs achieve improved complexity guarantees in the presence of heavy-tailed noise for any tail index $\alpha \in (0,2]$. Our analysis of the bias-variance trade-off not only yields new unified complexity guarantees for clipped SFOMs across this full range of tail indices, but is also straightforward to apply and can be combined with classical analyses under light-tailed noise to establish oracle complexity guarantees under heavy-tailed noise. Finally, numerical experiments validate our theoretical findings.

#18 General Formulation and PCL-Analysis for Restless Bandits with Limited Observability

著者: Keqin Liu, Qizhen Jia

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2307.03034

要約:
In this paper, we consider a general observation model for restless multi-armed bandit problems. The operation of the player is based on the past observation history that is limited (partial) and error-prone due to resource constraints or environmental or intrinsic noises. By establishing a general probabilistic model for dynamics of the observation process, we formulate the problem as a restless bandit with an infinite high-dimensional belief state space. We apply the achievable region method with partial conservation law (PCL) to the infinite-state problem and analyze its indexability and priority index (Whittle index). Finally, we propose an approximation process to transform the problem into which the AG algorithm of Ni\~no-Mora (2001) for finite-state problems can be applied. Numerical experiments show that our algorithm has excellent performance.

#19 Near-Optimal Algorithms for Omniprediction

著者: Princewill Okoroafor, Robert Kleinberg, Michael P. Kim

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2501.17205

要約:
Omnipredictors are simple prediction functions that encode loss-minimizing predictions with respect to a hypothesis class $H$, simultaneously for every loss function within a class of losses $L$. In this work, we give near-optimal learning algorithms for omniprediction, in both the online and offline settings. To begin, we give an oracle-efficient online learning algorithm that acheives $(L,H)$-omniprediction with $\tilde O (\sqrt{T \log |H|})$ regret for any class of Lipschitz loss functions $L \subseteq L_\mathrm{Lip}$. Quite surprisingly, this regret bound matches the optimal regret for \emph{minimization of a single loss function} (up to a $\sqrt{\log(T)}$ factor). Given this online algorithm, we develop an online-to-offline conversion that achieves near-optimal complexity across a number of measures. In particular, for all bounded loss functions within the class of Bounded Variation losses $L_\mathrm{BV}$ (which include all convex, all Lipschitz, and all proper losses) and any (possibly-infinite) $H$, we obtain an offline learning algorithm that, leveraging an (offline) ERM oracle and $m$ samples from $D$, returns an efficient $(L_{\mathrm{BV}},H,\epsilon(m))$-omnipredictor for $\varepsilon(m)$ scaling near-linearly in the Rademacher complexity of a class derived from $H$ by taking convex combinations of a fixed number of elements of $\mathrm{Th} \circ H$.

#20 Misspecification-robust amortised simulation-based inference using variational methods

著者: Matthew O'Callaghan, Kaisey S. Mandel, Gerry Gilmore

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.05724

要約:
Recent advances in neural density estimation have enabled powerful simulation-based inference (SBI) methods that can flexibly approximate Bayesian inference for intractable stochastic models. Although these methods have demonstrated reliable posterior estimation when the simulator accurately represents the underlying data generative process (DGP), recent work has shown that they perform poorly in the presence of model misspecification. This poses a significant issue for their use in real-world problems, due to simulators always misrepresenting the true DGP to a certain degree. In this paper, we introduce robust variational neural posterior estimation (RVNP), a method which addresses the problem of misspecification in amortised SBI by bridging the simulation-to-reality gap using variational inference and error modelling. We test RVNP on multiple benchmark tasks, including using real data from astronomy, and show that it can recover robust posterior inference in a data-driven manner without adopting hyperparameters or priors governing the misspecification influence.

#21 Defining latent spaces by example: optimisation over the outputs of generative models

著者: Samuel Willis, Alexandru I. Stere, Dragos D. Margineantu, Henry T. Oldroyd, John A. Fozard, Carl Henrik Ek, Henry Moss, Erik Bodin

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.23800

要約:
Modern generative AI models like diffusion and flow matching can sample from rich data distributions, but many downstream tasks - such as experimental design or creative content generation - require a higher level of control than unconstrained sampling. Here, the challenge is to efficiently identify outputs that are both probable under the model and satisfy task-specific constraints. Often, the evaluation of samples is expensive and lack gradients - a setting known as black-box optimisation. In this work, we allow black-box optimisation on top of diffusion and flow matching models for the first time by introducing surrogate latent spaces: non-parametric, low-dimensional Euclidean embeddings that can be extracted from any generative model without additional training. The axes can be defined via examples, providing a simple and interpretable approach to define custom latent spaces that express intended features and is convenient to use in downstream tasks. Our proposed representation is Euclidean and has controllable dimensionality, permitting direct application of standard optimisation algorithms. We demonstrate that our approach is architecture-agnostic, incurs almost no additional computational cost over standard generation, and generalises across modalities, including images, audio, videos, and structured objects like proteins.

#22 Curiosity-Driven Development of Action and Language in Robots Through Self-Exploration

著者: Theodore Jerome Tinker, Kenji Doya, Jun Tani

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.05013

要約:
Infants acquire language with generalization from minimal experience, whereas large language models require billions of training tokens. What underlies efficient development in humans? We investigated this problem through experiments wherein robotic agents learn to perform actions associated with imperative sentences (e.g., push red cube) via curiosity-driven self-exploration. Our approach integrates active inference with reinforcement learning, enabling intrinsically motivated developmental learning. The simulations reveal key findings corresponding to observations in developmental psychology. i) Generalization improves drastically as the scale of compositional elements increases. ii) Curiosity improves learning through self-exploration. iii) Rote pairing of sentences and actions precedes compositional generalization. iv) Simpler actions develop before complex actions depending on them. v) Exception-handling induces U-shaped developmental performance, a pattern like representational redescription in child language learning. These results suggest that curiosity-driven active inference accounts for how intrinsically motivated sensorimotor-linguistic learning supports scalable compositional generalization and exception handling in humans and artificial agents.

#23 Theoretical Guarantees of Learning Ensembling Strategies with Applications to Time Series Forecasting

著者: Hilaf Hasson, Danielle C. Maddix, Yuyang Wang, Gaurav Gupta, Youngsuk Park

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2305.15786

要約:
Ensembling is among the most popular tools in machine learning (ML) due to its effectiveness in minimizing variance and thus improving generalization. Most ensembling methods for black-box base learners fall under the umbrella of "stacked generalization," namely training an ML algorithm that takes the inferences from the base learners as input. While stacking has been widely applied in practice, its theoretical properties are poorly understood. In this paper, we prove a novel result, showing that choosing the best stacked generalization from a (finite or finite-dimensional) family of stacked generalizations based on cross-validated performance does not perform "much worse" than the oracle best. Our result strengthens and significantly extends the results in Van der Laan et al. (2007). Inspired by the theoretical analysis, we further propose a particular family of stacked generalizations in the context of probabilistic forecasting, each one with a different sensitivity for how much the ensemble weights are allowed to vary across items, timestamps in the forecast horizon, and quantiles. Experimental results demonstrate the performance gain of the proposed method.

#24 Calculating the maximum number of maximum cliques for simple graphs

著者: D\'aniel Pfeifer

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2307.14120

要約:
A simple graph on $n$ vertices may contain a lot of maximum cliques. But how many can it potentially contain? We will define prime and composite graphs, and we will show that if $n \ge 15$, then the grpahs with the maximum number of maximum cliques have to be composite. Moreover, we will show an edge bound from which we will prove that if any factor of a composite graph has $\omega(G_i) \ge 5$, then it cannot have the maximum number of maximum cliques. Using this we will show that the graph that contains $3^{\lfloor n/3 \rfloor}c$ maximum cliques has the most number of maximum cliques on $n$ vertices, where $c\in\{1,\frac{4}{3},2\}$, depending on $n \text{ mod } 3$.

#25 QCircuitBench: A Large-Scale Dataset for Benchmarking Quantum Algorithm Design

著者: Rui Yang, Ziruo Wang, Yuntian Gu, Tianyi Chen, Yitao Liang, Tongyang Li

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2410.07961

要約:
Quantum computing is an emerging field recognized for the significant speedup it offers over classical computing through quantum algorithms. However, designing and implementing quantum algorithms pose challenges due to the complex nature of quantum mechanics and the necessity for precise control over quantum states. Despite the significant advancements in AI, there has been a lack of datasets specifically tailored for this purpose. In this work, we introduce QCircuitBench, the first benchmark dataset designed to evaluate AI's capability in designing and implementing quantum algorithms using quantum programming languages. Unlike using AI for writing traditional codes, this task is fundamentally more complicated due to highly flexible design space. Our key contributions include: 1. A general framework which formulates the key features of quantum algorithm design for Large Language Models. 2. Implementations for quantum algorithms from basic primitives to advanced applications, spanning 3 task suites, 25 algorithms, and 120,290 data points. 3. Automatic validation and verification functions, allowing for iterative evaluation and interactive reasoning without human inspection. 4. Promising potential as a training dataset through preliminary fine-tuning results. We observed several interesting experimental phenomena: LLMs tend to exhibit consistent error patterns, and fine-tuning does not always outperform few-shot learning. In all, QCircuitBench is a comprehensive benchmark for LLM-driven quantum algorithm design, and it reveals limitations of LLMs in this domain.

#26 A Lipschitz spaces view of infinitely wide shallow neural networks

著者: Francesca Bartolucci, Marcello Carioni, Jos\'e A. Iglesias, Yury Korolev, Emanuele Naldi, Stefano Vigogna

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2410.14591

要約:
We revisit the mean field parametrization of shallow neural networks, using signed measures on unbounded parameter spaces and duality pairings that take into account the regularity and growth of activation functions. This setting directly leads to the use of unbalanced Kantorovich-Rubinstein norms defined by duality with Lipschitz functions, and of spaces of measures dual to those of continuous functions with controlled growth. These allow to make transparent the need for total variation and moment bounds or penalization to obtain existence of minimizers of variational formulations, under which we prove a compactness result in strong Kantorovich-Rubinstein norm, and in the absence of which we show several examples demonstrating undesirable behavior. Further, the Kantorovich-Rubinstein setting enables us to combine the advantages of a completely linear parametrization and ensuing reproducing kernel Banach space framework with optimal transport insights. We showcase this synergy with representer theorems and uniform large data limits for empirical risk minimization, and in proposed formulations for distillation and fusion applications.

#27 Semiparametric inference for impulse response functions using double/debiased machine learning

著者: Daniele Ballinari, Alexander Wehrli

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2411.10009

要約:
We introduce a double/debiased machine learning estimator for the impulse response function in settings where a time series of interest is subjected to multiple discrete treatments, assigned over time, which can have a causal effect on future outcomes. The proposed estimator can rely on fully nonparametric relations between treatment and outcome variables, opening up the possibility to use flexible machine learning approaches to estimate impulse response functions. To this end, we extend the theory of double machine learning from an i.i.d. to a time series setting and show that the proposed estimator is consistent and asymptotically normally distributed at the parametric rate, allowing for semiparametric inference for dynamic effects in a time series setting. The properties of the estimator are validated numerically in finite samples by applying it to learn the impulse response function in the presence of serial dependence in both the confounder and observation innovation processes. We also illustrate the methodology empirically by applying it to the estimation of the effects of macroeconomic shocks.

#28 Universal Approximation with Softmax Attention

著者: Jerry Yao-Chieh Hu, Hude Liu, Hong-Yu Chen, Weimin Wu, Han Liu

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2504.15956

要約:
We prove that with linear transformations, both (i) two-layer self-attention and (ii) one-layer self-attention followed by a softmax function are universal approximators for continuous sequence-to-sequence functions on compact domains. Our main technique is a new interpolation-based method for analyzing attention's internal mechanism. This leads to our key insight: self-attention is able to approximate a generalized version of ReLU to arbitrary precision, and hence subsumes many known universal approximators. Building on these, we show that two-layer multi-head attention alone suffices as a sequence-to-sequence universal approximator. In contrast, prior works rely on feed-forward networks to establish universal approximation in Transformers. Furthermore, we extend our techniques to show that, (softmax-)attention-only layers are capable of approximating various statistical models in-context. We believe these techniques hold independent interest.

#29 Minimax Rates of Estimation for Optimal Transport Map between Infinite-Dimensional Spaces

著者: Donlapark Ponnoprat, Masaaki Imaizumi

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2505.13570

要約:
We investigate the estimation of an optimal transport map between probability measures on an infinite-dimensional space and reveal its minimax optimal rate. Optimal transport theory defines distances within a space of probability measures, utilizing an optimal transport map as its key component. Estimating the optimal transport map from samples finds several applications, such as simulating dynamics between probability measures and functional data analysis. However, some transport maps on infinite-dimensional spaces require exponential-order data for estimation, which undermines their applicability. In this paper, we investigate the estimation of an optimal transport map between infinite-dimensional spaces, focusing on optimal transport maps characterized by the notion of $\gamma$-smoothness. Consequently, we show that the order of the minimax risk is polynomial rate in the sample size even in the infinite-dimensional setup. We also develop an estimator whose estimation error matches the minimax optimal rate. With these results, we obtain a class of reasonably estimable optimal transport maps on infinite-dimensional spaces and a method for their estimation. Our experiments validate the theory and practical utility of our approach with application to functional data analysis.

#30 A Computable Measure of Suboptimality for Entropy-Regularised Variational Objectives

著者: Cl\'ementine Chazal, Heishiro Kanagawa, Zheyang Shen, Anna Korba, Chris. J. Oates

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2509.10393

要約:
Several emerging post-Bayesian methods target a probability distribution for which an entropy-regularised variational objective is minimised. This increased flexibility introduces a computational challenge, as one loses access to an explicit unnormalised density for the target. To mitigate this difficulty, we introduce a novel measure of suboptimality called 'gradient discrepancy', and in particular a 'kernel' gradient discrepancy (KGD) that can be explicitly computed. In the standard Bayesian context, KGD coincides with the kernel Stein discrepancy (KSD), and we obtain a novel characterisation of KSD as measuring the size of a variational gradient. Outside this familiar setting, KGD enables novel sampling algorithms to be developed and compared, even when unnormalised densities cannot be obtained. To illustrate this point several novel algorithms are proposed and studied, including a natural generalisation of Stein variational gradient descent, with applications to mean-field neural networks and predictively oriented posteriors presented. On the theoretical side, our principal contribution is to establish sufficient conditions for desirable properties of KGD, such as continuity and convergence control.

#31 TempoPFN: Synthetic Pre-training of Linear RNNs for Zero-shot Time Series Forecasting

著者: Vladyslav Moroshan, Julien Siems, Arber Zela, Timur Carstensen, Frank Hutter

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2510.25502

要約:
Foundation models for zero-shot time series forecasting face challenges in efficient long-horizon prediction and reproducibility, with existing synthetic-only approaches underperforming on challenging benchmarks. This paper presents TempoPFN, a univariate time series foundation model based on linear Recurrent Neural Networks (RNNs) pre-trained exclusively on synthetic data. The model uses a GatedDeltaProduct architecture with state-weaving for fully parallelizable training across sequence lengths, eliminating the need for windowing or summarization techniques while maintaining robust temporal state-tracking. Our comprehensive synthetic data pipeline unifies diverse generators, including stochastic differential equations, Gaussian processes, and audio synthesis, with novel augmentations. In zero-shot evaluations on the Gift-Eval, fev-bench and Chronos-ZS benchmarks, TempoPFN achieves top-tier competitive performance, outperforming all existing synthetic-only approaches and surpassing the majority of models trained on real-world data, while being more efficient than existing baselines by leveraging fully parallelizable training and inference. We open-source our complete data generation pipeline and training code, providing a reproducible foundation for future research.

#32 Doubly Wild Refitting: Model-Free Evaluation of High Dimensional Black-Box Predictions under Convex Losses

著者: Haichen Hu, David Simchi-Levi

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2511.18789

要約:
We study the problem of excess risk evaluation for empirical risk minimization (ERM) under general convex loss functions. Our contribution is an efficient refitting procedure that computes the excess risk and provides high-probability upper bounds under the fixed-design setting. Assuming only black-box access to the training algorithm and a single dataset, we begin by generating two sets of artificially modified pseudo-outcomes termed wild response, created by stochastically perturbing the gradient vectors with carefully chosen scaling. Using these two pseudo-labeled datasets, we then refit the black-box procedure twice to obtain two corresponding wild predictors. Finally, leveraging the original predictor, the two wild predictors, and the constructed wild responses, we derive an efficient excess risk upper bound. A key feature of our analysis is that it requires no prior knowledge of the complexity of the underlying function class. As a result, the method is essentially model-free and holds significant promise for theoretically evaluating modern opaque machine learning system--such as deep nerral networks and generative model--where traditional capacity-based learning theory becomes infeasible due to the extreme complexity of the hypothesis class.

#33 GraphBench: Next-generation graph learning benchmarking

著者: Timo Stoll, Chendi Qian, Ben Finkelshtein, Ali Parviz, Darius Weber, Fabrizio Frasca, Hadar Shavit, Antoine Siraudin, Arman Mielke, Marie Anastacio, Erik M\"uller, Maya Bechler-Speicher, Michael Bronstein, Mikhail Galkin, Holger Hoos, Mathias Niepert, Bryan Perozzi, Jan T\"onshoff, Christopher Morris

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.04475

要約:
Machine learning on graphs has recently achieved impressive progress in various domains, including molecular property prediction and chip design. However, benchmarking practices remain fragmented, often relying on narrow, task-specific datasets and inconsistent evaluation protocols, which hampers reproducibility and broader progress. To address this, we introduce GraphBench, a comprehensive benchmarking suite that spans diverse domains and prediction tasks, including node-level, edge-level, graph-level, and generative settings. GraphBench provides standardized evaluation protocols -- with consistent dataset splits and performance metrics that account for out-of-distribution generalization -- as well as a unified hyperparameter tuning framework. Additionally, we benchmark GraphBench using message-passing neural networks and graph transformer models, providing principled baselines and establishing a reference performance. See www.graphbench.io for further details.

#34 Reliable Statistical Guarantees for Conformal Predictors with Small Datasets

著者: Miguel S\'anchez-Dom\'inguez, Lucas Lacasa, Javier de Vicente, Gonzalo Rubio, Eusebio Valero

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.04566

要約:
Surrogate models (including deep neural networks and other machine learning algorithms in supervised learning) are capable of approximating arbitrarily complex, high-dimensional input-output problems in science and engineering, but require a thorough data-agnostic uncertainty quantification analysis before these can be deployed for any safety-critical application. The standard approach for data-agnostic uncertainty quantification is to use conformal prediction (CP), a well-established framework to build uncertainty models with proven statistical guarantees that do not assume any shape for the error distribution of the surrogate model. However, since the classic statistical guarantee offered by CP is given in terms of bounds for the marginal coverage, for small calibration set sizes (which are frequent in realistic surrogate modelling that aims to quantify error at different regions), the potentially strong dispersion of the coverage distribution around its average negatively impacts the relevance of the uncertainty model's statistical guarantee, often obtaining coverages below the expected value, resulting in a less applicable framework. After providing a gentle presentation of uncertainty quantification for surrogate models for machine learning practitioners, in this paper we bridge the gap by proposing a new statistical guarantee that offers probabilistic information for the coverage of a single conformal predictor. We show that the proposed framework converges to the standard solution offered by CP for large calibration set sizes and, unlike the classic guarantee, still offers relevant information about the coverage of a conformal predictor for small data sizes. We validate the methodology in a suite of examples, and implement an open access software solution that can be used alongside common conformal prediction libraries to obtain uncertainty models that fulfil the new guarantee.

#35 Unsupervised Learning of Density Estimates with Topological Optimization

著者: Sunia Tanweer, Firas A. Khasawneh

公開日: Wed, 17 Dec 2025 00:00:00 -0500

リンク: https://arxiv.org/abs/2512.08895

要約:
Kernel density estimation is a key component of a wide variety of algorithms in machine learning, Bayesian inference, stochastic dynamics and signal processing. However, the unsupervised density estimation technique requires tuning a crucial hyperparameter: the kernel bandwidth. The choice of bandwidth is critical as it controls the bias-variance trade-off by over- or under-smoothing the topological features. Topological data analysis provides methods to mathematically quantify topological characteristics, such as connected components, loops, voids et cetera, even in high dimensions where visualization of density estimates is impossible. In this paper, we propose an unsupervised learning approach using a topology-based loss function for the automated and unsupervised selection of the optimal bandwidth and benchmark it against classical techniques -- demonstrating its potential across different dimensions.

stat.ML updates on arXiv.org

📋 論文タイトル一覧