Distributionally robust q-learning
WebShen and Jiang [13] considered the distributionally robust chance constraint where the reference distribution in the Wasserstein ball is a Gaussian distribution. Peng et al. [11] … WebJan 19, 2024 · To mitigate the potential harms of model misspecification, various forms of distributionally robust optimization have been applied. Although many of these …
Distributionally robust q-learning
Did you know?
WebOct 14, 2024 · Minimax Q-learning Control for Linear Systems Using the Wasserstein Metric. Stochastic optimal control usually requires an explicit dynamical model with probability distributions, which are difficult to obtain in practice. In this work, we consider the linear quadratic regulator (LQR) problem of unknown linear systems and adopt a … WebSep 14, 2024 · Robust Constrained Reinforcement Learning. Constrained reinforcement learning is to maximize the expected reward subject to constraints on utilities/costs. However, the training environment may not be the same as the test one, due to, e.g., modeling error, adversarial attack, non-stationarity, resulting in severe performance …
WebConfidence Regions in Wasserstein Distributionally Robust Estimation Jose Blanchet, Karthyek Murthy, Nian Si Biometrika 109.2 (2024): 295-315. ... A Finite Sample Complexity Bound for Distributionally Robust Q-learning Shengbo Wang, Nian Si, Jose Blanchet, Zhengyuan Zhou Artificial Intelligence and Statistics Conference (AISTATS)2024. WebJan 27, 2024 · Based on the standard Q-learning, we propose distributionally robust Q-learning with the single trajectory (DRQ) and its average-reward variant named …
WebJan 27, 2024 · Based on the standard Q-learning, we propose distributionally robust Q-learning with the single trajectory (DRQ) and its average-reward variant named differential DRQ. We provide asymptotic convergence guarantees and experiments for both settings, demonstrating their superiority in the perturbed environments against the non-robust ones. WebIn this paper, we address the chance-constrained safe Reinforcement Learning (RL) problem using the function approximators based on Stochastic Model Predictive Control …
WebMay 25, 2024 · We use distributionally-robust optimization for machine learning to mitigate the effect of data poisoning attacks. We provide performance guarantees for the trained model on the original data (not including the poison records) by training the model for the worst-case distribution on a neighbourhood around the empirical distribution …
WebFeb 22, 2024 · Combined with maximum entropy framework, we derive a distributionally robust variant of Soft Q-learning that enjoys efficient practical implementation and produces policies with robust behaviour ... barca santa maria di salaWebDistribution-robust optimization provides the basis for creating machine-learning models that can derive the sum of related data distributions. ... Bukljaš, Mihaela, Kristijan Rogić, … bar casa pepe firgasWebFeb 7, 2024 · This paper presents a distributionally robust Q-Learning algorithm (DrQ) which leverages Wasserstein ambiguity sets to provide idealistic probabilistic out-of-sample safety guarantees during online learning. First, we follow past work by separating the constraint functions from the principal objective to create a hierarchy of machines which … bar casa nati jaenWebAug 1, 2024 · A two-stage distributionally robust optimal distributed generation allocation model considering DR flexible adjustment is formulated to maximize the annual profit of … su-sa3034a driverWebReliable Machine Learning via Structured Distributionally Robust OptimizationData sets used to train machine learning (ML) models often suffer from sampling biases and underrepresent marginalized groups. Standard machine learning models are trained to ...While modern large-scale data sets often consist of heterogeneous … barca sambaWebIn this paper, we study communication efficient distributed algorithms for distributionally robust federated learning via periodic averaging with adaptive sampling. In contrast to standard empirical risk minimization, due to the minimax structure of the underlying optimization problem, a key difficulty arises from the fact that the global ... su s75WebDistributionally Robust Reinforcement Learning Elena Smirnova 1Elvis Dohmatob Jeremie Mary Abstract Real-world applications require RL algorithms to act safely. During learning process, it is likely that the agent executes sub-optimal actions that may lead to unsafe/poor states of the system. Explo-ration is particularly brittle in high-dimensional su s 70