跳至主要內容

Control Variate

RyanLee_ljx...大约 1 分钟ML

Control Variate


layout: Slide sidebar: false breadcrumb: false pageInfo: false

Introduction to Control Variate

Target

Reduce the variance of a random variable XX.

Method

Generate an alternative random variable YY such that:

  • E(Y)=E(X)\mathbb{E}(Y) = \mathbb{E}(X)
  • Var(Y)<Var(X)\text{Var}(Y) < \text{Var}(X)

Approach and Proof

The control variate YY is defined as:

Y=X+b(CE(C))(1) Y = X + b(C - \mathbb{E}(C)) \tag{1}

Where:

  • CC is any other random variable (different from XX) with known E(C)\mathbb{E}(C)
  • bRb \in \mathbb{R} is a constant

Key Requirements

  1. CC must be such that E(C)\mathbb{E}(C) is known a priori
  2. CC should be an inherent part of the simulation output for XX (generated "for free" with XX)

Control Variate in IMTSP

Problem Formulation

View MTSP as a bilevel optimization problem:

Upper Level:

  • Optimize allocation network f(θ)f(\theta) (city-agent assignment)
  • Optimize surrogate network s(γ)s(\gamma)

Lower Level:

  • Optimize TSP solver g(μ)g(\mu^*) (single-agent routing)

minL=maxD(g(h(f(θ))),μ) \min L = \max D(g(h(f(\theta))), \mu)

Notation Breakdown
ComponentDescription
f(θ)f(\theta)Allocation network (parameters θ\theta)
h()h(\cdot)Sampling function
g()g(\cdot)TSP solver
μ\muTSP solver parameters
D()D(\cdot)Euclidean distance cost function

Gradient Estimation

Challenge 1: Non-Differentiability

Solution: Log-Derivative Trick

Implement gradient computation through:

  1. Allocation network ff
  2. TSP solver gg and parameters μ\mu

Compact form:

minL=maxD(g(h(f(θ))),μ) \min L = \max D(g(h(f(\theta))), \mu)


Proof of Gradient Interchange

Under regularity conditions:

θL=θE[D(g(h(f(θ))),μ)] \nabla_\theta L = \nabla_\theta \mathbb{E}[D(g(h(f(\theta))), \mu)]

Rewritten as:

θL=E[θD(g(h(f(θ))),μ)] \nabla_\theta L = \mathbb{E}[\nabla_\theta D(g(h(f(\theta))), \mu)]


Variance Reduction

Challenge 2: High Variance

::: success Solution: Control Variate Introduce surrogate network to:

  1. Provide control variate for allocation network
  2. Minimize single-sample gradient variance :::

Surrogate Network Design

Input: Allocation matrix P(θ)P(\theta)
Output: Maximum tour length LL'

E(C)=E[θlogP(θ)] \mathbb{E}(C) = \mathbb{E}[\nabla_\theta \log P(\theta)]

评论
  • 按正序
  • 按倒序
  • 按热度
Powered by Waline v3.1.3