Control Variate

RyanLee_ljx...大约 1 分钟

layout: Slide sidebar: false breadcrumb: false pageInfo: false

Reduce the variance of a random variable $X$ .

Generate an alternative random variable $Y$ such that:

The control variate $Y$ is defined as:

Y = X + b(C - \mathbb{E}(C)) \tag{1}

Where:

$C$ is any other random variable (different from $X$ ) with known $\mathbb{E}(C)$
$b \in \mathbb{R}$ is a constant

Key Requirements

$C$ must be such that $\mathbb{E}(C)$ is known a priori
$C$ should be an inherent part of the simulation output for $X$ (generated "for free" with $X$ )

View MTSP as a bilevel optimization problem:

Upper Level:

Lower Level:

\min L = \max D(g(h(f(\theta))), \mu)

Notation Breakdown

Component	Description
$f(\theta)$	Allocation network (parameters $\theta$ )
$h(\cdot)$	Sampling function
$g(\cdot)$	TSP solver
$\mu$	TSP solver parameters
$D(\cdot)$	Euclidean distance cost function

Solution: Log-Derivative Trick

Implement gradient computation through:

Compact form:

\min L = \max D(g(h(f(\theta))), \mu)

Under regularity conditions:

\nabla_\theta L = \nabla_\theta \mathbb{E}[D(g(h(f(\theta))), \mu)]

Rewritten as:

\nabla_\theta L = \mathbb{E}[\nabla_\theta D(g(h(f(\theta))), \mu)]

::: success Solution: Control Variate Introduce surrogate network to:

Input: Allocation matrix $P(\theta)$
Output: Maximum tour length $L'$

\mathbb{E}(C) = \mathbb{E}[\nabla_\theta \log P(\theta)]

昵称

邮箱

网址