Metropolis Walker
The Boltzmann weight describes the likelyhood of finding a physical system in a state \(`s`\) at a given temperature \(`T`\):
p(s) \propto e^{-E(s)/k_B T}
where \(`E(s)`\) is the energy of the system in state \(`s`\) (in the context of optimization, this is often referred to as the "cost function"). The actual value of \(`p(s)`\) could be obtained via normalization:
p(s) = \frac{e^{-E(s)/k_BT}}{Z},\quad Z = \sum_{s'} e^{-E(s')/k_B T}
where the normalizing denominator \(`Z`\) is called the "partition function" and the sum is over all possible states of the system. For configuration spaces of interesting optimization problems, \(`Z`\) is typically intractable (that is, the number of possible configurations is too large to compute it).
NOTE: The lower the temperature \(`T`\), the highter the weight-discrepancy between "low energy" and "high energy" states. In the limit \(`T \to 0`\), the weight of all but the lowest energy state vanishes and we find the system in this "ground state" with probability \(`1`\).
Randomized Sampling
The metropolis algorithm describes a markov chain approach for randomized sampling of a system's states with their sampling probability converging to the Boltzmann weight. To extend the markov chain, a random next state is proposed and accepted with probability
p(s\to s') =
\begin{cases}
e^{-(E(s')-E(s))/k_BT},& \text{if } E(s') > E(s)\\
1, & \text{otherwise.}
\end{cases}
That is, we always accept a move to a lower-energy state, but only allow increases depending on the energy difference \(`\Delta_E`\) and temperature \(`T`\). This sampling approach can be proven to converge to the Boltzmann distribution in the long-term limit.
Implementation
Metropolis sampling is implemented as a specialization of the
Walker
base class:
:::cpp linenums=False
template <class Model>
class Metropolis : public Walker<Model> {
public:
/// Decide whether to accept a given cost increase.
bool accept(double cost_diff) override {
return cost_diff <= 0 ||
Walker<Model>::random_number_generator_->uniform() <
exp(-cost_diff * beta_);
}
/// Set the sampling temperature (also updates beta_).
void set_temperature(double temperature);
};
where the selection of a random next state and conditional application
of the proposed transition is implemented in the Walker
base class.
NOTE: For optimization problems it is customary to set the Boltzmann constant to \(`k_B\equiv 1`\) and work with the inverse temperature \(`\beta=1/T`\).
Usage
The Metropolis
class is used by solvers which rely on Boltzmann
sampling, such as ParallelTempering
,
SimulatedAnnealing
and
PopulationAnnealing
.
Instantiation: The class of the model to be sampled is specified
as a template argument to the Metropolis
class. This defines both the model
used for cost function evaluation and the type of state (Model::State_T
) in
the markov chain.
Preparation: Metropolis
instance initialization requires the following
steps prior to calls to the make_step()
method (inherited from Walker
):
set_model(&your_model)
must be called with a pointer to an instance of the model to use (not owned).set_random_number_generator(&your_rng)
must be called with a pointer to the random number generator to use (now owned).set_temperature(double)
must be called with the desired sampling temperature (the default value is \(`T=1`\)).init()
must be invoked after the model and rng have been assigned (and beforemake_step()
is called).
Both your_model
and your_rng
are "not owned", meaning they must be
instantiated by the caller and must outlive the last call to make_step()
NOTE: This delegation of ownership for the Model
and
RandomNumberGenerator
allows the caller to manage the number of instances
for each (e.g., avoid multiple models; rng's as needed for multi-threading).
Introspection: As a Walker
, each metropolis instance keeps track of the
last state in the markov chain and caches its current energy. In particular,
the calls to state()
and cost()
are \(`O(1)`\).
Example
#!cpp
#include "random/generator.h"
#include "markov/metropolis.h
#include "solver/test/test_model.h"
int main() {
// Prepare objects not owned by metropolis.
TestModel model;
common::RandomNumberGenerator rng;
rng.seed(42);
// Initialize the metropolis instance.
Metropolis<TestModel> metropolis;
metropolis.set_model(&model);
metropolis.set_rng(&rng)
metropolis.set_temperature(0.1);
metropolis.init();
// Attempt 50 metropolis steps
for (int i = 0; i < 50; i++) {
metropolis.make_step();
std::cout << metropolis.state() << " " << metropolis.cost() << std::endl;
}
return 0;
}