# Introduction: Combinatorial Optimization and the Ising Model#

This note introduces some background information on combinatorial optimization and optimization calculations with the Ising model. To learn how to run OpenJij, see the next note.

## What is Combinatorial Optimization?#

An optimization problem is a problem of finding the best solution while satisfying specific constraints. In particular, in the case of combinatorial optimization problems, the problem is to find the best combination among the possible.

Examples are shown below.

## Examples of Combinatorial Optimization#

### Knapsack Problem#

Suppose a man finds $N$ treasures, each with a value of $c_i\ (i \in\{0,1,\dots,N-1\})$ and a weight of $w_i$. Determining which treasures to put in his knapsack of capacity $W$ to maximize the total value is called the knapsack problem.

Above we considered how to successfully pack in one single knapsack. When multiple knapsacks exist, the problem of finding the minimum number of knapsacks to hold all $N$ treasures is called the bin packing problem. Packing random-sized rectangular boxes into a container so that each box does not overlap, as in loading a truck, is called the rectangular packing problem. Another similar problem called the Kepler conjecture is to find the proper way to pack spheres into an infinitely large box.

### Traveling Salesman Problem#

Suppose a salesman travels once to each of the $N$ cities to sell a product and then returns to the city he started. The problem to find the route with the shortest distance to reduce the cost of travel is called the traveling salesman problem.

The traveling salesman problem with time windows is the traveling salesman problem where the salesman has a fixed amount of time in each city for business meetings, and has specific appointment times to be in specific cities.

Also, in the traveling salesman problem, there is only one salesman, but in the vehicle routing problem, for example, we consider a situation in which multiple vehicles travel around a city. This can be expressed as having $M$ vehicles to deliver from a warehouse to $N$ destinations.

### Graph Coloring Problems#

In this section, we consider the simplest graph coloring problem, the vertex coloring problem. The vertex coloring problem requires assigning a color to each vertex in such a way that colors on adjacent vertices are different and the number of colors used is minimized. If the given graph is planar, this problem is related to the famous four-color theorem, which states that any map can be painted so that adjacent areas are of different colors.

Also, Sudoku, one of the most famous puzzles, is a game in which $9$ of $9\times 9$ squares must be filled in appropriately so that the same number does not cover the vertical, horizontal, and $3\times 3$ areas of a $9\times 9$ square. Sudoku can be solved as a kind of graph coloring problem.

As seen from the above examples, many combinatorial optimization problems examine combinations and sequences, and there are also many problems related to the real world.

In this note, we will discuss combinatorial optimization　using the Ising model, but various other algorithms have been studied in the context of combinatorial optimization for a wide variety of problems .

# Optimization Calculations using Ising Model#

## Ising Model#

The Ising model was originally introduced as a theoretical model to understand the behavior of magnetic materials. The Hamiltonian $H_J$ of the Ising model is written as follows:

$H_J = \sum_{i>j}J_{ij}\sigma_i\sigma_j + \sum_i h_i \sigma_i$

where $\sigma_i\in \{-1,1\}$ is a binary variable representing the $i$-th spin. The $J_{ij}$ represents the strength of the interatction between the spins, and $h_i$ represents the strength of the property that each spin wants to take either 1 or -1. Please refer to textbooks and other sources for more information on physics . This Hamiltonian corresponds to the cost function in an optimization problem. In other words, optimization calculations using the Ising model replace the combinatorial optimization problem of finding the optimal combination with the problem of finding a spin configuration that minimizes the Hamiltonian of the Ising model.

## Quadoratic Unconstraint Binary Optimization (QUBO) Formulation#

In order to solve a combinatorial optimization problem using the Ising model, the combinatorial optimization problem must be reduced to the form $H_J$. However, it is usually more straightforward to formulate the problem using a binary variable $q_i$ that takes values of either 0 or 1 rather than a spin variable $\sigma_i$ that takes values of either -1 or 1, in the context of mathematical optimization. This binary variable $q_i$ is related to the spin variable $\sigma_i$ as follows:

$q_i = \frac{\sigma_i + 1}{2}$

The Ising model is then rewritten as the following:

$H_Q = \sum_{i>j}Q_{ij}q_iq_j + \sum_i Q_{ii} q_i$

This is called the Quadratic Unconstraint Binary Optimization (QUBO) formulation. The spin variable $\sigma_i$ and the binary variable $q_i$ are equivalent models in the sense that they are mutually transformable. QUBO formulation is used in many Ising model-based optimization calculations.

### QUBO Formulation of the Traveling Salesman Problem#

As an example of Ising model-based optimization, we show the QUBO formulation of the traveling salesman problem below. Consider a binary variable $q_{it} \in \{0,1\}$ that is 1 when in city $i$ at time $t$ and 0 otherwise. Since we want to minimize the sum of the distance to go around cities, so we let $d_{ij}$ be the distance between cities $i$ and $j$, and the objective function is:

$\sum_{ijt}d_{ij}q_{it}q_{j(t+1)\mod n}$

Here, since this problem only involves one salesman visiting each city once, the following two constraint conditions apply: at a given time, the salesman can only be in one city, and in a given city, the salesman can only visit a city once.

$\sum_i q_{it} = 1,\forall t\\ \sum_t q_{it} = 1,\forall i$

We have formulated the mathematical model in binary variables, and next we reduce form to $H_Q$. To combine the constraints into a single Hamiltonian, a method called the penalty method is often used. Roughly speaking, the penalty method is a method of incorporating the effect of constraints into the objective function by adding an expression squared by the constraints to the objective function. Therefore, by using the penalty method, the final objective function can be written as:

$H_Q = \sum_{ijt}d_{ij}q_{it}q_{j(t+1)\mod n} + \sum_t A_t(\sum_i q_{it} - 1)^2 + \sum_i B_i(\sum_t q_{it} - 1)^2$

By transforming this formula, we finally form $H_Q$. Here, $A_t,B_i$ are the weight coefficients for each constraint, and if these coefficients are large, it will be easier to satisfy the constraints, but minimization of the objective function tends to be ignored. On the other hand, if the weights are set too small, minimization of the objective function will be performed, but it will be difficult to obtain a solution that satisfies the constraints. Thus, for problems involving constraints, these weight coefficients must be set appropriately.

Discussed was about the traveling salesman problem, and other formulations for specific optimization problems are also treated in this tutorial later on. For further reference, there are well-known papers such as  on how to formulate various NP-complete problems.

## Simulated Annealing#

We have discussed how to formulate a combinatorial optimization problem using the Ising model. However, the most important question is how to obtain a state that minimizes the cost function. We will now discuss the simulated annealing method, which is the basis of Ising optimization. Simulated annealing is a meta-heuristic algorithm inspired by the physics-based process of gradually lowering the temperature of a material, as the word “annealing” implies . The Geman-Geman theorem states that the simulated annealing method can reach an optimal solution if the temperature is lowered slowly enough . However, in practical use, the temperature cannot be lowered as slowly as this theory requires. Therefore, various solutions, including the optimal solution, are obtained stochastically.

The detailed algorithm will then be described. Let a certain spin configuration $\vec{\sigma}= \{\sigma_0,\sigma_1,\dots,\sigma_n \}$ be obtained and the value of the cost function be $H(\vec{\sigma})$. We then slightly change this spin configuration to obtain a new spin configuration $\vec{\sigma}_\mathrm{new}$. Any method can be used to obtain the new spin configuration, but one commonly used method is called a single spin flip, in which one spin is chosen and flipped. Another method called the Swendsen and Wang algorithm can also be used . The value of the cost function of the new spin configuration, $H(\vec{\sigma}_\mathrm{new})$, is calculated and if the value of the cost function is reduced by even a small amount, the new spin configuration is accepted; if the value of the cost function increases instead, the new spin configuration is accepted with probability $\exp(-(H(\vec{\sigma}_\mathrm{new}) - H(\vec{\sigma}))/T)$. Here, $T$ is a parameter corresponding to the temperature. Depending on its magnitude, the probability of accepting a spin configuration that increases the value of the cost function changes. The simulated annealing method is the algorithm that updates the new spin configuration by slowly decreasing $T$.

In summary, the simulated annealing method slowly lowers the temperature $T$ while updating the spin configuration with the following probability:

$P(\vec{\sigma}_\mathrm{new}|\vec{\sigma}) = \begin{cases} 1 & (H(\vec{\sigma}_\mathrm{new}) \leq H(\vec{\sigma})) \\ \exp(-(H(\vec{\sigma}_\mathrm{new}) - H(\vec{\sigma}))/T) & (H(\vec{\sigma}_\mathrm{new}) > H(\vec{\sigma})) \end{cases}$

The stochastic behavior of this cost function is a physics-based representation of the behavior of a transition to another state under the influence of thermal fluctuations.

The simulated annealing method prevents getting stuck in a local solution by probabilistically accepting not only states that lower the cost function, but also states that raise the cost function. The degree to which a state with an increasing cost function is acceptable depends on $T$. However, the appropriate temperature range depends on the problem to be solved, so it is always necessary to design an appropriate temperature range for each problem. OpenJij introduces a mechanism to automatically adjust the temperature.

## Quantum Annealing#

The simulated annealing method was an algorithm to avoid getting stuck in a local solution by using the effects of thermal fluctuations. In the world described by quantum mechanics, which describes physics in a very small, microscopic world, there exists a stochastic effect other than thermal fluctuation called quantum fluctuation. Quantum annealing uses the effect of quantum fluctuations instead of thermal fluctuations to perform optimization calculations. The Hamiltonian used in quantum annealing is:

$H(s) = \left(\sum_{i>j}J_{ij}\sigma^z_i\sigma^z_j + \sum_i h_i \sigma^z_i\right)-\Gamma(s)\sum_i \sigma_i^x$

Here, $s$ represents a kind of time, and the degree of influence of each term is adjusted by $\Gamma(s)$. Specifically, $\Gamma(s)$ is usually set to be very large at the initial time, and $\Gamma(s)$ is gradually made smaller. This lets the coefficients adjust to swap the influence of the two terms. The details of why we do it this way are left to textbooks and other sources , but the second term represents the superposition state of all possible spin configurations, and the first term represents the Hamiltonian of the original optimization problem. Therefore, from the superposition of all spin configurations caused by quantum effects, we gradually add the influence of the Hamiltonian we want to optimize, and finally adjust the spin state so that it exists at the optimal value in the Hamiltonian we want to optimize.

## Simulated Quantum Annealing#

Because quantum annealing uses pure quantum effects, it is difficult to perform large-scale calculations on classical computers. On the other hand, there is a computational method that mimics this quantum effect called simulated quantum annealing. Derivations are referred to textbooks and other sources . In this method, instead of performing quantum annealing for $H(s)$, we use the simulated annealing method on the Hamiltonian as follows:

$\tilde{H} = \frac{1}{L}\left(\sum_{i>j}\sum_{l=1}^{L}J_{ij}\sigma_{il}^z\sigma_{jl}^z + \sum_{i}\sum_{l=1}^{L}h_i\sigma_{il}^z\right)\\ \quad - \frac{T}{2}\log \coth\left(\frac{\Gamma}{TL}\sum_i\sum_{l=1}^L\sigma^z_{ik}\sigma^z_{ik+1}\right)$

where $L$ is a quantity called the Trotter number, and theoretically this method gives results corresponding to quantum annealing when $L$ is sufficiently large.