By Amirhossein Rajabi, Carsten Witt, in Proc. of GECCO 2020.

**Abstract**

Recent theoretical research has shown that self-adjusting and self-adaptive mechanisms can provably outperform static settings in evolutionary algorithms for binary search spaces. However, the vast majority of these studies focuses on unimodal functions which do not require the algorithm to flip several bits simultaneously to make progress. In fact, existing self-adjusting algorithms are not designed to detect local optima and do not have any obvious benefit to cross large Hamming gaps.

We suggest a mechanism called stagnation detection that can be added as a module to existing evolutionary algorithms (both with and without prior self-adjusting schemes). Added to a simple (1+1)EA, we prove an expected runtime on the well-known **Jump** benchmark that corresponds to an asymptotically optimal parameter setting and outperforms other mechanisms for multimodal optimization like heavy-tailed mutation. We also investigate the module in the context of a self-adjusting (1+)EA and show that it combines the previous benefits of this algorithm on unimodal problems with more efficient multimodal optimization.

To explore the limitations of the approach, we additionally present an example where both self-adjusting mechanisms, including stagnation detection, do not help to find a beneficial setting of the mutation rate. Finally, we investigate our module for stagnation detection experimentally.

# The Paper Story

## Self-adjusting Mechanisms

Self-adjusting Evolutionary Algorithms can learn good or even near-optimal parameters settings.

- Examples of popular self-adjusting EAs:

** GA with 1/5-rule:
**
if then
if then

** EA with two rates**
Create offspring by flipping each bit with prob.
Create offspring by flipping each bit with prob.
With prob. , replace with the strength that the best offspring has been created with (but with probability do a random step).

These self-adjusting schemes work since they in some sense measure **how successful **different parameter choices are compared to each other. There is basically **no signal** for the self-adjusting scheme in a local optimum. Mostly, classical self-adjusting algorithms would take many unsuccessful steps in such situations and suggest to set the mutation rate to its minimum although that might not be the best choice to leave the local optimum. On multimodal problems, however, **increasing the rate** is often beneficial and existing self-adjusting EAs perform poorly in this situation.

Most self-adjusting EAs perform poorly in local optima.

We are going to address this shortcoming by proposing **Stagnation Detection.**

**Stagnation Detection** is an *efficient* solution.

## Multimodal Functions

**Definition of "gap": **Given a fitness function

**Multimodal functions**have at least one point such that .

Example: In multimodal function, for all points such that .

## Stagnation Detection Idea

Assume the current search point is such that .

Improvement time of **(1+1) EA** with different **mutation strengths **** (mutation probability ):
β β
β β (Optimum but requiring the gap size).**

But in the real world, we don't know how large the gaps are. But hopefully we can find out how large the gaps are not. But How?

Assume that , and the algorithm doesn't make any improvement within steps, then with high probability the gap size is larger or much larger than .

We use this fact to develop the stagnation detection mechanism β

Starting from strength , the strength is increased from to when the counter exceeds the threshold . . If , If , .

- Parameter

## Details

The parameter that you can see in the threshold value can be used to control the probability of failing to find an improvement at the βrightβ strength. if is set to the number of fitness values of the underlying function ,In other words , then the probability of ever missing an improvement at the right strength is sufficiently small throughout the run. We recommend at least if nothing is known about the range of .

## EA with Stagnation Detection (EA)

Let's use the **Stagnation Detection** module in **classical (1+1) EA **and call it **SD-(1+1)EA.**

## Escaping from Local Optima

**Theorem:** SD-EA escapes from the local optimum in
, where .

## Proof idea

- :
**Not escaping**from the local optimum in phase . - Lower bound β

## SD- EA on JUMP

**Theorem:** The expected runtime of the SD-EA on with satisfies

## Proof idea

- SD-EA and EA behaviour identical on most unimodal functions except for an error event of small probability.

## General Bound on Fitness Function

- = the number of points of visited.
- where

**Theorem:** The expected runtime of the SD-EA is at most
,
moreover,
.

- Use this theorem to calculate the upper bound that the
**Stagnation Detection mechanism**has on any fitness function.

## SASD-EA

- EA with two-rate and Stagnation Detection (SASD-EA)
- This algorithm behaves like
__EA with two rates__on unimodal points and like__EA with stagnation detection__on local optima. - The threshold is since in each iteration the algorithm produces .

Two states:

**S1** β EA with two rates.

**S2 β** EA with stagnation detection.

**SASD-****EA Idea**
- **S1** at the beginning.
- Shift to **S2** where the counter exceeds the threshold.
- Shift to **S1** where a strict improvement is observed.

## Pseudocode

## Experimental Results on

We carried out an experiment that compares different algorithms on jump function with gap size of 4 and different n. The Numbers are averaged over 1000 independent runs of the algorithms.

- This result is showing the performance of Stagnation Detection on Jump function as a benchmark in practice.

## Where Self-adaptation Fails

- In the second part of the paper, we present a function where stagnation detection and other self-adjusting schemes cannot find its global optimum in polynomial time. Because measuring the number of successes does not hint on the location of the global optimum.

## Observation

The crucial observation for the function is that strength 1 makes it more likely to flip exactly one specific bit than strength 2.However, to flip specific two bits, the strength 2 is asymptotically optimal and clearly better than strength 1. The fitness function will account for this. It leads to a trap at a local optimum if a certain number of one-bit flips is exceeded before a certain minimum number of two-bit flips has happened; However, otherwise, the process is on track to the global optimum.

- Ratio of successfully achieved global optimum where over 1000 runs.

The optimization time of different algorithms on :
- **Theorem 4.1:** EA with strength w.h.p.
- **Theorem 4.1:** EA with strength for any constant w.h.p.
- **Theorem 4.2:** SD-EA at least w.h.p.
- **Theorem 4.3:** SASD-EA at least w.h.p.

- The proofs of all theorems are available in the arXiv version.

## Summary

- We have designed and analyzed self-adjusting EAs for multimodal optimization with
**Stagnation Detection**. - Stagnation Detection can be added to existing EAs without essentially changing their behavior on unimodal (sub)problems.
- (1+1)EA with stagnation detection (SD-(1+1)EA) optimizes the
**Jump**function with in asymptotically optimal time . - We have proved a
**general upper bound**for multimodal functions. - We have presented a function, on which all of our investigated self-adjusting EAs provably fail to be efficient.

Thank you. if you have any question please don't hesitate to contact us.