Optimization based on Multi-Meme Memetic Algorithm

doi:10.21203/rs.3.rs-4942743/v1

Download PDF

Research Article

Optimization based on Multi-Meme Memetic Algorithm

https://doi.org/10.21203/rs.3.rs-4942743/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

In this paper a new learning automata-based Multi-Meme memetic algorithm which is obtained from combination of learning automata (LA) and memetic algorithm (MA) is proposed for optimization problems. This algorithm is composed of two parts, genetic section and memetic section. Genetic section operates based on the irregular cellular learning automata (ICLA) which is a generalization of cellular learning automata (CLA) in which the restriction of rectangular grid structure in CLA is removed. Memetic section consists of a pool of memes in which each meme is correspond to a certain local search method and represented by a set of LAs by which the history of the corresponding local search method can be extracted. To show the superiority of our proposed algorithm over the some well-known algorithms, several computer experiments have been conducted. The obtained results show that the new algorithm performs better that other methods in terms of running time of algorithm and required number of colors.

Irregular Cellular Learning Automata (ICLA)

Memetic Algorithm (MA)

Multi-meme MA

Learning automata (LA) is based on the general schemes of reinforcement learning algorithms. LA enables agents to learn their interaction with an environment. They select actions via a stochastic process and apply them on a random unknown environment. They can learn the best action by iteratively performing and receiving stochastic reinforcement signals from the unknown environment. These stochastic responses from the environment show the favorability of the selected actions, and the LA change their action selecting mechanism in favor of the most promising actions according to responses from the environment [2, 3].

In this paper, we propose a Multi-meme memetic algorithm based on learning automata for optimization problems. For this purpose, the new proposed algorithm is used for solving vertex coloring problem. The vertex coloring problem is a well-known combinatorial optimization problem in graph theory which is used in many applications such as timetabling, air traffic follow management, register allocation and scheduling. Graph $\:\text{G}=(\text{V},\text{E})$ is given where V is the set of $\:\left|\text{V}\right|=\text{n}$ vertices and E is the set of $\:\left|\text{E}\right|=\text{m}$ edges. A k-coloring of graph G is a function $\:{\Phi\:}:\text{V}\to\:{\Gamma\:}$ where $\:{\Gamma\:}=\{\text{1,2},\dots\:,\text{k}\}$ is the set of integers, each one represents one color. A coloring is feasible if no two endpoints of each edge have the same colors. That is, for all $\:[\text{u},\text{v}]\in\:\text{E}$ we have that $\:{\Phi\:}\left(\text{u}\right)\ne\:{\Phi\:}\left(\text{v}\right)$, otherwise coloring is infeasible. If the endpoints u and v of any edge have the same colors, the vertices u and v are in conflict. The set of all vertices that are in conflict, form the conflict set $\:{\text{V}}^{\text{C}}$. The minimum number of colors required for a feasible coloring is called the chromatic number $\:{\chi\:}\left(\text{G}\right)$ and a graph G is said k-chromatic, if $\:{\chi\:}\left(\text{G}\right)=\text{k}$. We can consider a feasible k-coloring as a partitioning of the set of vertices in k disjoint sets, named color classes and shown by $\:\text{C}=\{{\text{C}}_{1},\:\dots\:,\:{\text{C}}_{\text{k}}\}$. The k-coloring problem is formally NP-complete for general graph and chromatic number problem is NP-hard [1]. The vertex graph coloring problem can be considered as a decision or an optimization problem. In decision version of the vertex coloring problem, the question to be answered is whether for some given k a feasible k-coloring exists, and in the optimization version of the vertex coloring problem intend for the smallest number k by which the graph can be feasibly colored.

The proposed algorithm is obtained from combination of learning automata (LA) for local search (Exploitation) and memetic Algorithm (MA) for global search (Exploration). This algorithm is composed of two parts, genetic section and memetic section. Genetic section operates based on the irregular cellular automata (ICLA) which is a generalization of cellular learning automata (CLA) in which the restriction of rectangular grid structure in CLA is removed. In genetic section, the input graph is modeled by an isomorphic ICLA in which each vertex of graph is associated with a cell of ICLA where is equipped with a learning automaton. In each cell, the set of actions of learning automaton constitute the set of color by which the corresponding vertex can be colored. The operation of genetic section is composed of a number of stages. Each stage consists of two steps. In the first step, the learning automaton of each cell chooses one of its actions (colors) randomly, based on its action probability vector and a new coloring is locally generated by each cell and its neighbors. The selected action is penalized, if the local coloring is infeasible or its number of colors is larger than that of the best coloring generated so far. In the second step, new coloring replaced with the current chromosome; if the genetic fitness of new coloring is better than it. The genetic fitness function counts the number of edges that their end points have same color class.

Memetic section of proposed algorithm consists of a pool of memes in which each meme is correspond to a certain local search method and represented by a set of LAs by which the history of the corresponding local search method can be extracted. The action probability vectors of a meme (the history) are updated according to a learning algorithm based on the reinforcement signal received from the genetic section after applying the corresponding local search method. Updating is performed on the basis of the result of applying corresponding local search method on current chromosome. If the colors of a vertex are the same before and after applying local search method, the selected action of learning automaton of corresponding gene is rewarded and penalized, otherwise. This process continues until the desired result is obtained. In this paper, we compare the results of proposed algorithm with the results of the best-known vertex coloring algorithms such as: TPA [4], AMACOL [5], ILS [6], CHECKCOL [7], GLS [8] and CLAVCA [9]. The obtained results show the superiority of the new method over the other algorithms in terms of running time of algorithm and required number of colors. This paper is organized as follows. In Section 2, an overview of related works on vertex graph coloring problem is represented. The learning automata (LA) and the cellular learning automaton (CLA) are described in Section 3. The new proposed algorithm is described in Section 4. Sections 5 is including of implementation considerations, simulation results, and comparison with other algorithms to highlight the contributions of the new algorithm. Finally, conclusions and future works are discussed in Section 6.

The wide variety approaches have been reported for solving the vertex graph coloring in the literature. These approaches classified as exact and approximation techniques. The first exact algorithm proposed in [10] to compute the chromatic number of graph. This algorithm finds optimal solution with running time $\:\text{O}\left({2.4423}^{\text{n}}\right)$. Also, an $\:\text{O}\left({2.40231}^{\text{n}}\right)$ algorithm proposed to compute the chromatic number of graph in [11]. The faster algorithms reported in the literature when the number of coloring is fixed. An $\:\text{O}\left({1.3289}^{\text{n}}\right)$ algorithm for k = 3 to solve k-coloring problem was provided in [12]. Also, an $\:\text{O}\left({1.7504}^{\text{n}}\right)$ algorithm proposed in [13] for k = 4. Finally, two algorithms with running time $\:\text{O}\left({2.1020}^{\text{n}}\right)$ and $\:\text{O}\left({2.3289}^{\text{n}}\right)$ proposed for k = 5 [11] and for k = 6 [14] respectively. Due to the NP-hardness of the vertex coloring problem for large graphs, the exact techniques can only be used for small graphs, while there are very large graphs in a variety of applications. On the other hand, a near optimal coloring of graph can be used in many applications. Hence, different approximation algorithms have been proposed for finding the near optimal solution for vertex graph coloring problem in the literature.

A two-phased local search algorithm is proposed in [4] for the graph coloring problem. The algorithm generates candidate solution by alternately executing two interacting functionalities, namely, a stochastic and a deterministic local search. In stochastic phase which is based on biased random sampling the feasible colorings is created. Then, in deterministic phase each vertex is assigned to the color in such a way that the solution penalty is minimized.

An iterated local search algorithm (ILS) which is based on a random walk in the space of the local optima, is proposed in [6]. The algorithm generates candidate solution in three steps. 1) A walk is built by iteratively perturbing a local optimal solution. 2) A new local optimal solution is obtained by applying a local search algorithm. 3) An acceptance criterion is used for selecting solutions to continue the search.

A priority-based local search algorithm, called CHECKCOL, is proposed in [7]. CHECKCOL decreases the running time of algorithm by avoiding unnecessary searches in large parts of the graph without making any progress in the solution. To do this, it introduce the checkpoint notion to force the algorithm to stop at certain steps and release all of its memory and then to start local search. In this algorithm, each vertex of the graph is dynamically assigned a priority. The priority is used to define a new effective long term memory scheme which is integrated with the short term memory scheme implied by the checkpoints.

In [15] a tabu search algorithm was proposed to solve vertex coloring problem. In this algorithm the vertices of the graph partition into several blocks and a different color is assigned to each block at each iteration. This causes that the solutions be feasible or infeasible. In next step, to create a feasible coloring, the set of neighbors generate for each vertex which is restricted by a tabu list and then a new solution is obtained by applying local search algorithm. Tabu list prevents premature convergence of algorithm.

A hybrid algorithm based on genetic algorithm and tabu search method is proposed in [16] to solve vertex coloring problem. The hybrid algorithm consists of a population of solutions and a crossover operator by which each vertex in the child chromosome inherits its color from one of the parent chromosomes. The mutation operator is replaced by a tabu search method. The hybrid algorithm improved by selecting k-independent sets of vertices first and then applying hybrid algorithm on the remaining vertices.

A genetic local search algorithm is proposed In [17] to solve vertex coloring problem. This algorithm introduces a new crossover operator based on the union of independent sets (UIS) which is combined with a tabu search method. The algorithm has been tested on several data sets and the results compared with the results obtained from best known methods. The comparison showed the superiority of the proposed algorithm.

An adaptive memory algorithm for k-coloring problem called AMACOL, was proposed in [5]. The AMACOL is a hybrid evolutionary algorithm in which the color classes that derived from the colorings generated during the previous stages of the search, stored in a central memory. At each generation the central memory is used to generate the new solutions which are then improved by applying a local search method. Finally, the central memory is updated by using the obtained solutions.

A hybrid evolutionary algorithm called HEA, was proposed in [18] for graph coloring problem. HEA uses the DSATUR construction heuristic to initialize the population of chromosomes. At each generation, new chromosomes are generated by first recombining two parent chromosomes that are improved by local search method. The greedy partition crossover (GPX) is used as crossover operator in HEA. GPX generate a new chromosome by alternately selecting color classes of each parents. The new chromosome generated by GPX is then improved by tabu search method and replaced with the worse parent in current population.

In this section, learning automata (LA) is introduced in brief. Then cellular learning automata (CLA) which is a combination of cellular automata and learning automata; and irregular cellular learning automata (ICLA) in which the restriction of the rectangular grid structure in cellular learning automata is removed are presented.

Learning Automata

A learning automaton [2] is an adaptive decision-making unit. It can be described as determination of an optimal action from a set of actions through repeated interactions with an unknown random environment. It selects an action based on a probability distribution at each instant and applies it on a random environment. The environment sends a reinforcement signal to automata after evaluating the input action. The learning automata process the response of environment and update its action probability vector. By repeating this process, the automaton learns to choose the optimal action so that the average penalty obtained from the environment is minimized. The environment is represented by a triple $\:<\underset{\_}{{\alpha\:}},\underset{\_}{{\beta\:}},\underset{\_}{\text{c}}>$. $\:\underset{\_}{{\alpha\:}}=\{{{\alpha\:}}_{1},\dots\:,\:{{\alpha\:}}_{\text{r}}\}\:$is the finite set of the inputs, $\:\underset{\_}{{\beta\:}}=\{{{\beta\:}}_{1},\dots\:,\:{{\beta\:}}_{\text{m}}\}\:$ is the set of outputs that can be taken by the reinforcement signal, and $\:\underset{\_}{\text{c}}=\{{\text{c}}_{1},\dots\:,\:{\text{c}}_{\text{r}}\}\:$ is the set of the penalty probabilities, where each element $\:{\text{c}}_{\text{i}}$ of $\:\underset{\_}{\text{c}}$ is associated with one input action $\:{{\alpha\:}}_{\text{i}}$. When the penalty probabilities are constant, the random environment is said a stationary random environment. It is called a non stationary environment, if they vary with time. Depending on the nature of the reinforcement signal, there are three types of environments: P‐model, Q‐model and S‐model. The environments, in which the output can take only one of two values 0 or 1, are referred to as P‐model environments. The reinforcement signal in Q‐model environment selects a finite number of the values in the interval $\:[\text{a},\:\text{b}]$. When the output of environments is a continuous random variable in the interval $\:[\text{a},\:\text{b}]$, it is referred to as S‐model. The relationship between the learning automaton and the random environment is shown in Fig. 1.

There are two main families of Learning automata [3]: fixed structure learning automata and variable structure learning automata. Variable structure learning automata are represented by a triple $\:<\underset{\_}{{\beta\:}},\underset{\_}{{\alpha\:}},\text{T}>$, where $\:\underset{\_}{{\beta\:}}$ is the set of inputs, $\:\underset{\_}{{\alpha\:}}$ is the set of output actions, and T is learning algorithm which is used to modify the action probability vector. Learning algorithms is the critical factor affecting the performance of variable structure learning automata. Suppose learning automaton selects action $\:{{\alpha\:}}_{\text{i}}\left(\text{k}\right)\in\:\underset{\_}{{\alpha\:}}$ according to action probability vector $\:\underset{\_}{\text{p}}\left(\text{k}\right)$ at instant k. The action probability vector $\:\underset{\_}{\text{p}}\left(\text{k}\right)$ is updated by the learning algorithm given in Eq. (1), if the selected action $\:{{\alpha\:}}_{\text{i}}\left(\text{k}\right)$ is rewarded by the random environment, and it is updated as given in Eq. (2), if the taken action is penalized. a and b denote the reward and penalty parameters and r is the number of actions that can be taken by learning automaton.

$$\:{\text{P}}_{\text{j}}\left(\text{n}+1\right)=\left\{\begin{array}{c}{\text{P}}_{\text{j}}\left(\text{n}\right)+a\left[1-{\text{P}}_{\text{j}}\left(\text{n}\right)\right]\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:j=i\\\:\left(1-\text{a}\right){\text{P}}_{\text{j}}\left(\text{n}\right)\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\forall\:j,\:j\ne\:i\end{array}\right.$$

$$\:{\text{P}}_{\text{j}}\left(\text{n}+1\right)=\left\{\begin{array}{c}{(1-\text{b})\text{P}}_{\text{j}}\left(\text{n}\right)\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:j=i\\\:b/\left(1-\text{r}\right){+\left(1-\text{b}\right)\text{P}}_{\text{j}}\left(\text{n}\right)\:\:\:\:\:\:\forall\:j,\:j\ne\:i\end{array}\right.$$

If $\:\text{a}=\text{b}$, the recurrence equations (1) and (2) are called linear reward-penalty $\:\left({\text{L}}_{\text{R}-\text{P}}\right)$ algorithm, if $\:\text{a}\gg\:\text{b}$ the given equations are called linear reward‐є penalty $\:\left({\text{L}}_{\text{R}{\epsilon\:}\text{P}}\right)$, and finally if $\:\text{b}=0$ they are called linear reward‐Inaction $\:\left({\text{L}}_{\text{R}-\text{I}}\right)$. In the latter case, the action probability vectors remain unchanged when the taken action is penalized by the environment.

Learning automata have a vast variety of applications in combinatorial optimization problems [19–22], computer networks [21, 23–26], queuing theory [27, 28], signal processing [29, 30], information retrieval [31, 32], adaptive control [33–35], neural networks engineering [36, 37] and pattern recognition [38–40].

Cellular learning automata (CLA)

Cellular learning automata (CLA), which is a combination of cellular automata (CA) and learning automata (LA), is a mathematical model for solving problems. Because of the ability of CLA to learn, this model is superior to CA and also because of the equipping CLA with a collection of LAs which can interact with each other, this model is superior to LA. The basic idea of CLA is to use LA to adjust the state transition probability of CA. the operation of CLA can be described as follows: at the first step, the state of every cells is determined on the basis of action probability vectors of the LA residing in that cell. The initial value of this state may be chosen randomly or on the basis of the past experience. In the second step, the rule of CA determines the reinforcement signal to each LA residing in that cell. Finally, each LA updates its action probability vector on the basis of supplied reinforcement signal and the chosen action. This process continues until the desired result is obtained. Formally a d-dimensional CLA is given below.

A $\:d$-dimensional cellular learning automata is a structure $\:\mathcal{A}=\left({Z}^{d},\varPhi\:,A,N,\mathcal{F}\right)$, where

$\:{Z}^{d}$ is a lattice of d-tuples of integer numbers. $\:\underset{\_}{{\Phi\:}}$ is a finite set of states. $\:A$ is the set of LAs each of which is assigned to each cell of the CA. $\:N=\{{\stackrel{-}{x}}_{1},{\stackrel{-}{x}}_{2},\dots\:,{\stackrel{-}{x}}_{\stackrel{-}{m}}\}$ is a finite subset of $\:{Z}^{d}$ called neighborhood vector, where $\:{\stackrel{-}{x}}_{i}{\in\:Z}^{d}$. $\:\mathcal{F}:{\underset{\_}{{\Phi\:}}}^{\stackrel{-}{m}}\to\:\underset{\_}{\beta\:}$ is the local rule of the cellular learning automata, where $\:\underset{\_}{\beta\:}$ is the set of values that the reinforcement signal can take. It gives the reward (reinforcement) signal to each LA from the current actions selected by its neighboring LAs.

Irregular Cellular Learning Automata (ICLA)

An irregular cellular learning automaton (ICLA) is a CLA in which the restriction of rectangular grid structure in CLA is removed. There are many applications such as wireless networks, graph related applications and etc. that cannot be adequately modeled with CLA. An ICLA is defined as an undirected graph in which, each node represent a cell which is equipped with a LA. The LA residing in each cell selects its action on the basis of its action probability vector. The local rule and the actions selected by any particular LA and its neighboring LAs determine the reinforcement signal to the LA residing in each cell (The local rule in ICLA is the same as the local rule in CLA).

In this section, we propose a learning automata-based Multi-meme memetic algorithm called LAMACOL for solving the vertex coloring problem. In the proposed algorithm, before coloring process starts, a preprocessing algorithm may be applied to reduce a graph G to a graph G’ such that a feasible k-coloring for G can be derived by construction rules from any feasible k-coloring of G’. The preprocessing algorithm consists of two reduction rules presented in [41].

Rule 1: Remove all vertices in G where their degree are less than the k. Knowing that the degree of a vertex u is less than k guarantees that at least one color that is not used in the set of adjacent vertices can be assigned to u without breaking feasibility.
Rule 2: Remove any vertex $\:\text{v}\in\:\text{V}\:$ for which there is a $\:\text{u}\in\:\text{V}$, $\:\text{u}\ne\:\text{v}\:$ and $\:[\text{u},\text{v}]\notin\:\text{E}$, such that u is connected to every vertex to which v is connected. In this case, any color that can be assigned to u can also be assigned to v.

These two rules can be applied in any order and if one rule applies, it may make possible further reductions by the other rule. Hence, in the preprocessing stage, the two rules are applied iteratively until no vertex can be removed anymore.

The proposed algorithm is composed of two parts, genetic section and memetic section. In each iteration, evolution is performed in genetic section and local search is performed in memetic section. In what follows genetic and memetic sections are described.

Genetic section for proposed algorithm

Genetic section consist of a population of chromosomes with size 1 (N = 1). In this section, an ICLA isomorphic to input graph is generated. For this purpose, each vertex $\:{\text{v}}_{\text{i}}$ of graph is associated with cell $\:{\text{c}}_{\text{i}}$ of ICLA where is equipped with a learning automaton ($\:{\text{L}\text{A}}_{\text{i}}$). The set of actions of learning automaton $\:{\text{L}\text{A}}_{\text{i}}$ constitute the set of color by which the cell $\:{\text{c}}_{\text{i}}$ can be colored. The resulting ICLA can be model by a duple $\:<\underset{\_}{\text{A}},\:\underset{\_}{\text{a}}>$ where $\:\underset{\_}{\text{A}}=\{{\text{L}\text{A}}_{1},{\text{L}\text{A}}_{2},\dots\:,{\text{L}\text{A}}_{\text{n}}\}$ is the set of learning automata corresponding to vertex-set of graph and $\:\underset{\_}{\text{a}}=\{\underset{\_}{{\text{a}}_{1}},\underset{\_}{{\text{a}}_{2}},\dots\:,\underset{\_}{{\text{a}}_{\text{n}}}\}$ denotes the action-set of learning automata in which $\:\underset{\_}{{\text{a}}_{\text{i}}}=\{{\text{a}}_{\text{i}}^{1},\:{\text{a}}_{\text{i}}^{2},\:\dots\:,\:{\text{a}}_{\text{i}}^{{\varDelta\:}_{\text{i}}+1}\}$ is the set of actions (colors) that can be taken by learning automaton $\:{\text{L}\text{A}}_{\text{i}}$ (the set of colors by which the vertex $\:{\text{v}}_{\text{i}}$ can be colored) where n is the number of vertices and $\:{\varDelta\:}_{\text{i}}$ is the degree of vertex $\:{\text{v}}_{\text{i}}$. It has been shown in [42] that an arbitrary graph can be colored with at most $\:\varDelta\:+1$ colors, where $\:\varDelta\:$ denotes the graph degree. Therefore, it can be concluded that vertex $\:{\text{v}}_{\text{i}}$ and its neighboring vertices can be colored with at most $\:{\varDelta\:}_{\text{i}}+1$ colors in the worst case. That is why, in genetic section of proposed algorithm, the action-set of $\:{\text{L}\text{A}}_{\text{i}}$ (corresponding to color set of vertex $\:{\text{v}}_{\text{i}}$) is composed of $\:{\varDelta\:}_{\text{i}}+1$ actions (or colors).

The operation of genetic section of proposed algorithm which composed of a number of stages can be described as follows. The stage t of genetic section consists of two steps. In the first step, the learning automaton of each cell chooses one of its actions (colors) randomly, based on its action probability vector which is initially set to $\:\frac{1}{{\varDelta\:}_{\text{i}}+1}$. The actions (colors) chosen by the set of learning automata generate a new solution for coloring of graph. The cardinality of the set of colors which are selected by the learning automaton at cell $\:{\text{c}}_{\text{i}}$ and its neighboring cells named color-degree and denoted as $\:{\text{D}}_{\text{i}}$. The action (color) which is selected by learning automaton $\:{\text{L}\text{A}}_{\text{i}}$ at cell $\:{\text{c}}_{\text{i}}$ is penalized, if this action (color) is also selected by the learning automata of at least one of its neighboring cells or color-degree of cell $\:{\text{c}}_{\text{i}}$, i.e., $\:{\text{D}}_{\text{i}}$, is greater than its own dynamic threshold, i.e., $\:{\text{T}}_{\text{i}}$. Dynamic threshold $\:{\text{T}}_{\text{i}}$ which retains the minimum value of $\:{\text{D}}_{\text{i}}$ that has been seen so far for cell $\:{\text{c}}_{\text{i}}$, can be initially set to $\:{\varDelta\:}_{\text{i}}+1$. Otherwise, the selected action (color) by learning automaton $\:{\text{L}\text{A}}_{\text{i}}$ is rewarded. In the second step of stage t, new solution which is created by combining the selected actions (colors) of all LAs, replaced with the current chromosome; if the genetic fitness of new solution is better than it. Otherwise, all automata penalize their selected actions. The genetic fitness function counts the number of edges that their end points have same color class. Formally, the genetic fitness of the only chromosome at stage t can be described as $\:\text{G}\text{F}\left(\text{t}\right)=1/(1+\sum\:_{\text{i}=1}^{\text{k}}\left|{\text{E}}_{\text{i}}\right|)$, where $\:{|\text{E}}_{\text{i}}|$ is the cardinality of set of edges with both end points in color class $\:{\text{C}}_{\text{i}}.$

Memetic section for proposed algorithm

Memetic section consists of a pool of memes of size L. The size of meme pool is equal to number of local search methods. Each local search method corresponds to one of the memes in the meme pool. Each meme saves the history of the effect of its corresponding local search method. Each meme is composed of n learning automata each of which corresponds to a gene of the current chromosome. Specifically, the $\:{\text{i}}_{\text{t}\text{h}}$ automaton corresponds to the $\:{\text{i}}_{\text{t}\text{h}}$ gene of the current chromosome. Each learning automaton has $\:{\varDelta\:}_{\text{i}}+1$ actions corresponding to $\:{\varDelta\:}_{\text{i}}+1$ possible colors that can be taken by vertex $\:{\text{v}}_{\text{i}}$. The history of the effect of $\:{\text{i}}_{\text{t}\text{h}}$ local search method at stage t is represented by the action probability vectors of learning automata in the $\:{\text{i}}_{\text{t}\text{h}}$ meme as given Eq. (3).

$$\:\underset{\_}{{\text{M}}^{\text{i}}\left(\text{t}\right)}=[\underset{\_}{{\text{M}}_{1}^{\text{i}}\left(\text{t}\right)},\:\underset{\_}{{\text{M}}_{2}^{\text{i}}\left(\text{t}\right)},\:\dots\:,\:\underset{\_}{{\text{M}}_{\text{n}}^{\text{i}}\left(\text{t}\right)}]$$

where

$$\:\underset{\_}{{\text{M}}_{\text{j}}^{\text{i}}\left(\text{t}\right)}={[{\text{M}}_{\text{j}1}^{\text{i}}\left(\text{t}\right),\:{\text{M}}_{\text{j}2}^{\text{i}}\left(\text{t}\right),\:\dots\:,\:{\text{M}}_{\text{j}{(\varDelta\:}_{\text{j}}+1)}^{\text{i}}\left(\text{t}\right)]}^{{\prime\:}},\:1\le\:\text{j}\le\:\text{n}\:\text{a}\text{n}\text{d}\:\sum\:_{\text{p}=1}^{{\varDelta\:}_{\text{j}}+1}{\text{M}}_{\text{j}\text{p}}^{\text{i}}\left(\text{t}\right)=1\:\forall\:\text{i},\text{j}.$$

$\:{\text{M}}_{\text{j}\text{p}}^{\text{i}}\left(\text{t}\right)$ denotes the probability that action p of $\:{\text{j}}_{\text{t}\text{h}}$ leaning automaton in meme i (corresponding to the color p for gene j) is selected. The probability of selecting an action (color) by $\:{\text{j}}_{\text{t}\text{h}}$ learning automaton is initially set to $\:\frac{1}{{\varDelta\:}_{\text{j}}+1}$. The action probability vectors of a meme (the history) are updated according to a learning algorithm based on the reinforcement signal received from the genetic section after applying the corresponding local search method. The aim of the set of learning automata of a meme is to find those colors of the genes of chromosome that result in a better solution. That is, the set of learning automata of a meme collectively finds a set of actions (colors) that minimizes the average penalty received from the environment (genetic section).

$\:{\text{M}\text{F}}_{{\beta\:}}\left(\text{t}\right)=\prod\:_{\text{j}=1}^{\text{n}}{\text{M}}_{\text{j}\text{p}}^{{\beta\:}}\left(\text{t}\right)\:$ where p is the action (color) selected by automaton j in meme β is the probability that the current chromosome is generated by applying local search method β at stage t. $\:{\text{M}\text{F}}_{{\beta\:}}\left(\text{t}\right)$ is referred to as memetic fitness of the current chromosome after applying meme $\:{\beta\:}$ at stage t. Memetic fitness changes when the action probability vectors of learning automata in the meme is updated. Updating is performed on the basis of the result of applying corresponding local search method on current chromosome. If the values of gene j of current chromosome are the same before and after applying local search method β on current chromosome, the action of $\:{\text{j}}_{\text{t}\text{h}}$ learning automaton of meme β which corresponds to the value of $\:{\text{j}}_{\text{t}\text{h}}$ gene is rewarded and penalized, otherwise. It has been shown in [43] that applying a local search method on current chromosome, for example local search method β, is beneficial if $\:\frac{{(1-{\text{M}\text{F}}_{{\beta\:}}\left(\text{t}\right))}^{2}}{{\text{M}\text{F}}_{{\beta\:}}\left(\text{t}\right)}>\frac{(1-\text{G}\text{F}\left(\text{t}\right))}{\text{G}\text{F}\left(\text{t}\right)}$. The proposed algorithm is composed of a number of iterations and at each generation, evolution is performed in genetic section and then local search is performed in memetic section. This process continues until the choice probability of at least one action (color) of all automata in ICLA exceeds a pre-specified threshold, e.g.,$\:\:{\pi\:}$.

The relationship between the genetic section and memetic section for proposed algorithm is shown in Fig. 2.

Different local search methods have been proposed for vertex coloring problem in the literature[8]. Next, we briefly review five local search methods used in new proposed algorithm.

1-exchange local search

The most frequently used local search is the 1-exchange local search in which one vertex $\:{\text{v}}_{\text{i}}$ is selected randomly and moved from its current color class i.e., $\:{\text{C}}_{\text{j}}$, into a different color class i.e., $\:{\text{C}}_{\text{l}}$ where $\:\text{l}\ne\:\text{j}$ and $\:\text{l},\text{j}\in\:\{1,\dots\:,\:{\varDelta\:}_{\text{i}}+1\}$.

Restricted 1-exchange local search

The 1-exchange local search, which is restricted to change only the color class of vertices that are in conflict, since only these modifications can lead to a increase of the fitness function; called restricted 1-exchange local search.

Swap local search

In the swap local search exactly one vertex $\:\text{v}\in\:{\text{V}}^{\text{C}}\:$ exchange the color class with another vertex $\:\text{v}\in\:\text{V}$.

Cyclic exchange local search

An extension of the 1-exchange and swap local search is the cyclic exchange local search. The cyclic exchange local search is a sequence of 1-exchanges local search. A cyclic exchange of length m acts on a sequence of distinctly colored vertices $\:({\text{u}}_{1},\:...,\:{\text{u}}_{\text{m}})$. For simplicity, we will denote the color class of any $\:{\text{u}}_{\text{i}}$, $\:\text{i}=1,\dots\:,\text{m}$, by $\:{\text{C}}_{\text{i}}$. The cyclic exchange local search moves any vertex $\:{\text{u}}_{\text{i}}$, $\:\text{i}=1,\dots\:,\text{m}$, from $\:{\text{C}}_{\text{i}}$ into $\:{\text{C}}_{\text{i}+1}$. We use the convention $\:{\text{C}}_{\text{m}+1}={\text{C}}_{1}$. A cyclic exchange local search does not change the cardinality of the color classes involved in the move. Figure 3 (a) gives an example of cycle exchange local search method.

Path exchange local search

The path exchange local search is the same as cycle exchange local search expect that the u_m remains in $\:{\text{C}}_{\text{m}}$. Hence the sequence of exchanges is not closed and the cardinality of $\:{\text{C}}_{1}$ and $\:{\text{C}}_{\text{m}}$ is modified. Figure 3 (b) gives an example of path exchange local search method.

Pseudo code for proposed algorithm demonstrated in Fig. 4:

In this section several experiments have been conducted to show the efficiency of proposed algorithm. The results of proposed algorithm is compared with the results of the best-known vertex coloring algorithms such as: TPA [4], AMACOL [5], ILS [6], CHECKCOL [7], GLS [8] and CLAVCA [9]. The obtained results show the superiority of the new algorithm over the other algorithms in terms of running time of algorithm and number of colors required for coloring the benchmarks. In these experiments, learning rate of the leaning automata is set to 0.1, and algorithm is terminate when the probability of chosen at least one action (color) of all automata in ICLA is 0.95 or greater. For this purpose, we have used a database of hard-to-color benchmarks DSJ graphs[44], Leighton graphs[45] and WAP graphs[4]. The characteristics of the used benchmark graphs are given in tables Table 1, Table 3 and Table 5, respectively. Tables Table 2, Table 4 and Table 6 show the comparison of obtained results for different algorithms. In these tables, for all algorithms, the first column includes the number of colors required for coloring the graph (color) and the second column includes the running time of each algorithm in seconds (time).

The first class of the benchmark graphs on which the proposed algorithm is tested is DSJ graphs. This class consists of the graphs with a variable number vertices n, where each of the n(n-1)/2 possible edges is generated independently at random with probability p. DSJ graphs are denoted as DSJCn.p where n ∈ {125, 250, 500, 1000} and p ∈ {0.1, 0.5, 0.9}. The characteristics of DSJ benchmark graphs are given in Table 1.

Table 1

The characteristics of DSJ benchmark graphs
Graph Name	Number of vertices	Number of edges	Density
DSJC125.1	125	736	0.09
DSJC125.5	125	3891	0.5
DSJC125.9	125	6961	0.9
DSJC250.1	250	3218	0.1
DSJC250.5	250	15668	0.5
DSJC250.9	250	27897	0.9
DSJC500.1	500	12458	0.1
DSJC500.5	500	62624	0.5
DSJC500.9	500	112437	0.9
DSJC1000.1	1000	49629	0.1
DSJC1000.5	1000	249826	0.5
DSJC1000.9	1000	449449	0.9

Table 2 shows the results of simulation experiments conducted on DSJ benchmark graphs. Comparing the results reported in Table 2, we find that, in most cases, GLS outperforms the others in terms of running time. It can be seen that AMACOL and CHECKCOL always select the smallest color-set for coloring the graphs, while their running time are the worst. It also can be seen that GLS uses the most number of colors, and AMACOL uses the smallest number of colors to color the graph. Comparing the LAMACOL with GLS, it can be seen that the running time of the LAMACOL is as close to GLS as possible, while the size of the color-set, in the LAMACOL, is significantly smaller as compared with GLS. On the other hand, comparing the result of proposed algorithm and AMACOL, we find that, in most cases, the color-sets chosen by LAMACOL are as small as those AMACOL, while the running time of the LAMACOL is considerably shorter than AMACOL.

Table 2

A performance comparison of the proposed algorithm and the most effective coloring algorithms on DSJ benchmarks graphs
Graph Name	TPA		AMACOL		ILS		CHECKCOL		GLS		CLAVCA		LAMACOL
Graph Name	color	time	color	time	color	time	color	time	color	time	color	time	color	time
DSJC125.1	5	0	5	0	5	0	5	0	5	0	5	0	5	0.01
DSJC125.5	19	289	17	125	17	2	17	110	18	0	17	12	17	10.5
DSJC125.9	44	5	44	57	44	0	44	4	44	0	44	19	44	16.3
DSJC250.1	8	10	8	12	8	0	8	28	9	0	8	7	9	6.5
DSJC250.5	30	3282	28	64	28	34	28	557	30	1	28	27	28	22.3
DSJC250.9	72	155	72	2604	72	6	72	182	73	6	73	32	72	45.6
DSJC500.1	12	0	12	9	13	0	12	4	13	0	12	20	13	14.5
DSJC500.5	48	124	48	326	50	106	48	1789	52	81	48	42	48	34.8
DSJC500.9	127	1268	126	1710	128	82	126	2045	130	154	126	70	126	84.6
DSJC1000.1	21	28	20	969	21	6	21	142	22	1	21	39	22	24.5
DSJC1000.5	84	2386	84	9235	91	303	84	7025	93	546	84	88	85	102.2
DSJC1000.9	226	3422	224	4937	228	2245	226	12545	234	1621	224	119	224	98.5

Leighton benchmark graphs are the second class which proposed algorithm is tested. Leighton graphs are random graphs of density below 0.25, which are constructed by first partitioning vertices into k distinct sets representing the color classes and then assigning edges only between vertices that belong to different sets. The chromatic number of these graphs is guaranteed to be k by setting cliques of sizes ranging from 2 to k into the graph. The Leighton benchmark graphs are denoted as le450_kx, where 450 is the number of vertices, k is the chromatic number of the graph and $\:\text{x}\in\:\{\text{a},\:\text{b},\:\text{c},\:\text{d}\}$ is a letter used to distinguish different graphs with the same characteristics, with c and d graphs having higher edge density than the a and b ones. The characteristics of Leighton benchmark graphs are given in Table 3.

Table 3

The characteristics of Leighton benchmark graphs
Graph Name	Number of vertices	Number of edges	Density
Le450_15a	450	8168	0.08
Le450_15b	450	8169	0.08
Le450_15c	450	16680	0.17
Le450_15d	450	16750	0.17
Le450_25c	450	17343	0.17
Le450_25d	450	17425	0.17

Table 4 shows the results of the conducted experiments on Leighton benchmark graphs. Comparing the results reported in Table 4, we observe that, in most cases, ILS outperforms the other algorithms in terms of running time. It also can be seen CHECKCOL, in most cases, select smallest color-set for coloring the benchmark graphs, while its running time is worst. Comparing the result of LAMACOL with the minimum number of required colors for coloring benchmark graphs, we observe that the size of color-set conducted by the proposed algorithm is equal to the minimum number of required colors for coloring benchmark graphs. It also can be seen that, the proposed algorithm is significantly outperforms CHECKCOL in terms of running time.

Table 4

A performance comparison of the proposed algorithm and the most effective coloring algorithms on Leighton benchmarks graphs
Graph Name	TPA		AMACOL		ILS		CHECKCOL		GLS		CLAVCA		LAMACOL
Graph Name	color	time	color	time	color	time	color	time	color	time	color	time	color	time
Le450_15a	15	1444	15	345	15	0	15	2145	15	2	15	23	15	46.2
Le450_15b	15	1655	15	345	15	0	15	2756	15	0	15	18	15	16.5
Le450_15c	15	82	15	2	15	19	15	4534	15	6	15	30	15	12.3
Le450_15d	15	34	15	4	15	20	15	4576	15	8	15	32	15	14.7
Le450_25c	26	44	26	93	26	2	25	3477	26	18	25	30	26	8.8
Le450_25d	26	22	26	10	26	1	25	4524	26	2	25	46	25	18.5

The third class of benchmark graphs on which proposed algorithm is tested is WAP graphs which arise in the design of transparent optical networks where each vertex corresponds to a light path in the network and edges correspond to intersecting paths. These graphs are denoted by WAP0ma, where m ∈ {1,2,…,8}. They have between 905 and 5231 vertices and all instances have a clique of size 40. The characteristics of WAP benchmark graphs are given in

Table 5.

Table 5

The characteristics of Wap benchmark graphs
Graph Name	Number of vertices	Number of edges	Density
WAP01a	2368	110871	0.04
WAP02a	2464	111742	0.04
WAP03a	4730	286722	0.03
WAP04a	5231	294902	0.02
WAP06a	947	43571	0.1
WAP07a	1809	103368	0.06
WAP08a	1870	104176	0.06

Table 6 shows the results of the conducted experiments on WAP benchmark graphs. Comparing the results given in Table 6, we find that, TPA and GLS outperform the other algorithms in terms of the required number of colors for coloring the WAP benchmark graphs. It also can be seen that, ILS outperforms others in terms of the running time. The obtained results reported in Table 6 show that, in most cases, both the running time and the size of color-sets created by the proposed algorithm is smaller than that of the best reported results.

Table 6

A performance comparison of the proposed algorithm and the most effective coloring algorithms on WAP benchmarks graphs
Graph Name	TPA		AMACOL		ILS		CHECKCOL		GLS		CLAVCA		LAMACOL
Graph Name	color	time	color	time	color	time	color	time	color	time	color	time	color	time
Wap01a	42	245	45	345	44	2	44	568	42	55	42	16	42	12.5
Wap02a	41	1618	44	802	43	251	43	486	41	160	41	28	41	16.5
Wap03a	44	17	53	245	46	365	46	689	44	782	43	51	44	38.4
Wap04a	43	95	48	45	44	484	44	23	43	834	43	47	43	36.4
Wap06a	41	348	44	545	42	1	42	25	41	8	40	3	41	2.5
Wap07a	42	541	45	89	44	1	44	182	42	215	40	14	41	16.5
Wap08a	42	200	45	446	43	56	44	22	42	41	42	26	42	18.4

In this paper, a new learning automata based memetic algorithm was proposed for optimization problems. The proposed algorithm is composed of two parts, genetic section and memetic section. Evolution is performed in genetic section and local search is performed in memetic section. Genetic section operates based on the irregular cellular automata (ICLA) which is a generalization of cellular learning automata (CLA) in which the restriction of rectangular grid structure in CLA is removed. A new solution is generated for coloring the graph in genetic section in which each vertex chooses its color based on the colors selected by its adjacent vertices. In memetic section, local searches are applied on new solution, generated in genetic section, and the histories of local searches are saved in a set of memes. To show the efficiency of proposed algorithm, several computer simulations were conducted on hard-to-color benchmark graphs. The obtained results showed that the proposed algorithm outperforms the well-known algorithms both in terms of the required number of colors and the running time of algorithm.

Author Contribution

M. Rezapoor Mirsaleh wrote the main manuscript text and prepared figures with support from M.R. Meybody. All authors reviewed the manuscript.

Karp, R.M.: Reducibility among combinatorial problems. RE Miller and JW Thatcher editors, Complexity of Computer Computations, 85–103. Plenum (1972)
Narendra, K.S., Thathachar, M.A.L.: Learning automata: an introduction. Prentice-Hall, Inc. (1989)
Thathachar, M.A.L., Sastry, P.S.: Varieties of learning automata: an overview. IEEE Trans. Syst. Man. Cybernetics Part. B: Cybernetics. 32, 711–722 (2002)
Caramia, M., Dell’Olmo, P.: Embedding a novel objective function in a two-phased local search for robust vertex coloring. Eur. J. Oper. Res. 189, 1358–1380 (2008)
Galinier, P., Hertz, A., Zufferey, N.: An adaptive memory algorithm for the k-coloring problem. Discrete Appl Math. 156, 267–279 (2008)
Lourenço, H.R., Martin, O., Stutzle, T., Glover, F., Kochenberger, G.: Iterated Local. Search. Handb. Metaheuristics, pp. 321–353, (2002)
Caramia, M., Dell'Olmo, P., Italiano, G.F.: CHECKCOL: Improved local search for graph coloring, Journal of Discrete Algorithms, vol. 4, pp. 277–298, (2006)
Chiarandini, M., Dumitrescu, I., Stützle, T.: Stochastic local search algorithms for the graph colouring problem, Handbook of approximation algorithms and metaheuristics, pp. 1–63, (2007)
Akbari Torkestani, J., Meybodi, M.R.: A cellular learning automata-based algorithm for solving the vertex coloring problem. Expert Syst. Appl. 38, 9237–9247 (2011)
Lawler, E.L.: A note on the complexity of the chromatic number problem. Inform. Process. Lett. 5, 66–67 (1976)
Byskov, J.M.: Exact algorithms for graph colouring and exact satisfiability. Oper. Res. Lett. 32, 547–556 (2004)
Beigel, R., Eppstein, D.: 3-coloring in time O (n1. 3289). J. Algorithms. 54, 168–204 (2005)
Byskov, J.M.: Enumerating maximal independent sets with applications to graph colouring. Oper. Res. Lett. 32, 547–556 (2004)
Rezapoor, M. M. and, Meybodi, M.R.: LA-MA: A new memetic model based on learning automata, in Proceeding of 18th National Conference of Computer Society of Iran, Tehran, Iran, (2013)
Hertz, A., de Werra, D.: Using tabu search techniques for graph coloring, Computing, vol. 39, pp. 345–351, (1987)
Fleurent, C., Ferland, J.A.: Genetic and hybrid algorithms for graph coloring. Ann. Oper. Res. 63, 437–461 (1996)
Dorne, R., Hao, J.: A new genetic local search algorithm for graph coloring, (1998)
Galinier, P., Hao, J.-K.: Hybrid evolutionary algorithms for graph coloring. J. Comb. Optim. 3, 379–397 (1999)
Amin, A., Jahanshahi, M., Meybodi, M.R.: Improved Learning-Automata-Based Clustering Method for Controlled Placement Problem in SDN. Appl. Sci. 13, 18: 10073 (2023). https://doi.org/10.3390/app131810073
Khaksar Manshad, M., Meybodi, M.R., Salajegheh, A.: New Cellular Learning Automata as a framework for online link prediction problem. J. Exp. Theor. Artif. Intell. (2023). https://doi.org/10.1080/0952813X.2023.2188261
Akbari Torkestani, J., Meybodi, M.R.: Learning automata-based algorithms for finding minimum weakly connected dominating set in stochastic graphs. Int. J. Uncertain. Fuzziness Knowledge-Based Syst. 18, 721–758 (2010)
Akbari Torkestani, J.: An adaptive focused Web crawling algorithm based on learning automata. Appl. Intell. 37, 586–601 (2012)
Akbari Torkestani, J., Meybodi, M.R.: An efficient cluster-based CDMA/TDMA scheme for wireless mobile ad-hoc networks: A learning automata approach. J. Netw. Comput. Appl. 33, 477–490 (2010)
Akbari Torkestani, J., Meybodi, M.R.: Mobility-based multicast routing algorithm for wireless mobile Ad-hoc networks: A learning automata approach. Comput. Commun. 33, 721–735 (2010)
Akbari Torkestani, J., Meybodi, M.R.: An intelligent backbone formation algorithm for wireless ad hoc networks based on distributed learning automata. Comput. Netw. 54, 826–843 (2010)
Jahanshahi, M., Dehghan, M., Meybodi, M.: LAMR: learning automata based multicast routing protocol for multi-channel multi-radio wireless mesh networks. Appl. Intell. 38, 58–77 (2013)
Meybodi, M.R.: Learning automata and its application to priority assignment in a queueing system with unknown characteristics, Ph.D. thesis, Departement of Electrical Engineering and Computer Science, University of Oklahoma, Norman, Oklahoma, USA, (1983)
M. L. v. Tsetlin, Automaton theory and modeling of biological systems vol. 102: Academic Press New York, (1973)
Hashim, A., Amir, S., Mars, P.: Application of learning automata to data compression. Adapt. Learn. Syst., pp. 229–234, (1986)
Manjunath, B., Chellappa, R.: Stochastic learning networks for texture segmentation, in Twenty-Second Asilomar Conference on Signals, Systems and Computers, pp. 511–516. (1988)
Oommen, B.J., Hansen, E.: List organizing strategies using stochastic move-to-front and stochastic move-to-rear operations. SIAM J. Comput. 16, 705–716 (1987)
Oommen, B.J., Ma, D.C.Y.: Deterministic learning automata solutions to the equipartitioning problem. IEEE Trans. Comput. 37, 2–13 (1988)
Frost, G.P., Stochastic optimisation of vehicle suspension control systems via learning automata, Thesis, P.D.: Department of Aeronautical and Automotive Engineering, Loughborough University, Loughborough, (1998)
Howell, M., Frost, G., Gordon, T., Wu, Q.: Continuous action reinforcement learning applied to vehicle suspension control, Mechatronics, vol. 7, pp. 263–276, (1997)
Unsal, C., Kachroo, P., Bay, J.S.: Multiple stochastic learning automata for vehicle path control in an automated highway system. IEEE Trans. Syst. Man. Cybernetics Part. A: Syst. Hum. 29, 120–128 (1999)
Beigy, H., Meybodi, M.R.: A learning automata-based algorithm for determination of the number of hidden units for three-layer neural networks. Int. J. Syst. Sci. 40, 101–118 (2009)
Meybodi, M.R., Beigy, H.: Neural network engineering using learning automata: determining of desired size of three layer feed forward neural networks. J. Fac. Eng. 34, 1–26 (2001)
Oommen, B.J., Croix, D.S.: String taxonomy using learning automata. IEEE Trans. Syst. Man. Cybernetics Part. B: Cybernetics. 27, 354–365 (1997)
Barto, A.G., Jordan, M.I.: Gradient following without back-propagation in layered networks, in 1st Int. Conference Neural Nets, San Diego, (1987)
Thathachar, M., Phansalkar, V.V.: Learning the global maximum with parameterized learning automata. IEEE Trans. Neural Networks. 6, 398–406 (1995)
Cheeseman, P., Kanefsky, B., Taylor, W.M.: Where the really hard problems are, in Proceedings of JCAI’91, pp. 331–337. (1991)
Galinier, P., Hertz, A.: A survey of local search methods for graph coloring. Comput. Oper. Res. 33, 2547–2562 (2006)
Rezapoor, M. M. and, Meybodi, M.R.: A New Criteria for Creating Balance Between Local and Global Search in Memetic Algorithms. Iran. J. Electr. Comput. Eng. (IJECE). 12, 31–37 (2014)
Johnson, D.S., Aragon, C.R., McGeoch, L.A., Schevon, C.: Optimization by simulated annealing: an experimental evaluation; part II, graph coloring and number partitioning. Oper. Res. 39, 378–406 (1991)
Leighton, F.T.: A graph coloring algorithm for large scheduling problems. J. Res. Natl. bureau Stand. 84, 489–506 (1979)

No competing interests reported.

Download PDF

Reviewers invited by journal
27 Aug, 2024
Editor assigned by journal
21 Aug, 2024
Submission checks completed at journal
21 Aug, 2024
First submitted to journal
20 Aug, 2024

You are reading this latest preprint version

Optimization based on Multi-Meme Memetic Algorithm

Status:

Version 1

Abstract

Figures

1. Introduction

2. Related work

3. Theory of automata

4. The proposed Multi-meme memetic algorithm for optimization problems

1-exchange local search

5. Experimental Result

6. Conclusion

Declarations

Author Contribution

References

Additional Declarations

Status:

Version 1