Rubik-1

28
Rubik’s Cube Ibanescu Diana Dudeanu Ermoghen Cracaoanu Sergiu

description

cub

Transcript of Rubik-1

Rubik’s Cube

Ibanescu Diana Dudeanu Ermoghen Cracaoanu Sergiu

Abstract

There are many algorithms to solve scrambled Rubik's Cube. It is not known how many moves is

the minimum required to solve any instance of the Rubik's cube. When discussing the length of a solution, there are two common ways to measure this. The first is

to count the number of quarter turns. The second is to count the number of face turns. A move like a half turn of the front face would be counted as 2 moves in the quarter turn metric and as only 1 turn in the face metric. A center square will always remain a center square no matter how you turn the cube.

This paper presents an evolutionary approach that tries to solve Rubik’s Cube. A chromosome represents a chain of moves that makes a cube to pass from a state to another. The cube starts from the final state .We apply a number of transformations chosen randomly over this final state. This new state will become the initial state.

The operations used to transform the cube are mutation and crossover. The way these operators are used is depending on the stage of the algorithm. The operators are used to confer an equilibrium between exploration and exploitation. If the local optimum is reached the exploration increases to avoid blocking cases.

The fitness function depends on the number of squares that are on the final position and on the number of crosses that the cube has. In fact there are two fitness functions.

The implemented algorithm does not find the solution all the time but it offers an alternative that can find the solution in a short time, depending on the input cube.

Contents

I. Introduction .................................................................................................................................. 4

II. Deterministic algorithms............................................................................................................. 5

II.1. Simple solution .................................................................................................................... 5

II.2. Ortega's Method ................................................................................................................. 10

III. Genetic Algorithm ................................................................................................................... 15

III.1. Cube representation .......................................................................................................... 15

III.2. The components of the genetic algorithm ........................................................................ 15

IV. Evolutionary computing SDK’s .............................................................................................. 17

IV.1. ECJ ................................................................................................................................... 17

IV.2. Origin ............................................................................................................................... 25

V. Experiments .............................................................................................................................. 27

I. Introduction

Rubik's cube is a cube whose edge length is approximately 56 mm (2.2 in). It consists of 26 smaller cubes; one side is made up of 3x3 such cubes. If you take an ordinary cube and cut it as you would a Rubik's cube, you are left with 27 smaller cubes. You see, the inner smaller cube comes from the center of this cube, and this center is not contained in the Rubik's cube. In fact, the smaller pieces of the Rubik’s cube have not really a shape of a cube. We call these 26 small pieces cubbies.

There are 6 cubbies in the middle of individual sides, and those are rigidly interconnected by a six-armed spatial cross (they merely rotate around their axles and keep other cubbies from falling off the cube). I will call those cubbies centers. The centers are connected to the cross by means of screws and springs, causing other cubbies to hold nicely tightly together. The center cubbies have a colored sticker on their external sides, which determines the color of each face. The colors are usually white, red, blue, yellow, orange, and green. Sometimes brownish color replaces red color. The original Rubik's cube (as well as all of the recently produced Rubik-brand cubes) has coloring white opposite yellow, red opposite orange, and blue opposite green. The former coloring has certain logic, called "plus yellow," because opposite side of a primary color (white, red, blue) is formed by adding yellow color (yellow, orange, green). On the other hand, the latter coloring tries to combine opposite sides with maximum contrast. Each coloring can further have two variants due to mirror reflection. We also have 12 edge cubbies, or just edges. Those cubbies have two stickers of different colors. All color combinations are exhausted, except those on opposite faces.

And the last type of cubbies is eight corner cubbies, or just corners. Those cubbies have three differently-colored stickers on their mutually orthogonal sides.

Labeling of moves

As you surely know, the Rubik's cube has 6 sides. All sides can be rotated by a certain angle. This rotation is called a move (or a turn). In order to be able to perform other moves, it is recommended to rotate by 0 (360), 90, 180, or -90 (270) degrees. The 90 and -90 degrees turns are called quarter turns. The 180 degreed turn is called a double turn (or sometimes - a bit misleading - face turn is used). Of course, if you rotate a side, you rotate one third of the cube (9 cubbies). This array is called a layer.

Having a cube in front of you, the individual layers are rotated thus:

• A layer facing towards you is the front layer, and is labeled as F (Front).

• A layer facing away from you is the back layer and is labeled as B (Back).

• A layer which is on top is the up layer and is labeled as U (Up).

• A layer which is at bottom is the down layer and is labeled as D (Down).

• A layer which is on your right is the right layer and is labeled as R (Right).

• A layer which is on your left is the left layer and is labeled as L (Left).

• A layer which is between the left and right layer is labeled as M (Middle).

• A layer which is between the top and bottom layer is labeled as E (Equator).

• A layer which is between the front and back layer is labeled as S (Standing). The move is labeled by the symbol of the layer you are turning (F, B, U, D, L, R). A symbol by itself labels a clockwise rotation of the layer. For a counterclockwise rotation, the layer symbol is followed by an apostrophe (a single quote}. For a 180-degree rotation, the layer symbol is followed by digit '2' or an exponent "to the second power." Turning M, E, and S layers is called slice turn. For M turn the direction is top-down, for M' bottom-up. For E turn the direction is left-right, for E' right-left. For S turn the direction is clockwise if seeing from front, for S' counterclockwise. For a clearer understanding here is an example:

For example F2 U' R M' means: Rotate the front layer by a half-turn (180 degrees in any direction), then rotate the up layer counterclockwise by a quarter-turn (90 degrees), then rotate the right layer clockwise by a quarter-turn (90 degrees) and finally rotate the middle layer in a bottom-up direction by a quarter-turn.

II. Deterministic algorithms

II.1. Simple solution

This method uses very few sequences that you need to memorize in order to solve the cube. Although there are quite a few sequences provided in this solution, most of them are intuitive steps. The method is split to two main steps: solve all eight corners of the cube and solve all twelve edges (and centers) while keeping the corners solved This method is an alternative with many advantages:

• Smaller amount of unintuitive sequences (nowadays often computer-generated)

• Generally shorter sequences

• Higher "symmetry" of solution (you solve cube evenly)

• You do not break what you solved in previous steps as much (easier recovery from mistakes)

• It is quite efficient with respect to its simplicity

• You can achieve very good times just by practice using the very basic method.

• You can scale up the method incrementally to gain speed and efficiency

1. Solve Four Bottom Corners We will start by solving four corners of the cube that share one color (in this case we will select white color). This step can be solved intuitively. Select one corner with white sticker and turn the whole cube so the white sticker of this corner is facing down. You have solved one of 4 corners this way. Now look for other corners with white sticker and put them to the bottom layer using the right one of the following sequences. Solve the corners one by one. When searching for the next corner to solve, you may freely turn the top layer to put the corners into the position in which you can apply the sequence.

Pay attention to align colors of the corners on sides, since if they do not match as well, the corners are not in correct places. The orange and green colors are just example, there can be other color combinations (like blue-red, green-red, ...). The cubbie on top, bottom sticker on the front side:

The cubbie on top, bottom sticker on the right side:

If you do not see any situation being similar to one of the first two above (remember that you can freely turn top layer to position the corner to the top-right-front position), the corners are in positions that are more difficult to solve. The following sequences will help you to transform such positions into the ones you should be familiar with already.

The bottom sticker on the top:

The cubbie on bottom, bottom sticker on the front side:

The cubbie on bottom, bottom sticker on the right side:

One possible way to remember the last two sequences is "bring white sticker to the top, put it back (inside layer you just turned), reverse the first step".

2. Place Four Top Corners

To solve the four top corners you will need to temporarily destroy the 4 bottom corners. The question is: How to destroy and restore the bottom corners so as the top corners become solved? The simplest idea is to remove one bottom corner from its position (using one of the sequences given earlier) and solve it in a different way. Let us look at an example showing the removing and restoring of the front-right-bottom corner: Remove, position top layer, and restore corner (shown applied to a solved cube):

If you look at the result you may notice that the top corners changed. Two corners are twisted (orange-blue-yellow and red-blue-yellow) and two are swapped (top-right ones). If we select carefully how to turn the whole cube before applying this corner sequence to affect the right corners, we can solve the top corners just by this one sequence! In this step, we will only position the corners to their correct positions while ignoring the way how they are twisted. Thus our task is quite simple: apply the corner sequence (possibly more times) to place the corners to their correct positions (use the colors of side stickers of bottom corners to find the right ones). As you can notice the corner sequence swaps the top-right-front and top-right-back corners. You just need to turn the top-layer and/or the whole cube (keeping top layer facing up) to a position where swapping these two top-right corners will place at least one corner to a correct position. You can always get one of the following cases when turning the top layer to place the corners: All corners are in their positions (although probably twisted) - this step is finished. If two adjacent corners can be correctly positioned by turning the top face then only one swap of the other two corners is necessary (make sure that you turn the cube so that these two corners are in top-right positions when applying the sequence). If two (diagonally) opposite corners can be correctly positioned by turning the top face then perform a swap of any two top corners and you will obtain the previous situation.

3. Twist Four Top Corners

Now we are able to position the top corners using one (quite simple) corner sequence that is explained in the previous text, thus there is no magic here up to this point. Let us try to follow this way even for twisting the corners. We can twist (two) corners using the previous corner sequence, however, it also moves corners which are not good for this step.

(Just reminding you that in this step, we want to twist corners and NOT move them, because they are already positioned in` the previous step.) The idea behind the corner sequence was to do some change and redo it in a different way, so the other parts of the cube became changed while everything solved before remains solved. Let us try the same idea in this step using our corner sequence: swap two corners using the corner sequence and swap them back from a different angle using the same corner sequence. If we can do so, the corners will be in their correct positions (swap + swap back = nothing), but will be somehow twisted. Let us try that, but before that I must say that swapping two corners back from a different angle requires left | right mirroring the corner sequence, which is shown below: Mirror vision of the corner sequence (shown applied to a solved cube):

Now you can try the presented idea of doing and redoing (in a different way) the corner swap to twist top corners: Normal corner sequence, turn cube, mirrored corner sequence (shown applied to a solved cube):

You can see that this new twist sequence leaves all corners in their original positions and twists only two corners: the top-left-front corner is twisted clockwise and the top-left-back corner counter-clockwise. It is not difficult to twist all top corners in any orientation using this twist sequence. After performing consequent multiple application of the twist sequence to the corners, as soon as three corners become oriented correctly, the remaining corner has no chance to be twisted incorrectly.

Examples: The following examples will show you what angle to choose for the twist sequence. After one application the orientation of top corners changes and you will get another that is also shown bellow. You will not get into an infinite cycle when followed correctly. Two twisted, facing opposite sides - apply from this angle

Two twisted, facing opposite sides - apply from this angle

Two twisted, facing the same side - apply from this angle

Two twisted, facing adjacent sides - apply from this angle

Three twisted, clock-wise - apply from this angle

Three twisted, counter clock-wise - apply from this angle

Four twisted, facing opposite sides - apply from this angle

Four twisted, facing three sides - apply from this angle

4. Solve Three Ledges

To solve the Ledges (which stands here for left-side edges - those with white stickers in the pictures) you use the following simple sequences.

Ledge in bottom-front:

Ledge in front-bottom:

Ledge in top-right:

Ledge in right-top:

Ledge in top-left (flipped in its place):

5. Solve Four Ridges Ridges are right-side edges, which have yellow stickers in the pictures. Ridge in bottom-front:

Ridge in front-bottom:

Ridge in top-left:

Ridge in left-top:

Ridge in top-right (flipped in its place):

6. Solve Last Ledge

Ledge in bottom-front:

Ledge in bottom-back:

7. Flip Midges

The edges in the ring need to be flipped in most cases before you can proceed to the following step of positioning them. How should you know which ones need to be flipped? There is a simple rule to spot the incorrectly oriented edges: Look at two colors - color of an edge sticker (choose either one of the two) and color of the center adjacent to the chosen edge sticker. If the colors are the same or opposite (red + orange or blue + green) the edge is just fine. It is flipped otherwise. There may be only none, two, or four of them. Two top midges flipped:

8. Place Midges Three midges in forward cycle:

Three midges in backward cycle:

Two top midges and two bottom midges swapped:

Two and two midges diagonally swapped:

II.2. Ortega's Method

This solution method is designed to solve Rubik's cube and to solve it quickly, efficiently, and without having to memorize a lot of sequences. For ease and speed of execution, turns are mostly restricted to the top, right, and front faces, and center and middle slices. Strong preference is given to the right face, since it is one of the easiest faces to turn for many people. Yet all sequences are minimal (or very close to minimal) by the slice-turn metric. This solution method orients cubbies before positioning them. The idea is that it is easier to permute cubbies after they've been oriented than before orienting them, because once the cubbies have been oriented, the facelet colors that determine their permutation make easily identifiable patterns on the cube. Orienting cubbies, whether done before or after positioning them, is always easy because orientation requires focusing on only one face color and on the patterns that that color makes on the cube. For middle-slice edges on the last layer, permuting cubbies after they've been oriented is a very simple affair, thus reinforcing this principle.

1. Orient Top Corners

You should be able to manage this on your own. Do not worry about positions - all corners will be permuted in step 3. For the greatest speed and efficiency, try to do this in one look

. For smoother cubing you should try to orient these corners on bottom face, because the next step can be done faster then (no cube rotation afterwards, easier looking ahead). (average number of turns for this step ... 5)

2. Orient Bottom Corners

Rotate the whole cube so that bottom face becomes top face. Orient the corners depending on which of the seven patterns below you see:

T pattern:

R U R' U' F' U' F

L (F) pattern:

F R' F' U' R' U R

MI pattern:

R U R' U R U2 R'

PI pattern:

R U R2 F' R'2 U R'

U pattern:

R' F' U' F U R

H pattern:

R2 U2 R' U2 R2

(average number of turns for this step ... 7)

3. Position All Corners

A pair here represents two adjacent corners on the top or bottom layer. Such a pair is considered to be solved correctly if the two corners are positioned correctly relative to each other. A solved pair will be easy to identify because the two adjacent facelets on the side (not top or bottom) will be of the same color. A layer can have only zero, one, or four correct pairs.

The number and location of correct pairs can be quickly identified by merely looking at two adjacent side faces (that is, not top or bottom). For a given layer, if you see one correct pair and one incorrect pair, then there is only one correct pair on that layer. If you see two correct pairs, then all four pairs are correct. If you see no correct pairs but both pairs consist of opposite colors, then there are no correct pairs on that layer. If you see no correct pairs and only one pair consisting of opposite colors, then there is one correct pair on that layer, and it is opposite to the pair with the opposite colors. Proceed with one of the following sequences depending on how many solved pairs you have:

0 (no pairs solved):

R2 F2 R2

1 (bottom-back pair solved):

R' U R' B2 R U' R

2 (top-back and bottom-back pairs solved):

R2 U F2 U2 R2 U R2

4 (bottom pairs solved):

5 (bottom and top-back pairs solved):

R U' R F2 R' U R F2 R2

(average number of turns for this step ... 8) At this point, align corners and position centers. The cube is now fully symmetric except for edges. Pick the new top and bottom face depending on what will make solving top and bottom edges easiest. Steps 4 and 5 can be combined, although this requires monitoring more cubbies simultaneously and may not yield a speed gain or a reduction in number of movements.

4. Solve Three Top Edges

In order to do this step efficiently, you need not position centers and allign corners in the previous step. Instead, you can solve first (or first two opposite) top edge using one or two turns ignoring centers and then, you can solve the top center together with another top edge. (average number of turns for this step ... 9)

5. Solve Three Bottom Edges

To reduce the number of turns required, you can combine this and the following step when solving the third bottom edge. There are several possible cases that are easy to find and very efficient. In addition, you should force yourself to look ahead in this step and try to prevent slower cases to occur. (average number of turns for this step ... 12)

6. Solve One More Top or Bottom Edge

Often, you can solve the last top or bottom edge in the previous step thus omit this step and reduce turns and time. (average number of turns for this step ... 4) At this point, the last top or bottom edge will either be in the middle layer, in position but not oriented, or solved. Depending on the case, proceed as follows to solve that last edge (if necessary) while orienting the middle layer edges.

7. Solve Last Top Edge and Orient Middle Edges

a) Top Edge in Middle Layer Position the "notch" at top-right and the edge cubbie at left-front, with the facelet with the top color on the left face. If the edge cubbie is twisted, mirror vertically (top-right becomes bottom-right, right-face turns go in the opposite direction) As shown in the diagram, the pink-marked edges are oriented correctly - O - if the pink facelet's color matches the color of the adjacent or opposite center. Otherwise the edge is oriented incorrectly (flipped) - X.

OOO

E R E R' E' R' E R

OOX

E R' E' R' E R' E' R'

OXO

E' R E R2 E R

OXX

R' E' R E R' E R

XOO

R' E' R E R' E R

XOX

R' E R' E' R' E' R E R E' R

XXO

R' E' R' E R E R

XXX

R' E R' E R' E' R2 E' R

b) U Edge Twisted in Its Position There will be 1 or 3 twisted edges in the middle layer: front-right twisted:

R U2 R' E2 R2 E' R' U2 R'

front-right not twisted:

R' E' R' E' R' E' R'

c) U Edge Solved There will be 0, 2, or 4 edges twisted in the middle layer: 2 adjacent (front-left and front-right):

R2 F M F' R2 F M' F'

2 opposite (front-left and back-right):

F M F' R2 F M' F' R2

4:

R F2 R2 E R E2 R F2 R2 E R

(average number of turns for this step ... 9)

8. Position Midges

Send front-right to back-left, back-left to back-right, and back-right to front-right:

R2 E' R2

Exchange centers with opposites:

M2 E' M2

Exchange front-right with back-right, front-left with back-left:

R2 E2 R2

(average number of turns for this step ... 4) Average number of turns for this method ... 58

III. Genetic Algorithm

The Rubik Cube problem consists of finding a minimal chaines of movements in order to solve the cube. Let's imagine that after a given number of chain the cube arrive in a state. The genetic algorithm used in this paper is in face a chain of genetic algorithms. The genetic algorithm is used for every state that the cube will pass, and all these put toghether will constitute the solution of the problems. The chain that bring the cube from a state to a best state is select by genetic algorithm. In this way, the cube will pass through the best states, so the chancşes to reach the final state are bigger. Initialization: begin T: = 0; Generate(P(t)); Evaluate(P(t)); SortPopulaŃion(P(t)); Fitness_new = average(fitness); end Iteration: do T := T + 1; Fitness_old = Fitness_new; Mutation(); Crossover(); P(t) = P(t) + Mutation() + Crossover(); EvaluateP(t)); SortPopulaŃion(P(t)); Selection(P(t)); Fitness_new = average(fitness); while (Math.Abs(fitness_old – fitness_new) > Math.Pow(10, -7);

III.1. Cube representation

The algorithm represents each of six faces of the cube as an array of 27 bits represented by a BitSet, and every cell represent a color that are encoded in 3 bits. By using a BitSet we minimize the memory consumption because we have to memorize a set of possible solutions. The twist is the only operation for the cube. For every face there are three possibilities of twisting: 90 degrees clockwise, 180 degrees clockwise, 90 degree counter-clockwise.

III.2. The components of the genetic algorithm

Coding the solutions

A solution (chromosome) represents a chain of moves that makes a cube to pass from a state to another.

Initialization

To establish what state will be the initial one, the cube will perform some movements starting from the final state, in a random manner. This is done in order to assure to possibility that starting from this state, and applying different moving, the cube will be able to arrive in the final state. This state chooses randomly will become the initial state and the starting point from the genetic algorithm. The genetic algorithm will

try to find a way to reach in the final state applying chains of movements. Every chain of moves will bring the cube in transition states.

The fitness function

Every individual from population is evaluated in order to measure their performance. The fitness function used in this algorithm computes what cells are on their position. The algorithm will stop when all the cells will be on their position. Also, the number of crosses is giving a boost to the fitness function, acknowledging the importance of these particular patterns in the most successful algorithms.

Mutation

The mutation operator consists of changing randomly a moves from the chromosome. The number of genes of a individual that participate to mutation vary from an individual to another. A parameter alpha is

considered, with ]1,0[∈alpha that is used to decide if the mutation will be applied (if alpha < pm) on the

gene or not (if alpha > pm). Also, with this parameter will be established if an individual will participate or not to the mutation. The pm parameter denote the probability of mutation and depends on various factors like iteration number, fitness dispersion and crossover rate. The initial value of pm is pm = 0.7 * N, where N is the total number of chromosomes in the population. The chromosome that results from this process will be added to the current population, without erasing the initial chromosome. This way, the risk of losing possible good solutions in future generations will be eliminated.

Crossover

Let pc be the probability of crossover. The matching pool for crossover S, is formed by the best 40% individuals. From this total, the number of individuals that will participate in the crossover operation is

pcN ⋅⋅ 4.0 . The initial value of pc is pc = 0.2*N, where N is the total number of chromosomes in the

population.

Two parents are selected from the basin S, denoted by x1 and x2 . Given a value a ∈ [0, 1], two descendants will be formed from the two parents and it is used one cut point.

Selection for the next iteration

Applying these mutation and crossover operations will create new individuals that will be added to the current population. This population will be sorted decreasingly by the fitness value, so that the first individual will have the best fitness, while the last one will have the worst. By applying selection, the first N best individuals are chosen where N is the dimension of the initial population.

Stop condition

The algorithm is let to works 200 times. If it doesn't reach the final state, the algorithm will choose the individual found so far, that brings the cube in the best state. Then another algorithm will start from this solution, trying to bring the cube in another best state.

Mutation versus crossover rate

Taking in consideration the observations of the cube behavior, there are situations when the cube arrives into a state that only two or three cells are not in the correct position and how many mutations and crossover are applied it can't go out from this local maximum. A solution for this situation is to increase exploration and decrease the exploitation. This is solved by increase the mutation rate and to decrease the crossover rate on individuals in order to move away form this local. The value fitness_old is computed at every iteration, that represents the old value of the average fitness at the previous iteration, and also the value fitness_new is calculated, representing the new value of the

average fitness at the current iteration. When dtnewfitnessoldfitness <− |__| , where dt is a tolerance

threshold ( dt = 710 − ) is accomplish then the mutation rate increase with 0.7 and the crossover decrease

with 0.2.

Improvements of the algorithm

The algorithm uses a tabu search memory in order to avoid the cycling situation when two moves cancel each other. This brings a performance to the algorithm, because forces the cube to avoid the states recently visited.

Enhancement

One enhancement to solve the cube is the bidirectional search. Let 0s the initial state and g the final state.

The genetic algorithm generates chromosome of length m. There will be used two genetic algorithms. One algorithm will start from the initial state and the other from the final state and will evolve independently till a common state will be reached. A correlation function compares the generated state for both the forward

and backward search. If any paths from forward and backward search cross the path from 0s to g is

determined. If no paths cross, the correlation function finds two closest states and the cube will pass in the

states 1s and 1g , till the sequence that bounds the two searches will be found.

IV. Evolutionary computing SDK’s

IV.1. ECJ

ECJ is a research EC system written in Java and represents models iterative evolutionary processes using a series of pipelines arranged to connect one or more subpopulations of individuals with selection, breeding (such as crossover) and mutation operators. It was designed to be highly flexible, with nearly all classes (and all of their settings) dynamically determined at runtime by a user-provided parameter file. All structures in the system are arranged to be easily modifiable. Even so, the system was designed with an eye toward efficiency. The ECJ source is licensed under the Academic Free License 3.0, except for the MersenneTwister and MersenneTwisterFast Java classes, which use this license.

General Features: - GUI with charting - Platform-independent checkpointing and logging - Hierarchical parameter files - Multithreading - Mersenne Twister Random Number Generators -Abstractions for implementing a variety of EC forms. Vector (GA/ES) Representations - Fixed-Length and Variable-Length Genomes Arbitrary representations - Ten pre-done vector application problem domains (rastrigin, sum, rosenbrock, sphere, step, noisy-quartic, booth, griewangk, nk, hiff) Other Representations - Multiset-based genomes in the rule package, for evolving Pitt-approach rulesets or other set-based representations.

EC Features: - Asynchronous island models over TCP/IP - Master/Slave evaluation over multiple processors, with support for generational, asynchronous steady-state, and coevolutionary distribution - Genetic Algorithms/Programming style Steady State and Generational evolution, with or without Elitism - Evolutionary-Strategies style (mu,lambda) and (mu+lambda) evolution - Very flexible breeding architecture - Many selection operators - Multiple subpopulations and species - Inter-subpopulation exchanges - Reading populations from files - Single- and Multi-population coevolution - SPEA2 multiobjective optimization - Particle Swarm Optimization - Differential Evolution - Spatially embedded evolutionary algorithms - Hooks for other multiobjective optimization methods

An example: Build a Genetic Algorithm for the MaxOnes Problem We will build an evolutionary computation system that uses: - Generational evolution - A GA-style selection and breeding mechanism (a pipeline of tournament selection, then crossover, then mutation)

- A single, non-coevolutionary population - A simple, floating-point fitness value (no multiobjective fitness stuff) - A fixed-length vector representation (MaxOnes uses a vector of bits) - Only one thread of execution. - Only one process (no island models or other such funkiness). The example is tested on UNIX system.

Create an app subdirectory and parameters file Go into the ec/app directory and create a directory called tutorial1. In this directory, create a file called tutorial1.params. The params file is where we will specify parameters which direct ECJ to do an evolutionary run. ECJ parameters guide practically every aspect of ECJ's operation, down to the specific classes to be loaded for various functions. ECJ's top-level object is ec.Evolve. Evolve has only one purpose: to initialize a subclass of ec.EvolutionState, set it up, and get it going. The entire evolutionary system is contained somewhere within the EvolutionState object or a sub-object hanging off of it. The EvolutionState object stores a lot of top-level global evolution parameters and several important top-level objects which define the general evolution mechanism. Some of the parameters include: - The number of generations to run - Whether or not we should quit when we find an ideal individual, or go on to the end of generations Some of the top-level objects inside EvolutionState include: - A subclass of ec.Initializer, responsible for creating the initial population. - An ec.Population, created initially by the Initializer. A Population stores an array of ec.Subpopulations. Each Subpopulation stores an array of ec.Individuals, plus an ec.Species which specifies how the Individuals are to be created and bred. We'll be using a Population with just a single Subpopulation. - A subclass of ec.Evaluator, responsible for evaluating individuals. - A subclass of ec.Breeder, responsible for breeding individuals. - A subclass of ec.Exchanger, responsible for exchanging individuals among subpopulations or among different processes. Our version of Exchanger won't do anything at all. - A subclass of ec.Finisher , responsible for cleaning up when the system is about to quit. Our Finisher won't do anything at all. - A subclass of ec.Statistics, responsible for printing out statistics during the run. - An ec.util.Output facility, responsible for logging messages. We use this instead of System.out.println(...), because Output makes certain guarantees about checkpointing, thread-safeness, etc., and can also prematurely quit the system for us if we send it a fatal or error message. - An ec.util.ParameterDatabase. The ParameterDatabase stores all the parameters loaded from our params file and other parameter files, and is used to help the system set itself up. - One or more ec.util.MersenneTwisterFast random number generators, one per thread. Since we're using only one thread, we'll only have one random number generator.

Define Parameters for the Evolve object Let's begin by defining some basic parameters in our params file which the Evolve class uses. Since Evolve (oddly given its name) isn't involved in evolution, these parameters are mostly administrative stuff. Add the following parameters to your tutorial1.params file. verbosity = 0 flush = true store = true

Most of the things ECJ prints out to the terminal are messages. A message is a string which is sent to the Output facility to be printed and logged. Messages can take several forms, though you'll usually see: plain-old messages, warnings, errors, and fatal errors. A fatal error causes ECJ to quit as soon as it is printed and logged. An ordinary error raises an error flag in the Output facility; ECJ can wait after a string of errors before it finally quits (giving you more debugging information). Warnings and messages do not quit ECJ.

The verbosity parameter tells ECJ what kinds of things it should print to the screen: a verbosity of 0 says ECJ should print everything to the screen, no matter how inconsequential. The verbosity can be changed. The flush parameter tells ECJ whether or not it should immediately attempt to flush messages to the screen as soon as it logs them; generally, you'd want this to be true. The store parameter tells ECJ whether or not it should store messages in memory as it logs them. Unless you have an absolutely gargantuan number of messages, this should probably be true. breedthreads = 1 evalthreads = 1 seed.0 = 4357

This tells ECJ whether or not it should be multithreaded. If you're running on a single-processor machine, it rarely makes sense to be multithreaded (in fact, it's generally slower). breedthreads tells the Breeder how many threads to spawn when breeding. evalthreads tells the Evaluator how many threads to spawn when evaluating. Each thread will be given its own unique random number generator. You should make sure that these generators have different seeds from one another. The generator seeds are seed.0, seed.1, ...., up to seed.n where n = max(breedthreads,evalthreads) - 1. Since we have only one thread, we only need one random number generator. 4357 is a good initial seed for the generator: but remember that if you run your evolution twice with the same seed, you'll get the same results! So change your seed for each run. If you'd like the system to automatically change the seed to an arbitrary seed each time you run, you can base the seed on the current wall clock time. You do this by saying seed.0 = time. Next let's define our evolution state. The simple package defines lots of basic generational evolution stuff, and we can borrow liberally from it for most of our purposes. We'll start by using its EvolutionState subclass, ec.simple.SimpleEvolutionState. We do this by defining a final parameter which Evolve uses to set stuff up: state = ec.simple.SimpleEvolutionState

Define Parameters for the SimpleEvolutionState object

SimpleEvolutionState defines a simple, generational, non-coevolutionary evolution procedure. The procedure is as follows: 1. Call the Initializer to create a Population. 2. Call the Evaluator on the Population, replacing the old Population with the result. 3. If the Evaluator found an ideal Individual, and if we're quitting when we find an ideal individual, then go to Step 9. 4. Else if we've run out of generations, go to Step 9. 5. Call the Exchanger on the Population (asking for a Pre-breeding Exchange), replacing the old Population with the result. 6. Call the Breeder on the Population, replacing the old Population with the result. 7. Call the Exchanger on the Population (asking for a Post-breeding Exchange), replacing the old Population with the result. 9. Increment the generation count, and go to Step 2. 10. Call the Finisher on the population, then quit. In between any of these steps, there are hooks to call the Statistics object so it can update itself and print out statistics information. Since our Exchanger will do nothing, steps 5 and 7 won't do anything at all. SimpleEvolutionState can work with a variety of Initializers, Evaluators, Breeders, Exchangers, Finishers, and Populations. But to keep things simple, let's use the basic ones which go along with it nicely. Here are some parameters which will direct SimpleEvolutionState to load these classes: pop = ec.Population init = ec.simple.SimpleInitializer finish = ec.simple.SimpleFinisher breed = ec.simple.SimpleBreeder eval = ec.simple.SimpleEvaluator stat = ec.simple.SimpleStatistics exch = ec.simple.SimpleExchanger

SimpleInitializer makes a population by loading an instance of (in this case ec.Population) and telling it to populate itself randomly. Populations, by the way, can also load themselves from files (see the Subpopulation documentation). The SimpleEvaluator evaluates each individual in the population independently. The SimpleStatistics just reports basic statistical information on a per-generation basis. The SimpleExchanger and SimpleFinisher do nothing at all. Additionally, there are some more parameters that SimpleEvolutionState needs: generations = 200 quit-on-run-complete = true checkpoint = false prefix = ec checkpoint-modulo = 1

generations is the number of generations to run. quit-on-run-complete tells us whether or not we should quit ECJ when it finds an ideal individual; otherwise it will continue until it runs out of generations.

checkpoint tells ECJ that it should perform checkpointing every checkpoint-modulo generations, using a Gzipped checkpoint file whose name begins with the prefix specified in prefix. Checkpointing saves out the state of the entire evolutionary process to a file; you can then start from that point by launching ECJ on that checkpoint file. If you have a long run and expect that the power might go out or the system might be shut down, you may want to checkpoint. Otherwise don't do it, it's an expensive thing to do.

Define the Statistics File SimpleStatistics requires a file to write out to. Let's tell it that it should write out to a file called out.stat, located right where the user launched ECJ at (that's what the $ is for): stat.file = $out.stat

How do we know that SimpleStatistics needs a file? Because it says so. A great many objects in ECJ have parameter bases. The parameter base is passed to the object when it is created, and is prefixed to its parameter names. That way, for example, you could conceivably create two different Statistics objects, pass them different bases, and they'd be able to load different parameters. Some ECJ objects also have a default base which defines a secondary parameter location that the object will look for if it can't find a parameter it needs at its standard parameter base. This allows some objects to all use the same default parameters, but specialize only on certain ones. SimpleStatistics doesn't have a default base. It's too high-level an object to need one. The base for our SimpleStatistics object is stat. Usually the bases for objects correspond with the parameter name that specified what class they were supposed to be. For SimpleStatistics, for example, the class-specifying parameter was stat = ec.simple.SimpleStatistics, hence stat is the base, and the SimpleStatistics' output filename is at stat.file. If no file is specified, by the way, SimpleStatistics will just output statistics to the screen.

Define the Population parameters

We begin by telling ECJ that the Population will have only one Subpopulation, and we'll use the default Subpopulation class for subpopulation #0. pop.subpops = 1 pop.subpop.0 = ec.Subpopulation

Note that Population, like Statistics, also uses parameter bases (in this case its base is pop). Similarly, Subpopulation #0 has a parameter base. It will be, you guessed it, pop.subpop.0. Let's define some stuff about Subpopulation #0: pop.subpop.0.size = 100 pop.subpop.0.duplicate-retries = 0 pop.subpop.0.species = ec.vector.VectorSpecies

We've first stated that the size of the subpopulation is going to be 100 individuals. Also, when initializing themselves, subpopulations can guarantee that they won't duplicate individuals: they do this by generating

an individual over and over again until it's different from its peers. By default we're telling the system not to bother to do this, duplicates are fine. As mentioned earlier, every Subpopulation has an associated ec.Species which defines features of the Individuals in the Subpopulation: specifically, how to create them and how to breed them. This is the first representation-specific object we've seen so far: ec.vector.VectorSpecies defines a particular kind of Species that knows how to make BitVectorIndividuals, which are the kind of individuals we'll be using. Other kinds of individuals require their own special Species classes.

Define the Representation Species hold a prototypical Individual which they clone multiple times to create new Individuals for that Species. This is the first place you will see the notion of prototypes in ECJ, a concept that's used widely. A prototype is an object which can be loaded once from the parameter files, and set up, then cloned repeatedly to make lots of customized copies of itself. In ECJ, Individuals are prototypes. The parameters for ec.Species are where the individual is specified: pop.subpop.0.species.ind = ec.vector.BitVectorIndividual

Here we stipulate that the kind of individual used is a ec.vector.BitVectorIndividual, which defines an Individual that holds a vector of with boolean values. VectorSpecies also holds various parameters that all individuals of that species will abide by: pop.subpop.0.species.genome-size = 100 pop.subpop.0.species.crossover-type = one pop.subpop.0.species.crossover-prob = 1.0 pop.subpop.0.species.mutation-prob = 0.01

This stipulates that our individuals will be vectors of 100 bits, that their "default" crossover will be one-point crossover, that if we use the default crossover we will use it 100% of the time to breed individuals (as opposed to 0% direct copying), and finally that if we use the "default" mutation, then each bit will have a 1% probability of getting bit-flipped, independent of other bits. We'll get to the "default" crossover and mutation in a second, but first note that VectorSpecies is a Prototype, and Prototypes almost always have default parameter bases to fall back on. The default parameter base for VectorSpecies is vector.species (see the VectorSpecies documentation). For example, instead of explicitly saying that all indivividuals in the species used in subpopulation #0 of the population are supposed to have a genome-size of 100, we could have simply said that all individuals belonging to any VectorSpecies have a genome size of 100 unless otherwise stipulated. We say it like this: vector.species.genome-size = 100.

Define the Fitness

Fitnesses are similarly defined: pop.subpop.0.species.fitness = ec.simple.SimpleFitness

Every Individual has some kind of fitness attached to it, defined by a subclass of ec.Fitness. Fitnesses are not built into Individuals; and instances of the same Individual subclass can have different kinds of Fitnesses if you see fit. Fitnesses are prototypes just like Individuals are: each Species instantiates one Fitness subclass, called the prototypical Fitness, and uses that class to clone copies which are attached to new Individuals the Species has created. Here we say that we will use ec.simple.SimpleFitness as our fitness class. SimpleFitness defines fitness values from 0.0 inclusive to infinity exclusive, where 0.0 is the worst possible fitness, and infinity is better than the ideal fitness. You can define the ideal fitness to any value greater than 0, we'll get to that later.

Define the Breeding Procedure

ECJ has a very flexible breeding mechanism called a breeding pipeline. It's not actually a pipeline per se: it's really a tree. The leaves of the tree are selection methods, responsible for picking individuals from the old population. Nonleaf nodes of the tree are breeding operators, which take individuals handed them by

their child nodes, breed them, and send them to their parent node. The root of the tree then hands completely-bred individuals to be added to the new population. We will define a breeding pipeline which does the following. First, it picks two individuals from the population and hands them to be crossed over. The crossover operator then hands the individuals to a mutation operator to be mutated. The mutation operator then hands the individuals off to be placed into the new population. The tree thus has a mutation operator at the root, with one child (the crossover operator). The crossover operator has two children, each selection methods. For a mutation operator we will use ec.vector.breed.VectorMutationPipeline. This operator requests Individuals of its sole child source (the crossover operator), then mutates all of them. It mutates them by simply calling the default mutation method defined in the Individuals themselves. If you want some non-default mutation method (like vector-inversion), you'll need to define your own BreedingPipeline subclass to do the custom mutation. Similarly, for a crossover operator we will use ec.vector.breed.VectorCrossoverPipeline. This operator requests one Individual from each of its two child sources (in this case, the selection methods), then crosses them over and returns both of them at once. This pipeline does its crossover simply by calling the default crossover method defined in the Individuals themselves. Once again, if you want a special kind of crossover not stipulated in the defaults, you'll need to define your own BreedingPipeline subclass to do the special crossover. Lastly, for our selection methods, let's use ec.select.TournamentSelection, which defines basic tournament-selection. The root of the pipeline is defined by the parameter pop.subpop.0.species.pipe, and everything else derives its base off of it in a hierarchical fashion: pop.subpop.0.species.pipe = ec.vector.breed.VectorMutationPipeline pop.subpop.0.species.pipe.source.0 = ec.vector.breed.VectorCrossoverPipeline pop.subpop.0.species.pipe.source.0.source.0 = ec.select.TournamentSelection pop.subpop.0.species.pipe.source.0.source.1 = ec.select.TournamentSelection

Because the default mutation and crossover probabilities and types were defined as part of the BitVectorIndividuals, we don't need to stipulate those parameters here. But one thing is left: we have to define the tournament size for our TournamentSelection to be 2. We could explicitly define sizes for each of the selection operators as follows: pop.subpop.0.species.pipe.source.0.source.0.size = 2 pop.subpop.0.species.pipe.source.0.source.1.size = 2

...but TournamentSelection (and all selection methods and breeding pipeline operators) is a Prototype, and so it has a default base we could simply use instead: select.tournament.size = 2

Define the Problem So far, we've managed to define the high-level evolutionary process, administrative details, representation, and complete breeding procedure without writing a single drop of code. But not any more. Now we have to write the object that's actually responsible for assessing the fitness of our Individuals. This object is called a Problem, and it is specified as a parameter in our Evaluator. We will create a Problem subclass called ec.app.tutorial1.MaxOnes which will take an Individual, evaluate it, and hand it back. Before we do so, we have one more self-explanatory parameter to define: eval.problem = ec.app.tutorial1.MaxOnes

Now close the tutorial1.params file and open a new file (also in the tutorial1 directory) called MaxOnes.java. In the file, write: package ec.app.tutorial1; import ec.*; import ec.simple.*;

import ec.util.*; import ec.vector.*; public class MaxOnes extends Problem implements SimpleProblemForm { First, it defines a setup method, which you can override (remember to call super.setup(...) ) to set up the prototypical Problem from a parameter file. Your Problem will be a clone of this prototypical Problem. Second, it defines the method clone() which is used to make (deep) copies of the Problem. Java's clone() method doesn't deep-clone by default; so if you have an object holding (for example) an array inside it, and clone the object, the array isn't cloned. Instead both objects now point to the same array. ECJ instead calls clone() on you, and you're responsible for cloning yourself properly. Since we're not defining any instance data that needs to be loaded from a parameter file or specially cloned, we don't even need to touch these methods. So what methods do we actually need to implement? As it turns out, Problem doesn't actually define any methods for evaluating individuals. Instead, there are special interfaces which various Evaluators use that you must implement. SimpleEvaluator requires that its Problems implement the ec.simple.SimpleProblemForm interface. This interface defines two methods, evaluate (required) and describe (optional). evaluate takes an individual, evaluates it somehow, sets its fitness, and marks it evaluated. describe takes an individual and prints out to a log some information about how the individual operates (maybe a map of it running around, or whatever you'd like). describe is called when the statistics wants to print out "special" information about the best individual of the generation or of the run, and it's not necessary. We'll leave it blank. Here's the first part of the evaluate method: // ind is the individual to be evaluated. // We are given state and threadnum primarily so we // have access to a random number generator // (in the form: state.random[threadnum] ) // and to the output facility public void evaluate(final EvolutionState state, final Individual ind, final int threadnum) { if (ind.evaluated) return; //don't evaluate the individual if it's //already evaluated Individuals contain two main pieces of data: evaluated, which indicates that they've been evaluated already, and fitness, which stores their fitness object. Continuing: if (!(ind instanceof BitVectorIndividual)) state.output.fatal("Whoa! It's not a BitVectorIndividual!!!",null); BitVectorIndividual ind2 = (BitVectorIndividual)ind;

First we check to see if ind is a BitVectorIndividual -- otherwise something has gone terribly wrong. If something's wrong, we issue a fatal error through the state's Output facility. Messages (like fatal) all have one or two additional arguments where you can specify a Parameter that caused the fatal error, because it's very common to issue a fatal error on loading something from the ParameterDatabase and discovering it's incorrectly specified. Since this fatal error doesn't have anything to do with any specific parameter we know about, we pass in null. Continuing: int sum=0; for(int x=0; x<ind2.genome.length; x++) sum += (ind2.genome[x] ? 1 : 0);

VectorIndividuals have all have an array called genome. The type of this array (int, boolean, etc.) varies depending on the subclass. For BitVectorIndividual, genome is a boolean array. We're simply counting the number of trues in it. Continuing: if (!(ind2.fitness instanceof SimpleFitness)) state.output.fatal("Whoa! It's not a SimpleFitness!!!",null); ((SimpleFitness)ind2.fitness).setFitness(state, // ...the fitness... (float)(((double)sum)/ind2.genome.length), ///... is the individual ideal? Indicate here... sum == ind2.genome.length); ind2.evaluated = true; }

Note that Fitness itself doesn't actually contain any methods for setting the fitness, only for getting the fitness. This is because different Fitness subtypes operate differently. In order to set a fitness, we must assume that it's some particular Fitness, in this case, SimpleFitness. Just in case, we double-check first. [If you're just hacking something up fast and you know that you're using a given kind of Individual and a given kind of Fitness, the double-checking is probably unnecessary, but if you change your Individual or Fitness in your parameters, your code may break in an icky way of course]. SimpleFitness defines a fitness-setting method where you provide the EvolutionState object, the fitness value you want to set (for SimpleFitness, this is between 0 inclusive and infinity exclusive, 0 being worst and infinity being better than the best), and a flag indicating whether or not this fitness is the ideal fitness. We do exactly this. Lastly, we mark the individual as evaluated, and we're done! We finish out with an empty version of the describe method, since we don't have anything special to say about individuals: public void describe(final Individual ind, final EvolutionState state, final int threadnum, final int log, final int verbosity) { }

}

Run the program

Close the MaxOnes.java file and compile it. If you're inside the tutorial1 directory, you can run it by calling: java ec.Evolve -file tutorial1.params

As of ECJ 8, the ideal individual will be discovered in the Generation 102, at least on Java VMs which obey strict math (Microsoft's VM does not). The system dumps its statistics into the out.stat file like you requested. Look in the file and note the style of the statistics that SimpleStatistics uses. The last few lines of the file look like this: Generation: 101 Best Individual: Evaluated: true Fitness: 0.99 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Generation: 102 Best Individual: Evaluated: true Fitness: 1.0

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Best Individual of Run: Evaluated: true Fitness: 1.0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

If you'd like information instead in a columnar format, and don't care about what the best individuals look like, you might try using ec.simple.SimpleShortStatistics instead of SimpleStatistics. You can of course modify your parameter file, but it might be easier to simply override a parameter on the command line: java ec.Evolve -file tutorial1.params -p stat=ec.simple.SimpleShortStatistics

...the last few lines look like this: 100 0.9550000005960464 0.99 0.99 101 0.9568999993801117 0.99 0.99 102 0.9563999980688095 1.0 1.0

These columns are: the generation number, the mean fitness of the first subpopulation per generation, the best fitness of the first subpopulation per generation, and the best fitness of the first subpopulation so far in the run. You can turn on even more statistics-gathering in most Statistics objects by saying stat.gather-full = true. More than one Statistics object can be defined simultaneously for a run as well, though that's out of the scope of this tutorial. Remember, you can also change the random number generator seed as well: java ec.Evolve -file tutorial1.params -p seed.0=4

IV.2. Origin

Origin is a Java-based software development platform for developing distributed evolutionary computation and genetic programming applications. It's based on ECJ, a research evolutionary computation framework developed by Dr. Sean Luke of George Mason University, and the Parabon Frontier Grid Platform. Like ECJ, most of Origin's behavior is dynamically configured at runtime by parameter files, and most Origin classes can be easily subclassed or replaced to extend Origin's operation. Evolutionary computation is a problem-solving method inspired by biological evolutionary processes. A random population of individuals - possible solutions to a problem - are generated and executed, and those individuals that produce better results are favored to breed additional solutions, while poorly performing individuals are eventually eliminated. Evolutionary computation is attractive for problems where the range of possible solutions (the "solution space") is too large to exhaustively test all possible solutions, and where the goal is to determine a "pretty good" solution rather than the best possible one. ORIGIN is licensed under the ORIGIN End User License Agreement.

1. The Parabon Frontier Grid Platform

The Parabon Frontier Grid Platform eliminates traditional computing limitations. It builts for extreme scalability, Frontier can employ a virtually unlimited number of computers. Supercomputation is get without the supercomputer. It is inexpensive. Frontier harnesses the unused capacity of computers-desktops, servers, clusters, etc.—so the cost of computation is extremely low. It's a pay-per-use service, so it is need only to pay for the capacity that is need, when it is need. Introduced in 2000 as the first commercially available grid computing solution, Frontier: gives the access and control through a simple, browser-based "dashboard.", guarantees task execution by shielding applications from unreliable computing resources and networking, supports most common computing platforms and is easily adaptable to others and has powerful applications available today and many more in development, thanks to the Frontier Software Development Kit (SDK) and a rich suite of other Frontier development tools.

Built from the ground up with the most advanced mobile code security capabilities, Frontier has safeguards for both providers and users. It's the most secure grid computing platform on the market.

2. The Origin Distributed Grid Models

Grid computing (or the use of computational grids) is the application of several computers to a single problem at the same time — usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data. One of the main grid computing strategies is to use software to divide and apportion pieces of a program among several computers, sometimes up to many thousands. Grid computing can also be thought of as distributed and large-scale cluster computing, as well as a form of network-distributed parallel processing. It can be small — confined to a network of computer workstations within a corporation, for example — or it can be a large, public collaboration across many companies or networks. The Origin distributed grid models use the Frontier® Grid Platform to distribute the work of an evolutionary run across the machines of a Frontier Grid. An evolutionary run is broken into units of work called tasks and these tasks are sent to a Frontier Grid Server for scheduling. The server forwards tasks to host computers running the Frontier Grid Engine for execution. When a task completes, its results are returned to the Origin application on the local machine. A properly-written Origin application can run without modification either conventionally on a local machine or as a distributed application on a Frontier Grid. The grid is best suited to evolutionary problems that have some combination of large populations (thousands to millions of individuals) and computationally expensive evaluation functions. Because of the latency inherent in distributing tasks over the grid an evolutionary run with a small population may run faster locally. Since changing between conventional and grid models usually requires only modifying a few parameter settings you can take advantage of this to test your evolutionary model by running Origin locally, then scale up to a much larger population running on the Frontier grid. Origin supports two grid models: master/slave, where individual fitness evaluations are distributed over the grid, and Opportunistic Evolution (OE), which combines remote generational evolution with an asynchronous steady-state local model. The master/slave model can be used with almost any generational evolutionary model and is the simplest grid model to work with - a properly written generational model will produce identical results in master/slave mode as when running locally. The OE model makes optimal use of the grid and is well suited for very large populations or long evolutionary runs.

The Master/Slave Model In the Origin Master/Slave model individual fitness evaluations are distributed over the grid, while selection and breeding is performed locally. To increase performance multiple individuals are sent to each Frontier task, but the individuals' fitness values are computed independently. The computed fitness values are returned to the Origin application and selection and breeding proceeds as usual, followed by launching another set of Frontier tasks to compute the next generation's fitness. Most Origin generational evolutionary programs can be change to a master/slave model with the following parameter file: parent.0 = original-parameter-file parent.1 = ${origin.parameter.dir}/com/parabon/ec/eval/master.origin.params This loads the original program parameters and then the Origin master/slave model parameters. Any parameter setting specific to the master/slave model should also go in this file. You can then switch between master/slave and local mode by specifying the new parameter file instead of original-parameter-file on the Origin command line. Because of the fixed overhead of distributing tasks over the Frontier grid and contention for compute resources with other grid users each task instance should be sent enough individuals to require at least several minutes to evaluate them. However on the other hand reducing the individuals sent to each task and thus increasing the number of tasks used to evaluate each generation will distribute evaluations over a

greater number of grid machines. The optimum balance between the amount of work per task and the number of tasks is usually determined empirically based on earlier evolutionary runs for this problem. Origin provides two parameters to control the number of individuals sent to each task: job-size (or eval.masterproblem.max-jobs-per-slave) and eval.masterproblem.max-data-per-slave. job-size sets the maximum number of individuals ("jobs" in ECJ terminology) per task. This should be based on the average time required to compute an individual's fitness- reduce this value as fitness evaluation times increase. eval.masterproblem.max-data-per-slave is best suited to GP problems where individuals grow in size over generations. As individuals get larger and take longer to evaluate Origin will reduce the number of individuals sent to each task. If grid tasks are failing due to "out of memory" exceptions decrease this value.

The Opportunistic Evolution Model

Opportunistic Evolution (OE) combines remote generational evolution with a local steady-state evolutionary model. OE distributes groups of individuals to remote tasks, where these subpopulations are evolved for a specified number of generations. The remote tasks return their final individuals to the local Origin application, which merges them into its population, then selects and breeds a new subset of individuals to be evolved by a remote task. Unlike the Master/Slave model, where all remote tasks must complete before the next generation is evalated, OE is asynchronous - as each task returns the OE application merges that task's individuals into the population, breeds new individuals, and launches another task. To change an existing steady-state evolutionary model to an Opportunistic Evolution model use a parameter file like this: parent.0 = original-parameter-file parent.1 = ${origin.parameter.dir}/ec/steadystate/steadystate.origin.params generations = local-evaluations slave.generations = remote-generations slave.generations is the number of generations each remote task should evolve, while generations is the number of times the population size is returned from remote tasks- e.g. if generations is 10 and the population size is 50,000 the Origin application would complete after 500,000 individuals were returned from remote evolution. OE also supports the same parameters to control the number of individuals sent to each task as Master/Slave and the standard steady-state evolutionary model parameters.

V. Experiments

The results of some experiments we made with our algorithm are shown in the tables below. All the experiments were made using a population of 100 chromosomes which evolved for 200 generations at each of the maximum 5 runs. Times are shown in seconds. 10 tests were made for each different situation. We made a comparison between unidirectional search and bidirectional success rate. For unidirectional search:

For unidirectional search: Initial random moves: 4 Success rate: 81 % Length of solutions: 2, 4, 6, 8, 9

Initial random moves: 5 Success rate: 50% Length of solutions: 4, 5, 7, 8 chromosome length

Initial random moves: 6 Success rate: 30% Length of chromosome: 7, 10

The variation of the mutation and crossover rate are shown in the figure:

0

20

40

60

80

100

120

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91

mutation

crossover

For bidirectional search: Initial random moves Success rate Mean success time Mean failure time

4 60% 5 202

5 40% 4 232

10 0% - 225