Download an executable jar file containing the Traveller
release "epsilon" Java 1.4 source code, a makefile, the byte
compiled classes, a copy of this document with accompanying
screenshots, some licenses, and other documentation.
Traveller.jar
[I've completely given up trying to present Traveller here
as an applet. Sorry about that. Applets don't work using
standard mechanisms in my non-standard MS-Windows 98/Cygwin
development environment, and I have no computer to test my
web site over whose operation I have control. Writing
software you cannot test properly is just a fast way to a
bad reputation and a nervous breakdown.
Those interested in running Traveller in their own Java
runtime environment are still invited to download the jar
file and enjoy themselves, where it starts out looking like
this. [Screenshots for this document for the epsilon
release of Traveller are from a "nested grid" layout, run a
couple of times to show different features. The problem
size is 337 cities, the population size is 40 members. This
is Traveller's most difficult current layout pattern, much
harder than random layouts because its fitness landscape is
full of very deep local optima potholes, and so far I only
know that it cannot be completely solved in 117,000+ seconds
at the current state of Traveller's heuristics, though I
ran out of patience with only two more fixes needed, easily
determined by eye in this regular layout pattern.]
One would think, as crucial as applet presentation is to the
successful evangelizing of Java, that how to make Java
applets run from browser windows would be as well documented
and supported in Sun's documention as is the Java language
API, but such is emphatically not the case. Grrrr.]
What is this Traveller applet?
Traveller is a Java language application which sets up and
runs a choice of many genetic algorithm heuristics to solve
a particular variety of the famous Traveling Salesman
Problem, while letting the user have a lot of control over
the algorithms used and the problem characteristics.
The spelling of the name "Traveller".
It is best to lay this controversial issue to rest at once.
Scott Robert Ladd
(henceforth "SRL" or "Scott") wrote the original of this
demo as an applet for the ("Blind" version of the)
Symmetrical Euclidean Traveling Salesman Problem, and he
used an alternative spelling "Traveller", for which he has
apparently received endless grief. He came back, on his
spiffy Coyote Gulch web
site (from which Traveller comes), to defend the spelling as
an accepted alternative. Well, yes and no. You will find it
in many dictionaries. It is an acceptable spelling under
British rules (which earlier followed the current American
rules), to double the "l", but the American rules double the
consonant when adding a suffix, only when
-
The word to which the suffix is to be added ends in a single
consonant.
-
There is a voiced vowel before and after the consonant when
the suffix is added.
-
The stress falls on the last (or only) syllable before the
suffix is added.
Which latter is not the case for the word "travel".
Nevertheless, it is Scott's original work, and Scott's
choice, so until he changes his mind, the spelling stays as
Scott and the British do it, and it is we Yanks who will
have to show the stiff upper lip in these trying times.
Besides, the spelling is now so embedded in the existing
code class names and such, it would be a chore to rip it out
at this point, and then the British would just declare war
in retaliation for the insult to their orthography, and... .
What is the status of the Traveller source code?
Scott's original code was published under the GNU Public
License, or GPL. My additions and modifications are
freeware, since I'm not into legalisms. Under the Berne
copyright convention, both Scott and I hold copyright on the
code we separately wrote, automatically. [What the status
is of the stuff mishmashed together I haven't a clue, best
take the conservative approach and assume it is all GPLed.]
My license of my copyright to you is that you may do
whatever you want with my code, whether for commercial or
private use, as long as you don't try to hold me
responsible for the consequences. Credit to me as the
author would be polite, but is not required. Scott's
license to you is contained in the GPL; the most recent
version of the GPL is available online from the Free
Software Foundation, at
GNU Public
License or by surface mail request from:
Free Software Foundation, Inc.
59 Temple Place - Suite 330
Boston, MA 02111-1307, USA.
and also included in the files COPYING and COPYING.LIB (two
considerably different licenses) in the Traveller.jar file.
For purposes of contacting me about my copyright (please
don't), I am most recently living homeless at
Kent Paul Dolan
General Delivery
Clovis, California, United States of America 93612-9999
but I am more usually known and reliably contacted since
1988 as
xanthian@well.com
or known as "xanthian", or since 1986 as "Kent, the man from
xanth", but henceforth will be known here as "KPD".
Glossary
Traveller uses some unfamiliar words, and some familiar
words in specialized ways. Here are the most important of
them.
- city
- codon
- node
- vertex
-
The SETSP (see next section) involves travels through a set
of Euclidean (x,y) coordinate pairs laid out on a two
dimensional coordinate system. From the point of view of
the human "traveling salesman" model, these are "cities".
From the point of view of the graph which the cities and
paths connecting them create, these are "nodes" or
"vertices". From the point of view of a genetic algorithm,
these smallest individual components of the genome
are "codons". In particular, for Traveller, those codons
contain a number which is the city's name.
- chromosome
- genome
- tour
-
The aggregate of genetic information which a genetic
algorithm manipulates in a single reproducing entity, and
which its designer uses to describe the problem in term of
its domain of discourse is precisely a "genome", or more
colloquially a "chromosome". From the viewpoint of the
human "traveling salesman" model, the Traveller genome
encodes the salesman's "tour". This tour, and thus the
genome, is independent of the description that starts with
one particular one of its cities/nodes; it is like a ring,
with no distinguished beginning or end, and this realization
becomes important when implementing genetic algorithms.
- edge
- link
- path
-
The connection by a straight line from one coordinate pair
of a tour to the next is an "edge" from the point of view of
the graph which the nodes define, occasionally a "link" or
"path" from the point of view of the traveler and city
model. What is important to Traveller and to the solution
of the SETSP is the length of this link, and even more the
sum of the lengths of all the links in the current tour,
which sum is the genome's fitness.
What is the Symmetrical Euclidean Traveling Salesman Problem
(SETSP), and why should I care?
Briefly but carefully expressed, the SETSP is this. In the
two dimensional Cartesian coordinate system plane, with the
usual Euclidean distance metric, and distances between
coordinate pairs the same in both directions, given a set of
coordinate pair locations, conceptually "cities", and an
agent, conceptually "a traveler"; then, find for that
traveler the closed circuit path that goes through each of
those cities once and only once, and which has the shortest
total length.
This problem is interesting for several reasons.
-
It is a simplified model of several very fundamental and
commercially important real world problems, such as vehicle
routing problems in operations research, or conductive lead
routing issues in integrated circuit design.
-
It is a visually striking problem, in which human intuition
can often beat the solution time of the best existing
computer programs for modest sized problems.
-
It is a well-studied problem, under constant review by
mathematicians and other researchers for over a century.
[The earliest reference to something identifiably the TSP
seems to date to the 1830's.]
-
It is the "poster child" problem of the research area known
as computational complexity of algorithms. This is so
because of a well known theorem saying that solutions to the
SETSP can be transformed to solutions of other
computationally complex problems at an acceptable
("polynomial function") cost.
-
Which gives the game away; the SETSP is one of the very hard
problems known as "NP Complete", whose best known
closed-form solutions all require non-polynomial
(exponential) time to solve with respect to the problem
size, in the SETSP case sized by the number of elements
(here, "cities") in the problem. Solving even one subset of
this class of problems, such as the traveling salesman
subset, in polynomial time, conceptually solves them "all"
(or at least a large subset of classes of them for which
this has been proven) in polynomial time, though quite
possibly with some much higher order polynomial. This
polynomial time solution is the holy grail being sought for
all this long time by computational complexity researchers.
-
Notice that there are two possible goals for Traveller and
similar code. For the researcher, producing the exact
solution is most interesting. For the integrated circuit
designer, producing a very good solution suffices, there is
no particular payoff in wringing out the last one millionth
of the conductive lead length at the expense of much longer
run times.
So just how hard is "very hard"?
I use 79 cities as my benchmark point, for sentimental
reasons. I call it my "particle death of the
universe"-sized problem (or my "the end of the matter"-sized
problem).
My computer runs at a half-a-gigaHertz clock cycle, and, I
think, does one floating-point operation per cycle. To
evaluate the fitness of an SETSP tour ("genome") takes (at
least) seven floating-point operations ("flops") per city.
My computer is made of matter, largely consisting of
protons. All the protons in the universe are expected to
have disintegrated in 10^100 years (That's a 1 with 100
zeros after it). The universe is currently about
15,000,000,000 years old, a submicroscopic beginning on such
a time-span.
The brute force exhaustive search approach to solving the
SETSP is to form every possible permutation of the cities
(in no particular order of creating the permutations so that
no fitness evaluation shortcuts are available), evaluate the
path length ("fitness") for each permutation, and select the
one with the shortest path length ("best fitness").
In attempting to solve a 79 city problem by the brute force
exhaustive search approach, my computer, sadly, would fail,
as it would see all its protons disintegrate first. Solving
that problem at the speed of my computer would require just
under 9*10^101 years, or just 90 times longer than it will
take the last proton in the universe to disintegrate.
Maybe a bigger computer would help? [Excuse me if I drop a
digit or two answering, the last time I read the needed
numbers was a very long time ago.] Well, if all the matter
in the entire universe, around 10^70 protons and neutrons,
were turned into computers like my computer, it would be
sufficient to make about 10^46 of them. If a 79 city SETSP
problem instance were partitioned evenly among them and a
massively parallel brute force solution attempted, granted
not all the protons would have disintegrated after 10^56
years, but probably all interest in the problem solution
would have disintegrated by then.
So, perhaps the brute force approach is not the best choice.
My best approach to solving the SETSP so far, finds the exact
solution for a 79 city case, occasionally in as little as 40
minutes, though more often in a couple of hours, using a mix
of genetic algorithms and good approximate solution initial
seed genomes. [ UPDATE: as of 2002/03/25, Traveller solves
80 cities with a population of 80 members and "best
practice" settings (on an even grid, so I know the right
answer) in 61 seconds including setup time. Looks like I'm
making progress.]
Paid professionals, using gangs of computers working in
concert, and much more sophisticated algorithms, have
exactly solved problems of around 15,000 cities in under a
year, as their current "high water mark". Literally cities,
in this case, they were working from a modern German map.
The problems of similar complexity to the SETSP, prevalent in
integrated circuit design, feature millions of way-points,
and are not exactly solvable in general in commercially
useful times by any known algorithm. Simulated annealing is
the heuristic algorithm of choice, last I read (and works
well enough that we now have computers that sit on our laps
and run half a gigaHertz).
What is the Blind Traveling Salesman Problem, and why did
Scott choose it?
The traveling salesman (or here, the algorithm which
represents that salesman) is blind when the only
information available to the genetic algorithm part of the
code for use about the path lengths is the total length of
the circuit, not any of the city to city distances that make
up that circuit.
This version of the problem is particularly suitable for
genetic algorithm approaches, which insist on a single
fitness metric. Also, it removes the complexity of various
algorithms that attempt to improve performance by taking
advantage of more local information.
Unlike the purity of Scott's version, mine makes available
many "sighted cheats", to be turned on at the user's option.
Providing good starting seed genomes, as with the minimal
spanning tree seeds, is obviously one such cheat. All of my
permutation algorithms are technically cheats because they
save work by not calculating the full circuit fitness, but
they could in concept just as easily compute and work with
the full fitness, so their approach isn't as big a cheat.
The "optimize near a point" group of heuristics are completely
sighted cheats, they take advantage of knowing city
locations to pick the cities nearest a point in the
play-field to try to optimize some characteristic of the
tour near just those few cities (they should be as awesome
as they seem, too; they is certainly complicated enough
compared to the lean beauty of Scott's Cyclic Crossover,
say). Similarly, when implemented, local optimization
seeds, probably from 2-OPT or 3-OPT (whatever those are)
approaches, when I can steal some working code, or a greedy
algorithm, when I feel like bothering to code one, or that
spiffy relaxation of an ellipse to a solution I've seen
several places on the Net, [done, see "Elastic Net" below]
would all be sighted cheats providing good seeds.
So why does Traveller use a genetic algorithm instead of
doing a direct solution?
Because no one in all that research time has ever
found a polynomial time direct solution, and it is
quite possible that no such direct solution even exists.
Where no direct solution algorithm exists, but the problem
needs solving anyway, computer programmers and other
researchers and practitioners employ heuristics,
methods that seem to work well most of the time but cannot
be proved to work all of the time, and sometimes don't.
Every computer weather forecast you've ever seen was done
with a heuristic, for example. No direct computational
solution to the weather forecasting problem in feasible time
is known to exist. The less understood the problem domain
is, the chancier use of heuristics becomes, thus hurricanes
arrive unexpectedly at your picnics.
However, the programming concept of genetic algorithms is
that they help us find solutions, quickly anyway, for which
no fast, direct solution is known, and where we don't very
well understand the problem domain. Genetic algorithms are
often used for problems whose solution is not well
understood, and they often perform very well, in
commercially successful ways. However, genetic algorithms
themselves are not very well understood.
Understanding them better is one of the goals of Traveller.
What a genetic algorithm does is to imitate that
highly successful technique seen in nature, evolution by
natural selection. Because computer algorithms' generations
can run so much faster than the generation times of nature,
this very inefficient process that takes millions of years
in nature to create a higher brow line, finds solutions much
more quickly in a computer. Typically genetic algorithms
employ these parts.
-
A fitness function, some way of deciding
that one candidate solution is better than another is. In
nature, this function is defined as survival to
spawn a next generation. In a computer genetic algorithm,
the fitness function is used to determine whether a
genome survives into the next generation, or is used for
determining which genomes get to participate in reproduction
in the current generation (or both), at the programmer's
whim.
-
One or more mutation mechanisms, that
create modified next generation candidates irrespective of
fitness, and usually without involving other candidates. In
nature, this might be radiation or chemical influence. In
nature as in the computer, most mutations are less fit than
their parents, so this mechanism is seemingly very wasteful,
but programmers care even less than uncaring nature does
about the culls.
-
One or more reproduction mechanisms, that
create new candidates with combinations of characteristics
of one or more old candidates, probably biasing which ones
get to participate in reproduction based on their fitness.
In nature, this competition for the right to reproduce
occurs in many forms, and reproduction can be sexual or
asexual or both, depending on species. Genetic algorithms
use similar approaches.
-
Some filtering mechanism that selects
fitter individuals with higher probability to survive to the
next generation. In nature this is typically competition
for food, territory, mates, or other scarce resources. In
genetic algorithms, it is some application of the fitness
function. Notice that this is not the same as a guarantee,
either in nature or in a computer program, that the fittest
do survive. Good and bad luck still have full scope
of operation. It is sufficient for evolution to take place
that the fittest have a better chance to survive and
reproduce than do the less fit.
The purposes of the SRL version of the applet Traveller.
Scott wrote Traveller as code to be part of a textbook. It
is meant to be of a tutorial nature, efficient in the small
with fast generation times, but not blindingly fast to find
a solution, not industrial strength, not meant for huge
problem sizes, and not the perfect algorithm (no one has
found one, and if one existed, it would probably be too
complicated for use in a textbook). Keep this in mind when
trying to evaluate Scott's version of Traveller.
[After my hack, slash, and burn attacks changing his code,
Traveller still doesn't have any of those features, they
were in conflict with my goals too, but my version no longer
would serve Scott's purposes at all, it is much too big and
complicated for one thing,
[As of this moment of writing during construction of the
Traveller "epsilon" source code release, the Unix "wc()"
command claims that Traveller contains 97 source code files,
of 33,039 lines of code or 897,299 source code bytes, while
this current document in HTML source code form just passed a
quarter megabyte in size.]
and by design is never "finished"
for another, so any comparison of the two implementations
needs to take into account their quite different goals.]
The purposes of the KPD version of the applet Traveller.
I do research programming in computational
complexity as a hobby, and the SETSP is the natural
choice of the hobbyist programmer. I also needed to learn
Java — Sun's
platform independent programming language, to have any
chance of ever being employed again. Stealing Scott's
working code (for which I'd lusted for a couple of years,
there were so many things I wanted to change) gave
me a chance to start with a functioning framework into which
I could introduce new code of the type I understood, while
learning more from types new to me as I modified them.
I love to mentor. This code is meant to allow others, just
by playing with the applet controls, to gain an intuition
and a deeper understanding of how Genetic Algorithms, and
this SETSP genetic algorithm in particular, perform both when
things are going well, and when problems occur. I wrote
this code mostly to teach myself to have that deeper
understanding, a gut level understanding of genetic
algorithms.
I have no faith in single trick approaches to genetic
algorithms. We use a genetic algorithm when we don't
understand the problem domain well enough to code an
efficient direct solution of the problem, and there is no
more reason to suspect that we understand the problem domain
well enough to select a single heuristic that will work for
all data sets.
I do have "faith" in kitchen sink approaches to genetic
algorithms, and want to test that faith by showing that
genetic algorithms perform better when a mix of heuristics,
metaheuristics, and seeding methods are used simultaneously.
Thus there are as many algorithms as I could reasonably
steal or create embedded in Traveller, with a control panel
to turn them on or off as desired. This kitchen sink
approach is not original with me, but goes back to early
numerical algorithm research literature I read back in the
1960s, long since lost to me. [Other hobbyists, lusting to
see their favorite SETSP heuristic running in Traveller's
mix, are invited to enter negotiations to send me source
code; don't just send it without asking, my email box is
small.]
[I have experience of another kind with genetic algorithms,
too, which can stand as an object lesson to other
programmers. The genetic algorithm concept and resulting
mechanism is so immensely powerful, that adding heuristics
which are full of bugs and don't do at all what is expected,
still can improve the performance of the mix. The
genetic algorithm in nature, after all, runs quite
adequately off the input of mutation and blind chance, so
handing the computer genetic algorithm mechanism a source of
genomes that differ in some new way or according to some new
distribution of changes, is equally important with handing
it a heuristic that is well designed and lovingly crafted.
Thus, the rest of the story is that you cannot trust that
your algorithm is working as you planned, just because the
genetic mechanism can use its output to solve the given
(SETSP or other class of) problems, and that such a seductive
"testing technique" proves exactly that your heuristic
doesn't break the genome badly enough to make it unusable,
and nothing much else.]
I was originally coding this application in MITs StarLogo
for Java, but that language ran out of steam for me. I am
intentionally creating something very complex, and that
language has hard-coded limitations on application
complexity that started to warp my coding into an
unmaintainable mess. It is great for other purposes, and I
still use it, but not for an endlessly growing single
application. Needing to get away from StarLogo, and not
ready to try to write a C++ GUI with the limited tool
libraries at my disposal, I decided to turn to Java with its
rich abstract windowing interface class collection, and
found Scott's applet waiting to be abused.
The Traveller application on the screen.
Traveller has become visually a plethora of framed windows
[some that persist, and some pop-up windows that exist only
as long as the corresponding parts of the code are running],
into which I divided Traveller's screen interface, as my
additions overwhelmed Scott's original one window design.
The Traveller application now consists of the following main
parts.
-
Identifying titles, with various
cutesy sub-titles, in each of the window title bars.
-
A Salesman Makes A
Splash pop-up "splash panel" (term of
art) window that identifies the application and claims
copyrights on it for Scott and for me (of the sort that
permit others to use the code without cost), plus pointing
the user at the web sites where more current versions might
be found. Notice in the information there that my code is
built off of Scott's Traveller 1.2.1 release, and be warned
that he has released subsequent versions with which I have
not (yet) tried (and probably at this point never will try)
to synchronize my code, now vastly different from his
original. Scott had put several of my bug-fixes into his
code as of his release 1.3.0, though, which is collegial and
heartwarming of him.
-
A small Salesman's Doorbells
window containing the main run control pushbuttons and also
the configuration control pushbuttons all in one convenient
place.
While it might seem more natural to keep the configuration
buttons with the configuration panel, there are now
two configuration windows, instead of one
configuration panel, so that wouldn't work any more. Since
the various pushbutton enabled or disabled states comprise
an implicit state machine for Traveller, it is easier for
the user to understand this application state if all the
buttons are in one window together so that the changes
indicating "button enabled" or "button disabled" to each
when one is pushed are more noticable. Here is what the
buttons mean and do.
-
Stop
Stop the genetic algorithm at the end of the next
generation, putting the applet into a mode where the
algorithm may be continued on the existing problem setup
with the Resume button. The Change Configuration button is
enabled. The Reset button is enabled. The Create New
Cities button is enabled. The Test Again button is enabled.
Because Traveller doesn't respond fully to the Stop button
until the end of a whole generation, I added a
Salesman's
Progress persistent pop-up window that
displays the genome-by-genome progress of Traveller through
a single generation, and "generation complete" at the end of
a generation, so that the user can gain some idea of when it
is safe to go forward modifying stuff.
Changing, in particular, the valuator panel values
before Traveller comes to a complete stop generates
scads of error conditions and messages. If you are
sufficiently unlucky, it will also crash Traveller (and
possibly your operating system if you run one of the tender
ones crashable by application software). If you are running
a sufficiently slow setup of Traveller (large problem and
population sizes) that waiting until the end of a Traveller
generation is unacceptable, use the tools your operating
systems provides to end an application program early (Unix's
shells' kill() command, MS-Windows Task Manager, or similar
tool) to stop Traveller cleanly, rather than trying to
change Traveller's setup in mid-generation). Then, when the
dust has settled, start Traveller over from scratch.
-
Resume
Pick up the running of the genetic algorithm from where it
was paused with the Stop button, (say to recapture some CPU
cycles on a personal computer for a while for some other
compute-bound task), correcting times to omit from the
status accounting the time during which Traveller was
paused. When Resume has been pressed, most other buttons
are disabled as described for the Start button.
-
Start
Start the genetic algorithm running, from a fresh starting
point, either when the program is first started, or after a
Reset, Create New Cities, or Set Configuration has had time
to take effect. The Start button disables almost all the
controls (see the checkbox descriptions for exceptions)
except the Stop and Test Again buttons.
-
Test Again
This is a convenience button. It does the equivalent of
pressing in order the Stop, Create New Cities, and Start
buttons. I use this when running lots of short, identically
set-up problems in a row for statistics gathering purposes.
After it is pressed, pretty much everything else is disabled
as described for the Start button.
-
Reset
Sets up the same problem with the same cities (but a
different state in the random number generator, so you don't
get the same run), ready to run again when the Start button
is pressed. This is interesting for checking variations in
execution times or solution success by selected genetic
algorithm heuristics and metaheuristics on a single problem
data set. After Reset does its work, the applet is still in
the state as described for the Stop button, except that the
Start instead of the Resume button is enabled.
-
Create New Cities
Sets up a new problem with a new set of city locations, but
otherwise with an unchanged configuration from the previous
run. This is interesting for testing a configuration setup
on a series of identically-sized problems. I use it for
statistics gathering. After Create New Cities does its
work, the applet is still in the state as described for the
Stop button, except that the Start instead of the Resume
button is enabled.
-
Change Configuration
This button enables the two configuration windows'
components for changes.
-
Set Configuration
This button disables the components in the two configuration
windows for changes, and stores the settings the user has
created into the program's control variables. [There are a
few exceptional components that stay enabled always, see the
checkbox descriptions for details.]
-
Default Configuration
Traveller has a default set of valuator and checkbox states
at startup time, this restores them, as a shortcut
alternative to putting them all back in place by hand. The
configuration windows' components are still enabled after
this button is pressed, for possible further user changes.
-
Enable All Heuristics
With so many heuristics, turning them on and off has become
a chore. These three "Heuristics" buttons provide
shortcuts. This first shortcut turns on all the heuristics
checkboxes.
-
Disable All Heuristics
This second shortcut turns off all the heuristics
checkboxes.
-
Randomize Heuristics
What fun! This third shortcut turns on a randomly chosen
subset of all the heuristics checkboxes.
-
A Salesman's
Route window containing a graphics canvas
within which the GA progress is given visual expression.
The canvas background is color "black". Inside the frame of
the canvas, a ten pixel black border is left vacant around
the area within which the conceptual traveler travels.
Three things can be seen on this canvas.
-
The city locations appear as small ovals in color "cyan".
-
The path of the traveler circuit for the best-fitness genome
found so far appears in color "yellow".
-
Optionally depending on some of the control checkbox
settings, the paths of all the other traveler circuits for
genomes currently in the population appear in color "red".
-
Various genome seeder pop-up windows,
of the style of the Salesman's Route window, showing
progress of those seeders that take more than some nominal
amount of time and have some interesting behavior to
display. All the seeder windows stay open until all the
seeders are done. Associated with the set of seeders is a
small pop-up ephemeral counter window,
Salesman's Wild
Oats (sorry about that, couldn't resist),
with a count of the number of "seeds sown" up to the
population size selected by the user, which also stays open
until all the seed genomes have been selected.
The following seeders (besides random seeds,
which didn't seem worth a display window to this author)
have been implemented with their own pop-up windows so far.
-
A temporary pop-up
Minimal Spanning
Tree window containing a graphics canvas
within which the minimal spanning tree population genome
seeding progress is given visual expression. This window
closes by itself as soon as all seeders have finished their
tasks.
The canvas background is color "black". Inside the frame of
the canvas, a ten pixel black border is left vacant around
the area within which the minimal spanning tree is built and
then used. Three things can be seen on this canvas.
-
A temporary pop-up Elastic
Net window containing a graphics canvas
within which the elastic net population genome seeding
progress is given visual expression. This window closes by
itself as soon as all seeders have finished their tasks.
The canvas background is color "black". Inside the frame of
the canvas, a ten pixel black border is left vacant around
the area within which the elastic net is built and
then used. Three things can be seen on this canvas.
-
The city locations appear as small ovals in color "cyan".
-
The waypoint nodes of the elastic net graph as it is being
constructed appear as small ovals in color "pink". These
will be seen to move across the canvas as the waypoints are
attracted to the location of the city nodes, by which means
the elastic net is deformed to approximate a traveling
salesman's tour through the city nodes.
-
The edges of the elastic net graph as it is being
constructed appear in color "green". These, of course, move
with the waypoints. The elastic net graph starts as a
circle in the middle of the canvas, and distorts as time
passes until it approximates a traveling salesman's tour
through the city nodes. Invisibly to the user, sample
genomes are captured from the elastic net as it distorts, by
taking in order around the elastic net the cities closest to
sequential waypoints (it's more complicated than that, but
that is the essense of what happens). As the elastic net
distorts, the order of closest cities changes, thus sampling
at many stages is fruitful and produces varying genomes at
different points during the distortion of the elastic net.
-
Various temporary pop-up visual
debugging windows, styled like the
Salesman's Route window. There is one of these for each
heuristic, and the set of them is controlled by the "Enable
Visual Debugging" checkbox in the "Salesman's Skills"
window, described below. Each has the Java class name of its
corresponding heuristic as its window frame title bar entry.
With Traveller's current richness of heuristics, it is quite
possible to enable more heuristics than you can possibly
find room on your screen to accomodate the corresponding
visual debugging windows. You can take two attitudes toward
this. Be serious, and use these windows as real debugging
tools, only turning on as many heuristics as you can
separately panel visual debugging windows, usually four on
my hardware. Be frivolous, and use these windows for fun
and to amuse or terrify your friends. The windows are
equipped with a "window to front" mechanism, so that the
window for the currently running heuristic always pops to
the front for viewing (right through any other application
you might want to see instead). You can either leave the
set of them stacked where they open, in the screen upper
left corner, and just watch what the top one is doing while
it is visible, or you can drag them and drop them messily
all over one another across your screen, and have them
popping forward like the moles pop up in a Whack-a-Mole
game, leaving usually several visible at once.
In general, the visual debugging windows show a "setup"
stage to show the genome or genome pair with which the
heuristic begins, one or a series of ongoing effort "step"
stages, and a "done" stage when the heuristic has developed
its final answer. These behave slightly differently for
asexual or consultive reproducers, which have one
progenitor, and for sexual reproducers, which have two.
In the former, less complex asexual (or consultive, it works
the same) reproducer case, the setup stage shows the parent
genome in red. The step stages show the current candidate
genome ("on top") in green, the next earlier generation
candidate genome (rarely, it turns out, visible) ("in the
middle") in pink, and the next earlier candidate genome
differing from the middle genome, ("on the bottom") in red.
Only when two successive internal steps of a multiple step
heuristic produce different candidate genomes does the
middle one show.
In the latter, more complex, sexual reproducer case, the
setup stage shows one of the genome parents pair in pink,
the other in red. The step stages show the genome parent
pair in the same colors, and overlaying them (so that only
unmatched edges of one or the other parent show through) the
current trial child genome in green. The done stage shows
the parent genome pair in the same colors, and overlaying
them, the product child genome in yellow/gold. Unless the
done stage shows all three colors, one or another parent got
propagated forward to the next generation (and so is
completely hidden by the identical child genome drawn after
and so on top of it), no different child was produced.
In either case, the shift from a green foreground genome to
a gold foreground genome indicates that the individual
heuristic has reached its "done" stage. For the simpler
Traveller heuristics, which have only one internal step, the
step stage isn't too interesting, nor is it visible for too
long. In the more complex heuristics, the slow running
ones, watching the heuristic run is worth the price of
admission to Traveller in and of itself. Things fly by just
a little too fast to see the details of the individual
steps, but for some heuristics, a beautiful general pattern
of changes emerges.
It is well worth mentioning that Traveller runs
much slower with visual debugging enabled, but you
expected that, right? Besides the heavy computational
demands of drawing all those genomes, each heuristic pauses a
full second for you to admire it at its "done" stage before
allowing another heuristic to kick off its processing, when
visual debugging is enabled. [Update: Now Traveller just
runs slower with visual debugging on, and the
windows are much more lively, since I got smart enough not
to redraw visual debugging window contents for heuristic
loop steps that led to unchanged images and genomes.]
-
A Salesman's
Workload window for setting the Traveller
numerical values which are under user control. Its
components are automatically disabled when Traveller is
running after the Resume, Start, or Test Again buttons have
been pushed. The Stop button re-enables its enabler, the
Change Configuration button. This window contains several
parts.
-
A control panel title line —
"Values Configuration:".
-
A set of labels and text valuators with
associated push-buttons, for entering numerical data used to
control Traveller.
Besides their associated labels, the valuators consist of
nine pieces, four decrement push-buttons, the current value
text field display, and four increment push buttons. The
push-buttons are labeled "-M", "-C", "-X", -I", "+I", "+X",
+"C", "+M" for changes in the unit of interest, either units
or hundredths, expressed in Roman Numerals (because they can
be one character wide, so the push-buttons can be nicely of
equal width).
-
Number of Cities
Number of Cities is the "N" in the computational complexity
expression of problem difficulty, and the SETSP problem
complexity rises as N! or "N factorial", very fast growth
indeed. Number of Cities has a minimum of three cities,
for which any permutation is a solution. Fairly modest two
digit numbers provide problems complex enough that the
universe would burn out and die before an approach of simply
forming every possible path and evaluating their fitnesses
to choose the shortest could finish running on the world's
fastest computers. Start this value small and ease up
gradually. Remember, changing N from 19 to 20 isn't making
the problem one unit harder, it is making the problem
twenty times harder.
-
Population Size
The population size chooses how many travelers (the usual
term of art is "genomes", or in Scott's term, chromosomes)
are going to be working and competing and mutating and
breeding and cross-breeding to solve the problem. It has a
minimum value of three, catering to the inver-over algorithm
in particular, which needs self and at least two other
genomes with potentially differing fitnesses to succeed.
Big populations find solutions in fewer generations, but not
necessarily in less time, since big populations eat a lot of
processing resources. Population sizes from half to four
times the number of cities chosen seem to work well.
Big populations are pretty much a waste of time if neither
crossover heuristics nor the one consultive heurisic are
enabled; each genome is then being solved as an independent
problem, so, if this is what you want to test, probably you
should set the population size as small as allowed. [There
is some argument for large populations, though: with lots of
chances, maybe at least one of them won't get stuck in a
local optimum for unbearable amounts of time, and so will
produce a solution in reasonable time. Feel free to explore
this and other issues, that exploration is exactly why
Traveller exists!]
Remember too, that Java is a resource hog, and some
combinations of number of cities and population size will
drive the problem from real memory to virtual memory. At
that point it is time to find something else to do while the
problem runs: get married, raise a family, choose a burial
plot. Virtual memory spillover creates that kind of waiting
times. Alternately, those combinations will cause the Java
interpreter to freeze up completely, taking tender operating
systems along with it.
Use good sense. On my particular set-up, a city count of
around a thousand, and a population size of around five
hundred, or vice versa, is pretty much at the ragged edge of
the possible. Such a problem runs for several days
from a purely random genome seeding before anything
resembling a solution emerges.
[UPDATE: Now that I've finally, as of release gamma, managed
to virtualize the size order N squared city distances array
in TravellerWorld.java, I can bring up much larger city
counts, though the product of population times city count is
still a limiting factor. Bringing up a 10,000 city problem
with a population of 20(!) with a minimal spanning tree
seeding took 724 seconds of setup, almost all of that MST
overhead, the first generation with its extra processing
requirements took 214 seconds, and subsequent generations
with all available heuristics enabled took around 154
seconds each. Average starting tours were around 37,700
units long, over fifty times shorter than the randomly
seeded case (below). The Salesman's Route window's canvas
was about 50% city-cyan colored when it opened, and
presumably lots of cities' integer pixel coordinates drawing
locations (as opposed to their double precision floating
point exact locations used to compute tour solutions) were
identical, so that their dots overlaid, underemphasizing the
crowding.
With random seeding, the generation times were about the
same (279 and 283 [I wonder why the difference in the
numbers compared to the MST seeded case; perhaps more pixel
painting time for the longer paths?], average starting path
lengths were in the 1,980,000 range, but the setup time was
only 9 seconds, and the canvas was solid yellow instead of
speckly cyan.
I'm not claiming that Traveller can do anything
useful with such a monster (though by the fifth
generation the randomly seeded case had found an improved
best genome ever), nor, despite Moore's law, am I likely
ever to see a laptop computer where that would be true, but,
like teaching a dog to dance, getting Traveller to do this
at all was still quite an accomplishment. Traveller can now
at least contain SETSP problems of the order of
magnitude of the largest irregular layout ones ever solved
exactly.]
-
Mutation Rate 0.01%
This controls how large a proportion of the population is
mutated and put into the new population without regard to
fitness, from 0.00% to 100.00%. A 2.0% mutation rate is a
very large one, which is why the valuator provides steps of
one one-hundredth of a percent. Mutation puts entropy back
into the population that the rest of the heuristics are
designed to remove. Sometimes, the algorithms get stuck on
a local optimum and cannot escape it to find the global
optimum, in which case adding some mutation for the next run
can provide the needed nudge. It is a good idea to try
doing some runs with mutation set to zero for a while before
adding any, though, to see how well Traveller can do. Many
heuristics are too tender to abide much mutation, and the
average fitness using them hovers far away from the optimum
solution if the mutation rate is set high. Mutation also
interferes with how long adaptive permutation effort, new
with Traveller's delta release, can afford to wait before it
decides to up the effort level.
-
A Salesman's
Skills window for enabling or disabling
the various features of Traveller that are under user
control. Its components are automatically disabled when
Traveller is running after the Resume, Start, or Test Again
buttons have been pushed. The Stop button re-enables its
enabler, the Change Configuration button. This window
contains several parts.
-
A control panel title line —
"Features Configuration:".
-
A large set of labels and checkboxes for
turning on or off various features of Traveller.
-
Layout Controls
Layout controls affect the locations of the cities through
which the traveler does his tour. Some layouts have known
exact solutions, the solutions of others can only be
estimated by eye.
-
Circular Layout?
The circular layout puts the cities on the circumference of
a circle in random positions. It is interesting because it
has a known and easily verified solution, and is thus useful
for knowing when a run is complete, for taking statistics of
the performance of various heuristics and configurations. I
fixed a bug in Scott's implementation that both displayed
and located the cities on rounded integer grid locations;
this can end up with a solution length longer than the
circle's circumference due to taxicab right angle turns to
walk around the polygon, an unsatisfactory situation indeed.
By keeping the actual city locations as double precision
reals, and only the display locations as long integers, I
fixed this artifact within the limits of detectability.
This and the next five checkboxes are a set of radio
buttons; you cannot turn one off, you must turn a
different one on, instead, so that there will always be
exactly one valid selection.
-
Even Grid Layout?
The smallest integer square at least as large as the number
of cities is computed, a grid with that many locations is
provided, and then the user selected number of cities is
laid out top to bottom left to right in a compact layout on
that grid until all cities have locations, which may or may
not mean the grid is filled. Even numbers of points can
always be connected by a path that has only horizontal and
vertical "unit" segments, odd numbers of points can always
be connected by a path that has all unit length steps except
for one diagonal step. Thus this is interesting as another
layout with a known solution. Thanks to Onno Waalewijn in
Norway for this suggestion.
-
Fractal Layout?
I have the C code for this [see my computational complexity
web page at the URL above for a link to the Fractal TSP web
pages somewhere in the Portuguese-speaking world], it just
needs porting, and it may end up as three choices, the
original research paper provides three. The layout is built
with an L-System parser, using the program input format for
the FRACTINT freeware program, forming a fractal pattern
with an easily computed and visually obvious minimum
solution. The research paper described heuristic approaches
that treated fractal layout cases with a million and a half
cities. I don't expect Traveller will be doing that on any
hardware I'll ever own. The eventual implementation will
again be to form the smallest layout that will at least
contain the number of cities selected by the user, and then
fill that until we run out of cities.
-
Nested Grid Layout?
This is similar to the even grid layout, and also sparked by
Onno's suggestion. Here, each square of the original grid
has a smaller square nested inside it. There is a visually
obvious best solution, but I lack words to describe it.
This is a very challenging layout for Traveller, much harder
than randomly placed points. It nearly inevitably gets
locked on a local optimum with a death grip. Until I
finally got simulated annealing working as a metaheuristic
wrapper around the various heuristics, there were many cases
of this layout I'd never seen Traveller bring to an exact
solution. The 52 point fully-populated-grid case, with a
population of 35, was my first big breakthrough, in a mere
19235 seconds. [Traveller has made some rather stunning
progress; as of release epsilon, Traveller, just tested,
finds an exact solution for that case in 22 seconds.] In
general, the solution should always consist of horizontal
lines, vertical lines, or lines running at a 45 degree
angle, but almost always, a diagonal at another angle sneaks
in and won't go away. This is by far the most interesting
layout to watch run, so far.
-
Poly Spiral Layout?
Here, during layout, a polygon is spun while shrinking, to
provide inward spiraling trails of cities; at this writing
there are five arms of spirals, it may differ when you see
it. Choosing an odd number makes the problem more difficult,
because for one arm the traveler cannot step to the next one
over upon reaching the center and continue back out. This
is one of the controls that makes me wish I had more screen
room or better understood popups in Java. It would be nice
for the user to be able to set the number of polygon sides
and the spiral turning rate. As is, though, the traveler
best path has some unexpected behaviors as numbers of cities
grow large, which I'll leave it to your pleasure to
discover.
Try this first with 50 cities and a population of maybe
three times that size, to see what I was expecting to see
happen. Next, try this with 1000 cities, a population size
of 250, and the minimal spanning tree seed heuristic to see
a pretty picture and the serendipitous result. Don't wait
to see it run to completion; running right now as I type
this (release gamma), it runs a generation in about 30
seconds, and has made only four improvements in the last 150
minutes.
-
Random Layout?
This is the original layout for the problem. It doesn't
have some easy way to check that the solution is optimum,
except experience and the human eye. It is fun to watch
because of all the cartoons a random number generator will
draw for you on its way to a solution.
-
Closed Path?
[This checkbox and capability has been removed; it was too
messy to try to keep all heuristics capable of doing both
open path and closed path calculations; now all Traveller
problem instances are the classical closed path case.]
-
Perturb Cities?
To the possible objection that a gridded layout is somehow
"too easy" [as came up on the Net from several functional
innumerates in response to Scott's circular layout], this
checkbox comes to the rescue. Turn it on, and an invisible
but sufficient fuzz is applied to the city's x and y
coordinates, so that no two distances between cities should
be the same, the differences are sufficient to change
the evaluation of fitness, but the (now unique) solution is
among the subset of expected solutions for the unperturbed
case, and thus still easily recognized.
-
Sexual Reproducer Heuristics
Sexual reproducer heuristics are those which create child
genomes as a mix of the codon arrangements of two parent
genomes. Because the SETSP involves a permutation genome,
all reproducer heuristics must maintain the permutation
invariant. This is especially hard to design for useful
sexual reproducer heuristics, which is in part why there are
fewer of them than of asexual ones.
-
Cyclic Xover?
-
type
Cyclic crossover is a sexual reproduction heuristic, where
two parent genomes contribute a portion of their data to
produce a new child genome which is a mixture of the two.
This is a classical "blind" heuristic, not dependent on
local city to city distance information, but only on global
"whole genome" fitness to achieve progress.
-
capability
Cyclic crossover, mixed with a bit of mutation, is a
self-sufficient TSP solver.
-
intent
Using the property of permutations known as cycles,
trade one cycle of nodes between the parent genomes "in
place", while preserving the internal order of the cycle in
the resulting child genome as read "left to right".
-
mechanism
Pick an arbitrary starting point in one parent genome, and
an arbitrary orientation of the two genomes to each other.
Swap codons at the matched locations in the genomes. Then,
for the codon which just moved into the first genome, find
the duplicate copy it made redundant, and perform a similar
swap at that location. Continue doing this until the mate
of the first codon selected gets moved to the initial
parent. This completes a cycle; if fewer than the
whole of both genomes got swapped (a possible degenerate
case), then a crossover has been accomplished, and
each parent is now comprised of a set of its original codons
in their original order, intercalated with a set of codons
from the other parent, retaining the order they had in that
parent. The more fit resulting genome is then propagated as
the sole product of the union.
-
special considerations
This is the cyclic crossover as described on Scott's site in
his genetic algorithms write-up. I mended it to cater to a
TSP permutation genome's being logically a ring, with
fitness invariant under rotation or reflection, not an array
with two ends as it was treated in Scott's original code.
Those changes removed some artifacts and made it a stronger
heuristic. Looking at its usual result in the visual
debugging window, you wonder how it ever helps solve the
TSP, but it is amazingly successful despite appearances. It
helps to remember that the essense of genetic algorithms is
culling, to see why. Producing mostly garbage is
OK as long as a heuristic coughs up the occasional pearl. I
also (as of Traveller release "epsilon") mended this
heuristic not to put forward (the copy of) the one of the
two parents selected on the basis of fitness (but instead
the parent selected in sequential progression), in the
degenerate case where the entire genomes were just swapped
without effective change, to prevent counterproductive
losses in population diversity which the putting forward the
(usually more fit) fitness selected parent's clone was
causing.
-
Edge Preserving Xover?
-
type
Edge preserving cyclic crossover is another sexual
reproduction heuristic. This is a "sighted cheat" because
the post-crossover bin flipping pass makes use of local
city-to-city distances in the interior of the genome in its
operation.
-
capability
Edge Preserving Cyclic Crossover is immensely powerful at
homogenizing a population, too powerful for its own good, in
fact. Even mixed with fairly heavy mutation, it just smashes
the population diversity quickly to zero, so that it needs
to be mixed with other heuristics that promote, rather than
eliminate, population diversity. It is one of the
heuristics which cannot usually solve a TSP instance alone,
but profoundly speeds up solutions by heuristics which can,
but which need help organizing parts of solutions into whole
solutions.
-
intent
Some time ago a Net correspondent and I talked about his
edge preserving crossover web site, and I noticed to him
that, as in Scott's case, he needed to treat the genomes as
rings that could slide past one another or reverse
directions before crossing over. He thanked me profusely
and went off to make the change, but I never heard back,
lost his web page URL, and forgot the details of his
algorithm. The idea stuck in my head, though, since
arguably solving the SETSP consists exactly of preserving the
correct set of edges for the solution and accumulating
correct edges until they are all in place.
-
mechanism
[There is a whole separate HTML document in the
Traveller.jar file describing this complicated heuristic, so
I'll just summarize here.] I've invented a new-to-me edge
preserving cyclic crossover, now a part of Traveller. It combines
Scott's cyclic crossover with a "binning" concept, where
edges or sequences of edges held in common by both parents
become a bin of cities kept in their same order, and the
thing that cyclic crossover crosses is bin contents instead
of individual cities. The mechanism is fairly complex.
First, genomes are converted to edge lists for each parent,
with edges in a canonical form with the lower sort-ordered
codon first. A copy of each edgelist is then sorted
alphabetically by edge-start codon, then edge-end codon, in
canonical edge form. The sorted copies of the edgelists are
then easily walked to detect edges which occur in both
genomes. These are then flagged several places, including
the unsorted copies of the edgelists. Individual edges
which are shared are then (carefully, false positives are
easily evoked) analyzed to detect common sequences of shared
edges (which may occur in the same or in reversed order
between the two genomes). Longest shared common sequences
of edges, or individual codons where they are not part of
any shared edge, then are treated as bins; as a
result of the analysis, it is guaranteed that exactly the
same bins, though in different order and orientation, occur
in each parent genome. The bins are then used to perform a
cyclic bin-wise crossover, just as described for the cyclic
crossover heuristic, but where whole bins replace individual
codons at each swap. Lastly, since common subsequences of
shared edges may have been reversed with respect to the
opposite parent genome during crossover, a(n affordable to
compute) subset of all possible flipped orders of the common
subsequences of edges is analyzed, and the best choice of
flips is kept as the child; of the two children thus
produced, the one having superior fitness is propagated as
the sole offspring of the crossover.
-
special considerations
The result of this crossover is a child genome that contains
all the edges or sequences of edges shared between the
original parents, in some (probably new) arrangement, and a
mix (on average) of half the remaining codon arrangement
from each parent. The edge preserving cyclic crossover
algorithm now lives under this checkbox, much like Piglet
lives under the name of "Trespassers Will" [short for
"Trespassers William", Piglet says].
-
Ordered Xover?
-
type
Ordered crossover is another sexual reproduction heuristic.
It is also a variant of "two point crossover". This is a
classical "blind" heuristic, not dependent on local city to
city distance information, but only on global "whole genome"
fitness to achieve progress.
-
capability
Ordered crossover, mixed with a bit of mutation, is a
self-sufficient TSP solver.
-
intent
Ordered crossover trades contiguous blocks of codons in a
swap between parent genomes, in the hope that a useful bit
of missing tour excellence can thus be passed to fit with
excellence left behind in the opposite partner.
-
mechanism
First, the two parent genomes are put into general
orientation with respect to one another. Then for a same
length matched marked block of codons, the algorithm swaps
them between parent genomes, doing a bunch of cleanups
afterward to restore the requirement that the resulting
genome be a permutation genome. The difference between this
heuristic and partial match crossover lies not in the two
point crossover used, which is identical, but in the
cleanups. Ordered crossover does its cleanups by sliding
all the unaffected codons to remove the gaps (caused by
crossover's making some codons duplicates) "to one end",
then filling in the missing codons (missing because they
just got swapped to the other parent) in their original
order into the contiguous empty space then made available.
Scott's site has a good description of this mechanism, read
it there. Here is a picture showing an ordered crossover
step by step, with two views of the same crossover shown,
the style Scott uses on the left, and a simpler one taking
advantage of the fact that permutation genomes are rings,
not arrays, and so functionally invariant under rotation,
shown on the right.
|
SRL Ordered Crossover, a Two Point Crossover variant
|
|
|
Show the three codon crossover region in the middle.
|
Rotate the view left first to put the crossover region at
one end.
|
|
Original genomes with crossover region marked.
|
AB|CDE|FGHIJK
GK|HBA|JIDECF
|
CDE|FGHIJKAB
HBA|JIDECFGK
|
|
Lift out matches for incoming codons, set them aside in the
order in which they'll be used.
|
HBA
--|CDE|FG-IJK
GK|HBA|JI---F
CDE
|
HBA
CDE|FG-IJK--
HBA|JI---FGK
CDE
|
|
Slide unaffected codons left-with-wrapping to make a hole
into which to move the local swap codons.
|
HBA
--|CDE|FGIJK-
--|HBA|JIF---
CDE
|
HBA
CDE|FGIJK---
HBA|JIFGK---
CDE
|
|
Slide local swap codons left-with-wrapping into the hole
just created to make room for the remote swap codons.
|
HBA
DE|---|FGIJKC
GK|---|JIFHBA
CDE
|
HBA
---|FGIJKCDE
---|JIFGKHBA
CDE
|
|
Drop the remote swap codons into place in the space created.
|
DE|HBA|FGIJKC
GK|CDE|JIFHBA
|
HBA|FGIJKCDE
CDE|JIFGKHBA
|
|
KPD Ordered Crossover, a variant on a variant
|
|
|
Show only the view with the swap region rotated to the left
end.
|
|
Original genomes with crossover region marked.
|
CDE|FGHIJKAB
HBA|JIDECFGK
|
|
Set everything aside, making room for a copy. In that copy
assign a left area for incoming codons from the crossover
region of the other genome, assign a right area for all the
other codons from the current genome.
|
---|-------- CDE|FGHIJKAB
---|-------- HBA|JIDECFGK
|
|
Walk each genome left to right. If the codon encountered is
one of the incoming swap region codons, instead of copying
it at all (we'll get its equivalent eventually), copy the
next codon in order from the donor parent's swap region into
the next available codon position in our own swap region.
Otherwise, copy the codon to the next available codon
position in our own non-swap region. Notice again in the
diagram to the right that this means that when a swap codon
is encountered in our own codons, the codon written to the
child is not necessarily the codon we just encountered, as
flagged by the asterisks in the cases where this is
occurring. In the case where it is the same, that is just by
coincidence. [To make the diagram easier to follow, codons
on the right are lowercased as those positions are used.]
|
---|C------- cDE|FGHIJKAB nonswap
---|H------- hBA|JIDECFGK nonswap
---|CD------ cdE|FGHIJKAB nonswap
---|HB------ hbA|JIDECFGK nonswap
---|CDE----- cde|FGHIJKAB nonswap
---|HBA----- hba|JIDECFGK nonswap
---|CDEF---- cde|fGHIJKAB nonswap
---|HBAF---- hba|jIDECFGK nonswap
---|CDEFG--- cde|fgHIJKAB nonswap
---|HBAJI--- hba|jiDECFGK nonswap
H--|CDEFG--- cde|fghIJKAB swap
C--|HBAJI--- hba|jidECFGK swap *
H--|CDEFGI-- cde|fghiJKAB nonswap
CD-|HBAJI--- hba|jideCFGK swap *
H--|CDEFGIJ- cde|fghijKAB nonswap
CDE|HBAJI--- hba|jidecFGK swap *
H--|CDEFGIJK cde|fghijkAB nonswap
CDE|HBAJIF-- hba|jidecfGK nonswap
HB-|CDEFGIJK cde|fghijkaB swap *
CDE|HBAJIFG- hba|jidecfgK nonswap
HBA|CDEFGIJK cde|fghijkab swap *
CDE|HBAJIFGK hba|jidecfgk nonswap
|
Notice just which and how much order was
preserved in Scott's version. The unaffected codons
retained their order among themselves, modulo collapsing the
holes created by the removed codons; the local swap codons
retained their contiguous order, and the remote swap codons
retained their contiguous order as well. Ordered crossover
lives up to its name!
In my version, "all" of the order in non-swapped codons were
retained, but at the expense of not ending up with the same
final result that Scott does. The result, in this example,
which I don't think is merely a coincidence, is
that my nonswapped codons as a set are in the same order as
Scott's nonswapped codons as a set, just with an internal
rotation as if they were a subring within the main genome
ring.
-
special considerations
This is almost the ordered crossover as described on Scott's
site in his genetic algorithms write-up. I mended it to
cater to a TSP permutation genome's really being a ring,
with fitness invariant under rotation or reflection, not an
array with two ends as in Scott's original code, which
removed some artifacts and made it a stronger heuristic. In
the process, I simplified it a bit, so that the output of
the crossover is not the same as in Scott's original
version (and the illustration above is of both versions,
since I wrote it using Scott's documentation), but the code
still adheres to the spirit of Scott's write-up, just avoids
some problems in things he forgot to discuss there.
-
Partial Match Xover?
-
type
Partial match crossover is another sexual reproduction
heuristic. It is also a variant of two point crossover.
This is a classical "blind" heuristic, not dependent on
local city to city distance information, but only on global
"whole genome" fitness to achieve progress.
-
capability
Partial match crossover, mixed with a bit of mutation, is a
self-sufficient TSP solver.
-
intent
Partial match crossover has the same intent as for ordered
crossover: hope to capture a substring of excellently
ordered codons from one parent, and move them to another
parent whose excellence so far is concentrated elsewhere.
-
mechanism
First, the two parent genomes are put into general
orientation with respect to one another. Then for a same
length matched marked block of codons, the algorithm swaps
them between parent genomes, doing a bunch of cleanups
afterward to restore the requirement that the resulting
genome be a permutation genome. The difference between this
heuristic and ordered crossover lies not in the two point
crossover used, which is identical, but in the cleanups.
Partial match crossover does its cleanups by removing from
the first parent the codons made redundant by the codons
copied from the second parent in the crossover step, then,
for paired codons from the crossover region, replacing the
ones just removed as redundant, with the opposite numbers
gone missing by being swapped into the second parent, in
place in the first parent at the locations where the
redundant codons originally resided (i.e., there is no slide
step as for ordered crossover). Scott's site has a good
description of this mechanism, read it there. The only
caveat is that his site's description fails to consider what
happens when a city-codon occurs in both sides of the
crossover region, but the Traveller code does make that
complication work correctly.
-
special considerations
This is the partial match crossover as described on Scott's
site in his genetic algorithms write-up. I mended it to
cater to a TSP permutation genome's really being a ring,
with fitness invariant under rotation or reflection, not an
array with two ends as in Scott's original code, which
removed some artifacts and made it a stronger heuristic.
-
Rolling Xover?
-
type
Rolling crossover is a sexual reproducer heuristic. This
may be a new crossover technique for permutation genomes, it
is at least new to the current author. It is a multipoint
crossover with the number of crossover points determined by
the data rather than set in advance, and with some crossover
opportunities taken and some not, depending on their ability
to contribute to improved fitness. This is a "sighted
cheat" because the entire crossover point selection and
donor parent selection is making use of local city-to-city
distances in the interior of the genome in its operation.
-
capability
Rolling crossover is much too powerful a population
diversity remover to be safely used as a stand alone
heuristic [if it cannot find a point at which codons seen
lists match to make a swap so that it can accomplish a
change, it just returns a child whose genome matches the
fitter parent, and population diversity cannot withstand
much of that powerful a selection agent], though since it is
blazingly fast and obsessive about improving genomes [the
child, if different from both parents, is always at least as
fit as the less fit parent, and very often, though not
always, more fit than the fitter parent], given sufficient
mutation and patience, it is capable of solving relatively
small problems in begrudgingly acceptable time. It works
much better in a mix, where it helps by accomplishing its
design goal of preventing fitness improvements from being
lost until rediscovered when unrelated fitness improvements
(ones elsewhere along the tour) promote a different genome.
-
intent
Rolling crossover attempts very directly to satisfy the wish
"gee, I'd like to get all those improvements I see
disappearing and reappearing as tours alternate on the
screen into a genome together before some of them got lost
in a cull."
-
mechanism
Rolling crossover attempts a multipoint crossover, walking
along parent genomes marking which codons (cities) have been
encountered in each, and, whenever the lists match, stopping
and grabbing for the child the more fit of the two genome
extensions since the last match point. Before doing the
walk, it picks an arbitrary starting point in one genome,
arbitrarily reverses or doesn't reverse the other genome,
and then rotates the other genome to match codons with the
codon chosen in the first genome, following which it walks
off in an arbitrary forward or reverse direction. This
isn't quite a general orientation crossover, but it was
thought that starting at a matching codon would improve the
chances of finding a second matchpoint before walking the
whole of both genome rings. If the latter occurs,
effectively no crossover occurs, just selection of one or
the other parent to clone as a child. Traveller by policy
only applies fitness bias to one of two sexual reproduction
participants,the other is just the next sequential member of
the population to come up in order in the Population class'
reproduction loop. When rolling crossover was allowed to
return the fitness-bias-selected parent unchanged as its
output, the population diversity would crash to nothing
frighteningly fast. Now by policy, rolling crossover when
it cannot find a crossover point internal to the genomes'
lengths, returns the sequentially selected parent,
preserving population diversity much more effectively in
this case where nothing new is to be added to the population
mix.
-
special considerations
Where parent genomes exhibit good city clustering, as after
a minimal spanning tree seeding, rolling crossover has the effect of
choosing the better clusters for the child genome wherever a
genome subset match can be found. Because
of potential fitness losses where the clusters join derived
from opposite parents, this may not always succeed in
producing a child superior to both parents, but the child is
at least as fit as either parent, usually at least a blend
of parent fitnesses if a match point between the ends of the
genomes was found, and is more often than not more fit than
either parent. This heuristic decreases population
diversity somewhat quickly, and so it is a good match for a
relatively high level of mutation, up in the one-half
percent to one percent range. It is not especially powerful
when used alone with mutation, for one thing because a
completely random genome rarely finds match points against
another equally random one except at the beginning and end
of the genomes. It needs other heuristics to provide the
improvements and global organization which it then blends
together very nicely. Unlike the other crossover
heuristics, that pretty much show you just sources and
results in their visual debugging windows, rolling crossover
is munged to show you the child genome as it is being built
up, which, in the case of a long string of identically
valued codon slots, gives a nice "rolling" visual effect.
-
Consultive Reproducer Heuristic
There exists (at least in Traveller) only one of these. It
makes modifications strictly within one genome, but it
consults with the genomes of the remainder of the population
to decide which modifications to make.
-
Inver-over?
-
type
Inver-over creates a new type of reproduction heuristic,
neither sexual nor asexual. I call it a consultive
reproducer. It doesn't do crossover with another genome, it
works on and modifies a single genome, but it steals ideas
for how to rearrange its own genome from the entire
population, biased in selecting whom to rob by the fitnesses
of the potential victims. This is a classical "blind"
heuristic, not dependent on local city to city distance
information, but only on global "whole genome" fitness to
achieve progress.
-
capability
Inver-over functions completely stand-alone to solve SETSP
problems, since it includes its own mutation capability. It
is the fastest completely Blind SETSP solver I've
found, but pales in competition with the sighted cheats I've
since implemented.
-
intent
This in an implementation of the very nice inver-over
heuristic from the paper by Guo Tao and Zbigniew
Michalewicz, "Inver-over Operator for the TSP". Look at my
computational complexity web page for a link to it (I'm
writing this offline, not that unusual for a homeless
person). The goal inver-over satisfies is to overcome the
"one generation, at most one improvement" behavior of other
blind SETSP heuristics.
-
mechanism
Inver-over begins by picking an arbitrary starting codon and
direction of progress in the genome. It then "consults"
some other member of the population to determine what codon
that other member has next to this member's current codon in
that other member's genome, finds that codon in its own
genome, and inverts the substring from the currently
adjacent codon to the desired-to-be-adjacent codon,
to put that codon next in its genome. It then makes the
just-moved-adjacent codon the current codon, and continues
as before. It stops when it finds itself about to make a
codon adjacent that already is adjacent. As a substitute
"mutation" operator, once in a small fraction of tries,
rather than consulting another member of the population, it
picks an arbitrary other codon of its own to make adjacent
to the current codon. In making codons adjacent, it picks
the shorter possible direction to do the inversion, which
half the time changes the direction of progress as a result.
It is often the case that it will therefore walk back and
forth across the same codons many times before finding its
stopping condition. What makes this seemingly
scatterbrained mechanism work so well is that the member to
consult at each step is selected biased in favor of more fit
members, so that inver-over is imitating success in its
genome rearrangements.
-
special considerations
This is one heuristic that is completely disabled by any
appreciable level of externally caused mutation, but works
well to do lots of improvements at once alone or when
working with compatible heuristics.
Inver-over is very effortful. When I run a mix of
inver-over and my three permutation and one "optimize near a
point" heuristics, what I think of as my slow heuristics, the
generation times are half as long as when I run inver-over
alone. On the other hand, inver-over is also Traveller's
most powerful "blind" heuristic. Since it keeps rearranging
its edges to match the fitter companion genomes' choices
until a fairly arbitrary limit is hit (the next edge it
selects from another fitness-biased randomly-selected genome
to create, it already has in its own genome), there is no
obvious limit on the amount of improvement inver-over can do
to a genome in a single pass, beyond the hard limit of not
bettering the perfect solution to the current TSP city
layout.
Because it benefits from population diversity two way: more
possible edges from which to select ones to adopt as its
own, and a better selection of fitnesses from which to
select more fit consultants, inver-over works best with
comparitively large populations. To return the favor,
inver-over, by creating new genomes from a mix of edges
available across the population, enhances, rather than
degrading, population diversity, until a global optimum is
approached, when it speeds up its convergence and starts to
create cloned population members "the hard way", edge by
edge rather than segment by segment.
Inver-over is one of the heuristics that is a lot of fun to
watch in the visual debugging window, especially when it
hits on a long run of changes.
-
Asexual Reproducer Heuristics
Asexual reproducer heuristics make a child genome using only
the information in a single parent genome. They are easier
to design with regard to maintaining the permutation genome
invariant than are sexual reproducer heuristics, and so they
are comparatively numerous. That doesn't keep them from
becoming wonderously complex, though.
-
Dewrinkler?
-
type
Dewrinkler is an asexual reproducer heuristic and one of the
members of the adaptive permutation heuristics. This is a
"sighted cheat" because the best permutation selection
portion makes use of local city-to-city distances in the
interior of the genome in its operation.
-
capability
Dewrinkler is a helper heuristic, useful in a mix, but not
capable of solving SETSP instances on its own.
-
intent
One of the ways SETSP runs on regular layout cases get
"stuck" is that two errors far apart cause the regular
pattern in-between to be somehow "flipped" at every pattern
repetition; there is a pair of "wrinkles" that need to be
moved adjacent to one another where one of the local fixup
routines can repair them. Dewrinkler is designed to drag
wrinkles a long distance, but, like waves moving far but not
moving the water of which they are composed far at all,
moving the edge misorientations comprising them only a short
distance. It tries to do this by ignoring one feature of the
fitness that keeps the other solution attempts locked in a
local optimum, while Dewrinkler proceeds to do local
permutations on its way around the genome ring.
-
mechanism
Dewrinkler uses the local permutation of Permute A Sublist
and Smoother, permuting a small sequential set of genomes
along the tour, to find their best fitness arrangement, and
does this while walking around the genome ring for a
complete cycle. It differs from the other two heuristics by
ignoring the fitness contribution of exactly one edge
position in the permutation at each step. That allows
Dewrinkler to "capture" a codon for a step or two, dragging
it along until the fitness worstening of its still-connected
edge forces it to be dropped and a different codon to be
dragged along instead. Conceptually, a "wrinkle" is being
"dragged". There is extra logic to keep Dewrinkler going
until it stops producing improvements, so it can become
quite slow to complete work on a single genome.
-
special considerations
Dewrinkler is designed to work most usefully in regular
layout SETSP instances. Use in other layouts may prove
effective also. Dewrinkler can sometimes be very "lively"
and so fun to watch in its visual debugger window.
-
Disordered Slide?
-
type
Disordered Slide is an asexual reproducer heuristic and is
one of the members of the adaptive permutation heuristics,
and is also one of the heuristics implementing the
AsexualReproducer interface that also implements its derived
Mutator interface. This is a classical "blind" heuristic,
not dependent on local city to city distance information,
but only on global "whole genome" fitness to achieve
progress.
-
capability
This is an incredibly weak huristic, mostly
counterproductive after a problem run is well begun.
However, at the beginning, it helps introduce some quick
gross order into genomes. Combined with simulated annealing
and rolling crossover, it is eventually capable of
solving small sized problems, say 20 cities, but it is
hundreds of times slower than some of the stronger
heuristics. The problem is that almost always, this
heuristic increases entropy, rather than decreasing
it.
-
intent
This was mostly just a matter of playing around. After I
wrote Ordered Slide to try to imitate an ordered crossover
in an asexual reproducer, it seemed natural to add the
disordered case as well. It didn't have any target
situation it was designed to solve.
-
mechanism
The mechanism is to select some random scattered subset of
the genome, set those codons aside, slide the remaining
codons left and right to leave a hole in the middle, then
randomize the set-aside codons and insert them into the
hole. Only incredibly rarely does this improve the genome's
fitness, but once in a while it will create some local
improvement that rolling crossover can capture and propagate
through the population, which is the only reason it works at
all when used only in tandem with rolling crossover. More
than likely, any fitness improvements are due to pulling the
set aside codons out of places where they are really
misfits.
-
special considerations
After some changes to concentrate the codons selected to a
compact portion of the genome, this works somewhat better,
but still not really well. However, it proves useful, with
rolling crossover, at least at the initial gross
organization phase of a test case with a city count of 240
and a population of 120. Disordered slide proves much
more useful as a very powerful mutator.
-
Invert?
-
type
Invert is an asexual reproducer heuristic and is one of the
heuristics implementing the AsexualReproducer interface that
also implements its derived Mutator interface. This is a
classical "blind" heuristic, not dependent on local city to
city distance information, but only on global "whole genome"
fitness to achieve progress.
-
capability
While exceptionally simple, Invert is so powerful a tool,
that, when combined with the simulated annealing
meta-heuristic, it long reigned supreme as the commercial
tool-of-choice for finding efficient integrated circuit lead
and waypoint layouts. The two together, properly tuned, can
often find the global optimum of SETSP problems, and can
always find an excellent imperfect solution. When Invert
doesn't uncross two edges, and thus improve fitness, it
always does the opposite, decreasing genome fitness. Since
some situations need a series of invert and then a series of
uninvert steps to escape local optima, either simulated
annealing or other heuristics in the mix, or a tuned rate of
mutation, are generally needed for Invert to completely
solve an SETSP instance.
-
intent
Invert is intended to uncross tour edges that cross; when it
does that it always improves genome fitness.
-
mechanism
Invert selects two nodes in the graph and reverses the
substring of codons between and including them. Because the
permutation genome is functionally invariant under reversal,
it doesn't matter except for efficiency which of the two
possible substrings delimited by the two selected nodes gets
reversed.
-
special considerations
This is Scott's invert asexual reproducer, that reverses the
order of a substring of cities in the genome. It is a
powerful tool for uncrossing crossed paths. I removed the
ring-versus-array artifact from Scott's code, so that it
doesn't display end effect biases and will now invert a self
intersection due to a sublist of the genome that happens to
lap the end of the genome. As noted above, Invert needs
either Simulated Annealing, mutation, or to work in a mix
with other heuristics to solve SETSP instances in general.
Many of the other heuristics of Traveller are just
generalizations and enhancements of Invert, such as Snow
Plow, Permute Several Sublists of Cities, and the Permute
Cuts family of heuristics.
-
Move?
-
type
Move is an asexual reproducer heuristic and is one of the
heuristics implementing the AsexualReproducer interface that
also implements its derived Mutator interface. This is a
classical "blind" heuristic, not dependent on local city to
city distance information, but only on global "whole genome"
fitness to achieve progress.
-
capability
If you have infinite patience, and include
simulated annealing or mutation, Move by definition can
solve any SETSP instance optimally. There isn't much
guarantee that it can beat brute force at the task, however.
-
intent
Move was written for two reasons: to implement the simplest
possible heuristic, and to add an obvious missing piece to the
Traveller toolkit. However, the effect of Move can be had by
two Inverts, the first to carry the codon-to-be-moved to its
intended spot, and the second, inverting a substring one
codon shorter, to put everything else back the way it was.
-
mechanism
Move selects a codon to move and a slot to which to move it,
picks up and sets aside the codon to be moved, rolls the
intervening codons one position left or right to fill the
slot that creates, and then drops the set aside codon into
the hole that now exists at the other end of the rolled
substring.
-
special considerations
This is an even simpler asexual reproducer than Swap or
Invert, that just moves one city somewhere else in the
genome. Scott's swapper is faster though, it can move two
cities, via a temp variable, in three copy operations, while
my mover must roll the intervening cities out of the way,
one by slogging one, doing an average of around N/4 copy
operations if it always selects the shorter direction for
the move, where N is the length of the genome.
-
Optimize Nodes And Edges Near A Point?
-
Optimize Nodes And Edges Near Every City?
-
type
Optimize Nodes And Edges Near Every City is an asexual
reproducer heuristic and one of the members of the adaptive
permutation heuristics. This is a "sighted cheat" because
the best permutation and best slot distribution pass makes
use of local city-to-city distances in the interior of the
genome in its operation, and the node and edge selection
makes use of city location geometry information too.
-
capability
This heuristic is Optimize Nodes And Edges Near A Point, but
on a heavy dose of steroids. It is N times as expensive,
and more than N times as powerful, because it gets N times
as many chances to return a vastly improved genome in one
generation.
-
intent
Optimize Nodes And Edges Near Every City is intended to
reproduce the behavior of Optimize Nodes And Edges Near A
Point, but to distribute the selected initial point in a
pattern derived from the node layout pattern, and to do such
an optimization near each node in a single pass.
-
mechanism
In Optimize Nodes and Edges Near Every City, the works of
Optimize Nodes And Edges Near A Point are wrapped in a loop
to execute once for each city, and the initial point
selection step for each loop iteration, instead of selecting
a random point uniformly distributed on the entire
playfield, selects a point with a gaussian distribution
around the location of the city corresponding to the current
loop iteration. The standard deviation of the gaussian is
set inversely proportional to the square root of the number
of cities, and directly proportional to the length of the
playfield edge, so that it approximates the distance between
cities.
-
special considerations
The gaussian distribution of points near city locations to
approximate the average distance between cities has strange
side effects in the Poly Spiral layout, where city density
varies greatly from center to edge of the playfield. In
particular, only rarely is the city corresponding to the
current loop iteration one of the ones selected as closest
to the initial point. This heuristic is lots of fun to
watch in a visual debugger window; the nearest "facing"
edges may in fact be quite far away, and the walk in city
name order means that there is a regular progression top to
bottom and then left to right on the even grid layout, and
something appropriate on the other regular layouts. This
heuristic is most effective, surprisingly, in the random
layouts, where it tailors initial point location selections
to more closely match the local density of cities than does
the uniformly distributed selection mechanism of Optimize
Nodes And Edges Near A Point; that is, this heuristic tries
to focus efforts where they will do the most good.
-
Optimize Nodes Near A Point?
-
type
Optimize Nodes Near A Point is an asexual reproducer
heuristic and one of the members of the adaptive permutation
heuristics. This is a "sighted cheat" because the best
permutation and slot distribution pass makes use of local
city-to-city distances in the interior of the genome in its
operation, and the node selection makes use of city location
geometry information too.
-
capability
This is a powerful (and slow) heuristic; it is probably
capable of solving SETSP instances alone, but the proof is
beyond me. It has the disadvantage (like Invert) that the
problem sits waiting for the algorithm to hit at random on
the particular place where an improvement can be made, as
opposed to routines like Smoother, that can exhaustively
test all possible cases for improvement potential.
-
intent
This was the first local to the playfield position rather
than local to the genome slot local optimization routine
added to Traveller. It was added to bring geometric
considerations to bear on improving fitness.
-
mechanism
This "sighted cheat" optimize-nodes-near-a-point algorithm
takes the C closest cities to a randomly
chosen point on the canvas, and rearranges them in all
possible orders and in all possible subsets (including empty
ones) back into the S ≤
C genome slots from which they were
extracted.
-
special considerations
This heuristic is very powerful at moving cities between
physically nearby path portions that have long genome
segments between them when following the path, but the "best
result out of all permutations tried in all slot
arrangements" is the most computationally intensive of my
permutation inventions, requiring the C!
permutations to also be salted into possibly as many or
possibly fewer slots S, in
(C choose S) ways. I
found the needed numbers in Pascal's triangle for how hard
that latter half of the effort is. The effort required
grows huge rapidly in the worst case, when every codon
extracted creates a separate slot. For eight points making
eight slots, there are 259,459,200 fitness evaluations
required. We don't go up to eight points; right now we stop
at four, though five or six might be sustainable. There is
also the little issue of all those distances to compute to
start, to find which C cities are nearest
to the randomly chosen point, but that effort drowns and is
lost in the noise compared to subsequent calculation needs
for large C and S, where
"large" is still in single digits.
Lest you think I'm whining, this little table illustrates
the reasons for the limitations chosen in Traveller rather
well.
Number of candidates to be child genomes generated by
permuting C cities in all possible orders
and then partitioning each permutation into all possible
same-ordered subsets among S ≤
C slots.
Cities C
1 2 3 4 5 6 7 8 9
+ - - -- --- ----- ------ ------- --------- -----------
1 | 1 2 6 24 120 720 5040 40320 362880
S 2 | 6 24 120 720 5040 40320 362880 3628800
l 3 | 60 360 2520 18060 181440 1814400 19958400
o 4 | 840 6720 60480 604800 6652800 79833600
t 5 | 15120 151200 1663200 19958400 259459200
s 6 | 322640 3991680 51891840 726485760
7 | 8648640 121080960 1816214400
S 8 | 259459200 4141347200
9 | 8821612800
-
Optimize Nodes Near Every City?
-
type
Optimize Nodes Near Every City is an asexual reproducer
heuristic and one of the members of the adaptive permutation
heuristics. This is a "sighted cheat" because the best
permutation and slot distribution pass makes use of local
city-to-city distances in the interior of the genome in its
operation, and the node selection makes use of city location
geometry information too.
-
capability
This is a very powerful, and correspondingly expensive,
heuristic. I'm not totally convinced that it can solve
every SETSP instance by itself, permuting nodes is not as
powerful as permuting sublists, but it certainly can do a
lot of improving in a single application to a genome.
-
intent
Optimize Nodes Near Every City was Traveller's first "Near
Every City" geometrically-aware heuristic. It focuses the
power of the Optimize Nodes Near A Point heuristic on every
city, one by one, in a loop, distributing the selected
initial point near each city rather than at random across
the playfield.
-
mechanism
For each city location, Optimize Nodes Near Every
City picks a point not on but centered near that location,
and then does the equivalent of the "Optimize Nodes Near A
Point" heuristic, for each of those points, in a loop,
before returning.
-
special considerations
Since the point selected can be anywhere in a small "fuzz
box" around the city location, vaguely the radius of the
distance between cities, and distributed as a gaussian, not
within a fixed limit, then depending on the direction it
lies from the city, it can find a different set of nearby
tour nodes (presumably usually including the city itself) to
attempt to optimize, so running this heuristic on the same
genome twice can (and often does) improve that genome each
time. Run in a problem set on the nested grid with only the
rolling crossover as a companion, it started to produce
recognizable order from a random seeding for the (full
nested grid) 136 city case in just four generations at a
population size of 68. When I realized that a gaussian
rather than a tiny fixed circle was the correct way to
distribute the points, that same problem, the bane of my
existence with ten hour runs to completion, went to exact
solution the first try in four minutes and three seconds,
using a population size of 17, and a heuristic mix of this
heuristic, Snow Plow, and Inver-Over! Like other heuristics
run in a loop, this one is decorative and amusing when
viewed in its visual debugger window.
-
Ordered Slide?
-
type
Ordered Slide is an asexual reproducer heuristic and is one
of the members of the adaptive permutation heuristics, and
is also one of the heuristics implementing the
AsexualReproducer interface that also implements its derived
Mutator interface. This is a classical "blind" heuristic,
not dependent on local city to city distance information,
but only on global "whole genome" fitness to achieve
progress.
-
capability
Ordered Slide is somewhat less entropy-philic than
Disordered Slide, but still is a very, very weak heuristic,
mostly useful in the early, disorganized parts of randomly
seeded populations' runs. Mixed with simulated annealing,
it could probably solve SETSP instances, but deathly
slowly.
-
intent
The intent of Ordered Slide was to emulated Ordered
Crossover in an asexual reproducer, but it didn't work out
well at first, perhaps because the selected subset of codons
was not compact in the genome. Hmm.
-
mechanism
Ordered slide works like the previous, DisorderedSlide
heuristic, except that it doesn't randomize the set aside
subset of codons, it puts them back into the slot between
the left and right unselected codons in the same order in
which it encountered them in the original genome.
-
special considerations
Like its cousin the DisorderedSlide, the performance of this
heuristic improved noticibly with a change of selection
criteria that focused the selected codons in a compact
portion of the genome, adjusting the pressure to do such a
selection with the length of the run. This reduces the
entropy introduced by "bad guesses" of the random number
generator, at least qualitatively.
-
Permute A Sublist?
-
type
Permute Cities Within A Sublist is an asexual reproducer
heuristic and one of the members of the adaptive permutation
heuristics. This is a "sighted cheat" because the best
permutation selection portion makes use of local
city-to-city distances in the interior of the genome in its
operation.
-
capability
This is a very local algorithm, and frequently
a useful tool when a lot of global tour selection fitness is
already in the population and mostly local cleanups are
needed. Alone, it is not capable of overcoming local
optima traps, because many needed repairs during SETSP runs
require manipulating remote parts of the genome
simultaneously.
-
intent
The intent of Permute Cities Within A Sublist is to apply a
local reorganization to optimality at one point along the
genome, and it succeeds in that purpose.
-
mechanism
In this heuristic, the things being permuted are single
cities within a contiguous substring selected at random from
the genome. All possible permutations of the codons in the
substring are evaluated, and the one creating the most fit
genome is the one returned in a modified child genome.
-
special considerations
This was Traveller's first permutation heuristic, that
attempted to try all possible arrangements in some small
subset of the total genome, and return the genome with the
best fitness from among all those rearrangements.
-
Permute Cuts Near A Point?
-
type
Permute Cuts Near A Point is an asexual reproducer heuristic
and one of the members of the adaptive permutation
heuristics. This is a "sighted cheat" because the best
permutation selection portion makes use of local
city-to-city distances in the interior of the genome in its
operation, and the cut selection makes use of city location
geometry information too.
-
capability
This is an extremely powerful algorithm, capable of solving
most, and probably all, SETSP instances without assistance.
It overcomes Invert's disadvantage of doing only one
inversion at a time when coordinated inversions may be
needed to break out of a local optimum. It enhances the
Permute Several Sublists Of Cities heuristic's capabilities
by adding the focus around a geometric location.
-
intent
This heuristic was derived conceptually from Optimize Nodes
Near A Point with the intent of using the captured list of
cities to split the genome into substrings with ends close
to each other geometrically rather than close together in
genome codon-by-codon steps, and then trying to reconnect
those substrings in all possible ways. This was in response
to the observation that often portions of the tour near in
space, but distant in terms of walking the genome, needed to
have a multiple simultaneous invert done to them to break
out of some local optimum.
-
mechanism
A list of closest cities is grabbed as in Optimize Nodes
Near A Point. For each city, at random either before or
after the city, the genome is cut. For M grabbed cities,
there will be M or fewer cleavage points; some may merge for
cities chosen that happen to be adjacent in the genome and
for which cleavage points are chosen towards each other.
Once the genome is cleaved, the mechanism of Permute Several
Sublists Of Cities is used to try all possible permutations
of the sublists in all possible forward or reversed
direction combinations. The best possible genome from among
the candidates thus produced is propagated as the child
genome.
-
special considerations
Naturally all this power is expensive. There is a phase to
find the distances to all cities, and there is a phase to do
all possible permutations with all possible reversals, both
effortful. The payoff is a heuristic with all this power
focused at one part of the city layout where a strong local
optimization is accomplished. This heuristic works well in
a mix, and, being asexual, doesn't require a large
population level for its success.