[I've completely given up trying to present Traveller here as an applet. Sorry about that. Applets don't work using standard mechanisms in my non-standard MS-Windows 98/Cygwin development environment, and I have no computer to test my web site over whose operation I have control. Writing software you cannot test properly is just a fast way to a bad reputation and a nervous breakdown. Those interested in running Traveller in their own Java runtime environment are still invited to download the jar file and enjoy themselves, where it starts out looking like this. [Screenshots for this document for the epsilon release of Traveller are from a "nested grid" layout, run a couple of times to show different features. The problem size is 337 cities, the population size is 40 members. This is Traveller's most difficult current layout pattern, much harder than random layouts because its fitness landscape is full of very deep local optima potholes, and so far I only know that it cannot be completely solved in 117,000+ seconds at the current state of Traveller's heuristics, though I ran out of patience with only two more fixes needed, easily determined by eye in this regular layout pattern.]
One would think, as crucial as applet presentation is to the successful evangelizing of Java, that how to make Java applets run from browser windows would be as well documented and supported in Sun's documention as is the Java language API, but such is emphatically not the case. Grrrr.]
Traveller is a Java language application which sets up and runs a choice of many genetic algorithm heuristics to solve a particular variety of the famous Traveling Salesman Problem, while letting the user have a lot of control over the algorithms used and the problem characteristics.
It is best to lay this controversial issue to rest at once. Scott Robert Ladd (henceforth "SRL" or "Scott") wrote the original of this demo as an applet for the ("Blind" version of the) Symmetrical Euclidean Traveling Salesman Problem, and he used an alternative spelling "Traveller", for which he has apparently received endless grief. He came back, on his spiffy Coyote Gulch web site (from which Traveller comes), to defend the spelling as an accepted alternative. Well, yes and no. You will find it in many dictionaries. It is an acceptable spelling under British rules (which earlier followed the current American rules), to double the "l", but the American rules double the consonant when adding a suffix, only when
Which latter is not the case for the word "travel". Nevertheless, it is Scott's original work, and Scott's choice, so until he changes his mind, the spelling stays as Scott and the British do it, and it is we Yanks who will have to show the stiff upper lip in these trying times. Besides, the spelling is now so embedded in the existing code class names and such, it would be a chore to rip it out at this point, and then the British would just declare war in retaliation for the insult to their orthography, and... .
Scott's original code was published under the GNU Public License, or GPL. My additions and modifications are freeware, since I'm not into legalisms. Under the Berne copyright convention, both Scott and I hold copyright on the code we separately wrote, automatically. [What the status is of the stuff mishmashed together I haven't a clue, best take the conservative approach and assume it is all GPLed.] My license of my copyright to you is that you may do whatever you want with my code, whether for commercial or private use, as long as you don't try to hold me responsible for the consequences. Credit to me as the author would be polite, but is not required. Scott's license to you is contained in the GPL; the most recent version of the GPL is available online from the Free Software Foundation, at
GNU Public License or by surface mail request from:
Free Software Foundation, Inc.
59 Temple Place - Suite 330
Boston, MA 02111-1307, USA.
and also included in the files COPYING and COPYING.LIB (two considerably different licenses) in the Traveller.jar file.
For purposes of contacting me about my copyright (please don't), I am most recently living homeless at
Kent Paul Dolan
General Delivery
Clovis, California, United States of America 93612-9999
but I am more usually known and reliably contacted since 1988 as xanthian@well.com or known as "xanthian", or since 1986 as "Kent, the man from xanth", but henceforth will be known here as "KPD".
Traveller uses some unfamiliar words, and some familiar words in specialized ways. Here are the most important of them.
The SETSP (see next section) involves travels through a set of Euclidean (x,y) coordinate pairs laid out on a two dimensional coordinate system. From the point of view of the human "traveling salesman" model, these are "cities". From the point of view of the graph which the cities and paths connecting them create, these are "nodes" or "vertices". From the point of view of a genetic algorithm, these smallest individual components of the genome are "codons". In particular, for Traveller, those codons contain a number which is the city's name.
The aggregate of genetic information which a genetic algorithm manipulates in a single reproducing entity, and which its designer uses to describe the problem in term of its domain of discourse is precisely a "genome", or more colloquially a "chromosome". From the viewpoint of the human "traveling salesman" model, the Traveller genome encodes the salesman's "tour". This tour, and thus the genome, is independent of the description that starts with one particular one of its cities/nodes; it is like a ring, with no distinguished beginning or end, and this realization becomes important when implementing genetic algorithms.
The connection by a straight line from one coordinate pair of a tour to the next is an "edge" from the point of view of the graph which the nodes define, occasionally a "link" or "path" from the point of view of the traveler and city model. What is important to Traveller and to the solution of the SETSP is the length of this link, and even more the sum of the lengths of all the links in the current tour, which sum is the genome's fitness.
Briefly but carefully expressed, the SETSP is this. In the two dimensional Cartesian coordinate system plane, with the usual Euclidean distance metric, and distances between coordinate pairs the same in both directions, given a set of coordinate pair locations, conceptually "cities", and an agent, conceptually "a traveler"; then, find for that traveler the closed circuit path that goes through each of those cities once and only once, and which has the shortest total length.
This problem is interesting for several reasons.
I use 79 cities as my benchmark point, for sentimental reasons. I call it my "particle death of the universe"-sized problem (or my "the end of the matter"-sized problem).
My computer runs at a half-a-gigaHertz clock cycle, and, I think, does one floating-point operation per cycle. To evaluate the fitness of an SETSP tour ("genome") takes (at least) seven floating-point operations ("flops") per city.
My computer is made of matter, largely consisting of protons. All the protons in the universe are expected to have disintegrated in 10^100 years (That's a 1 with 100 zeros after it). The universe is currently about 15,000,000,000 years old, a submicroscopic beginning on such a time-span.
The brute force exhaustive search approach to solving the SETSP is to form every possible permutation of the cities (in no particular order of creating the permutations so that no fitness evaluation shortcuts are available), evaluate the path length ("fitness") for each permutation, and select the one with the shortest path length ("best fitness").
In attempting to solve a 79 city problem by the brute force exhaustive search approach, my computer, sadly, would fail, as it would see all its protons disintegrate first. Solving that problem at the speed of my computer would require just under 9*10^101 years, or just 90 times longer than it will take the last proton in the universe to disintegrate.
Maybe a bigger computer would help? [Excuse me if I drop a digit or two answering, the last time I read the needed numbers was a very long time ago.] Well, if all the matter in the entire universe, around 10^70 protons and neutrons, were turned into computers like my computer, it would be sufficient to make about 10^46 of them. If a 79 city SETSP problem instance were partitioned evenly among them and a massively parallel brute force solution attempted, granted not all the protons would have disintegrated after 10^56 years, but probably all interest in the problem solution would have disintegrated by then.
So, perhaps the brute force approach is not the best choice.
My best approach to solving the SETSP so far, finds the exact solution for a 79 city case, occasionally in as little as 40 minutes, though more often in a couple of hours, using a mix of genetic algorithms and good approximate solution initial seed genomes. [ UPDATE: as of 2002/03/25, Traveller solves 80 cities with a population of 80 members and "best practice" settings (on an even grid, so I know the right answer) in 61 seconds including setup time. Looks like I'm making progress.]
Paid professionals, using gangs of computers working in concert, and much more sophisticated algorithms, have exactly solved problems of around 15,000 cities in under a year, as their current "high water mark". Literally cities, in this case, they were working from a modern German map.
The problems of similar complexity to the SETSP, prevalent in integrated circuit design, feature millions of way-points, and are not exactly solvable in general in commercially useful times by any known algorithm. Simulated annealing is the heuristic algorithm of choice, last I read (and works well enough that we now have computers that sit on our laps and run half a gigaHertz).
The traveling salesman (or here, the algorithm which represents that salesman) is blind when the only information available to the genetic algorithm part of the code for use about the path lengths is the total length of the circuit, not any of the city to city distances that make up that circuit.
This version of the problem is particularly suitable for genetic algorithm approaches, which insist on a single fitness metric. Also, it removes the complexity of various algorithms that attempt to improve performance by taking advantage of more local information.
Unlike the purity of Scott's version, mine makes available many "sighted cheats", to be turned on at the user's option. Providing good starting seed genomes, as with the minimal spanning tree seeds, is obviously one such cheat. All of my permutation algorithms are technically cheats because they save work by not calculating the full circuit fitness, but they could in concept just as easily compute and work with the full fitness, so their approach isn't as big a cheat. The "optimize near a point" group of heuristics are completely sighted cheats, they take advantage of knowing city locations to pick the cities nearest a point in the play-field to try to optimize some characteristic of the tour near just those few cities (they should be as awesome as they seem, too; they is certainly complicated enough compared to the lean beauty of Scott's Cyclic Crossover, say). Similarly, when implemented, local optimization seeds, probably from 2-OPT or 3-OPT (whatever those are) approaches, when I can steal some working code, or a greedy algorithm, when I feel like bothering to code one, or that spiffy relaxation of an ellipse to a solution I've seen several places on the Net, [done, see "Elastic Net" below] would all be sighted cheats providing good seeds.
Because no one in all that research time has ever found a polynomial time direct solution, and it is quite possible that no such direct solution even exists.
Where no direct solution algorithm exists, but the problem needs solving anyway, computer programmers and other researchers and practitioners employ heuristics, methods that seem to work well most of the time but cannot be proved to work all of the time, and sometimes don't. Every computer weather forecast you've ever seen was done with a heuristic, for example. No direct computational solution to the weather forecasting problem in feasible time is known to exist. The less understood the problem domain is, the chancier use of heuristics becomes, thus hurricanes arrive unexpectedly at your picnics.
However, the programming concept of genetic algorithms is that they help us find solutions, quickly anyway, for which no fast, direct solution is known, and where we don't very well understand the problem domain. Genetic algorithms are often used for problems whose solution is not well understood, and they often perform very well, in commercially successful ways. However, genetic algorithms themselves are not very well understood. Understanding them better is one of the goals of Traveller.
What a genetic algorithm does is to imitate that highly successful technique seen in nature, evolution by natural selection. Because computer algorithms' generations can run so much faster than the generation times of nature, this very inefficient process that takes millions of years in nature to create a higher brow line, finds solutions much more quickly in a computer. Typically genetic algorithms employ these parts.
Scott wrote Traveller as code to be part of a textbook. It is meant to be of a tutorial nature, efficient in the small with fast generation times, but not blindingly fast to find a solution, not industrial strength, not meant for huge problem sizes, and not the perfect algorithm (no one has found one, and if one existed, it would probably be too complicated for use in a textbook). Keep this in mind when trying to evaluate Scott's version of Traveller.
[After my hack, slash, and burn attacks changing his code, Traveller still doesn't have any of those features, they were in conflict with my goals too, but my version no longer would serve Scott's purposes at all, it is much too big and complicated for one thing,
[As of this moment of writing during construction of the Traveller "epsilon" source code release, the Unix "wc()" command claims that Traveller contains 97 source code files, of 33,039 lines of code or 897,299 source code bytes, while this current document in HTML source code form just passed a quarter megabyte in size.]
and by design is never "finished" for another, so any comparison of the two implementations needs to take into account their quite different goals.]
I do research programming in computational complexity as a hobby, and the SETSP is the natural choice of the hobbyist programmer. I also needed to learn Java — Sun's platform independent programming language, to have any chance of ever being employed again. Stealing Scott's working code (for which I'd lusted for a couple of years, there were so many things I wanted to change) gave me a chance to start with a functioning framework into which I could introduce new code of the type I understood, while learning more from types new to me as I modified them.
I love to mentor. This code is meant to allow others, just by playing with the applet controls, to gain an intuition and a deeper understanding of how Genetic Algorithms, and this SETSP genetic algorithm in particular, perform both when things are going well, and when problems occur. I wrote this code mostly to teach myself to have that deeper understanding, a gut level understanding of genetic algorithms.
I have no faith in single trick approaches to genetic algorithms. We use a genetic algorithm when we don't understand the problem domain well enough to code an efficient direct solution of the problem, and there is no more reason to suspect that we understand the problem domain well enough to select a single heuristic that will work for all data sets.
I do have "faith" in kitchen sink approaches to genetic algorithms, and want to test that faith by showing that genetic algorithms perform better when a mix of heuristics, metaheuristics, and seeding methods are used simultaneously. Thus there are as many algorithms as I could reasonably steal or create embedded in Traveller, with a control panel to turn them on or off as desired. This kitchen sink approach is not original with me, but goes back to early numerical algorithm research literature I read back in the 1960s, long since lost to me. [Other hobbyists, lusting to see their favorite SETSP heuristic running in Traveller's mix, are invited to enter negotiations to send me source code; don't just send it without asking, my email box is small.]
[I have experience of another kind with genetic algorithms, too, which can stand as an object lesson to other programmers. The genetic algorithm concept and resulting mechanism is so immensely powerful, that adding heuristics which are full of bugs and don't do at all what is expected, still can improve the performance of the mix. The genetic algorithm in nature, after all, runs quite adequately off the input of mutation and blind chance, so handing the computer genetic algorithm mechanism a source of genomes that differ in some new way or according to some new distribution of changes, is equally important with handing it a heuristic that is well designed and lovingly crafted. Thus, the rest of the story is that you cannot trust that your algorithm is working as you planned, just because the genetic mechanism can use its output to solve the given (SETSP or other class of) problems, and that such a seductive "testing technique" proves exactly that your heuristic doesn't break the genome badly enough to make it unusable, and nothing much else.]
I was originally coding this application in MITs StarLogo for Java, but that language ran out of steam for me. I am intentionally creating something very complex, and that language has hard-coded limitations on application complexity that started to warp my coding into an unmaintainable mess. It is great for other purposes, and I still use it, but not for an endlessly growing single application. Needing to get away from StarLogo, and not ready to try to write a C++ GUI with the limited tool libraries at my disposal, I decided to turn to Java with its rich abstract windowing interface class collection, and found Scott's applet waiting to be abused.
Traveller has become visually a plethora of framed windows [some that persist, and some pop-up windows that exist only as long as the corresponding parts of the code are running], into which I divided Traveller's screen interface, as my additions overwhelmed Scott's original one window design. The Traveller application now consists of the following main parts.
While it might seem more natural to keep the configuration buttons with the configuration panel, there are now two configuration windows, instead of one configuration panel, so that wouldn't work any more. Since the various pushbutton enabled or disabled states comprise an implicit state machine for Traveller, it is easier for the user to understand this application state if all the buttons are in one window together so that the changes indicating "button enabled" or "button disabled" to each when one is pushed are more noticable. Here is what the buttons mean and do.
Stop the genetic algorithm at the end of the next generation, putting the applet into a mode where the algorithm may be continued on the existing problem setup with the Resume button. The Change Configuration button is enabled. The Reset button is enabled. The Create New Cities button is enabled. The Test Again button is enabled.
Because Traveller doesn't respond fully to the Stop button until the end of a whole generation, I added a Salesman's Progress persistent pop-up window that displays the genome-by-genome progress of Traveller through a single generation, and "generation complete" at the end of a generation, so that the user can gain some idea of when it is safe to go forward modifying stuff.
Changing, in particular, the valuator panel values before Traveller comes to a complete stop generates scads of error conditions and messages. If you are sufficiently unlucky, it will also crash Traveller (and possibly your operating system if you run one of the tender ones crashable by application software). If you are running a sufficiently slow setup of Traveller (large problem and population sizes) that waiting until the end of a Traveller generation is unacceptable, use the tools your operating systems provides to end an application program early (Unix's shells' kill() command, MS-Windows Task Manager, or similar tool) to stop Traveller cleanly, rather than trying to change Traveller's setup in mid-generation). Then, when the dust has settled, start Traveller over from scratch.
Pick up the running of the genetic algorithm from where it was paused with the Stop button, (say to recapture some CPU cycles on a personal computer for a while for some other compute-bound task), correcting times to omit from the status accounting the time during which Traveller was paused. When Resume has been pressed, most other buttons are disabled as described for the Start button.
Start the genetic algorithm running, from a fresh starting point, either when the program is first started, or after a Reset, Create New Cities, or Set Configuration has had time to take effect. The Start button disables almost all the controls (see the checkbox descriptions for exceptions) except the Stop and Test Again buttons.
This is a convenience button. It does the equivalent of pressing in order the Stop, Create New Cities, and Start buttons. I use this when running lots of short, identically set-up problems in a row for statistics gathering purposes. After it is pressed, pretty much everything else is disabled as described for the Start button.
Sets up the same problem with the same cities (but a different state in the random number generator, so you don't get the same run), ready to run again when the Start button is pressed. This is interesting for checking variations in execution times or solution success by selected genetic algorithm heuristics and metaheuristics on a single problem data set. After Reset does its work, the applet is still in the state as described for the Stop button, except that the Start instead of the Resume button is enabled.
Sets up a new problem with a new set of city locations, but otherwise with an unchanged configuration from the previous run. This is interesting for testing a configuration setup on a series of identically-sized problems. I use it for statistics gathering. After Create New Cities does its work, the applet is still in the state as described for the Stop button, except that the Start instead of the Resume button is enabled.
This button enables the two configuration windows' components for changes.
This button disables the components in the two configuration windows for changes, and stores the settings the user has created into the program's control variables. [There are a few exceptional components that stay enabled always, see the checkbox descriptions for details.]
Traveller has a default set of valuator and checkbox states at startup time, this restores them, as a shortcut alternative to putting them all back in place by hand. The configuration windows' components are still enabled after this button is pressed, for possible further user changes.
With so many heuristics, turning them on and off has become a chore. These three "Heuristics" buttons provide shortcuts. This first shortcut turns on all the heuristics checkboxes.
This second shortcut turns off all the heuristics checkboxes.
What fun! This third shortcut turns on a randomly chosen subset of all the heuristics checkboxes.
The canvas background is color "black". Inside the frame of the canvas, a ten pixel black border is left vacant around the area within which the conceptual traveler travels. Three things can be seen on this canvas.
The canvas background is color "black". Inside the frame of the canvas, a ten pixel black border is left vacant around the area within which the minimal spanning tree is built and then used. Three things can be seen on this canvas.
The canvas background is color "black". Inside the frame of the canvas, a ten pixel black border is left vacant around the area within which the elastic net is built and then used. Three things can be seen on this canvas.
With Traveller's current richness of heuristics, it is quite possible to enable more heuristics than you can possibly find room on your screen to accomodate the corresponding visual debugging windows. You can take two attitudes toward this. Be serious, and use these windows as real debugging tools, only turning on as many heuristics as you can separately panel visual debugging windows, usually four on my hardware. Be frivolous, and use these windows for fun and to amuse or terrify your friends. The windows are equipped with a "window to front" mechanism, so that the window for the currently running heuristic always pops to the front for viewing (right through any other application you might want to see instead). You can either leave the set of them stacked where they open, in the screen upper left corner, and just watch what the top one is doing while it is visible, or you can drag them and drop them messily all over one another across your screen, and have them popping forward like the moles pop up in a Whack-a-Mole game, leaving usually several visible at once.
In general, the visual debugging windows show a "setup" stage to show the genome or genome pair with which the heuristic begins, one or a series of ongoing effort "step" stages, and a "done" stage when the heuristic has developed its final answer. These behave slightly differently for asexual or consultive reproducers, which have one progenitor, and for sexual reproducers, which have two.
In the former, less complex asexual (or consultive, it works the same) reproducer case, the setup stage shows the parent genome in red. The step stages show the current candidate genome ("on top") in green, the next earlier generation candidate genome (rarely, it turns out, visible) ("in the middle") in pink, and the next earlier candidate genome differing from the middle genome, ("on the bottom") in red. Only when two successive internal steps of a multiple step heuristic produce different candidate genomes does the middle one show.
In the latter, more complex, sexual reproducer case, the setup stage shows one of the genome parents pair in pink, the other in red. The step stages show the genome parent pair in the same colors, and overlaying them (so that only unmatched edges of one or the other parent show through) the current trial child genome in green. The done stage shows the parent genome pair in the same colors, and overlaying them, the product child genome in yellow/gold. Unless the done stage shows all three colors, one or another parent got propagated forward to the next generation (and so is completely hidden by the identical child genome drawn after and so on top of it), no different child was produced.
In either case, the shift from a green foreground genome to a gold foreground genome indicates that the individual heuristic has reached its "done" stage. For the simpler Traveller heuristics, which have only one internal step, the step stage isn't too interesting, nor is it visible for too long. In the more complex heuristics, the slow running ones, watching the heuristic run is worth the price of admission to Traveller in and of itself. Things fly by just a little too fast to see the details of the individual steps, but for some heuristics, a beautiful general pattern of changes emerges.
It is well worth mentioning that Traveller runs much slower with visual debugging enabled, but you expected that, right? Besides the heavy computational demands of drawing all those genomes, each heuristic pauses a full second for you to admire it at its "done" stage before allowing another heuristic to kick off its processing, when visual debugging is enabled. [Update: Now Traveller just runs slower with visual debugging on, and the windows are much more lively, since I got smart enough not to redraw visual debugging window contents for heuristic loop steps that led to unchanged images and genomes.]
Besides their associated labels, the valuators consist of nine pieces, four decrement push-buttons, the current value text field display, and four increment push buttons. The push-buttons are labeled "-M", "-C", "-X", -I", "+I", "+X", +"C", "+M" for changes in the unit of interest, either units or hundredths, expressed in Roman Numerals (because they can be one character wide, so the push-buttons can be nicely of equal width).
Number of Cities is the "N" in the computational complexity expression of problem difficulty, and the SETSP problem complexity rises as N! or "N factorial", very fast growth indeed. Number of Cities has a minimum of three cities, for which any permutation is a solution. Fairly modest two digit numbers provide problems complex enough that the universe would burn out and die before an approach of simply forming every possible path and evaluating their fitnesses to choose the shortest could finish running on the world's fastest computers. Start this value small and ease up gradually. Remember, changing N from 19 to 20 isn't making the problem one unit harder, it is making the problem twenty times harder.
The population size chooses how many travelers (the usual term of art is "genomes", or in Scott's term, chromosomes) are going to be working and competing and mutating and breeding and cross-breeding to solve the problem. It has a minimum value of three, catering to the inver-over algorithm in particular, which needs self and at least two other genomes with potentially differing fitnesses to succeed.
Big populations find solutions in fewer generations, but not necessarily in less time, since big populations eat a lot of processing resources. Population sizes from half to four times the number of cities chosen seem to work well.
Big populations are pretty much a waste of time if neither crossover heuristics nor the one consultive heurisic are enabled; each genome is then being solved as an independent problem, so, if this is what you want to test, probably you should set the population size as small as allowed. [There is some argument for large populations, though: with lots of chances, maybe at least one of them won't get stuck in a local optimum for unbearable amounts of time, and so will produce a solution in reasonable time. Feel free to explore this and other issues, that exploration is exactly why Traveller exists!]
Remember too, that Java is a resource hog, and some combinations of number of cities and population size will drive the problem from real memory to virtual memory. At that point it is time to find something else to do while the problem runs: get married, raise a family, choose a burial plot. Virtual memory spillover creates that kind of waiting times. Alternately, those combinations will cause the Java interpreter to freeze up completely, taking tender operating systems along with it.
Use good sense. On my particular set-up, a city count of around a thousand, and a population size of around five hundred, or vice versa, is pretty much at the ragged edge of the possible. Such a problem runs for several days from a purely random genome seeding before anything resembling a solution emerges.
[UPDATE: Now that I've finally, as of release gamma, managed to virtualize the size order N squared city distances array in TravellerWorld.java, I can bring up much larger city counts, though the product of population times city count is still a limiting factor. Bringing up a 10,000 city problem with a population of 20(!) with a minimal spanning tree seeding took 724 seconds of setup, almost all of that MST overhead, the first generation with its extra processing requirements took 214 seconds, and subsequent generations with all available heuristics enabled took around 154 seconds each. Average starting tours were around 37,700 units long, over fifty times shorter than the randomly seeded case (below). The Salesman's Route window's canvas was about 50% city-cyan colored when it opened, and presumably lots of cities' integer pixel coordinates drawing locations (as opposed to their double precision floating point exact locations used to compute tour solutions) were identical, so that their dots overlaid, underemphasizing the crowding.
With random seeding, the generation times were about the same (279 and 283 [I wonder why the difference in the numbers compared to the MST seeded case; perhaps more pixel painting time for the longer paths?], average starting path lengths were in the 1,980,000 range, but the setup time was only 9 seconds, and the canvas was solid yellow instead of speckly cyan.
I'm not claiming that Traveller can do anything useful with such a monster (though by the fifth generation the randomly seeded case had found an improved best genome ever), nor, despite Moore's law, am I likely ever to see a laptop computer where that would be true, but, like teaching a dog to dance, getting Traveller to do this at all was still quite an accomplishment. Traveller can now at least contain SETSP problems of the order of magnitude of the largest irregular layout ones ever solved exactly.]
This controls how large a proportion of the population is mutated and put into the new population without regard to fitness, from 0.00% to 100.00%. A 2.0% mutation rate is a very large one, which is why the valuator provides steps of one one-hundredth of a percent. Mutation puts entropy back into the population that the rest of the heuristics are designed to remove. Sometimes, the algorithms get stuck on a local optimum and cannot escape it to find the global optimum, in which case adding some mutation for the next run can provide the needed nudge. It is a good idea to try doing some runs with mutation set to zero for a while before adding any, though, to see how well Traveller can do. Many heuristics are too tender to abide much mutation, and the average fitness using them hovers far away from the optimum solution if the mutation rate is set high. Mutation also interferes with how long adaptive permutation effort, new with Traveller's delta release, can afford to wait before it decides to up the effort level.
Layout controls affect the locations of the cities through which the traveler does his tour. Some layouts have known exact solutions, the solutions of others can only be estimated by eye.
The circular layout puts the cities on the circumference of a circle in random positions. It is interesting because it has a known and easily verified solution, and is thus useful for knowing when a run is complete, for taking statistics of the performance of various heuristics and configurations. I fixed a bug in Scott's implementation that both displayed and located the cities on rounded integer grid locations; this can end up with a solution length longer than the circle's circumference due to taxicab right angle turns to walk around the polygon, an unsatisfactory situation indeed. By keeping the actual city locations as double precision reals, and only the display locations as long integers, I fixed this artifact within the limits of detectability.
This and the next five checkboxes are a set of radio buttons; you cannot turn one off, you must turn a different one on, instead, so that there will always be exactly one valid selection.
The smallest integer square at least as large as the number of cities is computed, a grid with that many locations is provided, and then the user selected number of cities is laid out top to bottom left to right in a compact layout on that grid until all cities have locations, which may or may not mean the grid is filled. Even numbers of points can always be connected by a path that has only horizontal and vertical "unit" segments, odd numbers of points can always be connected by a path that has all unit length steps except for one diagonal step. Thus this is interesting as another layout with a known solution. Thanks to Onno Waalewijn in Norway for this suggestion.
I have the C code for this [see my computational complexity web page at the URL above for a link to the Fractal TSP web pages somewhere in the Portuguese-speaking world], it just needs porting, and it may end up as three choices, the original research paper provides three. The layout is built with an L-System parser, using the program input format for the FRACTINT freeware program, forming a fractal pattern with an easily computed and visually obvious minimum solution. The research paper described heuristic approaches that treated fractal layout cases with a million and a half cities. I don't expect Traveller will be doing that on any hardware I'll ever own. The eventual implementation will again be to form the smallest layout that will at least contain the number of cities selected by the user, and then fill that until we run out of cities.
This is similar to the even grid layout, and also sparked by Onno's suggestion. Here, each square of the original grid has a smaller square nested inside it. There is a visually obvious best solution, but I lack words to describe it. This is a very challenging layout for Traveller, much harder than randomly placed points. It nearly inevitably gets locked on a local optimum with a death grip. Until I finally got simulated annealing working as a metaheuristic wrapper around the various heuristics, there were many cases of this layout I'd never seen Traveller bring to an exact solution. The 52 point fully-populated-grid case, with a population of 35, was my first big breakthrough, in a mere 19235 seconds. [Traveller has made some rather stunning progress; as of release epsilon, Traveller, just tested, finds an exact solution for that case in 22 seconds.] In general, the solution should always consist of horizontal lines, vertical lines, or lines running at a 45 degree angle, but almost always, a diagonal at another angle sneaks in and won't go away. This is by far the most interesting layout to watch run, so far.
Here, during layout, a polygon is spun while shrinking, to provide inward spiraling trails of cities; at this writing there are five arms of spirals, it may differ when you see it. Choosing an odd number makes the problem more difficult, because for one arm the traveler cannot step to the next one over upon reaching the center and continue back out. This is one of the controls that makes me wish I had more screen room or better understood popups in Java. It would be nice for the user to be able to set the number of polygon sides and the spiral turning rate. As is, though, the traveler best path has some unexpected behaviors as numbers of cities grow large, which I'll leave it to your pleasure to discover.
Try this first with 50 cities and a population of maybe three times that size, to see what I was expecting to see happen. Next, try this with 1000 cities, a population size of 250, and the minimal spanning tree seed heuristic to see a pretty picture and the serendipitous result. Don't wait to see it run to completion; running right now as I type this (release gamma), it runs a generation in about 30 seconds, and has made only four improvements in the last 150 minutes.
This is the original layout for the problem. It doesn't have some easy way to check that the solution is optimum, except experience and the human eye. It is fun to watch because of all the cartoons a random number generator will draw for you on its way to a solution.
[This checkbox and capability has been removed; it was too messy to try to keep all heuristics capable of doing both open path and closed path calculations; now all Traveller problem instances are the classical closed path case.]
To the possible objection that a gridded layout is somehow "too easy" [as came up on the Net from several functional innumerates in response to Scott's circular layout], this checkbox comes to the rescue. Turn it on, and an invisible but sufficient fuzz is applied to the city's x and y coordinates, so that no two distances between cities should be the same, the differences are sufficient to change the evaluation of fitness, but the (now unique) solution is among the subset of expected solutions for the unperturbed case, and thus still easily recognized.
Sexual reproducer heuristics are those which create child genomes as a mix of the codon arrangements of two parent genomes. Because the SETSP involves a permutation genome, all reproducer heuristics must maintain the permutation invariant. This is especially hard to design for useful sexual reproducer heuristics, which is in part why there are fewer of them than of asexual ones.
| SRL Ordered Crossover, a Two Point Crossover variant | ||
|---|---|---|
| Show the three codon crossover region in the middle. | Rotate the view left first to put the crossover region at one end. | |
| Original genomes with crossover region marked. |
AB|CDE|FGHIJK GK|HBA|JIDECF |
CDE|FGHIJKAB HBA|JIDECFGK |
| Lift out matches for incoming codons, set them aside in the order in which they'll be used. |
HBA --|CDE|FG-IJK GK|HBA|JI---F CDE |
HBA CDE|FG-IJK-- HBA|JI---FGK CDE |
| Slide unaffected codons left-with-wrapping to make a hole into which to move the local swap codons. |
HBA --|CDE|FGIJK- --|HBA|JIF--- CDE |
HBA CDE|FGIJK--- HBA|JIFGK--- CDE |
| Slide local swap codons left-with-wrapping into the hole just created to make room for the remote swap codons. |
HBA DE|---|FGIJKC GK|---|JIFHBA CDE |
HBA ---|FGIJKCDE ---|JIFGKHBA CDE |
| Drop the remote swap codons into place in the space created. |
DE|HBA|FGIJKC GK|CDE|JIFHBA |
HBA|FGIJKCDE CDE|JIFGKHBA |
| KPD Ordered Crossover, a variant on a variant | ||
| Show only the view with the swap region rotated to the left end. | ||
| Original genomes with crossover region marked. |
CDE|FGHIJKAB HBA|JIDECFGK |
|
| Set everything aside, making room for a copy. In that copy assign a left area for incoming codons from the crossover region of the other genome, assign a right area for all the other codons from the current genome. |
---|-------- CDE|FGHIJKAB ---|-------- HBA|JIDECFGK |
|
| Walk each genome left to right. If the codon encountered is one of the incoming swap region codons, instead of copying it at all (we'll get its equivalent eventually), copy the next codon in order from the donor parent's swap region into the next available codon position in our own swap region. Otherwise, copy the codon to the next available codon position in our own non-swap region. Notice again in the diagram to the right that this means that when a swap codon is encountered in our own codons, the codon written to the child is not necessarily the codon we just encountered, as flagged by the asterisks in the cases where this is occurring. In the case where it is the same, that is just by coincidence. [To make the diagram easier to follow, codons on the right are lowercased as those positions are used.] |
---|C------- cDE|FGHIJKAB nonswap ---|H------- hBA|JIDECFGK nonswap ---|CD------ cdE|FGHIJKAB nonswap ---|HB------ hbA|JIDECFGK nonswap ---|CDE----- cde|FGHIJKAB nonswap ---|HBA----- hba|JIDECFGK nonswap ---|CDEF---- cde|fGHIJKAB nonswap ---|HBAF---- hba|jIDECFGK nonswap ---|CDEFG--- cde|fgHIJKAB nonswap ---|HBAJI--- hba|jiDECFGK nonswap H--|CDEFG--- cde|fghIJKAB swap C--|HBAJI--- hba|jidECFGK swap * H--|CDEFGI-- cde|fghiJKAB nonswap CD-|HBAJI--- hba|jideCFGK swap * H--|CDEFGIJ- cde|fghijKAB nonswap CDE|HBAJI--- hba|jidecFGK swap * H--|CDEFGIJK cde|fghijkAB nonswap CDE|HBAJIF-- hba|jidecfGK nonswap HB-|CDEFGIJK cde|fghijkaB swap * CDE|HBAJIFG- hba|jidecfgK nonswap HBA|CDEFGIJK cde|fghijkab swap * CDE|HBAJIFGK hba|jidecfgk nonswap |
|
There exists (at least in Traveller) only one of these. It makes modifications strictly within one genome, but it consults with the genomes of the remainder of the population to decide which modifications to make.
Asexual reproducer heuristics make a child genome using only the information in a single parent genome. They are easier to design with regard to maintaining the permutation genome invariant than are sexual reproducer heuristics, and so they are comparatively numerous. That doesn't keep them from becoming wonderously complex, though.
Number of child genomes generated by permuting C cities and then listing each permutation in all possible same-ordered subsets partitioned among S slots where 1 ≤ S ≤ 2*C.
Cities C
1 2 3 4 5 6 7 8
+ -- --- ---- ----- ------- -------- --------- -----------
1 | 1 2 6 24 120 720 5040 40320
S 2 | 2 6 24 120 720 5040 40320 362880
l 3 | 3 12 60 360 2520 18060 181440 1814400
o 4 | 4 20 120 840 6720 60480 604800 6652800
t 5 | 5 30 210 1680 15120 151200 1663200 19958400
s 6 | 6 42 336 3024 30240 322640 3991680 51891840
7 | 7 56 504 5040 55440 665280 8648640 121080960
S 8 | 8 72 720 7920 95040 1235520 17297280 259459200
9 | 9 90 990 11880 154440 2162160 32432400 518918400
10 | 10 110 1320 17160 240240 3603600 57657600 980179200
11 | 11 132 1716 24024 360360 5765760 98017920 1764322560
12 | 12 156 2184 32760 524160 8910720 160392960 3047466240
13 | 13 182 2730 43440 741360 13358880 160392960 5078707200
14 | 14 210 3360 56880 1025760 19513440 390499200 8202700800
15 | 15 240 4080 73200 1391760 27864000 585547200 12887078400
16 | 16 272 4896 92784 1855680 38998080 858533760 19755348480
At this writing the local permutation limit is held to five,
restricting the worst case to 240,240 fitness evaluations
(and hoping that case never occurs in practice).
Lest you think I'm whining, this little table illustrates the reasons for the limitations chosen in Traveller rather well.
Number of candidates to be child genomes generated by permuting C cities in all possible orders and then partitioning each permutation into all possible same-ordered subsets among S ≤ C slots.
Cities C
1 2 3 4 5 6 7 8 9
+ - - -- --- ----- ------ ------- --------- -----------
1 | 1 2 6 24 120 720 5040 40320 362880
S 2 | 6 24 120 720 5040 40320 362880 3628800
l 3 | 60 360 2520 18060 181440 1814400 19958400
o 4 | 840 6720 60480 604800 6652800 79833600
t 5 | 15120 151200 1663200 19958400 259459200
s 6 | 322640 3991680 51891840 726485760
7 | 8648640 121080960 1816214400
S 8 | 259459200 4141347200
9 | 8821612800
There are a variety of mechanisms which researchers have designed that can be wrapped around or integrated with any heuristic or mix of heuristics, which in some sense exercise a higher level of control or are of a higher order than heuristics.
This is based, for me, on an idea I encountered in reading one of Stephen Jay Gould's wonderful books of essays on evolution, but others have already implemented it and it is a well known part of the genetic algorithm toolkit. The idea, from Gould, is that population fitness grows faster if isolated breeding cohorts feed back occasional improved genomes to the main population. I've also seen this described as an example of hybrid vigor, where breeding in small groups weeds out the bad genes, and crossbreeding back into the larger group recombines from improved genes. A "deme" is a district, also a breeding cohort, and "allopatric" is "in another place", if I remember correctly. This again is a metaheuristic, since it will work with any heuristic. It is an absolute booger to install into existing code, though, so it will probably be the last checkbox marked operational.
Here, too, is a metaheuristic, though people don't usually think of simulated annealing as such. Most simulated annealing implementations use some single simple heuristic, often inversion or a multi-city swap, and wrap that in the annealing mechanism. They usually work with a single genome instead of a population of genomes, too. However, annealing works quite well for populations of genomes. Also, annealing works with any heuristic that is as capable of emitting a worse fitness genome as a better fitness genome (which lets out all my permutation heuristics, the little "check valves"!) This one is now implemented, and as mentioned above, sometimes provides more powerful capabilities at breaking free of local optima than do runs done without it.
A warning is due, though, that annealing as implemented here doesn't work well if the population is seeded with already excellent genomes. The high limit of permitted dis-improvement from which annealing starts effectively randomizes all but the (immunized) single elite genome that remains on the screen. Thus, the effort of creating good seed genomes is mostly discarded, and the performance of Traveller comes to resemble the case where only random genomes are used for seeding the population.
Simulated annealing attempts to simulate the physical annealing process, in which a metal or a glass is very slowly reduced in temperature from near the melting point of the substance, to room temperature. The idea is that gross disorder will be removed by the high energies at higher temperatures, at the expense of adding local disorder, but that as the temperature goes lower, the energy to cause gross changes will diminish, and local order will begin to arise due to the "fitness pressure", which in the case of physical materials, is the tendency to assume some minimal energy form like a crystal or a glass.
In the SETSP or GA (genetic algorithm) case, this converts to allowing some proportion of the candidates for the next generation to enter that generation with a worsened fitness, but over generations to reduce the amount by which that fitness may be worse, until it no longer affects the algorithm. The fitness pressures are the same ones already operating for the algorithm. Again this is simple to implement as a filter where the existing fitness filter sits, and when I implemented it in StarLogo, it sped up convergence perceptibly. Now it does the same for Traveller, sometimes minimally, sometimes dramatically, depending on what heuristics are in the mix and what problem size and problem layout pattern and population size choices are made.
[Update: With some tuning, and for larger problems, simulated annealing is a sometimes crucially useful helper for heuristics, especially in my current setup as of release epsilon for the nested grid layout, still Traveller's most difficult (because it is rife with deep local optima) layout pattern. Mostly, starting with full pattern 52 and larger, a run without annealing lasts so long lack of time forces the user to abandon the run, while a run with annealing frequently succeeds in some time fitting inside of an hour to a day or two.]
It can be eye opening to realize that a low chance of annealing can lead to a fairly high proportion of annealing. This is why the chance of annealing has to be kept low. In the run I have executing as I type this, under Traveller release epsilon heuristics, a very difficult 337 city fully populated nested grid layout, with a population of only 33, all heuristics enabled and no mutation enabled, which has been running for 19 hours and a fudge (and is within one repair of the optimum solution at this point), for the first 2871 generations, among 94766 attempted improvements of a genome, there were 5258 successful improvements due to heuristics, and 890 succcessful disimprovements due to annealing, an unexpectedly high proportion of annealing to improving, considering that the chance of a disimprovement being considered for annealing is tuned at 4%, low even before the culls by the ever decreasing annealing limit occur. The rest of the story is that most heuristic applications, on average, return no improvement, or for those that can, return a disimprovement instead. Genetic algorithms discard lots of culls.
This idea also comes from Gould, and is also already part of the GA toolkit [but still awaiting implementation for Traveller]. Population diversity stays greater when predators are around that specialize on particular phenotypes in the population (eat only one color of moth). As the prey population of the desired type diminishes in the population in response to predation, the predators must re-evolve to target the new more frequent prey. The end result is that diversity (and indeed speciation) emerge in response to focused predation. I've got most of this one worked out in my head, I just need my edge preserving cyclic crossover working first to act as a code base to loot, because I plan to match predator to prey using the same genome in each, and by counting matched edges set the probability of predation. They will reproduce in response to different fitnesses, though, and I haven't yet got that well defined for the predators. Again, in some sense, this is a metaheuristic.
I finally got around to putting in a way to turn off Scott's single elitism feature, and this is it. When it is on, the best genome seen so far is always in the population at each generation, and is always portrayed in the Salesman's Route window. The shortest length status line has identical values in the best, at last improvement, and currently slots, and the best genome is immune from being mutated or overwritten by a less fit genome. When it is off, all genome slots are treated the same, the Salesman's Route shows merely the best current genome's corresponding tour, the best genome found can be irretrievably lost from the population, and the shortest path "worst ever" value can become bigger than the value at the start. All of this is frequently desirable, because the "best genome ever" may be helping lock the Traveller into a local optimum by always providing reproduction operators a chance to put the local optimum characteristics into the rest of the population, which may otherwise evolve away from them and it.
I've implemented this one once, it was the thing that broke StarLogo finally, but have not yet implemented it here. A tabu search metaheuristic runs as a supervisor over any heuristic, and what it does is forbid any member of the population to turn into a genome that some member of the population (it or another) recently became. The concept is to attempt to preserve population genetic diversity, since crossover among identical genomes in a permutation genome setting is pretty useless, and since premature convergence is a problem that dogs all genetic algorithms. The goal also is to avoid wasted effort backtracking over recently explored portions of the fitness landscape. This is a metaheuristic because it needn't care what the underlying heuristic, or set of heuristics, may be. Obviously, for big forbidden lists, it can get pretty labor intensive to compare every proposed new population member to all those past history forbidden changes. Nevertheless, it is conceptually pretty simple to implement, it just plugs in as a filter the same place the fitness filter goes.
Both selection of genomes to participate in reproduction or to survive to the next generation unchanged, and selection of heuristics to be run, can be usefully biased in some fashion. Research has provided a small number of useful ways for Genetic Algorithms to make one choice in preference to another under control of a random number generator while still biased in a desired direction.
This is Scott's original method for selecting fitter genomes for reproduction and mating. It is the only one implemented right now, so clicking the boxes for this and the next two methods doesn't do anything except put up checks and demonstrate a radio button implementation. It is mainly here to remind me of work still to be done, and so that I could turn the next two methods into part of a trio of radio buttons (since one of the three must be selected), an improvement over their inferior functioning in my first full source code Traveller "alpha" release.
Scott implemented the classic tournament mechanism in Traveller, the roulette wheel with gate widths proportional to genome desirability. There are several problems with this, though. It is often too powerful, it takes two of them if one also needs to select for bad fitness for selecting genomes to discard, and it is calculated at generation boundaries (and probably much too expensive to maintain in the continuous case), so that shifting from a generational to a continuously updated population requires a different selection mechanism. Up comes the tournament. A tournament ignores proportions in rankings in the population, only paying attention to who is better than whom, but not by how much. It just picks some arbitrary subset of the population, typically half a dozen or less, and then picks the most fit member from that subset as the candidate to put forward. This works just as well when there are distinct generation boundaries, or when the population breeds and replaces within itself in a continuously updating mode.
When I get time, I will have two tournaments here. The one just described is the strong tournament: choose a sample, return the extreme individual in the desired direction of fitness.
A weak tournament, in contrast, tries to avoid the population diversity problems that a strong tournament can create. It works like sneaking up on a leaf when you are an herbivore; slow and careful does it, for GAs of small brains. Instead of returning the extremum in the desired fitness direction, the weak tourney discards the extremum in the undesired direction, and then returns an arbitrary member of the group that remain.
This and the previous tournament are mutually incompatible (even though not yet implemented) so the interface works, with the roulette wheel choice, as radio buttons, to let exactly one be enabled at time.
One of the options in running genetic algorithms is where to put selection pressure, and where to allow randomness. The more places fitness selection is enabled, the more quickly the population converges to something, but also the more quickly population genetic diversity is extinguished. Having children compete for survival, but having parents picked at random for reproduction, is one perfectly valid way to try to keep populations genetically diverse. The opposite approach makes sense, too, of putting the selection pressure on the parents, but putting whatever child results into the population without competition (as nature does). I haven't tried to implement this latter approach in Traveller yet, and I have no firm feeling whether it would be successful or not. I do suspect that it would require much larger populations to have much chance of success.
One of the fastest ways to solve a problem is to start with the final answer. Failing that, starting with a good approximation of the answer is often helpful. Research has provided a number of approximate SETSP solvers, and for Traveller's use, these become mechanisms to seed the population with genomes of exceptionally good fitness (in comparision to randomly chosen tours) at the beginning, in hopes that a job well begun is (more than) half done.
This is a placeholder for a planned algorithm which will create a Delaunay triangulation of the cities (removing the vertices and edges added initially to build the triangulation in some implementations), take its convex hull (or use other means) to locate its exterior vertices and edges, and remove edges from the exterior in an order that does not cut the graph, does not create a cut-point for the graph [the second assures the first, the second is done by assuring that the third triangle vertex for a triangle whose edge is to be removed is not an exterior vertex of the existing remnant triangulation], does create a cycle [effectively automatic after the list of eligible-to-be-removed external edges goes empty, and the remaining "scaffolding" of internal edges is removed, given the above], and creates the minimum additional tour length as each selected external edge is removed, the other two edges of the triangle become exterior edges in its place, and the other vertex becomes an additional exterior vertex. The remaining exterior edges will be the tour. My expectation is that this will create a seed genome of excellent fitness, since it will have no self-intersections and will be composed entirely of "short" edges, at least insofar as Delaunay triangulation edges are short compared to the average edges of a complete graph of the TSP nodes which are the cities. There exists, I have read, a proof, by counterexample, I suppose, that not every SETSP instance is solvable by a set of edges which are a subset of the edges of a Delaunay Triangulation, so it is fair to say that this seeder is at least an imperfect cheat.
This seeding technique runs an elastic net, an SETSP approximate solution heuristic in itself, just to provide seed genomes for the Traveller population. It counts as a "sighted cheat." A description is below, under the source code write-ups, for those not familiar with elastic net heuristics. Because it starts its net as a small circle centered on the Traveller play field, it is especially compatible with the radially symmetric layouts, the circle (which it should solve at once) and the poly spiral. While its seeding guesses for the poly spiral are not the final answers for large numbers of cities, they are often things of beauty in themselves.
While not an exact TSP solver, this heuristic is extremely powerful. Often, though not always, the best seed it provides for a problem of 200 cites at random locations is already free of self intersections at the first generation. Mixed with minimal spanning tree seeds which tend to be messier but better understood, much faster to produce, and giving as a side benefit the nice "MST floor" display value, it makes feasible solution of problems otherwise beyond the Traveller users' patience.
This seeding tool is too slow (see details below in the source code section) for bigger problems (at least ones done for fun, it is perfectly feasible if there is money involved), and takes the fun out of smaller problems. On one random layout problem sized at 20 cities, for example, it just returned the final solution as an initial seed genome, rather taking the wind out of the rest of Traveller's sails. Thus, the elastic net is recommended as a genome seeding tool for problem sizes the order of magnitude of a hundred cities. It goes extremely CPU bound in the setup phase of Traveller, so the Traveller application or applet window doesn't respond to much of anything until it Elastic Net has finished it chores, and other applications have trouble initializing. Java doesn't "play well with others", or perhaps I just don't yet know how to teach it to be polite.
Since this seeder can take a significant amount of time to execute, I've provided a floor show as of release delta to keep the user entertained while it runs. When enabled, this seeder pops open a window showing the evolution of the elastic net to approximate the city locations. This is slick enough to stand as an elastic net demo all by itself, and doesn't add much to the (already huge) processing time required to complete an elastic net run. I've also made the seeding algorithm work harder to do a better job. The trade-off of course is that seeding takes even longer.
If running multiple seeders with floor shows, it is probably worth mentioning that because seeding is interleaved, all seeders run at the speed of the slowest one. Also, it is up to the user to unstack the various windows to see all the shows. The floor show windows close themselves when the seeding runs are done.
When I find a copy to steal, this will be one of the 2-OPT or 3-OPT approximate SETSP solver's output, used as a seed genome to Traveller. Right now it is just a wish.
This is the minimal spanning tree seed genome creation mechanism, a sighted cheat. It attempts to put excellent seed genomes into the population to cut short the nonsense of weeding down from random genome seeds to gain any beginning fitness at all. It works by generating the minimal spanning tree (MST) for the complete graph of the city nodes (see any competent computational geometry book; I use the one by Preparata and Shamos), doubling every edge of the MST, then starting at some arbitrary point, walking the edges in "outside" order, picking for the genome every codon, when first encountered, that hasn't already been added to the genome. Since the minimal spanning tree is by definition of shorter length than the solution to the SETSP, and since the cycle developed by the above described process, at every point where it ignores a node already harvested, shortens the doubled MST by replacing two edges of a triangle by the triangle's other edge, and in the end does produce a qualifying permutation genome, the seed genome produced is known to be at worst less than double in length the length of the ideal SETSP solution. For a thousand city random layout TSP problem, this is usually about 15 times shorter than the average of the randomly seeded starting population, a good start at reducing the length of the Traveller's tour.
There is (naturally) it turns out, long prior art and a more elegant way to do my doubling and then unwrapping of the MST; see: Nicos Christofides, Worst-case analysis of a new heuristic for the travelling salesman problem, Report 388, Graduate School of Industrial Administration, CMU, 1976. That even predates by about 8 years the first time I invented my version. Much like my math degree, where I never studied any math invented in the 20th century despite taking most of the coursework for an MS, there is a lot of background to learn before becoming "state of the art" in SETSP solving. The Christofides version adds only needed edges, and thus the unwrapping is cheaper too.
As of Traveller release epsilon, the MST seeder produces one unique seed before all the randomly subsampled ones. It does a greedy thinning of the doubled MST, removing edges in the order that provides the greatest edge by edge shortening of the cycle, until all nodes have been reduced to single occurrances and a tour which is a Hamiltonian cycle and thus a "genome" has been created. This is now the first MST seed returned, and then the ones where surplus nodes are removed at random are generated and returned for the rest of the MST seeds. Even this greedily derived version is not the best that can be done, for with much more effort, one can greedily remove edges that are the best choices that don't leave one with a self-intersecting seed genome. I've done that one in the past (18 years ago), I wish I remembered exactly how! [I do remember that it involved trig functions, which doesn't thrill me as a performance factor, but the previous instantiation was running on a screaming 0.5 megaHertz Apple ][+, so I'm sure it would be survivable somehow.] Also, the random surplus node removal seeds need work; right now the surplus nodes are not removed with equal probability, something that quickly becomes evident in the uneven density of edges in the MST seed depictions for large problems using large populations.
Since this seeder can take a significant amount of time to execute, I've provided a floor show as of release delta to keep the user entertained while it runs. When enabled, this seeder pops open a window showing first, the construction of the minimal spanning tree from the nodes/city locations, and then the (cumulative) added edges as the (edge doubled) minimal spanning tree is thinned to make seed genomes. Also as of release delta, the minimal spanning tree seeds have much better variety, due to an improved sampling technique. The trade-off is that seeding takes longer.
If running multiple seeders with floor shows, it is probably worth mentioning that because seeding is interleaved, all seeders run at the speed of the slowest one. Also, it is up to the user to unstack the various windows to see all the shows. The floor show windows close themselves when the seeding runs are done.
Traveller also offers the user a choice of ways to seed genomes into the starting population. This one is Scott's original seeding mechanism, rewritten by me to show better randomness. Cities are sprinkled in the genomes in random order, with no attempt at good fitness. This is Traveller's default seeding mechanism, and the seeding choice checkboxes are set up so that turning them all off turns this one back on again. Also, unimplemented seeding mechanisms, when selected, choose this one again. Since each seeding mechanism selected gets an approximately equal share of the genomes to seed, enabling unimplemented ones ups the portion of randomly seeded genomes.
To assist the user/programmer, Traveller provides ways to see into the problem beyond just the current solution. Some of them are parts of the Salesman's Trip Report status panel, others are enabled here.
This is a nice debugging and intuition-enhancing tool, and is one of the checkboxes that is enabled for use even when the Traveller is in started mode. Until you turn it off, it puts up at each generation, behind the first best genome depiction, the circuits of all the other genomes, to help you see how much diversity remains in the population. This is a really fun display, showing lots about how the genetic algorithm searches for a way to make progress, but it makes the optimum path in front of it hard to see when there is still a lot of diversity in the population. Its drawing overhead also increases Traveller generation running times by around 16 percent in my setup.
This is a nice debugging tool, and is one of the checkboxes that is enabled for use even when the Traveller is in started mode. It puts up once (look quickly) the circuits of all the genomes, to help you see how much diversity remains in the population.
This one does what it says. It just turns on debugging printouts scattered all over the code. Be prepared for a deluge. Last run I did this it put out 60,000 lines between my clicking start, seeing a generation happen, and clicking stop. I'm fairly sure nothing useful will happen if this is turned on in applet (as opposed to application) mode with no file system available to accept the output, but if "it makes demons fly out of your nose" (Ada programmers' slang for the possible behavior of erroneous code), you were warned. This checkbox is also always enabled.
This checkbox does what it says. It turns on debugging windows, one for each enabled heuristic. This checkbox is always enabled, but due to various race conditions I don't understand and that may be OS specific and therefore outside my control, it can be tricky to make this work while a generation is being processed, and sometime it seems to cause a corruption of pointers to off-screen image memory (on my MS-Windows 98 system, no big shock), which can destroy your screen image or even corrupt your (OK, my) OS enough to crash your system. So, this is a bit risky, but I still use it all the time.
This checkbox can produce more windows than a typical display will contain, if all heuristics are enabled, so each visual debugger window also does a "pop to front" when its heuristic is the one running. The user can just leave all couple dozen visual debugger windows stacked where they initially open, or drag them away to scatter them in heaps all over the screen. This makes great "visual chewing gum" for a cocktail party. More serious use for actual debugging would probably limit the enabled heuristics to one to four so all the active visual debugging windows could have non-overlapping screen locations. This is especially important for the heuristics with very brief execution times, which otherwise disappear under other visual debugger windows before thorough review by the user is possible.
Visual debugging slows down Traveller runs profoundly, so don't enable it when trying to time a run or get an answer fast. In addition to all the drawing overhead, there is a one second pause built into the last-drawn depiction of each heuristic so that the user has time to glance at the result before the visual debugger window gets overlaid. There is more description of the visual debugging windows and their contents far above this portion of the document.
On this line is the name of the heuristic which produced the exact genome you are presently seeing in the Salesman's Route window. This allows the user to judge which heuristics are being most successful in each generation at moving the population fitness, and best genome fitness, forward. When you figure out which is the best heuristic for a problem setup, turn it off, and find the second best. You might find, by repeating this a few times, a genome mix that performs better than a full kit of genomes for the particular problem difficulty you have chosen. There are some apparent anomolies due to what it means to be the "best genome" of a generation when clones of the original best genome start to accumulate in the population (the population "converges on an optimum"). The best genome is that one of all the clones of the best genome that has the lowest genome ID. This becomes important because while one heuristic might initially do the improvement that leads to a new best fitness, other heuristics might either discover an independent path to that best fitness's first example genome order, or might copy its improvements to another child genome, and thus become the new indicated parent of a "best genome", although no fitness improvement has occurred for the best of the population, merely a fitness improvement for some child genome that makes it the fitness equivalent of the current best genome. Mixing this with the "jitter" that allows equivalently fit, but differently ordered, genomes to replace any genome, and you might see both the responsible party and the drawn best genome change, with no betterment of shortest path fitness. Whew.
On this line is the ID, the array index name of the genome whose path is pictured in the canvas. Watch this to see if it is bouncing back and forth among a few population slots (at least in the initial implementation, Traveller genomes compete to create a child to replace themselves in the population, so a new better member in the same slot is a child of the old occupant). Look also for the ID to descend toward zero (the first of identical best solutions is the one displayed, counting up from zero), and the clones count next to it to start rising. This indicates that the population is converging on a solution, whether a local optimum or the globally optimal solution. This is one way to identify the need for some mutation to be added in a new next run, if a premature convergence occurs.
The clones count tells how many population members have genomes identical to the one whose path is being displayed. Pay attention if it rises to the population size. Also pay attention if it never exceeds one, an indication that the algorithm is not converging on a solution, a possible indication of too much mutation occurring. It can also stay at one when the layout is a regular one, with lots of alternative globally optimal solutions, though.
The sameFitness count tells how many other population members have (within an allowance for roundoff error) the same fitness as the current first best genome. This is mostly of interest for regular layouts, where there are potentially lots of equally fit optimum solutions, and lots of different but equally fit local optima into which the Traveller can fall. Looking at the sameFitness count gives the user some clue as to how much of the population, although not having identical tours to the current first best genome, which the clones count detects, is at the same level of progress anyway.
I finally added the brute force exhaustive search approach time estimate from my StarLogo version of BTSP, and it sits on this line. Any time you think Traveller is a bit sluggish, look at this line and consider the alternative.
This line and the next give more "through the screen" visibility into the operation of Traveller. This line just provides titles for the concise representation on the next line, which in two counts clusters shows the number of candidates improved, advanced by annealing despite disimprovement, mutated without consideration of improvement or disimprovement, and the number of candidates considered, with separate display clusters for the totals for all generations, and for the counts for the just completed generation. If single elitism is enabled (no longer the default), then the total number of candidates considered in each generation will be one less than the population size. In the (usually) very rare case that a candidate is cloned directly into the new generation without any improvement attempt, that one will also not be counted among the candidates considered. As Traveller gets more focused on a solution, improvements become harder, and so that number drops. As the annealing decrementor lowers the annealing limit, it becomes harder and harder for a disimproved candidate to sneak in by annealing (as intended), and so that number drops on average over time. Because mutation is probabilistic, the unchanged mutation rate will result in varying numbers of candidates mutated. Since mutation is attempted on all candidates whether improved or not, for some settings the sum of the first three numbers could exceed the fourth number in the individual generation counts cluster.
This line constains the data for which the previous line provides the titles, with overall problem counts and most recent generation counts shown.
This title line just shows the order of entries on the next three lines. For each of them it indicates the best (most fit) value ever of the corresponding type, the value as of the last time the best-genome-ever's fitness improved with a new genome discovered by Traveller, the value of the corresponding type as of the end of the most recently current generation, the value in the starting population, and the worst (least fit) value ever seen of the corresponding type.
To conserve lines of the status display, the "lengths" lines that follow are appended with small informative values that would fit in the remaining space. This line caters to that reality by noting that after the lengths values comes "other stuff".
This status line shows the values as described above for the approach to the goal of the algorithm, the shortest possible path length.
After the shortest lengths value may come, depending on other settings, both a "solved at" value when an exact solution length is known, and an "MST floor" value displaying the MST seeder's output of a value known to be less than or equal to the exact solution value.
This line shows the values as described above for the population average fitnesses.
Next to it, to give the user a feel for the units in which fitness is being expressed, is the size of the entire Salesman's Route window's graphics canvas, including the unused black border padding, but not including the pink frame around it.
To give the user a feel for the range of values of fitness in the population, which are often severely skewed rather than some nice bell-curve-normal distribution, this line shows the values as described above for the longest path (least fit genome) in the population.
Next to it may appear, if a circular layout is selected, the circumference of the circle.
This line is an indicator for the adaptive permutation effort status, consisting of the current high permutation quantity (effort level) allowed, the highest level allowed ever, followed by a fraction showing the number of consecutive failures of permutation heuristics to improve any genome, as of the end of the just completed generation, over the number of consecutive failures that are required to raise the effort level by one to the next level. This fraction's denominator increases with each effort level, and is tuned upwards by number of cities, and downward by increased levels of mutation rate, once at the beginning of each problem, and retuned once each time the permutation effort level is adjusted upward. Like the clones count, a rising adaptive permutation level is an indication that the Traveller population has focused on an optimum, whether local or global, making improvements difficult (or impossible) to locate or increasing the effort level needed to locate them. Increasing the effort applied to find more improvements when such an increase is needed is exactly the reason for the existence of adaptive permutation effort in Traveller.
This line shows the generation at the (last) improvement, and the "now" (most recently completed) generation. It also shows the number of "improvements" (new, better, "best genome ever" discoveries) since the run began.
This line shows times: the elapsed seconds for the genetic algorithm portion of the run at (last) improvement of the best genome ever value, and "now" (at the end of the most recently completed generation); following those two values is the time in seconds it took to do the seeding and city and distance array setup efforts.
When annealing is enabled, this line shows the annealing decrementer (the value by which the annealing limit is multiplied each time an annealing disimprovement succeeds), and the current value of the annealing limit, or how large a disimprovement is currently acceptable, in units of the length of the side of a screen pixel.
My three original permutation algorithms share some features. Each selects a count of somethings to permute at random. Next it selects those somethings themselves to permute at random. Then it works its way through all possible permutations of those somethings, computing the fitness of each. Last it returns the genome as modified by the one permutation which gives the genome the superior fitness. Quite often this is the identity permutation, and thus the original genome, if things whose permutation doesn't help genome fitness are selected. This is thus a version of the brute force approach, but restricted to permutation sets small enough that we can afford to wait for it to run to completion.
[This brings up the issue that, ideally, a mixed heuristics genetic algorithm package might work better if various heuristics could be slid in and out of the enabled set as the solution effort worked through various phases, which in turn adds the issue of how such phases should be defined at design time and identified at run time. As folks in the academic community say in research reports when laying groundwork for the next grant application: "the need for more study is indicated."]
Since those original three algorithms, I've added many other algorithms to Traveller that share the concept of trying all possible combinations for some subset of the genome, and all benefit from having that subset be small early in the run and only grow as genome fitness approaches optimal.
[I haven't solved the problem of sliding heuristics into and out of the mix, so I approached the problem from another direction. As of the "delta" release of Traveller, the several algorithms dependent in some sense on a permutation limit are adaptive. That is, they work with small permutation sizes early in the run of a problem, when useful permutations are easily found. When long strings of failures to improve anything with small permutation sizes are experienced, the permutation size is bumped until a global upper limit is eventually reached. This has lots of nice consequences.
The various permutation heuristics communicate through the PermutationController class to share information about how many consecutive unfruitful permutations have occurred, and thus judge when it is time to increase the permutation effort level. It is worth noticing that Mutation Rate and permutation adaptability have a fundamental conflict: if all the permutations are doing is undoing the mischief of mutations, then the mechanism for judging when to raise the permutation effort level is inappropriately satisfied to leave it low. It is best to set Mutation Rate to zero when using permutation heuristics, and the permutation adaptation mechanism, to counter the effects of non-zero mutation rates, will by design set the failure rate at which the permutation effort level is bumped at a low enough level that it is shorter than the rate of introduction of new mutations. This makes permutation effort levels rise faster and earlier than is the case in the absence of mutation, since the alternative was not to have them rise at all.]
Several algorithms that don't really permute anything now piggy-back on the run stage information that adaptive permutation attempts to capture, such as Disordered Slide and Ordered Slide, which use the current adaptive permutation limit merely to decide how many codons to slide together. Since the size of the adaptive permutation limit in some sense contributes to the strength of these algorithms, they too provide feedback of success or failure to help adapt the permutation level.
Population diversity can probaby be defined many ways; number of unique genomes compared as a proportion to population size is a very usual choice. From the point of view of Traveller, which has many mechanisms for combining and manipulating genomes, genomes merely being different isn't sufficient. Traveller is attempting to collect together a set of graph edges into a single genome which make that genome the global optimum solution. Most important to Traveller then, is that at any time a high proportion of all those needed edges from the global optimum solution exist somewhere in the population, and that the missing ones are being created sufficiently often to be available when needed.
Traveller is full of very powerful heuristics, but Traveller's test cases, particularly the nested grid layout, are full of pitfalls, too: local optima that are easily entered and that require many precise simultaneous genome changes to exit. This makes it important not to let the powerful heuristics remove all population diversity and leave Traveller with no tools useful in escaping a local optimum. This is, of course, a balancing act: it would also be nice to see Traveller still able to converge when a global optimum is finally encountered, the tricky part being that while the user can often distinguish local optima from the global optimum or optima, genetic algorithms provide no such capability to Traveller.
Traveller is assisted in avoiding this trap by several features.
+ 1, and only
selects second mates for sexual reproduction, and
consultants for consultive reproduction, based on fitness.
This prevents fitness pressure from too rapidly decreasing
population diversity. Most of Traveller's fitness pressure
arises in deciding whether the parent or its child should
occupy the same slot in the next generation.
Many times during software development, choices are made about ways to do things that to a casual observer could have been done either way, or in any of a variety of ways. When the programmer feels strongly enough about such a choice to defend it, that choice becomes a policy, and needs documenting. Traveller has many policies, and I have tried to mark the code with comments containing the string "POLICY" where I recognize myself implementing one.
All of these that are under my control, I will fix over time, while my enthusiasm endures. Mostly I lack knowledge and time; the former I will fix, the latter in some sense cures itself, or else you die first.
In my first Traveller full-code release "alpha", inver-over was broken, and I didn't even know it. I'd tried to test a polymorphic array to be of the type of one of its content options, and that doesn't work. I was also copying consulted chromosomes, which I only needed to read, in a hot loop, which killed the garbage collector once I got by the first problem. It is fixed now. The more general programming lesson learned is that the messages in exception catches shouldn't be conditionally enabled, they should always be enabled, so that when you write stupid code, you immediately see the complaints shouted in response. The second programming lesson learned is that programmers silly enough to depend on source-code-converter-vendor-supplied garbage collection instead of carefully crafted new() and dispose() calls deserve what happens to them when they fail to pay sufficient attention to garbage collection considerations, which are every bit as tricky to get correct as doing ones own heap maintenance, and a lot harder to isolate when done wrong, because they are implicit rather than explicit in the source code. Sigh.
In my first Traveller full-code release, the routine now called Optimize Nodes Near A Point was severely broken, and I didn't even notice it. I'd committed an "off by one" error in manipulating array indices, and was discarding cities from the genome. If I'd been better at my testing, I would have notice that this routine run by itself crashed Traveller cold, dereferencing null pointers in a few generations. Lesson learned: always test new facilities exhaustively alone, before mixing in other functionality that might hide or compensate for the newly introduced bugs. Sigh.
Does it seem strange to you that Java, which doesn't even support pointers, frequently commits null pointer dereferencing errors? Apparently, though Sun trusted Java enough to write it in Java, they didn't trust pointerless code enough not to give themselves some backdoor way to abuse pointers anyway. This is too bad, and probably makes Java of lower quality than it could have been. I know I'm constantly surprised to be encountering obvious bugs in a compiler this mature. Programming lessons learned: there is no such thing as a mature compiler, and you can definitely trust the build and execution environments to add their flaws to those you create in your code yourself. Sigh.
Probably things work a little better (in the Population class) since I fixed Scott's inputs to his roulette wheel for genomes, which were assigning the widest roulette wheel slots to the least fit genomes. This was compensated for, and his code worked at all, because he was also assigning the full fitnesses, rather than some value modeling the fitness differences, to the roulette wheel slots, making the slots all nearly equally wide anyway, and the overall power of the genetic algorithm paradigm overcame the incorrectly directed bias and incorrectly proportioned biases his roulette wheel was providing. Sigh.
After bragging endlessly about "fixing" Scott's code, naturally I turn out to have just replaced his bad code with my bad code in some instances. Until Traveller release "epsilon", my ever-so-elegant Ordered Crossover variant was carefully copying the crossover codons in the order they had in the recipient genome, rather than the in order they had in the donor genome, as originally intended, rendering the heuristic "interesting", but not particularly "useful". I didn't discover the blunder until I was creating the table in this document to describe the differences between my version of ordered crossover and Scott's version, and had to go back to look at my code to remember how it worked, which turned out to be "incorrectly". First programming lessons learned: "elegent" doesn't replace "correct" as a figure of merit. Second programming lesson learned: do the documentation as you go along: having to explain what you did is a great debugging aid (the "to suddenly find your error, all you really needed to do was describe your code to a potted plant" trick so familiar to long time programmers and the long-suffering patient fellow programmers who get to hear the code described without ever getting to put in a word of advice before the problem is located), and you don't end up with a huge slug of deferred unpleasant work to face all at once. Sigh.
In an earlier Traveller full-source code release, I said about Elastic Net: "It looked much more impressive in his applet than it does in my code, where I notice the fitness of the tour bouncing up and down as the relaxation progresses, but I don't have access to the tuning parameter descriptions in the research paper from which he developed his code, so perhaps I have mine set poorly. Still, though the "solutions" don't look much like solutions for city counts in the hundreds ... ", which was a slander, and for which I apologize. Sun's buggy quicksort demo code, foolishly adopted by me without thorough testing, which I was using to convey Elastic Net's intermediate solution approximations to Traveller as seed genomes, was the cause of problems I incorrectly attributed to Alexander's nice code. I finally ripped out Sun's code completely and wrote my own quicksort from scratch. Alexander's code does a fine job as far up in city counts as I've yet dared to use it. I then correctly concluded: "... (and Alexander says in his code that I'm outside its design capabilities there), the seed genomes captured from the relaxation are an excellent assist to Traveller." Yes, programmers make social blunders as well as, and sometimes combined with, coding blunders. Sigh.
As files, the Traveller code is divided up into top level routines, routines down Scott's package path, and routines down my package path. Logically, it is further subdivided (by Scott, with me imitating him) into genetic, tools, ui (user interface), and (my addition) structures, where I stuck quicksort and the Sortable interface so that structures implementing Sortable could rest beside them. Besides my code and Scott's code, also incorporated from original sources, all at least somewhat modified by me, are Sean Luke's port of the Mersenne Twister to Java, Alexander Budnik's Elastic Net code, the nice little MultiLineLabel class from the VoroGlide distribution (attributed there to Christian Icking, Rolf Klein, Peter Köllner, and Lihong Ma), and (already included in Traveller's sourc code, and compiled, and linked in Traveller's build kit, though not yet implemented into Traveller or used there) a big chunk of the Qhull distribution which will become my Delaunay Triangulation seeder someday after Herculean redesign effort and rewrites still to be done (it is a long integer location coordinates algorithm, I need a double precision float algorithm, for starters, and my triangulation will need many more structures, annotations, and markup points than Qhull's makes available). Published algorithms not available as code are credited where the authors are known. I have lost track of the origin of the Fortran 66 permutation generator code I copied from the Net and ported, but it works well and fast, however mysteriously; thanks to the original author, wherever you may be.
Besides code, the Traveller.jar file includes documentation files, build tool scripts, and license documents.
Here is just a comment about each file, usually brief, to help you find stuff. As in Star Wars, if you really want to understand Traveller, Use the Source, Luke!
This is the jar tool's notes to itself about what it put in the jar file; entries in the manifest help make the jar file an executable file.
This is the infamous GNU Public License, under which Scott published his original versions of Traveller.
This is the Free Software Foundation's alternate GNU Public License, useful for libraries used without modification in other software. Included for information only, I don't think it is used to decorate Scott's code.
A Unix-style makefile, which I execute with GNU make. It builds Traveller and also some little test routine I keep around. It also builds a jar file which I use to distribute Traveller. The Java compiler is supposed to be its own build utility, but I've seen it fail to rebuild classes for modified Java source files, and I know that "make" works, so I created a none-too-sophisticated makefile, and it is included for those who know how to use it and have the proper tools to execute it. The makefile does not (by design or laziness, I'm not sure) capture all the module to module dependencies, so when complicated changes with impacts "sideways" are done, a clean compile is needed; the makefile is mostly useful to avoid excess recompilation effort when making lots of changes in one module and recompiling and testing frequently along the way, and also because it captures the complexity of the jar files contents list. The makefile supports non-file targets "all", for the test and traveller main executables, and the jar file; "test" for the Test executable alone, "traveller" for the Traveller executable alone, "jar" for the jar file alone, and "clean" to remove all the *.class files and the jar file when a fresh start is needed.
The jar file built is executable, and I have successfully executed Traveller from it at the command line, using
java -cp Traveller.jar Travellerbut I haven't figured out how to use it to run Traveller as an applet from a web page. Sigh.
This is a little Unix-style (since I'm working in the Cygwin Unix work-alike environment for MS-Windows) shell script helper routine for the makefile's "make jar" option, that deselects source files which are not being compiled to build Traveller. These exist because Scott's Coyote Gulch distribution contains lots more code than just the stuff for his Traveller applet, and I don't want to weed that out of my working set; it proves useful to read once in a while. Unfortunately, my Traveller beta release included these files in the Traveller.jar file by mistake. Oops. That's why I wrote and added this script.
A copy of this document you are reading, included with the source code.
This is a whole separate document just to describe the Edge Preserving Cyclic Crossover heuristic, written because: 1) yes, it is that complicated, and 2) other people might want to reimplement it from the documentation, or use the documentation to understand the source code in Traveller, and this stand-alone document is much less intimidating than the (huge) document you are reading!
This is the main routine. I ripped out almost all of the original contents into separate classes; it used to be about ten times this big (no exaggeration). Now it is just responsible for putting up a one button splash panel, and for initializing the application and some of its classes that need a setup() call.
This abstract class defines a Chromosome, from which TravellerChromosome is derived. It is very thin gruel, but this is the entity all reproducers emit.
This class defines the Mutator interface, which has changed to be an extension of the reproducer interface, now that I've finally kept my promise to make mutators out of reproducers.
This creates a container from which a variety of Mutators may be hung, and instantiates a roulette wheel to select among them.
This code-rife class is the container for an array of genomes, and the code that maintains them; this is where the reproducers and mutators get run and fitness filtering happens. This is where the roulette wheel for reproducer selection biasing gets built and used. Many statistics used for status reporting are developed here. The first best genome is found and returned from here. Single Elitism, when enabled, is implemented here. [The policy that sexual reproducers' offspring compete for survival in the nest is distributed among the sexual reproducers. It might be nicer if it were instanced once here, instead, but merging mutators and reproducers made it essential that reproducers only return a single propagated child genome, not an array of them.] The call to simulated annealing happens here. When it is implemented, the call to tabu search filtering should happen here.
This defines the interface for a reproducer. My version is even thinner than Scott's. Since I derive three different sub-interfaces from this one, I had to remove the code that wasn't shared.
Like the MutatorVector, this is a place to hang a bunch of instances of Reproducer classes. With my changes, it now contains polymorphic Reproducers, sorted out for appropriate use in the reproduction loop in Population.java.
This is Scott's Roulette Wheel implementation. There is nothing wrong with it (after a picky bug fix) except that it is more constraining than tournaments will be. It is used in Population.java both to choose which genomes get to reproduce, and to choose which reproducer heuristics get to run.
This meat filled class is the home of the genome definition, and of the tools that access it and allow it to be modified. As each chromosome also keeps a handle to the "world", this class allows chromosome manipulators to access the public member methods and public member data provided by class TravellerWorld, frequently a lifesaver. An improvement as of Traveller release epsilon provides a checkValidation() method which is now called at the bottom of each reproducer (and the several reproducers which are also used as mutators) before genomes are returned to the population (only reporting is done, but checking in the genome creator/modifier simplifies reporting which one caused a problem).
This is the class that maintains the canvas on which the best genome's path is drawn. It is also vastly complex, balancing with the Traveller top level source file for sheer indigestibility. The population is instanced here, much other stuff happens here. This was worse before I took out the checkbox code and gave it its own home, took out the layout code and gave it its own home, took out the playfield code and gave it its own home, took out the seeder code and gave it its own home, took out the valuator code and gave it its own home, took out the pushbutton code and gave it its own home, took out the reproducer and mutator vector building code and gave it its own home; you get the idea.
I've attempted to break apart all the stuff Scott had running in the constructor, so the various functionalities are now methods, but they still run at constructor instancing time. This causes me several problems. My timings don't include constructor instancing times, since the constructor runs at the time the "Set Configuration" button is pressed, and my timer doesn't start until the "Start" button is pressed, to avoid incorporating the human lag time between button presses. [Fixed, setup is now timed and reported separately.] Also, things like minimal spanning tree creation, and elastic net seed creation, would be interesting to watch run in and of themselves (and would help fill the idle time while they do run), but because they run during construction, the thing constructed (the "world" which is a Java Canvas) is not yet available for use to do the displaying. [Fixed, they now have their own canvases, courtesy of the TravellerCanvas class.] My goal in pulling the functionality out into its own routines is to move them all under the control of the "Start" button, but I'm not there yet. So far, though, things are at least a lot more readable than in my previous release. [The canvas is now refactored out into TravellerCanvas.java, a step in the correct direction.]
This is a little utility class Scott wrote to handle vectors of floating point real numbers nicely.
I've never looked into this code, trust the name.
This is where Scott's attempt at a random number generator lives. Anything that eats as many random numbers as a genetic algorithm does needs something much more robust, which is why this tool is mostly just used to seed the Mersenne Twister random number generator, now. His "next()" should have been "nextInt()", too, it turns out; next() is a private method in the original core Java development kit code.
This code by Scott handles scheduling tasks, and making threads alive or asleep.
I've never looked into this code, trust the name.
I've never looked into this code, trust the name.
I've never looked into this code, trust the name.
I've never looked into this code, trust the name.
This is a derived interface from the Reproducer interface, for the case of a Reproducer that takes a single genome (Chromosome) as input and propagates one child genome as output.
Scott's genome was originally just an array of integers. I factored it out to leave the possibility for future expansion, and because I needed immediately for the array contents to be objects so that they could be possibly null, and to be objects so that I could use them to implement the Sortable interface. Here is the class that implements the new "codons" (a correct technical term of art) that go to make up a Chromosome (slang for genome).
This is an interface derived from Reproducer, for a style of reproducer of which inver-over is the only example I know, that takes a genome to mutate, and a population of genomes from which to draw guidance for that mutation, and propagates one child genome as output.
The code to build the vector of reproducers, and the separate vector of reproducers that are being used as mutators, got refactored to here from Traveller World as part of the neverending struggle to cut that monster down to a maintainable size.
In the process of hooking the various classes implementing extensions of Reproducer to the reproduction vector, this class implements a policy as to how frequently each will run in proportion to the others, which Scott implemented via one of his roulette wheels. Right now they are all set equal [Update: Traveller's asexual heuristics started to outnumber sexual ones so dramatically, this is no longer true; now crossovers are set as five times as likely to run as asexual heuristics, and RollingCrossover is set at double that; this should bring some balance back in the full heuristics runs.] (except that I dramatically reduced the run chances for the default "clone" reproducer), but a dandy project for someone who understands Java pop-ups would be to implement controllers allowing user manipulation of these proportions. I know there are times I would like to double or triple usage of a particular heuristic in the mix, but right now there isn't enough screen real estate to support controls to make such changes.
This is my third interface derived from Reproducer, this time one taking two parent genomes as input, one selected in sequential order from the population, one selected by fitness bias from the population, and propagating one child genome as output.
This class implements the simulated annealing metaheuristic (discussed under its checkbox entry), which is used by Population to choose in part whether parent or child genome survives to the next generations.
Dewrinkler is a bastardization of Smoother to ignore the fitness contribution of one edge position in the segment being permuted, which has the effect of capturing worst fit codons and dragging them for a step or two as the heuristic walks around the genome ring.
[Originally com/coyotegulch/genetic/TravellerClone.java] This class acts as the default Reproducer, just copying the parent to the child. It got moved to the KPD source code tree as part of a regularization of the reproducers into asexual, consultive, and sexual subsets to match the interfaces they implement, and renamed as part of a stripping of "Traveller" from heuristic names so that they would fit in the title bars of the (originally tiny) visual debugging windows.
This asexual reproducer is a fairly unsuccessful invention of mine, described under its checkbox entry. Since I finally finished my long bragged-about merging of mutators and reproducers, it is more useful as a mutator than as a reproducer, and it gets picked up automatically for that use whenever mutation is enabled, whether enabled as a reproducer or not.
This is my modification of Scott's sublist inverting asexual Reproducer. It got moved to the KPD source code tree as part of a regularization of the reproducers into asexual, consultive, and sexual subsets to match the interfaces they implement, and renamed as part of a stripping of "Traveller" from heuristic names so that they would fit in the title bars of the (originally tiny) visual debugging windows, and to make the visual debugger title bar name more closely match the CheckBoxControls name label. It is also used as a mutator.
This is my implementation of a simple one codon move asexual reproducer. It is also used as a mutator.
This is an enhancement of Optimize Nodes Near A Point, to also create slots from edges close to and facing that point, in the sense that a perpendicular dropped from the point falls on the line of which the edge is a segment, in the part of the line between the two city locations that define the edge. The goal is to transfer cities to nearby edges whose ends are not necessarily nearby.
This is an intensification of Optimize Nodes And Edges Near A Point, to repeat that heuristic once for every city for a point selected in a distribution centered at that city.
This implements the "optimization near a point" heuristic, a sighted cheat, as a Reproducer, as described elsewhere in this document. There is a very complete description of the algorithm at the end of the source code file which implements it, see it below the source code section.
As described in its checkbox entry, this implements the "optimize near every city" heuristic, trading effort for power. It was the second time I got caught putting a loop around another heuristic and forgetting to account for the situation that there is a new "previous genome" for each iteration of the loop, and so the input parent genome does not suffice for reuse. "Too soon old, too late smart" got me again.
This asexual reproducer is a fairly unsuccessful invention of mine, described under its checkbox entry. It is also used as a mutator, where it has better success.
This is the "local codons" version among my three original asexual reproducer permutation tools, as discussed earlier. It is the intellectual parent of many of Traveller's most successful heuristics.
This is a specialization of Permute Several Sublists Of Cities, using the city selection mechanism of Optimize Nodes Near A Point, to cluster the ends of the genome segments being swapped around a single randomly chosen location on the playfield.
This is an intensification of Permute Cuts Near A Point, to repeat that heuristic once for every city for a point selected in a distribution centered at that city.
This is the "widely scattered codons" version of single city permutation in an asexual reproducer, as discussed earlier.
This is the conceptually most powerful of my three original permuting asexual reproducers, which permutes and inverts pieces of the circuit cut into sub-paths, as discussed earlier.
This is a fairly successful attempt (after lots of failed tries) to adapt the form of the quicksort algorithm to the task of improving SETSP genomes. It uses the usual quicksort algorithm, except that at each point where quicksort would do a swap based on the results of an order comparision, QuasiQuickSort does a swap if a check, that the swap would improve the genome fitness, succeeds.
This is a very successful attempt to adopt the Shell sort [specifically, the Shell—Metzner sort, adapted from Pascal Programs For Scientists and Engineers, by Alen R. Miller, ISBN 089588-058-X, who in turn credits his version as "Adapted from Programming in Pascal, P. Grogono, Addison — Wesley, 1980"] to the task of bettering genome fitness. The basis of the Shell sort is to pick a "span", an integer at least half the length of the array to be sorted, and sweep two indices separated by that span across the array, swapping where elements are found out of order a span apart. After each such pass, the span is decreased by some factor to leave it at least one half its previous size, and another pass is done, until the span has been reduced to comparing adjacent elements. The difference here is that substring inversions are used instead of swaps, and the inversions happen, not because of an out-of-order situation, but when a test shows that doing the inversion would improve genome fitness. The level of success was a little surprising, but the basic concept of establishing large scale order before attempting small scale ordering is sound, even for the SETSP application.
[This routine is a fraternal twin to the above routine, thus the similarity in descriptions.] This is a very successful attempt to adopt the Shell sort [specifically, the Shell—Metzner sort, adapted from Pascal Programs For Scientists and Engineers, by Alen R. Miller, ISBN 089588-058-X, who in turn credits his version as "Adapted from Programming in Pascal, P. Grogono, Addison — Wesley, 1980"] to the task of bettering genome fitness. The basis of the Shell sort is to pick a "span", an integer at least half the length of the array to be sorted, and sweep two indices separated by that span across the array, swapping where elements are found out of order a span apart. After each such pass, the span is decreased by some factor to leave it at least one half its previous size, and another pass is done, until the span has been reduced to comparing adjacent elements. The difference here is that the swaps happen, not because of an out-of-order situation, but when a test shows that doing the swap would improve genome fitness. The level of success was a little surprising, but the basic concept of establishing large scale order before attempting small scale ordering is sound, even for the SETSP application.
This is a buffed up version of PermuteCutsNearEveryCity, changed to do the city loop by visiting cites at random, to use a MarkArray to detect when every city has been visited, and to keep doing such loops until an entire loop passes with no improvement. It spawns a SmallDisplay to show its progress, since it can run for tens of seconds and longer in large problem instances.
This is a buffed up version of OptimizeNodesNearEveryCity, changed to do the city loop by visiting cites at random, to use a MarkArray to detect when every city has been visited, and to keep doing such loops until an entire loop passes with no improvement. It spawns a SmallDisplay to show its progress, since it can run for tens of seconds and longer in large problem instances.
This is a buffed up version of OptimizeNodesAndEdgesNearEveryCity, changed to do the city loop by visiting cites at random, to use a MarkArray to detect when every city has been visited, and to keep doing such loops until an entire loop passes with no improvement. It spawns a SmallDisplay to show its progress, since it can run for tens of seconds and longer in large problem instances.
This is my Traveller "smoother"; think of it as sandpaper, for taking out the small roughnesses. It does this by circling the genome, one codon at a time, doing the best possible permutation of a small sequential sublist located at each step. At this writing, with the full mix of heuristics, Traveller solved a 200 city random layout with a population of only 5 in about 20 minutes.
This is my Traveller "snow plow", described above under its checkbox entry. It is hard to believe that this is a new idea, the temptation to do it hung over me the whole time I was still trying to respect the "blindness" of Scott's Blind SETSP approach, but it is wonderfully effective, being the most powerful improver of genome fitness among my existing implemented heuristics. Elastic Net, not suitable for use as a repeatedly applied heuristic, is probably even more effective, but this snow plow is the best in the reusable kit.
This is SnowPlow driven to extremes. Besides SnowPlow's loop to keep dragging a PermuteSublistsOfCities around the genome ring at constant cleavage point offsets first cast across the genome's length, until a complete circuit with no improvement occurs, SnowPlowSqueezebox tosses in the fillip of making the algorithm O(N*N) by default, shrinking the span across which the cleavage offsets are initially thrown at random in each circuit by one until there is just enough room for them all not to be consecutive integers, at each failure to improve with a wider span.
This asexual reproducer is a simple conversion to Reproducer form of Scott's original one and only mutator. It is also used as a mutator.
This is my Java implementation of the inver-over ConsultativeReproducer; see a link on my computational complexity web page (URL above) to the research paper describing this algorithm.
Here is my somewhat modified version of Scott's cyclic crossover implementation of a sexual Reproducer. This is astonishingly simple code.
This implements the edge preserving cyclic crossover sexual reproducer, discussed under its checkbox entry.
This is Scott's ordered crossover sexual Reproducer, which he would no longer recognize, probably. It needed the heaviest modifications to remove the artifacts of being array minded when it should have been ring minded, and my version doesn't work quite like his does given the same genomes and crossover points. See the very pretty table in the check box controls window description above for implementation details. This source file got moved to the KPD source code tree as part of a regularization of the reproducers into asexual, consultive, and sexual subsets to match the interfaces they implement, and renamed as part of a stripping of "Traveller" from heuristic names so that they would fit in the title bars of the (originally tiny) visual debugging windows.
This is Scott's implementation of the Partial Match Crossover sexual Reproducer, also re-coded by me nearly into unrecognizability, but it still does what his did, just in more general genome orientations. It got moved to the KPD source code tree as part of a regularization of the reproducers into asexual, consultive, and sexual subsets to match the interfaces they implement, and renamed as part of a stripping of "Traveller" from heuristic names so that they would fit in the title bars of the (originally tiny) visual debugging windows.
This is my rolling crossover algorithm, possibly a new invention. It is described above under its checkbox entry.
New with a source code refactoring out of TravellerWorld.java, for Traveller release epsilon, this file implements the Circular Layout checkbox option.
New with a source code refactoring out of TravellerWorld.java, for Traveller release epsilon, this file implements the Even Grid Layout checkbox option.
New with a source code refactoring out of TravellerWorld.java, for Traveller release epsilon, this class is a placeholder for an eventual factal layout matching the Fractal Layout checkbox option, which still in reality defaults to the Random Layout option at this writing.
New with a source code refactoring out of TravellerWorld.java, for Traveller release epsilon, this file handles calling the selected layout option as indicated on the checkbox radio buttons for the available layouts.
New with a source code refactoring out of TravellerWorld.java, for Traveller release epsilon, this file implements the Nested Grid Layout checkbox option.
New with a source code refactoring out of TravellerWorld.java, for Traveller release epsilon, this file implements the Perturb Cities checkbox option.
New with a source code refactoring out of TravellerWorld.java, for Traveller release epsilon, this file implements the Poly Spiral Layout checkbox option.
New with a source code refactoring out of TravellerWorld.java, for Traveller release epsilon, this file implements the Random Layout checkbox option.
Unintegrated code from Qhull that will eventually implement seed generation from a Delaunay Triangulation of the city locations.
This houses the elastic net seed generator algorithm. The best part of this code for me is that I successfully virtualized a size order N*N lookup table too big for use in this implementation, giving me hope that I can do the same for Scott's city distance list table in TravellerWorld.java [DONE!]. Since I wanted to sample several genome seeds from a single run of the elastic net algorithm, I rearranged the original code so that the iterate loop becomes a service which runs forward from its last state for a requested number of iterations, then returns an array suitable for use in seeding a genome based on the then-current intermediate state of the elastic net's relaxation progress.
More recently, I've added a pop-up window "floor show" that looks a lot like the display part of the original author's applet, and in fact works well as a stand-alone display for showing an elastic net at work.
The original of this code is due to Alexander Budnik, budnik@nf.jinr.ru, and was found posted at URL http://nuweb.jinr.dubna.su/~filipova/tsp.html . My version bears only faint resemblance to the original, so don't blame the original author for the current code.
Elastic Net uses an exponentially damped relaxation technique, in some extremely general sense a least squares fit algorithm, to blend an initial circle of way-points, typically three times as many as cities, into an approximate tour for the Traveling Salesman Problem. It works by setting up an attraction between way-points and nearby cities, to create a tour, and also between way-points and their next neighboring way-points on the circle, to shorten the tour, and then varying the strength of each attraction during the run, giving special consideration to the attraction of a waypoint to its closest city as the run progresses.
The one severe problem is that the Elastic Net algorithm is in itself very computer intensive. I've never dared to try using it beyond a 240 city problem (now I have), and just the setup time to produce the seed genomes runs for several minutes. In that 240 city case it took about 12 minutes. It has a plenty acceptably short run time (seconds) for small numbers of cities, on the order of 50, though (250 cities, at last try, took 14 minutes of setup in Elastic Net; I just added a setup duration timer).
The slow execution is because there are lots of calculation loops that do "all the way-points" nested for "all the cities", so this algorithm has a nasty N*N complexity, with a large constant multiplier. At a half gigaflops processing rate, my laptop conceptually can do Sagans of somethings in bearable durations, but a 1000 city layout with elastic net seeding, probably needing a few hours to run the seeding loop, is so far beyond my courage. [Well, I finally tried it, and it took "forever", as expected, over 13,000 seconds, or almost four hours.]
This is my minimal spanning tree class. It provides three services. It builds an MST from a distance array. It provides back a floor value known to be less than the length of the perfect solution to the TSP instance. It provides back seed genomes derived from the MST as described elsewhere in this document.
An as yet unintegrated class from Qhull used by the Delaunay Triangulation class.
An as yet unintegrated class from Qhull used by the Delaunay Triangulation class.
An as yet unintegrated class from Qhull used by the Delaunay Triangulation class.
An as yet unintegrated class from Qhull used by the Delaunay Triangulation class.
An as yet unintegrated class from Qhull used by the Delaunay Triangulation class.
An as yet unintegrated class from Qhull used by the Delaunay Triangulation class.
An as yet unintegrated class from Qhull used by the Delaunay Triangulation class.
An as yet unintegrated class from Qhull used by the Delaunay Triangulation class.
This interface extends Sortable to provide for a content equality check, as opposed to merely a same sort order check, for two objects which implement it. It was required for the extension of the Heap class to the HeapWithDelete class, to provide unique identification of target heap elements which were to be deleted. This in turn is used in creating the unique greedily thinned seed genome returned by the Minimal Spanning Tree seeder class.
This class implements Sortable (of course) and is used to sort a list of edges by distance from a selected point in the two "Optimize Nodes And Edges..." heuristics.
This is code used in my "edge preserving cyclic crossover" sexual reproducer and my two "optimize nodes and edges" asexual reproducers. It now lives in the "structures" part of the package tree, since it implements the Sortable and ContentsComparable interfaces. It encapsulates the concept of a graph edge, with a canonical form that puts the lower sort-ordered node first, a sort comparision that sorts on first node, then within identical first nodes on second node, and with a single boolean value usable for marking it in various graph algorithms.
This too is code used in my "edge preserving cyclic crossover" reproducer. It abstracts a list of edges and the methods needed to work on them.
This class implements a classic heap structure, and a heapsort built on it.
This class extends the heap to support finding (by sort order and contents match, see interface ContentComparable) and removing an element from a heap while maintaining the heap invariant. If you guess it was a challenge to get correct, you are correct. It did not turn out to have wonderful algorithmic complexity, but by the time I got it working at all I was happy with that much accomplishment. It is used in creating the unique greedily thinned seed genome returned by the Minimal Spanning Tree seeder class, which needed a heap, but needed also to be able to remove superseded elements from that heap.
Like the next entry, this provides a class that implements Sortable, used in CircleLayout in finding an exact solution to the circular layout problem setup.
The name tells it all; I needed something that implemented the Sortable interface, so that I could sort cities by the closest associated way-point, in creating genome seeds using Elastic Net. I suspect that there is a smarter way to do this, but I haven't found the appropriate level of abstraction needed yet.
This is the usual quicksort algorithm, gussied up with a modern Sortable interface and so also necessarily strengthened to deal well with having the things to be sorted be accessed by references rather than directly. I tried to use Sun's demo code for a while, but it was easier finally to start over and write my own working code than to make theirs work.
This provides the classical interface, that a sorter just needs to be equipped with a comparison function and a swap function for the entities to be sorted, to sort anything which can be made accessible to it via an array. Stuff implementing Sortable can use QuickSortAlgorithm to sort it, which was the whole point. About the gamma release of Traveler, I changed this from a boolean "lessThan()" interface to a more state of the art three-way "compareTo()" interface, imitatating GNU qsort()'s equivalent "comp" interface.
In earlier Traveller full-source code releases, I had structured Sortable and its implementers in terms of a lessThan() method, but since I learned while reading Other People's Code (a great learning technique) (Steve Fortune's plane sweep method Voronoi diagramming code, to be exact) that GNU qsort uses a three way comparison method similar in return value concept to the C library's strcmp() (which means such a comparator is probably the interface experienced programmers expect), I changed Sortable to require a three way comparison method compareTo() as well, lessThan() went away, and now Sortable provides three named constants from which that comparison should choose its return value. I then mended the rest of Traveller to comply. This is one of those improvements you enjoy having done, out of professional pride.
This class encapsulates and provides access to a static debugging flag, so that it can be set from the checkbox GUI, and seen everywhere. It controls whether Traveller does or does not emit a detailed debugging trace. It also provides, for debugging use, several polymorphic versions of "dump()", which accept one or two dimensional arrays of ints or doubles, and return a String representation in brackets and separated by commas, for single dimensional arrays, or with the internal arrays in parentheses and with entries sorted by commas, for the two dimensional case. Somewhere in the midst of doing all this I discovered the Java convention that every object should implement a toString() method, which I've tried to do wherever appropriate. New with the Traveller epsilon release, this class also implements a visual debugging flag, used to enable or disable use of the new in the same release VisualDebugger class for heuristics (see below).
This is a nice little utility class that maintains an array of marks, keeps track of how many of them are set, and prvides methods to set and clear, set and clear all, set with notice if all are set, clear with notice if all are clear. I use it in several heuristics, when I am visiting cities at random and want to know when I have visited them all, as part of loop termination decisions.
This is Sean Luke's mind-boggling port of the Mersenne Twister C code to Java. [Don't necessarily blame Sean. It was probably mind-boggling in the original too. Bit twiddling and similar tricks done for run time efficiency need extensive source code comments to be comprehensible.] I added a needed "double within a range" method that Scott's code used frequently (but previously with his randomizer), and cleaned up the text of an output prompt. I also added two getTwister() calls and a static member m_soliton to enable optional use as a (Design Patterns) Solitary (which is how Traveller uses it), and reformatted it top to bottom away from Sun's Java Coding Standard (see the end of this document for my excuses) and toward something more maintainable, but otherwise it is clean.
I mooched this useful little class unchanged from the VoroGlide distribution. It is used by the Traveller main routine to lay out the text on the Traveller splash panel displayed at startup.
This class stand between the PermutationGenerator class and a consumer of that class's product, to hide the details of deciding how big a permutation to do, in one easily maintained spot. It contains the policy constant BASE_PERMUTE_LIMIT, which tells how big a permutation in general will be allowed (or, alternately, how slowly Traveller will run with permuting heuristics enabled). It also implements, with its client classes, adaptive permutation effort, in which the allowable permutation limit begins low and is only increased when permutation begins to fail to be effective at lower effort levels. This turned out not only to provide a tremendous speed-up for the beginning parts of Traveller runs, where for example a problem of 200 cities with a population of 20 genomes didn't require a permutation effort level above permuting three items for almost 1600 generations before that level of permutation ran out of steam, it also pointed out that three of my four gamma release permuters had buggy fitness evaluation code: one case of an index started at one where zero was appropriate, one case of using a name already retrieved from the array as in index into an array of names: digging too deep, and one case of forgetfulness, where I just left out a whole portion of the fitness computation. This only became apparent when it was crucial for the new adaptive permutation effort functionality that the permuters indeed never return worsened genomes, only the original or improved ones, and the bugs were making that not be the case. It is indeed true that software is one of the few things that is improved by messing around with a working version. In this case, the genetic algorithm concept is so powerful that it was able to move toward solutions with methods that only were doing useful work a small fraction of the times they were run, much as with Scott's original implementation, where two of the release 1.2.1's four reproducers were buggy, and all four had design weaknesses from treating genomes as arrays instead of rings, yet still his applet still solved the TSP quite regularly.
I'm not too fond of the Java requirement to waste a whole file, which in some modern file system implementations is 8Kbytes or so, on something this trivial, but here is a whole class whose only purpose is to create a new named exception. I don't remember this being nearly so expensive in C++.
This dandy little piece of code I've ported from Fortran 66 to StarLogo, then from StarLogo to Java, and I still haven't a good idea how it works, though about four hours of cross-eyed study would probably do the trick. Given an array of integers precisely from 1 to N, it permutes them in all possible orders and returns those permutations one at a time. I've provided a mechanism to feed this back as if it were permuting 0 to N-1, much more useful in zero based array index systems like C, C++, and Java, but in fact the algorithm cannot do that, so I have to copy the array with a decrement before returning it.
This class is used as a debugging tool and just plain fun display by each Traveller heuristic. It puts up a Salesman's Route sized TravellerCanvas, and within it shows, either very quickly for algorithms whose internals are not interesting, or in step by step detail for those more viewable, the heuristic in operation. It vastly slows down Traveller, but it _is_ pretty and implementing for all the heuristics helped me find problems in even ones I thought I'd gotten correct. To make the currently executing heuristic's visual debugging window visible, a Frame.toFront() call is done in the initial stages of each heuristic via a call to a method in this class. This can be incredibly annoying when you are trying to get some real work done with Traveller running behind your current work windows, so use it only when that isn't a problem. The checkbox for Enable Visual Debugging is one of the ones always enabled, and when you turn off visual debugging, the associated windows get closed the next time each associated heuristic is executed, usually pretty close to instantly.
Here is where I refactored the three namespaces used by Traveller and Traveller world when talking about the checkboxes on the control panel into a single namespace, and got to throw away lots of code with no loss in functionality. This isn't pretty, but it is vastly improved, and by localizing the code maintaining checkboxes, cuts down the time to add a new checkbox correctly from hours to minutes. As of the "gamma" release of Traveller, the remaining code to construct a checkbox control panel was moved into this file. As of the "delta" release of Traveller, the code to maintain the window in which the panel resides was also moved into this class, and its interface to the toplevel routine thereby immensely simplified. As of the "epsilon" release of Traveller, I really got my act together and output the checkboxes column-wise instead of row-wise, which make it much easier to scan down a column to find alternative choices.
This class supports the former toolbar and configuration buttons, in the Traveller's Doorbells window. It works as an implicit state machine, as discussed above under the entries for that window.
This little class puts up a small TravellerFrame window into which a one line status message can be written. It is used at this writing for two counters, the count of seeds sown, and the count for each generation of which element in the population is currently being considered for reproduction. The latter use may well be slowing Traveller down a fraction, so someday (but not today) there will probably be a button to enable or disble progress counters [but I really find them handy to convince me that Traveller is still alive when it gets very slow, and to advise me when a generation has completed after I push the Stop button so that I can safely change the configuration without crashing things].
This class factors out the framed window with a drawing canvas which was formerly part of TravellerWorld for displaying the current best tour. It is now also used by the seeders for their pop-up window "floor show" displays to entertain the user while long running seeding mechanisms execute, and by the visual debugger to display heuristic internal operations in a user friendly way.
This static class, new in Traveller release delta, puts color values in one place accessible to all of Traveller's window, panel, and frame user interface maintainering classes.
This is a pretty ordinary and usual class, it takes care of setting up the window and frame in which the Traveller applet GUI is drawn. As of Traveller release delta, it was moved down from the toplevel to this new ui package, where it could be seen by the rest of Traveller's classes, because now it also maintains a series of public values used by the rest of Traveller to size window frames in a consistent style. [Which latter should have been done by extending the initially clean class Scott wrote, another example of me being "too soon old and too late smart".]
This class factors out the status bar display formerly a part of Traveller.java, and now much enhanced, thus justifying a separate window.
New with the delta release of Traveller, this class refactors the valuator code for setting and accessing number of cities, population size, and mutation rate, and its graphical interface panel and window, out of the Traveller top level routine and into this separate class, a huge win for good coding practice and software maintainability. It probably belongs in the .../ui package rather than the .../genetic package, which will happen later when more refactoring allows all the GUI stuff to be simplified and relocated there. Here, finally, a bandaid for the problem of keeping the valuator text field sizes consistent has been discovered and applied, though I hope that Java provides a better way to do this than the obscure and unreliable one I found. Comments in the source code dwell further on this issue.
The contributions made by KPD to Traveller do not follow the Sun coding conventions as found at http://java.sun.com/docs/codeconv/ . More than that, when I maintain other people's code that uses them, I rip them out. This is not out of perversity, or of ignorance of those conventions. I read them and rejected them, for good and sufficient reasons. They constitute premature standardization on an inferior 1972-era poorly considered style of coding chosen for C. They do not have the quality, for example, of my own intuitive, unformalized coding conventions, developed over 41 years. Those Sun conventions seem designed by minds too warped by existing C usage to take a fresh look at what works as opposed to what is common practice. They also ignore realities of cheaper disk storage, better, multi-windowing text editors, and better understanding of human interface issues (like what makes code readable) that have arisen since the C coding conventions were developed in 1972. I'm wasting some time and space here to justify myself before the swat teams arrive.
Explicitly:
[A useful trick, which I've employed for the HTML source text for this document [Take a look! This is such a nice solution to the uglification issues in trying to write hierarchically indented HTML source code while leaving room somewhere for the body text.], though not yet tried for free format programming languages, that helps deal with the shortage of horizontal white-space, is to indent the control structures, but put all the simple statements left-flush. The control hierarchy is still extremely obvious, but the problem of running out of horizontal line for the simple statements and having to split them across multiple lines is reduced dramatically. Feel free to steal this idea and use it in your own code.]
if (((condition1 && condition2)||(methodYieldingBooleanValue1(parameter1, parameter2, methodYieldingBooleanValue2(parameter3, parameter4)) ^^ methodeYildingBooleanValue3(parameter5) && condition3) || condition4)
{
doWonderfulThings();
}
isn't much improved as:
if (((condition1 && condition2) ||
(methodYieldingBooleanValue1(parameter1, parameter2,
methodYieldingBooleanValue2(parameter3, parameter4))
^^ methodYieldingBooleanValue3(parameter5) && condition3)
|| condition4)
{
doWonderfulThings();
}
while its meaning is pretty transparent as to how the
boolean sub-expressions are nested, or at least it is vastly
more maintainable, when formatted thusly
if
(
(
(
condition1
&&
condition2
)
||
(
methodYieldingBooleanValue1
(
parameter1,
parameter2,
methodYieldingBooleanValue2
(
parameter3,
parameter4
)
)
^^
methodYieldingBooleanValue3
(
parameter5
)
&&
condition3
)
)
||
condition4
)
{
doWonderfulThings();
}
which also demonstrates that _any_ delimiters or operators
can be stacked for clarity, and should be, as needed,
usually under the leftmost end of the fully qualified parent
keyword or variable name, to make obvious who owns it,
indenting only when justified so as not to introduce false
hierarchy with spurious indenting, and to keep the visual
flow of hierarchy obvious. For example:
StringBuffer bufferWithModeratelyLongName = new StringBuffer();
bufferWithModeratelyLongName
.append( /* complicatedStringProducingExpression */ )
.append( moreGoodStuff() )
.append( evenMoreGoodStuff() )
.append( /* anotherComplicatedExpression */ )
;
which (splitting at the period) the Sun Java Coding Standard
expressly forbids, but which is obviously instead a Very
Good Thing. Notice too the standard trick of putting the
semicolon on a line alone to ease maintenance when adding or
reordering the appends. For another example:
public static GlaberigenousOozeContainingEmittedClass
methodWithObnoxiouslyLongNameAndNoRoomForParameters
(
int [][] parameter1,
Feldercarbonaceous [] parameter2,
Twinkly parameter3
)
which also illustrates that parallel concepts should be
stacked vertically whenever possible.
/*
* Comment
* at length
* about everything.
*/
looks out of balance with the opening and closing
bracketing digraphs out of plumb and the left border ragged,
and is moderately easily confused with arithmetic. I use
this much nicer, more visually substantial form instead,
/*
** Comment
** at length
** about everything.
*/
with the left border neat and the opening and closing
bracketing digraphs vertically aligned, and always keep it
flush to the left margin. This simpler format also makes
it easier to construct editor macros to maintain the
comment block, which means the author is more likely to
write the comments in the first place.
if (someCondition)
oneLineConsequent();
else
oneLineAlternative();
as
if (someCondition) {
oneLineConsequent();
} else {
oneLineAlternative();
}
but as
if (someCondition)
{
oneLineConsequent();
}
else
{
oneLineAlternative();
}
obviously more readable, and similarly
do
{
// something that needs to execute at least once
}
while (conditionalExpression);
for all the reason listed above: visual match of bracketing
characters, ease of finding missing brackets with bracket
matching tools, and keeping logically parallel constructs
(like "if" and "else", like "{" and "}", like "do" and
"while") vertically stacked and aligned.
Notice that, once I can trust myself to put multi-line blocks
enclosed in brackets stacked vertically, this following is
now perfectly safe, since the absence of an opening bracket
on a second line means that the structure must be contained
in a single line.
if (someCondition) { oneLineConsequent(); } else { oneLineAlternative(); }