<< GVANT Program Documentation >>

These notes are meant as supplemental documentation to the individual
comments incorporated into the actual GVANT.C code.  It was originally
written for searching OGR-20, but applies to OGR-21 as well.


< Key Definitions >


OGR-20   What we are searching for: an OGR (see next) with 20 marks,
         including each end of the ruler.

OGR      Before discussing an Optimum Golomb Ruler (OGR), it is important
         to understand the basics of Golomb Rulers.  A Golomb Ruler is
         defined as a ruler which includes a predetermined number of marks
         (say, "N") placed at integer multiples of some fixed unit and
         measuring the maximum number of distinct differences possible
         (equal to N*(N-1)/2).  As such, one condition of a Golomb Ruler
         is that no single distance can be measured twice (i.e., using two
         different sets of marks) because to do so would mean that the
         total number of distances that could be measured would be less
         than N*(N-1)/2.

         An Optimum Golomb Ruler (OGR) is the smallest Golomb Ruler
         possible for a given number of marks.  To illustrate by example,
         0-1-3-7 is a Golomb Ruler of 4 marks, measuring each of the
         distances [1 2 3 4 6 7] once and only once.  However, it is not
         optimal, because a smaller ruler is possible: 0-1-4-6. This
         second ruler is an OGR because no shorter Golomb Rulers are
         possible for four marks.

differences

         The distance between a pair of marks (not necessarily successive)
         is referred to as a "difference".  The length of the interval
         between two successive marks is referred to as a "first
         difference".  Until two marks with a given difference are placed
         on a ruler, the corresponding difference value is considered
         available for use.  At the beginning of the search process, all
         difference values are available.  Until a particular distance is
         taken, it continues to be referred to as "available".

node     The search approach may be regarded as a tree search (with some
         clever pruning).  Each time a new mark is placed on a ruler under
         construction, the resulting starting segment of the ruler may be
         regarded as corresponding to a node of the search-tree.  The
         pruning techniques are such that leaf-nodes are rarely visited -
         backtrack having occurred well-before reaching that depth of the
         tree. 

segment  An interval of ruler from one mark to another.  There may be more
         than two marks in a segment.

< Basic approach >

GVANT constructs a Golomb Ruler by working from the left side of the
ruler and consecutively adding marks.  To add a new mark, a routine
called "place" is called that confirms that the addition of the mark
does not violate any OGR properties (like repeating a distance twice).
Various techniques are included to speed the process up, but these are
not required to prove optimality, only to make the process of
searching faster.


< Maintaining the list of available differences >

The list of available differences is tracked in an array called td[].
Along with this list, a second matching list called pb[] is used.

td[]     The td[] array is a list of available differences.  Initially,
         all differences are available, and the corresponding entry in
         the array is a number greater than zero.  Once a difference is
         used, its position in the array is given a value of less than
         zero.  To check if a difference is available, GVANT looks for a
         positive value in the td[] array.

         The td[] array serves a second purpose as well.  It acts as a
         pointer to the "next" available difference.  For example, when
         initialized, td[2]=3 (meaning that the next difference available
         after 2 is 3), etc.  Thus GVANT has a quick, efficient way to
         locate all differences still available.

         To maintain this array, GVANT begins by initializing each element
         with the next higher number.  Initially, the first available
         difference is 1 (td[0]=1).  Then, as differences get used, GVANT
         turns the td[] to a negative number, and also backtracks to the
         previous available difference on the list and updates its value
         to point to the new "next" available difference.  To facilitate
         this backtracking, a second array of pointers is used:

bp[]     The bp[] array is analogous to the td[] array, but in reverse
         format.  Each entry points to the previous available difference
         on the td[] list.  It is initialized as follows: td[1]=0,
         td[2]=1, td[3]=2, etc.  Values always remain positive.

In addition to the arrays that keep track of available difference
locations, two scalar variables are used to keep track of the sum of
the smallest available differences.  These smallest available
differences are referred to as the "head".  The head includes the same
number of (smallest) differences as the number of marks that must
still be placed on the ruler.  For example, when considering the first
mark of a 20 mark ruler, the head contains the 19 smallest remaining
differences.

fhd      fhd (for "final head difference") is the largest number in the
         head.  For instance, for a 20 mark ruler, fhd is 19 when
         evaluating the first segment.  Thus the minimum length for the
         remaining 19 segments is 1+2+3...+19 or (19*20)/2 = 190.
         However, if we examine a number less than 20 for the first
         segment, we automatically increase fhd to the next available
         difference (in this case 20), since one of the pre-existing 19
         smallest available differences will now be used.

lf       lf (for "least finish") is the sum of the head differences (i.e.,
         the sum of the m _smallest_ available differences, where m is the
         number of additional marks still needed to reach the far end of
         the ruler).  It is initialized to (n)(n-1)/2 for the analysis of
         the very first segment.  Although it can be calculated in a
         simple sum routine, it is more efficient to keep track of how the
         sum changes for each deletion (or addition) of a difference.  For
         example, suppose the head has 3 available differences, say 2, 4
         and 5, and the next smallest available difference is 8.  If we
         use up the "4", then lf becomes 2+5+8=15.  Mathematically we can
         also calculate as follows: lf[before] = 2+4+5=11.  lf[after] =
         lf[before]+8-4=15, where "8" has been added to the head and "4"
         has been deleted.  In this way, each recalculation of the head
         takes only a little addition and/or subtraction versus summing up
         all the individual head elements.


< Other arrays used >

q[]      This is the array of first differences for the current ruler.  As
         each node is visited (e.g., as we build a ruler), we store the
         differences between the last two marks placed on the ruler in the
         q[] array.

bq[]     This array is used to save the best set of q[] found so far.

p[]      This array, constructed upon completion, contains the mark
         locations of the OGR.


< Recursive Routine PLACE >

To build an OGR, the main routine calls function "place" which in turn
recursively calls itself.  At each stage of the recursion, place examines
the ruler in three distinct segments.  For clarity (and to differentiate
these larger segments from individual lengths from one mark to its
neighboring marks) these will be labeled as SEG-1, SEG-2 and SEG-3.

               SEG-1        SEG-2        SEG-3
Ruler:     | ----1----> | ----2----> | ----3----> |
         p[1]                                    p[n]

  1      SEG-1 is the portion of the ruler already constructed before
         "place" is called.  It stretches from the far left side of the
         ruler (p[1]) all the way to the beginning of the current segment
         (SEG-2).

  2      The current interval (SEG-2) is what is calculated each time
         "place" is called.  SEG-2 corresponds to a first difference; ie.,
         there are no marks in its interior.  SEG-2 is just the interval
         between the last placed mark and the next to be placed.

  3      SEG-3 is the segment from the end of the current interval stretching
         all the way to the far right end of the ruler (p[n]).  Although
         GVANT does not know a priori what the OGR length is, it uses known
         Golomb Rulers as an upper bound.  And by calculating a minimum
         length for SEG-3, GVANT can determine a maximum allowable length
         for the current interval, SEG-2.  Anything greater than this cannot
         result in an OGR.


< Reducing the Search Space >

To eliminate unnecessary searching (e.g., looking for mirror images of
existing rulers) several techniques are employed.

  1      As discussed above, an array of available differences called td[] is
         maintained to reduce the number of computations required; function
         "place" draws new marks to consider from this list.

  2      To eliminate mirror images, GVANT.C only examines rulers for which
         the middle mark (or an approximation thereof) is on the left side
         of the midpoint of the ruler.  (A middle mark (n odd) or the
         average of the two marks closest to the middle of the mark
         sequence (n even) can never be in the exact center of the ruler
         since it would result in the same distance measured twice --from
         the middle mark(s) to each of the ruler's ends.)

         A key advantage to this technique is that small differences are
         used up more quickly, allowing GVANT to more easily process the
         deeper recursive calls.

         GVANT checks each addition of a mark in the first half of the
         ruler (through half the total number of marks) to ensure that
         there is enough room to meet this constraint.  Actual
         implementation is similar to the technique discussed below in 3a.

         NOTE: We constructed two slightly different versions of GVANT.
         The first (version 33b) uses the average of the two middle marks
         (n even) or the single middle mark (n odd) to eliminate mirror
         images.  For larger rulers however (e.g., 15, or 17 and higher) a
         revised version is more efficient; it uses the average of
         the marks 'once-removed' from the middle two marks.  (E.g., for
         OGR-20 version 33b uses marks 10 & 11; the revised version uses 
         marks numbered 9 & 12).  This is the version we are distributing.

        NOTE also: In the final distributed version we do not necessarily 
        eliminate _all_ mirror images, since to do so requires more 
        computational effort than a less rigourous test.  However, mirror 
        images will be tested only if a new OGR is found with a total 
        length less than any currently known.  

  3      Another technique for reducing computations is to consider the
         maximum length the current interval (SEG-2) can occupy.  For
         instance, at each call to "place" GVANT knows SEG-1, as well as the
         maximum ruler length to be considered.  By subtracting the minimum
         possible SEG-3, "place" can determine the maximum allowable
         current interval, SEG-2.

           3a     The third segment, SEG-3, must be large enough to
                  accommodate at least the size of the OGR for the number of
                  marks remaining to be placed.

           3b     More importantly, SEG-3 must be large enough to fit the sum
                  of the remaining available differences.  For instance, if 3
                  marks must still be placed, and the 3 smallest available
                  differences are 4,5 and 8, then the third segment must be at
                  least 17.

                  To facilitate this process, GVANT maintains a running record
                  of the available differences (td[]) as well as of their sum
                  (lf).

                  This technique is very effective when combined with
                  technique #2 outlined above, since technique #2 tends to
                  ensure that many small differences are attempted early, thus
                  resulting in larger "head"s.

  4      The algorithm does not consider mark-placement choices that can
         be shown to be sub-optimal.  I.e., the algorithm does not consider 
         any choices that could only lead to solutions that are longer than 
         the best known ruler "today" of the same number of marks.  
         
         There is a parameter "ban" (_B_est _A_vailable length for a ruler 
         with _N_ marks) initialized to the length of the best known ruler 
         --at this time-- for "n" marks.  To reduce
         computations, distances from the end of a given segment to "ban"
         are selectively examined.  If these distances are not available, 
         then we can proceed by reducing "ban" (which is a 'win' because 
         it causes earlier pruning) or we must choose a different current
         interval, SEG-2.  
         
         These differences are only selectively examined because an exhaustive 
         routine would be very CPU intensive, and would actually increase the 
         CPU time.  Note that this step is not required, since all lengths 
         are ultimately examined; however, by early selective examinations, 
         invalid ruler rulers are found more quickly.

  5      Another technique for reducing computations begins by asking the
         question "can we use a difference greater than those in the head
         (>fhd)?" (see 3b above).  if we cannot, then we know that if this
         node is part of a valid ruler, then all remaining differences in
         the ruler (q[]) must come from the head.  Since we already know
         the sum of the head (lf), we also know the total ruler's length. 
         Combined with step 4 above, this results in fast elimination of
         many potential rulers at nodes deep in the tree.

  6      Since the forward looking td[] array and the backward looking bp[]
         array require the most CPU for large rulers, the pointer
         adjustments are made only if the pointers are within the range of
         new possible first differences.  This is easily calculated: it 
         represents the current maximum value in the head (fhd) plus any 
         slack we have not used in selecting the current interval (nf-lf).

         For example: Suppose we know that SEG-3 cannot be bigger than 22.
         Also suppose we just added a mark, and "took" the following
         differences:  4, 7, 15, 23, 33, 37, 41, 45, 66, 78, and 108.
         There is no need to make the changes in the bp[] array for the
         differences 23, 33, 37, ..., because we will never have to
         backtrack from these points (the largest difference we can take
         from this point forward must be less than SEG-3 or 22).


< Additional search reduction techniques not used due to inefficiencies >

1.       Limiting the lower bound of the current interval (SEG-2) such that
         the length from here to the left side of the ruler is at least as
         large as the equivalent OGR.  We quickly find this out anyway by
         checking differences, and most tests will end up concluding that
         the length is greater than the corresponding OGR anyway.

2.       Extension of the test to see if values outside of the head can be
         used in SEG-3 (if not, we can further limit potential ruler
         lengths.)  However, these extensions reduce execution by only 1
         percent or so, and are difficult to rationalize given the extended
         code required.

3.       Checking if the minimum length of the third section, SEG-3, is an
         available difference itself (if not, then SEG-3 must be longer
         than otherwise calculated.)  However, this analysis is somewhat
         redundant with the more efficient "selective checking" of
         distances to the right end of the ruler as described above (#5).


< Restart / Redundancy Checking >

To facilitate running GVANT on multiple platforms, it is designed to
start at user-specified ruler segments.  These starting segments, 
called ruler "stubs", define where GVANT should place the first 
several marks.

stub     A stub is an intial ruler segment described by a set of
         differences that GVANT uses in placing the leftmost marks.  GVANT
         searches for a ruler that includes the input starting stub.

As examples, to search inclusively all potential OGRs of length 20,
the initial stub is a set of zeros [0 0 0 0 0].  GVANT will start
searching at [0-1-2-4-5-...] and will continue until the user breaks
the run.  (Note that in the first ruler tested GVANT skips the "3"
since the first two differences [1-2] already use this difference by
combination.)

Starting Stubs allow GVANT to be run simultaneously on several
different machines.  For instance, used with two machines, GVANT could
search for a 20 mark OGR using stubs of [0 0 0 0 0] on the first
machine and [17 5 0 0 0] on the second.  When the first machine
reaches [17 5 0 0 0] it can be stopped.  For standardization, stubs
will be set at uniform lengths when first distributed to multiple
users.  However, each user can break his/her assigned stub range into
individual pieces as needed.

To provide extra versatility, GVANT periodically stores current ruler
information in a save-file.  This serves two functions: (1) it allows
the user to interrupt the process and restart it later without
significant loss of CPU effort and (2) it allows a central control
facility to check and compare results, ensuring that runs made on
different platforms or by different users are identical.


< Additional Documentation >

Additional documentation is given in the actual program code, and in 
the installation guide.

---
Copyright, 1996, 1997 by David Vanderschel and Mark Garry. All rights reserved.

