-*- Mode:Text -*-


			     A Proposal for

		Expanding the LISP Regression Test Suite

				  - - -

			      [Proposal #5]

			     Sept. 23, 1988 

			   David M. J. Saslav

				  - - -

Introduction: 

A significant portion of Common LISP and ZetaLISP functionality is not
tested by the existing regression test suite (found in SYS:VALID;).
This suite is run whenever a new system has been compiled.

Proposal:

Part I:

Spend one day or so surveying the test suite to determine which
functionality is without test/validation sequences, attaching
programmer-time	estimates for each major category of tests not currently
in existence. (Completed.)

Part II:

Specify the functional modules for which (additional) tests are required.

Part III: 

Write test/validation regression tests for that portion of overall
CommonLISP and ZetaLISP functionality not currently covered by the
existing LISP test suite.

Rationale: 

In our case, the phrase "regression" has an important meaning: we must
ensure that the Falcon software is at least as good as, if not better
than, the existing Lambda system.  The existing test suite is the
starting point for a robust LISP regression suite, one which will
be usable over the entire lifetime of all future projects.

The existing test suite is healthy.  When first brought up on the
Lambda, it diagnostically located over twenty bugs as well as numerous
examples of anomalous behaviour requiring documentation.

The test suite needs immediate expanding, however.  If the first
comprehensive test of the complete Falcon system occurs after most (or
all) of the modules have been ported, newly-introduced problems will
take the maximum possible toll on developer time.  Full testing of the
core software gives an early indication of the scope and nature of
newly-introduced problems, thus minimizing the time required to solve
them.

While this kind of testing is supplemental to the testing that every
developer is responsible for within the particular development domain,
this regression suite should serve as a collecting place for these
tests.  They can be run every time the system changes substantially
(e.g., when the system is recompiled).  That is the meaning of the
phrase "regression" -- these tests are designed primarily to ensure that
the state of the software does not "regress", i.e., that the existing
software base never suffers damage from any major system development.

-keith/dmjs


============================== Commentary

[smh 30sep88]

Two issues to be considered:

[1]	I question whether the focus on Common Lisp is appropriate.
Although we might like to build (and advertise) a Common Lisp machine,
in fact our product will and must support both CL and Lambdoid Zetalisp.
Anything that can be said about the time savings due to having a
regression test facility for CL goes just as much for Zetalisp.  Given
that we know we won't have resources to do a 100% job in creating a test
suite, then we owe it to ourselves to factor in Zetalisp functions and
constructions in order to prioritize them into our testing efforts.

	I am here thinking about language features such as (but by no
means limited to) locatives, dynamic closures, array leaders,
multiprocessing, and (ideally) flavors.  All these require special
support in both the compiler and low-level runtime system.  Having
regression tests for this support is just as useful as for basic CL
functionality.  Actually, I'd be willing to punt testing flavors (except
for the basic message sending and instance reference mechanisms) because
the system itself will test the high-level components adequately, but on
the other hand testing locatives is desirable because so many hidden
places in the system try to use them.

[2]	Obviously, barring some serious manpower additions, we will not
be able to complete a full validation/regression testing suite for our
project.  Given this, I feel we need to develop some rough estimates of
what portion of the job we might complete with plausible manpower
assignments.  We all agree that programmers *should* add tests as they
implement or debug some component of the system, but we all also
recognise that it is difficult to get programmers to do so.  This is
especially true if there are serious time pressures -- "externally
invisible" tasks like testing are usually the first to be compromised.
Therefore, for project scheduling purposes we need to decide on a
plausible surcharge ratio to factor into each implementor's time --
perhaps 10-20% ??  We should schedule that time for writing tests, and
then make completion of tests part of each milestone item.  If we just
leave test coding to the good graces of each implementor, it won't
happen.

	In any case, we can't leave all testing as part of the task of
the code writer.  Many areas of Lisp functionality are implemented
inside large files which will be cross compiled and downloded, and our
plans intend this to proceed *very* quickly once it gets started.
Coding any significant amount of testing as part of this task would
seriously alter our porting time estimates!

====================

[Keith: I agree with the substance of Steve's points.  Two comments:

1) There is a test file for ZetaLISP; it just doesn't have much in it
yet.  I agree that it should be extended, and that can be considered
part of this project.

2) Yes, the system is the best exercise of Flavors.  But if we follow
the existing cold-load architecture, Flavors does not get exercised
until it is needed for the second layer, "inner system".  So we must
have confidence in the Flavors software at a point before it has been
used, namely, in the static cold load.  So, I think putting together a
short test suite for Flavors, which can be run before the cold load gets
built, could save us much time and aggravation.]

====================

[smh 3oct88]

There are really two separable parts to flavors.

One is the low-level primitive mechanisms in the compiler,
``microcode'', and funcall hacks to make message sending and instance
variable reference work.  These mostly still need to be written, and
will then need to be debugged, so we can capture the debugging test
cases in the compiler/runtime regression testing suite.  I feel we
should do this, because it would be nice to have tests for these
low-level things just like it is nice to have tests for other low-level
language features.

The other part is the high-level flavor.lisp code which is primarily
responsible for maintaining the flavor data base.  Since this is almost
all high-level lisp code, there is no reason to suspect it will fail in
processor-dependent ways.  Also, there is an ungodly semidocumentedly
permutationally-much amount of it, with all those different
:method-combination options, daemons, wrappers, and whoppers.  Therefore
I think we might as well bypass any serious testing of of high-level
flavor features.  I think our limited time is better spent elsewhere.

====================

[keith 10oct88]

As Steve and I discussed the other day, we've just been batting around
different ways of saying the same thing vis-a-vis "low-level" testing:
the tests should take the form of calls to the minimum number of
high-level functions (DEFFLAVOR, DEFMETHOD, SEND) required to test the
low-level mechanisms (FUNCALL, dispatching, etc.).  I believe at this
point he and I are in violent agreement on every important issue.

Also, David and I agree that the scope of this proposal should include
testing ZetaLISP fundamentals.  The proposal text (above) has been
updated to reflect this change.  Additional material is forthcoming
which will map out, in priority order, the particular modules that need
to be tested.

====================