Proposal #2 Stack Groups on the Falcon. The hardware stack structure on the Falcon is different than that on the lambda. However, what the programmer wants to see (from the debugger, for example) remains the same, a "stack machine". This proposal details how we reformat the various falcon stacks into stacks which look very much like the LAMBDA. The basic idea is to do the reformatting always and at the very lowest level, i.e. as the data is loaded or stored from the falcon hardware. Doing it this way has several advantages and disadvantages. advantages: (1)economy of mechanism-- no need for separate "scrolling" routines for example. no matter what you are trying to do, you save out, do-it, and reload. Thus, the only routines deal directly with the hardware are load and store. (2)minimum of code that runs in "embarrasing" circumstances Somewhat a consequence of 1. When the Falcon call hardware is "disturbed" the code must VERY careful. (no ordinary subroutines, may not be able to reference local variables in the usual way, etc .) Ultimate reliability will be served if this sort of code is minimized. (3)avoid proliferation of hardware frame numbers into system datastructures. the HEAP numbers used by the falcon hardware may cause difficulties if they are allowed to "persist" over a long period. (When you reload they might not be the same, what if you have to scroll, what if there are more than 256 frames, etc. etc.) We avoid these difficulties by discarding them immediately (remapping them, if you will, into positions in a conventional style stack.). (4)closest to lambda. This will allow our current error handler to operate with minimum modification. (5)only two stacks need be allocate (or worried about) in most contexts, as on LAMBDA. (these are the MAIN and SPECIAL stacks). Only one EXTRANEOUS stack per entire system is allocated, and it is considered to be in the same logical classification as the hardware stacks. disadvantages: some loss of efficiency doesn't make "pooled" use of hardware (i.e. allowing several processes frames to be in hardware at once.) This is probably not a good idea anyway for various reasons. However, a slight inefficency will occur. Process switch time will still be quite acceptable (by lisp machine standards) anyway. (Due to the necessity to SWAP the special pdl, worst case latency is in principle unbounded anyway.) mechanism of call hardware load-unload itself a small overhead may occur due to this mapping, etc, as compared to simply storing out the hardware in a fixed place, etc. This is insignificantly small and may not ever exist since the code required is comparable. catch and throw. Storing things out and reloading to implement "stack molesting" operations could introduce some overhead. The case which we might eventually want to fix by a "fast bypass" is short THROWs. If you are THROWing to a place only a few frames up the stack, it will be quite a significant overhead to store out the stack group, figure out the THROW, and reload. This can be sped back up by an "fast case" test which looks for the catch frame within the FALCON call hardware. If found, it would do the right thing, etc. If the catch-frame was not found, then you store out etc. Note that in any case it is necessary for the system to be able to handle the case where you are throwing over more frames than will fit in the call hardware (no matter how big that might be unless it was capable of being truly arbitrarily large). Copying data in the EXTRANEOUS stack. Since the EXTRANEOUS stack is considered to be on a level with the hardware stack(s), it is emptied and/or reloaded in parallel with the actual hardware. This data might not need to be copied otherwise. However, the overhead should not be significant since if you are using the extraneous stack, those frames are not running ultra fast anyway. (note that it is not necessary to reload the "entire" EXTRANEOUS stack (if that were defined somehow) but only that portion that corresponds to the portion loaded in the PHYSICAL call hardware. shortcomings: (These dont really have to do with the scheme being proposed here, but are due to inherit design deficiencies in FLEABIT. However, we might as well discuss them here.) Due to the lack of formatting information in fleabit format stacks, it may not be possible to put information in the FLEABIT BIND and EXTRANEOUS stacks in their proper frames on "stack group" stack. Until this is fixed, it will not be possible to force a return with the error handler from an arbitrary frame. The FLEABIT system does record all the PDL levels CATCH frames. The STACK dumper routine must detect CATCH frames (easily done with a single compare) make the corresponding MAIN stack frame contain the EXTRANEOUS pdl info up to the level in the CATCH frame. -- background -- The falcon hardware implements the following stacks, the hardware frame "heap" (and presumably software extension thereof), the EXTRANEOUS PDL (used for overflow from 16 slots and functions of >= K args (k= 16 now, this will have to be reduced to 14 or 15), the special stack (a usual lisp binding stack). There is also a hardware return stack (which also contains return-destination and open-and-active information) which corresponds logically corresponds with the frame heap. Catch frames in the fleabit, ultra-simplified A catch frame is opened like any other. However, it can be recognized by the fact the presence of a special marker (currently li:unwind-marker) in register 0 of that frame. The rest of the frame is available for containing other info necessary to unwind to that level in specific registers. For example, register 1 has the special-pdl pointer, register 2 the extraneous stack pointer, etc. Altho the details may change, we propose to retain this basic scheme. ==================== Comments ==================== [smh 23sep88] "This proposal details how we reformat the various falcon stacks into stacks which look very much like the LAMBDA." It doesn't, really. This proposal is mostly about justifications for doing things this way rather than some other way. It would be nice if the proposal listed the _what_ in addition to the _why_, that is: - What modules/functions would have to be written? By whom? (Presumably RG.) - Sketch (at least) the data structure definition for a STACK-GROUP. - What code would have to be modified to call the new stuff? What are the implications for, say, the trap handler? - When should this project be done? When is the earliest it could start because of dependencies on other tasks? When is the latest it could start because of dependencies upon it? - How long will this project take to write, debug, and integrate? While in process, might it poterntially cause additional flakines that would affect other development? I don't see why it is ever necessary to copy the extraneous pdl. The extraneous pdl is accessed via a general register. When a stack group is resumed the extraneous pdl pointer could simply be loaded with the ``current'' pointer saved in the stack group, with the actual vector of data staying ``in-place'' in the stack group. The two problems this leaves to resolve are: (1) detecting and handling overflows so the extraneous pdl can be grown; and (2) guaranteeing the gc doesn't move the beast, or if it does, that doing so doesn't break anything. mechanism of call hardware load-unload itself a small overhead may occur due to this mapping, etc, as compared to simply storing out the hardware in a fixed place, etc. This is insignificantly small and may not ever exist since the code required is comparable. I don't understand this. It is really unclear what is being compared to what with regard to efficiency. catch and throw. Storing things out and reloading to implement "stack molesting" operations could introduce some overhead. "Could" should here read "will". It would be reasonable to claim that the overhead will be acceptible, and that the more-efficient alternatives would be very messy to implement, but some evidence to both points should be provided. In summary, I don't disagree with the conclusion of this proposal, but for planning purposes it would be better to spell out more of the implementation details and implications before commencing the , or even accepting the proposal. ====================