CHAPTER 27
POLYNOMIAL UPPER BOUND
FOR PROCESSING THE SEMANTICS OF LANGUAGES
THAT GENERATE PHRASES IN OTHER LANGUAGES
This chapter concentrates on languages whose semantics generate
phrases in other languages. Our syntax grammar from the previous
chapter is an example. Its semantics generate phrases in the
~datatype ~language.
This semantic process can consume an exponential amount of time if
we're not careful.
We describe a transformation that can be applied to any semantic
structure. It will render the semantic structure immune from an
exponential blowup of compute time. We will process potentially
exponentially many meanings in only polynomial time.
The next chapter shows the transformation to be applied to the ~last
language in a sequence, where the semantics do not generate phrases
in another language. There, ambiguity might need to be resolved via
common sense, as the language processing comes to a close.
This chapter assumes that the (syntax) grammar has no cyclic coercions,
such as the pair:
A -> B
and
B -> A
The next chapter does deal with such unrestricted coercions.
27.1
Shared Semantics - Efficient Evaluation
We saw a parser in Chapters 12 and 14 that can parse a string of
length N at a cost that is only a polynomial function of N. This is
true even if the string has an exponential number of meanings, due to
ambiguities. The semantic structure built up during the parsing
consumes only a polynomial amount of memory, but it can represent
exponentially many meanings, due to the automatic sharing of common
sub-meanings.
If we're not careful, we may consume an exponential amount of time
processing all exponentially many meanings. Even with a structure of
only polynomial size, we might consume exponential cost by evaluating
a given block more than once.
Figure 25.8 shows an example of ambiguous semantics. When ambiguities
exist, another phenomenon also arises. Notice in figure 25.8 how the
blocks for A, B, and C are ~shared. Each has more than one pointer
pointing to it.
Consider the invocation of the OR block. It first invokes its left
choice, (A+B)#C, and then invokes its other choice, A+(B#C). Each
interpretation is given a chance to generate phrases in the datatype
language. Notice that in the invocation of each of these choices, A
will be invoked ~twice, as will B and C. A will be asked to generate
its phrase twice, once within (A+B)#C and once within A+(B#C).
Although you may think of A, B, and C as variables (leaves in the
semantic tree) one can replace each of A, B, and C by expressions,
such as in:
A + B # C
X*Y + U*V # X*Y
The invocation of A may be nontrivial. In general, its expense can be
arbitrarily great. Therefore, we should be concerned about executing
A twice.
27.1.1
Surpressing Exponential Blowup
Things can get worse. The semantics represented by A might itself
contain ambiguities. Figure 15.10 shows semantic ambiguities nested
within one another, three levels deep. There are a total of eight
possible paths, when you consider that each of the three OR-blocks
offers two choices. Both choices are always taken at each OR-block.
For example, the execution of the top OR-block invokes twice the OR-
block second from the top. Each of those invocations invokes twice
the bottom OR-block. The bottom OR-block thus is invoked a total of
four times. Each of those four invocations invokes twice things at the
very bottom of the structure. Thus, things at the bottom of that
structure will be executed a total of eight times.
In general, ambiguities nested N levels deep gives rise to 2^N meanings
and 2^N executions of things at the bottom of the semantic structure.
We expected there to be 2^N meanings (Section 15.4) but we
never thought about 2^N executions. Must we lose the polynomial upper
bound when we invoke the semantics?
No. We can manage to execute all 2^N possible meanings in only
polynomial time by making sure that no semantic block gets invoked
twice. That is, upon the first invocation of any shared block, we can
carry out the invocation and then ~store the results with the block.
Upon a subsequent invocation, we deliver our saved results, and avoid
recomputing them.
Given that any semantic block executes only once, the total time taken
to invoke all the semantics is bounded above by the total number of
semantic blocks. Because there is only a polynomial number of
semantic blocks, we can process exponentially many represented meanings
in only polynomial time and memory consumption, by supressing duplicate
executions of the same block.
This trick of supressing duplicate executions might not always be
possible. It is, however, in our applications.
BOX: What can cause exponential execution times even in
BOX: structures that are only polynomial in size?
BOX:
BOX: Supposing amgiguities are nested six deep, how long
BOX: may it take to invoke those semantics?
BOX:
BOX: How do we supress exponential blowup?
27.1.2
Keep Only Full-Spanning Types
The invocation of the semantics of EXPRs generates phrases in the
datatype language. Of all the generated phrases, we are really
interested in only full-spanning types. Notice how we've used those
generated phrases. We always collect up only the full-spanning
types, via the "[>...<]" notation.
Let's therefore store with each semantic block only the ~full-spanning
types that are generated by the first invocation. Upon subsequent
invocations, we can look thru the saved, full-spanning unit-length
phrases (types) and GIVE each one again, ~as ~though ~we ~were
~generating ~those ~phrases ~for ~the ~first ~time. However, we do it
this time very quickly, without recomputing the entire phrase
generation again.
Figure 27.1(a) shows an original semantic block. As usual, the first
bin in the block points off to a program, and values for the program's
variables are held in the remaining bins. Conceptually, let's
introduce a spare bin in the semantic block, as shown in figure
27.1(b). Initially, nothing resides in that bin.
Upon first invocation, the datatype phrases are generated as usual,
and only the full-spanning results are kept, as shown in figure
27.1(c). Upon subsequent invocations, we notice that our new last bin
already holds some data. We therefore don't call our program.
Instead, we GIVE each of the full-spanning phrases.
27.1.2.1
How We Might GIVE The Full-Spanning Types
We now show how we might GIVE each of our full-spanning phrases.
Let P be of type PHRASE, so as to be able to hang on to each of our
full-spanning unit-length PHRASE blocks:
FOR P $E the_last_bin ; "(a CHOICE_OF_PHRASES)"
DO
P.LEFT:= LEFT;
GIVE(P);
END
For each full-spanning unit-length phrase P, we GIVE it, except that
we ignore P's LEFT field and instead replace it with the value in the
global variable LEFT.
Recall, we are expected to produce phrases that span from LEFT to C.
We've just incorporated LEFT. Our call to GIVE implicitly puts our
phrase block P onto C. Thus, each phrase block P is GIVEn so as to
span from LEFT to C.
27.2
What We Do At Shared Semantic Blocks
Suppose S is a shared semantic block. As a shared block, at least
two different points of view can invoke it. Our job is to affect S
so that upon non-first invocations, its execution is supressed,
although we must ~act as though we were generating phrases again.
Let's first explore how we'd like to see the shared block behave.
We will later consider how we map this behavior to any block (S) that
is shared.
Here is a BASIC_PROCESS that should replace S:
VAR SEED = CHOICE_OF_PHRASES ;
DO SEED:= NIL;
GIVE //{ S; SEED; }
BEGIN VAR P=PHRASE;
IF -DEFINED( SEED ) THEN
SEED := [> <*S*>; <] ;
FI
FOR P $E SEED;
DO
P.LEFT:= LEFT;
GIVE(P);
END
END \\
We declare the variable SEED and set it to NIL. SEED will hold
ultimately the full-spanning unit-length phrases generated by S.
SEED is the extra bin. This replacement BASIC_PROCESS for S always
checks to see if SEED is defined (i.e., filled in). If it's not, we
fill it in first thing, by executing:
SEED := [> <*S*>; <] ;
This invokes S, causing phrases to be generated (as we expect of S).
The full-spanning unit-length phrases are collected up and assigned
into SEED.
Now, whether it be the first or subsequent invocation, we GIVE each
PHRASE block in SEED, as in:
FOR P $E SEED;
DO
P.LEFT:= LEFT;
GIVE(P);
END
We act as though we were generating phrases all over again. As
discussed earlier, this GIVEs each phrase block P, so that it spans
from (the given) LEFT to C.
Again, the first invocation sets SEED, and all invocations augment
C with the phrases in SEED. Thus, only the first invocation actually
invokes S. We have achieved the supression of duplicate invocations
of S.
There is a little problem associated with this algorithm, which we will
identify and remedy shortly. First, let's arrange to put our new
BASIC_PROCESS right over the block S. This will make S itself
adopt our desired behavior.
BOX: What do we do at shared blocks (e.g., S)?
BOX:
BOX: What is SEED?
BOX:
BOX: How many times do we compute SEED?
BOX:
BOX: What do we do regardless of whether this is a first
BOX: or non-first invocation?
BOX:
BOX: Why do we modify P.LEFT?
27.2.1
Overwriting The Shared Block With The New BASIC_PROCESS
We want our new BASIC_PROCESS to reside in S's place, so that whenever
somebody arrives at the block S, they encounter instead our new
BASIC_PROCESS.
We need to @-assign (modify objectively) the block S. Basically, we
want to say:
@(S):= //{S;SEED;} ... \\ ;
where the "//...\\" is what was just shown. This makes our new
block reside right where the S block is.
Unfortunately, this straightforward @-assignment will suprise us
(like in figure 15.4). We proceed safely as follows:
VAR SEED = CHOICE_OF_PHRASES ;
COPY_S = BASIC_PROCESS ;
SEED:= NIL;
COPY_S:= COPY(S) ;
@(S):= //{ COPY_S; SEED; }
... (as before, but using COPY_S in place
of S)
\\ ;
We've introduced COPY_S, which we must use in place of S within our
//...\\.
Figure 27.2(b) shows the original S and COPY_S. Part (c) shows our new
process as well, the "//...\\". It retains access to COPY_S. Part (d)
shows what the "@" does. It overwrites the original S block so as to
now be our new BASIC_PROCESS. Thus, upon arriving at the original S
block, one sees our new BASIC_PROCESS.
Our new BASIC_PROCESS retains access to S's old data, now called
COPY_S. The new BASIC_PROCESS, although invoked potentially many
times, invokes COPY_S only once. (Another example of this use of the
"@" operator appears in Section 15.2). Here is the complete new
scenario:
VAR SEED = CHOICE_OF_PHRASES ;
COPY_S = BASIC_PROCESS ;
SEED:= NIL;
COPY_S:= COPY(S);
@(S):= //{ COPY_S; SEED; }
BEGIN VAR P=PHRASE;
IF -DEFINED( SEED ) THEN
SEED := [> <*COPY_S*>; <] ;
FI
FOR P $E SEED;
DO
P.LEFT:= LEFT;
GIVE(P);
END
END \\ ;
BOX: BOX: How many iterations does "FOR P $E SEED;" cause
BOX: on the first and on subsequent invocations?
27.2.2
Resetting Shared Blocks - Introducing The Variable PRESENT_TIME
Let's generalize slightly our new BASIC_PROCESS.
The shared block's action of generating phrases can conceivably depend
on a set of ~global ~variables. Upon first invocation, those
variables may be read to help direct the phrase generation. However,
upon subsequent invocations, those variables won't be read, as we
use the result from the first invocation without re-invoking COPY_S.
What if those global variables change in value between invocations?
As things stand now, the global variables will be ignored on all
non-first invocations. We need a way to ask the shared block to
regenerate its phrases from scratch, ignoring the results (SEED)
acquired from a previous invocation.
Let's invent a new global variable named PRESENT_TIME:
VAR PRESENT_TIME = INT ;
Let's agree to ~increment this variable whenever we want to announce
that other global variable(s) have been changed. That is, let's have
shared blocks remember SEED only as long as PRESENT_TIME remains
unchanged.
When PRESENT_TIME changes, we will ignore SEED and invoke
S all over again. This assures that new values in global variables
will actually be used as we generate now possibly a different set of
phrases, depending on the global variables' values.
We respond to PRESENT_TIME by keeping track of what PRESENT_TIME's
value was upon our last invocation, which we will call LAST_TIME:
VAR COPY_S = BASIC_PROCESS;
SEED = CHOICE_OF_PHRASES;
LAST_TIME = INT;
SEED:= NIL;
LAST_TIME:= 0;
COPY_S:= COPY(S);
@(S):= //{ SEED; LAST_TIME; COPY_S; }
BEGIN VAR P = PHRASE;
IF PRESENT_TIME > LAST_TIME THEN
SEED:= [> <*COPY_S*>; <] ;
LAST_TIME:= PRESENT_TIME;
FI
FOR P $E SEED;
DO
P.LEFT:= LEFT;
P.SEM:= COPY(P.SEM); "(new)"
GIVE(P);
END
END \\ ;
One of the two differences between this and our previous rendition,
is the IF-statement. Now, instead of checking if SEED is undefined,
we compare LAST_TIME to PRESENT_TIME. When they don't match, we
re-invoke COPY_S and assign the new results into SEED.
This way, of course, anybody can cause us to re-invoke COPY_S, just by
incrementing PRESENT_TIME. We finally set LAST_TIME to PRESENT_TIME
so that we will ~not re-invoke COPY_S, until someone increments
PRESENT_TIME again.
We assume that PRESENT_TIME is always one or greater, so that our
initial zero value for LAST_TIME is sufficient to guarantee that we
will invoke COPY_S upon the first invocation.
27.2.3
One Other Subtle But Necessary Change
We made one other, unrelated change. After assigning LEFT into P.LEFT,
we perform:
P.SEM := COPY( P.SEM ) ;
This doesn't change P.SEM, except to reference now a separate copy of
P.SEM. This is done for a subtle reason. Recall that GIVE might
modify objectively the semantic block P.SEM of any given PHRASE block
P. GIVE objectively modifies the semantic structure P.SEM in its
effort to supress duplicate phrase blocks.
Such an objective modification would become apparent from the point of
view of SEED, which contains P. By passing into GIVE a copy of our
semantic block (P.SEM), that copy might be modified objectively by
GIVE, but the original P.SEM is not modified. (GIVE never gets a hold
of the original). This assures that our data in SEED will not be
modified, as a side-effect of GIVE.
BOX: How do you cause shared blocks to reevaluate from
BOX: scratch?
BOX:
BOX: Why might you want to perform such reevaluations?
BOX: (An important example appears in the next chapter
BOX: in Section 28.4.5).
BOX:
BOX: Even after such a reset, does the next evaluation
BOX: incur polynomial or exponential cost?
BOX:
BOX: Why do we copy P.SEM?
27.3
A Closer Look At The Shared Activity
At shared blocks, we've acted out phrase generation via:
FOR P $E SEED;
DO
...
GIVE(P);
END
We set SEED via:
SEED:= [> <*COPY_S*>; <] ;
Now, suppose the invocation of COPY_S generates a full-spanning INT.
With the datatype grammar active, which may include the coercion from
INT to REAL:
~INT -> ~REAL
that full-spanning INT will be accompanied by a full-spanning REAL.
SEED therefore contains both an INT and a REAL phrase.
Look what happens when we GIVE each of these. When we GIVE the INT,
that INT and ~another REAL will go onto C. That other REAL arises
because the grammar is active, acting upon the INT. When we then GIVE
SEED's REAL phrase, we wind up with ~two REALs, as figure 27.3
illustrates. The two REALs have identical semantics. Each is the
result of coercing the ~same INT to a REAL. We have just introduced a
meaningless ambiguity. The two parts of the ambiguous meaning under
REAL are identical.
Our activity at shared blocks appears to introduce meaningless
ambiguities. To supress them, let's modify the assignment into SEED:
SEED := [> <*COPY_S*> <] ;
We wish during this execution that GIVE wouldn't apply the grammar
when the full-spanning INT is generated within "<*COPY_S*>". This
would result in SEED containing only the full-spanning INT and no
REAL, (at least no REAL derived from the INT).
This way, when we go thru SEED and GIVE each of the phrase blocks, an
active grammar will then and only then apply the coercion from INT to
REAL, and thus produce a total of only one REAL.
27.3.1
Arresting Full-Spanning Types
We wish to ~arrest all calls to GIVE that introduce full-spanning
unit-length phrases. It is those calls that we want to replay later,
when we are asked to generate our phrases. We want those full-
spanning phrases, not yet seen by the grammar, to form SEED.
We will need to modify our statement that assigns into SEED. It will
have to tell GIVE to arrest full-spanning unit-length phrases, and
eventually collect those full-spanning phrases into SEED.
Let's communicate this desire to GIVE via the following variables:
VAR FULL_SPANNING_LEFT = CHOICE_OF_PHRASES ;
" FULL_SPANNING_LEFT defines the left endpoint
of what GIVE will consider 'full-spanning'. "
FULL_SPANNERS = CHOICE_OF_PHRASES ;
" FULL_SPANNERS will hold all arrested full-
spanning unit-length phrases. "
27.3.2
Modifying GIVE
FULL_SPANNING_LEFT tells GIVE what we consider to be the left endpoint
of full-spanning phrases. GIVE will note such phrases by doing
nothing, except putting them onto FULL_SPANNERS.
No grammar will be called, and no looking for duplicate PHRASE blocks
will occur.
We redefine GIVE now:
DEFINE GIVE( P: PHRASE ):
IF P.LEFT =:= FULL_SPANNING_LEFT
THEN FULL_SPANNERS::= $> P ;
ELSE
exactly as before
FI
ENDDEFN
The "=:=" in the IF-clause compares the addresses of the blocks P.LEFT
and FULL_SPANNING_LEFT. When they are identical addresses (meaning the
~same CHOICE_OF_PHRASES), this phrase block P is said to be
full-spanning. It is arrested and put onto FULL_SPANNERS.
We've supressed some of GIVE's activity. Full-spanning unit-length
phrases are not seen by the grammar. However, if we were to call the
old GIVE with the phrase blocks on FULL_SPANNERS, we would restore that
lost activity.
The order in which GIVE receives its phrase blocks has no
effect on the result after all the GIVing is completed.
Delaying the full-spanners and then GIVing them all at once is
equivalent to having not arrested the full-spanning unit-length
phrases in the first place.
27.3.3
Disabling GIVE's New Feature
For flexibility, let's invent a way to disable this new feature of
GIVE. The action described in the following will make GIVE behave
the old way, arresting no phrases.
Let's define a function that delivers a new PHRASE block, one that
nobody else references:
DEFINE NEW = CHOICE_OF_PHRASES ;
{ [POS: 0] }
ENDDEFN
(This dummy CHOICE_OF_PHRASES has one phrase block, containing zero
as its part-of-speech. Nobody will ever look at it this deeply
though).
Let's declare a variable NEVER to hold a dummy CHOICE_OF_PHRASES
that no one else will ever reference:
VAR NEVER = CHOICE_OF_PHRASES ;
NEVER:= NEW ;
We know that nobody references this dummy because all our programs up
to now never read the variable NEVER.
Our one exception will be to put NEVER's data into FULL_SPANNING_LEFT.
Even with this exception, no PHRASE block's LEFT field will ever match
NEVER.
Let's put NEVER into FULL_SPANNING_LEFT:
FULL_SPANNING_LEFT:= NEVER ;
This causes GIVE's new IF-clause to always deliver FALSE, and hence
to act like it did before our latest modification. (P.LEFT will
never match FULL_SPANNING_LEFT which contains the NEVER data). Thus,
this assignment statement is a sure-fire way to disable GIVE's new
feature of arresting full-spanning PHRASE blocks.
27.3.4
Final Modification To The Shared Activity
We update our activity at shared blocks, so as to arrest full-spanning
PHRASE blocks passed to GIVE. Replace:
SEED := [> <*COPY_S*> <] ;
by:
HOLDING LEFT:= NEW; "Left endpoint for the phrases
generated by COPY_S"
FULL_SPANNING_LEFT:= LEFT; "Tell that to GIVE"
FULL_SPANNERS:= NIL; "GIVE puts full-spanning PHRASE
blocks here"
C:= NIL ;
DO <*COPY_S*>; "Generate all the phrases"
SEED:= FULL_SPANNERS; "All the full-spanning PHRASE
blocks arrested from GIVE"
ENDHOLD
This new program sets things up for GIVE, and finally acquires the
arrested full-spanning phrases. It sets LEFT to a brand new
CHOICE_OF_PHRASES that nobody else references yet. All phrases
generated by "<*COPY_S*>;" will naturally adopt LEFT as the left
endpoint for all full-spanning phrases. We tell GIVE that LEFT is the
left endpoint for full-spanning phrases by assigning LEFT into
FULL_SPANNING_LEFT.
We set FULL_SPANNERS to NIL, so that it will hold ultimately
only the arrested full-spanning PHRASE blocks.
We invoke COPY_S so as to generate all phrases. Upon completion,
GIVE has put all full-spanning PHRASE blocks onto FULL_SPANNERS. We
take those as our SEED.
BOX: What potential problem would arise with our original
BOX: activity at shared blocks?
BOX:
BOX: How does our new method, having GIVE arrest full-
BOX: spanning unit-length phrases, and putting them into
BOX: SEED, solve the original problem?
BOX:
BOX: How does GIVE know what is to be considered to be
BOX: full-spanning?
BOX:
BOX: Where does GIVE put all arrested, full-spanning
BOX: phrases?
BOX:
BOX: What simple assignment will disable GIVE's arresting of
BOX: full-spanning phrases?
27.3.5
Final Modification To The Dash-Notation's Meaning
Presently, GIVE's test for full-spanning phrase blocks is not quite
airtight. Consider the following two phrase generations:
<X>
and
<Y> - <Z>
Both <X> and <Y> have the same left neighbor, the value in LEFT.
GIVE would presently think of both as full-spanning. In fact, <Y>
is not full-spanning. It has a right neightbor (<Z>).
We really want GIVE to consider a phrase block P full-spanning only
when P.LEFT matches FULL_SPANNING_LEFT ~and C holds the far right
endpoint.
The variable C holds the far right endpoint only when there is no dash
to the right of the present phrase generation.
Therefore, let's turn off GIVE's recognition of full-spanning phrases
on all but the rightmost statement of any sequence of statements
connected by dashes. Only that rightmost statement has C representing
the right endpoint of full-spanning phrases. Let's translate:
STATEMENT(1) - STATEMENT(2) - ... - STATEMENT(N)
into the following, which has our new contributions in italics:
HOLDING LEFT; "(Preserve LEFT so that it looks like we never
modified it)".
DO
RIGHTHAND:= C; "Remember present C, for our final
STATEMENT"
~SAVE:= ~FULL_SPANNING_LEFT ; "Remember
FULL_SPANNING_LEFT"
~FULL_SPANNING_LEFT:= NEVER; "disable GIVE's new
feature"
C:= NIL; ~STATEMENT(1)
LEFT:= C; C:= NIL; ~STATEMENT(2)
LEFT:= C; C:= NIL; ~STATEMENT(3)
...
LEFT:= C; C:= NIL; ~STATEMENT(N-1)
LEFT:= C;
C:= RIGHTHAND;
~FULL_SPANNING_LEFT:= ~SAVE; "Re-enable GIVE's
new feature"
~STATEMENT(N) "Rightmost STATEMENT"
ENDHOLD
We disable GIVE's recognition of full-spanning unit-length
phrases on all but the rightmost statement. Disabling occurs via
tha assignment:
FULL_SPANNING_LEFT:= NEVER ;
and the re-enabling occurs by setting FULL_SPANNING_LEFT to its
original value (SAVE). Notice how C and FULL_SPANNING_LEFT are
restored to their original values at the same time, when we enter the
rightmost statement.
---------------- Parentheses in previous paragraph around "1", "2"
---------------- "3", and "N" and "N-1" mean subscripting! -------------
The meaning of "full-spanning" is a local concept. Even though
we shut off the notion of "full-spanning" in some statements, a
statement can establish a ~new notion of full-spanning, use it,
and then abandon it. These different meanings of full-spanning are
protected from one another by the fact that we restore
FULL_SPANNING_LEFT when we are done using it.
BOX: Why have we modified the dash notation's meaning?
BOX:
BOX: How does the modification assure that GIVE will
BOX: arrest ~only full-spanning unit-length phrases?
27.4
Finding All Shared Blocks
Given a semantic block, a BASIC_PROCESS, how do we manage to travel
its internal structure so as to discover which blocks are shared?
If we can do this, we can objectively modify each shared node, S,
as shown earlier, so that S will behave as we wish a shared block to
behave.
Figure 27.4 shows a semantic block. It points to three sub-blocks.
We can visit all sub-blocks in a semantic block by the following
procedure MARK_BLOCK.
We present MARK_BLOCK in English and not in ICL because in ICL, a
BASIC_PROCESS (semantic block) can only be ~invoked. We can't see the
individual bins in ICL. In our implementations so far, MARK_BLOCK is
written in assembly language:
DEFINE MARK_BLOCK( S: BASIC_PROCESS ):
for each non-first bin in S, which we will refer to
as B
DO
MARK_BLOCK(B);
END
ENDDEFN
Recall that the first bin in a BASIC_PROCESS points not to another
BASIC_PROCESS, but to a ~program ~block of machine language code.
Only the non-first bins point to other BASIC_PROCESSes.
MARK_BLOCK, given a BASIC_PROCESS S calls itself recursively with each
of its sub-blocks. Each of those recursive calls naturally follows all
sub-sub-blocks of each of those sub-blocks, and so on. MARK_BLOCK will
visit all blocks accessible from any given semantic block. Figure 27.5
(a) and (b) show examples of visits. The blocks are numbered to
indicate the order in which they are visited.
Let's modify MARK_BLOCK so as to ~mark each block it encounters. We
reserve a single bit in each semantic block, and assume that its value
is initially zero. To mark a block means turning on that single bit
in the block:
DEFINE MARK_BLOCK( S: BASIC_PROCESS ):
Mark the block S
for each non-first bin in S, which we will refer to
as B
DO
MARK_BLOCK(B);
END
ENDDEFN
This procedure winds up ~marking all the semantic blocks accessible
from the one semantic block given to MARK_BLOCK initially. (This
marking is an objective modification).
We are ready to detect shared blocks. Recall that a shared block has
two or more pointers pointing to it. Since MARK_BLOCK follows ~all
pointers, MARK_BLOCK will encounter a shared block ~more ~than ~once.
(figure 27.5(c)). It will arrive at the shared block once from each
pointer.
Our final modification to MARK_BLOCK detects whether or not
the given block has been encountered before:
DEFINE MARK_BLOCK( S: BASIC_PROCESS ):
IF S is marked "i.e., this is a non-first arrival"
THEN NOTE_SHARING(S);
ELSE
mark S
for each non-first bin B in S
DO
MARK_BLOCK(B);
END
FI
ENDDEFN
In case S is marked, we know we've been here before. That means
this is a second arrival, or in general, a non-first arrival. Two
arrivals implies this block S is shared. Thus, if S is marked, we
call NOTE_SHARING.
NOTE_SHARING refers to the procedure shown in the previous section,
where we assigned objectively over S, as in:
@(S):= //{COPY_S;SEED;} ... \\ ;
27.4.1
The Cost Of Marking
Since we don't call MARK_BLOCK recursively if the block is already
marked, each pointer to a block is followed exactly once. Since the
number of pointers in the entire semantic structure is bounded above by
the number of blocks in the structure (times a constant factor), our
MARK_BLOCK procedure consumes execution time proportional to the size
of the semantic structure.
The largest possible size of a semantic structure is at worst
a polynomial function of the number of characters originally parsed.
Even though the semantic structure can represent exponentially many
meanings, our processing time is fortunately only polynomial.
One more refinement would be to avoid calling NOTE_SHARING if
NOTE_SHARING has already been called upon the block S. That
is, a block with ~three or more pointers to it will presently
call NOTE_SHARING upon that same block ~two or more times. We
really need only to NOTE_SHARING only once.
(After calling NOTE_SHARING the first time, we could mark S
with another bit besides the bit used for marking. This bit
would indicate that we already know that S is shared. We would
test this bit prior to calling NOTE_SHARING next time around,
and hence surpress further calls to NOTE_SHARING on S).
BOX: BOX: What does MARK_BLOCK do?
BOX:
BOX: How does it know when it reaches a shared block?
BOX:
BOX: What does NOTE_SHARING do to a given shared block?
BOX:
BOX: Can the marking process inadvertantly consume
BOX: exponential time? Why?
27.5
Summary For The Semantics Of The Syntax Grammar
For the semantics of the syntax grammar, or any grammar whose
semantics generate phrases in another language, here is what we
want to do at OR-blocks and shared blocks:
1) OR-block:
OR:= //[A;B;] <*A*>; "A generates its phrases"
<*B*>; "B generates its phrases"
"The resulting phrases together might be
ambiguous"
\\ ;
2) Shared block, named S:
SEED:= NIL;
LAST_TIME:= 0;
ORIGINAL:= COPY(S);
@(S):= //{SEED;LAST_TIME;ORIGINAL;} BEGIN VAR P=PHRASE;
IF PRESENT_TIME > LAST_TIME THEN
"Set SEED... "
HOLDING LEFT:= NEW; "Left endpoint for the phrases
generated by COPY_S"
FULL_SPANNING_LEFT:= LEFT; "Tell that to GIVE"
FULL_SPANNERS:= NIL; "GIVE puts full-spanning PHRASE
blocks here"
C:= NIL ;
DO <*COPY_S*>; "Generate all the phrases"
SEED:= FULL_SPANNERS; "All the full-spanning PHRASE
blocks arrested from GIVE"
ENDHOLD
LAST_TIME:= PRESENT_TIME;
FI
"GIVE from SEED..."
FOR P $E SEED;
DO
P.LEFT:= LEFT; "(Incorporate the global variable LEFT)"
P.SEM:= COPY(P.SEM);
GIVE(P);
END
END
\\ ;
BOX: BOX: What does the OR-block do for semantics that generate
BOX: phrases in another language?
BOX:
BOX: If PRESENT_TIME remains unchanged, what IF-clause
BOX: supresses non-first invocations of the original
BOX: semantics?