Syntactic sugar for inference rules
From Dyna
delgetl orbocpas noeltalalet Future versions of Dyna may add a couple of special notations for use in inference rules. These notations are merely syntactic sugar intended to improve readability.
Contents |
Anonymous accumulations
Sometimes it is handy to write a summation, or other accumulation, without giving it a name. Dyna might introduce notation as in these two examples:
foo1(U) max= edge(U,V) * (+= edge(V,W)).
foo2(U) += edge(U,V) * (max= edge(V,W)).
which would be syntactic sugar, respectively, for
foo1(U) max= edge(U,V) * temp1(V). temp1(V) += edge(V,W).
foo2(U) += edge(U,V) * temp2(V). temp2(V) max= edge(V,W).
Thus, (+= ...) is a summation operator. It sums
over all variables that are mentioned only inside (+= ...);
in this case, it sums over W. Similarly,
(max= ...) is a maximization operator.
Binarizing inference rules using anonymous accumulations
Notice that the CKY parsing rule
constit(X,I,K) += rewrite(X,Y,Z)*constit(Y,I,J)*constit(Z,J,K)..
can be asymptotically sped up -- in fact Dyna currently does this speedup automatically -- by taking advantage of the distributive law and introducing an intermediate summation such as
constit(X,I,K) += (+= rewrite(X,Y,Z)*constit(Y,I,J)) * constit(Z,J,K).
which sums over Y, yielding the "binarized" program
constit(X,I,K) += temp(X,Z,I,J)*constit(Z,J,K). temp(X,Z,I,J) += rewrite(X,Y,Z)*constit(Y,I,J). % slashed constituent
Notice that temp(X,Z,I,J) is really a "slashed" X/Z
constituent, i.e., an X that is missing a Z.
Another possible binarization is
constit(X,I,K) += rewrite(X,Y,Z) * (+= constit(Y,I,J)*constit(Z,J,K)).
which sums over J and is equivalent to
constit(X,I,K) += rewrite(X,Y,Z)*temp(Y,Z,I,K). temp(Y,Z,I,K) += constit(Y,I,J)*constit(Z,J,K). % double constituent
The if ever syntax
See quantification in semi-expressions.
Note also that an if ever construction such as
foo(X) += bar(X) if ever edge(X,Y) != 0.
can be regarded as syntactic sugar for an anonymous accumulation that performs the existential quantification
foo(X) += bar(X) if (|= edge(X,Y) != 0).
This in turn is syntactic sugar for
:- item(temp, bool, false). foo(X) += bar(X) if temp(X). temp(X) |= edge(X,Y) != 0.
Evaluated subterms
This part of the language is still being worked out.
In future, Dyna will probably let you write expressions something like these:
foo(=2+3) to mean foo(5)
foo(=bar(nil)) to mean foo(5), if bar(nil) has value 5
foo(=complex("3+4i")) to mean foo(3 + 4i), where
3 + 4i is a constructed primitive term
word(W,I,=I+1) in an inference rule
baz(P,=baz(P,Q)) in an inference rule
A subterm preceded by = is evaluated in place.
Evaluated subterms as syntactic sugar
In fact, the above examples are just syntactic sugar for the semi-expressions
foo(X) whenever X is 2+3.
foo(X) whenever X is complex("3+4i").
foo(X) whenever X is bar(baz).
word(W,I,X) whenever X is I+1.
baz(P,X) whenever X is baz(P,Q). % note universal quantification over Q
Because these semi-expressions contain variables, they are really patterns,
not expressions, and can only be used in queries and inference rules.
In particular, foo(=bar(nil)) is not a term and cannot be constructed in C++. (The argument of a foo term would have to be an integer, in our example, not a variable like X or an integer-valued expression like bar(nil).)
Given the inference rules
goal += foo(=bar(baz)). goal += word(W,I,=I+1). goal += baz(P,=baz(P,Q)).
the antecedents of goal would be matches to the above
patterns:
foo(5) whenever 5 is bar(baz). word(np,5,6) whenever 6 is 5+1. baz(a,c) whenever c is baz(a,b).
These semi-expressions are slightly clumsy, but they provide the necessary information, especially if analyzed in C++ with pattern-match accessors (see pattern).
The is operator
The is operator in the semi-expressions is adapted from Prolog. α is β is true if the term α (usually a primitive) is the value of the expression β.
(By contrast, α=β is true if
α and β are equal terms, and
α==β is true if α and β are expressions with equal values. All three have boolean values, but α=β is properly regarded as a foreign axiom because it does not evaluate its subterms α and β, α==β as a computed expression because it does evaluate them, and α is β as a piece of special Dyna syntax.)
Evaluation of subterms versus non-subterms
Note that the antecedent expression in an inference rule is always evaluated.
Thus, if the value of a is the term b*c, and the value of b*c is 6, then the following are different:
goal = a. % goal is the actual term b*c goal = =a. % goal is 6 goal = ==a. % goal is 6 (since the value of 6 is itself)
goal = foo(a). % goal is the value of the item foo(a) goal = foo(=a). % goal is the value of the item foo(b*c) goal = foo(==a). % goal is the value of the item foo(6)
The last example expands to
goal = foo(=X) whenever X is a.
which expands further to
goal = foo(Y) whenever Y is X whenever X is a.
or more concisely,
goal = foo(X) whenever (Y is X) & (X is a).
Automatic evaluation of subterms
It might be convenient to allow the = to be omitted in some
or all circumstances, allowing syntax like word(W,I,I+1).
However, this must be done carefully since it is often useful to write
terms such as constit(s/np,3,5) where s/np is
simply a structured subterm (akin to slash(s,np)) rather than
a request to divide the value of s by the value of np.
Various conventions could be adopted. Why should I+1 but not s/np be evaluated?
- Perhaps because the user has declared that
+subterms (orint+intsubterms) should be automatically evaluated. (Maybe it's even enough to explicitly declare them to be computed expressions; note that common operators like+and/will be implicitly declared.) - Perhaps because the user has declared that
/subterms (ornonterm/nontermsubterms) should not be automatically evaluated. (Maybe it's even enough to explicitly declare them to be structures; note that such declarations would otherwise be inferred). - Perhaps because
I+1has a value whereass/npdoesn't (either becausesandnpdon't, or because/isn't defined to return a value when it takes arguments of their types). In other words, all computed expressions should be evaluated, even as subterms (unless somehow protected, e.g., by a non-traditional quote operator, or by having also been declared as structures, or because their context says not to evaluate them). - An even stronger version of the previous option would be to evaluate all expressions by default, not just computed expressions but also items. But this seems to interfere with convenient constructions like
priority(constit(...))andgradient(constit(...)). - Perhaps because the context of this instance of
I+1allows anintbut not a+. (This is a bit dangerous, though, and would also make type inference difficult.)
Some of the issues in designing this include
- whether evaluation of non-void terms is opt-in or opt-out
- whether the decision of whether to evaluate is up to the term or the context in which it appears
- concrete syntax for declarations and for the quote operator that protects against evaluation
- ensuring that
foo(bar(3))means the same thing in Dyna as in C++
Possible approach
At present, I think the leading proposal -- if we do auto-evaluation at all -- is to do it aggressively as follows.
- Ordinary evaluation:
- Evaluate all terms that have values (as in C).
- Never evaluate void terms (as in Prolog; this allows nesting). Or conceivably, just get this effect by adopting a convention where void terms evaluate to themselves.
- Pass by reference: One can declare contexts that suppress evaluation.
foo(int X, literal term Y)or perhapsfoo(int X, quoted term Y)will suppress evaluation offoo's second argument: it is supposed to be a term, not something that might evaluate to a term. Not sure whether type inference should try to infer this kind of declaration.- Example:
int s(literal peano Predecessor). Nows(s(s(s(z))))can evaluate to 4 wherever it's used (thanks to rules likes(X) = X+1), but eachsprotects the Peano number inside it from evaluation. - Example:
double priority(literal term)attaches priorities to terms, not their values. - Example:
X==foo(Y)evaluates its arguments (value equality), whileX~~foo(Y)does not (term equality) and is equivalent toquote(X)==quote(foo(Y)). (One can mix them, e.g.,X==quote(foo(Y)).)
- Example:
- Quote: We can quote terms explicitly to prevent their evaluation, e.g.,
foo(quote(s(s(s(s(z)))))). Formally, we'd define the polymorphic functionforeign T quote(quoted T)to be the identity. Notice that quoting a void term has no effect (so it is always safe to quote if you're not sure whether something is void).- Quoting only goes down one level, I think, so
quote(foo(2+2))would be equivalent toquote(foo(4)). This is different from Lisp and prevents the need for the backquote operator, I think. If you want to quote at multiple levels, do it explicitly:quote(foo(quote(2+2)) - Variable substitution is not affected by quoting, so
quote(foo(Y))ranges over the termsfoo(1),foo(2), etc. - An aggregation operator can be declared to suppress evaluation or not. They typically suppress evaluation of the consequent, but generally allow the antecedent to evaluate if it's non-void. Unevaluated antecedents could be useful for aggregation operators on terms, notably
cons=or whatever we call adding an element to a set.
- Quoting only goes down one level, I think, so
- Eval: We can eval terms explicitly to force their evaluation, e.g.,
=favoriteitemto get the value of the term stored byfavoriteitem. Not clear how this should work where evaluation is suppressed: shouldpriority(=favoriteitem)set the priority offavoriteitem, or -- more consistently but maybe less usefully -- the priority of a new termeval(favoriteitem)? - Question: What do the instantiated antecedent expressions of
goal += foo(X*Y) if bar(X,Y)look like? - Disavantage: When reading a program, it's not immediately obvious which terms are evaluated and which aren't. You have to know who is void and who suppresses evaluation. A smart syntax-coloring editor might help by colorizing the functors (and parens, if not infix) of unevaluated terms.
- Disadvantage: Presumably this convention introduces an unfortunate discrepancy between the concrete syntax on the Dyna side and the C++ side.
Cool idea: Possibly we should steal C++ punctuation. We can think of terms as pointers to values; they just happen to have lots more structure than a 32-bit sequence.
-
&s(s(z))for quoting (rather thanquote(s(s(z))), or's(s(z)), which we can't parse because'is already used to quote functors) -
*favoriteitemforeval(favoriteitem)(rather than=favoriteitem) -
priority(term &)to suppress evaluation of the argument (rather thanpriority(quoted term)); after all, this is just pass-by-reference -
double *derivative(double *Expression)to define a function ondouble-valued terms (rather thandouble =derivative(double =Expression)) - Warning: C++ notation in Dyna could be really confusing if the analogy proves faulty!
Maybe this C++-like notation is the secret to getting the C++ concrete syntax to work as in Dyna?
- Note that
operator&andoperator*can be overloaded. - Given the declaration
int foo(string), the C++ expression&foo("arg")would have to return some kind ofterm-- maybe aterm<int>, analogous to anint *. (Presumably this would be a subtype ofterm<term>, which is the highest-level class). -
foo("arg")itself would have to returnint(or more plausibly, something that implicitly casts toint; it would also have to supportoperator&and -- iffoois an axiom type -- be usable as an lvalue). - We'd probably want to drop the use of a special
chartobject and just use terms directly as both lvalues (in the case of axioms) and rvalues. But thec[foo(bar))]notation could still be used as syntactic sugar, equivalent tochart(c,foo(bar))wherefoo(bar)is not evaluated. The question is howbaris evaluated: arguablyc[foo(bar)]should usec[bar]as the argument tofoo, since the whole expression is to be evaluated in the context of the chart; I'm not sure whether we can pull that off.
