Query (future version)
From Dyna
One thing that makes Dyna fast is that when an inference rule is triggered, it can quickly look up the items that it needs in the chart. A compiled Dyna program maintains special indices for this purpose.
This lookup facility is also made public for your use. Your C++ driver program can issue arbitrary queries against a chart -- as long as your Dyna program has declared the form of these queries in advance.
Contents |
Declaring queries
- Warning: The syntax described on this page is new. Version 0.3.1 still uses the old syntax. It lets you declare
:- query(foo)only iffoois a union type. It provides a functionality equivalent to pattern queries with arguments, via declarations such as:- query(barfly, bar(a(X),_Y,Z,c(X,0,_Y)))which gives you abarfly::iteratorover this pattern; the iterator's constructor takes the chart as an argument along withXandZ(the variables that don't start with_).
Declaring a query in your Dyna program has no effect on the semantics
of the Dyna program. However, it makes dynac
generate a C++ iterator class that you can use to issue queries.
Declaring a query may also make dynac generate extra code
to maintain indices on the chart. That makes the query fast; on the
other hand, be warned that maintaining these indices costs time and space.
- Eventually, DynaMO will be able to notice which queries aren't used enough to merit good indexing. This is also important for Dyna's internal use of queries to run inference rules.
Queries without arguments
Suppose your Dyna program includes the declarations
:- structure(constit(string nonterm, int start, int end)). :- query(constit).
Remember that constit compiles into a C++ class. Given
the above declaration, it will include a nested class
constit::iterator. An instance of the nested class lets you
iterate over all constit items. For example, in a CKY
program (give link), you can assert a bunch of words into the
chart and then iterate over all the constits.
If you want the nested class to be called constit::gimme
instead, you can include an optional argument specifying a C++ name
(as for many declarations):
:- query(constit, gimme).
Queries over foreign axioms
Queries over foreign axioms work just like queries over structures. The only difference is that you have to implement them yourself in C++!
- Or else give pragmas saying how to obtain the results from other queries that are implemented.
Pattern queries
You can also declare a query over a pattern. For example, given the pattern declarations
:- pattern(mypattern, constit("np",I,I)).
:- pattern(lengthtwo, edge(U,V)+edge(V,W)).
:- pattern(deep, foo(X,Y,yada(yada(yada(X))))).
you could declare
:- query(mypattern). :- query(lengthtwo). :- query(deep).
Now, for example, you have a C++ class
mypattern::iterator that you can use for iterating
over all items in the chart that match mypattern.
Furthermore, for each mypattern object returned by the iterator,
you can use pattern-match accessors to
recover the variable bindings whereby the pattern matched.
Patterns allow you to write queries that
- contain constants (as in
mypattern) - contain subterms that match a particular pattern (as in
deep) - carry out database joins (as in
lengthtwo, a query over expressions) - have internal unification (as in all three examples)
Queries with arguments
The queries declared so far have iterated over all terms of a given type (perhaps a pattern type). But you may want to declare a more flexible query that allows you to request a subset of those terms, specifying which subset at runtime.
For example, the following will obtain all rewrite axioms
with a specified left-hand side:
:- structure(rewrite(string lhs, string rhs)). % unary rule: lhs -> rhs :- structure(rewrite(string lhs, string rhs1, string rhs2)). % binary rule: lhs -> rhs :- query(rewrite(lhs)). % specifies accessor name from above declarations
Now rewrite::iterator is a class with an extra argument
in its constructor. rewrite::iterator(mychart,"np") will
return all rewrite items t in
mychart such that t.lhs()=="np".
You can do the same with patterns:
:- pattern(lengthtwo, edge(U,V)+edge(V,W)). :- query(lengthtwo(U,W)).
Now lengthtwo::iterator(mychart,3,10) will return all
length-two paths that are built from edges in mychart and
go from vertex 3 to vertex 10. Each such path is returned by the
iterator as a lengthtwo object t such that
t.U()==3 and t.W()==10. To extract the
intermediate vertex V, you can just call t.V(). To find
the total weight of the two edges, use mychart[t] to
get the value of the expression edge(3,V)+edge(V,10).
Once you start using queries with arguments, you may find that you want multiple queries over the same type:
% WRONG :- query(lengthtwo). :- query(lengthtwo(U)). :- query(lengthtwo(W)). :- query(lengthtwo(U,W)).
This won't work because it tries to declare four classes with the same
default name, lengthtwo::iterator. You have to specify
explicit alternative names for at least three of the classes, using
the optional argument discussed earlier.
% RIGHT :- query(lengthtwo, unbound_iterator). % lengthtwo::unbound_iterator :- query(lengthtwo(U)). % lengthtwo::iterator (default name) :- query(lengthtwo(W), reverse_iterator). % lengthtwo::reverse_iterator :- query(lengthtwo(U,W), bound_iterator). % lengthtwo::bound_iterator
Query semantics
So what objects are returned by a query iterator, exactly?
It is only possible to define queries over items or other expressions (i.e., terms with values).
While there may be infinitely many expression terms that match a query pattern, such as
constit("np",I,I), the query iterator is only required to return those
that have non-default values in the chart. (It may also return some that have default values.)
- Possibly Dyna should guarantee that if an axiom has been asserted, it will be returned by a query, even if it has been asserted with default value. Possibly it should also make some kinds of guarantees about items that have been derived with default values, but that may affect efficiency, so there is no guarantee at present.
- In some circumstances it is possible for the chart to contain infinitely many matches that have non-default value. Possibly the compiler should forbid the declaration of queries for which this is possible, since the iterator would have to return non-ground terms (patterns), which are not currently first-class objects in Dyna the way they are in Prolog. The issue does not arise for the subset of Dyna that is currently implemented, which only allows semiring programs.
Like other methods for examining a chart, queries consider everything that can be derived from the axioms in the chart. So you can place words into a parse chart and then immediately query to find out what phrases have sprung into existence. In practice, this means that calling the query iterator constructor may make some parsing happen. If you would like to block that effect, and ask what is in mychart right now, you should query mychart.peek instead of mychart.
Using queries
Suppose you have declared a query with 0 or more arguments:
:- query(foo(args)).
How do you use the foo::iterator class on the C++ side?
Here's a typical loop, that prints out all expressions in the chart that match the query and have non-default value:
for (foo::iterator iter(mychart,args); iter; ++iter) cout << *iter << endl; // *iter is an instance of class foo
That's all there is to it!
Iterator objects
As the loop above illustrates, a Dyna iterator is a bit simpler than a C++ STL iterator. It is only intended to support forward iteration through a list.
An iterator instance iter either points to some element of the list, in which case
(bool)iter is true and *iter returns the element,
or else it points past the end of the list, in which case (bool)iter is false and
*iter is an error (typically a segmentation fault). Notice that the (bool) cast is implicit in the context of a loop test, as shown above.
When constructed, iter points to the first element in the list (if any).
Each call to ++iter advances it to point to the next element (if any). Eventually, it will point past the end of the list (if finite). Once that happens, continuing to call ++iter has undefined behavior.
An iterator object has one additional method: iter.restart() makes iter point to the first element again. This is especially useful during training, when one wants to iterate over the axioms during each training pass.
Iterators are fail-fast. That is, if the chart changes during the lifetime of the iterator,
then the next call to *iter or ++iter will throw an exception.
- The fail-fast feature hasn't been implemented yet, and perhaps should be reconsidered or made optional, since it adds a bit of overhead. Also, it is not clear whether the iterator should fail if the chart has changed, or only if the particular list has changed.
Constructing iterators
Notice that you obtain a foo::iterator by calling
the class's constructor in any of the usual ways. The first argument
of the constructor is the chart that you want to iterate over.
constit::iterator iter(mychart, args); // named object on stack constit::iterator(mychart,args); // temporary object on stack constit::iterator *piter = new constit:iterator(mychart, args); // object on heap
(You might have expected to get your iterator by calling a method on the chart,
rather than by passing the chart to a constructor. But then you'd have to
write constit::iterator iter = mychart.constit_iterator(), which
is both longer and less efficient than constit::iterator iter(mychart).)
