Query (future version)

From Dyna

Jump to: navigation, search

One thing that makes Dyna fast is that when an inference rule is triggered, it can quickly look up the items that it needs in the chart. A compiled Dyna program maintains special indices for this purpose.

This lookup facility is also made public for your use. Your C++ driver program can issue arbitrary queries against a chart -- as long as your Dyna program has declared the form of these queries in advance.

Contents

Declaring queries

Warning: The syntax described on this page is new. Version 0.3.1 still uses the old syntax. It lets you declare :- query(foo) only if foo is a union type. It provides a functionality equivalent to pattern queries with arguments, via declarations such as :- query(barfly, bar(a(X),_Y,Z,c(X,0,_Y))) which gives you a barfly::iterator over this pattern; the iterator's constructor takes the chart as an argument along with X and Z (the variables that don't start with _).

Declaring a query in your Dyna program has no effect on the semantics of the Dyna program. However, it makes dynac generate a C++ iterator class that you can use to issue queries.

Declaring a query may also make dynac generate extra code to maintain indices on the chart. That makes the query fast; on the other hand, be warned that maintaining these indices costs time and space.

Eventually, DynaMO will be able to notice which queries aren't used enough to merit good indexing. This is also important for Dyna's internal use of queries to run inference rules.

Queries without arguments

Suppose your Dyna program includes the declarations

:- structure(constit(string nonterm, int start, int end)).
:- query(constit).

Remember that constit compiles into a C++ class. Given the above declaration, it will include a nested class constit::iterator. An instance of the nested class lets you iterate over all constit items. For example, in a CKY program (give link), you can assert a bunch of words into the chart and then iterate over all the constits.

If you want the nested class to be called constit::gimme instead, you can include an optional argument specifying a C++ name (as for many declarations):

:- query(constit, gimme).

Queries over foreign axioms

Queries over foreign axioms work just like queries over structures. The only difference is that you have to implement them yourself in C++!

Or else give pragmas saying how to obtain the results from other queries that are implemented.

Pattern queries

You can also declare a query over a pattern. For example, given the pattern declarations

:- pattern(mypattern, constit("np",I,I)).
:- pattern(lengthtwo, edge(U,V)+edge(V,W)).
:- pattern(deep, foo(X,Y,yada(yada(yada(X))))).

you could declare

:- query(mypattern).    
:- query(lengthtwo).    
:- query(deep).    

Now, for example, you have a C++ class mypattern::iterator that you can use for iterating over all items in the chart that match mypattern. Furthermore, for each mypattern object returned by the iterator, you can use pattern-match accessors to recover the variable bindings whereby the pattern matched.

Patterns allow you to write queries that

  • contain constants (as in mypattern)
  • contain subterms that match a particular pattern (as in deep)
  • carry out database joins (as in lengthtwo, a query over expressions)
  • have internal unification (as in all three examples)

Queries with arguments

The queries declared so far have iterated over all terms of a given type (perhaps a pattern type). But you may want to declare a more flexible query that allows you to request a subset of those terms, specifying which subset at runtime.

For example, the following will obtain all rewrite axioms with a specified left-hand side:

:- structure(rewrite(string lhs, string rhs)).               % unary rule: lhs -> rhs
:- structure(rewrite(string lhs, string rhs1, string rhs2)). % binary rule: lhs -> rhs
:- query(rewrite(lhs)).   % specifies accessor name from above declarations

Now rewrite::iterator is a class with an extra argument in its constructor. rewrite::iterator(mychart,"np") will return all rewrite items t in mychart such that t.lhs()=="np".

You can do the same with patterns:

:- pattern(lengthtwo, edge(U,V)+edge(V,W)).
:- query(lengthtwo(U,W)). 

Now lengthtwo::iterator(mychart,3,10) will return all length-two paths that are built from edges in mychart and go from vertex 3 to vertex 10. Each such path is returned by the iterator as a lengthtwo object t such that t.U()==3 and t.W()==10. To extract the intermediate vertex V, you can just call t.V(). To find the total weight of the two edges, use mychart[t] to get the value of the expression edge(3,V)+edge(V,10).

Once you start using queries with arguments, you may find that you want multiple queries over the same type:

% WRONG
:- query(lengthtwo).
:- query(lengthtwo(U)).
:- query(lengthtwo(W)).
:- query(lengthtwo(U,W)).

This won't work because it tries to declare four classes with the same default name, lengthtwo::iterator. You have to specify explicit alternative names for at least three of the classes, using the optional argument discussed earlier.

% RIGHT
:- query(lengthtwo, unbound_iterator).     % lengthtwo::unbound_iterator
:- query(lengthtwo(U)).                    % lengthtwo::iterator (default name)
:- query(lengthtwo(W), reverse_iterator).  % lengthtwo::reverse_iterator
:- query(lengthtwo(U,W), bound_iterator).  % lengthtwo::bound_iterator

Query semantics

So what objects are returned by a query iterator, exactly?

It is only possible to define queries over items or other expressions (i.e., terms with values). While there may be infinitely many expression terms that match a query pattern, such as constit("np",I,I), the query iterator is only required to return those that have non-default values in the chart. (It may also return some that have default values.)

Possibly Dyna should guarantee that if an axiom has been asserted, it will be returned by a query, even if it has been asserted with default value. Possibly it should also make some kinds of guarantees about items that have been derived with default values, but that may affect efficiency, so there is no guarantee at present.
In some circumstances it is possible for the chart to contain infinitely many matches that have non-default value. Possibly the compiler should forbid the declaration of queries for which this is possible, since the iterator would have to return non-ground terms (patterns), which are not currently first-class objects in Dyna the way they are in Prolog. The issue does not arise for the subset of Dyna that is currently implemented, which only allows semiring programs.

Like other methods for examining a chart, queries consider everything that can be derived from the axioms in the chart. So you can place words into a parse chart and then immediately query to find out what phrases have sprung into existence. In practice, this means that calling the query iterator constructor may make some parsing happen. If you would like to block that effect, and ask what is in mychart right now, you should query mychart.peek instead of mychart.

Using queries

Suppose you have declared a query with 0 or more arguments:

:- query(foo(args)).

How do you use the foo::iterator class on the C++ side? Here's a typical loop, that prints out all expressions in the chart that match the query and have non-default value:

for (foo::iterator iter(mychart,args); iter; ++iter) 
  cout << *iter << endl;   // *iter is an instance of class foo

That's all there is to it!

Iterator objects

As the loop above illustrates, a Dyna iterator is a bit simpler than a C++ STL iterator. It is only intended to support forward iteration through a list.

An iterator instance iter either points to some element of the list, in which case (bool)iter is true and *iter returns the element, or else it points past the end of the list, in which case (bool)iter is false and *iter is an error (typically a segmentation fault). Notice that the (bool) cast is implicit in the context of a loop test, as shown above.

When constructed, iter points to the first element in the list (if any). Each call to ++iter advances it to point to the next element (if any). Eventually, it will point past the end of the list (if finite). Once that happens, continuing to call ++iter has undefined behavior.

An iterator object has one additional method: iter.restart() makes iter point to the first element again. This is especially useful during training, when one wants to iterate over the axioms during each training pass.

Iterators are fail-fast. That is, if the chart changes during the lifetime of the iterator, then the next call to *iter or ++iter will throw an exception.

The fail-fast feature hasn't been implemented yet, and perhaps should be reconsidered or made optional, since it adds a bit of overhead. Also, it is not clear whether the iterator should fail if the chart has changed, or only if the particular list has changed.

Constructing iterators

Notice that you obtain a foo::iterator by calling the class's constructor in any of the usual ways. The first argument of the constructor is the chart that you want to iterate over.

constit::iterator iter(mychart, args);                           // named object on stack
constit::iterator(mychart,args);                                 // temporary object on stack
constit::iterator *piter = new constit:iterator(mychart, args);  // object on heap

(You might have expected to get your iterator by calling a method on the chart, rather than by passing the chart to a constructor. But then you'd have to write constit::iterator iter = mychart.constit_iterator(), which is both longer and less efficient than constit::iterator iter(mychart).)

Personal tools