next up previous
Next: Boolean Model of IR Up: Probabilistic Model of IR Previous: Robertson / Sparck Jones

Inference Networks

Inference networks use a slightly different probabilistic model based on reasoning under uncertainty [30]. An inference network is an inverted tree structure comprised of multiple parents culminating to a single child. The tree contains four levels, with each level consisting of a different type of node:

Document
nodes, which represent a particular document;
Representation
nodes, which represent a particular concept. Typically, this is a term contained within one or more documents;
Query
nodes, which represent the concepts used to represent the information need of the user;
I
node, which represents the information need of the user.

A sample network is shown in

 figure153
Figure 1:  Sample inference network for query ``(Richard and Nixon) and not Watergate''

Figure 1. Here, the nodes labeled tex2html_wrap_inline2347 and tex2html_wrap_inline2349 correspond to three of the n Document nodes. The nodes labeled ``Richard,'' ``Nixon,'' and ``Watergate'' are Representation nodes, the ``and'' and ``not'' nodes are the Query nodes, and the ``information need'' node is the I node. The horizontal line separates the Document and Representation nodes, which are created at index time rather than at query time, from the Query and I nodes, which are created dynamically for each query.

Probabilities are calculated starting from the Document nodes and propagated down to the I node. Each child node contains a link matrix [51] that details how to combine the probabilities of the child's parents into the probabilities for the child. Eventually, all probabilities are propagated down to the I node. The I node thus contains the probabilities of each parent node satisfying the information need. It is a simple matter then to sort the probabilities and display the most relevant ones, in accordance with Robertson's Probability Ranking Principle [35].



Erik Selberg
Wed Aug 6 12:24:17 PDT 1997