Assume an IR system uses a document ranking function r(Q, d). The
Swets model examines the distribution of the values of r; in
particular, it looks at two distributions: the distribution of values
when d is the relevant set of documents
, and the distribution
when d is in the irrelevant set
. Intuitively, the
ranking values when
should generally be higher than then
. The assumption is that the more separated the
distributions are, the better the ranking function. Assume for a
moment that these two distributions exist with means of
and
. Let
be the distance
between the two means.
Figure
![]()
Figure 2: The Swets model of ranking distribution
2 shows a graphical representation of the Swets model.
Pictured are two distributions of values of a ranking function r,
the leftmost distribution corresponding to values of r(Q, d) when
d is not relevant, and the rightmost distribution corresponding to
values of r(Q, d) when d is relevant. The means of each
distribution,
and
are highlighted, as is
, the distance between the means.
Swets uses as an approximation for the quality of the distributions, with higher values indicating better distributions. Intuitively, larger values of imply that the ranking function r is able to do a better job of differentiating relevant documents from irrelevant documents. While there are some issues in dealing with this model to compare practical systems [5, 41], the Swets model nevertheless offers an attractive theory on which to base formulas for document ranking.