Skip to the content of the web site.

Library Guide Series

Relevancy Searches and Displays

More and more search systems, especially web search engines, incorporate relevancy ranking of results. In a traditional boolean search several large sets can be combined to get one small result, only those documents fitting the specified criteria are kept and would be sorted by date. A full relevancy search can produce very large results because all the results are kept and the results sorted by a formula related to the query you put in.

What the relevancy ranks really mean is that most of the top-ranked documents will be of interest to you, while only a few of the lowest ranked documents will be of interest. Note that this does not bring all the good articles to the top, just a higher percentage.

The relevancy ranking in bibliographic and textual material is usually based on word frequency distribution, the more often the search terms appear, the higher the ranking. This is fine-tuned in any number of ways, such as including factors based on the number of times the word appears in the entire database and in the document, and the proximity of words to each other.

What a Relevancy Search Looks Like

While there are some hybrid systems, generally you no longer have the boolean AND or OR. Instead you define what words have to be present, are important, or must be excluded. You can often truncate. eg:

Feature TRELLIS
Keyword anywhere
Alta Vista
or Google Basic searches
EI Compendex
Simplified - basic
Word Must be Present + +  
Word is Important * n.a. n.a.
Phrase "word word" "word word" "word word"
Not ! - n.a.
Truncation ? * *
Parentheses n.a. n.a. Use separate query boxes
Example "lake erie" +zebra "lake erie" +zebra "lake erie" zebra

What the Results Look Like

Each record often comes with a percent figure - this is an indication of how well this fits the query.

If you were to work your way through the entire list you would find the first few might be the equivalent of boolean ANDing all the search keys together; further along more terms are dropped until, at the end, you have documents each containing only one of the search keys, as though all the terms had been ORed together.

Preparing for a Relevancy Search

Pretty much the same except I would leave out long lists of synonyms, that will bring those articles to the top. This often means you will have to repeat the search several times to pick up all synonyms/variant spellings. So rather than
wine and (sulfur or sulphur)
You may need to search:
wine sulfur
Then
wine sulphur

The above is a simple example but it is an illustration of how a complicated search might have to be broken up. A longer search, such as (fuel or fuels or coal or oil or "natural gas" or hydrogen or electric*) and ("greenhouse gas" or carbon dioxide" or methane) could become unmanageable.

These work best with larger documents as opposed to brief citations. Read the instructions / help pages carefully. Each of these systems works a bit differently.

Remember: One way or the other, the results of the search is still based on the occurrence of words in the records and the meanings you attach to them. If you don't ask the right question the computer won't find the answer. This is just a way of sorting the results.

June 20, 2005

Previous Chapter Next Chapter

Comments and Questions are welcome!

Librarian, Information Services and Resources
Last Updated: October 7, 2004