Better Living Through…

Clearly, Yahoo knows the keys to better living.

What I find interesting about these lists from an IR perspective is that what information they’re leaking about the ranking algorithms. Before I get into that, let me explain the following tables. For each search engine, the first column contains what the type ahead lists on the website (as evidenced by the screenshots.) The second column contains what the type ahead drop down in Safari’s toolbar lists. For Yahoo, the third column lists what is given if type a space after the “through”. For Google and Bing, the space did nothing.


Dentistry Dentistry
Design Design
Chemistry Sewing
Sewing Chemistry
Chemistry Lyrics
Chemistry Movie
Catastrophe Lyrics


Chemistry Chemistry
Design Design
Chemistry Lyrics Chemistry Lyrics
Killing Killing
Beowulf Beowulf
Coffee Coffee
Mathematics Mathematics


Chemistry Chemistry Chemistry
Design Design Design
Circuitry Circuitry Circuitry
Killing Killing Killing
Dentistry Dentistry Dentistry
Chemistry Lyrics Chemicals
Hypnosis Technology
Chocolate Software
Better Information Sarcasm
Recreational Sims

The first thing I noticed was that Google had different ranking between the toolbar and the web page. Also, Google is really emphasizing local search. “Better Living Through Dentistry” is a dentist in San Francisco. Putting it first is really strange, since the most famous (and frequently parodied) phrase is “Better Living Through Chemistry.”

Surprisingly, Bing and Yahoo aren’t returning the same results, even though Bing powers both searches. I know Yahoo Search still exists, but apparently they’re still doing custom ranking. Also, Yahoo is using their own ranking to drive the type ahead results. Yahoo isn’t tokenizing their type ahead searches, while Bing and Google are tokenizing theirs on a word basis. Otherwise, typing a space, wouldn’t give generate all new results for Yahoo.

Since I had a set to differing lists, I decided to combine the lists into a single ranked list. To do this, I ordered the terms by the averaged the rank they appeared. When the term did not appear in a list, I used the rank (MaxRankForEngine + 1) + (MaxRankForEngine / NumTermsUnseen). I’m not sure that is the best way to combining federated search results, but since this is just a blog post, I’m not too worried about it.

Bold entries are unique.

  1. Chemistry
  2. Design
  3. Killing
  4. Circuitry
  5. Chemistry Lyrics
  6. Dentistry
  7. Sewing
  8. Beowulf
  9. TV
  10. Chemicals
  11. Coffee
  12. Chemistry Movie
  13. Technology
  14. Hypnosis
  15. Catastrophe Lyrics
  16. Software
  17. Chocolate
  18. Mathematics
  19. Chems
  20. Sarcasm
  21. Better Information
  22. Economics
  23. Sims
  24. Recreational