Chinese Proper Noun Lexicon Baidu has it, Google not

Posted by Ryan on 2011-03-03

Baidu has long claimed that they are doing a better job than Google in terms of processing Chinese search queries. Well, I want to reserve comment on that. But there is one thing I know for sure that Baidu has but Google doesn’t – a Chinese proper noun lexicon.

Today I am going to show you a very simple example, as most of you may not understand the Chinese language. But if you do understand the Chinese language, the article can be much fun for you because you can work on your own searches based on what I did.

The sample Chinese search phrase is “吸血蝙蝠侠”

The phrase consists of three Chinese words, “吸血(blood-sucking)”, “蝙蝠(bat)” and “侠(man)”, which can be possibly recognized as three search queries:

  1. blood-sucking bat / man
  2. blood-sucking / batman
  3. blook-sucking / bat / man

After implementing searches on Google and Baidu respectively, you can see Google and Baidu treat them very differently.

In Google’s SERPs, it is obvious that Google treats the three Chinese words as separate ones. On the other side, Baidu combines “蝙蝠(bat)” and “侠(man)” into a proper noun “蝙蝠侠(batman)”.

This means Baidu has a Chinese proper noun lexicon within its system, making it able to identify those proper nouns and treat them specially.

As I said, if you understand Chinese, you can do some searches and compare on your own, which can be very much fun, 🙂

