Current Draft RSS

My name is Mike. I work at a custom ladies shoes company, and these are some things I find inspiring or interesting in the moment. You can also find me on Twitter as @mikeee and at 22michaels.com

Archive

May
12th
Mon
permalink

Powerset launches beta

Powerset has launched a beta, searching the Wikipedia corpus.

I was keen to try the service out, mainly because of the huge amount of buzz this company has been getting of late. I tried the first 3 random queries that came into my head.

The first query I tried was how many senators in the us senate. The results were fine - although the snippet for the most relevant result didn’t match my query.

That said, what’s really interesting is the snippet almost matches Google’s snippet.

Powerset says: 

No senator has been expelled since, although many senators have chosen to resign when faced with expulsion proceedings (for example, Bob Packwood in 1995). … US Senate, Congressional Research Service. 

whilst Google’s snippet for the same query is:

No senator has been expelled since, although many senators have chosen to resign ….. 1801-1850, November 16, 1818: Youngest Senator. United States Senate. …

Clearly both engines have some room for optimization. A better snippet would be:

In the Senate, each state is represented by two members. Membership is therefore based on the equal representation of each state, regardless of population, for a total membership of 100. 

The next query was gestation of elephants. Again, the snippet wasn’t relevant:

An elephant carrying Thidambu during Thrissur Pooram festival in Kerala, south India. … Lions are the only known natural predators of elephants.

… but when I clicked through, it had highlighted the best sentence that answered my question. Neat.

Google, on the other hand, has the answer in the snippet:

Elephants are mammals, and the largest land animals alive today. [1] The elephant’s gestation period is 22 months, the longest of any land animal. …

Finally, the number of employees at facebook, produced a good hit in the snippet on Powerset. (Facebook had 500 employees as of March, apparently.) Google, on the other hand, didn’t get this right in the snippet.

It’s clever of Powerset to launch whilst only searching Wikipedia. It allows them to show off their search technology without actually searching the web. I assume they’re betting this might attract a potential acquisition partner to step up to the plate.

The real test of Powerset will be once they turn on web crawling. The hardest part of developing a search engine is, undoubtedly, establishing the authority of each document. Google achieves this, in part, through PageRank. This is what makes Google so incredible when it comes to weeding out spam and other terrible results. Keyword search, even “natural language” search where you are stemming keywords, isn’t really that spectacular when you are searching against an authorative index.

I guess we’ll have to wait to see how Powerset behaves when it indexes the entire web….

(Disclaimer: yes, I work for a competing search engine. But I don’t work on the search engine itself, and I only know as much as you do about how it works. This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.)