Hashemian Blog

Web Tools, Financial Markets, Technology

Saturday, July 19, 2008

Yahoo Slurp Versus Googlebot 

As a part of my early morning routine at my job I scan a daily notification on our web site visitor patterns, specifically looking for abusive or abnormal page requests to identify rogue hosts. I wrote the program years ago, but it has become one of my most effective tools in identifying abuse.

No surprise that search engine robots (spiders or crawlers) always top the list. Spiders are necessary evils. You want them crawling and indexing all your pages frequently, but that leads to spiders with voracious appetites that can rob the infrastructure of performance.

Speaking of voracious appetites, for years Googlebot used to top our crawl volume list, far and away from the second position, usually (but not always) occupied by Yahoo Slurp (Yahoo's spider). As of a few weeks ago I am noticing that Yahoo Slurp crawl volume is surpassing that of Googlebot. The first few days I considered it an anomaly, but the levels have remained consistent, knocking Googlebot out of its first position. Googlebot still registers with more or less the same numbers, but Yahoo Slurp has been consistently beating those numbers, sometimes by twice as many. That's a considerable spike in Yahoo's crawl activity.

I find it interesting that Yahoo's crawl volume started to rise in the midst of its take-over battles with Microsoft. Search engines have always competed on who has the most indexed pages. It's a bragging right, more than it is a practical or useful matter in search accuracy. But accuracy in search results is subjective, while page volumes are concrete numbers and that's why they are sometimes used as key differentiators between competing search engines. Still, the timing of the jump in Yahoo's crawling activity is intriguing. Could the motive be to enhance its index size (considered a key asset), thereby adding to the company's value? One wonders.

Meanwhile, sites need to consider the battle repercussions between the search engine titans as their increased crawl rates could put additional pressure on their infrastructures. One way to mitigate the effects is to set up mirror servers and deflect the spider/robot traffic to them using application-aware appliances, but that there's some effort involved and there's room for error as the servers need to be synchronized in real-time, specially for those such as news-related sites that continuously publish new pages and fresh content. The other consideration is bandwidth management to accommodate increased traffic. At the least, faster servers and fatter pipes are in order. As a reference here are the current signatures left behind by Google and Yahoo crawlers:

Googlebot's user-agent:
Mozilla/5.0+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html)

Yahoo Slurp's user-agent:
Mozilla/5.0+(compatible;+Yahoo!+Slurp;+http://help.yahoo.com/help/us/ysearch/slurp)

,,,,,,

Labels:

<Yahoo Slurp Versus Googlebot>

0 comments

Monday, July 07, 2008

The ANWR Oil Debate 

Lately I've seen a number of Web sites and emails exhorting Americans to press their government to explore for oil in a parcel of land in ANWR, the Arctic National Wildlife Refuge in the northeast corner of Alaska.

The ANWR debate is nothing new and it was a hot topic back in the 1970's oil crisis. Proponents argue that this location may hold vast deposits of oil which could bring relief to the current shortages of oil and thus tame the high prices in the US and the rest of the world. Opponents include the natives and environmentalists who fear that such exploration and subsequent drilling could endanger this natural setting robbing it and its inhabitants (humans and animals) of its ecological diversity and wealth.

I am not sure which side of the debate holds the better argument. Personally I don't like to see natural settings overrun by industrial concerns, i.e. the oil companies. What I do know is that Exxon-Mobil's earnings topped $40 billion in 2007 and surely they will easily exceed that figure this year. As people learn of these outlandish profits by the big oil while their savings are being squeezed, there is bound to be some backlash. That has manifested itself in the form of calls for special taxes on oil companies.

Couple that with the inevitable emergence of fuel economics and alternative energy and it's not hard to guess that oil companies are worried about their prospects. To me, these pro- oil exploration campaigns are not about alleviating oil shortages, but more about distracting the public from the abuses of the oil companies. Many go even further to shift the spotlight away from the big oil and cast it on the liberals, democrats, or Arabs in a shameless effort to create public sympathy and support for the oil companies. One wonders who the real authors are.

Oil companies already have millions of acres of land they can start exploring, but that's not enough. ANWR may hold large reserves of oil, but this campaign smells more like a land-grab and less like a sincere effort to help calm the oil crunch. Blaring their propaganda machine in times of panic and despair is always a good way to assure power and profits.

Review history and see how dictators and tyrants have come to power. Their reigns have almost always preceded by periods of unrest and panic when people are at their most vulnerable and can be easily deceived by empty promises and blustering rants. Once they tap into the herd mentality, they are assured of their golden positions. I want a way out of this oil mess too, but not enough to sell out this country to oil thugs. They're beyond rich enough already.

,,,,

Labels: , ,

<The ANWR Oil Debate>

2 comments

This page is powered by Blogger. Isn't yours?

Links
  • Hashemian Blog Feeds
  • Add to Google
  • Read Hashemian.com/blog/ with Bloglines
  • Subscribe to Hashemian.com/blog/ with My Yahoo!
  • Technorati Profile
  • TMCnet.com
  • ARCHIVES
  • 09/01/2003 - 10/01/2003
  • 03/01/2004 - 04/01/2004
  • 04/01/2004 - 05/01/2004
  • 05/01/2004 - 06/01/2004
  • 06/01/2004 - 07/01/2004
  • 07/01/2004 - 08/01/2004
  • 08/01/2004 - 09/01/2004
  • 09/01/2004 - 10/01/2004
  • 10/01/2004 - 11/01/2004
  • 11/01/2004 - 12/01/2004
  • 12/01/2004 - 01/01/2005
  • 01/01/2005 - 02/01/2005
  • 02/01/2005 - 03/01/2005
  • 03/01/2005 - 04/01/2005
  • 04/01/2005 - 05/01/2005
  • 05/01/2005 - 06/01/2005
  • 06/01/2005 - 07/01/2005
  • 07/01/2005 - 08/01/2005
  • 08/01/2005 - 09/01/2005
  • 09/01/2005 - 10/01/2005
  • 10/01/2005 - 11/01/2005
  • 11/01/2005 - 12/01/2005
  • 12/01/2005 - 01/01/2006
  • 01/01/2006 - 02/01/2006
  • 02/01/2006 - 03/01/2006
  • 03/01/2006 - 04/01/2006
  • 04/01/2006 - 05/01/2006
  • 05/01/2006 - 06/01/2006
  • 06/01/2006 - 07/01/2006
  • 07/01/2006 - 08/01/2006
  • 08/01/2006 - 09/01/2006
  • 09/01/2006 - 10/01/2006
  • 10/01/2006 - 11/01/2006
  • 11/01/2006 - 12/01/2006
  • 12/01/2006 - 01/01/2007
  • 01/01/2007 - 02/01/2007
  • 02/01/2007 - 03/01/2007
  • 03/01/2007 - 04/01/2007
  • 04/01/2007 - 05/01/2007
  • 05/01/2007 - 06/01/2007
  • 06/01/2007 - 07/01/2007
  • 07/01/2007 - 08/01/2007
  • 08/01/2007 - 09/01/2007
  • 09/01/2007 - 10/01/2007
  • 10/01/2007 - 11/01/2007
  • 11/01/2007 - 12/01/2007
  • 12/01/2007 - 01/01/2008
  • 01/01/2008 - 02/01/2008
  • 02/01/2008 - 03/01/2008
  • 03/01/2008 - 04/01/2008
  • 04/01/2008 - 05/01/2008
  • 05/01/2008 - 06/01/2008
  • 06/01/2008 - 07/01/2008
  • 07/01/2008 - 08/01/2008
  • 08/01/2008 - 09/01/2008
  • 09/01/2008 - 10/01/2008
  • 10/01/2008 - 11/01/2008
  • 11/01/2008 - 12/01/2008
  • 12/01/2008 - 01/01/2009
  • 01/01/2009 - 02/01/2009
  • 02/01/2009 - 03/01/2009
  • 03/01/2009 - 04/01/2009
  • 04/01/2009 - 05/01/2009
  • 05/01/2009 - 06/01/2009
  • 06/01/2009 - 07/01/2009

  • Read Financial Markets  |   Home  |   Blog  |   Web Tools  |   News  |   Articles  |   FAQ  |   About  |   Contact

    © 2001-2009 Robert Vahid Hashemian
    Support the effort
    Liked this page?
    Please consider creating a link to it
    from your Web site.

    hashemian.com
    هاشمیان.com

     Home

     Blog

     Web Tools Add Free Web Tools custom Google Toolbar button (Requires Toolbar >V4)
    Usage

     News

     Articles

     FAQ

     About

     Contact

     Financial Markets Book
    Read Complete Book

    Search Amazon:  
    Amazon Logo


    Get Kindle, $259

    aStore - Hashemian.com on Amazon

    Visits: Powered by hashemian.com

     

     

     

     

     

    Search Hashemian.com



    eBay