Swati Lathia

Learning ways

Session Ids in URLs : Crawler Confusion

Session IDs are most common in e-commerce sites and are embedded in a URL so the website can track their users or consumers from page to page and they are used to keep track of items in a consumer’s shopping cart.

But these IDs cause problems for search engine crawlers because they create a large number of links for the spider to crawl. This can create a situation where the search engine indexes essentially the same page over and over. Search engines like Google refer to it as a ‘spider trap’, which we will discuss later on.

Below are a few examples of how session IDs can give the appearance of an endless number of pages within a single site. A crawler coming to your website may find a page with the following URL:

http://www.yoursite/shop.cgi?id=dkom2354kle03i

This page gets indexed but when the spider returns later to look for new content, it finds the following:

http://www.yoursite/shop.cgi?id=hj545jkf93jf4k

This is actually the same page as before, just with a different special session ID but the spider sees it as a new URL. Because of this confusion, search engine spiders are programmed to avoid pages containing these session IDs.

Session Ids in URLs : Crawler Confusion

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top