Scraping vs Indexing of IDX data

by THE WAV GROUP on May 19, 2009

MIBOR – the Metropolitan Indianapolis Board of REALTORS has recently responded to a complaint regarding a violation of their IDX rules and regulations – specifically the “anti-scraping” regulations.

MIBOR has correctly responded to the complaint ruling that an IDX property search page which allows data to be copied and re-purposed is in violation of their rules.  I begin this post with this acknowledgement to clarify that MIBOR is in no way behaving badly.  Indeed, they are enforcing their published rules and regulations correctly.

This issue has created a very difficult situation for MIBOR, NAR, and the unfortunate member of MIBOR who is in violation of the rules.  Clearly the member had no intention of violating the rules – it is a case of confusion between the term “scraping” and “indexing” of an agent’s real estate listing website.

Here are some common definitions of the terms:

scraping is a technique in which a computer program extracts data from the display output of another program.

An index is any data structure which improves the performance of look-up.  Indexing is the task of creating that data structure using spiders.

Spiders are programs that automatically fetches data to feed search engine results.

In this case, Google has indexed listing information on a real estate agent’s website in a way that easily allows a consumer to access that information with a click-through.  At issue here is the notion that Google has scraped components of the listing information and displayed that in the search result which links to the agent’s IDX compliant listing detail page.  As such, Google has both “indexed” and “scraped” the data.

I applaud NAR for stepping up to tackle this issue in a way that serves the best interests of the member, broker, Association of REALTORS, and the MLS.  Modifying this rule to allow for indexing and limiting scraping will take some highly technical lawyering.  Indeed, the differences between scraping and indexing is a matter of use case and intent rather than technique.  The legal bill for sorting this out is likely to be a big one :-)

What I believe is more at issue here is the notion that third party websites like Trulia and Zillow do not need to adhere to these rules – giving them a huge advantage at winning the eyes of consumers through search engine optimization techniques that agents and brokers are forbidden from pursuing.

Leave a Comment

Previous post:

Next post: