Scrapers, Robots and Spiders: The Battle Overheen Internet Gegevens Mining
Scrapers, Robots and Spiders: The Battle Overheen Internet Gegevens Mining
When American Airlines sued Farechase, Inc. ter federal district court te Texas earlier this year, claiming that Farechase’s “screen-scraping” of AA’s flight information from AA.com wasgoed illegal, it wasgoed only the most latest te a series of cases challenging unauthorized gegevens collection from Internet web sites. What practices are encompassed by “screen-scraping”? Is “scraping” indeed illegal? What does this line of cases mean for your business?
What is “Screen – Scraping”? Despite its pejorative title, screen- scraping software simply gathers and aggregates gegevens from other Internet websites for use by the gathering party. Usually, the purpose is to reformat the gegevens and display it for the benefit of the gathering party’s customers. Examples of gegevens aggregation range from sites that collect prices on retail sites to companies that aggregate private financial gegevens on mutual fund and banking web sites, permitting registered users to access information about numerous accounts on a single web webpagina.
The software that performs this function, often referred to spil a robot (or “bot”), “spider” or “crawler”, automatically searches Internet web sites for specific information. The Farechase case provides one example of how this technology works. Farechase’s customers are travel agencies. When a customer uses Farechase to research a particular airline, hotel or wagen rental fare, Farechase’s software will search different airline sites and collect the “webfares” suggested. A popular webpagina such spil AA.com might be searched thousands of times a day te response to queries initiated by Farechase customers. Farechase’s real time search technology is an advance on more traditional gegevens mining, te which companies search sites on a regular onderstel and maintain a separate database that may be queried by users. Ter the case of sites selling books or music, a real time search may not be essential, spil long spil the database is updated frequently. Farechase took this concept one step further by permitting its users to search for fares suggested at the very ogenblik the search is conducted, thus assuring that the results would be current.
Unnecessary to say, one’s views on this type of gegevens mining depend largely on whether one is the scraper or scrapee. The targets of this practice, such spil American Airlines, complain that the onveranderlijk traffic resulting from scraping puts an toegevoegd cargo on their Internet servers, slowing down their response times for legitimate users. Ter the Farechase case American Airlines claimed that if left unstopped, Farechase would be performing overheen 200,000 daily searches by the end of 2003. Moreover, it argued that by permitting customers to access web fares by going directly to the American booking pages at AA.com, American is incapable to establish the relationship with its customers that would occur if customers were required to navigate through AA.com’s preliminary pages, thereby costing American customer good will. On the other mitt, companies like Farechase argue that their service encourages comparison shopping, and that companies that stand against it are afraid of the competition (and the lower prices) that result.
Technological Defenses. Before discussing the legalities of screen-scraping, it is worth pointing out that companies who are targeted by this practice and who object to it often undertake a measure of “self-help” before authorizing their lawyers to opstopping suit. Such self-help sometimes leads to a technological battle worthy of a William Gibson novel. The defenders attempt to identify and block the Internet Protocol (IP) addresses of the attackers. The attackers react by hiding or disguising their scrapers’ identities by using fake IP addresses, thereby evading the blocking firewalls. The attackers, not lightly discouraged, seem to have a limitless supply of disguises, perpetuating this high tech cat-and-mouse spel. Te several cases the attackers have prevailed. Spil a result, several of thesis disputes have ended up ter the courts.
Legal Defenses. When technical defenses fail, screen-scraper targets such spil American Airlines have two primary legal weapons to deploy te their defense. The very first is to keuze breach of a click-wrap or browse-wrap on-line license. The 2nd is to allege a “tort” (or legal wrong), most commonly “trespass to chattel.”
Ter its case against Farechase, American Airlines attempted to fire both barrels at its tegenstander, but its opening salvo wasgoed feeble. Very first, American claimed that Farechase violated American’s “browsewrap” agreement. By its use of the term “browsewrap” American wasgoed referring to an online agreement which emerges on the webpagina (usually under the terms and conditions listig), but does not require the user to click on or express consent to the agreement before proceeding to use the webpagina. By tegenstelling, the better known (and far more effective) “click-wrap” agreement requires the first-time user to click on a word or symbol to express acceptance of a webpagina’s licensing terms before gaining access to the webpagina. While the user of a decently implemented click-wrap agreement can expect enforcement, no court has yet enforced a browsewrap agreement, and the only two courts that have considered the kwestie at all have voiced doubts spil to the enforceability of such an agreement. However, the capability to protect a web webpagina with nothing more than an explicit statement on the webstek restricting access received a potential boost te a latest decision by the Very first Circuit Court of Appeals te Boston. That court suggested that screen-scraping may crack the Rekentuig Fraud and Manhandle Act (the “CFAA”), and that a limitary warning of the sort used ter browsewrap agreements may be enough to invoke the CFAA.
The 2nd barrel of American’s gun wasgoed loaded with more powerful munitions, te the form of its optie that Farechase had violated the law of “trespass to chattels” (i.e. goods). While the English law of trespass spil applied to chattels can be traced back hundreds of years, it has shown a surprising capability to adapt itself to the law of the Internet. Most courts that have considered the applicability of trespass law to gegevens scrapers have ruled te favor of the complaining party. The best known of thesis cases, eBay, Inc. v. Bidder’s Edge, Inc., resulted ter an injunction ordering Bidder’s Edge to zekering gegevens mining from the eBay webstek. Moreover, ter several of thesis cases the courts have not required proof that the scrapers caused any measurable harm, or caused any specific injury, to the sites they were gegevens mining.
Not remarkably, based on the above record, American Airlines wasgoed successful te obtaining an injunction against Farechase. While Farechase is still te business, its searches no longer include American web fares.
TLB Comment: Based on this state of the law, can gegevens miners expect to build a business based on unauthorized screen-scraping? Somewhat remarkably, the outlook may be better than it emerges. Very first, many companies do utilize this form of gegevens mining without protestation from the owners of the sites they are crawling. The reasons are economic, not legal. Ter some industries screen-scraping has become an accepted method of business. Further, the vast majority of companies are willing to provide access to their sites when they are approached cooperatively. The fact that some percentage of their capacity is being used by a scraper is not a deterrent, spil long spil the scraper’s customers ultimately are referred to the vendor’s webpagina to make the purchase.
2nd, while the law thus far has favored original content providers, the law on electronic trespass to chattels is far from lodged. Just before this article went to press the California Supreme Court issued a decision te Intel v. Hamadi, rejecting Intel’s attempt to prevent a former employee from sending mass e-mails to Intel employees. Ter that case the court held that electronic trespass to chattels is not actionable under California law unless it involves “actual or threatened injury to the individual property or the possessor’s legally protected rente ter the individual property.” Since Hamadi’s e-mails (numbering ter the hundreds of thousands) to Intel employees caused no such harm, the court refused to order Hamadi to cease communications. Albeit this case wasgoed not a screen-scraping case, the issues implicated are essentially the same (Intel relied strenuously on the scraper cases), and therefore Hamadi may be an significant defensive implement for scrapers to use ter the future.