|
Parasites Inclusion Overview |
Search engines match keyword searches to web pages, and bring visitors to a web site. This is generally considered a 'good thing'.
Parasites however feed on keywords from communication traffic, and use competitive advertising to take visitors away from a web site. This is typically considered a 'bad thing'.
Sometimes people find that a Parasite has copied and stolen web page content that for privacy, security, or copyright reasons should not be read by anyone other than its intended recipient.
In recognition of these problems, and to prevent Parasite infestations spreading across the Internet, it is vital that Parasites offer facilities for Web site administrators and content providers to limit what a Parasite does.
This can be achieved through two mechanisms:
|
The Parasite
Inclusion Protocol |
A Web site administrator can indicate which parts of the site can be freely abused by providing a specially formatted file on their site, http://.../parasites.txt. |
|
The Parasites
META tag |
A Web author can indicate how a page may or may not be abused, through the use of a special HTML META tag. |
The remainder of this page provides full details on these facilities.
This method is similar to the popular robots.txt standard. However, in view of the harmful effects associated with parasitic spyware, Parasites.txt is an inclusive not exclusive protocol.
To encourage take-up it is recommended that Parasitic advertising systems block access to web sites that do not voluntarily comply with this standard, or fail to permit sufficient exploitation of the web site's content.
Note that these methods rely on cooperation from the Parasite, and are by no means guaranteed to work for every Parasite. If you need stronger protection from Parasites and other spyware, you should use alternative methods such as SSL encryption, IP address range blocks, copyright royalty claims, or prisons. As with Parasitic organisms, ineffective treatments may defeat the individual Parasite but not end the lifecycle.
The Parasites Inclusion Protocol is a method that allows Web site administrators to indicate to spying Parasites which pages from their site can be freely copied and abused for behavioural profiling keywords, or modified to display advertising.
In a nutshell, when a Parasite first starts spying on site traffic, say http://www.foobar.com/, it checks for http://www.foobar.com/parasites.txt. If it can find this document, it will analyse its contents for records like:
Parasite-Agent: * Permissions: profile, scrape, pop-up, forge-cookies, redirect Allow: /
to see if it is allowed to feed off the document. In this example all Parasites may profile users, scrape content, insert pop-ups, forge cookies, and redirect users to a third party web site.
The precise details on how these rules can be specified, and what they mean, can be found in:
The Parasites META tag allows HTML authors to indicate to spying Parasites if a document may be abused for user profiling, or modified to display advertising. No server administrator action is required.
Note that currently no Parasites implement this. Or any kind of inclusion protocol for that matter. So all your documents are abused anyway.
In this simple example:
<META NAME="PARASITES" PERMISSIONS="profile, scrape, pop-up, forge-cookies, redirect">
a Parasite may profile users, scrape content, insert pop-ups, forge cookies, and redirect the user to a third party web site.
Full details on how this tags works are provided: