Saturday, October 22, 2005

The unfortunate incentives for splogging

Ranking in searches using Google was based loosely on an academic model of citations. In academics, the more frequently something is cited, the "better" or "more valuable" it is deemed to be. On the internet, this can translate into the more "other sites" link to a given site, the "better" or "more valuable" the given site is, and it will be ranked higher in a Google search. When one has thousands of hits for a search, it is likely that the searcher is only going to look at the first few, so that high ranking on a Google search is critical to a given site being viewed. Splogging was an unfortunate response to the way Google ranks things.

To take a specific example of a search, consider
+"patent reform" +2795

At about 6am on Oct. 22, this returns 686 hits.

On the first page, we have
thomas.loc.gov (2)
ieeeusa.gov
patentlaw.typepad.com (blog, from June 2005)
publicknowledge.org (2)(brief undated, unsigned article)
govtrack.us
law.com (an article on the bill by Peter Geier written 8-19-05)
ipo.org
smalltimes.com (an article by Greg Mayer written 10-3-05)

On the second page, we have
vcexperts.com (synopis of a seminar by John T. Johnson, undated)
house.gov
townsend.com (2)(from a law firm)
promotetheprogress.com (2) (blog, from June 2005)
jones.com (law firm, press update, from June 2005)
wikipedia.org
usnewswire.com (release from Prof. Inventor Alliance, 8-17-05)
bakerbotts.com (law firm, dated Oct. 05)

On the third page, we have
govtrack.us
blackenterprise.com (republication of article in Idaho Business Review)
patentlaw.typepad.com (blog, from July 05)
acenet.edu
cio.com (Aug. 18 reference to my article in July 18 issue of NJLJ)
boston.bizjournals.com
fr.com (law firm, July 26 article)
patentbaristas.com (blog, Aug. 29 guest blog post by Lee Thomason)


I could go on, but one sees this appears like a random mix of entries, random in date, random in origin, random in content, but somehow "ranked."

Of the cio.com entry, it is interesting to note that an entry talking about an article appears before the article itself.

On page 4 of the search, we find the same smalltimes.com article that appears of page 1 of the search. Exactly the same content leads to a ranking on page 1 and on page 4. Huh?

At about 6:40pm on Oct 22, the search returned 685 hits:

The changes in the first page were the omission of one of the two publicknowledge hits, and the insertion between ipo and smalltimes of a reference to IPBiz.blogspot (the text about Scott Cleere; new to page 1).

On the second page, the second publicknowledge hit appeared in front of vcexperts. There were two promotetheprogress hits, followed by
house.gov
patentbaristas.com
townsend (2)
jonesday
wikipedia

On the third page
releasesusnewswire.com (press release of Professional Inventors Alliance)
moneycentral.groups.msn (New to page 3)
bakerbotts
IPBiz.blogspot (about continuations, new to page 3)
4ipt.coom (2; New to page 3)
ipfrontline.com
govtrack.us
blackenterprise.com
patentlaw.typepad.com (July 2005)

The hit to cio moved to page 4. The second hit to smalltimes remained on page 4.
On page 15 we have a hit from the Aug 05 EEJD blog and a hit to the IPT article "Imagine: No more indecision in intellectual property cases."

Through page 25 (the last displayed) my article in the New Jersey Law Journal, which was explicitly about patent reform in HR 2795 was not included in the hits.

At about 9:20am on Oct. 23, the same search returned 685 hits.

First page:
thomas.loc.gov (2)
ieee.org
patentlaw.typepad.com (blog entry from June 2005)
govtrack.us
publicknowledge.org
law.com (article by Peter Geier of Aug. 19, 2005)
ipo.org
smalltimes.com
vcexperts.com

Second page:
publicknowledge.org
house.gov
townsend.com (2) (law firm entry)
promotetheprogress.com (2)(blog)
jonesday.com (law firm entry)
wikipedia.org
usnewswire.org (press release)
bakerbotts.com (law firm entry)

Third page:
govtrack.us
blackenterprise.com
patentlaw.typepad.com (blog, entry from July 2005)
cio.com (referring to July 2005 NJLJ)
boston.bizjournals.com
acenet.edu
fr.com (law firm entry)
patentbaristas.com (blog, entry Aug. 29, 2005)
arentfox.com (law firm entry)

Fourth page:
autm.net
smalltimes.com
steptoe.com (law firm entry)
fr.com (law firm entry)
bio.org
bayoubuzz.com
ipfrontline.com
foley.com
acenet.edu
strafford.pub.com

Fifth page:
strafford.pub.com
judiciary.senate.gov
foley.com (law firm entry)
aau.edu (2)
bna.com
newsfactor.com
sciencemag.org
law.washington.edu
mainelaw.maine.edu

Sixth page:
moneycentral.groups.msn.com
ip-watch.org
ucf.edu (republication of article by Molly Laas in FDC reports)
piug.derwent.co.uk
4ipt.com (2)
piausa.org
bakernet.com (law firm entry)
corante.com
managingup.com

(The IPT article "Imagine" appears on page 15; the entry for IPBiz (Cleere) had moved to page 22; the entry for IPBiz (continuations) had moved to page 23).

The results of a search clearly depend on "when" a search is made. In the result sequence, a hit can move over twenty pages in less than one day.

Search results 12 noon on Oct. 25:

First page:
thomas.loc.gov (2)
ieee.usa
patentlaw.typepad.com (blog entry June 2005)
publicknowledge.org
govtrack.us
law.com
ipo.org
ipbiz.blogspot.com (blog entry, Cleere, Oct. 2005)
smalltimes.com

Second page:
publicknowledge.org
vcexperts.com
promotetheprogress.com (2)(blog entry June 2005)
house.gov
patentbaristas.com (blog entry)
townsend.com (2)(law firm entry)
jonesday.com (law firm entry)
wikipedia.org

Third page:
releases.usnewswire.com
moneycentral.groups.msn.com
bakerbotts.com (law firm entry)
ipbiz.blogspot.com (blog, continuations)
4ipt.com (2)
ipfrontline.com
govtrack.us
blackenterprise.com
patentlaw.typepad.com (blog entry July 05)

Fourth page:
cio.com
boston.bizjournals.com
acenet.edu
nyls.blogs.com
fr.com (law firm entry)
gbpatent2.com (law firm entry)
arentfox.com (law firm entry)
autm.net
smalltimes.com
piausa.org

Fifth page:
steptoe.com (law firm entry)
fr.com (law firm entry)
bio.org
ipfrontline.com
straffordpub.com (2)
bayoubuzz.com
foley.com (law firm entry)
acenet.edu
piausa.org

****
from hankyoreh-->

Kim Cheol-su (not his real name), the operator of a popular blog, recently had a disconcerting experience. A writing of his that he had not even posted appeared on a portal site as a popular piece. When he investigated it, he found that someone was operating a “phantom blog,” presenting Kim’s blog writings intact as if they were that person’s own work. He visited the blog and left a message protesting this, but there was no response. This was an example of what is being called a “splog,” or a “spam blog,” which raises advertising revenues by presenting content from famous blogs.

0 Comments:

Post a Comment

<< Home