Who is page-store.com?
… and why do they want my content?
The other day I looked through the statistics of this domain and discovered that the usual two or three hundred megabytes worth of spent bandwidth, in July was close to two thousand. Two bleeping gigabytes. I found the culprit, as it was on the very top of the visting hosts list.
It resolves to the following:
domu-12-31-37-00-02-72.usma3.compute.amazonaws.com
The top domain is owned by Amazon, Inc., and the host’s IP address is 72.44.62.140, which is assigned by ARIN to someone called “Amazon Development Centre South Africa”.
In the Apache logs, its visits looks like this:
72.44.62.140 - - [20/Jul/2007:02:25:21 +0200] "GET /blog/?p=30 HTTP/1.0" 200 16525 "http://break-left.org/blog/" "Mozilla/5.0 (compatible; heritrix/1.12.1 +http://www.page-store.com)"
If you visit the URL in the User-Agent string, you’ll end up at a rather sparse web page which doesn’t say anything about anything, really. Apparently it’s some sort of search engine reseller or somesuch, and their definition of “deep web search” must be “don’t honour robots.txt“.
Page-store.com is registered to a fellow named Paul Pedersen of POV Search in Palo Alto, California, USA. To add injury to grief, the domain has been registered using a throwaway Yahoo account. It was registered 3. April 2007, so that and the fact that POV Search is using an open source web crawler, must mean they are in quite a hurry to generate “content” to sell. My content, that is.
I’ll just say this: when someone sucks down my web pages over the course of a few days and in its entirety, spamtraps and webpoison pages included, expressing an intent to sell it, they can from now on talk to my firewall.
I am not the only one who is being visited, it seems …
3 kommentarer til “Who is page-store.com?”
Du kan legge igjen en kommentar, eller et tilbaketråkk fra din egen blogg.
Alan Williamson
Sa følgende 7. august, 2007 kl 13:05:I too have noticed this, and also blogged about it. There is a wider issue at play here; the trust of Amazon.
Steve
Sa følgende 24. november, 2007 kl 20:27:Hi,
I having a similar problem but the ip is different:
67.202.28.172
OrgName: Amazon.com, Inc.
OrgID: AMAZO-4
Address: Amazon Development Centre South Africa
Address: 1200 12th Avenue South
City: Seattle
StateProv: WA
PostalCode: 98144
Country: US
NetRange: 67.202.0.0 - 67.202.63.255
CIDR: 67.202.0.0/18
NetName: AMAZON-EC2-3
NetHandle: NET-67-202-0-0-1
Parent: NET-67-0-0-0-0
NetType: Direct Assignment
NameServer: PDNS1.ULTRADNS.NET
NameServer: PDNS2.ULTRADNS.NET
NameServer: PDNS3.ULTRADNS.ORG
Comment: www.amazonaws.com
RegDate: 2007-08-02
Updated: 2007-08-02
OrgTechHandle: ANO24-ARIN
OrgTechName: AES Network Operations
OrgTechPhone: +27 21 794 3661
OrgTechEmail: aes-noc@amazon.com
The I noticed is the amazonaws.com
Sugerør » Blog Archive » Powerset.com or page-store.com? Who’s to know
Sa følgende 17. februar, 2008 kl 23:50:[…] A few months ago I posted about page-store.com and their annoying way of crawling my domain. The short version is that they over the course of about a day, they visited every page in this domain, including the gallery with around a thousand pictures. Yep, the high resolution versions too. The result? Page-store.com spent 1.5 GB of my bandwidth, whereas the runner-up used 11 MB! Did I mention they ignored robots.txt? […]