SiteTruth updates

June 10th, 2010

We’ve made some minor updates, in preparation for major ones. Big changes are in the works.

In this round, the main SiteTruth.com page was redesigned, we have a new color scheme, and some server upgrades were made. We had several brief outages during upgrading, but are now back up.

Meanwhile, as a digression, we offer SiteTruth Classic, our search engine in a steampunk format.

AdRater 1.1 released.

May 8th, 2010

AdRater 1.1 has been released and approved as an official Firefox extension. This update to our AdRater plug-in, which rates Google ads as they appear on web pages, will now work on versions of Firefox from 2.0 to 3.6.

Phishing exploitation of major sites

April 13th, 2010

SiteTruth distributes a list of major domains being exploited by active phishing scams. [sitetruth.com] This is generated by processing PhishTank data, which we do automatically every 3 hours. The SiteTruth system is looking for the identity of the business behind the web site, and forged business identification is a problem.  So we use phishing reports to find forgeries, and take a hard line – one phishing report down-rates the entire domain.  At any given time, there are about 30 to 80 domains on the list.  Rather than being secretive about this, we publish the list, and try to help legitimate site operators to get off it. We do this because we want to reduce the collateral damage from our tough blacklist system.

Some sites get themselves off the list quickly. By now, most of the better free hosting services and short-URL services are automatically checking PhishTank and the APWG blacklist to see when they’ve been hit. Today, if you run a service where anybody can put up a page that could be used for phishing (i.e. it’s not full of your own headers and banners), you need automation to deal with attacks. As an example, “t35.com” has  been hit by a flood of phishing attacks, with several hundred new reports in PhishTank per day. The attacks were coming in faster than the abuse staff could clean them out. They’re now gaining on the problem, but haven’t squashed it yet. Take-away lesson: automate your response to such attacks.

The domains near the top of the list have been there for a while. Note the dates, which are the date that the oldest phishing report still online and active appeared in PhishTank. Some just need help. Typically, these are small businesses, churches, and nonprofits that have had a break-in and were partially taken over by a phishing site. Often, they lack an information technology staff, let alone abuse and security departments. We send them the Anti-Phishing Working Group’s “What To Do if your Site Has Been Hacked”. [antiphishing.org] Sometimes we give them a phone call. They deserve sympathy and help.

Then there are the hard cases. These are sites with no visible contact address, or a clueless abuse department. At the moment, Google Sites and Google Spreadsheets are being used for phishing. Google is new to the free hosting business, and the phishers have discovered some tricks that Google can’t yet handle. While Google puts a “report abuse” link on their site pages, it’s possible to set up a file for downloading on Google Sites, and an HTML page can be served that way [phishtank.com], without Google’s abuse checking. There’s also an exploit of Google Spreadsheets [phishtank.com]. That one is an example of Habbo Hotel phishing. [bbc.co.uk] We’ve reported these to Google several times, but they haven’t been fixed yet.

We’ve been seeing a new type of attack recently – a phishing operation breaks into a shared hosting server and plants phishing pages on multiple domains on a single server. One of these hit one of the mysterious “*.websitewelcome.com” servers, which has “cloaked domain registration” and no useful default web page. These seem to be associated with “ThePlanet.com”, but whether ThePlanet operates them, is providing wholesale hosting, is providing colocation, or is just the upstream connectivity provider is not clear.

Hiding the contact information of a hosting provider is legally unwise. The hosting provider may lose the “safe harbor” protection of the the DMCA. [cornell.edu] The “safe harbor” provision for “Information Residing on Systems or Networks At Direction of Users” only applies if “the service provider has designated an agent to receive notifications of claimed infringement… by making available through its service, including on its website in a location accessible to the public, and by providing to the Copyright Office, substantially the following information: the name, address, phone number, and electronic mail address of the agent.” So when the RIAA or the MPAA come calling, a likely event for a hosting service, they get to go after the hosting provider.

So that’s vulnerability reporting in phishing land.  Our experience is that occasional nagging will keep that list down in the 25 to 50 domain range. If we stop nagging, it creeps up to around 100. When we first started, there were about 175 domains on the list. Reporting vulnerabilities does measurably help.

SiteTruth technology now patented

April 6th, 2010

The technology behind SiteTruth is  covered by U.S. Patent #7,693,833, issued today, and another pending patent.

SiteTruth outage

March 15th, 2010

SiteTruth.com was down for two hours this morning due to a power outage at Codero’s Phoenix, AZ data center. According to Codero:

At approximately 8:00AM Central Time, utility power to our Phoenix data center and the surrounding area was cut. The City of Phoenix has not yet informed us of what caused this issue. Upon failure of utility power, backup generator fired-up as designed but the switching gear failed to transfer load to the backup power. Codero staff immediately contacted all of our power vendors as well as building maintenance since the transfer gear and generator are not owned by Codero and are out of our physical control. Unfortunately, the electrical group which could manually transfer the load did not arrive before our UPS and battery loads diminished, and electrical service shutdown.

Upon arriving, faulty breakers had to be replaced before they were able to transfer the load to the backup power source. Shortly after this transfer, utility power was restored, and the PHX data center is currently back on utility power.

Upsurge in phishing attacks on major sites detected

February 26th, 2010

We track major domains being exploited by active phishing scams, as part of our site legitimacy testing process. Until three days ago, that list had from 25 to 50 domains on it. In the last three days, the number of domains being exploited has doubled. As of today, we’re at 96 major domains, each of which is hosting at least one phishing page.

The new phishing pages cover a wide range of financial institutions around the world. We’re seeing Canada Trust, the Austrialian tax authorities, banks in Greece, Italy, South Africa, and India, along with the usual targets – Bank of America, HSBC, and PayPal. This has been reported to US-CERT.

Domains containing phishing pages receive SiteTruth’s lowest rating for the entire domain. This encourages sites to be proactive in securing their site.

Performance improvement

December 6th, 2009

SiteTruth’s main search page is now much faster. It’s now fast enough to use as your main search page. Let us know how it works for you.

Google advertiser quality update

December 4th, 2009

Our latest statistics on the quality of Google’s advertisers may indicate a slight downward trend. Of 20247 Google AdWords advertiser domains seen in the last 60 days, we see the following ratings.

Sites Percent Rating
1964 9.7% Site ownership and business identity verified. No significant issues found.
7729 38.2% Site ownership identified but not verified.
3715 18.3% No information available.
6839 33.8% Site ownership unknown or questionable, or significant negative information about the business was found.

Most notably, the percentage of sites in the highest category, those where the identity of the business behind the site was verified by a trusted third party, has decreased.

Our sample size has increased substantially, which has some effect on the data. This data comes from users of our AdRater plug-in, and the number of AdRater users has increased substantially in the last month. When we rate an ad for a user, we accumulate data about advertiser behavior. We don’t collect data about what users are doing; just advertisers.

This data includes only ads served by Google’s US-based ad servers.

We will be reporting this data periodically.

Performance temporarly degraded due to disk problem.

November 19th, 2009

Due to a disk problem, we cleared SiteTruth’s rating cache. Responses will be slower than usual while the cache rebuilds itself.

We upgraded a server today, which should help with the increased load.

“The Myth Of Great Search Engine Results”

October 26th, 2009

Danny Sullivan at SearchEngineLand has written “The Myth of Great Search Engine Results”. He is of the opinion that search engine results are getting “worse”, but he can’t quite say why.  We can.

Google search results are getting worse for hard questions. Google is trying to correct for more user errors. Google used to insist that all the words of the query appear in the result. They’ve backed off on that; you now get some pages that Google considers important even if some words are missing. You can insist that a search word be present by quoting it. It’s easier to get answers to simple questions now, but the user has to do more work on hard ones.

Google also has become much more aggressive about spelling correction. This is a problem when your query has a word that is “close” to a common word. Again, quoting a single word forces an exact match. Google also considers synonyms now.

Most search queries are very dumb. Look at Google Trends to confirm this. That’s where the market is, and that’s what Google is targeting. It’s a reasonable business decision from their perspective.

Search in more adversarial areas, where “search engine optimization” is practiced, have a different set of problems. When a search engine operates perfectly, it makes no money. If Google takes a buyer directly to the seller’s page, Google makes nothing. If Google organic search directs the buyer to a site with Google AdWords, or produces search results sufficiently irrelevant that clicking on a search result ad looks promising, then Google makes money. It’s thus not in Google’s interest that organic search be spam-free.

Which, of course, is what SiteTruth is for – search with less evil.