Ghostbusters: Removing ghost referral spam from Google Analytics

December 27, 2016 by BizTraffic Team

Referral spam can be frustrating for webmasters

Note: Looking for even more spam removal tips? You can read Part 2 of our Ghostbusting series here!

Since its launch in 2005 after Google’s acquisition of Urchin, Google Analytics (or GA) has cemented itself as a fantastic and (most importantly) free tool that can be used to retrieve and report upon website traffic data. As the predominant web analytics tool, GA provides webmasters with some highly useful instruments to track online campaigns and conversion rates, perform website optimization, and even generate e-commerce reporting. Unfortunately, the accuracy of all of that robust data analysis that GA provides can be easily damaged by referral spam.

These are not the bots you’re looking for.

Referral spam (or “ghost referrals”) is spam in the form of “referral” visits to your website, which can artificially inflate your hard-earned traffic statistics including your conversion and engagement rates.

Ghost referrals generate visitor data without anyone even visiting your website. This is accomplished by spam bots scraping your Google Analytics tracking code (e.g. UA-, which the spammers then use to send information directly to Analytics (bypassing your website). Most spam can be pretty easy to spot when you look at your visitor data, it probably looks something like this:

Remove that pesky spam!

At this point, you’re probably thinking the same thing that I was when I first saw this spam: why are they bothering to spam my Analytics account? Why not just spam via email, pop-ups, etc.? Well, if you’re anything like me, you might have seen some of these referral links and been tempted to click on them to see where the heck your referral traffic was coming from (especially when it looks like you're getting referral traffic from websites like reddit.com or huffingtonpost.com).

Don't click those links!

The GA spammers profit off of curious webmasters who click their spammy ad links. Even if only 1 in 1000 webmasters click a ghost referral, that can add up quickly.

Who you gonna call?

According to several sources, Google is apparently working on a solution to referral spam. In the meantime, however, there are a few viable solutions for dealing with GA referral spam.

The easiest and most pragmatic of these is to use Analytics' built-in filters to filter out those annoying spammers. Caveat emptor: this solution does not remove ghost referrals from your incoming traffic, per-se, but it does allow you to view more accurate traffic metrics, which is why most of us use Analytics in the first place! Also noteworthy: this is a view-level filter solution, which means that it will not retroactively remove spam (more on a workaround for that coming later!), but implementing these filters will help to clean up your data over time.

First, head over to the Admin tab at the top of your Analytics page. From there, you'll want to go into Filters:

Google Analytics -> Admin -> Filters

Once you're in there, you can create a new filter. There are a few options here, so I'll lay out a couple plans of action.

Plan #1: Destroy anything with 15+ symbols

This plan is relatively straightforward. We'll be using a regular expression (also known as a  "regex" or "regexp") to filter any domains with a "language" field with fifteen or more symbols. This solution comes from analytics-toolkit.com's blog (definitely check them out!). While this won't work for all ghost referral spam, it's a good (and simple) start. You'll need to use this expression for your new filter:

.{15,}|\s[^\s]*\s|\.|,|\!|\/

And here's what it should look like:

Language settings view-filter in Google Analytics

Plan #2: Annihilate specific domains

If the first filter isn't getting enough work done for you, you can always create customized filters to get rid of domains. This can be somewhat painstaking, but if you only have a small number of domains that you need to clean up, it can definitely be worthwhile. One major drowback of this approach is that you'll need to actually identify the spam domains to begin with. Be careful not to filter real traffic!

Here's an example filter pattern that gets rid of several domains (note that each individual domain is separated by a '|'): 

.*((darodar|priceg|buttons\-for(\-your)?\-website|makemoneyonline|blackhatworth|huffingtonpost|o\-o\-6\-o\-o|(social|(simple|free|floating)\-share)\-buttons)\.com|econom\.co|ilovevitaly(\.co(m)?)|(ilovevitaly(\.ru))|(humanorightswatch|guardlink)\.org).*

And, for good measure, here's what that would look like in the actual filter on Analytics:

Campaign Source view-filter in Google Analytics

Conclusion (for now)

The solutions above will get you well on your way to removing pesky Google Analytics referral spam from your future metrics. With any luck, Google will roll out their own measures against these spam techniques so that there's less dirty work for webmasters to handle.

In the meantime, keep an eye on your metrics! The spammers can and will adapt as solutions are developed to circumvent their techniques. In our next post, we'll cover how to filter out spam retroactively using advanced segments in Google Analytics.

Download EO Mistakes to Avoid Website Redesign

Filed Under: Google, referral, marketing metrics