How to Identify and Remove Bot Traffic in Google Analytics

Let’s talk about bot traffic in Google Analytics.

Most of the time, bot traffic in our analytics data gets a bad name. We think of it as spam, and we don’t want this data anywhere near our reports.

Sure, there are spam bots sending hits to our analytics data. But there are also good bots. Or at least bots that we want to visit our websites, for testing, diagnostics, and even monitoring SEO results.

Whether or not you welcome this bot traffic, 99% of the time you don’t want to see bot traffic in your analytics reports.

Why? Because Bots are not real users, and they don’t perform like humans. Heavy bot traffic (5% of sessions or more) can skew our data and pollute our analytics.

How do you keep the bots out of your Google Analytics reports?

Google doesn’t always block the bot

I love Google Analytics… But sometimes their “one size fits all” tool misses the mark.

Now, I’ve been pretty outspoken about how Google handles spam traffic. And for a long time, my take was that Google wasn’t doing much to keep the spam traffic out of our analytics reports.

Is Google Analytics Newest Data Quality Issue the Most Challenging

Google was like Fredo in the Godfather II when it came to defending our data against spam – drastically underachieving!

You broke my heart google

And the community noticed. Like many Google Analytics users, I started checking out other analytics products. Maybe the grass was greener somewhere else?

Is Google Analytics the best analytics software?

When users started threatening to move away from Google Analytics, Google took notice. And since then things have gotten better. There’s been a noticeable reduction of spam traffic in our analytics reports.

But, spam is just one type of bot traffic that can pollute our analytics data. Many varieties bots hit our websites, and sometimes we are the ones sending bots to our sites.

How do we keep the Bots out?

Defending your data against bot traffic is a bit like playing whack-a-mole.

You have to identify the unwanted website hits and respond.

Let’s look at how we can identify bot traffic. And let’s walk through some strategies for blocking this traffic from our analytics reports.

Analytics Course Student Question

One of our Analytics Course students recently noticed a big issue with bot traffic in his reports. And he wants to know how to respond to the problem.

Keith Asks:

We recently sent an email blast out using an email list that we rented. The company is well known with a good reputation as far as we can tell. We received clicks to the site but no action once on the website and looking through GA I see that 82% of the clicks can be located to the United States but is (not set) for state, city & metro. I have not seen such a high number of (not set), in fact we on another site we have we had 300 not set in the last 250,000 sessions.

The email was targeted at Seattle and the Network/Service Provider dimension shows Microsoft Corporation. Looking at a number of other sites we manage, we don’t have anywhere near the % of location (not set), even for Network/Service Provider: Microsoft Corporation.

So what do you think. Is this a case of a narrow target to Seattle for a number of individuals at Microsoft that happen to block geo location or perhaps something fishy with bot clicks from the email list provider to show results? (no accusations, just haven’t seen numbers like this

Here’s Keith problem: His company sent out a targeted email campaign, and it resulted in a bunch of unwanted hits in their Google Analytics reports. This happens from time to time.

But then something interesting happened. The geo-location for the majority of this traffic was “(not set).”  And all the not set traffic is coming from one ISP organization – Microsoft Corporation.

Keith wants to know if this bot traffic?

It is most likely bot traffic

Here’ why: Keith is getting a disproportionate amount of location (not set) data in his reports. Also, the traffic doesn’t sound like it exhibits human behavior. It’s doubtful that Microsoft employs a bunch people to sit around and click email links, and then immediately bounce off his site once they click through.

So how do we keep this traffic out of our reports?

The Google black-box solution

The easiest way to keep bot traffic out of your Analytics reports is to use Google’s automatic filter. To set up this filter, go to your view settings and check the box that says “Exclude all hits from known bots and spiders.”

Google's bot traffic filter

I’ve used Google’s auto filter on almost every account I’ve analyzed.

And I’d say 60% of the time; it works all the time.

The bot traffic filter in Google Analytics work 60% of the time, every time.

When this filter doesn’t work, two things could be happening.

False positives and false negatives.

Google’s black-box doesn’t always exclude the traffic you want to exclude. And it doesn’t always include the traffic you want to include.

So, you have to run tests. You want to be proactive with testing your filters, and not trusting Google blindly.

Why? Because you can’t remove bot traffic from your analytics data after the fact. So you want to put filters in place before this traffic becomes pervasive in your reports.

Here’s how to set up bot filters in Google Analytics

Step #1 – Create a new Google Analytics View

When you create a view to test bot traffic, give your view a very specific name. That way other users in your account will know the view is only for testing bot filters. Google organizes views alphabetically.  If you start the name of your view with “XX,” it should show up at the bottom of the view list, and most users won’t see it in your account.

bot traffic filter

Step #2 – Uncheck your bot setting

In your new view, uncheck your bot filter. You want to let the bot traffic into this view.

Step #3 – Understand your bot traffic

In the example below I’ve tried to replicate Keith’s problem.

If you look at the traffic coming from “Microsoft corp,” you can see the average session duration is 2 seconds. The other behavior metrics are also different from the rest of the traffic.

identifying bot traffic

The lousy traffic doesn’t have all the bot qualities we usually see. Bot traffic typically has a 100% bounce and 1 page per session.

But it still looks like junk to me. So, I am calling it a bot!

Step #4 – Develop a filter pattern

In this case, we’ll create a filter excluding traffic by ISP Organization.

excluding bot traffic

Step #5 – Verify your filter

In your filter settings, use the “filter verification” to run a test.

Our verification test indicates that our filter should be useful.

bot traffic filter verification

Step #6 – See if it worked

You’ll have to check your reports to see if your filter worked. It may take a little a while to find out if you blocked the bot traffic. Be patient! You should know after a couple of days if your filter has removed the junk traffic from your reports.

be patient when checking your bot traffic filters

If your filter was successful, you could add it to your main view. If it didn’t work? Try adjusting your exclusions again.

The waiting game is part of the life of an analyst. You’re often waiting for the data to come in so that you can review results.

It’s part of the cycle of reviewing your data quality.

the data quality review cycle

The cycle works like this:

  • You analyze the traffic in your reports
  • Then you identify anomalies in your traffic.
  • You determine the cause of those anomalies.
  • Next, you implement a fix for these problems.
  • Then, you document the flaws, using annotations or other records
  • And you analyze your traffic again – a day, a week, a month later, to make sure your solutions are working.

Hopefully, you can apply this method to your future data quality analysis.

Filtering bot traffic: Questions or comments

We did our best to be thorough in explaining how to filter bot traffic from Google Analytics, but every situation is different. So let us know how we can help.

Do you have questions about identifying or excluding bot traffic? Leave a comment below, and I’ll answer any questions you have.


This post and video was episode 67 in our 90 Day Challenge digital marketing series.

To get access to all 90 videos, subscribe to our YouTube channel. YouTube will send our subscribers weekly emails about all the videos we published over the past week.

Want to know about each video and post as soon as it comes out? Sign up for the 90-day challenge email newsletter. The newsletter will be the best way to make sure you don’t miss any of the content.

  • Bhavesh says:

    Hey,

    I need your help. Our website had an unnatural 12x spike in traffic last week. All the traffic was direct, had 100% bounce rate, happened at a particular time on a particular day, and all the traffic was from one country only and landed on one particular page. What could have caused it and How can we Filter it in Google Analytics and prevent it from happening again?

    How can exclude bot traffic particular day, time and country it is possible to de this in google analytics.

  • Isabelle Fleming says:

    Excellent article and I have done all of this. However, I still see a few problems in my analytics.
    Using Google Analytics User Explorer report, I have noticed that some users keep viewing the exact same page, sometimes day after day.
    As an example a unique client may have 9 sessions in a day with one page view, always the same one. Each time the session bounces. Why would someone look at the same page over and over again in the course of the day?
    Could it be bots? Could the user have a page of my website constantly open on his device – each time he refreshes his device, it refreshes the page and is counted as one session by GA?

    How to address this issue which distorts my analytics?

  • What if there are literally a thousand networks that have zero average session duration? Is there a way to filter them out in a single filter, or I just have to suffer and encode them manually?

    Thanks Jeff for the quality content!

    • Jeff Sauer says:

      You can write a regular expression to block multiple networks in one filter. But it will be too long of a string for 1,000 networks. So you would probably need to build multiple filters to accomplish this. But not 1,000.