Is there a way to prevent faked Google Analytics statistics by using PhantomJS and/or a ruby crawler like Anemone?

Our monitoring tool (which is based on both of them) crawls the sites from our clients and updates the link status of each link in a specific domain.

The problem, that simulates huge trafic.

Is there a way to say something like "I'm a robot, don't track me" with a cookie, header or something?

( adding crawler IP's to Google Analytics [as a filter] may not be the best solution )

Thanks in advance

有帮助吗?

解决方案 2

I found a quick solution for this specific problem. The easiest way to exclude your crawler which executes js (like phantomjs) from all Google Analytics statistics is, to simply block the Google Analytics domain through the /etc/hosts.

127.0.0.1    www.google-analytics.com
127.0.0.1    google-analytics.com

It's the easiest way to prevent fake data. This way, you don't have to add a filter to all your clients.

( thanks for other answers )

其他提示

Joe, try setting up advanced exclude filter -- use field Browser and into "Filter Pattern" put down the name of your user agent for phantom (or any other user agent -- look up the desired name in your Technology -> Browser and OS report).

enter image description here

IP filtering might not be sufficient, but maybe filtering by user agent string (which can be set arbitrarily with phantom) ? That would be the "browser" field in the filters.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top