Google Analytics sampling limitations and how to overcome them

Post on 20-Jan-2015

1.821 views 2 download

description

Google analytics work fine for small and medium-sized websites, but when the data volume increases, some of the reports are sampled. See in this presentation some of the most common issues and how you can overcome them.

Transcript of Google Analytics sampling limitations and how to overcome them

Google Analytics sampling limitations

and how to overcome them

George  PapadongonasWeb  Analyst,  Amazee  Metrics

16/7/2013

How Google Analytics stores data

• All un!ltered data of a web property (up to 10 million hits per month) are stored in the Google Analytics database

• Each standard report has an associated data table in the Google Analytics database with unsampled data

• Reports for accounts with more than 200,000 visits per day are processed daily

• Reports for accounts with less than 200,000 visits per day are processed more often

2

Google Analytics Report Sampling

• Sampling starts when the requested date range has more than 500,000 visits

• The sample size can be arranged by using the Google Analytics sampling slider. The default setting is for 250,000 visits, maximum setting is 500,000

• Visits are counted for the speci!c date range on a Web Property level, not on a Pro!le level

• Standard reports without !lters, advanced segments or secondary dimensions use always unsampled data

• Sampling applies to custom reports

3

Organic Search traffic report is unsampled

4

By adding a second dimension,sampling is applied

5

Adjust the sampling size

6

Prefer the higher precision setting

7

Google Analytics Report Sampling

8

How sampling is calculated

• Web property has 24,580,303 visits

• Pro!le has 492,786 visits

• Default sampling is: 250,000x492,786/24,580,303 = 5,012 visits (1%)

• Maximum sampling is: 500,000x492,786/24,580,303 = 10,024 visits (2%)

9

Avoid the faster processing setting

10

Prefer the higher precision setting

11

Pageviews or events reports can be unreliable

12

Google Analytics Report Sampling

• Visits and Visitors reports are usually reliable, even with a small sample

• E-commerce transactions, individual pageviews, adwords data, revenue and goal conversions are less reliable

13

Solutions

14

1. Buy Google Analytics Premium

• “Only” $150.000 / year

• 1 Billion hits processed per month

• Unsampled reports

• Data processing every 4 hours

15

2. Create custom pro"les

• Instead of creating reports with speci!c advanced segments, create custom pro!les using !lters

• The default reports of all pro!les are always unsampled, even if the visitors are more than 500,000

16

3. Enable Data Sampling

• Sample your data , by adding a line in the Google Analytics tracking code

code_gaq.push(['_setSampleRate', '80']); Sets

sampling rate at 80%

• Not a perfect solution, as the data are still sampled, but you have control and can avoid tracking interruption (for more than 10 million hits per month)

17

4. Use smaller date range

• Break you report in smaller data ranges, each one having less than 500.000 visits

• This ensures that the data are unsampled

• Export the reports using the Google Analytics API

• Aggregate the data in Excel and create the master report

18

5. Use analyticscanvas.com

• Analytics Canvas offers query partitioning, using the Google Analytics API.

• Reports are exported in smaller date ranges, so that they are unsampled and they are then merged automatically with Analytics Canvas.

19

6. Download Google Analytics data locally

• It is possible to keep a local copy of Google Analytics data

• Add a line in the Google Analytics tracking code

_gaq.push(['_setLocalRemoteServerMode']);

• Add _utm.gif to your web server root

20

6. Download Google Analytics data locally

86.138.209.96 www.mysite.com - [01/Oct/2007:03:34:02 +0100] "GET /__utm.gif?utmwv=1&utmt=var&utmn=

2108116629 HTTP/1.1" 200 35 "http://www.mysite.com/pageX.htm" "Mozilla/4.0 (compatible; MSIE 6.0;

Windows NT 5.1; SV1; .NET CLR 1.1.4322)" "__utma=1.117971038.1175394730.1175394730.1175394730.1;

__utmb=1; __utmc=1; __utmz=1.1175394730.1.1.utmcid=23|utmgclid=CP-Bssq-oIsCFQMrlAodeUThgA|

utmccn=(not+set)|utmcmd=(not+set)|utmctr=looking+for+site; __utmv=1.Section One"

21

• Data are recorded in the server log !les

• Use http://analytics.angel!shstats.com/ to analyze them, as Urchin is discontiniued

Thanks!

22