Real life data in Google Analytics (crushing the monolith)

Google Analytics often gets a bad PR for providing too shallow insights and reporting on vanity metrics. 

After all, when you sign up for Google Analytics, you’re there in front of tens of standardised reports that may overwhelm you and fail to provide any truly actionable insights.

Not all is lost! Google Analytics has a neat little feature called custom definitions that can help you see some “real life” data – unique to your business – inside your Analytics and in the context of all other reports that Analytics offers.

The essence of custom definitions is to help you enrich data and get deeper insights. With custom definitions, you have a complete control over data that gets sent to your Analytics account. Examples? Not unheard of are custom definitions like product weight category if you’re an e-commerce business, pricing plan if SaaS, ruling political party if media company.

There are two types of custom definitions: custom dimensions and custom metrics

As a reminder, dimensions (both regular and custom) are labels that describe data (e.g. city or source) and metrics are measurements (e.g. new users, avg. session duration). Reports are combinations of dimensions and metrics.

Once you properly set up custom dimensions and metrics, you can use them just as regular dimensions and metrics. This means you can fire them as a secondary dimension in reports, build advanced segments from them, create alerts, export data from Analytics to BigQuery, etc. 

In the free version of Google Analytics, you can create up to 20 custom dimensions and metrics. 

This article is a tutorial based on a case study. Say you’re running a SaaS business, and that your lead signup form comes with the following fields: user email, company name, company size, and pricing plan. By using custom dimensions and metrics, we will turn form responses into new data available as custom dimensions and metrics in your Analytics report. Each form field other than e-mail (we can’t pass PII to Google Analytics) will have a corresponding custom dimension or metric. As a result, in the end we’ll have three new dimensions (company name, company size and pricing plan option) and one new custom metric (monthly subscription revenue from a given pricing plan), ready to be used in Analytics.

Setting it up properly will allow you, for example, to see your regular Google Analytics acquisition report with pricing plan as a secondary dimension. We call such data “real life data” to distinguish it from the predefined metrics and dimensions available in Google Analytics.

In this article, I’ll show you step-by-step how to set up custom dimensions and metrics for this particular case study using Google Analytics and Google Tag Manager (using data layer).

Three steps to enrich Google Analytics with “real life data”

We need to do three things to make it work. 

First, we need to create our new dimensions and metrics in Google Analytics (actually we need to create new “buckets” or “labels” to hold new data). 

Second, we need to generate data for our new variables. We’ll do it by installing a script to write form response into so-called data layer variables, for later use by Google Tag Manager.

Third, we need to set up Tag Manager to read data from data layer variables and send it to Google Analytics.

If you don’t understand the specifics of each step at this point or don’t quite know what some details like data layer mean, don’t worry. I’ll explain everything in the article.

Anyway, that’s our game plan. Let’s do it!

Creating new data buckets in Google Analytics

Launch your Google Analytics property. Click Custom definition and then Custom dimensions.

Click New custom dimension. We’ll set up the first bucket for a new, custom dimension.

Name your custom dimension and choose its scope. Please note that the name you put here is the name that will appear in your Google Analytics or Data Studio reports. Therefore, it’s best to put a human readable text such as Company Size, Company Name etc.

In case you don’t know what scope is, we’ve covered this topic at length in our article on building a data pipeline between Analytics and BigQuery. Check out!

In our example, we will definitely go with the user scope. This is because our custom dimensions – company name, company size, pricing plan, because they’re certainly not a session or hit level attributes – they belong to the individual users.

Click Create.

Once you create the custom dimension label, Google will hand you implementation codes, but you can ignore them. We will set it all up using Tag Manager in the next step.

Repeat these steps for all other custom dimensions you want to add. Always remember to choose the appropriate scope. In our example, we’ll do the same for Company Name and Pricing Plan.

The end result for our dimensions is as follows.

Note that each custom dimension has its own index. Note them – you’ll use these indexes later in Tag Manager (any ideas why? Give it a thought!).

How about custom metrics? We’ll use one to track monthly revenue from a given pricing plan. With the custom metrics, you can choose from one of two scopes: hit or product. Hit scope for custom metrics represents measurement linked to one particular hit. You can attach it to pageview or event. Another possible scope is product and it’s reserved for tracking more details about products in an e-commerce transaction. Clearly, we need to go with the hit scope for our custom metric.

We’re now set with preparing our Google Analytics labels to receive new data. 

To check that everything works, fire a custom Analytics report or any other report that lets you apply secondary dimensions, such us source/medium report in the Acquisition section. Click Secondary Dimension. Look for the Custom Dimensions category. You should see your newly created dimension labels there. Take a look.

Try clicking on any of these dimensions, for example Company Size. You will see a blank report. No data. 

Why is that? 

That’s because all you did was creating labels in your Google Analytics. There’s no actual data yet. For these dimensions to display any data, you need to first generate the data and pass it to Google Analytics.

Let’s do it now!

Generating real life data

Before we dive strictly into the data layers and extracting data from user activity, some additional background info first.

As a data-driven marketer, you’re always on the lookout to track yet another user interaction to get a fuller picture of your marketing engine. A bit of code here, a bit of code and marketing pixels there, and suddenly your developer gives up on carrying this burden of external tools firing everywhere. You don’t want this!

Tools like Google Tag Manager help you control this (potential) mess. The idea of Google Tag Manager is super simple. You install the container script only once and then set up any tracking rules through a central, web-based Tag Manager interface.

Another benefit of Google Tag Manager is that once you add the container script, it also creates something called a data layer. Data layer is essentially an object (simply a piece of code that keeps variables and their values) that passes information from your website to Google Tag Manager. Data layer carries some predefined variables and that’s useful, but it’s particularly useful for its ability to also include other variables, defined by yourself.

With data layer, you can pass all sorts of additional data to Tag Manager and later to other tools. There are three main sources of such data: user activity, internal database, and external data.

User activity includes form activity, browsing behaviour or other content-based interactions on the website. Internal database can be used to pass data related to the products, lifetime value or relationships with your customers, account representatives etc. Finally, external data can be of value in understanding the context of your users in your future, enriched reports. You can even use it to store things such as weather on a given day (as a custom dimension), whether or on a user follows you on Twitter, or how many followers on Instagram someone has (as a custom metric). But that’s probably quite extreme…

Let’s come back to our example. There are two main ways to provide information to Google Tag Manager. The first approach is to use a data layer declaration. In our example, it would look like this.

dataLayer = [{
'pPlan': '3',
'cSize': '25',
'cName': 'First Party',
'mSales': '500'
}];

Normally you would put this piece of code before your Tag Manager container script. You have 4 properties, each corresponding to one custom variable we’re installing. Each property has its own value. Data layer property names should be valid Javascript names and should not conflict with Tag Manager’s reserved names like event. Other than that, you can name these variables however you want!

Your goal is to fill these properties with data. In order to do it, you need to pull the data from an appropriate place on your website. In our case, the source of data will simply be the signup form. The specifics of the implementation will strictly depend on your website. You would normally ask your developer to write a script that pulls data from the form or other place (e.g. database) and pass relevant data to the data layer variables.

There’s also another way to install this. Instead of putting the code before Tag Manager container script, you can fire the following code upon the form submission or button click.

dataLayer.push({'pPlan': '3','cSize': '25','cName': 'First Party', 'mSales': '500'});

The approach here depends entirely on you, but the second approach is now the recommended approach.

So what happens here? If someone fills the form, data layer will grab your data, save it to particular data layer variables, which Tag Manager will later on read from and pass to Google Analytics. Easy!

Closing the loop with Tag Manager

Now that we have the code installed and grabbing data we want, it’s time to tell Tag Manager to send the data to Google Analytics.

GTM 101. Google Tag Manager consists of tags, triggers and variables. Tags are just marketing scripts that fire when a given set of triggers is met. Variables help you make your work reproducible.

In the previous step, we’ve walked through a short piece of code that saved form data to variables in the data layer.

Now in Tag Manager, our goal will be to read values from the variables and pass them onto Google Analytics.

First let’s create new variables in Tag Manager. We need variables to be named exactly the same in the data layer script we’ve installed in the website code. Only then we’ll be able to do anything with the data.

For that, click Variables in your Google Tag Manager console. You should see two subsections titled Built-in variables and User-defined variables.

We’ll be defining our own variables which are not available by default to other Tag Manager users. Click New in the User-defined variables section.

Give your new variables short titles, e.g. Company Size, Company Name, etc.

Click on the Variable Configuration section. You should see a lot of variable types to choose from.

Remember, we’re saving lead form responses to the properties in data layer object provided to us by Google Tag Manager. Therefore, choose data layer as our variable type.

In the Data Layer Variable Name field, put down the name of the property from the data layer that corresponds to the custom dimension or metric you’re setting up.

This is the variable we’ll read from. It has to be exactly the same as the variable name you’ve used earlier in the data layer script.

In our example, the data layer variable corresponding to Company Size is cSize, Company NamecName, Pricing PlanpPlan, and for monthly subscription valuemSales.

Set up a new Tag Manager variable for each data layer variable.

In the data layer version, choose version 2. Version 1 is to be treated as deprecated by now.

Default value option is useful if a corresponding field in your lead form is optional. Some people will naturally leave it blank. Put down some arbitrary default value for such cases. End reports in Google Analytics will be more complete.

In our example, let’s assume that the company name field is optional in our signup form. Let’s set the default value to be undefined. This value will be passed to Google Analytics reports if someone does not fill in the company name field.

Finally, if you expect the form responses to differ significantly among the users, you can try using some predefined formatting rules. This can be especially useful for string inputs. You’ll see various formatting options in Change Case dropdown list.

The final variable set-up for the Company Name looks as follows.

Repeat setting up such Tag Manager variables for the remaining variables used in the data layer.

Click Tags.

Before we go further, let’s pause and do a quick wrap-up. We now have new labels or buckets in Google Analytics. We have a way to generate and save new data to data layer variables. We created new variables in Tag Manager.

Now we need to use our new variables to pass data to Google Analytics. As a reminder, custom dimensions and metrics data get extracted from individual hits of users. Therefore, in order to send these additional bits of data to Google Analytics, we need to “attach” it to one of Analytics tags. It can be a pageview or event tag. In this tutorial, I’ll show you how to do it using pageview tag. Remember that user or session scoped custom dimensions should only be send once.

Open your GTM web panel again.

Click Tags and open Universal Analytics Pageview Tag. If you don’t have one, set it up.

Click Enable overriding settings in this Tag. Then click More Settings and finally Custom Dimensions.

This is where you put down your custom dimensions variable and their corresponding Google Analytics index. Add you variable by clicking a + icon and choosing it from the list.

Once you add custom dimensions and metrics as well as their indexes, your edited Pageview tag should look like this.

That’s it. Now let’s see if it actually works. For that, save your edited Pageview Tag and go back to the main Tag Manager console.

Click “Preview” on the top bar. This will launch a preview mode in your Tag Manager account and will let you see if your tags work.

Open your website. You should see a wide rectangle at the bottom of your page with tags.

Visit your page with lead form.

Fill in the form and send it.

Now click into the details of your Pageview Tag in the preview mode:

If everything is correct, you should see your data layer variables with values from your lead form.

These values will be read by Tag Manager, appended to Pageview Tag and sent to Google Analytics when the trigger is fired.

How does Google Analytics know how to label this new data? The answer is custom variable index which you saw earlier in Google Analytics and then also used in Tag Manager.

If everything works correctly, go back to Tag Manager and publish your changes: a bunch of new variables and updated Pageview Tag.

In our examples, company size and pricing plan data is categorical variables, even though they seem like numbers.

Playing with new data: custom dimensions as secondary dimensions.

Our implementation is now finished. From now on, if someone fills the lead form on your website, information about their company size, name, pricing plan, and monthly subscription value will be passed Google Analytics. And that means you can finally start using this new data to get deeper insights!

The first thing we can do with our new custom data is to set them as secondary dimensions.

In our example, it will look like this:

If we want to see marketing sources for various choices for Pricing Plan, for example, all we need to do is to apply the “Pricing Plan” custom dimensions. This will result in the following.

This report tells us that most users from organic and direct traffic chose pricing plan identified as 4.

Building more advanced segments

Let’s say you want to analyse the demographics and locations of users who chose 3rd pricing plan and have a company size of 25 employees. Advanced segments built out of custom dimensions can help here.

To build your advanced segment, open Admin console in Google Analytics and Segments under View column.

Click New Segment. You should now see a segment builder.

Name your segment and click Conditions under Advanced.

Choose your Pricing Plan custom dimension and make it exactly match “3”. Now let’s narrow down further. Choose your Company Size dimension and set it equal to 25. 

You should now have your advanced segment ready to be applied. It should look like this.

We can now apply this segment to any behavioral report in our Analytics view. Let’s look into the geographics of this segment. 

For that, click the Audience report, then click Geo and Location. You should see this. 

To apply your new Segment, click Add Segment.

Find your new segment in the list and apply it. The new segment should now appear on top marked with orange.

As you can see, only about 4% of users qualify for your new segment. By running the reports with All Users segment turned on, we can see the relative demographics of these two groups. 

Let’s scroll down to the actual report.

As you can see by comparing these two segments, users who qualify to our unique segment account for a minority of the users, but they have higher engagement, both in terms of the time on site as well as pages/session metric and bounce rate.

Creating better user alerts

Let’s now set up an automated alert in Google Analytics if we have less than 15 users with Pricing Plan equal 3 and the Company Size equal 25 over the period of a month.

In order to set it up, go to your Admin panel and choose Custom Alerts in a View column.

Name your custom alert and mark when it should fire. Also, give the email address for alert notifications. The end result should look as follows.

Diving deeper into Analytics and other tools

In this post, my goal was to help you reduce the annoying disconnect between real-life and Google Analytics data.

I think the best strategy these days is to optimize for making sure that the insights you get from data are as actionable as possible.

Enriching data, or – what we’ll do in the future posts – integrating and unifying data, is certainly a good step in that direction.