In previous posts I argued that an
entity is a thing that’s of special interest to the business. For ecommerce, an obvious one is
product (but is certainly not the only one). An
entity at its core is an object, but a special kind of object by virtue of its importance to the business.
We’ve also looked at how interactions are universally made up of the same basic components — subject, verb, object, context. We’ve seen that each of these components has its own dictionary. It’s all very conceptual so far.
The first step to putting this into practice using Google Tag Manager is working out whether a
page is a business entity or an object.
Well, it’s neither.
You’d think that a page is an object at least but that would be wrong too. And that simple statement has dramatic consequences on how we design the
Take the most common analytics interaction of all, the pageview. If we were to map it to our event grammar, this is what we’d come up with:
user sees product
Except that’s wrong. The user doesn’t see the page, they see the objects and entities that are on the page. You see, the page is nothing more than a container of stuff that is shown to the user. Therefore we design its dictionary accordingly.
The clue is in the page naming conventions
Think about it. Setting up decent naming conventions for pages is a core task to any Google Analytics setup. And how is that done?
|Pages||Assigned Page Grouping||Additional Page Groupings|
|product pages||Products||By product type, brand, price, category, etc|
|product category pages||Category Pages||By category level, name, type, etc|
Notice that what we’ve actually been doing is work out what business entity the page is about and then assign a page classification that reflects that entity. Not just that, but additional classifications are made according to attributes belonging to these entities. There is nothing inherently about the page itself in this classification.
Let me say it again. A page isn’t an object, it’s a container of objects (some important, some not so much) that the user sees. A mere vehicle for all of the entities and objects the user interacts with. So its dictionary would look something like this:
1 2 3 4 5 6 7 8 9 10 11 12
page: main_entity: type: product id: 354 assets: product_collection_recently_viewed: <product collection dict> ... product_354: <product dict> product_987: <product dict> ... cta_newsletter: <call to action dict> promo_spring2014: <product collection dict>
Therefore our humble pageview interaction becomes:
user is primarily shown
product354 and they are also shown
product collection“recently viewed”,
ctaabout “newsletter” and the “spring2014″
Notice that I used the word “shown”. This was a deliberate choice. “Sees” implies an active action on the part of the user. In reality, the pageview is simply the consequence of an active click on a link that occured on the previous page. Which ties in nicely with my next point.
Why does this matter?
Increased need to track impressions
Enhanced Ecommerce has built-in support for product impressions but other entities need impression measurement, too. In order to measure the end-to-end success of an entity (be it
product collection, or something else), we also need to track when it was shown to the user. That’s essential context in analysis. After all, if the user never saw it, how can we possibly intepret interactions (or lack of) with it?
Using the page for dictionary lookups
When a website page loads, so do all of its assets. It’s how the web works. We can replicate that behaviour with respect to dictionaries, too. On page load we also load dictionaries for all entities and objects on the page. This creates a dictionary of dictionaries that we simply tap into whenever an interaction occurs on the page. We grab from it what we need to complete our
interaction dictionarybefore we pass it to Google Tag Manager. Neat!
Automatic page classifications using lookups in the dictionary of dictionaries
Let’s picture the following scenario. User clicks on a link and is taken to a product page. The page loads all assets (required for functionality) and all asset dictionaries. A myriad of user-driven interactions can occur on this page. Some may be directly related to the
product (e.g. add to wishlist), others might be directly related to another business entity (e.g. click on link related to
In a previous post we’ve established that every interaction “recipe” will include a reference to what business entity it primarily relates to. We do this by including the following in the
1 2 3 4 5
... main_entity: type: product id: 354 ...
Once we’ve got that, the sequence of steps is more or less as follows (simplified):
Interaction has occured
interaction dictionarywas passed to Google Tag Manager’s dataLayer. See How it works in practice for details on this step.
Interaction is linked to a specific entity instance
GTM specifically looks for an entity
idsomewhere in this dictionary (will discuss where this should go in a future post). This is how Google Tag Manager simply knows what that interaction was really about.
Look for matching dictionary
GTM then looks in the
page.assetsarea of the page’s own dictionary to see whether a dedicated dictionary matching our entity type and id exists. Since we’ve designed the
assetsas a list of objects (rather than an array), we can access the dictionary directly:
This means that we’re not flicking through
assetstrying to find a match. We “open” the dictionary at the precise location where we know the dictionary should exist. That makes for very fast lookups.
Unpack the found dictionary
Having found a dictionary for a
productwith id 354, its dictionary is passed as a whole to the dataLayer where it’s unpacked into individual attributes like name, price, cost, etc. These then make their way through to whatever analytics tags require them.
Automatic page classification based on business entity. Magic!
When GTM finds the dictionary for
page.assets, it also specifically looks for a
classificationscluster of attributes within. Because our page dictionary told us that it’s primarily about a
product, we know that the
product‘s classification is the most appropriate way to classify the page itself.
GTM therefore transfers the
classificationsbranch to the dataLayer and unpacks it into different levels of classifications which are then assigned to the page.
But does the page have any attributes of its own?
I believe it does but these are most likely related to page functionality and how entities and objects are presented to the user.
Here are some relevant fragments from the
1 2 3 4 5 6 7 8 9 10 11
page: active_layout: grid active_filters: size: 12 price: low: 34 high: 87 pagination: page_number: 2 total_pages: 5 items_per_page: 20
This makes perfect sense. You don’t focus on the page itself when you’re trying to understand behaviour connected to business entities (
promo, etc) just as you don’t look to business entities to answer questions related to website functionality!
Thoughts? Please let me know in the comments.