Written by Jose Vicente
Cohort analysis is a technique widely used in medicine in which, starting from a sample of healthy patients, a study is made of the behavior of a disease depending on the exposure to certain risk factors within a period of time.
Applying this to web analytics, we obtain a way to know if a group of users that have a certain behavior in a specific period of time contribute value to our business in the long term.
This type of analysis and its importance in web analytics was discussed at the Google Analytics User Conference Spain 2013, which we attended at the end of May, and in the official Google Analytics blog among other changes that the features necessary to perform this type of analysis will soon be available in the advanced segmentation of Google Analytics..
What is a cohort?
A cohort is a group of people who share a common characteristic or experience within a given time period. From this definition, it should be emphasized that this common characteristic must occur in a specific period of time, which does not necessarily coincide with the time period of the analysis.
It is very easy to find examples of application in medicine, a cohort could be a group of people who started smoking during the year 1990 within a study of the occurrence of lung cancer in that segment of patients carried out between 2010 and 2020.
But how do we apply cohorts, for example, to an e-commerce site, an example could be the analysis of the behavior of customers who made their first purchase within the sales period. The cohort could be defined as users who made their first purchase between July 1 and July 30, but we would study their behavior over the following six months. During those six months we could observe, for example, whether this cohort:
- Visiting our online store again?
- Do you make purchases outside the sales period?
- Do you spend the same amount as other users?
How to do cohort analysis with Google Analytics
Currently doing cohort analysis in Google Analytics is complicated for two reasons:
- We cannot define time periods in the segments.
- Google Analytics segmentation is applied on visits, not users.
We can overcome these obstacles with the use of custom variables. If when the user makes his first purchase we set a custom variable with the date of his first purchase, we will be able to define advanced traffic segments based on this value.
_gaq.push(['setCustomVar', 1,'FechaPrimeraCompra', 'DDMMAAAA', 1]);
Since session variables are defined at the user level, the second problem would be solved. This method is not foolproof, and also presents problems such as:
- We lose the user’s personalized variable information when the user deletes cookies.
- If a user is not logged in to our store, we will not be able to establish the date of their first purchase.
Once we have the necessary data, we can create the desired cohort with an advanced segment in which we will select our custom variable ‘FirstPurchaseDate’ and the date range set in July. As we have defined the date range with the format ‘DDMMYYYYYYY’ we only have to make the date end with ‘072013’ to have the users who have made their first purchase in July.
Once we have defined this advanced traffic segment, we can analyze it to obtain data such as:
- Do they spend more or less money compared to other traffic segments?
- Are they still buying the same type of products?
- From which traffic sources did they first come to the site and make this first purchase?
The easiest way to answer these questions is to make comparisons with other traffic segments and thus know if this analyzed cohort provides long-term value to our company.
How will we do cohort analysis in Google Analytics in the future?
Google has announced
new features in the advanced segmentation of Google Analytics
and among them we extract the ones that solve our problems when performing cohort analysis:
- User segmentation: as we have already mentioned, advanced segments are based on visits. The new user segmentation option will allow us to select all visits from users that fit certain criteria such as demographics or certain behaviors. This new functionality can be combined with the existing ones at the visit level.
- Definition of cohorts: includes the possibility of add date ranges in advanced segmentsThis allows us to define cohorts of users with a specific behavior in a defined period of time and avoids having to make additional implementations in the Google Analytics tracking code to obtain this data.
The changes will be available in Google Analytics in the coming months and will be rolled out on an account-by-account basis as they usually do.
The ability to perform this type of user-based analysis, such as Cohort, is one of the major changes in the tool’s concept that we will be testing as soon as we have access to these new functionalities.