If you've had the opportunity to explore Google Analytics 4 (GA4), you may have noticed that while the platform provides powerful analytics capabilities and insights, there are occasional nuances or constraints which may prevent you from fully harnessing your data’s potential.
You might observe certain gaps in your reports or encounter discrepancies in data accuracy. In some instances, values may take the form of "not set," making the extraction of meaningful insights challenging, though not insurmountable. So long as your tracking has been implemented correctly (Ask us how we can help you set up your tracking!), then you can assume these variances are coming from GA4.
In the majority of cases where you see a “not set” value in your data, your best bet is that it has to do with one of the following issues:
These three obstacles prevent you from having full visibility on and accessibility to your data. Fortunately, there’s a relatively simple fix for this; cue BigQuery to the rescue!
BigQuery is a fully-managed, serverless data warehouse and analytics platform provided by Google Cloud. It allows users to run SQL-like queries against large datasets in real-time, making it suitable for analyzing and processing massive amounts of data. It also erases the sampling, thresholding and cardinality issues we would otherwise run into in GA4.
Even better, Google provides free, direct data exports to BigQuery from GA4, and BigQuery comes with a free terabyte (1TB) of data storage per month. BigQuery does charge for additional storage, as well as frequent querying, but because the application is serverless and dynamically scalable, these costs are relatively low. Many users get by with only paying a few bucks a month!
To better understand the value of integrating BigQuery into your analytics arsenal, let’s take a deeper look at the three potential sources of issues mentioned above.
Data sampling occurs when a subset of your data is analyzed for patterns and trends, and that logic is then applied to the entire data set from which you then gather your insights. Data sampling provides a snapshot of what’s going on. Google implemented this in Google Analytics to reduce query load times and keep costs down by limiting server-usage.
GA4 opens up your business to deeper and more detailed data collection with the addition of custom parameters instead of the Event Model, Label, and Action model of Universal Analytics (UA). Google can rely on data sampling less when you’re paying them for GA360, but since you can gather an infinite amount of data, at some point, so much will still result in sampling.
For a little more on data sampling, here's a video clip from our Director of Analytics, Jon Phillips:
On the other hand, if your site doesn’t see as much traffic, or you’re just not collecting much data for a certain dimension, you may
There are varying minimum thresholds for different types of data, and you can try to boost traffic to collect more data, but this will always be an issue in GA4 to some degree.
Cardinality refers to the number of unique values assigned to a dimension. Some dimensions have a fixed number of unique
In GA4, only use high-cardinality dimensions when the information collected is necessary for the business, as they can more quickly cause reports to reach the row limit.
Luckily, there are many resources out there on the wild, wild web to get you started on setting up BigQuery, if you’re in the DIY spirit. That said, we know how quickly these types of setups can get complicated and confusing. When you’re working with data that important, it’s critical to your business that you know these implementations are in good, experienced hands.
If you’re looking for a trusted, fully-Google certified agency to get you into the world of better data, you can always reach out to us and we’re happy to have a conversation. From billing set up to querying and managing your data, we can walk you through it!