On the data management team at Amplitude, we spend a lot of time talking about data functionality, which is the ability of your product analytics taxonomy to answer the questions that you have. There are three factors that are critical to the success of data functionality: accuracy, comprehensiveness, and usability. Together, these factors determine exactly how well your team can use your taxonomy to answer questions.
Key #1: Accuracy
Accuracy is a measure of how closely the data in your product analytics system matches the data in the outside world (or potentially your existing system of record). If your data isn’t accurate, then the answers that you produce won’t be correct, and your team will quickly lose trust in that system.
I think the big thing people miss here is assuming that their data is correct until proven wrong, which inevitably results in someone looking like a fool. That’s why it’s important to come up with a framework for explicitly measuring and verifying the data that you load into your product analytics system. If you have an existing Redshift warehouse that your data analysts use, you need to make sure that your product analytics system matches that source of record. Whenever that analysis shows discrepancies, you need to be able to explain or correct them. If you can’t ensure that your data is accurate, then you can’t use the data.
Key #2: Comprehensiveness
Comprehensiveness is a measure of whether you have all of the data in your analytics system that are required to answer your questions.
Measuring comprehensiveness involves looking at the breadth and depth of queries that your system is able to answer. A comprehensive product analytics system may ingest data from a variety of different systems, such as your marketing automation system, your split testing framework, or receipts from your app store. And to answer the most detailed questions, you will often need to segment on pieces of contextual data, such as the location in the product where a user performed an action or specific details about the user or groups performing the action. Most sophisticated product analytics solutions provide the ability to use this data, so long as it is captured correctly.
Key #3: Usability
Finally we come to usability. Usability is effectively a measure of how easy it is to answer your questions within your product analytics system. A usable system will make it possible for many different people in your org to quickly answer their questions. This is critical, because if it takes eight hours to get an answer, teams will probably skip using the system for all but the most important questions, and may instead rely on the gut judgment of the highest paid person in the room. If you can get an answer in 30 seconds, you can get to the point where people across your product team are constantly answering their own questions in real-time.
So, what does usability mean in specific terms? There are already whole articles written on this topic, but here are some quick guidelines:
Usability Guideline 1
Most importantly, events need to be at the proper level of abstraction. If your event name is “click” and you need to delve into multiple properties to figure out what was clicked or what business action that represents, it is going to be hard for most users to figure things out. On the other hand, if your event type is “header_click_publish,” your event types may be too granular (the “header” context could probably be stored in an event property).
Usability Guideline 2
In addition, events should be clearly named in a clear and consistent fashion, such that it is easy to understand what they do. Our solutions architect team is a big fan of the verb + noun framework, e.g. “Publish Article.” Names should also be consistent; you don’t want an event named “Publish Article” and then another event named “delete_article.” If you are able to achieve a high level of usability, then any new user should be able to quickly figure out the taxonomy with little guidance.
So which of these three factors is most important? It’s important to start with a focus on accuracy, even at the expense of comprehensiveness. If you only have a few events, but you can trust the data, over time you will be able to expand that trust as you implement more events. On the other hand, if you start out by loading hundreds or thousands of low-quality events into your product analytics system, you may never recover. One of the saddest and most frustrating things that our team sees is when companies have loaded in large amounts of data that no one on their team can trust. In many cases, the solution to this problem is throwing everything out and trying again.
Once you have verified that the initial data is accurate, you probably want to focus on usability. This will allow the largest number of people in your organization to use the data, highlighting the value of self-serve analytics. In some cases, you may want to focus on comprehensiveness in the short-term, but I would caution you against creating a system that makes it easy for just your product analytics team to quickly answer questions. While this will provide benefits, it still leaves your company far from data democracy. It does require work to get the point where everyone on your product team can use data in making decisions, but this also leads to the best possible product outcomes.
One final thing to consider is that data functionality evolves over time. You may get to the point where your data is able to answer your questions, meaning that you have achieved usability, accuracy, and comprehensiveness. However, at some point in the future, you may find you have new questions that you can’t answer. Maybe you want to dive deeper into your data, or possibly you want to use a more advanced analysis type. Or possibly you want to onboard many colleagues with the product analytics system, and realize that they won’t be able to answer their questions without extensive support. Having the opportunity to improve your data functionality is a good thing, although it will require you to think about these three factors, and which of them need to improve.
Data functionality is a journey, and there is no fixed end point. All that I can say is “good luck,” and that once you start on this path, you will never be able to turn back. But there are many guides to help you along the way, and as you succeed, we (and your colleagues) will be cheering you on.