Sellers of Customer Data Platforms (CDP) promise their software will gather data from various applications, and assemble it into a single-source-of-truth “golden record” for each customer.
It’s a lovely vision, but rarely achieved. And that’s perfectly okay. Most companies won’t achieve the goal of one record for each customer, but will find ways to cope with the limitations that prevent the creation of golden records.
Let’s use this common CDP use case to illustrate the complexity: Identifying customers among the hoards of anonymous visitors to your website.
It’s a challenge. Anonymity was central to the internet’s design. And while there are lots of ways to identify anonymous website visitors, they all have their limitations.
Imagine Robert Williams, our leading man and swing dance aficionado, interacts with Ella, publisher of (the fictitious, I believe) Ella’s Swing Dance Magazine.
Robert meets Ella on his commute to work. She tells him he ought to read her magazine. On his lunch break, Robert searches for the magazine website on the desktop he uses at the office. When Robert’s web browser makes a request to Ella’s Swing Dance Magazine website, Ella’s CDP puts a cookie on that device and creates a user profile. The profile includes the following information:
IP address: 126.96.36.199
User-Agent: Mozilla/5.0 (Linux NT 10.0)
The record might also include what pages were visited, and what type of content the visitor seems to prefer. The visitor is still anonymous to Ella’s CDP. The profile is one of the millions of unknown visitors.
When Robert gets home that evening, he types the URL of Ella’s website into his iPad. Her CDP dutifully puts a cookie on that device and creates a new profile. But on this visit, Robert decides to sign up for Ella’s free e-newsletter with one of his junk email addresses. The CDP captures the email address from the form submission and creates a second profile, which has more information than the first.
IP address: 188.8.131.52
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6)
Referrer: [blank]Email: [email protected]
Name: Bob Williams
However, nothing in this second records enables Ella’s CDP to conclude the records are tied to the same individual. The records were created on different devices at different times, and share no information identifying Robert.
Two weeks later, Robert and Ella are jitterbugging at Mobtown Ballroom. Ella has a few copies of her magazine, and Robert takes one home. He signs up for a print subscription using one of the blow-in cards. Ella’s fulfillment service dutifully records this new subscriber data, which is then imported into the CDP, creating Robert’s third profile with still more information:
Name: Robert Williams
Address: 123 Main Street
Phone: (301) 555-1212
Email: [email protected]
This profile has valuable information, including a new email address. But this profile has no data from online activity, so it doesn’t help with online ad targeting or customer journey data.
Robert now has three profiles in Ella’s CDP. There’s no way to merge any of them. We know they’re all Robert. The CDP doesn’t.
Fortunately, Ella’s magazine has the good sense to include some special online content for print subscribers as a way to link offline and online behavior. A QR code printed in the magazine allows Robert to view a video on the website about the Travelling Charleston. Robert scans the QR code with his iPad. That takes him to the website, where the CDP recognizes the cookie it put on that device earlier.
Bingo! Now Ella’s CDP can merge the iPad profile (#2) with the subscription information (#3). Several good things happen as a result:
- Robert’s three profiles have been consolidated into two
- Robert has become a known user in Ella’s CDP
- Ella’s CDP knows that Robert uses two different email addresses
- Robert’s subscription information (offline behavior) and the profile created when he accessed Ella’s site from his iPad (online behavior) are now linked.
The record created from Robert’s desktop remains anonymous.
Note that, in this scenario, Ella’s CDP has been configured to accept multiple emails in a customer’s profile. Some companies designate the email address as a unique field – allowing only one per profile. In that case, the records would not merge, and Robert’s subscription information would remain in its own profile, not connected to any online activity.
Will Ella’s CDP ever be able to attach Robert’s work computer to his online profile? Maybe. For example, if Robert opens one of Ella’s e-newsletters on his work computer, the CDP might (depending on how strict it is about such things) recognize that as Robert and merge the profiles.
Identifying individuals from their online and offline behaviors and creating single records may seem complicated, but it’s quite a bit less confusing than what happens in real life. Consider the complexity added when Robert’s smartphone and home desktop are added to the equation.
Merging records: deterministic vs. probabilistic method. Which is right for you?
The “golden record” that the CDP salesman is waving in front of you assumes that all these different sources of information can be merged, but they need to have a field in the record to merge on. What’s that going to be?
Most companies opt for an email address as the best piece of personally identifiable information on which to merge records. But as we’ve seen, and as we all know, people have multiple email addresses. They also change over time.
If you stick with a strictly deterministic matching method, you’ll need to match a unique field (like an email address or a social media account) across multiple profiles to create your “golden record,” and you’ll inevitably leave some information behind.
There are other options. Some CDPs use probabilistic methods to merge profiles. That method enables you to match records that might otherwise remain distinct. But you risk incorrectly merging profiles and creating a customer experience headache.
(Read this article for an in-depth comparison of deterministic and probabilistic matching.)
You can’t create a single record for each customer that covers all the chaos and weird realities of how people behave. What you can do, and what you must do, is decide where that matters.
There are use cases where improperly merged profiles yield very bad customer experience outcomes. Stick with deterministic matching in those cases, even though you’re going to lose some of the data on interactions with that customer. You’ll have multiple profiles for some individuals, many of which will remain “unknown.”
Other use cases are far more forgiving. If you want to create a segment of people who share a particular interest, you don’t need to get down to the individual. In these cases, probabilistic methods are sufficient.
In any event, recognize that “golden records” are a nice idea, but you’ll never actually get there.
Get the daily newsletter digital marketers rely on.
Opinions expressed in this article are those of the guest author and not necessarily MarTech. Staff authors are listed here.