What is CRM Data Duplication? Why It's a Trillion Dollar Problem
What's "duplicate data" or more specifically "duplicate CRM data"?Simply put, any piece of data that's been created more than once or appears identical. For example, three contacts named "John Smith" who refer to the same person (not actually three John Smiths, but the same one), would be considered duplicates.CRM software has only been around for a few decades but duplicate data has been an issue since the printing press, or sooner. It could be argued that even the ancient Sumerians ran into this problem when carving cuneiform script into their clay tablets.Duplicate data really became an issue with the advent of computers. When data could easily be copied without much human input, algorithms were created to combat this issue. Imagine the pains Dropbox ran into dealing with synchronizing folders and files across multiple machines that might already have the same data.We don't give duplicate data much thought when it's easily taken care of by software and virtually invisible. However, when we do encounter this problem we usually have to take extra steps to prevent it. Oddly enough, computer technology has become highly advanced, yet we still have trouble finding tools to organize and cleanse the digital mess we leave behind.Maybe the area we notice is the most is when it comes to our personal contact lists in our mobile phones. Earlier feature phones didn't have this issue because entering a contact was done manually and usually wasn't the quickest and easiest thing to accomplish. Once smartphones and cloud-syncing came on the scene, duplicate contacts became an instant problem. You might have dozens of the same people across multiple social networks, email accounts, and previously imported or migrated contact lists. You might also manually enter a contact that already exists in your list, but you simply forgot it was already there. Or, you didn't have the time to quickly search for the name before entering it.With all this automation and ease of use, it's easy to see (and something we experience on a daily basis) how duplicate contacts quickly take over our phones and email clients.And that's just what happens to the phone you use for personal use.In any company that has customers, this becomes an instant problem as well, but ten times easier to get wrong. Especially in our modern age where everything is kept digitally. Marketing and sales companies are especially prone to this problem. Most, if not all, tech companies have experienced this issue across their user or customer base, email lists, lead lists and, well let's just get to the point, CRM.CRM is commonly referred to as an industry, but CRM can also pertain to anything we use to maintain a list of business contacts, with data to help keep on top of the sales and marketing or any other business related use.The CRM industry has this issue mostly because of the shear bulk of, not just customers, but:
- Email subscribers synchronized from the likes of Mailchimp (which sometimes also contains duplicate emails).
- Leads collected from marketing campaigns, later imported into the CRM.
- Followers and fans from social networks, like Facebook, Twitter, LinkedIn, etc. that are often times automatically synced with the CRM.
- Cold leads collected from the phone book, purchased email lists, phone lists, etc.
- Web2CRM-style leads fed in through forms and other opt-in methods
- Contacts synced through integrations that don't check for duplicates
- Lack of duplicate checks on imports
The reason bad data costs so much is that decision makers, managers, knowledge workers, data scientists, and others must accommodate it in their everyday work. And doing so is both time-consuming and expensive. The data they need has plenty of errors, and in the face of a critical deadline, many individuals simply make corrections themselves to complete the task at hand. They don’t think to reach out to the data creator, explain their requirements, and help eliminate root causes.--Harvard Business Review: Bad Data Costs the U.S. $3 Trillion Per Year