Blog

Coming 2018: Find & Merge Duplicates in HubSpot CRM

on January 23, 2018

HubSpot

[Update: The HubSpot Integration is now launched!]

Finding and merging duplicates in HubSpot, according to the community, is not easy. Many people have asked us how to deduplicate contacts and companies HubSpot.

Let’s go over the duplicate issue in HubSpot and how we aim to combat it in 2018.

Duplicate Contacts and Companies in HubSpot

HubSpot, like many CRMs, has the problem of duplicate data sneaking in. At first, a few duplicate contacts are normal and easy to control (at under 2%). When your HubSpot database grows, duplicates begin to be a problem. Not only are they a slight annoyance, but they hurt your bottom line and credibility.

Duplicates are also a problem for your team. Wading through many records to find a note about a sales call with a customer, is never fun. It causes everything to waste time and make damaging mistakes. In short, it hurts your bottom line.

HubSpot has a way to manually merge duplicates. So why bother with a 3rd party integration? Because merging duplicates is time-consuming. It also becomes impossible to keep on top of. Once your team is adding new info every day, keeping orderly seems like a pointless task.

How Dedupely Will Help You Merge Duplicates in HubSpot

We designed Dedupely to automate finding duplicates and merging them in bulk. This alone saves your team hundreds of accumulated hours. It saves your company the money it was wasting on poor productivity.

Dedupely allows you to fine-tune your hunt for duplicates by field or a combination of fields. Once it finds duplicates, you can merge one by one or in bulk.

You can automate it to find and merge new duplicates in HubSpot every day.

What Will The HubSpot Integration Deduplicate?

Our integrations include deduplication of Contacts, Companies and Deals. This doesn’t change for HubSpot. Notes, tasks and other linked items will end up in the merged contact.

Fields and custom fields also merge. When two fields conflict, it keeps the best looking field. You can tell Dedupely which fields to keep when only one is allowed.

Merging duplicates in HubSpot

To learn more you’ll want to see the HubSpot integration page and sign up for launch alerts.

When Can I Start Using Dedupely?

We have started work on the HubSpot Integration. In reality, we’ve been working on it for over a year off and on. Because of the growth of other integrations we had to put it on the back burner for a while. We didn’t want to rush the process. HubSpot reached out to us 2 years ago. At the time, we were ill-equipped to take on a new integration.

2018 is a big year for us. We’re growing faster than we did in previous years. Our team is growing and we’re getting better at launching. Our tech stack has improved since last year and we’ve managed to make a few breakthroughs.

We’re building what you asked for. An integration to find and merge duplicate contacts, companies and deals in HubSpot.

We estimate the launch to happen in Q1 of 2018. Most likely late February to March.

We’ll See You There

I’m excited to be working on this problem. Duplicate data is frustrating for everyone. More so for businesses that want to focus on their customer rather than manage data. Duplicates in HubSpot shouldn’t be a nightmare. The reality is that it’s hard to get under control manually.

P.S. We’ll be selecting a few early birds to be part of our beta program. If this interests you sign up at the bottom of this page to get a say on what you want us to include on launch day.

How to Merge Duplicate Contacts in Pipedrive

on January 16, 2018

Merging duplicate contacts in Pipedrive

Duplicates spreading across Pipedrive is never fun. They come in a number of ways, from bad imports to integrations that don’t check for existing records. However, merging duplicate contacts in Pipedrive is easier than most other CRMs. Pipedrive has gone the extra mile to give the user some extra control over the merging process.

To merge a duplicate person contact, click on the person you identify as a duplicate:

Pipedrive Duplicate Contacts

Then navigate to the merge button at the top right:

Merge Duplicate Pipedrive Contact

 

Here I’m checking if M. Vela has a duplicate. Since I already know there’s another duplicate named Mario Vela, I’m just going to search for that name. However if I’m in doubt that there are more duplicates, I can also search for similar last names, emails, etc:

Pipedrive Merge Contacts Screen

Continue, and you’ll see where Pipedrive’s merging capabilities shine. This screen gives you more control over what gets merged and in the winning contact (or final contact) and what gets removed:

Conflicting values in merge screen

What’s up with the “In case of conflict”? Basically, if the field only accepts one input value, one can stay, and the rest will have to be left behind. If one field is blank, the corresponding field of the other duplicate will be used by default.

This is true with fields like first and last name, organization, owner, “visible to” settings, etc.

Multiple input value fields get merged together and there’s no worry about any data loss there because they all end up on the same stack. Phone and email are examples of this because they allow more than one value.

So once you’re satisfied with the match, hit preview and you’ll see how the fields actually merge:

Pipedrive merge duplicate preview screen

As you can see here, the selected “in case of conflict” fields get correctly chosen. Also, the blank organization field has been populated. If it looks good, you can either go back or hit merge.

The same feature works for Organisations and Deals.

How Pipedrive Merges Contact Object Links

Pipedrive also merges the contact’s linked objects such as:

  • Notes
  • Deals
  • Organizations
  • Tasks
  • Activities

So the history of both duplicate contacts ends up in the same merged contact. Carrying over linked objects helps prevent the myriad of issues you encounter with duplicates spreading across Pipedrive.

How to Prevent Further Duplicate Contacts in Pipedrive

If most of your duplicates are coming from manual input and imports, Pipedrive has two features to combat duplicates from being entered in the first place.

The first one is when you enter a new contact. As you type the name of a contact that already exists, you’ll see the following:

Potential duplicate contact

This helps you catch any obvious duplicate contacts in Pipedrive.

The second way works for imports. When importing, by default, Pipedrive will attempt to merge existing records:

Pipedrive import deduplication

Conclusion

This is a very basic introduction to merging your duplicate Pipedrive contacts. If you want to learn more about finding and merging duplicate contacts, go to Pipedrive’s Support Center. Also, take a look at our integration which identifies duplicate contacts, organizations and deals in bulk, saving you hours (or days) in hunting down duplicates.

What is CRM Data Duplication? Why It’s a Trillion Dollar Problem

on January 2, 2018

What’s “duplicate data” or more specifically “duplicate CRM data”?

Simply put, any piece of data that’s been created more than once or appears identical. For example, three contacts named “John Smith” who refer to the same person (not actually three John Smiths, but the same one), would be considered duplicates.

CRM software has only been around for a few decades but duplicate data has been an issue since the printing press, or sooner. It could be argued that even the ancient Sumerians ran into this problem when carving cuneiform script into their clay tablets.

Duplicate data really became an issue with the advent of computers. When data could easily be copied without much human input, algorithms were created to combat this issue. Imagine the pains Dropbox ran into dealing with synchronizing folders and files across multiple machines that might already have the same data.

We don’t give duplicate data much thought when it’s easily taken care of by software and virtually invisible. However, when we do encounter this problem we usually have to take extra steps to prevent it. Oddly enough, computer technology has become highly advanced, yet we still have trouble finding tools to organize and cleanse the digital mess we leave behind.

Maybe the area we notice is the most is when it comes to our personal contact lists in our mobile phones. Earlier feature phones didn’t have this issue because entering a contact was done manually and usually wasn’t the quickest and easiest thing to accomplish. Once smartphones and cloud-syncing came on the scene, duplicate contacts became an instant problem. You might have dozens of the same people across multiple social networks, email accounts, and previously imported or migrated contact lists. You might also manually enter a contact that already exists in your list, but you simply forgot it was already there. Or, you didn’t have the time to quickly search for the name before entering it.

With all this automation and ease of use, it’s easy to see (and something we experience on a daily basis) how duplicate contacts quickly take over our phones and email clients.

And that’s just what happens to the phone you use for personal use.

In any company that has customers, this becomes an instant problem as well, but ten times easier to get wrong. Especially in our modern age where everything is kept digitally. Marketing and sales companies are especially prone to this problem. Most, if not all, tech companies have experienced this issue across their user or customer base, email lists, lead lists and, well let’s just get to the point, CRM.

CRM is commonly referred to as an industry, but CRM can also pertain to anything we use to maintain a list of business contacts, with data to help keep on top of the sales and marketing or any other business related use.

The CRM industry has this issue mostly because of the shear bulk of, not just customers, but:

  • Email subscribers synchronized from the likes of MailChimp (which sometimes also contains duplicate emails).
  • Leads collected from marketing campaigns, later imported into the CRM.
  • Followers and fans from social networks, like Facebook, Twitter, LinkedIn, etc. that are often times automatically synced with the CRM.
  • Cold leads collected from the phone book, purchased email lists, phone lists, etc.
  • Web2CRM-style leads fed in through forms and other opt-in methods
  • Contacts synced through integrations that don’t check for duplicates
  • Lack of duplicate checks on imports

Just to mention a few.

With all of the technology moving back and forth, and people’s lack of ability to combat this themselves, it’s no wonder nearly all businesses have this issue.

So why is this really an issue? Who’s actually getting hurt and why do we need to even think about this?

You don’t have to look too far to find the downsides of duplicate data. There are major drawbacks that effect not only your company from the inside, but also your customers, your credibility, and your overall brand.

According to the 2016 Data Science Report from Crowd Flower, “60% [of data scientists] said they spent most of their time cleaning and organizing data.” That’s a huge amount of time that could be better spent on actually doing their job.

In the sales sector, duplicate data confuses salespeople who make thousands of calls per month. Ever get cold calls from the same company multiple times, even after you’ve asked to be removed from the list (also, multiple times)? This is because they can’t always simply remove you from the list (assume there is no “the” list). You’ve probably been entered more than once. One of your entries has a note to stop calling, but other duplicates haven’t been updated. Which salesperson on a quota would consider updating or deleting multiple entries, even if they bother to “strike you off the list” at all?

This pushes anyone interacting with your company to get frustrated. Why would a credible company make such a gross mistake? From within the company, it’s easy to see why. From outside, there’s little room to save face. Especially, when you combine personal that don’t care with a mix of data issues.

Even real pros who can’t possibly remember every name and interact with 100s of clients a year. How are they supposed to remember that they called this person three months ago when there was no note or record, or the record of the last interaction was attached to another duplicate?

Another example is when you receive the same newsletter multiple times from the same company. You probably signed up for the same newsletter more than once, or opted in somewhere else and landed on the same list.

Bad data messes with everyone. Duplicates are everywhere and it works like a cancer, eventually getting too big to handle.

Okay, okay! Duplicate contacts aren’t as bad as government dept, or actual cancer, world hunger or the 100s of way worse issues society faces. But it is a problem, one that’s worse than your website not working in Internet Explorer 8, the odd broken link or the extra 40 cents you could save on another brand of disposable coffee cups. Duplicate data does actually have a dark side, that affects your business’s bottom line. IBM estimates that bad data costs US businesses, give or take, 3.1 trillion dollars, a year. That is, not handling your duplicate data (and not cleansing general “bad data”) is leaking real money from your company.

Really, we can’t blame the team for not being better at data cleansing:

The reason bad data costs so much is that decision makers, managers, knowledge workers, data scientists, and others must accommodate it in their everyday work. And doing so is both time-consuming and expensive. The data they need has plenty of errors, and in the face of a critical deadline, many individuals simply make corrections themselves to complete the task at hand. They don’t think to reach out to the data creator, explain their requirements, and help eliminate root causes.

Harvard Business Review: Bad Data Costs the U.S. $3 Trillion Per Year

Conclusion

Duplicate data is a common problem across all businesses. And since technology touches nearly every industry, it has become much easier to accelerate without any effort. Yet, that much more challenging to combat, with real penalties to those who try.

Financial reports derived from marketing, sales and general customer data get skewed. Credibility and moral suffers. Teams are much less productive. Proper decisions can’t be made because data is inaccurate. Extra time has to be spent hunting for the right contact, note, piece of data.

Can we really trust our data to do the job we trust it with? It really depends on how much time you and your team are spending on “fixing” the data. The extent at which your data is corrupt makes itself known. The reassuring part is that it’s never silent for long. So yes, you can probably trust your Google Analytics, your bank statements, your call history or anything that isn’t synced, touched and updated on a continual basis by dozens of people and services.

Your CRM, your center of data collection, the data you collect and organize, your customer data, recorded interactions. Data that constantly moves, morphs and populates is the data you need to pay attention to, and probably your largest source of pain, lost productivity and lost opportunity.

Dedupely Vs. Five Alternatives For Deduping Your CRM

on December 31, 2017

We don’t have many competitors. That is, there aren’t too many other companies out there specializing in data deduplication.

However, we still do compete. I once heard a smart entrepreneur say “Your competition is the process your market uses outside of software”. We don’t compete with other companies or focus much on what they do. We do however have to compete against the rather old, outdated processes our customers use before using Dedupely.

Let’s cover a few data deduplication alternatives and why I’m convinced Dedupely is the best:

1. Google Sheets & CSVs (Export-Clean-Import)

You can remove duplicates with Google Sheets. This takes a bit of doing, but you can eventually end up with a cleaner CSV to import back into your CRM.

Pros

  • Totally free.
  • 100% power over match calculations.

Cons

  • May or may not be easy to merge multiple duplicates into one line, rather just removes the other duplicates.
  • You lose references to Notes, Tasks, Deals, Companies, etc. when importing back in.
  • Exporting and importing again takes a lot of extra time and work and often messes with your data.

2. Coders

Getting a coder to write deduplication scripts is one alternative I’ve seen a few companies use.

Access to powerful programming languages is abundant in this day and age. Hiring coders from Fiverr or UpWork can be fairly easy and inexpensive. You can even do this yourself if you are a tech-savvy person.

Pros

  • Get more control over your unique situation
  • Tweak and change things over time
  • Direct access to the API to avoid CSV imports/exports

Cons

  • Anything close to big data (more than a few thousand contacts) begins to require more computational power. You might need to invest in a better computer or server stack.
  • You need to get the coder to update things on a continual basis when you want to tweak how the deduplication scan works.
  • Building a deduplication script is not always simple, and paying a programmer can cost you > $20 an hour (especially programmers who know how to write powerful algorithms).

3. VAs (You outsource data deduping to)

Using a VA (Virtual Assistant) to manually dedupe is probably the most common alternative. Our first customer (back in 2014) did this before he switched to Dedupely.

Pros

  • Relatively cheap depending on your duplicates situation.
  • You don’t have to do it yourself.

Cons

  • Moral killer for the VA (I can safely say, no one likes spending hours deduplication contacts)
  • Can be costly depending how many duplicates you have and how often they enter your CRM
  • Can be very very slow. It takes time for humans to hunt down duplicates.

4. Other Data Cleansing Companies & Consultants

Yes, there are other companies out there that specialize in data cleansing and deduplication. They are few and in-between but you will find a few if you look hard enough.

I’ve noticed that most CRM deduplication companies only focus on the enterprise. Smaller cloud-CRMs are usually left in the dark. As such, pricing is much higher than smaller businesses or businesses, in general, can justify paying.

I’ve also noticed we are the only ones that cache your CRM data for faster results. Other software out there tends to only dedupe and scan via the API or CSVs, which can take months vs. taking a few hours.

One more point. I’ve seen a few companies who integrate deduplication features in their software. But it’s usually and afterthought and these companies don’t specialize, so it’s usually not very effective (at least from what I’ve heard from customers who have used these products)

So far, Dedupely is the only company solving this issue with a modern web model and giving access to the customer at a price that reflects the value.

5. DIY Manual Merges

The most mundane and hair-pulling method of deduplication is doing it yourself. This works great for small databases. But later, once you get integrated with forms, and other services, building a team, importing from other sources, this will drive you and your team absolutely nuts!

Searching for duplicates is the worst part. It takes hours to detect duplicates, then later find the matching duplicates. Some CRMs will only allow you to match two duplicates at a time. So this can take way more time and effort than you really care to put into it. Not to mention, you can’t say much about your productivity after you’ve spent 12 hours deduplicating your CRM instead of talking to customers or doing pretty much anything else in your business.

Now imagine you have hundreds of duplicates. A very quick task in Dedupely. One that takes possibly a few hours by hand. Not only that, what if you’re 3rd party integrations are causing duplicates on a daily basis? Do you really want to spend hours a week merging all that? Thousands of duplicates takes this eternal suffering to a whole new level. Now, you might as well just forget about it.

Doing it yourself is a possible alternative to Dedupely if you don’t have much data. We know that you probably don’t need Dedupely in that case. Once you grow, and your data starts taking on a life of its own, we’re here to help you handle that.

Why We Believe We’re Numer One

Before you start, think about the total investment. Most of these methods will cost you anywhere from a few hundred to a few thousand dollars and 10-20 hours plus of your time. Dedupely is made to save you time, and only starts at $29/mo (and goes up to a few hundred depending on the amount of data in your CRM). We estimate that you can save from 5x to 10x, in resources invested.

This shameless plug is to show you why we are convinced about what we do. Why we think our product is a 10x improvement over everything else out there. We’re sold, and we want your business.

Our team constantly keeps focused on the customer. This is why we added hundreds of new improvements and around a dozen new small features in 2017. We’re obsessed with serving our customer so our product is always a no-brainer.

If you do find a better alternative than Dedupely, tell us!