Most B2B marketers and salespeople have heard the terms clean/dirty data, data scrub, and data hygiene. But you may still be scratching your head about what exactly these terms even mean. And there’s no wonder why — data hygiene is one of those terms that’s been tossed around so much that it seems to have lost its meaning.
But data hygiene is such an important process for keeping your CRM in tip-top shape that we’re dispelling the most common doubts about it as well as our expert recommendations and best practices for creating and deploying your own data hygiene policy.
Let’s get started.
Data hygiene, aka data cleansing or a data scrub, is the process of removing “dirty data” from your CRM. Dirty data is data that’s outdated, incomplete, inaccurate, or otherwise faulty and can negatively impact metrics like your marketing performance. It can also hike up your software costs by having too many useless contacts you don’t really need to store. And most importantly, dirty data can prevent you from taking impactful business decisions because you won’t have an accurate representation of your audience at any given time.
A data hygiene policy is an ongoing effort to keep your data clean — meaning organized, accurate, and up to date. Your data hygiene policy can help sales and marketing teams save precious hours every month by reducing the time it takes to manage their activities — not to mention it improves segmentation, which allows you to launch more effective marketing and sales campaigns.
An ongoing data hygiene policy also provides you with more realistic performance reporting and cuts costs on your tech stack.
The first step to a data hygiene policy is auditing existing data. Obviously, you can’t exactly start with a clean slate when you have an implemented CRM — there are countless contacts already in there. However, you can begin by looking for inconsistencies, duplicates, incomplete records, and other dirty data.
Next, you want to implement standards for collecting data moving forward, so that your data remains squeaky clean for years to come.
Clean data is data that includes all the information your team needs to do their job. It’s also data that’s updated and organized according to the standards you set when you implement a data hygiene policy.
In order to maintain your data clean, you need to standardize your data collecting process. This way, data that makes its way to your CRM is already clean, which prevents things from getting out of hand.
How do you do this? By making sure your submission forms and any other ways of entering data contain the fields you actually need to operate. Common fields include:
Other fields will vary depending on your unique business. Ie, if you run a pet grooming business, your fields may include species and breed. What truly matters is that any data you collect is a) necessary and b) as accurate as it can be.
No two databases are the same, but there are common problems that can help steer your data hygiene policy:
This is a common problem for users of the HubSpot Salesforce integration. A syncing error can create duplicate contacts during the implementation phase. Another common issue is subscribers signing up multiple times for multiple resources (ie, downloading a guide, booking a consultation, requesting a demo, etc.).
Depending on what data is missing, this can be more or less serious. For example, if you’re missing the subscriber’s name but have their email, you may be able to infer the name. However, it’s tedious and can impact your personalization. Similarly, if you’re missing data about their company or interests, segmentation will suffer.
Old company emails, addresses, or phone numbers can result in countless work hours wasted trying to contact a lead. And they can falsely increase your bounce rate or lower your open or engagement rates.
This type of faulty data includes typos, incomplete email addresses (for example, .co instead of .com), or phone numbers without area codes, for example. Unlike the other types, invalid data is data that was never useful, to begin with.
The answer is: It depends. How often you need to conduct a data cleanup depends on how big your database is, how quickly it comes in, and the type of data you store.
As a general rule of thumb, if you have more than 100,000 contacts, we recommend at least quarterly audits to ensure that data remains clean. On the flip side, if your list has less than 50,000 contacts, you could probably do away with yearly audits.
Again, it depends. If your customers are making a one-time purchase (like a home), you’ll likely have fewer interactions with them than if they have a monthly subscription. However, it’s always a good idea to conduct re-engagement campaigns to ensure that clients interact with your brand at least once a year. If they don’t, it’s time to cut them loose.
Another scenario where it’s definitely time to let them go? When emails bounce. A soft bounce may be caused by an Out of Office message, but hard bounces mean that the email is either inactive or wrong, which means it’s no use keeping it in your CRM.
Other than that, companies in certain fields (like healthcare) need to check industry guidelines (like HIPAA) for data management. The GDPR stipulates that you should keep data for as long as the purpose hasn’t been fulfilled, which feels a bit vague. So we recommend consulting with your legal team for more specific information regarding compliance.