Checking for and managing duplicates

Why do duplicates happen?

Duplicates are an inevitable result of working in a database, and it's important to manage them by performing a periodic sweep.

Unfortunately, there's no silver bullet for eliminating duplicates, particularly if you want to ensure that more new duplicates are not created or that no records are merged that shouldn’t be. The only way to handle your duplicates, if you want to be thorough and careful about things, is to go through each set of them, one by one.

Avoid duplicate creation during import

To avoid creating duplicates when you’re importing data, we recommend reviewing our knowledge base article Control duplicates on import. This article provides guidance on reviewing your constituent match settings, reviewing “new” constituents, and using the Set constituent button with the magnifying glass icon (look for this icon next to the constituent name during your import or submission record review) to select a specific constituent for an import record. (These same tools can apply to reviewing records coming in through an LGL integration with another product such as LGL Forms, Wufoo, or PayPal.)

Doing so will enable you to search your existing data for that constituent name or names like it, allowing you to see if you have that constituent in your database before you save the constituent with a new record. If you find a match, you can save the submission into that constituent record. This is a great way to make sure the constituent doesn’t already exist in your database before you save the record, thereby avoiding the creation of a duplicate.

There are a few other strategies that can minimize duplicates, in addition. If you have imported a file and a lot of duplicate names have resulted, for example, we recommend either of the methods listed below to minimize cleanup. (If you have already accepted the import, you would first undo the import [go to your import and click the Undo button], make the changes in your spreadsheet, and then re-import the spreadsheet.)

Go through your import spreadsheet, creating a unique ID based on each constituent name. This ID will function as an external constituent ID. Importing the external constituent ID into LGL will let you later control which constituents match to each other for subsequent imports.
Select name-only matching, which would work well if your data doesn’t have name overlap (i.e., there are no cases in which the same names are being used by multiple constituents in your database).

NOTE: If you were to take all the records that have a full name match and merge them automatically, that would probably cut the list down by a lot. But if you have any name overlap in your data, false matches would result.

Remove an incorrect match from an import

If the Flex Importer incorrectly matches a record you are importing to an existing constituent record in your database, you can remove the match by clicking the "Clear match" button, as shown here:

Avoid duplicate creation during constituent entry

To avoid creating duplicates as you manually enter constituents, after you enter the constituent's first and last name LGL will display a notification below the name field when there are records containing possible duplicate/s:

If you see this notification, you have the option to see the listing by clicking the Show results button to open it in the constituent entry form. Or you can click the View in new tab button in the screenshot above to see the results in a separate browser tab:

Another option for avoiding duplicate creation would be to search for a constituent before you add any new records to LGL. Searching by name fragments is also a good idea (*nic bick* instead of nick bicknell) because that is a more foolproof way to catch possible variations in spelling.

Methods for cleaning up duplicates

You can merge duplicate constituents on a one-off basis (see this article for details), but you can also have LGL look for duplicates in bulk from the Constituents > Duplicates page.

NOTE: Duplicates are searched for only among constituent records that have the same constituent type. By default, duplicate records will also only be searched for among records with the same contact type as well. For example, an “Individual” record that has a "Primary" contact type will be matched with other “Individual" records that have the "Primary" contact type.

You have several different deduplication options to choose from:

Name only: Dedupe constituents by name only. This will check a) First name + Last name together for constituent type = Individual, as well as b) Organization name for constituent type = Organization. This is not recommended if you have a large database or a lot of constituents who are different people but share the same first/last name.
Last name only: Dedupe constituents by last name only. This is not recommended if you have a large database or a lot of constituents who are different people but share the same last name.
Email only: Dedupe constituents by email only. This is a good way to flag constituents who share email addresses as possible dupes. All active email addresses for each constituent will be checked.
Email and first name: Dedupe constituents by email and first name. This is a fairly conservative approach. All active email addresses for each constituent will be checked for possible matches.
Address only: Dedupe constituents by address only. This can be helpful to identify spouse records that you want to merge. All active addresses for each constituent will be checked for possible matches.
Address and last name: Dedupe constituents by address and last name. This is a fairly conservative approach. All active addresses for each constituent will be checked for possible matches.
Address and full name: Dedupe constituents by address, last, and first name. This is a very conservative approach. All active addresses for each constituent will be checked for possible matches.
Phone number only: Dedupe by phone number only. This is a good way to flag constituents who share phone numbers as possible dupes. All active phone numbers for each constituent will be checked.

Once you have chosen the desired options, click the Run dupe checker button. This will clear out your existing duplicate queue and kick off a reduplication process that runs in the background. This process will take several minutes, and could take even longer depending on the size of your database and the number of duplicates found. If you want to wait, the page will refresh when the process is done, or you can move on to other tasks in LGL and wait for the email confirmation letting you know the process is complete and how many duplicates were found.

After the process is complete, you can work your way through the duplicate queue, merging records as appropriate or flagging them as “not a dupe” if they should remain separate ( and be removed from consideration in the dupe queue, including in future dupe searches).

Finding dupes in a specific set of constituents

You can limit which constituents are being reviewed by choosing a specific list or import file. The constituents in that list or file will then be compared to your entire database to look for duplicates.

NOTE: If you need instructions for unmerging a merged record, read Unmerging a merged constituent record.

If you're looking for information on how to minimize duplicates during an import or for data coming from LGL Forms, please see Control duplicates on import.