Remove Duplicates from a Python List

A quick and easy function to remove duplicates from a list in Python.
Andrew Wood  •   07 June 2022
Andrew Wood  •   Last Updated: 07 June 2022
A quick and easy function to remove duplicates from a list in Python.

Removing duplicates from a Python list is a common task which can be quickly and easily performed if you make use of the properties of a dictionary

dictionary is a data collection type that stores information in key:value pairs. You refer to the entry in the dictionary by the key, which then allows you to extract the corresponding value. Dictionary keys have to be unique; this is a property we make use of to remove duplicate values from a list.

To remove duplicates we invoke the dict.fromkeys() function which creates a new dictionary from a list of keys. Since the dictionary can contain no duplicate keys, the function automatically removes the duplicates from the list. We then turn the newly created dictionary back into a list which will now contain no duplicate values.

>>> somelist = ['dog','cat','hamster','dog','fish']
>>> somelist = list(dict.fromkeys(somelist))
>>> print(somelist)
# duplicates removed from list
['dog', 'cat', 'hamster', 'fish']

A Case Sensitive Duplicate Removal Function

If you have a list of strings containing both upper and lower case values, you may want to modify the filter to be either case sensitive or case insensitive depending on the application. Since a dictionary only requires that the keys be unique, the same string in a different case will be viewed as unique and will not be filtered out by the dict.fromkeys() method. 

>>> mylist = ['dog','cat','Dog']
>>> mylist = list(dict.fromkeys(mylist))
>>> print(mylist)
# 'dog' and 'Dog' are viewed as unique keys and so are not filtered out.
['dog', 'cat', 'Dog']

If you do require that the filtering considers case then the best way to proceed is to write a function that performs the filtering as required.

def remove_duplicates(alist,case_sensitive=True):
    """ Remove duplicates from a list of strings. 
        Default to a filter that IS sensitive to case.
        i.e. 'dog' and 'Dog' are unique strings.
    """
    if not case_sensitive:
        alist = [i.lower() for i in alist] 
    return list(dict.fromkeys(alist))

The function will return a different result depending on whether you wish to filter with or without case sensitivity.

>>> animal_list = ['lion','zebra','Zebra','rhino','hippo','Lion']

>>> default_filtered_list = remove_duplicates(animal_list)
>>> print(default_filtered_list)
# resulting list is case sensitive
['lion', 'zebra', 'Zebra', 'rhino', 'hippo', 'Lion']

>>> remove_case_list = remove_duplicates(animal_list,case_sensitive=False)
>>> print(remove_case_list)
# resulting list is case insensitive
['lion', 'zebra', 'rhino', 'hippo']

The remove_duplicates() function will break down if you try to filter a list that contains data other than strings. This is because you can't run the lower() method on a datatype other than a string. If your list contains multiple data types you would need to modify the function to only apply the lower() method to strings.

Share this
Comments
Canard Analytics Founder. Python development, data nerd, aerospace engineering and general aviation.
Profile picture of andreww
Share Article

Looking for a partner on your next project?

Contact Us