# 19.5: Sets

- Page ID
- 40847

In Section 13.6 I use dictionaries to find the words that appear in a document but not in a word list. The function I wrote takes `d1`

, which contains the words from the document as keys, and `d2`

, which contains the list of words. It returns a dictionary that contains the keys from `d1`

that are not in `d2`

.

def subtract(d1, d2): res = dict() for key in d1: if key not in d2: res[key] = None return res

In all of these dictionaries, the values are `None`

because we never use them. As a result, we waste some storage space.

Python provides another built-in type, called a `set`

, that behaves like a collection of dictionary keys with no values. Adding elements to a set is fast; so is checking membership. And sets provide methods and operators to compute common set operations.

For example, set subtraction is available as a method called `difference`

or as an operator, `-`

. So we can rewrite `subtract`

like this:

def subtract(d1, d2): return set(d1) - set(d2)

The result is a set instead of a dictionary, but for operations like iteration, the behavior is the same.

Some of the exercises in this book can be done concisely and efficiently with sets. For example, here is a solution to `has_duplicates`

, from Exercise 10.15.7, that uses a dictionary:

def has_duplicates(t): d = {} for x in t: if x in d: return True d[x] = True return False

When an element appears for the first time, it is added to the dictionary. If the same element appears again, the function returns `True`

.

Using sets, we can write the same function like this:

def has_duplicates(t): return len(set(t)) < len(t)

An element can only appear in a set once, so if an element in `t`

appears more than once, the set will be smaller than `t`

. If there are no duplicates, the set will be the same size as `t`

.

We can also use sets to do some of the exercises in Chapter 9. For example, here’s a version of `uses_only`

with a loop:

def uses_only(word, available): for letter in word: if letter not in available: return False return True

`uses_only`

checks whether all letters in `word`

are in `available`

. We can rewrite it like this:

def uses_only(word, available): return set(word) <= set(available)

The `<=`

operator checks whether one set is a subset of another, including the possibility that they are equal, which is true if all the letters in `word`

appear in `available`

.

As an exercise, rewrite `avoids`

using sets.