Dictionary
Dictionaries are arguably the most important data structure in Python for
placements. Their O(1) average time complexity for lookups, insertions,
and deletions is the key to optimizing a vast number of algorithmic
problems. A candidate's ability to recognize when and how to use a
dictionary is a primary indicator of their problem-solving skills.
Part 1: The Fundamentals of Dictionaries
Core Explanation
A Python dictionary (dict) is an unordered (in Python < 3.7) or insertion-
ordered (Python 3.7+) collection of key-value pairs. It's Python's
implementation of a hash map or hash table.
Key-Value Pair: Each entry in a dictionary has two parts: a key and
a value. You use the unique key to look up its associated value.
Keys:
o Must be unique. You cannot have duplicate keys.
o Must be immutable. You can use strings, numbers, or tuples
as keys, but not lists or other dictionaries.
Values: Can be anything: numbers, strings, lists, even other
dictionaries.
# 1. Using curly braces {} (most common)
student = {"name": "Alice", "age": 25, "courses": ["Math", "Physics"]}
# 2. Using the dict() constructor
person = dict(name="Bob", age=30)
# 3. Creating an empty dictionary
empty_dict = {}
2. Accessing Values
This is where dictionaries shine.
Bracket Notation []:
o print(student["name"]) -> "Alice"
o Pros: Direct and simple.
o Cons: If the key does not exist, it will raise a KeyError, which
can crash your program.
The .get() Method (Safer):
o print(student.get("age")) -> 25
o print(student.get("id")) -> None (returns None instead of
crashing)
o You can also provide a default value to return if the key is not
found:
o print(student.get("id", "Not Found")) -> "Not Found"
3. Adding and Modifying Pairs
The syntax is the same for both operations.
# Add a new key-value pair
student["gpa"] = 3.8
# Modify an existing value
student["age"] = 26
print(student)
# Output: {'name': 'Alice', 'age': 26, 'courses': ['Math', 'Physics'], 'gpa': 3.8}
Placement Point of View (Part 1)
KeyError vs. .get(): A good programmer anticipates potential errors. In an interview,
using .get() to safely access dictionary keys shows you write robust code and handle
edge cases gracefully.
Immutability of Keys: An interviewer might ask, "Why can't a list be a dictionary key?"
The correct answer—"Because lists are mutable, and a key's hash value must remain
constant for the hash map to work"—demonstrates a deep understanding of the
underlying data structure.
Part 2: Iterating and Advanced Dictionary Methods
1. Iterating Through Dictionaries
There are three main ways to loop through a dictionary.
student = {"name": "Alice", "age": 25, "gpa": 3.8}
# 1. Iterating over keys (the default behavior)
print("--- Keys ---")
for key in student:
print(key, "->", student[key]) # Look up value for each key
# 2. Iterating over values using .values()
print("\n--- Values ---")
for value in student.values():
print(value)
# 3. Iterating over key-value pairs using .items() (most
common and useful)
print("\n--- Items (key-value pairs) ---")
for key, value in student.items(): # Unpacks the (key, value)
tuple
print(f"{key}: {value}")
3. Other Essential Methods
'key' in my_dict: The preferred way to check for the existence of a key. This is
an O(1) operation.
len(my_dict): Returns the number of key-value pairs.
my_dict.clear(): Removes all items from the dictionary.
my_dict.update(other_dict): Merges one dictionary with another. If there are
overlapping keys, the values from other_dict will overwrite the existing ones.
The .setdefault(key, default) Method
This is a powerful method for a specific pattern: adding a key only if it doesn't already exist.
How it works:
1. It checks if key is in the dictionary.
2. If the key exists, it returns its value.
3. If the key does not exist, it inserts the key with the specified default value and
then returns that default value.
# Before (clumsy way)
stats = {}
word = "hello"
if word not in stats:
stats[word] = 0
stats[word] += 1
# After (using setdefault)
stats = {}
word = "hello"
stats.setdefault(word, 0) # Sets to 0 if not present, then returns the value (0)
stats[word] += 1
# The most common pattern: Frequency Counting
sentence = "this is a sentence this is a test"
word_counts = {}
for word in sentence.split():
# Set the count to 0 if the word is new, then immediately increment it.
word_counts[word] = word_counts.get(word, 0) + 1
print(word_counts)
# Output: {'this': 2, 'is': 2, 'a': 2, 'sentence': 1, 'test': 1}
Part 3: Dictionaries for Problem Solving (Placement Focus)
This is the most critical part. The primary use of dictionaries in interviews
is to trade space for time. By using a dictionary (which takes up O(n)
space in the worst case), you can often reduce a problem's time
complexity from O(n²) or O(n log n) down to O(n).
Pattern 1: The Cache / Memoization
Use Case: Store the results of expensive computations to avoid re-
calculating them. This is the heart of dynamic programming.
Example: Fibonacci Sequence
o A naive recursive solution is O(2ⁿ) because it re-computes the
same values over and over.
o Using a dictionary as a cache reduces it to O(n).
Pattern 2: The Frequency Counter
Use Case: Count the occurrences of items in a sequence. This is the
most common dictionary pattern.
Example: Find if a string is an anagram of another. Two strings are
anagrams if they contain the same characters with the same
frequencies.
Problem: s1 = "listen", s2 = "silent" -> True. s1 = "rat", s2 = "car" -
> False.
Practice Question 1: Two Sum (The Classic Revisited)
Scenario: You are given a list of numbers and a target value.
Task: Find the indices of two numbers in the list that add up to the
target. Assume there is exactly one solution.
Why it's a dictionary problem: A brute-force O(n²) solution uses
two nested loops. To optimize to O(n), you need a way to check
"Have I seen the number I need?" in O(1) time. That's a perfect job
for a dictionary.
Practice Question 2: Grouping Anagrams
Scenario: You have a list of words and you need to group them
together if they are anagrams of each other.
Task: Given a list of strings, group the anagrams together.
Example: ["eat", "tea", "tan", "ate", "nat", "bat"] -> [["eat", "tea",
"ate"], ["tan", "nat"], ["bat"]]
Why it's a dictionary problem: We need a way to "categorize"
words. What do all anagrams have in common? A sorted version of
their letters. We can use the sorted string as a key in a dictionary,
and the value will be a list of all words that match that key.
Topic: Sets in Python
1. Core Explanation
A set is an unordered collection of unique, immutable elements. Think
of it as a mathematical set.
Let's break down those keywords:
Unordered: Sets do not maintain any order. You cannot ask for the
element at index 0 because there is no concept of an index. The
elements might be stored in a way that seems random.
Unique: A set cannot contain duplicate elements. If you add an
element that already exists, the set simply remains unchanged.
Immutable Elements: The items within a set must be of an
immutable type (e.g., string, int, float, tuple). You cannot put a list or
a dict inside a set.
How do they work?
Like dictionaries, sets are built on top of a hash table. This is what gives
them their incredible performance for checking membership. When you
add an item, its hash is calculated to determine where to store it. This is
why the elements must be immutable (so their hash value never
changes).
# 1. Using curly braces {}
my_set = {1, 2, 3, "hello", 3, 2} # Duplicates are automatically removed
print(my_set) # Output: {1, 2, 3, 'hello'} (order is not guaranteed)
# 2. Using the set() constructor on an iterable
list_with_duplicates = [1, 2, 2, 3, 4, 4, 4]
unique_numbers = set(list_with_duplicates)
print(unique_numbers) # Output: {1, 2, 3, 4}
# 3. Creating an empty set - THE TRAP!
empty_set = set() # This is the ONLY way
# empty_dict = {} # This creates an empty DICTIONARY, not a set.
2. Set Mathematical Operations
Sets support powerful mathematical operations that are highly efficient.
Union ( | or .union() ): All elements from both sets.
Intersection ( & or .intersection() ): Elements that are in both sets.
Difference ( - or .difference() ): Elements in the first set but not in the second.
Symmetric Difference ( ^ or .symmetric_difference() ): Elements in either set,
but not in both.
devs = {"Alice", "Bob", "Charlie", "David"}
qa = {"Charlie", "David", "Eve", "Frank"}
# Union: Everyone involved in the project
everyone = devs.union(qa)
# or everyone = devs | qa
print(f"Union: {everyone}")
# Intersection: People who do both development and QA
cross_functional = devs.intersection(qa)
# or cross_functional = devs & qa
print(f"Intersection: {cross_functional}")
# Difference: People who are ONLY developers
only_devs = devs.difference(qa)
# or only_devs = devs - qa
print(f"Difference (Only Devs): {only_devs}")
# Symmetric Difference: People who are in one team but not both
specialists = devs.symmetric_difference(qa)
# or specialists = devs ^ qa
print(f"Symmetric Difference: {specialists}")
3. Placement Point of View & Common Use
Cases
When should a candidate reach for a set in an interview?
1. Deduplication: When you need to get the unique elements from a
list. The list(set(my_list)) pattern is the most Pythonic way to do this.
2. Fast Membership Checking: This is the most important one. If
you find yourself repeatedly checking if item in my_list: inside a
loop, your algorithm is likely O(n²). By converting the list to a set
first (my_set = set(my_list)), each check becomes O(1), and your
overall algorithm becomes O(n). This is a critical optimization
that interviewers look for.
Sets Practice Question 1: Finding Duplicate Numbers
Scenario: You are given a list of integers. Your task is to find out if
the list contains any duplicate values.
Task: Write a function contains_duplicate(nums) that returns True if
any value appears at least twice in the list, and False if every
element is distinct.
Why it's a set problem: The core property of a set is that it only
holds unique elements. We can leverage this to solve the problem
very efficiently.
Sets Practice Question 2: Finding Missing Numbers
Scenario: You are given two lists of student IDs. The first list
(all_students) contains all students enrolled in a course. The second
list (submitted_hw) contains all students who submitted their
homework.
Task: Write a function find_missing_students(all_students,
submitted_hw) that returns a list of student IDs who did not submit
their homework.
Why it's a set problem: This is a perfect use case for the set
difference operation. We want to find items that are in the first
collection but not in the second.