pythonFebruary 13, 20265 min read

Python Data Types & Structures

Lists, tuples, sets, and dictionaries — when to use each, how comprehensions work, mutability traps, and the time/space complexity that actually matters.

Python Data Types & Structures

Start Here: The Mental Model

Think of Python's data structures like containers at a hardware store.

  • A list is like a shopping cart — ordered, you can add or remove items, and things can repeat.
  • A tuple is like a locked display case — ordered, but nobody's changing what's inside.
  • A set is like a bag where duplicates automatically fall out — unordered, unique items only.
  • A dictionary is like a filing cabinet — every item has a label (key) so you can find it instantly.

That mental model alone will carry you pretty far. Now let's go deeper.


Lists — Your Everyday Workhorse

A list is an ordered, mutable sequence. "Mutable" means you can change it after creation.

python
1fruits = [class=class="syn-str">"syn-str">class="syn-str">"apple", class=class="syn-str">"syn-str">class="syn-str">"banana", class=class="syn-str">"syn-str">class="syn-str">"cherry"]
2fruits.append(class=class="syn-str">"syn-str">class="syn-str">"mango") class="syn-comment"># adds to end
3fruits[0] = class=class="syn-str">"syn-str">class="syn-str">"avocado" class="syn-comment"># changes first item

When to use a list:

  • You care about order.
  • You need to add, remove, or modify items.
  • Duplicates are fine or even expected.

Common pitfall: Lists are slow for membership checks. If you're writing if item in my_list in a tight loop with thousands of items, switch to a set. Checking membership in a list is O(n) — it scans every element. A set does it in O(1).


Tuples — Lightweight and Immutable

A tuple looks like a list but uses parentheses and cannot be changed after creation.

python
1coordinates = (40.7128, -74.0060) class="syn-comment"># New York City lat/long
2rgb = (255, 128, 0)

When to use a tuple:

  • The data shouldn't change — like coordinates, RGB values, or database records.
  • You need to use it as a dictionary key (lists can't be keys because they're mutable).
  • You want a small performance edge — tuples use slightly less memory than lists.

Common confusion: Some devs avoid tuples entirely and just use lists everywhere. That works, but it misses a signal. When you use a tuple, you're saying "this data is fixed." It communicates intent, and Python also enforces it.


Sets — When Uniqueness Is the Point

A set stores only unique values and doesn't preserve insertion order.

python
1tags = {class=class="syn-str">"syn-str">class="syn-str">"python", class=class="syn-str">"syn-str">class="syn-str">"backend", class=class="syn-str">"syn-str">class="syn-str">"api"}
2tags.add(class=class="syn-str">"syn-str">class="syn-str">"python") class="syn-comment"># duplicate — silently ignored
3print(tags) class="syn-comment"># {class=class="syn-str">"syn-str">class="syn-str">"python", class=class="syn-str">"syn-str">class="syn-str">"backend", class=class="syn-str">"syn-str">class="syn-str">"api"}

When to use a set:

  • You need to eliminate duplicates.
  • You want fast membership testing (in checks).
  • You're doing set operations: union, intersection, difference.
python
1a = {1, 2, 3, 4}
2b = {3, 4, 5, 6}
3
4print(a & b) class="syn-comment"># intersection: {3, 4}
5print(a | b) class="syn-comment"># union: {1, 2, 3, 4, 5, 6}
6print(a - b) class="syn-comment"># difference: {1, 2}

Common pitfall: You can't index a set (my_set[0] raises an error). If you need both uniqueness and order, consider a list that you deduplicate, or look into dict.fromkeys() which preserves insertion order.


Dictionaries — Key-Value Lookup, Fast

A dictionary maps keys to values. Lookups by key are O(1) — essentially instant regardless of size.

python
1user = {
2 class=class="syn-str">"syn-str">class="syn-str">"name": class=class="syn-str">"syn-str">class="syn-str">"Alice",
3 class=class="syn-str">"syn-str">class="syn-str">"age": 30,
4 class=class="syn-str">"syn-str">class="syn-str">"active": True
5}
6
7print(user[class=class="syn-str">"syn-str">class="syn-str">"name"]) class="syn-comment"># class=class="syn-str">"syn-str">class="syn-str">"Alice"
8user[class=class="syn-str">"syn-str">class="syn-str">"email"] = class=class="syn-str">"syn-str">class="syn-str">"alice@example.com" class="syn-comment"># add a new key

When to use a dictionary:

  • You need to look something up by name or ID.
  • You're counting things (word_count["hello"] += 1).
  • You're grouping data by a category.
  • You want structured data without defining a class.

Common pitfall: Accessing a key that doesn't exist raises a KeyError. Use .get() instead for safe access:

python
1class="syn-comment"># This raises an error if class=class="syn-str">"syn-str">class="syn-str">"email" doesn't exist
2user[class=class="syn-str">"syn-str">class="syn-str">"email"]
3
4class="syn-comment"># This returns None (or a default) if class=class="syn-str">"syn-str">class="syn-str">"email" doesn't exist
5user.get(class=class="syn-str">"syn-str">class="syn-str">"email", class=class="syn-str">"syn-str">class="syn-str">"no email")

List Comprehensions and Dict Comprehensions

These are one-liners for building collections. They're faster than loops and, once you're used to them, more readable.

List comprehension:

python
1class="syn-comment"># The long way
2squares = []
3for n in range(10):
4 squares.append(n ** 2)
5
6class="syn-comment"># The comprehension way
7squares = [n ** 2 for n in range(10)]
8
9class="syn-comment"># With a filter
10even_squares = [n ** 2 for n in range(10) if n % 2 == 0]

Dict comprehension:

python
1words = [class=class="syn-str">"syn-str">class="syn-str">"apple", class=class="syn-str">"syn-str">class="syn-str">"banana", class=class="syn-str">"syn-str">class="syn-str">"cherry"]
2word_lengths = {word: len(word) for word in words}
3class="syn-comment"># {class=class="syn-str">"syn-str">class="syn-str">"apple": 5, class=class="syn-str">"syn-str">class="syn-str">"banana": 6, class=class="syn-str">"syn-str">class="syn-str">"cherry": 6}

When to use them: When you're transforming or filtering a collection in one step. If the logic gets complex (nested ifs, nested loops), a regular for loop is clearer. Comprehensions aren't always better — just often more concise.


Mutable vs Immutable — Why It Matters

Mutable objects can be changed after creation: lists, dicts, sets.

Immutable objects cannot: integers, floats, strings, tuples.

This distinction has practical consequences.

python
1class="syn-comment"># Strings are immutable — this creates a NEW string, doesn't change the original
2name = class=class="syn-str">"syn-str">class="syn-str">"alice"
3name.upper() class="syn-comment"># returns class=class="syn-str">"syn-str">class="syn-str">"ALICE", but name is still class=class="syn-str">"syn-str">class="syn-str">"alice"
4name = name.upper() class="syn-comment"># now name is class=class="syn-str">"syn-str">class="syn-str">"ALICE"

The bigger trap is with mutable default arguments in functions:

python
1class="syn-comment"># DON'T do this
2def add_item(item, my_list=[]):
3 my_list.append(item)
4 return my_list
5
6add_item(class=class="syn-str">"syn-str">class="syn-str">"a") class="syn-comment"># [class=class="syn-str">"syn-str">class="syn-str">"a"]
7add_item(class=class="syn-str">"syn-str">class="syn-str">"b") class="syn-comment"># [class=class="syn-str">"syn-str">class="syn-str">"a", class=class="syn-str">"syn-str">class="syn-str">"b"] — the list persists between calls!
8
9class="syn-comment"># DO this instead
10def add_item(item, my_list=None):
11 if my_list is None:
12 my_list = []
13 my_list.append(item)
14 return my_list

This trips up almost every Python developer at least once. The default list is created once when the function is defined, not each time it's called.


Time and Space Complexity — The Practical Summary

You don't need to memorize every operation, but these are the ones that matter most in practice.

OperationListSetDict
Access by indexO(1)
Access by keyO(1)
Membership test (in)O(n)O(1)O(1)
Append / AddO(1)O(1)O(1)
Insert at positionO(n)
Delete by valueO(n)O(1)O(1)

The one to internalize first: membership testing in a list is slow. If you're checking if x in collection frequently, use a set or dict instead.

Space-wise, sets and dicts consume more memory than lists because they maintain hash tables internally. For small datasets this doesn't matter. For millions of records, it might.


Takeaway

  • List when order matters and data changes.
  • Tuple when data is fixed and order matters.
  • Set when you need uniqueness and fast lookups.
  • Dict when you need to find things by a key.

Use comprehensions to build collections cleanly, but keep them readable. Understand mutability so you don't get surprised by shared state. And know that in checks on a list are linear — if you're doing many of them, reach for a set.

These aren't just syntax rules. Each choice communicates intent to the next person reading your code — including future you.

Share