- Mathspp Insider 🐍🚀
- Posts
- named tuples for the win!
named tuples for the win!
Hey there, 👋 How is your Python going? In this Mathspp Insider 🐍🚀 email we’ll talk about named tuples and how they can make your life easier. |
This email at a glance
A couple of personal notes and an extra comment regarding my previous email about type hints;
we’ll see how a named tuple (
collections.namedtuple
) can enhance your developer experience when working with tuples.
Shoutout to my friend Tushar!
In my previous email about type hints I shamefully forgot to give a shoutout to my friend Tushar from https://tushar.lol.
Tushar is very knowledgeable in many aspects of Python and typing is one of them.
He helped me a lot with that email, so thank you!
Now, let us talk about tuples.
Parsing a CSV
Let me define a function that parses some CSV data and returns 4 pieces of information:
the CSV header;
all of the data rows;
the number of columns; and
the number of rows.
Here’s how we could define such a function:
import csv
def parse_csv(csv_data):
reader = csv.reader(csv_data)
header, *data = reader
return (
header,
data,
len(header),
len(data),
)
The function doesn’t do anything spectacular and returns a 4-element tuple.
What’s up with it, then?
Indices, indices everywhere!
The “problem” with this type of code, which is common, is that a tuple is a great way to aggregate data but provides a terrible way to access it.
Enter: named tuples.
A named tuple is a tuple where the positions of the tuple can be accessed by name, as if they were regular attributes.
To create a named tuple, you can use collections.namedtuple
like so:
from collections import namedtuple
ParsedCSVResults = namedtuple(
"ParsedCSVResults",
[
"header",
"data",
"columns",
"rows",
]
)
The namedtuple
takes the name of the named tuple you want to create and then the names of the positions, in order.
Then, when you create a tuple, you use the object ParsedCSVResults
instead:
def parse_csv(csv_data):
reader = csv.reader(csv_data)
header, *data = reader
return ParsedCSVResults(
header,
data,
len(header),
len(data),
)
When you call this function, the value you get out of it will look a bit different but it is still a tuple.
Suppose I have some CSV data with first and last names from different people.
The result of calling the function would look like this:
>>> results = parse_csv(csv_name_data)
>>> results
ParsedCSVResults(
header=['first', 'name'],
data=[...],
columns=2,
rows=3
)
The result can still be used as a tuple but you can also access data with named attributes:
>>> results.header
['first', 'name']
>>> results.rows
3
Advantages of named tuples
One advantage of using this thin wrapper over a plain tuple is that it’s much easier to remember the attribute names over the indices.
Another advantage of using named tuples is that you can instantiate the named tuples with keyword arguments:
def parse_csv(csv_data):
reader = csv.reader(csv_data)
header, *data = reader
return ParsedCSVResults(
header=header,
data=data,
columns=len(header),
rows=len(data),
)
If you have two or more functions working with these tuples, using named tuples reduces the chance of creating tuples with the same data but in different orders.
Named tuples for typed code bases
If you use type hints, the namedtuple
from collections
won’t be good for you.
You’ll want to use typing.NamedTuple
in that case:
from typing import NamedTuple
class ParsedCSVResults(NamedTuple):
header: list[str]
data: list[list[str]]
columns: int
rows: int
I prefer the named tuple from typing
whenever possible because it looks much more like a regular class than the version from collections
.
However, other than the way it looks and the fact that typing.NamedTuple
works with typing and collections.namedtuple
doesn’t, they’re the same.
Have you used named tuples before?
Have you ever used either collections.namedtuple
or typing.NamedTuple
?
Where?
(And which one do you prefer?)
Reply to this email and let me know!
🐍🚀 How was this email? |