Sign in to your Python Morsels account to save your screencast settings.
Don't have an account yet? Sign up here.
Let's talk about creating CSV files in Python.
Python's csv module has a writer callable, which accepts a file object or a file-like object, and which returns a writer object (similar to csv.reader):
>>> import csv
>>> csv_file = open("pets.csv", mode="wt", newline="")
>>> writer = csv.writer(csv_file)
We can write a new row to our CSV file by calling our writer object's writerow method, and providing an iterable representing each column in our row:
>>> import csv
>>> csv_file = open("pets.csv", mode="wt", newline="")
>>> writer = csv.writer(csv_file)
>>> writer.writerow(["Joffrey", "Cat", "Color Point", 3])
27
>>> csv_file.close()
The writerow method relies on our file's write method, which means our file contents may not be written until our file has been closed.
So just as with manually writing to a file in Python, using your file as a context manager, that is in a with block, is considered a good idea when writing CSV files:
>>> import csv
>>> with open("pets.csv", mode="wt", newline="") as csv_file:
... writer = csv.writer(csv_file)
... writer.writerow(["Joffrey", "Cat", "Color Point", 3])
If you have iterables of iterables of rows that you'd like to write to a CSV file:
>>> cities = [
... ["New York City", 8175133, 81, .314],
... ["Los Angeles", 3971883, 66, .121],
... ["Chicago", 2695598, 54, .186],
... ["Houston", 2296224, 78, .081],
... ["Phoenix", 1563025, 79, .045],
... ]
You could use a for loop along with the writerow method:
>>> import csv
>>> with open("cities.csv", mode="wt", newline="") as csv_file:
... writer = csv.writer(csv_file)
... for row in cities:
... writer.writerow(row)
...
32
30
26
26
26
But writer objects actually include a writerows method that we could use instead:
>>> import csv
>>> with open("cities.csv", mode="wt", newline="") as csv_file:
... writer = csv.writer(csv_file)
... writer.writerows(cities)
...
The writerows method will loop over a given iterable of rows and call the writerow method on each row for us.
Occasionally, I find myself reverting back to using writerow in a for loop though, because sometimes I need to change my data a little bit as I loop.
>>> import csv
>>> with open("cities.csv", mode="wt", newline="") as csv_file:
... writer = csv.writer(csv_file)
... for city, population, air_index, public_transit in cities:
... population = f"{population:,}"
... public_transit = f"{public_transit:.1%}"
... writer.writerow([city, population, air_index, public_transit])
In such cases, you can often change loops like this into a generator expression, and then pass that generator expression into the writerows method:
>>> import csv
>>> with open("cities.csv", mode="wt", newline="") as csv_file:
... writer = csv.writer(csv_file)
... writer.writerows(
... [city, f"{population:,}", air_index, f"{public_transit:.1%}"]
... for city, population, air_index, public_transit in cities
... )
...
So we still do the processing we need, but we're able to use writerows instead of a for loop and writerow.
What if we wanted to write a header to our CSV file?
A writer object doesn't know anything about headers.
But CSV headers are really just another row.
So we could call the writerow method, and then write out the rest of our rows after that:
import csv
cities = [
["New York City", 8175133, 81, .314],
["Los Angeles", 3971883, 66, .121],
["Chicago", 2695598, 54, .186],
["Houston", 2296224, 78, .081],
["Phoenix", 1563025, 79, .045],
]
with open("cities.csv", mode="wt", newline="") as csv_file:
writer = csv.writer(csv_file)
writer.writerow([
"City",
"Population",
"Air Quality Index",
"Public Transit Users",
])
writer.writerows(
[city, f"{population:,}", air_index, f"{public_transit:.1%}"]
for city, population, air_index, public_transit in cities
)
Just as with csv.reader, Python's csv.writer objects accept an optional delimiter argument (which defaults to ,).
We can set the delimiter to any single character.
For example, we could write out a tab delimited file by specifying a delimiter of \t:
>>> with open("cities.csv", mode="wt", newline="") as csv_file:
... writer = csv.writer(csv_file, delimiter="\t")
If you'd prefer to think in terms of dictionaries for each of the rows in your file, instead of in terms of iterables or lists, you could use Python's DictWriter class:
>>> import csv
>>> csv.DictWriter
<class 'csv.DictWriter'>
We'll need an iterable of dictionaries to use DictWriter:
>>> city_rows = [
... {
... "city": city,
... "population": f"{population:,}",
... "Air Quality Index": air_index,
... "Public Transit Users": f"{public_transit:.1%}",
... }
... for city, population, air_index, public_transit in cities
... ]
The DictWriter class accepts an iterable of fieldnames, which specify the order of the columns in our CSV file:
>>> with open("cities.csv", mode="wt", newline="") as csv_file:
... writer = csv.DictWriter(csv_file, fieldnames=city_rows[0].keys())
Unlike writer objects, DictWriter objects actually have a writeheader method, because they know about our headers and can write them out to our file for us:
>>> with open("cities.csv", mode="wt", newline="") as csv_file:
... writer = csv.DictWriter(csv_file, fieldnames=city_rows[0].keys())
... writer.writeheader()
Besides that fieldnames argument and the writeheader method, the DictWriter class pretty much works the same way as the writer class.
But its writerows and writerow methods accept dictionaries instead of lists:
>>> with open("cities.csv", mode="wt", newline="") as csv_file:
... writer = csv.DictWriter(csv_file, fieldnames=city_rows[0].keys())
... writer.writeheader()
... writer.writerows(city_rows)
...
56
The keys in each of these dictionaries represent the headers in our CSV file:
>>> city_rows[0]
{'city': 'New York City', 'population': '8,175,133', 'Air Quality Index': 81, 'Publi
c Transit Users': '31.4%'}
They're actually the same thing as those fieldnames that we originally specified to DictWriter.
writer = csv.DictWriter(csv_file, fieldnames=city_rows[0].keys())
Now, you might have noticed that when opening each of our files for writing, we've been specifying a newline="" argument:
>>> with open("cities.csv", mode="wt", newline="") as csv_file:
... writer = csv.DictWriter(csv_file, fieldnames=city_rows[0].keys())
When using Python's csv module, it's important to specify newline="" if your code might ever be run on Windows.
Python's csv writer ends each of its lines with a carriage return followed by a line feed (i.e. \r\n) because that's a very common convention when writing CSV files.
This is an issue, though.
Because when you write to a text file on Windows in Python, Python tries to helpfully convert each line ending (\n) into \r\n because Python tries to use the line ending that's most common for the platform that it's writing on, and a carriage return followed by a line feed (\r\n) is the most common way to represent a line ending on Windows.
So when writing a CSV file on Windows, the \r\n that Python's CSV module writes out ends up turning into \r\r\n, because Python double-converts that \n.
We can fix this by using the newline="" argument when opening our file for writing.
That newline='' argument disables any operating system-specific line feed conversions.
If none of that made any sense to you, that's okay.
There's really one thing you need to remember when writing CSV files in Python: always open the CSV file you're about to write to by specifying a newline="" keyword argument.
writer or DictWriter to write CSV filesTo write to a CSV file in Python, you can use writer or DictWriter from Python's csv module.
We don't learn by reading or watching. We learn by doing. That means writing Python code.
Practice this topic by working on these related Python exercises.
Sign in to your Python Morsels account to track your progress.
Don't have an account yet? Sign up here.