17. File IO
We know how to read input from a user
We know how to store data in variables and lists
We know how to manipulate data
The trouble is, if we have large amounts of data, inputting data with
input
is not workableFortunately an easy way to address this is reading data from a file
17.1. Text Files
Text files are great way to store textual data
They typically have the file extension “.txt”, but the actual extension doesn’t really matter
Most of what we are about to see will work on many different file types too (not just text files)
17.1.1. Reading from a Text File
There are a few ways to open and read from a file, but the easiest is as follows
1my_file = open("someFileName.txt", "r")
The above example opens up a file named
"someFileName.txt"
in read only mode ("r"
)This assumes that the file being opened is in the current working directory
A reference to the file is stored in the variable
my_file
Activity
Create a text file somewhere on your computer (perhaps Desktop for ease).
Upload the file to Colab.
Open your file like in the above example, but with your proper file name.
Try using the methods
.readline()
and.read()
.See if you can figure out how to re-read from the file after you already read the full contents.
Note that there are many more methods available beyond
.readline()
and.read()
, but these will likely be the ones you use the mostread
reads the entire contents of the filereadline
reads a single line from the file
It is also important to
.close()
the file once you are done using it in Python
17.1.2. Writing to a Text File
Writing to a text file is similarly simple
1my_other_file = open("anotherFileName.txt", "w")
Unlike reading however, the file does not need to exist
Python will create a new file with the name
"anotherFileName.txt"
The most commonly used methods you will likely use when writing to a file will be
.write(text)
and.writelines(listOfText)
write
will write the provided text to the filewritelines
will write multiple lines of text to a file based on a list of strings — each string in the list will be its own line
Activity
Open some file in write only mode (
"w"
) in Python with a name of your choice.Use the
.write()
method to write contents to the file.Once you are done writing to the file, use the
.close()
method to close the file.Open the file you just created in some text editor and confirm that it matches what you wrote.
Warning
It is very important to .close()
your files when you are done with them, especially when writing to a file.
Based on how Python writes to files, the contents you write are not sent to the file right away. Instead, it goes to
something called a buffer that periodically writes to the file. If you fail to .close()
your file, there is a
chance that the buffer never finished writing to the file before the program terminated. When you .close()
the
file, it flushes the buffer, meaning that anything left in the buffer will be written to the file.
17.2. Comma Seperated Values (CSV)
CSV files are are a popular file format for tabular data
Data that can be stored in a table
Think of rows and columns of data, like in a spreadsheet
CSV files are stored in plain text, but values are seperated with commas
You may come across CSV files that use tabs or spaces to separate data
They can be read in a simple text editor, or even in a spreadsheet program where it will format the data nicely
In fact, you can typically save data from a spreadsheet into a CSV file
An example of data in a CSV is as follows
1name, height, weight, IQ
2Subject 1, 170, 68, 100
3Subject 2, 182, 80, 110
4Subject 3, 155, 54, 105
The above example can be represented in a table as follows
name |
height |
weight |
IQ |
---|---|---|---|
Subject 1 |
170 |
68 |
100 |
Subject 2 |
182 |
80 |
110 |
Subject 3 |
155 |
54 |
105 |
The first line in the example CSV is a header, which explains the values in each column
You do not need these, some CSV files have them, some don’t
17.2.1. Reading a CSV File
Python has a built-in library to help make reading CSV files simple
In fact, you have already seen this in the Starbucks Density assignment
1def load_starbucks_data(file_name: str) -> list:
2
3 import csv
4
5 # Open the Starbucks file specified by file_name
6 starbucks_file = open(file_name, "r")
7 starbucks_file_reader = csv.reader(starbucks_file)
8
9 # Create an empty list that the Starbucks location tuples will be added to
10 starbucks_locations = []
11
12 # For each row in the file, create a tuple of the lat/lon pair and add it to the list
13 for row in starbucks_file_reader:
14 location_tuple = (float(row[0]), float(row[1]))
15 starbucks_locations.append(location_tuple)
16
17 starbucks_file.close()
18 return starbucks_locations
The emphasized line with the
for
loop is the trick to reading data from the csv readerWhen using the
for
loop, we read one row at a time from the fileThe file is like a collection of rows
So, for each row in the collection of rows
Here, the variable
row
will store a reference to the row’s data in the form of a list, where each element in the list is from a different column
Activity
Download
this csv file
to your computer and then upload it to Colab.Write a function called
load_airports()
that loads this CSV file into a list and returns the list.Use
load_starbucks_data
as a reference
Play around with the data a little to get a feel for how the information is stored in the list.
Activity
Write a function get_name_from_code(airport_code, airport_list)
that will return a string containing the full
name of the airport with the corresponding airport_code
. The parameter airport_list
should be the list you
loaded using load_airports()
.
If your function made use of a linear search, can you think of a way to alter get_name_from_code
and
load_airports
such that you do not need a linear search?
17.2.2. Writing to a CSV File
If we have large amounts of tabular data in our program we want to save to a file, we can write to a CSV file
1# Create a file to write to
2out_file = open("nameOfOutputFile.csv", "w")
3csv_out_file = csv.writer(out_file)
4
5# Write a row to the file
6csv_out_file.writerow(['First cell','Second cell', 'Third cell'])
7
8# Be sure to close the file when done!!!
9out_file.close()
In the above example, notice that all the data for the row is contained within a list
This is similar to how the data is read in as a list
With a csv writer, there are two important methods for us to know
writerow
, which was discussed abovewriterows
, which takes a list of lists to write a large block of data