17. File IO
We know how to read input from a user
We know how to store data in variables and lists
We know how to manipulate data
The trouble is, if we have large amounts of data, inputting data with
inputis not workableFortunately an easy way to address this is reading data from a file
17.1. Text Files
Text files are a great way to store textual data
They typically have the file extension “.txt”, but the actual extension doesn’t really matter
Most of what we are about to see will work on many different file types too (not just text files)
17.1.1. Reading from a Text File
There are a few ways to open and read from a file, but the easiest is as follows
1my_file = open("someFileName.txt", "r")
The above example opens up a file named
"someFileName.txt"in read only mode ("r")This assumes that the file being opened is in the current working directory
A reference to the file is stored in the variable
my_file
Activity
Create a text file somewhere on your computer (perhaps Desktop for ease).
Upload the file to Colab.
Open your file like in the above example, but with your proper file name.
Try using the methods
.readline()and.read().See if you can figure out how to re-read from the file after you already read the full contents.
Note that there are many more methods available beyond
.readline()and.read(), but these will likely be the ones you use the mostreadreads the entire contents of the filereadlinereads a single line from the file
It is also important to
.close()the file once you are done using it in Python
17.1.2. Writing to a Text File
Writing to a text file is similarly simple
1my_other_file = open("anotherFileName.txt", "w")
Unlike reading however, the file does not need to exist
Python will create a new file with the name
"anotherFileName.txt"The main methods for writing to a file are
.write(text)and.writelines(listOfText)writewill write the provided text to the filewritelineswill write multiple strings to a file based on a list of strings — note that it does not add newlines between them, so include"\n"in each string if needed
Activity
Open some file in write only mode (
"w") in Python with a name of your choice.Use the
.write()method to write contents to the file.Once you are done writing to the file, use the
.close()method to close the file.Open the file you just created in some text editor and confirm that it matches what you wrote.
Warning
Always .close() your files when done, especially when writing. Python may not immediately write all contents to
disk — calling .close() ensures everything is flushed and saved properly.
17.2. Comma Separated Values (CSV)
CSV files are a popular file format for tabular data
Data that can be stored in a table
Think of rows and columns of data, like in a spreadsheet
CSV files are stored in plain text, but values are separated with commas
You may come across CSV files that use tabs or spaces to separate data
They can be read in a simple text editor, or even in a spreadsheet program where it will format the data nicely
In fact, you can typically save data from a spreadsheet into a CSV file
An example of data in a CSV is as follows
1name, height, weight, IQ
2Subject 1, 170, 68, 100
3Subject 2, 182, 80, 110
4Subject 3, 155, 54, 105
The above example can be represented in a table as follows
name |
height |
weight |
IQ |
|---|---|---|---|
Subject 1 |
170 |
68 |
100 |
Subject 2 |
182 |
80 |
110 |
Subject 3 |
155 |
54 |
105 |
The first line in the example CSV is a header, which explains the values in each column
You do not need these, some CSV files have them, some don’t
17.2.1. Reading a CSV File
Python has a built-in library to help make reading CSV files simple
In fact, you have already seen this in the Starbucks Density assignment
1def load_starbucks_data(file_name: str) -> list:
2
3 import csv
4
5 # Open the Starbucks file specified by file_name
6 starbucks_file = open(file_name, "r")
7 starbucks_file_reader = csv.reader(starbucks_file)
8
9 # Create an empty list that the Starbucks location tuples will be added to
10 starbucks_locations = []
11
12 # For each row in the file, create a tuple of the lat/lon pair and add it to the list
13 for row in starbucks_file_reader:
14 location_tuple = (float(row[0]), float(row[1]))
15 starbucks_locations.append(location_tuple)
16
17 starbucks_file.close()
18 return starbucks_locations
The emphasized line with the
forloop is the trick to reading data from the csv readerWhen using the
forloop, we read one row at a time from the fileThe file is like a collection of rows
So, for each row in the collection of rows
Here, the variable
rowwill store a reference to the row’s data in the form of a list, where each element in the list is from a different column
Activity
Download
this csv fileto your computer and then upload it to Colab.Write a function called
load_airports()that loads this CSV file into a list and returns the list.Use
load_starbucks_dataas a reference
Play around with the data a little to get a feel for how the information is stored in the list.
Activity
Write a function get_name_from_code(airport_code, airport_list) that will return a string containing the full
name of the airport with the corresponding airport_code. The parameter airport_list should be the list you
loaded using load_airports().
If your function made use of a linear search, can you think of a way to alter get_name_from_code and
load_airports such that you do not need a linear search?
17.2.2. Writing to a CSV File
If we have large amounts of tabular data in our program we want to save to a file, we can write to a CSV file
1import csv
2
3# Create a file to write to
4out_file = open("nameOfOutputFile.csv", "w")
5csv_out_file = csv.writer(out_file)
6
7# Write a row to the file
8csv_out_file.writerow(['First cell','Second cell', 'Third cell'])
9
10# Close the file when done
11out_file.close()
Notice that the row data is a list — symmetric with how it’s read in
With a csv writer, there are two important methods for us to know
writerow, which was discussed abovewriterows, which takes a list of lists to write a large block of data