*******
File IO
*******
* We know how to read input from a user
* We know how to store data in variables and lists
* We know how to manipulate data
* The trouble is, if we have large amounts of data, inputting data with ``input`` is not workable
* Fortunately an easy way to address this is reading data from a file
Text Files
==========
* Text files are great way to store textual data
* They typically have the file extension ".txt", but the actual extension doesn't really matter
* Most of what we are about to see will work on many different file types too (not just text files)
Reading from a Text File
------------------------
* There are a few ways to open and read from a file, but the easiest is as follows
.. code-block:: python
:linenos:
my_file = open("someFileName.txt", "r")
* The above example opens up a file named ``"someFileName.txt"`` in read only mode (``"r"``)
* This assumes that the file being opened is in the current working directory
* A reference to the file is stored in the variable ``my_file``
.. admonition:: Activity
:class: activity
#. Create a text file somewhere on your computer (perhaps Desktop for ease).
#. Upload the file to Colab.
#. Open your file like in the above example, but with your proper file name.
#. Try using the methods ``.readline()`` and ``.read()``.
#. See if you can figure out how to re-read from the file after you already read the full contents.
* Note that there are many more methods available beyond ``.readline()`` and ``.read()``, but these will likely be the ones you use the most
* ``read`` reads the entire contents of the file
* ``readline`` reads a single line from the file
* It is also important to ``.close()`` the file once you are done using it in Python
Writing to a Text File
----------------------
* Writing to a text file is similarly simple
.. code-block:: python
:linenos:
my_other_file = open("anotherFileName.txt", "w")
* Unlike reading however, the file does not need to exist
* Python will create a new file with the name ``"anotherFileName.txt"``
* The most commonly used methods you will likely use when writing to a file will be ``.write(text)`` and ``.writelines(listOfText)``
* ``write`` will write the provided text to the file
* ``writelines`` will write multiple lines of text to a file based on a list of strings --- each string in the list will be its own line
.. admonition:: Activity
:class: activity
#. Open some file in write only mode (``"w"``) in Python with a name of your choice.
#. Use the ``.write()`` method to write contents to the file.
#. Once you are done writing to the file, use the ``.close()`` method to close the file.
#. Open the file you just created in some text editor and confirm that it matches what you wrote.
.. warning::
It is very important to ``.close()`` your files when you are done with them, especially when writing to a file.
Based on how Python writes to files, the contents you write are not sent to the file right away. Instead, it goes to
something called a *buffer* that periodically writes to the file. If you fail to ``.close()`` your file, there is a
chance that the buffer never finished writing to the file before the program terminated. When you ``.close()`` the
file, it *flushes the buffer*, meaning that anything left in the buffer will be written to the file.
Comma Seperated Values (CSV)
============================
* CSV files are are a popular file format for tabular data
* Data that can be stored in a table
* Think of rows and columns of data, like in a spreadsheet
* CSV files are stored in plain text, but values are seperated with commas
* You may come across CSV files that use tabs or spaces to separate data
* They can be read in a simple text editor, or even in a spreadsheet program where it will format the data nicely
* In fact, you can typically save data from a spreadsheet into a CSV file
* An example of data in a CSV is as follows
.. code-block:: python
:linenos:
name, height, weight, IQ
Subject 1, 170, 68, 100
Subject 2, 182, 80, 110
Subject 3, 155, 54, 105
* The above example can be represented in a table as follows
.. list-table:: CSV Viewed as a Table
:widths: 50 25 25 25
:header-rows: 1
* - name
- height
- weight
- IQ
* - Subject 1
- 170
- 68
- 100
* - Subject 2
- 182
- 80
- 110
* - Subject 3
- 155
- 54
- 105
* The first line in the example CSV is a *header*, which explains the values in each column
* You do not need these, some CSV files have them, some don't
Reading a CSV File
------------------
* Python has a built-in library to help make reading CSV files simple
* In fact, you have already seen this in the Starbucks Density assignment
.. code-block:: python
:linenos:
:emphasize-lines: 13
def load_starbucks_data(file_name: str) -> list:
import csv
# Open the Starbucks file specified by file_name
starbucks_file = open(file_name, "r")
starbucks_file_reader = csv.reader(starbucks_file)
# Create an empty list that the Starbucks location tuples will be added to
starbucks_locations = []
# For each row in the file, create a tuple of the lat/lon pair and add it to the list
for row in starbucks_file_reader:
location_tuple = (float(row[0]), float(row[1]))
starbucks_locations.append(location_tuple)
starbucks_file.close()
return starbucks_locations
* The emphasized line with the ``for`` loop is the trick to reading data from the csv reader
* When using the ``for`` loop, we read one row at a time from the file
* The file is like a collection of rows
* So, for each *row* in the *collection of rows*
* Here, the variable ``row`` will store a reference to the row's data in the form of a list, where each element in the list is from a different column
.. raw:: html
.. admonition:: Activity
:class: activity
#. Download :download:`this csv file ` to your computer and then upload it to Colab.
#. Write a function called ``load_airports()`` that loads this CSV file into a list and returns the list.
* Use ``load_starbucks_data`` as a reference
#. Play around with the data a little to get a feel for how the information is stored in the list.
.. admonition:: Activity
:class: activity
Write a function ``get_name_from_code(airport_code, airport_list)`` that will return a string containing the full
name of the airport with the corresponding ``airport_code``. The parameter ``airport_list`` should be the list you
loaded using ``load_airports()``.
If your function made use of a linear search, can you think of a way to alter ``get_name_from_code`` and
``load_airports`` such that you do not need a linear search?
.. raw:: html
Writing to a CSV File
---------------------
* If we have large amounts of tabular data in our program we want to save to a file, we can write to a CSV file
.. code-block:: python
:linenos:
# Create a file to write to
out_file = open("nameOfOutputFile.csv", "w")
csv_out_file = csv.writer(out_file)
# Write a row to the file
csv_out_file.writerow(['First cell','Second cell', 'Third cell'])
# Be sure to close the file when done!!!
out_file.close()
* In the above example, notice that all the data for the row is contained within a list
* This is similar to how the data is read in as a list
* With a csv writer, there are two important methods for us to know
* ``writerow``, which was discussed above
* ``writerows``, which takes a list of lists to write a large block of data
For Next Class
==============
* Read `Chapter 19 of the text `_