23. Objects IV — Data Structures
We have spent much of the course learning how to write algorithms
But we also saw a few data structures
Lists
Tuples
Sets
Dictionaries
Although these data structures had clear functionality associated with them, we use them as a mechanism for storing and managing data
They may also provide some other benefits
Ability to index
Efficient lookups
Easy to search
Ordering the data
Managing duplicates
…
Consider the
List
s we have been usingThey can be indexed
The lists have a beginning and end
The elements exist in some order (though, not necessarily ordered)
They are easy to do a linear search on
You have all been taking advantage of these features, but the lists are provided to us at a certain level of abstraction
If you dig deeper and look under the hood, all the functionality of the list needs to be programmed
Data structures are an important part of a programmers toolbox
Actually understanding how they do what they do is critical
To start learning about data structures, we will be constructing our own simple collection — a
Course
that will manageStudent
objects
23.1. Student Class
The
Student
objects are the objects we care aboutBut the problem is, we will have many of them and we want a nice way to manage all instances
This is where the
Course
class comes in, which is covered in the next section
For now, we will focus on the
Student
class — a simple class to keep track of important student informationWe will also define an
__eq__
and__repr__
for the class as they are often handy for testing purposes
1class Student:
2
3 def __init__(self, first_name: str, last_name: str, student_number: int):
4 self.first_name = first_name
5 self.last_name = last_name
6 self.student_number = student_number
7
8 def __eq__(self, other) -> bool:
9 """
10 Two students are considered equal if their student numbers are the same. This assumes student numbers are unique
11 among students.
12
13 :param other: A student to compare the self Student to
14 :type other: Student
15 :return: True if the students have the same student number, false otherwise
16 :rtype: boolean
17 """
18 if isinstance(other, Student):
19 return self.student_number == other.student_number
20 return False
21
22 def __repr__(self) -> str:
23 return f"{self.last_name}, {self.first_name}\t{self.student_number}"
Above is the entire
Student
classThe only functionality is defined in the magic methods
The main purpose of the
Student
class is to store dataNotice the
__eq__
defines equality for theStudent
class strictly on thestudent_number
attributeThe
first_name
andlast_name
values may change for aStudent
However, the
student_number
of eachStudent
should be unique and not change
Also notice how the
__repr__
is not following theClassName(attributes)
formatFor the purpose of this program, a more specialized format is more appropriate
23.2. Course Class
The
Course
class will be a collection ofStudent
objectsUltimately, the
Student
objects will be stored in aList
, but we want to make use of a class to add specific functionality we care aboutWe will have the
Course
class control access to the list ofStudents
through methodsA method to add a
Student
to the listA method to remove a
Student
from the listA method to get the size of the course — the number of
Student
objects in the listA method to check if a given
Student
exists in theCourse
__repr__
method
1class Course:
2 """
3 A collection of students enrolled in a course. This class manages the individual Students and provides simple
4 enrollment details.
5 """
6
7 def __init__(self, course_name: str):
8 self.course_name = course_name
9 self._students = []
Above is the constructor for the class
The only two attributes the class has are
course_name
and_students
Notice how the
_students
attribute starts with an underscore_
We don’t really want to access the list of
Student
objects directlyInstead, to give us more control over how the list is used, we want to add and remove
Students
through the methods we write in the classNothing will actually stop us from accessing the list directly
some_course._students
But as a convention, to let yourself and other programmers know, all attributes and methods that are not to be accessed directly start with an underscore
1class Course:
2
3 # init and/or other methods not shown for brevity
4
5 def size(self) -> int:
6 return len(self._students)
7
8 def add(self, student: Student):
9 self._students.append(student)
For
size
, we just hijack thelen
of the_students
list and return that valueSimilarly for
add
, we can make use of the list’sappend
methodThe methods to
remove
aStudent
and check if theCourse
contains
a specificStudent
both need a linear searchFind the
Student
to be removed, if it existsCheck if the
Student
is actually in theCourse
To eliminate any duplicate code, we can write a helper method —
_find
Like the attribute
_students
, the method_find
is not really indented to be used outside the class, thus, by convention, it starts with an underscore
_find
will be a method that performs the linear search on the listIf the specified
Student
is found within the_students
list, return the index of where it isIf the
Student
is not found, return a sentinel value —-1
The
-1
sentinel value is a special value with meaning within the context of the algorithm — not found
1class Course:
2
3 # init and/or other methods not shown for brevity
4
5 def _find(self, student: Student) -> int:
6 """
7 Returns the index of the first occurrence of a given Student if it exists. If it does not exist, return a
8 sentinel value of -1.
9
10 :param student: The student to search for
11 :type student: Student
12 :return: Index of the student, or -1 if it is not found
13 :rtype: int
14 """
15 for i, s in enumerate(self._students):
16 if s == student:
17 return i
18 return -1
In the above
_find
method, we actually make use of theStudent
class’__eq__
methods == student
calls the__eq__
magic method from theStudent
class
Overall, the
_find
method is not particularly remarkable — it’s a linear searchWith
_find
written, theremove
andcontains
methods are simple to write
1class Course:
2
3 # init and/or other methods not shown for brevity
4
5 def contains(self, student: Student) -> bool:
6 return -1 != self._find(student)
7
8 def remove(self, student: Student):
9 if not self.contains(student):
10 raise ValueError("No such student to remove")
11 else:
12 self._students.pop(self._find(student))
contains
just calls_find
and checks if the sentinel value was returnedremove
checks if theStudent
exists, and if it does, it removes itNote that if the
Student
is not found, an exception is raised by the method
The last method we will add to the
Course
class is the__repr__
magic methodSince we wrote the
__repr__
method for theStudent
class, we know how to format the string for an individualStudent
objectHowever, the
Course
class is a collection ofStudent
objects
1class Course:
2
3 # init and/or other methods not shown for brevity
4
5 def __repr__(self) -> str:
6 s = ""
7 for student in self._students:
8 s += f"{student}\n"
9 return s
In the above example, we start with an empty string
s
Then, we loop over all
Student
objects within the_students
list, and for eachStudent
Convert them to a string (which the f-string will do automatically) with a newline character
And then appends the whole new string to tne end of the string
s
What’s interesting here is how the
Course
class’__repr__
makes use of theStudent
class’__repr__
Below is an example of a
Course
object and it’s string representation
1my_course = Course("CS101")
2my_course.add(Student("Bob", "Smith", 123456789))
3my_course.add(Student("Jane", "Doe", 987654321))
4my_course.add(Student("Niles", "MacDonald", 192837465))
5my_course.add(Student("Jane", "Doe", 987654321))
6print(my_course)
7
8# Results in
9# Smith, Bob 123456789
10# Doe, Jane 987654321
11# MacDonald, Niles 192837465
12# Doe, Jane 987654321
The string printed out is one single string that spans multiple lines