23. Objects IV — Data Structures

  • We have spent much of the course learning how to write algorithms

  • But we also saw a few data structures

    • Lists

    • Tuples

    • Sets

    • Dictionaries

  • Although these data structures had clear functionality associated with them, we use them as a mechanism for storing and managing data

  • They may also provide some other benefits

    • Ability to index

    • Efficient lookups

    • Easy to search

    • Ordering the data

    • Managing duplicates

  • Consider the Lists we have been using

    • They can be indexed

    • The lists have a beginning and end

    • The elements exist in some order (though, not necessarily ordered)

    • They are easy to do a linear search on

  • You have all been taking advantage of these features, but the lists are provided to us at a certain level of abstraction

  • If you dig deeper and look under the hood, all the functionality of the list needs to be programmed

  • Data structures are an important part of a programmers toolbox

  • Actually understanding how they do what they do is critical

  • To start learning about data structures, we will be constructing our own simple collection — a Course that will manage Student objects

23.1. Student Class

  • The Student objects are the objects we care about

  • But the problem is, we will have many of them and we want a nice way to manage all instances

    • This is where the Course class comes in, which is covered in the next section

  • For now, we will focus on the Student class — a simple class to keep track of important student information

  • We will also define an __eq__ and __repr__ for the class as they are often handy for testing purposes

 1class Student:
 2
 3    def __init__(self, first_name: str, last_name: str, student_number: int):
 4        self.first_name = first_name
 5        self.last_name = last_name
 6        self.student_number = student_number
 7
 8    def __eq__(self, other) -> bool:
 9        """
10        Two students are considered equal if their student numbers are the same. This assumes student numbers are unique
11        among students.
12
13        :param other: A student to compare the self Student to
14        :type other: Student
15        :return: True if the students have the same student number, false otherwise
16        :rtype: boolean
17        """
18        if isinstance(other, Student):
19            return self.student_number == other.student_number
20        return False
21
22    def __repr__(self) -> str:
23        return f"{self.last_name}, {self.first_name}\t{self.student_number}"
  • Above is the entire Student class

  • The only functionality is defined in the magic methods

  • The main purpose of the Student class is to store data

  • Notice the __eq__ defines equality for the Student class strictly on the student_number attribute

    • The first_name and last_name values may change for a Student

    • However, the student_number of each Student should be unique and not change

  • Also notice how the __repr__ is not following the ClassName(attributes) format

  • For the purpose of this program, a more specialized format is more appropriate

23.2. Course Class

  • The Course class will be a collection of Student objects

  • Ultimately, the Student objects will be stored in a List, but we want to make use of a class to add specific functionality we care about

  • We will have the Course class control access to the list of Students through methods

    • A method to add a Student to the list

    • A method to remove a Student from the list

    • A method to get the size of the course — the number of Student objects in the list

    • A method to check if a given Student exists in the Course

    • __repr__ method

1class Course:
2    """
3    A collection of students enrolled in a course. This class manages the individual Students and provides simple
4    enrollment details.
5    """
6
7    def __init__(self, course_name: str):
8        self.course_name = course_name
9        self._students = []
  • Above is the constructor for the class

  • The only two attributes the class has are course_name and _students

  • Notice how the _students attribute starts with an underscore _

  • We don’t really want to access the list of Student objects directly

  • Instead, to give us more control over how the list is used, we want to add and remove Students through the methods we write in the class

  • Nothing will actually stop us from accessing the list directly

    • some_course._students

  • But as a convention, to let yourself and other programmers know, all attributes and methods that are not to be accessed directly start with an underscore

1class Course:
2
3    # init and/or other methods not shown for brevity
4
5    def size(self) -> int:
6        return len(self._students)
7
8    def add(self, student: Student):
9        self._students.append(student)
  • For size, we just hijack the len of the _students list and return that value

  • Similarly for add, we can make use of the list’s append method

  • The methods to remove a Student and check if the Course contains a specific Student both need a linear search

    • Find the Student to be removed, if it exists

    • Check if the Student is actually in the Course

  • To eliminate any duplicate code, we can write a helper method — _find

    • Like the attribute _students, the method _find is not really indented to be used outside the class, thus, by convention, it starts with an underscore

  • _find will be a method that performs the linear search on the list

    • If the specified Student is found within the _students list, return the index of where it is

    • If the Student is not found, return a sentinel value — -1

    • The -1 sentinel value is a special value with meaning within the context of the algorithm — not found

 1class Course:
 2
 3    # init and/or other methods not shown for brevity
 4
 5    def _find(self, student: Student) -> int:
 6        """
 7        Returns the index of the first occurrence of a given Student if it exists. If it does not exist, return a
 8        sentinel value of -1.
 9
10        :param student: The student to search for
11        :type student: Student
12        :return: Index of the student, or -1 if it is not found
13        :rtype: int
14        """
15        for i, s in enumerate(self._students):
16            if s == student:
17                return i
18        return -1
  • In the above _find method, we actually make use of the Student class’ __eq__ method

    • s == student calls the __eq__ magic method from the Student class

  • Overall, the _find method is not particularly remarkable — it’s a linear search

  • With _find written, the remove and contains methods are simple to write

 1class Course:
 2
 3    # init and/or other methods not shown for brevity
 4
 5    def contains(self, student: Student) -> bool:
 6        return -1 != self._find(student)
 7
 8    def remove(self, student: Student):
 9        if not self.contains(student):
10            raise ValueError("No such student to remove")
11        else:
12            self._students.pop(self._find(student))
  • contains just calls _find and checks if the sentinel value was returned

  • remove checks if the Student exists, and if it does, it removes it

    • Note that if the Student is not found, an exception is raised by the method

  • The last method we will add to the Course class is the __repr__ magic method

  • Since we wrote the __repr__ method for the Student class, we know how to format the string for an individual Student object

  • However, the Course class is a collection of Student objects

1class Course:
2
3    # init and/or other methods not shown for brevity
4
5    def __repr__(self) -> str:
6        s = ""
7        for student in self._students:
8            s += f"{student}\n"
9        return s
  • In the above example, we start with an empty string s

  • Then, we loop over all Student objects within the _students list, and for each Student

    • Convert them to a string (which the f-string will do automatically) with a newline character

    • And then appends the whole new string to tne end of the string s

  • What’s interesting here is how the Course class’ __repr__ makes use of the Student class’ __repr__

  • Below is an example of a Course object and it’s string representation

 1my_course = Course("CS101")
 2my_course.add(Student("Bob", "Smith", 123456789))
 3my_course.add(Student("Jane", "Doe", 987654321))
 4my_course.add(Student("Niles", "MacDonald", 192837465))
 5my_course.add(Student("Jane", "Doe", 987654321))
 6print(my_course)
 7
 8# Results in
 9# Smith, Bob    123456789
10# Doe, Jane 987654321
11# MacDonald, Niles  192837465
12# Doe, Jane 987654321
  • The string printed out is one single string that spans multiple lines

23.3. For Next Class