Introduction to classes and objects

As programs become more complex, it becomes important to find better ways of organizing the code—dividing the code up into easily managed pieces, each of which is easy to debug and understand. Functions are one way to organize your code. A further way to organize your code is by defining classes: programmer-defined types of data and functions that work on those types of data.

Python offers several built-in data types: booleans, integers, floats, strings, and lists. Python also lets you define a new data type that you design specially for your program, by writing a description of how that data type works, called a class. You can then use that class to create an object of the data type.

Broadly speaking, there are two benefits to using classes and objects.

  1. They provide abstraction. In other words, they give us a way to hide how data is stored in objects of a class. That way, the users of the class need not concern themselves with how data is stored.

  2. They provide a set of “parts” that you can use at will. Suppose that you’re designing a car. It contains four tires. Do you need to design four different tires? No, since all four tires are essentially the same. You can design just one and reuse the design in each place that you need a tire.

Think of how you implemented your pong program. You had to keep track of a ball. You probably had some global variables to describe the position of the ball and also its velocity: how the ball’s position changes at each timestep in the program. Let’s say that you called these variables x, y, v_x, and v_y. It would be great if we had a data type that represented a ball. We could create a ball object that would internally store the location and velocity of the ball.

Each object of the ball class would have its own internal variables to store the location and velocity of that ball. We call these variables instance variables, and they give us a way to specialize each object from a class. Each instance of a ball object would have its own x, y, v_x, and v_y variables.

Let’s say that we want a class, Ball, for describing balls, and let’s assume that I have defined the Ball class already. We might use it like this example:

Let’s look at the first line after the import.

myball = Ball(10.0, 15.0, 0.0, -5.0, .1, 10.0)

The special function Ball is called the constructor for objects of the Ball class. The constructor does three things:

  1. The constructor allocates memory for a Ball object.
  2. The constructor initializes the Ball object with starting values.
  3. The constructor returns the address of the Ball object in memory.

Therefore, this line of code constructs a brand new object of the Ball class, using the values 10.0, 15.0, 0.0, –5.0, .1, and 10.0. The address of that Ball object is copied into the variable myball.

Like lists, objects are created from the heap, and we store the address of the object in a variable. We also call the address of an object a reference to the object. So, in the above code, myball holds a reference to a Ball object.

We don’t know much yet about the internals of Ball objects yet, except that they seem to have something called x and something called y. If we imagine that the Ball object we constructed resides at address 3000 in memory, here’s our picture:

Instance variables and the dot . operator

We call each object in memory an instance of the class.

Each object has its own copy of each variable defined by the Ball class. We say that the variables associated with a particular instance of an object are the instance variables for that object.

To refer to an instance variable of an object, we need two things:

  1. The address of the object.
  2. The name of the instance variable.

That’s why in the above code, we wrote myball.x and myball.y. The reference to a Ball object is myball, and x and y are instance variables of every object from the Ball class. When we write myball.x, we mean the x instance variable of the specific Ball object whose address is in myball. As see in the above code, we can use the value of an instance variable, and we can assign to an instance variable.

The above code has only one instance of a Ball object, but we can have as many as we like:

ball1 = Ball(10.0, 15.0, 0.0, -5.0, .1)
ball2 = Ball(12.0, 23.0, 2.0, 3.0, .1)

Here, we have created two distinct instances of Ball objects:

Here, ball1 holds the address of a Ball object at address 3200. This Ball object has instance variables that we can access as ball1.x and ball1.y, with values 10.0 and 15.0, respectively. Also, ball2 holds the address of a Ball object at address 4500, with instance variables that we can access as ball2.x and ball2.y, with values 12.0 and 23.0, respectively.

In fact, each Ball object also has some additional instance variables. v_x and v_y hold the the velocity of the ball in the x and y directions. These variables are also assigned values by the constructor, so ball2.v_x has the value 2.0, and ball2.v_y has the value 3.0. Finally, there is the radius of the ball, ball2.radius, and the scaling factor for drawing measured in pixels per meter, ball2.ppm

Exercise: animated ball

Objective: Create an object and use the instance variables of that object to make a simple animation.

Create a Ball object in the global scope with appropriate parameters:

Instance variable Value Meaning
x 10.0 x position
y 15.0 y position
v_x 2.0 x velocity
v_y -2.0 y velocity
radius 1.0 radius

Then, within main(), draw the ball using the provided function, and update the position of the ball each frame, by multiplying the velocities by the timestep and adding to the position instance variables.

Here is an answer for this exercise. No peeking until you’ve solved it yourself!

Methods

We might also have functions specially designed to work on Ball objects. For example, functions to move the ball, to draw the ball, or to change the direction of the ball’s motion. Functions designed to work on an instance of an object are called methods.

Methods provide yet more abstraction. We don’t have to know anything about how methods do what they do. All we care about is that they do what we want them to do. In the case of our Ball objects, when we call Ball methods, we don’t have to be concerned with the internal details of Ball objects. We just care that the Ball methods work correctly.

For example, we would like to have a method that moves the ball (changes the x and y instance variables of a Ball object) using the current velocity of the ball and a duration. We’ll call this function update_position. We’ll see how to write the update_position function soon, but here’s an example of how to use it:

Notice that to call a method on an object, the syntax is

reference_to_object.method_name(param1, param2, ...)

In our example, myball is a variable that stores the address of a Ball object, the method name is update_position, and the only parameter to the method is the duration 0.1 seconds.

Exercise: gravity ball

Objective: Write code that makes use of methods provided for an object.

Here are three methods that the Ball class provides:

Write a simple animation of a falling ball by making calls to these methods.

Here is an answer for this exercise. No peeking until you’ve solved it yourself!

Defining your own classes

To define a class, you write several functions that operate on objects of that class. Now you know that we call these functions methods.

We have seen how the objects of the Ball class might be created and used. We will soon look at how to define the Ball class itself, by writing the methods of the Ball class. But first, let’s look briefly at how import statements let us separate our new class into its own file. The previous example program started with the line

from ball import Ball  # import the Ball class

Although we could have defined the Ball class right in the same file with our main code, it is very convenient to write the class definition in some other file, ball.py. Separating each class definition into its own file yields two advantages:

  1. Each file is relatively short and easy to debug.
  2. We can use objects of the class in many different files and programs written in the future, not just for the program we are writing today.

Good programming practice: No Python file should have more than a single class definition. By convention, the name of the file should be the same as the name of the class, but all lowercase.

Like any “good practice” rule; there are exceptions. But for now, you should put each new class in its own file, observing the naming convention.

The class name and naming conventions for classes

The first three lines should be clear, an import statement and the definition of some constants. Next we have the line

class Ball:

This line gives the name of the class. Don’t forget the colon! By convention, class names are capitalized, and rather than using underscores to represent spaces (like you would for a function), use a new capital letter for each word in the class name. For example,

class ThreeToedSloth:

This style of using capital letters for each new word is sometimes called camel case, because each new capital looks like the hump of a camel’s back.

The constructor and the __init__ method

A class definition is a list of method definitions for interacting with the class. In order to work with an object, you have to construct it. We’ve seen examples of constructing an object:

myball = Ball(10.0, 15.0, 0.0, -5.0)

The constructor is a special function defined automatically by Python; it has the same name as the name of the class. We said that the constructor Ball does three things:

  1. The constructor allocates memory for a Ball object.
  2. The constructor initializes the Ball object with starting values.
  3. The constructor returns the address of the Ball object in memory.

Steps 1 and 3 are handled for us by Python, but we will need to write the method that says how to initialize the object with starting values. We call this method the init method, and it has the special name __init__, written with two underscores both before and after init.

class Ball:
    def __init__(self, start_x, start_y, start_v_x, start_v_y, radius, r = 0.5, g = 0.5, b = 0.5):
        # Location and velocities of the ball.
        self.x = start_x
        self.y = start_y
        self.v_x = start_v_x
        self.v_y = start_v_y

        self.radius = radius   # radius (in pixels)

        # Color of the ball, for drawing purposes.
        self.r = r
        self.g = g
        self.b = b

The init method for a class:

  1. Always has the special name __init__.
  2. Always has the first parameter self.
  3. Sets the values of instance variables of the object based on the parameters to the function.

There’s a lot going on here. Let’s take the issues one by one.

First, you notice that the constructor Ball(10.0, 15.0, 0.0, -5.0) took only four actual parameters, but the definition of the __init__ method has eight formal parameters. What’s going on? A couple of things, actually.

The constructor allocates memory for the new Ball object, and it passes the address of that object in the variable self to the __init__ method. Having the address of the object in self allows the __init__ method to have access to the instance variables (and methods) of the newly created Ball object.

In order to access any instance variables of a Ball object, we need the address of the Ball object. If we don’t have an address of a Ball object, we don’t have access to any instance variables of any Ball object. Period. self holds that address.

So the first line of the __init__ method

self.x = start_x

sets the instance variable x of the object at the address self to have the value of the parameter start_x. If the constructor was called with Ball(10.0, 15.0, 0.0, -5.0), then start_x has the value 10.0, and __init__ copies the value 10.0 into the instance variable x of the object at the address self.

For clarity of this first explanation, I used the parameter names start_x, start_y, etc. But there’s no reason I couldn’t have called those parameters simply x and y. Python would not have been confused by the following code:

def __init__(self, x, y, v_x, v_y):
    self.x = x
    self.y = y
    self.v_x = v_x
    self.v_y = v_y

x by itself is the parameter to __init__. self.x is the instance variable x of the object at address self. I will typically have parameters to __init__ that have the same names as the instance variables.

Now, what about the last three formal parameters? How come I didn’t have to supply corresponding actual parameters? That’s because Python allows for optional parameters with default values. If you leave out the corresponding actual parameters, then Python uses the default values. Otherwise, it uses the values of the actual parameters. In this case, the formal parameters r, g, and b all have the same default value of 0.5. Therefore, the call to the Ball constructor is the same as if it had been Ball(10.0, 15.0, 0.0, -5.0, 0.5, 0.5, 0.5).

We’ve used functions with optional parameters before. If you recall, when we first called start_graphics, we supplied just one parameter, the name of a function. Then, as we did fancier stuff with graphics, we supplied a second parameter (a string with the window name), a third and fourth parameter (the window width and height), and a fifth parameter (a boolean indicating whether to flip the coordinates vertically).

Whenever you define optional parameters, they have to come after all the required (i.e., non-optional) parameters. And whenever you call a function (or method) and omit optional parameters, the parameters you omit must be the last ones. So, if you call the constructor Ball(10.0, 15.0, 0.0, -5.0, 0.7), then the formal parameter r gets the value 0.7 but the formal parameters g and b each get their default values of 0.5. (We’ll see a little later that there is a way to selectively omit optional parameters.)

Now, what about this business of setting the instance variables of the object being constructed? Think of it this way. Some combinations of instance variables are nonsensical, or just plain illegal in the context of the problem being solved. For the ball in the window, you might want to disallow the ball being outside the window. For a nuclear reactor, you might want to disallow certain configurations of the fuel rods.

If an object can exist with illegal, or even just unknown, values of its instance variables, that is bad. Bad, bad, bad! So, if you ever find yourself defining a constructor that does not assign to every instance variable, you should follow my simple three-step plan:

Defining your own methods

To recap: A method is a special function defined in a class; it acts on an object of the class in which the method is defined. Every method always takes at least one parameter, self, that refers to (contains the address of) an object of the class in which the method is defined.

For the Ball class, an obvious method would be the function that updates the position of the ball, based on the current velocity and some duration of time, the timestep. With this method, here’s what the Ball class would look like:

class Ball:
    def __init__(self, start_x, start_y, start_v_x, start_v_y, r = .5, g = .5, b = .5):
        # body of __init__ omitted for brevity

    def update_position(self, timestep):
        self.x = self.x + timestep * self.v_x
        self.y = self.y + timestep * self.v_y

Let’s think about what information the update_position method needs to do its job. It needs the ball’s position (i.e., the ball’s x and y coordinates), the ball’s velocity (again, in x and y), and the timestep being simulated. The ball’s position is in the instance variables x and y, and so we can refer to them by self.x and self.y. The ball’s velocity is in the instance variables v_x and v_y, and so we can refer to them by self.v_x and self.v_y. What about the timestep? It’s not kept in an instance variable. Why not? Because the timestep is not a property of a ball; it’s an amount supplied by the code that is simulating how the ball moves. So we supply the timestep as the parameter timestep.

Notice, then, that the update_position method gets the information it needs from two sources: the instance variables of a Ball object and a parameter of the method. It could even have gotten information from yet another source—global variables—though this particular method didn’t need to.

Exercise: tree

Objective: design a class and use create and use an object of that class.

The code below draws the sky and some grass. Write a class, Tree, that can be used to draw a tree. Each tree object should have two instance variables that store the coordinates of the bottom left pixel on the trunk of the tree. The Tree class should have a method draw that draws the tree, using a brown rectangle for the trunk, and a green circle for the top of the tree. For example, you could have a trunk that is 20 pixels high and 5 pixels wide, and a circle with radius 12 for the top part of the tree.

Using the Tree class, create a single Tree object in the global scope placed on the grass, and draw that tree using a single method call within the scene method.

Here is an answer for this exercise. No peeking until you’ve solved it yourself!

What happens when you call a method

Let’s repeat what we saw before, when we created a Ball object and then updated its position:

Several things happen when a method is called:

  1. The address before the dot is copied into the variable self of the method.
  2. Any values inside the parentheses are copied into remaining formal parameters of the the method.
  3. Any optional parameters that do not have corresponding actual parameter values receive their default values.
  4. The current value of the program counter is saved, and then the program counter is set to the first line of the method body. The method executes.
  5. When the method completes, either because the program counter reaches the end of the method or a return statement, the program counter goes back to do the next thing after the method call.

Notice that steps 2–5 work the same way for function calls and method calls. Method calls differ from function calls only in that method calls include step 1. Every method call must have a reference to an object before the dot. Every method header (in the definition of the method) must have a first parameter, self.

There are two common programming errors to avoid here:

  1. It’s easy to forget to type the word self as the first parameter of the method header.
  2. It’s easy to try to call a method without putting the reference to the object before the dot. In either case, Python will probably complain that you’ve passed the wrong number of parameters to the method.

You can think of the object reference before the dot as just a special parameter to the method.

Methods can call other methods

You might notice that there is a method in the Ball class, animate_step, which we haven’t discussed yet. It looks like this:

  def animate_step(self, timestep):
        self.update_position(timestep)
        self.update_velocity(timestep)

Notice that animate_step itself calls other methods of the Ball class to actually do the real work. Since it has a reference to a Ball object (in the variable self), it can call other methods on that object; for example, self.update_position(timestep).

Special methods: __init__ and __str__(self)

The __init__ method that we’ve already seen is special because we never ever call it directly. When we call the constructor (the function created automatically by Python that is used to create objects of the class), Python implicitly calls the __init__ method for us automatically. The double underscores at the beginning and end of __init__ mark that this is a special method that will never be called directly by the programmer, but instead will just be used by the constructor function.

There are a few other special methods that a designer of a class might write, but are not used directly by a programmer. One of the most useful of these is the method __str__(self). This method takes one parameter, the reference to the object, and should return a string.

If the programmer doesn’t call __str__, when is it used? It turns out that when you use the function str on the object, the str function calls the __str__ method for the class, if you have defined one. In fact, print automatically calls __str__ to get a string representation of the object to print, even if you don’t use str explicitly. You can add a __str__ method to the Ball class like this:

def __str__(self):
    return str(self.x) + ", " + str(self.y)

Now we can easily print out information about a Ball object in two different ways:

Lists of objects

A Python list can contain references to (i.e., addresses of) objects. Just create each object with the constructor, and add the object to the list, either using indexing or with the append method of lists. Here’s an example:

Exercise: cascade

Objective: Create objects, add references to those objects to a list, and call methods on each object in the list.

The code provided below draws a ball and animates its motion. Rewrite the code so that every frame that the mouse button is down, a new ball is created at that mouse location and added to a list of balls. Each frame, your code should draw all of the balls and update their positions and velocities, so that all of the balls fall downwards. You can use is_mouse_pressed(), mouse_x(), and mouse_y() within main() to check the state of the mouse.

Here is an answer for this exercise. No peeking until you’ve solved it yourself!

Deleting items from a list

The cascade exercise is inefficient because balls are still simulated even after they have fallen off the screen. Here’s a nice function to check whether any ball has fallen off the screen, and if so, remove it from the list.

def remove_offscreen(blist):
    i = len(blist) - 1
    while i >= 0:
        if blist[i].y < 2:
            del blist[i]
        i -= 1

Here, I used the Python del operation to remove Ball objects from the list once they fell off the bottom of the graphics window. Notice that if you delete the item at index 3, the item previously at index 4 moves into the vacated spot at index 3 and, in fact, all items after index 3 move one spot closer to the beginning of the list. A for-loop that loops to the end of the original size list would therefore run into trouble if the list were shortened during the body of the loop.

In fact, looping forward through a list while deleting items is also tricky for another reason. Suppose you are considering item at index 4. If you delete this item at index 4, then you should not advance the index, since the list moved the previous item 5 into the vacated spot.

A convenient workaround is to loop backwards. Even if you delete the item at the current index, you can still reduce the index by 1 to consider a new item.

Sorting a list of objects with a sort key

We previously discussed how to sort a list of numbers or strings. What if we wanted to sort a list of objects? The main question is how to compare two objects. For example, if we had a list of objects of some class Student, we might sort by height, or we might sort by grades on a CS 1 exam.

The .sort() method of the list class sorts a list in place. However, to sort a list of objects, we need to have some way of choosing the value to sort by, sometimes called the sort key. We can do this by writing a function that takes an object of the desired type, and returns the value. The sort method takes a reference to a function as a named parameter, key. Here’s an example:

Exercise: the forest

Objective: design a class and manipulate objects of that class to model an environment.

Write the function create_forest which creates and returns a list of Tree objects placed randomly on the grass drawn by scene. Add a few lines of code to scene to draw those trees.

You will notice that the forest looks a little strange. Trees that are closer to the viewer should be drawn last, so that they appear on top of trees that are further from the viewer. But the randomly placed trees might have trees in the back appearing too late in the list, causing weird drawing behavior. Modify create_forest to return a list of trees that is sorted by the y coordinates of the trees in ascending order.

Here is an answer for this exercise. No peeking until you’ve solved it yourself!

Exercise: cloudy day

Objective: Design a class and manipulate objects of that class to model an environment.

Add clouds to the scene by creating a Cloud class, and creating a list of cloud objects; add clouds to the list at random times with coordinates slightly to the left of the screen. The clouds should drift slowly from left to right in the wind; delete the cloud objects when they leave the screen. (For an added challenge, darken the sky based on how many clouds are in the sky.)

Here is a student-contributed answer for this exercise. No peeking until you’ve solved it yourself!