Lists and for-loops

So far, we’ve needed a new variable name for each new piece of information we wanted to store. A list is a Python data type that can store multiple pieces of information, in order, and with a single variable name. The list name, together with a non-negative integer can then be used to refer to the individual items of data. Let’s start with an example:

Notice that we can use moons[0] and moons[1] as if they were variables in their own right.

There are two main uses of lists:

  1. Storing large amounts of similar data using just one name for the list.
  2. Applying similar operations to many pieces of data.

Creating lists

You can type a list in using brackets and commas.

When the Python interpreter reaches the brackets, Python allocates space in memory for the items in the list. Of course, usually there’s not much point in just printing out a list created this way; we’d like to be able to refer to the list with a variable name. The assignment operator = assigns the list to the variable on the left-hand side:

characters = ["Horton", "Lorax", "Mayzie"]

Strictly speaking, it’s incorrect to say “characters is a list.” Instead, the correct thing to say is “characters holds the address in memory of a list.” Although the difference between “is a list” and “holds the address in memory of a list” might seem subtle right now, we’ll see later that there’s definitely a difference.

You can also create an empty list.

worst_things_about_cs1 = []

For now, we don’t have much use for empty lists, but we’ll soon see that we can start with an empty list and then add items to it.

Exercise: star power

Objective: Create a list with data and make use of that list by passing it to a function.

The vertices of a star centered the middle of a small drawing area are (161, 86), (119, 80), (100, 43) (81, 80), (39, 86), (70, 116), (63, 158), (100, 138), (137, 158), (130, 116).

In the code below, the polygon function takes two parameters. The first parameter is a reference to a list containing the x coordinates of a polygon. The second parameter is a reference to a list containing the y coordinates of a polygon. Draw a star on the screen.

Here is a solution.

Accessing items of a list

Once a list has been created and its address assigned to a variable, you will want to do things with individual items of the list. We use square brackets to index into a list. The first list element is at index 0 (because that’s how computer scientists like to count: starting from 0), so to get the 0th item in the list, follow the name of the list variable with square brackets containing the number 0.

What goes within the brackets can be any expression that evaluates to an int that is in the range of the indices:

Notice that list items can really be treated as variables. You can use the assignment operator to change an item in a list (characters[1] = "Thing One"). Here is another example:

some_numbers[2] has the value 8 (remember, we start counting, or indexing, the list, from 0). some_number[3] has the value 12. After adding, we get 20, and assign that value to the 0th item of the list. Run the program to see the result.

Common programming error: List index out of range

If you forget that the list is indexed from 0, or for some other reason try to access an item of the list at an index beyond the end of the list, Python will abort your program and print an error message. The following code fails, because there are only items at indices 0, 1, and 2:

This is a very common type of programming error. If you get an error like

IndexError: list index out of range

from Python, step through your code by hand near the line where Python reports an error, and figure out how you managed to ask for a list index that isn’t in the list. There are two cases:

  1. You asked for an item at the index you intended, but somehow didn’t originally put the item into the list.
  2. You put the correct items into the list, but made a mistake in computing or choosing the index.

The length of a list

You can find out how many items are in a list using the Python built-in function len. Here’s an example:

Iterating over a list with a while-loop

A list is a way of organizing, or structuring, data that your program will use. We say that a list is a data structure, and we will see many other examples of data structures later in the course. One of the key principles of a data structure is that the data structure allows you to apply operations to some or all of the data in a unified way.

Let’s take an example. You decide to start a new website, www.warmhanoverweather.com. Initially, daily temperatures are recorded in Fahrenheit.

temperatures = [9, 14, 18, 12, 4, 16, -11]

But then next month, the United States decides to convert to the Celsius scale, leaving the Bahamas, Belize, the Cayman Islands, and Jamaica as the only countries using the Fahrenheit scale. You decide to convert all of your temperatures to Celsius. You could do it this way:

(Notice the floating-point inaccuracies. I told you so.)

You are doing the same type of operation over and over again. Convert temperatures[0]. Convert temperatures[1]. Convert temperatures[2]. And so on. You could use a loop, and replace six statements like so,

This while-loop would work for even a very long list of temperatures.

Exercise: falling star

Objective: Modify the data in a list using a while loop.

The code below draws a star in the center of the screen. Re-write the code so that the star slowly falls from its starting position downwards.

Here is a solution.

An algorithm design and implementation example: reversing a list

Let’s look at an example in which we want to reverse a list. Recall from the first chapter that a fully specified step-by-step strategy for solving a problem in computer science is called an algorithm. The first step to solving a problem is to think about how we might solve it as a human, as precisely as possible. Once each step has been fully specified, we have designed an algorithm. We then translate that algorithm into runnable code, thereby implementing the algorithm.

There are two choices for our reverse_list algorithm:

  1. Create a new, reversed copy of the list, leaving the original list intact.

  2. Reverse the list “in place.” If we do it this way, there is only one list, and at the end of the operation, the items of the list are in reversed order. We say that this operation is in place because the items never leave the list they started in.

We’ll use the second approach and reverse the list in place, since I haven’t yet told you how to make a new list of the correct size.

Now let’s think about how to reverse the list in place. What’s our target?

We want the first item to be in the last item’s spot, and vice versa. We want the second item to be in the current second-to-last item’s spot, and vice versa. So here is a first attempt at our algorithm specification, in English (or pseudo-code):

for every item in the list:
   swap the item with the corresponding item from the end of the list

Will it work? Let’s walk through it in our heads for a list of length 6. Start by swapping the first item with the sixth. Good. Swap the second item with the fifth. Good. Swap the third item with the fourth. Good. The next item is the fourth item. Houston, we have a problem. After we swapped the third item with the fourth, we actually were done—the list was reversed. If we then swap the fourth with the third, as our first attempt at an algorithm seems to specify, we put them back to where they were when we started. We have begun to undo our reversal of the list items. We need to modify our algorithm so that it stops after swapping only half of the items in the list.

for the first half of the items in the list:
   swap the item with the corresponding item from the second half of the list

There’s one more thing we should be nervous about. A list of length 6 has a first half and a second half, each of size 3. What if we had a list of length 7? What’s the first half? What’s the second? How many swaps should we do? Well, the item that is precisely in the middle of an odd-length list doesn’t get changed by reversing the list. So for a list of length 7, we need to do only 3 swaps. If the list length is odd, the number of swaps is the length of the list, divided by 2, and rounded down. If the list length is even, the number of swaps is the length of the list divided by 2. So our final algorithm design is:

for the first half (rounded down) of the items in the list:
   swap the item with the corresponding item from the second half of the list

Now let’s implement the reverse_list algorithm in Python code. We need to loop through the first half (rounded down) of the list. We can use a while-loop for that.

There are a few tricky things to notice. First, we can accomplish the rounding down of the index of the middle item of the list by just using integer division (while index < len(l) // 2:). In Python 2, we could have just used the / operator, but the // integer division operator makes it more specific and works for Python 3, too.

Also, because indices in a list start at 0, we have to be careful about computing the index into the second half of the list. It’s good to check an example. Suppose the list l has length 8. Then the index of the last item should be 7. So we should use len(l) - 1, and then subtract the current index to get the index into the second half of the list. Work it out: if the index is 0, then len(l) - 1 - 0 gives 7. If the index is 1, then len(l) - 1 - 1 gives 6. If the index is 2, then len(l) - 1 - 2 gives 5. And so on. So it works out.

The final thing to notice is how we swap items. We need a temporary variable to hold one of the values. Suppose you tried something like this:

l[index] = l[right_index]
l[right_index] = l[index]

The both items of the list would end up having the same value after the first line, and the original value of l[index] has been irrevocably lost.

If needing a temporary variable confuses you, think of it this way. Suppose that Nicole and Tom want to exchange the positions of their two cars in their two-car garage. Tom’s Chevy starts on the left side, and Nicole’s Honda starts on the right. Tom can’t just drive his Chevy out onto the right side. No, he has to put his Chevy in the driveway (the temporary storage), then put Nicole’s Honda in the left side, and finally move his Chevy to the right side.

Appending to an existing list

If you have an existing list and would like to add an item to the end, you can do that with the append function:

The syntax for append is probably not what you expected. You might have expected something like append(dwarfs, "Grumpy"), since append clearly needs two things: the variable referring to the list, dwarfs, and the data to append, "Grumpy".

What’s the period doing in dwarfs.append("Grumpy")? This syntax shows how we call a special kind of function called a method. Methods change things called objects. We’ll see objects in detail in an upcoming chapter, and this syntax for calling append indicates that lists are a kind of object. In this case, we are calling the method append on the list dwarfs, passing to the method the parameter "Grumpy". If this discussion makes no sense to you now, fear not: it will. In the meantime, just think that the way to append an item, say new_item, to a list, say my_list, is to call my_list.append(new_item).

Exercise: circle art

Objective: Append items to a list in response to user input.

Modify the following program so that it lets the user draw circles on the screen while dragging the mouse with the button pressed. Do not delete or in any other way prevent the clear() function from being called each time circle_art is called.

Here is a solution.

Inserting into and deleting from lists

There are a couple more list operations that you should know about.

You can insert an item into a list and delete an item from the list using the insert method and the del operation, as demonstrated in

To delete an item, you just write del followed by the name of the list and an index to the item you want to delete, so that the line del new_england[1] deletes the item at index 1. You can also delete a contiguous portion of a list, using colon-notation. In our example, del new_england[1:4] deletes the items starting at index 1 and going up to—but not including—index 4. In general, if you write del some_list[i:j], you delete the items from index i through index j − 1.

To insert into a list, you call the insert method. This method takes two parameters: the index of the item you want to insert before, and the item to insert. So when we call new_england.insert(3, "Newyorkachusetts"), we’re inserting before the item at index 3 ("Vermont"). If you give insert an index beyond the end of the list, as in new_england.insert(17, "New Brunswick"), then it’s just like an append operation.

for-loops

We often use a loop to work with items of a list when programming. This section introduces a new type of loop, called a for-loop, that is particularly well suited to dealing with items of a list. Anything you can accomplish with a for-loop you can also accomplish with a while-loop, so we are not getting any more power out of for-loops. But we’ll see that they are easier to read and to type. Alternate, briefer ways of expressing the same thing in code are called syntactic sugar, because they’re just a sweet extra.

Before we get to for-loops, let’s look at another example with a while-loop. A particularly common case is when you would like to use each item in a list, but do not want to actually change the list. Here is an example of simply printing out all of the items of a list:

You might notice that in the above loop, the variable index is never used on its own; it’s a variable that is introduced solely for the purpose of getting items. You never print out index, you don’t compute anything with it, and you really don’t care what its value is. You just want to get each item of the moons list and print it. You also don’t really care about len(moons). Yes, you need to use it to know when to terminate the while loop, but who cares what the actual value is? (I also introduced a variable moon to store the list item, and then worked with that temporary variable. There was no real reason not to just print out moons[index] directly, but this temporary variable will help clarify the explanation of for-loops.)

If you were describing the method for printing out the names of moons at a high level, you might say (in English) something like, “For every item of the moons list, print out that item.” You might not tell me about initializing index, about the mechanical detail of incrementing index each time through the loop, or about comparing index to len(moons). Humans don’t think like that. We can hide these details with a for-loop.

A for-loop iterates over items in a list, making each item available in sequence. Here is an example, in moons_for.py, that acts identically to the example with the while-loop.

With a for-loop, you don’t need to create an index variable, increment the index variable, or compare the index variable to the length of the list. Python does that stuff for you internally, behind the scenes. There are a couple of key things to notice.

  1. If you actually want the value of the index, you can’t determine it during the for-loop. You’ll need to use a while-loop.

  2. You cannot change the moons list during a for-loop. Each item is copied into the temporary variable moon. You can change moon if you like, but that won’t change the list. If you want to change the list, you’ll need to use a while-loop.

Because for-loops are simpler and easier to read, you should use a for-loop in your code wherever the goal is to refer to the items of a list, but without changing the list. If you want to change which items the list contains, you should use a while-loop.

Using range to create a list of int values

Some functions return lists. For example, the built-in Python function range gives ints in a specified range. (By built-in, I mean that you don’t even have to import it from anywhere.) For example, range(3, 10) gives the list

[3, 4, 5, 6, 7, 8, 9]

Notice that the last number in the list is 9, not 10.

range can save you some typing if you want to count over integers. Here’s an example.

If you only give the range function just one parameter, then the list will start at 0:

By calling len and range, you can use a for-loop to index into a list. Then you can change the values in the list, since you have an index. For example, here are two ways to double each number in a list:

and

Exercise: bubbles

Objective: Access and modify lists in a for loop using an index variable.

The following program adds x and y coordinates to lists to represent bubbles in water. Write a for-loop that draws all of the bubbles, and moves every bubble upwards one pixel after it is drawn. Notice that since the bubbles are originally placed below the bottom of the screen, you will not see anything at first; your program might need to run for a while. For debugging, you could increase the frequency of bubbles and place them higher up on the screen.

Here is a solution.

You can treat a string like a list (sort of)

You can get a character from a string almost as though the string were a list:

You can even loop over characters in a string,

You might think you could change a character in a string this way too. It won’t work. mystring[3] = 'r' will give an error. Python strings are immutable; their values cannot be changed.

If you wanted to do something like changing the third letter to r, you would have to build an entirely new string, using the old string and the new letter. We’ll see how to do that later in the course.

How it works: The list in memory

Consider when we first assign to a variable, such as x = 5. Python knows that x is an integer, and Python knows that four bytes of memory are typically enough to store that integer. Python allocates those four bytes, and it copies a binary code corresponding to the value 5 into those bytes. We say that the value 5 is stored in the variable.

Something different happens when you create a list. Space in memory is allocated for the list somewhere other than at the variable’s address. The variable just stores the address of the list. Let’s look at an example:

mice = ["Minnie", "Mickey", "Mighty"]

Here is what happens. First, Python sees a list literal on the right-hand side of the equals sign. Python finds memory space for the list data ["Minnie", "Mickey", "Mighty"] in an area of memory called the heap. This new list now has an address. Let’s say the address of that list data is 1000.

Now Python looks at the left-hand side of the assignment operator. Python has to create a new variable, mice. Python copies the value 1000—the address of the list—into the variable mice. Here’s how you should think of it:

I drew the arrow to indicate that the variable mice is really just telling us that the list itself is somewhere else, in this case at address 1000.

Important note. It would be incorrect to say that the variable mice contains the list. In our earlier example, x = 5, the variable x really did contain the value 5 (or at least its binary representation), not an address or anything else. For a list, however, the variable naming the list—mice in this example—contains only the address of the list data, not the list data itself.

We cannot store a list in a variable, because the list could have any size at all. But just like integer values, all addresses have the same size, so we know how much space in a variable to use for an address.

Aliasing and the assignment operator

Because the address of a list is stored in a variable, rather than actually storing the list data itself in the variable, an important property follows: any variable that stores that address can be used to change the list. Look at this example:

This program prints

['Dartmouth', 'Columbia', 'Yale', 'Stanford', 'Penn', 'Brown', 'Princeton', 'Harvard']

Notice that ivies[3] is no longer 'Cornell'; it has become 'Stanford', even though we made the change to expensive_schools[3], not to ivies[3].

What happened here?

The assignment operator always copies the value of a variable. But in this case, the value of the variable ivies is some address (perhaps 2000), not the list itself. The list is not copied by the assignment operator in expensive_schools = ivies. Only the address is copied. There are not two different lists in the heap; there’s still just one list, but both ivies and expensive_schools hold its address:

Now, when the item at index 3 in the list whose address is held in expensive_schools is changed to "Stanford", the one and only list at address 2000 changes:

I wrote the item at index 3 in italics just to highlight it. When we print the list using the ivies variable (which contains the address of the same list), we see that the list has changed.

You can think of it like this. Let’s say you have a savings account number. You can think of that number as the address used to refer to a pile of money in some bank vault somewhere. Now if the bank assigns me the same address (savings account number), then I can use that number to withdraw half of the money. When you go to look at the account, half of the money is gone.

When two names refer to the same thing, those names are called aliases of each other. Dr. Dre and Andre Ramelle Young both refer to the same person. Aliasing can be a source of tricky-to-find bugs. (You might view it as a bug if I withdrew half of your money from your savings account, and you didn’t even know that we had the same account number.)

For integers, floats, or booleans, the assignment operator makes a new copy of the data, not a copy of the address, so for these primitive types, changing the value of one variable never has an effect on the value of another variable. For strings, the assignment operator copies the address, but because strings are immutable, you’ll never notice that aliasing is occurring.

Why the difference between how primitive types and lists are treated? Lists can be long, and copying a long list can take a long time, relatively speaking. Copying the address of the list might be enough. We’ll see later that we can still tell Python that we really do want to copy the list itself, and not just the address, using a special function list.

Passing lists to functions (and aliasing)

If you pass the address of a list to a function, then the function can change the data at that address. The function does not change the value of the parameter which is passed to it, which is just some address of a list. But the data contained within the list itself is changed.

Exercise: function for reversing a list

Objective: Take an implementation of a list algorithm, and wrap it in a function that takes the list as a parameter.

Reversing a list seems like a useful thing to do; we already wrote the code to do it. Aliasing lets us move that code into a function. Write a function reverse_list that takes one parameter, the address of a list to reverse, and reverses that list in place. Test your function by calling reverse_list on a test list of numbers.

Here is a solution.

Copying a list with list

Sometimes, you really do want a copy of a list. You can use the function list to make a copy of a list. The list function copies all the data in the list into a new area in the heap, and returns the address of the new data.

Sorting a list with sorted

Python contains a useful built-in function, sorted, which makes a sorted copy of the original list, and returns that list. The sorted function does not change the original list. Look at this example:

Run it, and look at the output.

How does the sorted function work? Over the years, computer scientists have developed dozens of different algorithms for sorting. We’ll see a little later a simple sorting algorithm called selection sort.

Good programming practice: Lists should contain only data of similar types

It is possible to store different types of items in a single list. However, you should avoid doing so most of the time.

weird_list = ["test", "2", "5", 1, 3]

The problem is that if you want to apply some operation to all items of the list, you can’t be sure that it makes sense to do so. Imagine you tried to sort weird_list using the sorted function. What would the result be? It turns out Python will do it, but I just can’t remember or predict the behavior. It’s terrible programming practice to rely on surprising behavior that you cannot predict!