Variables and expressions

This chapter discusses how to store data, and how to compute values from data. The topics are variables, memory, data types, expressions, and operators.

Variables

It’s useful to be able to store values in the computer’s memory for later use. A variable is a name that refers to a location in the computer’s memory where a piece of information can be stored. A simple example:

The statement meaning_of_life = 42 is called a variable assignment and it causes Python to do three things:

  1. Python reserves a space in memory to hold the integer value 42.
  2. Python gives the name meaning_of_life to that location.
  3. Python copies a pattern of 0s and 1s corresponding to the value 42 into that location in memory.

Later, when the value of the variable is needed (in the print( meaning_of_life ) statement) the central processing unit of the computer (sometimes called the CPU) determines the location in memory with which the name meaning_of_life is associated, goes to that location, and fetches the value it finds there, making that value available to Python. Because this value is used in a print statement, Python prints the value 42.

In Python, each variable must be assigned a value before that variable can be used.

Variable names vs. strings

Strings have quotation marks around them so that it is clear that the string is to be considered a string value, and not a variable name. What does the following code print? Is there an error in the code?

Using variables to change code behavior

We’ll see soon how to use variables to compute and store values. But one important use of variables is to change the behavior of code. First, we write the code in terms of some variables. Changing the values of those variables then changes the behavior of the code. The example below shows how this works. First, run the code. Then change the value of x to 60 and y to 120 and run the code again. How does the behavior of the program change?

Exercise: big smile

Objective: Use variables to allow behavior of code to be changed easily.

The code below will draw a smiley face on the screen, and you can use the variables x and y to change where the smiley face is drawn. But the smiley face is always the same size. Add a new variable, scale, that allows you to change the size of the smiley face to make the face either larger or smaller. For example, if scale had the value 2, then the code would draw the smiley face twice as large (but still centered on x and y).

Test your code by changing variable values a few times to draw the smiley face at different locations at both small and large scales.

Choosing good variable names

Variable names are sequences of letters, digits, and underscores. The first letter cannot be a digit, and variable names cannot contain spaces. Python is case-sensitive: uppercase and lowercase letters are considered to be different characters in variable names, so that the names Earth and earth refer to different variables. You should choose variable names that are descriptive of the value that will be stored.

By convention, letters appearing in variable names in Python are all lowercase. If you want to make up a variable name from multiple words, use the underscore character _ to replace spaces, as in the variable name meaning_of_life. (Some programmers favor “camel case”: meaningOfLife. This is not the time to strike a bold blow for independence – you should follow Python conventions in your Python code, which includes using underscores to separate words in variable names.)

Variable assignment with ‘=’

The equals sign used to assign a value to a variable is not the same as the equals sign you see in mathematics.

In mathematics, the equals sign is used to write down facts: the expression or variable on the left-hand side of the equation is now, and always will be, equal to the expression or variable on the right-hand side.

In mathematics, x = 5 is a fine equation. So is x = 6. But if I gave you both equations, you’d say I screwed up, because then x would equal both 5 and 6, and it just can’t. But in Python, the following works just fine:

The = operator in Python does not mean “mathematically equals.” In Python, the assignment operator, written with an equals sign, does several things:

  1. If the variable on the left of the equals sign does not exist, space in memory is reserved for the variable.
  2. The expression on the right-hand side of the equals sign is evaluated.
  3. The computed value on the right-hand side is assigned (copied into) the space in memory assigned for the variable.

So the first line of code copies the value 5 into the variable x. The second line of code copies the value 6 into the variable x. When the program is finished, x has the value 6.

Here are a few statements that wouldn’t work in mathematics, but do work in Python:

In mathematics, x = x + 1 would mean something like “x is the number that is one greater than itself”. That sounds like something that Captain Kirk used to make a computer explode in Star Trek. There is no such number.

In Python, there’s no problem. Evaluate the expression on the right-hand side. Fine—because x has the value 5, we know that x + 1 evaluates to 6. Assign that value to the variable on the left-hand side. Fine—now the variable x has the value 6. The computer does not explode.

When you see or write an equals sign in Python, do not think “mathematically equals.” Say in your mind, “assignment operator.” Compute the value on the right-hand side. Put the computed value into the variable on the left-hand side.

Will the code 5 = x work in Python? No. The left-hand side operand of the assignment operator must be a variable name. 5 is not a variable name.

How memory works

Memory in a computer is basically a long sequence of 0s and 1s, or binary digits: bits. You can think of a very long row of on/off switches. Since each switch value doesn’t give much information (only a 0 or a 1), it’s useful to refer to groups of bits. A group of eight sequential bits is called a byte.

On a Mac or PC, each byte has its own address, a number associated with it that we can use to refer to that byte. The byte at address 0 is the first byte in memory; the byte at address 1 is the second, and so forth. (Computer scientists often start counting at 0. Believe it or not, we often find it easier to start counting at 0 than starting at 1.)

Wait – how do we refer to a location in memory? Do we use variable names or addresses? The answer is both. You have a Hinman box number. I can refer to your mailbox directly by number (address), or I can use your name to refer to the box, assuming that I have a directory that tells me what box number is associated with your name. Python acts as the directory that keeps track of the correspondence between variable names and memory addresses where the contents of the variables can be found.

Integer types

I just said that memory is a string of 0s and 1s. How in the world are we going to represent the number 42 by a string of 0s and 1s?

Python uses a special code, relying on how to represent numbers in base 2, or binary. Let’s not worry too much about how to represent numbers in binary, but I’ll just tell you that 42 in base 2 is 101010. Here are various integers in binary:

Integer Binary
6 110
18 10010
42 101010
90 1011010
999 1111100111

The first thing to notice about these binary representations is that their lengths differ. The integer 6 needs only three bits, but the integer 999 needs ten bits. To be safe, Python allocates a fixed number of bytes of space in memory for each variable of a normal integer type, which is known as int in Python. Typically, an integer occupies four bytes, or 32 bits. Integers whose binary representations require fewer than 32 bits are padded to the left with 0s.

Let’s say you had only one byte of memory. How many different patterns of 0s and 1s can represent integers in eight bits? Let’s count them:

00000000
00000001
00000010
00000011
00000100
00000101
...
11111011
11111100
11111101
11111110
11111111

It looks like there are 28 = 256 different patterns. I could use my one byte to represent 256 unique integer numbers, because each integer would need its own bit pattern. If I had two bytes, I could represent 216 = 65,536 different integer numbers.

With four bytes (the usual amount of memory allocated to each int variable), we could store 232 different integer numbers. If the leftmost bit is a 1, the number is construed as negative. If the leftmost bit is a 0, then the number is construed as either 0 (if all the bits are 0) or positive (if the leftmost bit is 0 but there’s at least one 1 somewhere). So we expect half of the int values to be negative, one of them to be 0, and the rest to be positive. Therefore, we expect the largest positive int to be 231 − 1, or 2,147,483,647, and the most negative integer to be  − 231, or -2,147,483,648.

Note: When you are typing in large integers, whether it’s as part of your program, as console input, or as input anywhere else, do not include commas. Do not type 100,000; instead, type 100000. I included the commas above so that you could easily see that two bytes gives us a little more than 65,000 different numbers and four bytes gives us integer values with magnitudes of 2 billion and change.

The long int type

If you want to store a number larger than 2,147,483,647, Python can do it, using the long int data type. Rather than allocating a fixed four bytes of memory for a long int variable, Python decides how many bytes to assign based on the actual size of the number. Larger integers will require more memory, since the shortest representations (with the fewest bits) are assigned first to numbers closer to 0. In addition to the memory cost, computations with long ints are much slower than computations with ints.

In Python, every datum has a type. But we don’t have to say what the type is; Python can figure it out for itself.

The floating-point type

The floating-point type is like scientific notation, e.g., 6.02 × 1023. Since you can’t type a superscript in plain text for Python, if an exponent on the 10 is needed, you would write it like this:

Notice that we omit the 10, but it’s understood; 6.02e23 is not 6.0223 (6.02 raised to the power 23), but instead is 6.02 × 1023.

Floating-point numbers are stored with three parts:

The floating-point type (or “float type” for short) is used for numbers that have fractional parts or are too large to store in a long int that takes up a reasonable amount of memory. Typically, eight bytes are used for the Python floating type. Notice that this means that there are only 264 different floating-point numbers that can be represented. This might seem like a lot of numbers, but remember that the real number line has infinitely many numbers. Floating-point numbers, therefore, allow only limited precision.

Floating-point numbers get less precise the further you get from 0

The difference between 0.01 and 0.02 is, relatively, a lot (100%), but the relative difference between (6.02e23 + 0.01) and (6.02e23 + 0.02) is not a lot, at least compared with the size of the numbers involved.

For this reason, the 264 floating-point numbers are not evenly distributed on the real number line. More of them are allocated near 0 than near numbers with larger magnitudes. When you type in a number with a decimal point, or create one through some mathematical operation, the computer finds a floating-point number close to the correct number. Small fractions are likely to be lost in this rounding process for real numbers that are very large, since the nearest floating-point number that the computer can represent may be relatively quite far away.

Expressions and operators

You can use Python to compute. The program

The plus-sign is called an operator. An operator takes one or more operands, computes a result, and makes that result available to Python for further use.

In this example, the operand on the left is 18, and the operand on the right is 24. An expression is an operator and its operands; we say that the expression can be evaluated to give a single value. In our example, the expression is 18 + 24; when evaluated, this expression’s value is 42.

An expression can be an operand to an operator:

Here, the expression (3 * 6) is the left operand to the operator +.

The character * denotes multiplication in Python, to avoid confusion with the letter x. If Python needs the value of an expression, Python computes that value. In this example, the value of the expression (3 * 6) is needed before the addition can be done, and so the value 18 is computed first, by the multiplication operator *. That value, 18, can then be used as an operand to the addition operator.

Arithmetic operators such as +, -, *, and / (division) follow the same order of operations as you are used to from mathematics. * and / have higher precedence than + and -, meaning that they are evaluated first. Operators with the same precedence are evaluated left to right. Parentheses make the order of operations explicit. When in doubt, use parentheses to make your code as easy to read as possible.

For example:

Integer division

There are two types of division, and each is sometimes useful. Integer division takes two integers and evaluates to an integer. Floating-point division takes two floating-point numbers (numbers with a decimal point) and evaluates to a floating-point number.

Remember word problems like this? At the Lake Morey Skate-athon, I skate around an oval ice track that is four miles long. I carry a card, and for every full lap I skate, I get my card stamped. If I start at 12:00 noon and skate at 10 miles per hour, how many stamps will I have at 3:00 pm? At 3:00 pm, I will have skated 3 × 10 miles, or 30 miles. Dividing 4 into 30 gives 7 laps, and hence 7 stamps, with half a lap (2/4 of a lap) left over.

In Python version 2, the program:

print( (3 * 10) / 4 )

gives the result ‘7’: integer division. In Python 3, the operator / indicates floating-point division, and the same code would print the value 7.5. If you want integer division in Python 3, you must use the operator `//’.

Using integer division when you want floating-point division is a very common mistake in Python 2. The best solution is to start your code with the statement:

from __future__ import division

Operators defined on integer and floating-point types: +, -, *, /, %

The operators defined on integer and floating-types types are

The mod operator in Python is unusual in that it can take floating-point operands. (Most other programming languages that support a mod operator insist on only integer operands.) For example, 8.0 % 2.5 evaluates to 0.5, because 2.5 goes into 8 three times, with 0.5 left over.

The string type

There are many different types of data to store. Integers are one type; letters of the alphabet are another. The “string” type of data represents one or more letters of the alphabet, symbols (such as @ or ~), or digits.

When you type a string value into Python, it must be surrounded by quotes, so that the string does not look like the name of a variable.

We call "Z" a string literal, since it should be interpreted by Python literally as the character “Z” and not as some variable name or anything else. (In x = 42, the number 42 is an integer literal.)

The quotes say that “hello” is a string, and not a variable name. More examples:

The quotes just identify the data as a string; the quotes aren’t part of the string.

Python is unusual in that you can use either single quotes or double quotes around the string, as long as you use the same kind of quotes before and after a given string. So the following lines do the same thing:

But print( "hello' ) would be an error.

String concatenation

The plus sign (“+”) behaves differently depending on the types of the data that are on either side of it. If there are ints on either side, the plus sign is the integer-addition operator, and it adds the two ints to get another int. If there are floating-point numbers on each side, they are added to get another floating-point number.

What if there are strings on each side? Then the plus sign is the string-concatenation operator. Concatenation means to “combine two strings together”.

Internal representation of strings

Given that the memory of a computer stores only sequences of 0s and 1s, then how does Python store a string in memory? Python uses a code to convert a string into a binary representation of the string when the string is stored. When the string is retrieved, Python uses the code in reverse to convert back from binary into the string.

Which code does Python use? Python version 2.7 (the version we use for this class), uses a popular code called the American Standard Code for Information Interchange, or ASCII. Each character (string of length 1) uses eight bits, or one byte. For example, the ASCII code for the character A is 01000001, and the ASCII code for a is 01100001.

If you’d like to see a full table of ASCII character codes, click here. You rarely need to know a character’s ASCII code.

Converting between types

There are a few special functions that convert between types of data: int, float, str are useful for the types of data we’ve seen so far.

Sometimes conversions are performed automatically by Python behind the scenes, much as what happens when one of the operands for division is a floating-point number and the other is an int. For example, if you try to add a float and an int, the int is converted to a float before addition:

The boolean type

There’s one another type of data, called a boolean. There are only two possible values: True and False. Notice that these are capitalized.

Coding style

We’ve already discussed comments, which are one tool to make reading and understanding programs easier. Another way to improve understandability is to use meaningful names for variables. For instance, consider the following code:

What does it do? We could tell by running it, or we could add comments. Or we could use more descriptive names than x, y, and z.

Without using comments, we’ve improved the understandability of the code considerably. This is not to say that comments should be jettisoned completely in favor of meaningful names! Rather, the two strategies work together.

Next, consider this code, with meaningful names:

It’s easy to see what it does because of the names we’ve chosen. But it could be better, especially the constant floating-point values we’ve stuck in there. Presumably, we’re confident in our ability to calculate 4.0 * 3.0 / pi (notice that I have to make sure that at least one of 4 and 3 is a float so that I don’t get integer division), but if we make a mistake, it’s going to be very difficult to track down.

Here, we’ve replaced the number 3.14 with a variable pi and wrote out 4.0 / 3.0 instead of precomputing its value.

Some constants are so widely-used that Python defines them for us. For instance:

An added benefit of using Python’s pi is that the Python designers have gone to the trouble to calculate a much more precise approximation of π than we did.

Bonus coverage: The function type

A function is itself a data type in Python. You can think of the name of the function as a variable that contains the address of the function’s lines of code.

What this means is that other variables can also store a reference to the function. Here is an example:

We have already seen another example. We wrote the draw_house function, and then gave that function to the start_graphics function. The start_graphics function was then able to use draw_house when it needed it. Slick! The type of the data contained in draw_house is a “reference to a function.”