We often observe a bug for the first time when we run the program and notice unexpected behavior. Catching and fixing bugs requires you to act like a detective: something that should have been a normal situation has gone wrong. Why? Debugging is as much an art as a science, but here is a procedure that can help.
Ask yourself what you expected. What actually happened? What’s the difference between what you expected and what you got? Be as precise as possible. Not, “the circle is in the wrong place,” but “I expected a circle to be drawn at the location 200, 100, but it seems to have been drawn at 100, 200.” By asking the question precisely, you can sometimes solve the problem immediately. “Aha! Perhaps I swapped the x- and y-coordinates somewhere.”
Find the earliest point in the code where something unexpected happens. Bugs often have a cascading effect: once one bug happens, strange behavior can follow. So you want to find the first place where something went haywire. Based on what you expected to happen and what actually happened, come up with a few possible places where the earliest problem could have occurred.
Step through the code mentally. Start at the first executable line and work your way through, writing down the values of variables as you go. Be absolutely certain about the type of the value that you intend to be stored in each variable, and make sure that this is the type actually stored.
Insert print-statements into your code to tell you the order in which functions and statements execute and the values of variables. As you scan the console output, you’ll see what your program is actually doing. You might then insert additional print-statements to get a better idea of how the code behaves, and you might remove some of the print-statements that you previously inserted in order to reduce clutter in the console.
Use the debugging tool in PyCharm or some other code editor to step through the code. Watch variable values closely, and watch where the program counter moves to. The minute something happens that you didn’t expect, figure out precisely what occurred.
Here are a few brief notes on how to use the debugging tool in PyCharm.
You have a couple of choices for how to run the PyCharm debugging tool. From the Run menu, you can choose Debug… and then choose the program you want to debug from the popup menu. Alternatively, if you right-click on the name of your program in the Project pane, the popup menu gives you the choice of Debug your_program, where your_program is the name of your program.
When you start debugging, the Console pane gets another tab, labeled Debugger. In that pane, you’ll see information about the program. There’s a sub-pane labeled Frames and one labeled Variables. The Frames sub-pane lists the functions currently running, with the most recently called function at the top. The pane labeled Variables shows the variables defined in a function. If you click on the name of a function in the Frames sub-pane, you’ll see the variables defined in that function.
Once you’ve run the debugger on your program, you can avoid going through the menus to run the debugger on it again. Instead, you can click on the little bug icon on the left of your Console or Debugger pane.
Python code runs too fast for you to follow it. The debugging tool lets you stop your code at a particular place, and then run one line of code at a time, while you watch the variable values and the program counter. In the source code window, you need to specify where to initially stop, using a breakpoint: a place in your program where the debugger will stop once the program counter gets there, giving you control at that time. If you single-click in the main source code window just to the left of a line of code, you set a breakpoint at that line. You’ll see a light pink dot on each line with a breakpoint. PyCharm lets you set a breakpoint only on a line where there is actual code, not a blank line or a comment. You need to set at least one breakpoint before you start a debugging run; otherwise, the program will run just as if you were not debugging.
When the program counter gets to a line with a breakpoint, your program will now temporarily stop. The line of code with the breakpoint is highlighted, in blue on my Mac (the color might differ on your computer). This line of code is where the program counter currently is. The Variables sub-pane will show you the names and values of available variables.
If you right-click on a variable name in the Variables sub-pane, one of the options you get is to set the value of the variable. That way, if the variable has the wrong value and you want to see how the program would behave if it had the correct value, you don’t need to start over. Of course, you’ll want to edit your program at some point so that the variable gets the correct value without you intervening in the debugger.
The buttons at the top of the Debugger pane allow you finer control over the program. From left to right:
Show Execution Point: Clicking this button will take you directly to the currently executing line, regardless of which function you’ve clicked in the Frames sub-pane, and it will also show you in the Variables sub-pane the variables in the function where the currently executing line is located.
Step Over: Clicking this button causes execution to run until the next line in the current function or file, skipping any function called at the current program counter. If the current line is the last one in a function, execution moves to the line executed right after this function returns.
Step Into: Clicking this button causes the debugger to step into the function called at the current program counter.
Force Step Into: Clicking this button causes the debugger to step into the function called at the current program counter even if this function is to be skipped. (Confession: This description comes from the PyCharm website. I don’t really know what it means.)
Step Out: Clicking this button causes the debugger to finish executing the current function, stopping at the line executed right after this function returns.
Run to Cursor: Clicking this button resumes the program execution and pauses until the program counter reaches the line at the current cursor location in the editor. No breakpoint is required. Actually, there is a temporary breakpoint set for the current line at the caret, which is removed once your program execution is paused. Thus, if the caret is positioned at the line that has already been executed, the program will be just resumed for further execution, because there is no way to roll back to the previous breakpoints. This action is especially useful when you have stepped deep into function calls and need to step out of several functions at once.
If there are breakpoints set for the lines that should be
executed before bringing you to the specified line, the debugger will pause at the first encountered breakpoint.
The PyCharm website recommends that you use this button when you need a kind of a temporary breakpoint at a specific line, where the program execution should not be interrupted. You probably won’t use this button much at all.
Evaluate Expression: Clicking this button opens a window in which you can type an expression, including the currently active variables, and see the result.
If you cannot remember which button is which, let the cursor hover over the buttons until you see the tool tips.
Using the debugging tool is really good practice to see exactly how your program behaves as it is executed, and you should get into the habit of running the debugger on your code frequently.
Debugging a graphical program is a little trickier. There are things happening behind the scenes to draw the window. Try to set up your PyCharm window so that you can see the graphics window as the program is executing. (You can use the resume button, which looks like a “play” button, to continue running the program until it reaches another breakpoint.) Also notice that the window might not look right because it is not being redrawn as you step through the program. You can insert request_redraw
calls to get the window to update.
You always have the alternative of inserting print statements in your code to tell you what the code is doing and what the variable values are.
Here’s some buggy code, from buggy.py:
# Buggy program to print out the prime numbers between 2 and 100
# CS 1 debugging example by Devin Balkcom, April 4 2015
possible_prime = 2
print "Prime numbers between 2 and 100: "
while possible_prime < 100:
possible_factor = 1 # bug 2
# start by assuming the number is prime.
prime = True
# then use a loop to test whether that is correct by checking possible integer factors
while possible_factor <= possible_prime: # bug 4
# If the number is evenly divisible by the possible factor, then it's not a prime
if possible_prime / possible_factor == 0: # bug 3
prime = False
# Don't need to check any more factors.
# Terminate the loop early:
break
possible_factor += 1
if prime:
print str(possible_prime) + " "
possible_prime += 1 # bug #1
The idea is to loop over integers from 2 to 100. For each integer, we loop over possible factors. If there are no possible factors other than 1 and itself, the number is not prime, and we can move on to the next number.
On the other hand, if we find check for possible factors and don’t find them, then the number was prime. Print it.
When we first run the code, we find that it prints 2 many times in an infinite loop, not what we expected. We step through with the debugger, watching variable values, and we find that possible_prime
is not getting updated, even though we expected it to be. In fact, we notice that the debugger skips that line completely. A close loop reveals that the line possible_prime += 1
is actually outside the while loop. It’s never called. Dead code! So, indent it.
Then we run the code again, with hope in our hearts! And it prints all of the numbers 2 to 99, and we are sad. 2 prime, ok. 3 prime, ok. But we expected 4 not to be prime, and it was reported as prime. So we run the debugger, stepping through until we get to the situation where possible_prime = 4
and possible_factor = 1
. And then we step carefully. Wait! Of course 4 is evenly divisible by 1! We should start possible_factor
at 2.
With this clue in hand, we look at the condition of the loop very carefully. possible_prime / possible_factor == 0
. Since 4 / 2 is 2, then the condition is False. But we expected this check to test if this was an integer factor, and report True if so. Our test reported False. Oh. Evenly divisible means no remainder, and it’s the modulus operator that computes the remainder. We wanted possible_prime % possible_factor == 0
. Fix it.
Onwards and upwards! We run the code, and… it reports that there are no prime numbers between 2 and 100. Stand up. Breathe deeply. It’s ok.
What did we expect? 2 should be prime. We run the debugger, and notice that the body of the if statement is getting executed even when ‘possible_prime’ has the value 2. Odd. The possible_factor
is 1 when this first happens. Oh. 2 % 1
is indeed 0, so we think that 1 is an integer factor. In fact, it is. But we forgot that even primes have 1 as an integer factor. So we should start checking with possible_factor = 2
. We fix it.
We run the code, and… it reports that there are no prime numbers between 2 and 100. Run the debugger. Go to the case where ‘possible_prime’ has the value 2, because 2 really, really, is prime. We find that the body of the if statement is getting executed when possible_factor
has the value 2. Of course! Primes are evenly divisible by themselves! Let’s make the loop stop when possible_factor
is one less than possible_prime
.
We make the last fix, and it works!
This was a contrived example. Frequently, you’ll only have one immediate bug to track down at a time. You can reduce the number of bugs you have to catch by doing two things:
Factoring can also help. We could write a small function to test if a number has any factors other than 1 or itself, and test (and if necessary, debug) that in isolation. Then it would be easy to write the loop to print out primes.