Lesson 5 Conditionals and Controls
Welcome to lesson 5. In this lesson, you will be introduced comparison operators, logical operators, conditional statements, iteration and looping.
Follow along with this tutorial by creating a new notebook in jupyterhub named lesson05.ipynb and entering the code snippets presented.
Outline
- Comparison operators
- Logical operators
- Conditional statements
- Control structures
5.1 Comparison operators
Early on in lesson 1, we learned about different types of mathematical operators such a +, -, *, and /. Aside from these operators, these are ones used to perform comparative and logical operations. We learned about the Boolean data type. Boolean values are either True
or False
. Similarly, comparison and logical operators return Boolean values of True
or False
.
Let’s start with comparision operations as presented in the table below:
Operation | Operator | Example Input | Answer |
---|---|---|---|
Less than | < | 4 < 10 | True |
Less than or equal to | <= | 4 <= 4 | True |
Greater than | > | 11 > 12 | False |
Greater than or equal to | >= | 4 >= 4 | True |
Equal to | == | 3 == 2 | False |
Not equal to | != | 3 != 2 | True |
Less than
For numeric comparisons, the less than operator <
evaluates to see if the first value is less than the second value. If it is, the value of True
is returned. If it is not, the value of False
is returned.
What will be returned from the following statement?
33 < 90
If you answered True
you are correct. If you didn’t run the code above and you’ll see the correct answer is returned.
Less than or equal to
For numeric comparisons, the less than or equal to operator <=
evaluates to see if the first value is less than or equal to second value. If it is, the value of True
is returned. If it is not, the value of False
is returned. When using <=
do not put a space between the <
and the =
(e.g. < =
). This will produce a SyntaxError: invalid syntax
.
33 <=33
True
Greater than
For numeric comparisons, the greater than operator >
evaluates to see if the first value is greater than second value. If it is, the value of True
is returned. If it is not, the value of False
is returned.
33 >= 90
False
Greater than or equal to
For numeric comparisons, the greater than or equal to operator >=
evaluates to see if the first value is greater or equal to the second value. If it is, the value of True
is returned. If it is not, the value of False
is returned. When using >=
do not put a space between the >
and the =
(e.g. > =
). This will produce a SyntaxError: invalid syntax
.
33 >= 33
True
Equal to
For numeric comparisons, the equality operator ==
evaluates to see if the first value is equal to the second value. If it is, the value of True
is returned. If it is not, the value of False
is returned. When using ==
do not put a space between the first =
and the second =
(e.g. = =
). This will produce a SyntaxError: invalid syntax
.
33 == 33
True
Note that the equal sign and the double equal sign have VERY different meanings in python. The =
denotes assignment as in x = 4
defined as the value of 4 is assigned the variable x. The ==
denotes equality and compares two values to see if they are equal.
Not equal to
For numeric comparisons, the inequality operator !=
evaluates to see if the first value is not equal to the second value. If it is, the value of True
is returned. If it is not, the value of False
is returned. When using !=
do not put a space between the !
and the =
(e.g. ! =
). This will produce a SyntaxError: invalid syntax
.
33 != 33
False
5.2 Conditional operators
In addition to the six comparison operators (>,>=,<,<=,==, and !-=
) there are three conditional operators. These are explained in the table below.
Operation | Operator | Example Input | Answer |
---|---|---|---|
Or | or | (3==3) or (4==7) | True |
And | and | (3==3) and (4==7) | False |
Not | not | not(3==3) | False |
Or
For numeric comparisons, the or operator or
returns True
if one of the statements is true. For example, we can evaluate the result of the two statements in parentheses as show below. If one of them is true, then the value of True
is returned.
(5**2 == 25) or (5!=5)
True
And
Contrastly, if we altered the statements above and replaced the or
with and
the result would be False
. This is because both statemetns need to be true when the and
operator is used.
(5**2 == 25) and (5!=5)
False
** Not ** The not
operator reverse the result of the statements. It returns False
if the result is true and True
if the result is false.
not(5==5)
False
5.3 Conditional statements
Conditionals are a nice way to make decisions by asking if something equals True or not. But often one condition is not enough. We may want to take the opposite of our result. Or for instance if we want to make a decision upon more than one parameter. Then based on the result we may want one action to happen over another.
This can be done with comparision and logical operators.
if statements
Here’s an example using a conditional if
statement that asks that question whether 1 < 2 and 4 > 2:
if 1 < 2 and 4 > 2:
print("The first condition met")
The first condition met
The structure for an if statement is very particular. The first line of the if is left aligned and communications the condition. The condition must be one that returns a True or False value. The condition must be follwed by a colon. Next, if the condition is True, then the lines below the if statemment will be executed.
See the prototype below:
if CONDITION :
print("The first condition met")
Include only one statement on each line. Each statement must be indented. Otherwise, an IndentationError will be thrown.
For example, the code below will return this error:
IndentationError: expected an indented block
if 1 < 2 and 4 > 2:
print("The first condition met")
Multiple if statements
You can ask multiple questions in a single code chunk. Each if
statement is evaluated separately and unrelated to each other. This means, if the first condition is met or unmet the second condition is still evaluated, and so on.
if 1 < 2 and 4 > 2:
print("The first condition met")
The first condition met
if 1 > 2 and 4 < 10:
print("condition not met")
if 4 < 10 or 1 < 2:
print("condition met")
condition met
if / elif statements
If you wanted to predicate the second condition on the first and the third on the second, this is where we would use a the if
and elif
statements instead of just if
.
Examine the code below. Take note of how the colon is used at the end of the if statement and each elif statement.
Also, note the indentation of the nested if statement after each elif. The indentation rules are important to follow.
if 1 < 2 and 4 > 2:
print("The first condition met") #printed if condition is met
elif 1 > 2 and 4 < 10:
print("The second condition met") #printed if condition is met
elif 4 < 10 or 1 < 2:
print("The third condition met") #printed if condition is met
else:
print("No conditions were met.")
The first condition met
5.4 Conditional selections of DataFrames
Conditional statements are useful as demonstrated in the examples above. Also, they are very useful when working with data sets in the form of DataFrames.
We’ve gone over how to select columns and rows, but what if we want to make a conditional selection?
For example, what if we want to filter our mba
DataFrame to show only schools ranked greater than or equal to 10s?
To do that, we take a column from the DataFrame and apply a Boolean condition to it. Here’s an example of a Boolean condition:
#import our data
import pandas as pd
mydata = pd.read_csv("mba.csv", header=0)
mydata
Rank School ... Total Tuition ($) Duration (Months)
0 1.0 Chicago (Booth) ... 106800 21
1 2.0 Dartmouth (Tuck) ... 106980 21
2 3.0 Virginia (Darden) ... 107800 21
3 4.0 Harvard ... 107000 18
4 5.0 Columbia ... 111736 20
5 6.0 California At Berkeley (Haas) ... 106792 21
6 7.0 MIT (Sloan) ... 116400 22
7 8.0 Stanford ... 114600 21
8 9.0 IESE ... 95610 19
9 10.0 IMD ... 67416 11
10 11.0 New York (Stern) ... 96640 20
11 12.0 London ... 92144 15
12 13.0 Pennsylvania (Wharton) ... 107852 21
13 14.0 HEC Paris ... 66802 16
14 15.0 Cornell (Johnson) ... 107592 21
15 16.0 York (Schulich) ... 61800 8
16 17.0 Carnegie Mellon (Tepper) ... 108272 21
17 18.0 ESADE ... 81693 12
18 19.0 INSEAD ... 80719 10
19 20.0 Northwestern (Kellogg) ... 113100 22
20 21.0 Emory (Goizueta) ... 87200 22
21 22.0 IE ... 82389 13
22 23.0 UCLA (Anderson) ... 105160 21
23 24.0 Michigan (Ross) ... 105500 20
24 25.0 Bath ... 36057 12
25 NaN Average ... 94962 18
[26 rows x 11 columns]
#Boolean condition
condition01 = mydata['Rank'] <= 10
condition01
0 True
1 True
2 True
3 True
4 True
5 True
6 True
7 True
8 True
9 True
10 False
11 False
12 False
13 False
14 False
15 False
16 False
17 False
18 False
19 False
20 False
21 False
22 False
23 False
24 False
25 False
Name: Rank, dtype: bool
A Series
is returned of Boolean values. Those with the value of True
meet the condition. You can see that a Series
is retunred by by using the type()
function.
type(condition01)
<class 'pandas.core.series.Series'>
Let’s write another condition to see which schools have an average starting salary of greater than or equal to $125,000.
#Boolean condition
condition02 = mydata['AvgSalary'] >=125000
condition02
0 False
1 False
2 False
3 False
4 False
5 False
6 False
7 True
8 True
9 True
10 False
11 False
12 False
13 False
14 False
15 False
16 False
17 False
18 False
19 False
20 False
21 False
22 False
23 False
24 True
25 False
Name: AvgSalary, dtype: bool
These data are useful, but what if you wanted to return the data for just those schools that met your conditions?
This is where we would use double square brackets to return the result set.
Let’s try doing this with condition02.
#Boolean condition that returns the result set
condition02 = [mydata[mydata['AvgSalary'] >=125000]]
condition02
[ Rank School ... Total Tuition ($) Duration (Months)
7 8.0 Stanford ... 114600 21
8 9.0 IESE ... 95610 19
9 10.0 IMD ... 67416 11
24 25.0 Bath ... 36057 12
[4 rows x 11 columns]]
This returns a List
object. You can see this by using the type()
function.
type(condition02)
<type 'list'>
To return a DataFrame
object, remove the outer square brackets.
#Boolean condition that returns the result set
condition02 = mydata[mydata['AvgSalary'] >=125000]
condition02
Rank School ... Total Tuition ($) Duration (Months)
7 8.0 Stanford ... 114600 21
8 9.0 IESE ... 95610 19
9 10.0 IMD ... 67416 11
24 25.0 Bath ... 36057 12
[4 rows x 11 columns]
type(condition02)
<class 'pandas.core.frame.DataFrame'>
We can ask more complex questions of our data using using logical operators |
for “or” and &
for “and”.
Let’s filter the the DataFrame to show only those schools where the Average salary is greater than or equal to $125,000 and the tution is less than or equal to $100,000.
#Boolean condition that returns the result set
condition03 = mydata[(mydata['AvgSalary'] >=125000) & (mydata['Total Tuition ($)'] <= 100000)]
condition03 #type DataFrame
Rank School ... Total Tuition ($) Duration (Months)
8 9.0 IESE ... 95610 19
9 10.0 IMD ... 67416 11
24 25.0 Bath ... 36057 12
[3 rows x 11 columns]
Take note of the the use of square brackets and parentheses. We need to make sure to group evaluations with parentheses so Python knows how to evaluate the conditional.
The pandas isin() method
Pandas isin()
method is used to filter DataFrames. Using the isin()
method we return a subset of data based on whether a value exsits. In the example below, we are evaluating to see if the Country
field contains the US or France.
#Boolean condition that returns the result set
condition04 = mydata[mydata['Country'].isin(['US', 'France'])]
condition04 #type DataFrame
Rank School ... Total Tuition ($) Duration (Months)
0 1.0 Chicago (Booth) ... 106800 21
1 2.0 Dartmouth (Tuck) ... 106980 21
2 3.0 Virginia (Darden) ... 107800 21
3 4.0 Harvard ... 107000 18
4 5.0 Columbia ... 111736 20
5 6.0 California At Berkeley (Haas) ... 106792 21
6 7.0 MIT (Sloan) ... 116400 22
7 8.0 Stanford ... 114600 21
10 11.0 New York (Stern) ... 96640 20
12 13.0 Pennsylvania (Wharton) ... 107852 21
13 14.0 HEC Paris ... 66802 16
14 15.0 Cornell (Johnson) ... 107592 21
16 17.0 Carnegie Mellon (Tepper) ... 108272 21
19 20.0 Northwestern (Kellogg) ... 113100 22
20 21.0 Emory (Goizueta) ... 87200 22
22 23.0 UCLA (Anderson) ... 105160 21
23 24.0 Michigan (Ross) ... 105500 20
[17 rows x 11 columns]
5.5 Control structures
The while loop
The while statement allows you to repeatedly execute a block of statements as long as a condition is true. A while statement can have an optional else clause.
The while loop in Python is used to iterate over a block of code as long as the test expression (condition) is true.
We generally use this loop when we don’t know beforehand, the number of times to iterate.
# Example to illustrate
# the use of else statement
# with the while loop
counter = 0
while counter < 3:
print("Inside loop")
counter = counter + 1
else:
print("Inside else")
Inside loop
Inside loop
Inside loop
Inside else
The for loop
The for..in statement is another looping statement which iterates over a sequence of objects i.e. go through each item in a sequence. The for loop in Python is used to iterate over a sequence (list, tuple, string) or other iterable objects. Iterating over a sequence is called traversal. We will see more about sequences in detail in later chapters. What you need to know right now is that a sequence is just an ordered collection of items.
Let’s look at two examples:
Example 1
for i in range(1, 5):
print(i)
else:
print('The for loop is over')
1
2
3
4
The for loop is over
Example 2
# Program to find the sum of all numbers stored in a list
# List of numbers
numbers = [6, 5, 3, 8, 4, 2, 5, 4, 11]
# variable to store the sum
sum = 0
# iterate over the list
for val in numbers:
sum = sum+val
# Output: The sum is 48
print("The sum is", sum)
('The sum is', 48)
Summary
- Conditional statements may use comparision and logical operators to evaluate the truth of an expression.
- We can ask more complex questions of our data using using logical operators
|
for “or” and&
for “and”. - Pandas
isin()
method is used to filter DataFrames. for
andwhile
are looping statements that iterates over a sequence of objects or block of code, respectively.
5.6 Quiz
1. Evaluate the following statement:
8 == 8
- TRUE
- True
- true
- FALSE
- Not False
2. Evaluate the following statement:
8 =! 8
- False
- FALSE
- false
- SyntaxError: invalid syntax
- True
3. Evaluate the following statement: (32*2==64) and (8**2==64)
- False
- True
- TRUE
- FALSE
- SyntaxError: invalid syntax
4. Which statement will provide a syntax error given that x=4
x>=4 and x<5
(x>=4) and (x<5)
((x>=4) and (x<5))
((x>=4) and (x<!5))
Feedback: Choice d, produces a syntax error. x<! is a syntax error.
5. What type of object will be returned from this statement:
condition = [mydata[mydata['AvgSalary'] >=125000]]
- List
- Series
- Data Frame
- Tuple
- Array
6. What type of object will be returned from this statement:
condition = mydata[mydata['AvgSalary'] >=125000]
- List
- Series
- Data Frame
- Tuple
- Array
7. What type of object will be returned from this statement:
condition = mydata['AvgSalary'] >=125000
- List
- Series
- Data Frame
- Tuple
- Array
8. The for statement is an example of a control structure
- TRUE
- FALSE
9. Which is not a comparison operator
==
!=
=
>
<
10. What’s wrong with the following code?
counter = 0
while counter < 3:
print("Inside loop")
counter = counter
else:
print("Inside else")
- Nothing
- the counter does not decrement causing an infinite loop
- the counter does not increment causing an infinite loop
- The code produces a syntax error
- the else should be indented
5.7 Exercises
5.7.1 Exercise 5.1
Filter the
mba
DataFrame to show only those schools where the Average salary is less than to $100,000 and the tution is greater than equal to $100,000.