Lesson 5 Conditionals and Controls

Welcome to lesson 5. In this lesson, you will be introduced comparison operators, logical operators, conditional statements, iteration and looping.

Follow along with this tutorial by creating a new notebook in jupyterhub named lesson05.ipynb and entering the code snippets presented.

Outline

  • Comparison operators
  • Logical operators
  • Conditional statements
  • Control structures

5.1 Comparison operators

Early on in lesson 1, we learned about different types of mathematical operators such a +, -, *, and /. Aside from these operators, these are ones used to perform comparative and logical operations. We learned about the Boolean data type. Boolean values are either True or False. Similarly, comparison and logical operators return Boolean values of True or False.

Let’s start with comparision operations as presented in the table below:

Operation Operator Example Input Answer
Less than < 4 < 10 True
Less than or equal to <= 4 <= 4 True
Greater than > 11 > 12 False
Greater than or equal to >= 4 >= 4 True
Equal to == 3 == 2 False
Not equal to != 3 != 2 True

Less than

For numeric comparisons, the less than operator < evaluates to see if the first value is less than the second value. If it is, the value of True is returned. If it is not, the value of False is returned.

What will be returned from the following statement?

33 <  90

If you answered True you are correct. If you didn’t run the code above and you’ll see the correct answer is returned.

Less than or equal to

For numeric comparisons, the less than or equal to operator <= evaluates to see if the first value is less than or equal to second value. If it is, the value of True is returned. If it is not, the value of False is returned. When using <= do not put a space between the < and the = (e.g. < =). This will produce a SyntaxError: invalid syntax.

33 <=33
True

Greater than

For numeric comparisons, the greater than operator > evaluates to see if the first value is greater than second value. If it is, the value of True is returned. If it is not, the value of False is returned.

33 >= 90
False

Greater than or equal to

For numeric comparisons, the greater than or equal to operator >= evaluates to see if the first value is greater or equal to the second value. If it is, the value of True is returned. If it is not, the value of False is returned. When using >= do not put a space between the > and the = (e.g. > =). This will produce a SyntaxError: invalid syntax.

33 >= 33
True

Equal to

For numeric comparisons, the equality operator == evaluates to see if the first value is equal to the second value. If it is, the value of True is returned. If it is not, the value of False is returned. When using == do not put a space between the first = and the second = (e.g. = =). This will produce a SyntaxError: invalid syntax.

33 == 33
True

Note that the equal sign and the double equal sign have VERY different meanings in python. The = denotes assignment as in x = 4 defined as the value of 4 is assigned the variable x. The == denotes equality and compares two values to see if they are equal.

Not equal to

For numeric comparisons, the inequality operator != evaluates to see if the first value is not equal to the second value. If it is, the value of True is returned. If it is not, the value of False is returned. When using != do not put a space between the ! and the = (e.g. ! =). This will produce a SyntaxError: invalid syntax.

33 != 33
False

5.2 Conditional operators

In addition to the six comparison operators (>,>=,<,<=,==, and !-=) there are three conditional operators. These are explained in the table below.

Operation Operator Example Input Answer
Or or (3==3) or (4==7) True
And and (3==3) and (4==7) False
Not not not(3==3) False

Or

For numeric comparisons, the or operator or returns True if one of the statements is true. For example, we can evaluate the result of the two statements in parentheses as show below. If one of them is true, then the value of True is returned.

(5**2 == 25) or (5!=5)  
True

And

Contrastly, if we altered the statements above and replaced the or with and the result would be False. This is because both statemetns need to be true when the and operator is used.

(5**2 == 25) and (5!=5)  
False

** Not ** The not operator reverse the result of the statements. It returns False if the result is true and True if the result is false.

not(5==5) 
False

5.3 Conditional statements

Conditionals are a nice way to make decisions by asking if something equals True or not. But often one condition is not enough. We may want to take the opposite of our result. Or for instance if we want to make a decision upon more than one parameter. Then based on the result we may want one action to happen over another.

This can be done with comparision and logical operators.

if statements

Here’s an example using a conditional if statement that asks that question whether 1 < 2 and 4 > 2:


if 1 < 2 and 4 > 2:
    print("The first condition met")
The first condition met

The structure for an if statement is very particular. The first line of the if is left aligned and communications the condition. The condition must be one that returns a True or False value. The condition must be follwed by a colon. Next, if the condition is True, then the lines below the if statemment will be executed.

See the prototype below:

if CONDITION :
    print("The first condition met")

Include only one statement on each line. Each statement must be indented. Otherwise, an IndentationError will be thrown.

For example, the code below will return this error:

IndentationError: expected an indented block

if 1 < 2 and 4 > 2:
print("The first condition met")

Multiple if statements

You can ask multiple questions in a single code chunk. Each if statement is evaluated separately and unrelated to each other. This means, if the first condition is met or unmet the second condition is still evaluated, and so on.


if 1 < 2 and 4 > 2:
    print("The first condition met")
The first condition met
if 1 > 2 and 4 < 10:
    print("condition not met")

if 4 < 10 or 1 < 2:
    print("condition met")
condition met

if / elif statements

If you wanted to predicate the second condition on the first and the third on the second, this is where we would use a the if and elif statements instead of just if.

Examine the code below. Take note of how the colon is used at the end of the if statement and each elif statement.

Also, note the indentation of the nested if statement after each elif. The indentation rules are important to follow.


if 1 < 2 and 4 > 2:
    print("The first condition met") #printed if condition is met
elif 1 > 2 and 4 < 10:
    print("The second condition met") #printed if condition is met
elif 4 < 10 or 1 < 2:
    print("The third condition met") #printed if condition is met
else: 
    print("No conditions were met.")
    
The first condition met

5.4 Conditional selections of DataFrames

Conditional statements are useful as demonstrated in the examples above. Also, they are very useful when working with data sets in the form of DataFrames.

We’ve gone over how to select columns and rows, but what if we want to make a conditional selection?

For example, what if we want to filter our mba DataFrame to show only schools ranked greater than or equal to 10s?

To do that, we take a column from the DataFrame and apply a Boolean condition to it. Here’s an example of a Boolean condition:

#import our data
import pandas as pd
mydata = pd.read_csv("mba.csv", header=0) 
mydata
    Rank                         School  ... Total Tuition ($)  Duration (Months)
0    1.0                Chicago (Booth)  ...            106800                 21
1    2.0               Dartmouth (Tuck)  ...            106980                 21
2    3.0              Virginia (Darden)  ...            107800                 21
3    4.0                        Harvard  ...            107000                 18
4    5.0                       Columbia  ...            111736                 20
5    6.0  California At Berkeley (Haas)  ...            106792                 21
6    7.0                    MIT (Sloan)  ...            116400                 22
7    8.0                       Stanford  ...            114600                 21
8    9.0                           IESE  ...             95610                 19
9   10.0                            IMD  ...             67416                 11
10  11.0               New York (Stern)  ...             96640                 20
11  12.0                         London  ...             92144                 15
12  13.0         Pennsylvania (Wharton)  ...            107852                 21
13  14.0                      HEC Paris  ...             66802                 16
14  15.0              Cornell (Johnson)  ...            107592                 21
15  16.0                York (Schulich)  ...             61800                  8
16  17.0       Carnegie Mellon (Tepper)  ...            108272                 21
17  18.0                          ESADE  ...             81693                 12
18  19.0                         INSEAD  ...             80719                 10
19  20.0         Northwestern (Kellogg)  ...            113100                 22
20  21.0               Emory (Goizueta)  ...             87200                 22
21  22.0                             IE  ...             82389                 13
22  23.0                UCLA (Anderson)  ...            105160                 21
23  24.0                Michigan (Ross)  ...            105500                 20
24  25.0                           Bath  ...             36057                 12
25   NaN                        Average  ...             94962                 18

[26 rows x 11 columns]
#Boolean condition
condition01 = mydata['Rank'] <= 10
condition01
0      True
1      True
2      True
3      True
4      True
5      True
6      True
7      True
8      True
9      True
10    False
11    False
12    False
13    False
14    False
15    False
16    False
17    False
18    False
19    False
20    False
21    False
22    False
23    False
24    False
25    False
Name: Rank, dtype: bool

A Series is returned of Boolean values. Those with the value of True meet the condition. You can see that a Series is retunred by by using the type() function.

type(condition01)
<class 'pandas.core.series.Series'>

Let’s write another condition to see which schools have an average starting salary of greater than or equal to $125,000.

#Boolean condition
condition02 = mydata['AvgSalary'] >=125000
condition02
0     False
1     False
2     False
3     False
4     False
5     False
6     False
7      True
8      True
9      True
10    False
11    False
12    False
13    False
14    False
15    False
16    False
17    False
18    False
19    False
20    False
21    False
22    False
23    False
24     True
25    False
Name: AvgSalary, dtype: bool

These data are useful, but what if you wanted to return the data for just those schools that met your conditions?

This is where we would use double square brackets to return the result set.

Let’s try doing this with condition02.

#Boolean condition that returns the result set
condition02 = [mydata[mydata['AvgSalary'] >=125000]]
condition02
[    Rank    School  ... Total Tuition ($)  Duration (Months)
7    8.0  Stanford  ...            114600                 21
8    9.0      IESE  ...             95610                 19
9   10.0       IMD  ...             67416                 11
24  25.0      Bath  ...             36057                 12

[4 rows x 11 columns]]

This returns a List object. You can see this by using the type() function.

type(condition02)
<type 'list'>

To return a DataFrame object, remove the outer square brackets.

#Boolean condition that returns the result set
condition02 = mydata[mydata['AvgSalary'] >=125000]
condition02
    Rank    School  ... Total Tuition ($)  Duration (Months)
7    8.0  Stanford  ...            114600                 21
8    9.0      IESE  ...             95610                 19
9   10.0       IMD  ...             67416                 11
24  25.0      Bath  ...             36057                 12

[4 rows x 11 columns]
type(condition02)
<class 'pandas.core.frame.DataFrame'>

We can ask more complex questions of our data using using logical operators | for “or” and & for “and”.

Let’s filter the the DataFrame to show only those schools where the Average salary is greater than or equal to $125,000 and the tution is less than or equal to $100,000.

#Boolean condition that returns the result set
condition03 = mydata[(mydata['AvgSalary'] >=125000) & (mydata['Total Tuition ($)'] <= 100000)]
condition03 #type DataFrame
    Rank School  ... Total Tuition ($)  Duration (Months)
8    9.0   IESE  ...             95610                 19
9   10.0    IMD  ...             67416                 11
24  25.0   Bath  ...             36057                 12

[3 rows x 11 columns]

Take note of the the use of square brackets and parentheses. We need to make sure to group evaluations with parentheses so Python knows how to evaluate the conditional.

The pandas isin() method

Pandas isin() method is used to filter DataFrames. Using the isin() method we return a subset of data based on whether a value exsits. In the example below, we are evaluating to see if the Country field contains the US or France.

#Boolean condition that returns the result set
condition04 = mydata[mydata['Country'].isin(['US', 'France'])]
condition04 #type DataFrame
    Rank                         School  ... Total Tuition ($)  Duration (Months)
0    1.0                Chicago (Booth)  ...            106800                 21
1    2.0               Dartmouth (Tuck)  ...            106980                 21
2    3.0              Virginia (Darden)  ...            107800                 21
3    4.0                        Harvard  ...            107000                 18
4    5.0                       Columbia  ...            111736                 20
5    6.0  California At Berkeley (Haas)  ...            106792                 21
6    7.0                    MIT (Sloan)  ...            116400                 22
7    8.0                       Stanford  ...            114600                 21
10  11.0               New York (Stern)  ...             96640                 20
12  13.0         Pennsylvania (Wharton)  ...            107852                 21
13  14.0                      HEC Paris  ...             66802                 16
14  15.0              Cornell (Johnson)  ...            107592                 21
16  17.0       Carnegie Mellon (Tepper)  ...            108272                 21
19  20.0         Northwestern (Kellogg)  ...            113100                 22
20  21.0               Emory (Goizueta)  ...             87200                 22
22  23.0                UCLA (Anderson)  ...            105160                 21
23  24.0                Michigan (Ross)  ...            105500                 20

[17 rows x 11 columns]

5.5 Control structures

The while loop

The while statement allows you to repeatedly execute a block of statements as long as a condition is true. A while statement can have an optional else clause.

The while loop in Python is used to iterate over a block of code as long as the test expression (condition) is true.

We generally use this loop when we don’t know beforehand, the number of times to iterate.

# Example to illustrate
# the use of else statement
# with the while loop

counter = 0

while counter < 3:
    print("Inside loop")
    counter = counter + 1
else:
    print("Inside else")
Inside loop
Inside loop
Inside loop
Inside else

The for loop

The for..in statement is another looping statement which iterates over a sequence of objects i.e. go through each item in a sequence. The for loop in Python is used to iterate over a sequence (list, tuple, string) or other iterable objects. Iterating over a sequence is called traversal. We will see more about sequences in detail in later chapters. What you need to know right now is that a sequence is just an ordered collection of items.

Let’s look at two examples:

Example 1


for i in range(1, 5):
    print(i)
else:
    print('The for loop is over')
1
2
3
4
The for loop is over

Example 2

# Program to find the sum of all numbers stored in a list

# List of numbers
numbers = [6, 5, 3, 8, 4, 2, 5, 4, 11]

# variable to store the sum
sum = 0

# iterate over the list
for val in numbers:
    sum = sum+val

# Output: The sum is 48
print("The sum is", sum)
('The sum is', 48)

Summary

  • Conditional statements may use comparision and logical operators to evaluate the truth of an expression.
  • We can ask more complex questions of our data using using logical operators | for “or” and & for “and”.
  • Pandas isin() method is used to filter DataFrames.
  • for and while are looping statements that iterates over a sequence of objects or block of code, respectively.

5.6 Quiz

1. Evaluate the following statement:

8 == 8

  1. TRUE
  2. True
  3. true
  4. FALSE
  5. Not False

2. Evaluate the following statement:

8 =! 8

  1. False
  2. FALSE
  3. false
  4. SyntaxError: invalid syntax
  5. True

3. Evaluate the following statement: (32*2==64) and (8**2==64)

  1. False
  2. True
  3. TRUE
  4. FALSE
  5. SyntaxError: invalid syntax

4. Which statement will provide a syntax error given that x=4

  1. x>=4 and x<5
  2. (x>=4) and (x<5)
  3. ((x>=4) and (x<5))
  4. ((x>=4) and (x<!5))

Feedback: Choice d, produces a syntax error. x<! is a syntax error.

5. What type of object will be returned from this statement:

condition = [mydata[mydata['AvgSalary'] >=125000]]

  1. List
  2. Series
  3. Data Frame
  4. Tuple
  5. Array

6. What type of object will be returned from this statement:

condition = mydata[mydata['AvgSalary'] >=125000]

  1. List
  2. Series
  3. Data Frame
  4. Tuple
  5. Array

7. What type of object will be returned from this statement:

condition = mydata['AvgSalary'] >=125000

  1. List
  2. Series
  3. Data Frame
  4. Tuple
  5. Array

8. The for statement is an example of a control structure

  1. TRUE
  2. FALSE

9. Which is not a comparison operator

  1. ==
  2. !=
  3. =
  4. >
  5. <

10. What’s wrong with the following code?

counter = 0
while counter < 3:
    print("Inside loop")
    counter = counter
else:
    print("Inside else")
  1. Nothing
  2. the counter does not decrement causing an infinite loop
  3. the counter does not increment causing an infinite loop
  4. The code produces a syntax error
  5. the else should be indented

5.7 Exercises

5.7.1 Exercise 5.1

  1. Filter the mba DataFrame to show only those schools where the Average salary is less than to $100,000 and the tution is greater than equal to $100,000.

5.7.2 Exercise 5.2

5.8 Assignment 5

5.9 Exam questions