Course Module • Jillur Quddus

2. Control and Evaluations Part 1

Introduction to Python

Control and Evaluations Part 1

Introduction to Python

Jillur Quddus • Founder & Chief Data Scientist • 1st September 2020

  Back to Course Overview

Introduction

In this module we will cover the fundamental building blocks of the Python programming language, namely:

  • Basic Concepts - logical/physical lines, comments, indentation, identifiers and keywords
  • Literals - string, boolean, numeric and special literals
  • Operators - unary, binary, arithmetic, assignment, comparison, logical, identity, membership and bitwise operators

The source code for this module may be found in the public GitHub repository for this course. Where code snippets are provided in this module, you are strongly encouraged to type and execute these Python statements in your own Jupyter notebook instance.

1. Basic Concepts

As covered in the previous module, Python is an interpreted language. This means that each line of your Python source code is read, verified, translated into something called byte code (low-level machine code) and executed - if an error is encountered, the program will halt at that point and an error message is returned. In Python, the program that does this is called the Python Interpreter. At a lower level, your Python source code is broken down into tokens by a lexical analyzer, which are then fed into a parser.

Python lexical analyzer
Python lexical analyzer

Let's take a deeper look to help us understand how this lexical analyzer breaks down our Python source code into a stream of tokens.

1.1. Logical and Physical Lines

A logical line corresponds to a single Python statement. A physical line is a line terminated by an end of line character (for example when you press the ENTER key on your keyboard). In most cases, and to improve the readability of your Python source code, it is recommended that a logical line spans a single physical line as follows:

# Logical line spanning a single physical line
my_string = 'My entire string on a single physical line.'
print(my_string)

1.2. Explicit Line Joining

However sometimes it may be the case that a logical line spans multiple physical lines. One way to achieve this is to use explicit line joining. This is where you use the backslash character \ to continue a Python statement onto new physical lines as follows:

# String literal spanning two physical lines using explicit line joining
my_string = 'The first part of the string. \
The second part of the string'
print(my_string)

# Control flow spanning multiple physical lines using explicit line joining
year = 2019
month = 9
day = 15
if 1900 < year < 2100 and 1 <= month <= 12 \
    and 1 <= day <= 31:    # Valid date
        print("You have entered a valid date")

1.3. Implicit Line Joining

There are times in Python where logical lines span multiple physical lines without the use of the backslash \ character. This is called implicit line joining and applies to expressions defined in parentheses (round brackets), square brackets or curly braces as follows:

# Implicit line joining
days = ('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday',  # Weekdays
        'Saturday', 'Sunday')  # Weekend
print(days)

1.4. Comments

As you may have noticed in the previous examples, you can write comments in your Python source code by using the # character. Comments are a useful way to describe what your code is doing to other developers, or should you need to come back to it in the future. As long as the # character is not inside a string literal, Python will consider it as a comment and ignore it for the purposes of executing your code. In relation to a logical line spanning multiple physical lines:

  • Explicit Line Joining - a physical line ending in a backslash character cannot be followed by a comment. Also the backslash character cannot be used to make a comment span multiple physical lines.
  • Implicit Line Joining - a physical line ending in a backslash character can be followed by a comment.

In regards to best practice, your comments should be short, concise and relevant. As a beginner programmer, you may be tempted to write a large number of comments to help you remember what is going on. Whilst this is fine for when you are just starting out, as you become more advanced in your programming skills, try to significantly reduce the number of comments to just those required to understand the more complex areas of your code (for example, you do not need to explain what an IF statement is doing!). If you find that you are writing comments purely to explain what variables are and how they are being processed, try renaming your variables to a more descriptive name within the context of your program.

1.5. Mutli-Logical Physical Lines

It is possible to define multiple logical statements on the same physical line by delimiting statements with the semi-colon ; character. However this is generally discouraged as it reduces the readability of your source code:

# Multiple logical lines spanning a single physical line
from datetime import date; today = date.today(); print(today)

1.6. Blank Lines

Logical lines containing nothing but whitespace characters (for example spaces and tabs) are ignored. They do however serve to make your Python source code more readable, either when writing Python modules, or to improve the readability of complex cells in Jupyter Notebook.

The end of a logical line corresponds to a NEWLINE token as processed by the lexical analyzer. Blank lines do not generate NEWLINE tokens.

1.7. Indentation

Indentation - the leading whitespace characters at the beginning of a logical line - is important in Python as it is used to define and group together related statements. It is NOT recommended to mix spaces and tabs for indentation as this can lead to inconsistent indentation levels. Tabs are replaced by space characters but the number of space characters as a result of this replacement operation is dependent on the underlying environment, for example UNIX platforms may behave differently to non-UNIX platforms such as Windows.

Most IDEs can be configured to explicity define the number of spaces to replace tabs with - the default is commonly 4 space characters for one tab. However, most IDEs and web-based notebooks, including Jupyter Notebook, will also automatically ident and move the cursor to the correct position after starting a new physical line (by pressing the ENTER key) should an indentation be required.

We will use indentation implicitly as we introduce further concepts over this course, but for now let us take a look at a couple of examples of indentation in action:

# Indentation is required for control flow in Python
x = 101
if x >= 100:
    print("Your number is bigger than or equal to 100")
else:
    print("Your number is less than 100")

# Incorrect indentations lead to IndentationError
x = 100
    y = 200    # Incorrect indentation
if x >= y:
print("X is bigger than or equal to Y")    # Incorrect indentation 
else:
    print("Y is bigger than X")

Indentation levels generate INDENT and DEDENT tokens respectively.

1.8. Identifiers

Identifiers are names given to identify variables, functions, classes, modules and other objects in Python. Identifiers must adhere to the following rules:

  • Keywords cannot be used as identifiers (see below)
  • Only lowercase letters, uppercase letters, digits and underscore characters are allowed
  • They cannot start with a digit
  • They can be of any length, but they should be short, concise and relevant

There are also conventions that you should adhere to when naming identifiers in Python that we will introduce over this course.

Should you fail to adhere to these rules, a SyntaxError will be raised, as follows:

# Valid identifiers
my_first_number = 100
my_first_string = 'Hello World'

# Invalid identifiers
1number = 10
string-test = 'Invalid Identifier'

Identifiers are another category of tokens generated by the lexical analyzer.

1.9. Case Sensitivity

Note that Python is a case-sensitive language. For example, when you name an identifier, you must ensure that you use the exact same case when subsequently referencing it.

1.10. Keywords

Keywords are reserved words in the Python language and should not and cannot be used for any other purpose. For example, keywords cannot be used as variable names or other identifiers in Python. We will cover most of these keywords during this course, but for your reference they are listed here:

Falseawaitelseimportpass
Nonebreakexceptinraise
Trueclassfinallyisreturn
andcontinueforlambdatry
asdeffromnonlocalwhile
assertdelglobalnotwith
asyncelififoryield

Keywords are another category of tokens generated by the lexical analyzer. In the remaining sections of this class, we shall study two other fundamental categories of tokens - literals and operators.

2. Literals

Literals are another fundamental category of tokens in Python, but important enough to justify their own section. Officially, literals are notations for constant values of Python built-in types. In plain English, they are raw data values in Python. In this section, we will cover the following types of literals:

  • String
  • Boolean
  • Numeric
  • Special

2.1. String Literals

A sequence of characters or text enclosed within either single ' or double quotes " is used to form a string literal, as follows:

my_first_string = 'Hello World'
my_second_string = 'Line 1.\nLine 2.'
my_third_string = r'Line 1.\nLine 2.'
my_fourth_string = u'r\u00e9sum\u00e9'

print(my_first_string)
print(my_second_string)
print(my_third_string)
print(my_fourth_string)

You may notice a couple of odd looking characters and character sequences in the previous examples, which are explained as follows:

  • String literals prefixed with the r string prefix are called raw strings where backslashes are treated as literal characters. In this example, where \n is a new line character, instead of printing a new line, it will treat \n as a literal and print '\n' instead.
  • String literals prefixed with the u string prefix support Unicode literals. In this example, the word 'resume' actually contains accented Unicode é characters which are denoted by the UTF-16 encoding \u00e9.

2.2. Boolean Literals

A Boolean literal can only have two values - True (representing the value of 1) or False (representing the value of 0), as follows:

my_first_boolean = True
my_second_boolean = False

print(my_first_boolean)
print(my_second_boolean)
print(1==True)
print(1==False)
print(0==True)
print(0==False)

2.3. Numeric Literals

Python supports three types of numeric literals - integers, floating point numbers and imaginary numbers (a component of complex numbers).

2.3.1. Integer Literals

Integer literals can be formed using the standard base-10 system (i.e. each digit can have an integer value from 0 to 9) as you would normally define integers. Alternatively they can also be formed using binary, octal and hexadecimal systems as well, as follows:

# Integer Literals using different number systems
decimal_integer = 100
binary_integer = 0b1100100
octal_integer = 0o144
hexadecimal_integer = 0x64
decimal_groupings_integer = 100_000_000

print(decimal_integer)
print(binary_integer)
print(octal_integer)
print(hexadecimal_integer)
print(decimal_groupings_integer)

Note that in the last example, the underscore _ character can optionally be used for numeric groupings (as of Python 3.6). This applies to all numeric literal types i.e. integer, floating point and imaginary literals.

2.3.2. Floating Point Literals

Floating point literals can be formed using radix (base) 10 integer and exponent parts, as follows:

# Floating point literals
my_first_number = 3.14
my_second_number = 10e2
my_third_number = 100e-5
my_fourth_number = 10.
my_fifth_number = .12345
my_sixth_number=3.14_15_93

print(my_first_number)
print(my_second_number)
print(my_third_number)
print(my_fourth_number)
print(my_fifth_number)
print(my_sixth_number)

2.3.3. Imaginary Literals

Finally, imaginary literals can be formed using the j suffix. By default, imaginary literals defined by themselves result in complex numbers with a zero real part. To define complex numbers with a non-zero real part, a floating point number must be added to it, as follows:

# Imaginary literals
my_first_complex_number = 10j
my_second_complex_number = .10j
my_third_complex_number = 2+3j

print(my_first_complex_number)
print(my_second_complex_number, 
      my_second_complex_number.real, 
      my_second_complex_number.imag)
print(my_third_complex_number, 
      my_third_complex_number.real, 
      my_third_complex_number.imag)

2.4. Special Literals

There exists in Python one special literal - None. This is used to specify nothing or a null value. Note that None is NOT equivalent to an empty string, 0 nor False, as follows:

x = None
print(x)
if x:
    print('x is True')
else:
    print('x is not True')
print(bool(None))

You may be wondering why x is not True is printed in the last example when None is not the same as False. Well, if x expects x to be a boolean, and in this case executes if x.__nonzero__ or bool(x) to perform this comparison. bool(x) i.e. bool(None) returns False (as it does not determine None to be a valid boolean value) which is why x is not True is printed. Try it out yourself and other combinations to see what you get.

3. Operators

The final category of tokens that we shall study in this module are operators. Operators are used to represent operations to be performed on operands (values or quantities), and may be divided into the following types:

  • Arithmetic operators
  • Assignment operators
  • Comparison operators
  • Logical operators
  • Identity operators
  • Membership operators
  • Bitwise operators

3.1. Arithmetic Operators

Arithmetic operators perform common arithmetic operations on numerical values, as follows:

# Addition
print(13 + 7)

# Subtraction
print(10 - 7)

# Multiplication
print(64 * 8)

# Division
print(225 / 15)

# Modulus (remainder after division)
print(69 % 8)

# Exponentiation (raising by a power)
print(2 ** 5)

# Floor division (quotient)
print(100 // 7)

3.2. Assignment Operators

Assignment operators assign values to variables, as follows:

# Assignment
x = 10
print(x)

# Add and assign (x = x + 5)
x += 5
print(x)

# Substract and assign (x = x - 2)
x -= 2
print(x)

# Multiply and assign (x = x * 4)
x *= 4
print(x)

# Divide and assign (x = x / 2)
x /= 2
print(x)

# Modulus and assign (x = x % 4)
x %= 4
print(x)

# Exponentiation and assign (x = x ** 8)
x **= 8
print(x)

# Floor division and assign (x = x // 15)
x //= 15
print(x)

# Bitwise AND and assign (x = x & 18)
x = int(x)
x &= 18
print(x)

# Bitwise OR and assign (x = x | 10)
x |= 10
print(x)

# Bitwise XOR and assign (x = x ^ 2)
x ^= 2
print(x)

# Bitwise signed right shift and assign (x = x >> 1)
x >>= 1
print(x)

# Bitwise zero fill left shift and assign (x = x << 2)
x <<= 2
print(x)

Now that we have an understanding of assignment operators, we are able to define what a variable is in Python. A variable may be defined as an identifier which has been assigned some type of value, whether that value is a literal, a literal collection or some other object or data structure. In Python, a variable is created the moment you first assign a value to it and does not need to be declared in advance. Languages such as Python where variables and variable types do not need to be declared in advance are called dynamically typed languages.

3.3. Comparison Operators

Comparison operators compare two values, as follows:

# Equal
print(10 == 100)

# Not equal
print(2 != 5)

# Greater than
print(13 > 12)

# Less than
print(100 < 1_000_000)

# Greater than or equal to
print(15 >= 15)

# Less than or equal to
print(20 <= 19)

3.4. Logical Operators

Logical operators and and or are used in control flow to combine conditional statements, as follows:

# AND
print(1 < 100 and 10 < 100)

# OR
print(13 > 14 or 1 > 0)

# NOT
print(not(1 < 100 and 10 < 100))

3.5. Identity Operators

Identity operators is and is not are used to compare variables at the object memory level i.e. are two variables the same object residing in the same location in memory, as follows:

x = 10
y = 100.0

# IS
print(x is y)
print(x is x)

# IS NOT
print(x is not y)

x = y
print(x is y)
print(x is not y)

3.6. Membership Operators

Membership operators in and not in are used to test if a given value is present in a given object, as follows:

weekdays = ('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday')

# IN
print('Thursday' in weekdays)
print('Saturday' in weekdays)

# NOT IN
print('Sunday' not in weekdays)

3.7. Bitwise Operators

The previous operators that we have introduced are, generally speaking, intuitive and relatively straightfoward to understand. For beginner programmers though, bitwise operators may seem completely alien and frankly quite bizarre! So what are bitwise operators? Bitwise operators are used to compare numbers in their binary form, that is when numbers are represented by a series of bits where each bit is either 1 or 0.

Bitwise operations are NOT unique to Python. They are fundamental operations that work at the individual bit level, thereby allowing for fast calculations and comparisons as they are carried out directly by your computer processor. Though beginner programmers are unlikely to use bitwise operations in their code, it is important to understand how they work as they are used extensively in more complex use cases including cryptography, encryption, compression and network communications.

To understand how bitwise operators work, we must first understand how numbers can be represented in binary form. Binary is simply a series of bits. A bit represents the smallest unit of data that a computer can store, where each bit can only take one of two possible values at any one time: 1 or 0. Computers store data and execute instructions in bit multiples called bytes, where 8 bits make 1 byte.

To represent numbers in binary form, imagine a simple table where each column represents a power of 2 as follows:

Binary form
Binary form

Using this table, any integer value can be represented as a series of bits, where the only possible value for each cell is either 1 or 0. Once the bits have been set, simply add those columns where the bit value is 1 to get your number, as follows:

Binary form examples
Binary form examples

Using these examples, representing numbers in binary form using 8 bits, 0 is 00000000, 1 is 00000001, 3 is 00000011, 85 is 01010101 and 252 is 11111100. Now that we have a basic understanding of how to represent numbers in binary form, let us return to the bitwise operators.

3.7.1. Bitwise AND

The bitwise AND & operator compares each bit of two operands. If both bits are set to 1, then the output of bitwise AND is also 1. However if either bit is 0, then the output of bitwise AND is 0. For example 17 & 18 = 16 as follows:

Bitwise AND
Bitwise AND

3.7.2. Bitwise OR

The bitwise OR | operator also compares each bit of two operands. If either bit is set to 1, then the output of bitwise OR is also 1, otherwise 0. For example 16 | 10 = 26 as follows:

Bitwise OR
Bitwise OR

3.7.3. Bitwise XOR

The bitwise XOR ^ operator (exclusive OR) also compares each bit of two operands. If only one of the two bits is set to 1, then the output of bitwise XOR is also 1, otherwise 0. In otherwords, the two bits should have opposite values if the outcome of bitwise XOR is to be 1. For example 26 ^ 2 = 24 as follows:

Bitwise XOR
Bitwise XOR

3.7.4. Bitwise NOT

The bitwise NOT ~ operator works on only one operand (i.e. it is a unary operator - see below for further details on unary operators). The bitwise complement switches bit values, that is it changes 1 to 0, and 0 to 1. However it is not quite that simple unfortunately!

3.7.5. Two's Complement

Two's Complement is an operation on binary numbers that is used to represent numbers with a sign i.e. positive or negative. To sign a number, the first bit is used to represent the sign and is called the sign bit. If the sign bit is 0, this represents a positive number. If the sign bit is 1, this represents a negative number. Two's Complement is computed by inverting the bits in a given number (i.e. the complement, that is change 1 to 0, and 0 to 1), including the sign bit, and adding 1 to the result. Therefore to represent a positive number, you simply write the binary form as shown above. However to represent a negative number, include the sign bit and compute the Two's Complement of that number. For example to represent the number -200 using 9 bits (8 bits plus the sign bit):

  • Represent 200 in binary form = 011001000
  • Compute the complement = 100110111
  • Add 1 to the result = 100111000

Therefore -200 represented in binary form is 100111000. That is 1 for the sign bit followed by the 8-bits 00111000. And here is the eureka moment: 28 = 256 (2 to the power of the number of bits without the sign bit) and 00111000 = 56. If we subtract one from the other, we get -200!

Two's Complement
Two's Complement

Coming back to the bitwise NOT operator, in Python this will:

  • Invert the bits (bitwise complement)
  • Interpret the result in Two's Complement form i.e. return the decimal value that this Two's Complement binary form represents

Simply put, bitwise NOT applied to N will always result in -(N+1). For example ~4 = -5 as follows:

Bitwise NOT
Bitwise NOT

You may now be wondering how you are able to look at 111111011 and know that it represents -5 i.e. go from Two's Complement to decimal. To compute this reverse computation, either perform Two's Complement again on 11111011 which gives you 00000101 = 5, and with the sign bit gives you -5. Equivalently, you can compute the decimal value of 11111011 = 251 and substract 28 = 256 (2 to the power of the number of bits without the sign bit) from this to give you -5.

3.7.6. Bitwise Signed Right Shift

The bitwise signed right shift >> operator works with two operands. As the name suggests, it shifts all the bits of the first operand to the right by the number of bits specified in the second operand, with all the resultant empty spaces towards the left filled with 0 for positive numbers and 1 for negative numbers. For example 24 >> 2 = 6 as follows:

Bitwise Signed Shift Right
Bitwise Signed Shift Right

3.7.7. Bitwise Zero Fill Left Shift

The bitwise zero fill left shift << operator also works with two operands. As the name suggests, it shifts all the bits of the first operand to the left by the number of bits specified in the second operand, with all the resultant empty spaces towards the right filled with 0. For example 12 << 2 = 48 as follows:

Bitwise Zero Fill Left Shift
Bitwise Zero Fill Left Shift

3.7.8. Bitwise Operators in Python

Bitwise operators may be implemented in Python as follows:

# Bitwise AND
print(17 & 18)

# Bitwise OR
print(16 | 10)

# Bitwise XOR
print(26 ^ 2)

# Bitwise NOT
print(~4)

# Bitwise signed right shift
print(24 >> 2)

# Bitwise zero fill left shift
print(12 << 2)

3.8. Unary Operators

Unary operators are those operators with only one operand. Binary operators are those operators that have two operands like most of the previous operators. Unary operators in Python are as follows:

x = 5
y = -10
z = False

# Negative
print(-x)

# Unchanged
print(+y)

# Not
print(not z)

# Bitwise NOT
print(~x)

3.9. Operator Precedence

Finally, Python defines an order of precedence for operators - that is given a single Python statement containing multiple operators, the order in which they are computed. The following table describes that order of precedence, from highest precedence to lowest precedence:

Note that whilst Python operators group left to right, the exponential operator ** groups from right to left. Therefore 2 ** 3 ** 2 evaluates to 2 ** (3 ** 2) = 2 ** 9 = 512.

OperatorDescription
(), [], {}Parenthesized expressions and bindings
fn(args), x[index], x[index:index], x.attributeFunction call, slicing, subscription and attribute reference
await xAwait expression
**Exponentiation
+x, -x, ~xUnary positive, unary negative and bitwise NOT
*, @, /, //, %Multiplication, matrix multiplication, division, floor division and modulus
+, -Addition and subtraction
<<, >>Bitwise shifts
&Bitwise AND
^Bitwise XOR
^Bitwise OR
in, not in, is, is not, <, <=, >, >=, !=, ==Comparisons, membership and identity tests
not xBoolean NOT
andBoolean AND
orBoolean OR
if...elseConditional expressions
lambdaLambda expressions

Summary

In this module we have covered the fundamental building blocks of the Python programming language. We have an understanding of how our Python source code is broken down into tokens by the lexical analyzer based on logical lines, indentations, identifiers and keywords. We also have an understanding of the various types of literals and operators available in Python from which we can start building more complex programs.

Homework

Using only pen and paper (i.e. do NOT use Python nor Jupyter Notebook!), compute the output of the following Python statements. Once completed, use Python to verify your answers.

# Question 1
print(47 & 55)

# Question 2
print(59 | 44)

# Question 3
print(16 ^ 12)

# Question 4
print(131 // 8)

# Question 5
print(0b1110101)

# Question 6
print((2 + 2) ** (2 + 2))

# Question 7
print(10 * 10 + 2 * (100 / 10))

# Question 8
print(11 & 13 * 2 ** 3)

# Question 9
print((1000 % 30 ^ 17 % 3 ** (1 + 1)) ** 8)

# Question 10
print((~bool(None)) ** 2 << 4)

What's Next?

In the next module, we will continue exploring the fundamental building blocks of the Python programming language including string formatting, conditional statements, basic data structures and control flow.