In this module we will cover the fundamental building blocks of the Python programming language, namely:
- Basic Concepts - logical/physical lines, comments, indentation, identifiers and keywords
- Literals - string, boolean, numeric and special literals
- Operators - unary, binary, arithmetic, assignment, comparison, logical, identity, membership and bitwise operators
The source code for this module may be found in the public GitHub repository for this course. Where code snippets are provided in this module, you are strongly encouraged to type and execute these Python statements in your own Jupyter notebook instance.
1. Basic Concepts
As covered in the previous module, Python is an interpreted language. This means that each line of your Python source code is read, verified, translated into something called byte code (low-level machine code) and executed - if an error is encountered, the program will halt at that point and an error message is returned. In Python, the program that does this is called the Python Interpreter. At a lower level, your Python source code is broken down into tokens by a lexical analyzer, which are then fed into a parser.
Let's take a deeper look to help us understand how this lexical analyzer breaks down our Python source code into a stream of tokens.
1.1. Logical and Physical Lines
A logical line corresponds to a single Python statement. A physical line is a line terminated by an end of line character (for example when you press the ENTER key on your keyboard). In most cases, and to improve the readability of your Python source code, it is recommended that a logical line spans a single physical line as follows:
# Logical line spanning a single physical line my_string = 'My entire string on a single physical line.' print(my_string)
1.2. Explicit Line Joining
However sometimes it may be the case that a logical line spans multiple physical lines. One way to achieve this is to use explicit line joining. This is where you use the backslash character
\ to continue a Python statement onto new physical lines as follows:
# String literal spanning two physical lines using explicit line joining my_string = 'The first part of the string. \ The second part of the string' print(my_string) # Control flow spanning multiple physical lines using explicit line joining year = 2019 month = 9 day = 15 if 1900 < year < 2100 and 1 <= month <= 12 \ and 1 <= day <= 31: # Valid date print("You have entered a valid date")
1.3. Implicit Line Joining
There are times in Python where logical lines span multiple physical lines without the use of the backslash
\ character. This is called implicit line joining and applies to expressions defined in parentheses (round brackets), square brackets or curly braces as follows:
# Implicit line joining days = ('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', # Weekdays 'Saturday', 'Sunday') # Weekend print(days)
As you may have noticed in the previous examples, you can write comments in your Python source code by using the
# character. Comments are a useful way to describe what your code is doing to other developers, or should you need to come back to it in the future. As long as the
# character is not inside a string literal, Python will consider it as a comment and ignore it for the purposes of executing your code. In relation to a logical line spanning multiple physical lines:
- Explicit Line Joining - a physical line ending in a backslash character cannot be followed by a comment. Also the backslash character cannot be used to make a comment span multiple physical lines.
- Implicit Line Joining - a physical line ending in a backslash character can be followed by a comment.
In regards to best practice, your comments should be short, concise and relevant. As a beginner programmer, you may be tempted to write a large number of comments to help you remember what is going on. Whilst this is fine for when you are just starting out, as you become more advanced in your programming skills, try to significantly reduce the number of comments to just those required to understand the more complex areas of your code (for example, you do not need to explain what an IF statement is doing!). If you find that you are writing comments purely to explain what variables are and how they are being processed, try renaming your variables to a more descriptive name within the context of your program.
1.5. Mutli-Logical Physical Lines
It is possible to define multiple logical statements on the same physical line by delimiting statements with the semi-colon
; character. However this is generally discouraged as it reduces the readability of your source code:
# Multiple logical lines spanning a single physical line from datetime import date; today = date.today(); print(today)
1.6. Blank Lines
Logical lines containing nothing but whitespace characters (for example spaces and tabs) are ignored. They do however serve to make your Python source code more readable, either when writing Python modules, or to improve the readability of complex cells in Jupyter Notebook.
The end of a logical line corresponds to a NEWLINE token as processed by the lexical analyzer. Blank lines do not generate NEWLINE tokens.
Indentation - the leading whitespace characters at the beginning of a logical line - is important in Python as it is used to define and group together related statements. It is NOT recommended to mix spaces and tabs for indentation as this can lead to inconsistent indentation levels. Tabs are replaced by space characters but the number of space characters as a result of this replacement operation is dependent on the underlying environment, for example UNIX platforms may behave differently to non-UNIX platforms such as Windows.
Most IDEs can be configured to explicity define the number of spaces to replace tabs with - the default is commonly 4 space characters for one tab. However, most IDEs and web-based notebooks, including Jupyter Notebook, will also automatically ident and move the cursor to the correct position after starting a new physical line (by pressing the ENTER key) should an indentation be required.
We will use indentation implicitly as we introduce further concepts over this course, but for now let us take a look at a couple of examples of indentation in action:
# Indentation is required for control flow in Python x = 101 if x >= 100: print("Your number is bigger than or equal to 100") else: print("Your number is less than 100") # Incorrect indentations lead to IndentationError x = 100 y = 200 # Incorrect indentation if x >= y: print("X is bigger than or equal to Y") # Incorrect indentation else: print("Y is bigger than X")
Indentation levels generate INDENT and DEDENT tokens respectively.
Identifiers are names given to identify variables, functions, classes, modules and other objects in Python. Identifiers must adhere to the following rules:
- Keywords cannot be used as identifiers (see below)
- Only lowercase letters, uppercase letters, digits and underscore characters are allowed
- They cannot start with a digit
- They can be of any length, but they should be short, concise and relevant
There are also conventions that you should adhere to when naming identifiers in Python that we will introduce over this course.
Should you fail to adhere to these rules, a
SyntaxError will be raised, as follows:
# Valid identifiers my_first_number = 100 my_first_string = 'Hello World' # Invalid identifiers 1number = 10 string-test = 'Invalid Identifier'
Identifiers are another category of tokens generated by the lexical analyzer.
1.9. Case Sensitivity
Note that Python is a case-sensitive language. For example, when you name an identifier, you must ensure that you use the exact same case when subsequently referencing it.
Keywords are reserved words in the Python language and should not and cannot be used for any other purpose. For example, keywords cannot be used as variable names or other identifiers in Python. We will cover most of these keywords during this course, but for your reference they are listed here:
Keywords are another category of tokens generated by the lexical analyzer. In the remaining sections of this class, we shall study two other fundamental categories of tokens - literals and operators.
Literals are another fundamental category of tokens in Python, but important enough to justify their own section. Officially, literals are notations for constant values of Python built-in types. In plain English, they are raw data values in Python. In this section, we will cover the following types of literals:
2.1. String Literals
A sequence of characters or text enclosed within either single
' or double quotes
" is used to form a string literal, as follows:
my_first_string = 'Hello World' my_second_string = 'Line 1.\nLine 2.' my_third_string = r'Line 1.\nLine 2.' my_fourth_string = u'r\u00e9sum\u00e9' print(my_first_string) print(my_second_string) print(my_third_string) print(my_fourth_string)
You may notice a couple of odd looking characters and character sequences in the previous examples, which are explained as follows:
- String literals prefixed with the
rstring prefix are called raw strings where backslashes are treated as literal characters. In this example, where
\nis a new line character, instead of printing a new line, it will treat
\nas a literal and print '\n' instead.
- String literals prefixed with the
ustring prefix support Unicode literals. In this example, the word 'resume' actually contains accented Unicode é characters which are denoted by the UTF-16 encoding
2.2. Boolean Literals
A Boolean literal can only have two values -
True (representing the value of 1) or
False (representing the value of 0), as follows:
my_first_boolean = True my_second_boolean = False print(my_first_boolean) print(my_second_boolean) print(1==True) print(1==False) print(0==True) print(0==False)
2.3. Numeric Literals
Python supports three types of numeric literals - integers, floating point numbers and imaginary numbers (a component of complex numbers).
2.3.1. Integer Literals
Integer literals can be formed using the standard base-10 system (i.e. each digit can have an integer value from 0 to 9) as you would normally define integers. Alternatively they can also be formed using binary, octal and hexadecimal systems as well, as follows:
# Integer Literals using different number systems decimal_integer = 100 binary_integer = 0b1100100 octal_integer = 0o144 hexadecimal_integer = 0x64 decimal_groupings_integer = 100_000_000 print(decimal_integer) print(binary_integer) print(octal_integer) print(hexadecimal_integer) print(decimal_groupings_integer)
Note that in the last example, the underscore
_character can optionally be used for numeric groupings (as of Python 3.6). This applies to all numeric literal types i.e. integer, floating point and imaginary literals.
2.3.2. Floating Point Literals
Floating point literals can be formed using radix (base) 10 integer and exponent parts, as follows:
# Floating point literals my_first_number = 3.14 my_second_number = 10e2 my_third_number = 100e-5 my_fourth_number = 10. my_fifth_number = .12345 my_sixth_number=3.14_15_93 print(my_first_number) print(my_second_number) print(my_third_number) print(my_fourth_number) print(my_fifth_number) print(my_sixth_number)
2.3.3. Imaginary Literals
Finally, imaginary literals can be formed using the
j suffix. By default, imaginary literals defined by themselves result in complex numbers with a zero real part. To define complex numbers with a non-zero real part, a floating point number must be added to it, as follows:
# Imaginary literals my_first_complex_number = 10j my_second_complex_number = .10j my_third_complex_number = 2+3j print(my_first_complex_number) print(my_second_complex_number, my_second_complex_number.real, my_second_complex_number.imag) print(my_third_complex_number, my_third_complex_number.real, my_third_complex_number.imag)
2.4. Special Literals
There exists in Python one special literal -
None. This is used to specify nothing or a null value. Note that
None is NOT equivalent to an empty string, 0 nor
False, as follows:
x = None print(x) if x: print('x is True') else: print('x is not True') print(bool(None))
You may be wondering why
x is not Trueis printed in the last example when
Noneis not the same as
xto be a boolean, and in this case executes
bool(x)to perform this comparison.
False(as it does not determine
Noneto be a valid boolean value) which is why
x is not Trueis printed. Try it out yourself and other combinations to see what you get.
The final category of tokens that we shall study in this module are operators. Operators are used to represent operations to be performed on operands (values or quantities), and may be divided into the following types:
- Arithmetic operators
- Assignment operators
- Comparison operators
- Logical operators
- Identity operators
- Membership operators
- Bitwise operators
3.1. Arithmetic Operators
Arithmetic operators perform common arithmetic operations on numerical values, as follows:
# Addition print(13 + 7) # Subtraction print(10 - 7) # Multiplication print(64 * 8) # Division print(225 / 15) # Modulus (remainder after division) print(69 % 8) # Exponentiation (raising by a power) print(2 ** 5) # Floor division (quotient) print(100 // 7)
3.2. Assignment Operators
Assignment operators assign values to variables, as follows:
# Assignment x = 10 print(x) # Add and assign (x = x + 5) x += 5 print(x) # Substract and assign (x = x - 2) x -= 2 print(x) # Multiply and assign (x = x * 4) x *= 4 print(x) # Divide and assign (x = x / 2) x /= 2 print(x) # Modulus and assign (x = x % 4) x %= 4 print(x) # Exponentiation and assign (x = x ** 8) x **= 8 print(x) # Floor division and assign (x = x // 15) x //= 15 print(x) # Bitwise AND and assign (x = x & 18) x = int(x) x &= 18 print(x) # Bitwise OR and assign (x = x | 10) x |= 10 print(x) # Bitwise XOR and assign (x = x ^ 2) x ^= 2 print(x) # Bitwise signed right shift and assign (x = x >> 1) x >>= 1 print(x) # Bitwise zero fill left shift and assign (x = x << 2) x <<= 2 print(x)
Now that we have an understanding of assignment operators, we are able to define what a variable is in Python. A variable may be defined as an identifier which has been assigned some type of value, whether that value is a literal, a literal collection or some other object or data structure. In Python, a variable is created the moment you first assign a value to it and does not need to be declared in advance. Languages such as Python where variables and variable types do not need to be declared in advance are called dynamically typed languages.
3.3. Comparison Operators
Comparison operators compare two values, as follows:
# Equal print(10 == 100) # Not equal print(2 != 5) # Greater than print(13 > 12) # Less than print(100 < 1_000_000) # Greater than or equal to print(15 >= 15) # Less than or equal to print(20 <= 19)
3.4. Logical Operators
or are used in control flow to combine conditional statements, as follows:
# AND print(1 < 100 and 10 < 100) # OR print(13 > 14 or 1 > 0) # NOT print(not(1 < 100 and 10 < 100))
3.5. Identity Operators
is not are used to compare variables at the object memory level i.e. are two variables the same object residing in the same location in memory, as follows:
x = 10 y = 100.0 # IS print(x is y) print(x is x) # IS NOT print(x is not y) x = y print(x is y) print(x is not y)
3.6. Membership Operators
not in are used to test if a given value is present in a given object, as follows:
weekdays = ('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday') # IN print('Thursday' in weekdays) print('Saturday' in weekdays) # NOT IN print('Sunday' not in weekdays)
3.7. Bitwise Operators
The previous operators that we have introduced are, generally speaking, intuitive and relatively straightfoward to understand. For beginner programmers though, bitwise operators may seem completely alien and frankly quite bizarre! So what are bitwise operators? Bitwise operators are used to compare numbers in their binary form, that is when numbers are represented by a series of bits where each bit is either 1 or 0.
Bitwise operations are NOT unique to Python. They are fundamental operations that work at the individual bit level, thereby allowing for fast calculations and comparisons as they are carried out directly by your computer processor. Though beginner programmers are unlikely to use bitwise operations in their code, it is important to understand how they work as they are used extensively in more complex use cases including cryptography, encryption, compression and network communications.
To understand how bitwise operators work, we must first understand how numbers can be represented in binary form. Binary is simply a series of bits. A bit represents the smallest unit of data that a computer can store, where each bit can only take one of two possible values at any one time: 1 or 0. Computers store data and execute instructions in bit multiples called bytes, where 8 bits make 1 byte.
To represent numbers in binary form, imagine a simple table where each column represents a power of 2 as follows:
Using this table, any integer value can be represented as a series of bits, where the only possible value for each cell is either 1 or 0. Once the bits have been set, simply add those columns where the bit value is 1 to get your number, as follows:
Using these examples, representing numbers in binary form using 8 bits, 0 is 00000000, 1 is 00000001, 3 is 00000011, 85 is 01010101 and 252 is 11111100. Now that we have a basic understanding of how to represent numbers in binary form, let us return to the bitwise operators.
3.7.1. Bitwise AND
The bitwise AND
& operator compares each bit of two operands. If both bits are set to 1, then the output of bitwise AND is also 1. However if either bit is 0, then the output of bitwise AND is 0. For example
17 & 18 = 16 as follows:
3.7.2. Bitwise OR
The bitwise OR
| operator also compares each bit of two operands. If either bit is set to 1, then the output of bitwise OR is also 1, otherwise 0. For example
16 | 10 = 26 as follows:
3.7.3. Bitwise XOR
The bitwise XOR
^ operator (exclusive OR) also compares each bit of two operands. If only one of the two bits is set to 1, then the output of bitwise XOR is also 1, otherwise 0. In otherwords, the two bits should have opposite values if the outcome of bitwise XOR is to be 1. For example
26 ^ 2 = 24 as follows:
3.7.4. Bitwise NOT
The bitwise NOT
~ operator works on only one operand (i.e. it is a unary operator - see below for further details on unary operators). The bitwise complement switches bit values, that is it changes 1 to 0, and 0 to 1. However it is not quite that simple unfortunately!
3.7.5. Two's Complement
Two's Complement is an operation on binary numbers that is used to represent numbers with a sign i.e. positive or negative. To sign a number, the first bit is used to represent the sign and is called the sign bit. If the sign bit is 0, this represents a positive number. If the sign bit is 1, this represents a negative number. Two's Complement is computed by inverting the bits in a given number (i.e. the complement, that is change 1 to 0, and 0 to 1), including the sign bit, and adding 1 to the result. Therefore to represent a positive number, you simply write the binary form as shown above. However to represent a negative number, include the sign bit and compute the Two's Complement of that number. For example to represent the number -200 using 9 bits (8 bits plus the sign bit):
- Represent 200 in binary form = 011001000
- Compute the complement = 100110111
- Add 1 to the result = 100111000
Therefore -200 represented in binary form is 100111000. That is 1 for the sign bit followed by the 8-bits 00111000. And here is the eureka moment: 28 = 256 (2 to the power of the number of bits without the sign bit) and 00111000 = 56. If we subtract one from the other, we get -200!
Coming back to the bitwise NOT operator, in Python this will:
- Invert the bits (bitwise complement)
- Interpret the result in Two's Complement form i.e. return the decimal value that this Two's Complement binary form represents
Simply put, bitwise NOT applied to N will always result in -(N+1). For example
~4 = -5 as follows:
You may now be wondering how you are able to look at 111111011 and know that it represents -5 i.e. go from Two's Complement to decimal. To compute this reverse computation, either perform Two's Complement again on 11111011 which gives you 00000101 = 5, and with the sign bit gives you -5. Equivalently, you can compute the decimal value of 11111011 = 251 and substract 28 = 256 (2 to the power of the number of bits without the sign bit) from this to give you -5.
3.7.6. Bitwise Signed Right Shift
The bitwise signed right shift
>> operator works with two operands. As the name suggests, it shifts all the bits of the first operand to the right by the number of bits specified in the second operand, with all the resultant empty spaces towards the left filled with 0 for positive numbers and 1 for negative numbers. For example
24 >> 2 = 6 as follows:
3.7.7. Bitwise Zero Fill Left Shift
The bitwise zero fill left shift
<< operator also works with two operands. As the name suggests, it shifts all the bits of the first operand to the left by the number of bits specified in the second operand, with all the resultant empty spaces towards the right filled with 0. For example
12 << 2 = 48 as follows:
3.7.8. Bitwise Operators in Python
Bitwise operators may be implemented in Python as follows:
# Bitwise AND print(17 & 18) # Bitwise OR print(16 | 10) # Bitwise XOR print(26 ^ 2) # Bitwise NOT print(~4) # Bitwise signed right shift print(24 >> 2) # Bitwise zero fill left shift print(12 << 2)
3.8. Unary Operators
Unary operators are those operators with only one operand. Binary operators are those operators that have two operands like most of the previous operators. Unary operators in Python are as follows:
x = 5 y = -10 z = False # Negative print(-x) # Unchanged print(+y) # Not print(not z) # Bitwise NOT print(~x)
3.9. Operator Precedence
Finally, Python defines an order of precedence for operators - that is given a single Python statement containing multiple operators, the order in which they are computed. The following table describes that order of precedence, from highest precedence to lowest precedence:
Note that whilst Python operators group left to right, the exponential operator
**groups from right to left. Therefore
2 ** 3 ** 2evaluates to
2 ** (3 ** 2) = 2 ** 9 = 512.
|Parenthesized expressions and bindings|
|Function call, slicing, subscription and attribute reference|
|Unary positive, unary negative and bitwise NOT|
|Multiplication, matrix multiplication, division, floor division and modulus|
|Addition and subtraction|
|Comparisons, membership and identity tests|
In this module we have covered the fundamental building blocks of the Python programming language. We have an understanding of how our Python source code is broken down into tokens by the lexical analyzer based on logical lines, indentations, identifiers and keywords. We also have an understanding of the various types of literals and operators available in Python from which we can start building more complex programs.
Using only pen and paper (i.e. do NOT use Python nor Jupyter Notebook!), compute the output of the following Python statements. Once completed, use Python to verify your answers.
# Question 1 print(47 & 55) # Question 2 print(59 | 44) # Question 3 print(16 ^ 12) # Question 4 print(131 // 8) # Question 5 print(0b1110101) # Question 6 print((2 + 2) ** (2 + 2)) # Question 7 print(10 * 10 + 2 * (100 / 10)) # Question 8 print(11 & 13 * 2 ** 3) # Question 9 print((1000 % 30 ^ 17 % 3 ** (1 + 1)) ** 8) # Question 10 print((~bool(None)) ** 2 << 4)
In the next module, we will continue exploring the fundamental building blocks of the Python programming language including string formatting, conditional statements, basic data structures and control flow.
- 1. Getting Started in Python
- 2. Control and Evaluations Part 1
- 3. Control and Evaluations Part 2
- 4. Data Aggregates Part 1
- 5. Data Aggregates Part 2
- 6. Functions and Modules Part 1
- 7. Functions and Modules Part 2
- 8. Classes and Objects Part 1
- 9. Classes and Objects Part 2
- 10. IO and Exceptions
- 11. PCAP Practice Exam