Course Module • Jillur Quddus

7. Functions and Modules Part 2

Introduction to Python

Functions and Modules Part 2

Introduction to Python

Jillur Quddus • Founder & Chief Data Scientist • 1st September 2020

  Back to Course Overview

Introduction

In this module we will formally introduce Python modules and packages, including how to write and use Python modules, how to construct and distribute Python packages, how to hide Python module entities, how to document Python modules, and Python hashbangs. Specifically we will cover:

  • Python Modules - importing modules, qualifying entities, initialising modules, writing and using modules, the name variable, Python hashbangs, and module documentation
  • Python Packages - creating packages, packages vs directories, the init file, and hiding module entities

The source code for this module may be found in the public GitHub repository for this course. Where interactive code snippets are provided in this module, you are strongly encouraged to type and execute these Python statements in your own Jupyter notebook instance.

1. Modules

If we were to write the user-defined functions that we created in the Functions and Modules Part 1 module of this course directly in the Python interpreter instead of Jupyter Notebook (or another desktop or web IDE), then as soon as we quit the interpreter, we would lose those function definitions and would need to re-write them again from scratch.

In order to persist a Python application beyond a single session, we write Python code in a script file via a text-editor or an integrated development environment (IDE). However for larger Python programs that may contain several hundred, thousand or even tens of thousands of lines of code, then we would look to split our program across not only multiple re-usable Python functions, but also across multiple script files. And if we are writing disparate Python programs, we would like a means to re-use these functions and script files across these different disparate programs instead of copying and pasting the same code over and over again.

Python modules are files containing Python definitions and statements, which can then be imported into other modules. The file name is the module name with the .py file extension. Python modules can contain both Python statements as well as function definitions. For the statements found in Python modules outside of function definitions, these statements are intended to initialize the module and are executed only the first time the module name is encountered in an import statement (they are also executed if the module is executed as a script and standalone application).

The Python standard library provides a wide number of built-in modules, some of which we have already encountered in this course thus far, such as copy, datetime, functools, math, pprint, random and unicodedata. These standard library modules can be imported using the same syntax as user-defined modules and exhibit the same behaviour, as described in the following sections of this module. To learn more about the Python Standard Library, including built-in modules, please visit The Python Standard Library.

Provided below is an example user-defined Python module, following Python style guide conventions described in the Google Python Style Guide and the PEP 8 Style Guide for Python Code respectively. The file name is numbertools.py and hence the module name is numbertools. This module contains one module-level variable called mobius_phi, and five function definitions; is_int(), is_even(), is_prime(), is_fibonacci() and is_perfect_square() to test whether a given number is an integer, even, a prime number, a Fibonacci number, and a perfect square respectively.

#!/usr/bin/env python3
"""Collection of tools for number testing

This module demonstrates the creation and usage of modules in Python.
The documentation standard for modules is to provide a docstring at the top
of the module script file. This docstring consists of a one-line summary
followed by a more detailed description of the module. Sections may also be
included in module docstrings, and are created with a section header and a 
colon followed by a block of indented text. Refer to
https://www.python.org/dev/peps/pep-0008/ for the PEP 8 style guide for
Python code for further information.

Attributes:
    mobius_phi (float): Module level variables are documented in
    either the ``Attributes`` section of the module docstring, or in an
    inline docstring immediately following the variable. Either form is
    acceptable, however HyperLearning AI prefer module level variables be
    documented in the module docstring. In this case, mobius_phi is a
    constant value used as part of the Mobius test to determine whether
    a given number is a Fibonacci number or not.

"""

import math

mobius_phi = 0.5 + 0.5 * math.sqrt(5.0)


def is_int(num):
    """Test whether a given number is an integer or not.

    Tests whether a given number is an integer or not using the in-built
    isinstance() Python function, which returns True if the given object
    is of the specified type, otherwise False.

    Args:
        num (int): The number to test whether it is an integer

    Returns:
        bool: True if num is an integer, otherwise False.

    """

    return isinstance(num, int)


def is_even(num):
    """Test whether a given number is even or not.

    Tests whether a given number is even or not using the modulo operator.

    Args:
        num (int): The number to test whether it is even

    Returns:
        bool: True if num is even, otherwise False

    """

    return True if num % 2 == 0 else False


def is_prime(num):
    """Test whether a given number is a prime number or not.

    Tests whether a given number is a prime number or not, by first testing
    whether it is 0, 1, negative or not a whole number. If neither of these
    conditions are met, then the function proceeds to test whether the given
    number can be divided by the numbers from 2 to the floor division of the
    given number by 2 without a remainder. If not, then the given number is
    indeed a prime number.

    Args:
        num (int): The number to test whether it is a prime number

    Returns:
        bool: True if num is a prime number, otherwise False

    """

    if num <= 1 or num % 1 > 0:
        return False
    for i in range(2, num // 2):
        if num % i == 0:
            return False
    return True


def is_fibonacci(num):
    """Test whether a given number is a Fibonacci number or not.

    Tests whether a given number is a Fibonacci number or not using
    the Mobius Test.

    Args:
        num (int): The number to test whether it is a Fibonacci number

    Returns:
        bool: True if num is a Fibonacci number, otherwise False

    """

    a = mobius_phi * num
    return num == 0 or abs(round(a) - a) < 1.0 / num


def is_perfect_square(num):
    """Test whether a given number is a perfect square.

    Tests whether a given number is a perfect square or not based
    on the Babylonian method for computing square roots.

    Args:
        num (int): The number to test whether it is a perfect square

    Returns:
        bool: True if num is a perfect square, otherwise False

    """

    if num < 0:
        return False
    if num == 0 or num == 1:
        return True

    x = num // 2
    y = {x}
    while x * x != num:
        x = (x + (num // x)) // 2
        if x in y:
            return False
        y.add(x)
    return True

1.1. Importing Modules

To import our numbertools module, we use the import keyword followed by the name of the module, as follows:

# Import our numbertools module
import numbertools

We can now call a function from the module using the <module_name>.<function_name> syntax, as follows:

# Call functions from the module
print(numbertools.is_int(0.5))
print(numbertools.is_even(1_000_002))
print(numbertools.is_prime(277))
print(numbertools.is_fibonacci(12))
print(numbertools.is_perfect_square(1444))

We can also access any variables declared in the module (outside of module function bodies), using the <module_name>.<variable_name> syntax, as follows:

# Access variables from the module
print(numbertools.mobius_phi)

Note that each module is only imported once per Python interpreter session to improve efficiency when running Python programs. If changes are made to a module, then the Python interpreter must be restarted. Alternatively, we can use the Python import importlib; importlib.reload() statements to reload a specific module interactively.

In some cases, you may wish to create an alias when you import a module or module entity (e.g. a function or variable defined in the module) to improve the readability of your subsequent code. We can create an alias for an imported module or entity using the as keyword followed by your chosen alias name. Consequently, in our subsequent code we must reference the module or entity by using its alias instead of the original entity name, as follows:

# Create an alias for a module
import numbertools as nt

# Call module entities qualified with the module alias
print(nt.is_perfect_square(9801))
print(nt.mobius_phi)

We can use the dir() Python function, applied to modules, to list all the entity names provided by a given module (i.e. function names and variable names), as follows:

# List all the function and variable names in a given module
print(dir(nt))

Finally we may choose to only import specific entities from a given module. For example, we may only be interested in importing the sin(), cos() and tan() trigonomteric functions along with the radians() angular conversion function and pi constant value from the Python math standard library module rather than the entire module. We can import specific entities from a module using the from keyword, as follows:

# Import specific entities
from math import cos, pi, radians, sin, tan

# Use the specifically imported entities
print(round(cos(math.radians(90)), 0))
print(round(sin(math.radians(90)), 0))
print(round(tan(math.radians(45)), 0))
print(round(pi, 10))

Note that if we use the from keyword to import specific entities from a module, then do not use the module name to qualify that entity when using it in your code. Rather just use the entity name, such as in the example above where we use pi without qualifying it with the module name math. Also note that it is possible to use the * operator with from to import all modules from a package (except those with a name beginning with an underscore), such as from math import *. However this is not recommended as it reduces the readability of your code and may introduce modules and module entities that are unknown to the developer and have not been explicitly imported, potentially causing unforeseen bugs and issues which may be hard or time-consuming to debug.

1.2. Module Search Path

When you import a module, or specific module entities, into your Python program, Python will look for that module in the following default and priority-ordered locations:

  1. Built-in Module - Python will first check whether the module is a built-in module provided by the Python Standard Library, for example datetime or math.
  2. Current Directory - Python will then search the current directory of the running Python interpreter (note that this is how our numbertools module was found in the examples above).
  3. PYTHONPATH - Python will then check an environment variable, if set, called PYTHONPATH which may contain a list of directories.
  4. Installation-dependent default directory - Python will then check the default directories as defined by your specific Python installation. In the case of Anaconda, this will include/lib/python[n]/site-packages.

This list of locations may be modified by updating sys.path - a list object which holds the directories in which Python will look for modules, as follows:

# Examine and modify sys.path
import sys

# Examine sys.path
print(sys.path)

# Append a location to sys.path
sys.path.append('/foo/bar/code')
print(sys.path)

1.3. Executing Modules as Scripts

In addition to importing modules, they may be executed as scripts in their own right. In other programming languages, such as Java, the main() method or function acts as the entry point for an application, and it is through this entry point that all other computation required for your program to run will be invoked. For example in Java, the main method looks as follows:

public static void main(String[] args) {
    System.out.println("I am the 1st statement that will get executed");
    // more commands ...
}

In Python however, there is no such explicit main() function. Instead there is a special variable called __name__ (the word 'name' with two leading underscore and two trailing underscore characters). If a module is executed as a standalone program, then the Python interpreter will assign to __name__ the string literal "__main__" (the world 'main' with two leading underscore and two trailing underscore characters). However if the module is only being imported, then the Python interpreter will assign to __name__ the module name as a string.

We can therefore include a block of code at the end of a module that first uses an if conditional statement to test whether __name__ == "__main__". If this evaluates to True, then the following indented block of code is executed as if it were the main() function of our standalone program. Let us update our numbertools module to include a block of code that should be executed should numbertools be executed as a script and standalone program:

#!/usr/bin/env python3
"""Collection of tools for number testing

This module demonstrates the creation and usage of modules in Python.
The documentation standard for modules is to provide a docstring at the top
of the module script file. This docstring consists of a one-line summary
followed by a more detailed description of the module. Sections may also be
included in module docstrings, and are created with a section header and a 
colon followed by a block of indented text. Refer to
https://www.python.org/dev/peps/pep-0008/ for the PEP 8 style guide for
Python code for further information.

Attributes:
    mobius_phi (float): Module level variables are documented in
    either the ``Attributes`` section of the module docstring, or in an
    inline docstring immediately following the variable. Either form is
    acceptable, however HyperLearning AI prefer module level variables be
    documented in the module docstring. In this case, mobius_phi is a
    constant value used as part of the Mobius test to determine whether
    a given number is a Fibonacci number or not.

"""

import math

mobius_phi = 0.5 + 0.5 * math.sqrt(5.0)


def is_int(num):
    """Test whether a given number is an integer or not.

    Tests whether a given number is an integer or not using the in-built
    isinstance() Python function, which returns True if the given object
    is of the specified type, otherwise False.

    Args:
        num (int): The number to test whether it is an integer

    Returns:
        bool: True if num is an integer, otherwise False.

    """

    return isinstance(num, int)


def is_even(num):
    """Test whether a given number is even or not.

    Tests whether a given number is even or not using the modulo operator.

    Args:
        num (int): The number to test whether it is even

    Returns:
        bool: True if num is even, otherwise False

    """

    return True if num % 2 == 0 else False


def is_prime(num):
    """Test whether a given number is a prime number or not.

    Tests whether a given number is a prime number or not, by first testing
    whether it is 0, 1, negative or not a whole number. If neither of these
    conditions are met, then the function proceeds to test whether the given
    number can be divided by the numbers from 2 to the floor division of the
    given number by 2 without a remainder. If not, then the given number is
    indeed a prime number.

    Args:
        num (int): The number to test whether it is a prime number

    Returns:
        bool: True if num is a prime number, otherwise False

    """

    if num <= 1 or num % 1 > 0:
        return False
    for i in range(2, num // 2):
        if num % i == 0:
            return False
    return True


def is_fibonacci(num):
    """Test whether a given number is a Fibonacci number or not.

    Tests whether a given number is a Fibonacci number or not using
    the Mobius Test.

    Args:
        num (int): The number to test whether it is a Fibonacci number

    Returns:
        bool: True if num is a Fibonacci number, otherwise False

    """

    a = mobius_phi * num
    return num == 0 or abs(round(a) - a) < 1.0 / num


def is_perfect_square(num):
    """Test whether a given number is a perfect square.

    Tests whether a given number is a perfect square or not based
    on the Babylonian method for computing square roots.

    Args:
        num (int): The number to test whether it is a perfect square

    Returns:
        bool: True if num is a perfect square, otherwise False

    """

    if num < 0:
        return False
    if num == 0 or num == 1:
        return True

    x = num // 2
    y = {x}
    while x * x != num:
        x = (x + (num // x)) // 2
        if x in y:
            return False
        y.add(x)
    return True


if __name__ == "__main__":

    print('----- Number Tools -----')
    num = int(input('Please enter any integer: '))
    print(f'Testing if {num} is an integer: {is_int(num)}')
    print(f'Testing if {num} is an even number: {is_even(num)}')
    print(f'Testing if {num} is a prime number: {is_prime(num)}')
    print(f'Testing if {num} is a Fibonacci number: {is_fibonacci(num)}')
    print(f'Testing if {num} is a perfect square: {is_perfect_square(num)}')

We can now run our numbertools module as a script and standalone program via the command line as follows:

python numbertools.py

>> ----- Number Tools -----
>> Please enter any integer: 13
>> Testing if 13 is an integer: True
>> Testing if 13 is an even number: False
>> Testing if 13 is a prime number: True
>> Testing if 13 is a Fibonacci number: True
>> Testing if 13 is a perfect square: False

One recommended strategy to achieve the same outcome is to define an explicit main() function in your Python module, and place all code that you want the Python interpreter to run when the module is executed as a standalone program inside this function. Thereafter, if __name__ == "__main__" will simply call this main() function. The advantage of this approach is that it is aligned with other programming languages such as Java and C++ that do have main() functions, whilst also improving the readability of your Python program.

1.4. Command Line Arguments

In the example above, our standalone numbertools Python application prompts the user for an integer. However we can also pass arguments to a standalone application at the same time we execute its module as a script via the command line. Fortunately, Python provides the argparse module that makes it straightforward to create user-friendly command-line interfaces. We can use argparse to instruct the Python interpreter to expect both required and optional arguments to be passed to a standalone application.

Let us update our numbertools module to expect a mandatory integer to be passed to it via the command line when it is executed as a script. First we create an ArgumentParser object and assign it to the variable parser. Next we call the add_argument() parser method to expect a mandatory integer to be passed to the standalone application. Next we call the parse_args() parser method that parses and converts argument strings to objects. Finally we assign to the variable number the parsed integer object, which is then passed as an argument to our main() function.

#!/usr/bin/env python3
"""Collection of tools for number testing

This module demonstrates the creation and usage of modules in Python.
The documentation standard for modules is to provide a docstring at the top
of the module script file. This docstring consists of a one-line summary
followed by a more detailed description of the module. Sections may also be
included in module docstrings, and are created with a section header and a 
colon followed by a block of indented text. Refer to
https://www.python.org/dev/peps/pep-0008/ for the PEP 8 style guide for
Python code for further information.

Attributes:
    mobius_phi (float): Module level variables are documented in
    either the ``Attributes`` section of the module docstring, or in an
    inline docstring immediately following the variable. Either form is
    acceptable, however HyperLearning AI prefer module level variables be
    documented in the module docstring. In this case, mobius_phi is a
    constant value used as part of the Mobius test to determine whether
    a given number is a Fibonacci number or not.

"""

import argparse
import math

mobius_phi = 0.5 + 0.5 * math.sqrt(5.0)


def is_int(num):
    """Test whether a given number is an integer or not.

    Tests whether a given number is an integer or not using the in-built
    isinstance() Python function, which returns True if the given object
    is of the specified type, otherwise False.

    Args:
        num (int): The number to test whether it is an integer

    Returns:
        bool: True if num is an integer, otherwise False.

    """

    return isinstance(num, int)


def is_even(num):
    """Test whether a given number is even or not.

    Tests whether a given number is even or not using the modulo operator.

    Args:
        num (int): The number to test whether it is even

    Returns:
        bool: True if num is even, otherwise False

    """

    return True if num % 2 == 0 else False


def is_prime(num):
    """Test whether a given number is a prime number or not.

    Tests whether a given number is a prime number or not, by first testing
    whether it is 0, 1, negative or not a whole number. If neither of these
    conditions are met, then the function proceeds to test whether the given
    number can be divided by the numbers from 2 to the floor division of the
    given number by 2 without a remainder. If not, then the given number is
    indeed a prime number.

    Args:
        num (int): The number to test whether it is a prime number

    Returns:
        bool: True if num is a prime number, otherwise False

    """

    if num <= 1 or num % 1 > 0:
        return False
    for i in range(2, num // 2):
        if num % i == 0:
            return False
    return True


def is_fibonacci(num):
    """Test whether a given number is a Fibonacci number or not.

    Tests whether a given number is a Fibonacci number or not using
    the Mobius Test.

    Args:
        num (int): The number to test whether it is a Fibonacci number

    Returns:
        bool: True if num is a Fibonacci number, otherwise False

    """

    a = mobius_phi * num
    return num == 0 or abs(round(a) - a) < 1.0 / num


def is_perfect_square(num):
    """Test whether a given number is a perfect square.

    Tests whether a given number is a perfect square or not based
    on the Babylonian method for computing square roots.

    Args:
        num (int): The number to test whether it is a perfect square

    Returns:
        bool: True if num is a perfect square, otherwise False

    """

    if num < 0:
        return False
    if num == 0 or num == 1:
        return True

    x = num // 2
    y = {x}
    while x * x != num:
        x = (x + (num // x)) // 2
        if x in y:
            return False
        y.add(x)
    return True


def main(num):
    """Entry point for the Number Tools application.

    Given a number, this function will test whether that given number
    is an integer, an even number, a prime number, a Fibonacci number
    and a perfect square.

    Args:
        num (int): The number to test

    Returns:
        None

    """

    print('----- Number Tools -----')
    print(f'Testing if {num} is an integer: {is_int(num)}')
    print(f'Testing if {num} is an even number: {is_even(num)}')
    print(f'Testing if {num} is a prime number: {is_prime(num)}')
    print(f'Testing if {num} is a Fibonacci number: {is_fibonacci(num)}')
    print(f'Testing if {num} is a perfect square: {is_perfect_square(num)}')


if __name__ == "__main__":

    # Define the required command line arguments
    parser = argparse.ArgumentParser(description='Number Tools Application')
    parser.add_argument("-n", "--num", type=int, required=True,
                        help="Any integer value")

    # Parse the command line arguments
    args = parser.parse_args()
    number = args.num

    # Call the main function
    main(number)

We can now run our numbertools module as a script and standalone program via the command line and pass an argument to it as follows:

python numbertools.py -n 13

>> ----- Number Tools -----
>> Testing if 13 is an integer: True
>> Testing if 13 is an even number: False
>> Testing if 13 is a prime number: True
>> Testing if 13 is a Fibonacci number: True
>> Testing if 13 is a perfect square: False

python numbertools.py -num 144

>> ----- Number Tools -----
>> Testing if 144 is an integer: True
>> Testing if 144 is an even number: True
>> Testing if 144 is a prime number: False
>> Testing if 144 is a Fibonacci number: True
>> Testing if 144 is a perfect square: True

If we fail to pass the mandatory command line arguments or they are passed in an incorrect format, then argparse will automatically construct and display an informative message to the user, as follows:

python numbertools.py

>> usage: numbertools.py [-h] -n NUM
>> numbertools.py: error: the following arguments are required: -n/--num

To learn more about the argparse module, please visit the official Python documentation for the module here. There is also an introductory argparse tutorial that covers the core concepts of the module, including positional and optional arguments.

1.5. Module Documentation

As demonstrated in our numbertools example module, we use docstrings (string literals enclosed in triple """ quotations) to document functions, methods, and modules (and classes, which we will study in the Classes and Objects Part 1 module of this course) in Python. Module docstrings specifically are written at the start of the module file, and should (by convention) begin with a one-line summary of the module followed by a more detailed description. Guided by the Google Python Style Guide standard, module docstrings may then optionally contain a brief description of exported classes and functions and/or usage examples. Module-level variables may also be described in the module docstring, normally in a section entitled Attributes followed by an indented list of the module-level variables along with their data type (optional) and a brief description of that variable.

To learn more about styling and docstring conventions in Python, please refer to PEP 8 Style Guide for Python Code and PEP 257 Docstring Conventions respectively.

Standard comments in our Python code (that is lines that start with a #) are ignored by the Python interpreter but are used to explain to other developers certain functional aspects of the code with a view to developing software libraries, applications and services that are easier to maintain and update. Docstrings however are not ignored by the Python interpreter, and are bound to their respective object (i.e. function, method, module or class) via that object's __doc__ attribute (that is the word 'doc' with two leading and two trailing underscore characters).

For example, to access and display (in a human friendly manner) the docstring for our numbertools module, we can call the print() function given numbertools.__doc__. Similarly to access the docstring for any function, method or class, simply access the __doc__ attribute for that object, as follows:

# Access and display a module's docstring
print(numbertools.__doc__)

# Access and display a built-in module's docstring
print(math.__doc__)

# Access and display a function's docstring
print(numbertools.is_fibonacci.__doc__)

# Access and display a built-in function's docstring
print(sin.__doc__)

Finally, to display a comprehensive description of a Python module, function, method or class, we can call the Python help() function on the object in question, as follows:

# Display help on the numbertools module
help(numbertools)

# Display help on a specific function
help(numbertools.is_perfect_square)

# Display help on a specifc built-in function
help(len)

# Display help on the built-in math module
help(math)

1.6. Python Hashbang

In computing, a hashbang (commonly referred to as a shebang) is a character sequence consisting of #! that is entered in the very first line of a script file. The # character is used because it is often used to denote a comment in most scripting languages so will be ignored by default, and the ! character is referred to as the 'bang'. Script files in general are not compiled and hence are not byte code or executable files - they are simply text files. The purpose of the hashbang is to inform the command-line interpreter, or shell, on Unix-like operating systems which interpreter to use in order to run the script.

The general format of a hashbang is #!interpreter [optional arguments]. In bash scripts for Unix systems, the hashbang may be #!/bin/bash that informs the shell to use the bash interpreter. In general shell scripts for Unix systems, the hashbang may be #!/bin/sh that informs the shell to use the system interpreter. In these two examples, the absolute path to the required interpreter is provided. Alternatively we may use the env utility provided with the name of the interpreter as an argument. The previous hashbangs then become #!/usr/bin/env bash and #!/usr/bin/env sh respectively, where the env utility will search for the interpreter executable in the current user's $PATH environmental variable (if more than one path to bash, for example, is found then the first one is used), making this approach more flexible and portable across different environments.

We use the #!/usr/bin/env python3 hashbang at the top of a Python script (as you can see in our numbertools Python module script file) to inform the shell on Unix-like operating systems that it should use the env tool to look for a python3 interpreter in the user's $PATH environmental variable. Once the first python3 interpreter is found, that should be used to run the script. The shell will consequently run /usr/bin/env python3 <name of module>.py. PEP 394 The Python Command on Unix-Like Systems recommends to explicitly use python3 as the hashbang interpreter for Python 3 scripts, as simply using python may refer to either python2 or python3 interpreters (and it is still common for some Unix-like operating systems to come pre-installed with Python 2, and some systems may have both Python 2 and Python 3 installed on them). Furthermore, it is not recommended to use the absolute path to the Python 3 interpreter on your system in the hashbang, as this absolute path may not be the same on different machines meaning that the script may fail to run in different environments. For example, #!/usr/local/bin/python may not exist in some machines and environments.

Note that the python3 interpreter in Anaconda may be found in the absolute path/bin/python3. During installation of Anaconda on Unix-like operating systems, the absolute path/bin will be added to the start of the user's $PATH environmental variable. As such, this will normally be the first python3 interpreter that the env utility will find and hence use to run Python 3 scripts. Note also that when you run the python command via the command-line, the shell will also search the $PATH variable for the first location of an interpreter executable called python, which is why this command will then run the Python interactive interpreter.

As a consequence of including the Python hashbang at the start of our Python modules, we can now execute them directly in Unix-like operating systems via the command line (we first need to make the file executable), as follows:

# Make the .py file executable
chmod +x numbertools.py

# Alternatively use octal mode
chmod 764 numbertools.py

# Execute the Python module directly
./numbertools.py -n 144

>> ----- Number Tools -----
>> Testing if 144 is an integer: True
>> Testing if 144 is an even number: True
>> Testing if 144 is a prime number: False
>> Testing if 144 is a Fibonacci number: True
>> Testing if 144 is a perfect square: True

1.7. PYC Files

As discussed in the Getting Started in Python module of this course, Python is an interpreted language. This means that each line of your Python source code is read, verified, translated into byte code and executed.

In section 1.6. Python Hashbang of this module, we saw that if we place the hashbang #!/usr/bin/env python3 (or equivalent) as the first line in our Python module script file and make the file executable, then the identified Python interpreter takes responsibilty for running the module as a script. However when a Python module is imported for the first time by another module, a .pyc file will be created (in the same directory as the .py module script file) and which contains the compiled byte code of the module in question. When you next run your Python application, the Python interpreter will first check for the existence of a .pyc file and run it if it is more recent than any changes made to the .py module. If the .py module has recent changes that supersede an existing .pyc file, then the Python interpreter will re-translate it into byte code and store that compiled byte code in the .pyc file.

In some cases, you may wish to explicitly create and share the .pyc compiled byte code of your modules. This may be the case when installing modules for shared use across different users and/or different environments, and the user running the shared Python application does not have the necessary file-system permissions for the automatic creation of .pyc byte code files in the relevant directories when modules are first imported. To overcome this issue, the Python standard library provides the py_compile module to generate a byte code file from a source file. The compile() function in the py_compile module, given the relative or absolute location of a Python .py file, will compile the source file to a byte code file. If the absolute location to the .py file is /foo/bar/abc.py, for example, then the byte code file will be persisted to /foo/bar/__pycache__/abc.cpython-38.pyc by default (where 38 refers to the Python interpreter version, for example Python 3.8). We can provide an custom location for the byte code file to be persisted to by providing an optional cfile argument, as follows:

# Generate a byte code file of our numbertools module
import py_compile
py_compile.compile('numbertools.py')

# Generate a byte code file of our numbertools module in a custom location
py_compile.compile('numbertools.py', cfile='/tmp/numbertools.pyc')

We can now run our numbertools application using the .pyc file instead via the Python interpreter and command line, as follows:

python /tmp/numbertools.pyc -n 144

>> ----- Number Tools -----
>> Testing if 144 is an integer: True
>> Testing if 144 is an even number: True
>> Testing if 144 is a prime number: False
>> Testing if 144 is a Fibonacci number: True
>> Testing if 144 is a perfect square: True

2. Packages

Most of us organise our photographs, videos, music, games, personal documents and work documents into a hierarchy of different directories on our computers, devices and cloud storage accounts. This allows us to better manage and locate our files, especially as the number of files that we manage is large and may grow over time. Similarly, for any non-trivial Python application that we develop, we should seek to group similar and related modules together, where a single group is stored in its own directory. In Python, packages are used to create a hierarchical structure of modules. Just as normal directories on our computers can contain multiple files and multiple sub-directories, packages can contain multiple modules and multiple sub-packages.

On our computers, the path to any file is defined by the ordered folder names separated by a slash character (forward slash on Unix-like operating systems, and backward slash on Windows operating systems), for example /foo/bar/myfile.txt. In order to import a module from a Python package, the analogous 'path' is the ordered package and sub-package names separated by the . dot operator, for example for.bar.mymodule. Officially, packages in Python are a way of structuring Python's module namespace using dot notation. This enables us to avoid module name collisions (i.e. importing two different modules that share the same name) as they can be qualified via their package.

Whilst directories and packages are logically analogous, in order for Python to treat a directory as a package, it must contain a file named __init__.py (that is the word 'init' with two leading and two trailing underscore characters). This initialisation file can either be empty, or hold Python code that is executed when that package is initialised. But in either case, the file must exist for that directory to be treated as a package. We will explore package initialisation files in greater detail in section 2.3. Init File of this module.

The following diagram illustrates an example Python application. In this example, we are designing an end-to-end Python application capable of controlling a driverless-car. Such an application would be extremely complex, hence we would seek to organise our application over different packages and different modules to make it easier to manage and easier for other developers to conceptualise and maintain, as follows:

Driverless car application - click to enlarge
Driverless car application - click to enlarge

2.1. Importing Packages

As described above, in order to import a module from a Python package, the 'path' is the ordered package and sub-package names separated by a . dot character. In the following example, we import a module from the Pandas data analysis library (pre-bundled with Anaconda) called types found in the nested package pandas.api, as follows:

# Import a module from a nested package in the Pandas library
import pandas.api.types as types
print(types.is_integer_dtype(str))

# Alternatively use from to import a specific module
from pandas.api import types
print(types.is_integer_dtype(str))

# Alternatively use from to import a specific function from a specific module
from pandas.api.types import is_integer_dtype
print(is_integer_dtype(str))

2.2. Creating Distribution Packages

In this sub-section we will create our first distribution package in Python. A distribution package is a versioned archive file containing relevant Python packages, modules and other resource files that are required by our release (i.e. a versioned release of our distribution package so that others may install it as a library in their own Python environments). Note that we will not be using Jupyter notebooks to write the Python modules in our packages as we require our Python modules to be .py files (not .ipynb). As such, and in order to follow this sub-section, you can refer to the GitHub repository for this course, specifically the contents of examples/my-first-project.

As discuseed in the Getting Started in Python module of this course, to write Python modules you may use any text-editor or a dedicated integrated development environment (IDE) that supports Python development, including PyCharm, PyDev and Visual Studio Code. You can use Jupyter Notebook to edit text files, including .py files, but it lacks the features of a proper IDE such as automatic error checking and code completion.

2.2.1. The Goal

In the examples/my-first-project directory found in the GitHub repository for this course, we have created a Python project called my-first-project with the following structure:

my-first-project/
├── LICENSE
├── README.md
├── myutils/
│   └── __init__.py
│   └── collections/
│       └── __init__.py
│       └── dictutils.py
│       └── listutils.py
│       └── tupleutils.py
│   └── numbers/
│       └── __init__.py
│       └── numberutils.py
│   └── strings/
│       └── __init__.py
│       └── stringutils.py
├── setup.py
└── tests/

Using this Python project, we will create a bespoke Python distribution package called myutils containing the following exhaustive packages and modules:

  • myutils.collections.dictutils
  • myutils.collections.listutils
  • myutils.collections.tupleutils
  • myutils.numbers.numberutils
  • myutils.string.stringutils

Once we have created our Python distribution package, we will then distribute it for others to include in their own Python applications.

2.2.2. Package Modules

First we need to populate our Python modules, namely dictutils, listutils, tupleutils, numberutils and stringutils respectively. For the purposes of this tutorial, we will populate these modules with trivial user-defined Python functions, as follows:

dictutils.py

#!/usr/bin/env python3
"""Collection of useful tools for working with dictionary objects.

This module demonstrates the creation and usage of modules in Python.
The documentation standard for modules is to provide a docstring at the top
of the module script file. This docstring consists of a one-line summary
followed by a more detailed description of the module. Sections may also be
included in module docstrings, and are created with a section header and a
colon followed by a block of indented text. Refer to
https://www.python.org/dev/peps/pep-0008/ for the PEP 8 style guide for
Python code for further information.

"""

import pandas as pd


def convert_to_dataframe(my_dict, column_list):
    """Convert a given dictionary to a Pandas DataFrame.

    Args:
        my_dict (dictionary): The dictionary to convert to a DataFrame
        column_list (list): A list containing ordered column names

    Returns:
        DataFrame: Pandas DataFrame loaded with the given dictionary

    """

    return pd.DataFrame.from_dict(
        my_dict, orient='index', columns=column_list)

listutils.py

#!/usr/bin/env python3
"""Collection of useful tools for working with list objects.

This module demonstrates the creation and usage of modules in Python.
The documentation standard for modules is to provide a docstring at the top
of the module script file. This docstring consists of a one-line summary
followed by a more detailed description of the module. Sections may also be
included in module docstrings, and are created with a section header and a
colon followed by a block of indented text. Refer to
https://www.python.org/dev/peps/pep-0008/ for the PEP 8 style guide for
Python code for further information.

"""


def convert_to_dict(my_keys, my_values):
    """Merge a given list of keys and a list of values into a dictionary.

    Args:
        my_keys (list): A list of keys
        my_values (list): A list corresponding values

    Returns:
        Dict: Dictionary of the list of keys mapped to the list of values

    """

    return dict(zip(my_keys, my_values))

tupleutils.py

#!/usr/bin/env python3
"""Collection of useful tools for working with tuple objects.

This module demonstrates the creation and usage of modules in Python.
The documentation standard for modules is to provide a docstring at the top
of the module script file. This docstring consists of a one-line summary
followed by a more detailed description of the module. Sections may also be
included in module docstrings, and are created with a section header and a
colon followed by a block of indented text. Refer to
https://www.python.org/dev/peps/pep-0008/ for the PEP 8 style guide for
Python code for further information.

"""


def convert_to_dict(my_tuple):
    """Convert a given tuple of (value, key) tuples to a dictionary.

    Args:
        my_tuple (tuple): A tuple of (value, key) tuples

    Returns:
        Dict: A dictionary mapping each tuple key to its value.

    """

    return dict((y, x) for x, y in my_tuple)

numberutils.py

#!/usr/bin/env python3
"""Collection of useful tools for working with numbers.

This module demonstrates the creation and usage of modules in Python.
The documentation standard for modules is to provide a docstring at the top
of the module script file. This docstring consists of a one-line summary
followed by a more detailed description of the module. Sections may also be
included in module docstrings, and are created with a section header and a 
colon followed by a block of indented text. Refer to
https://www.python.org/dev/peps/pep-0008/ for the PEP 8 style guide for
Python code for further information.

Attributes:
    mobius_phi (float): Module level variables are documented in
    either the ``Attributes`` section of the module docstring, or in an
    inline docstring immediately following the variable. Either form is
    acceptable, however HyperLearning AI prefer module level variables be
    documented in the module docstring. In this case, mobius_phi is a
    constant value used as part of the Mobius test to determine whether
    a given number is a Fibonacci number or not.

"""

import math

mobius_phi = 0.5 + 0.5 * math.sqrt(5.0)


def is_int(num):
    """Test whether a given number is an integer or not.

    Tests whether a given number is an integer or not using the in-built
    isinstance() Python function, which returns True if the given object
    is of the specified type, otherwise False.

    Args:
        num (int): The number to test whether it is an integer

    Returns:
        bool: True if num is an integer, otherwise False.

    """

    return isinstance(num, int)


def is_even(num):
    """Test whether a given number is even or not.

    Tests whether a given number is even or not using the modulo operator.

    Args:
        num (int): The number to test whether it is even

    Returns:
        bool: True if num is even, otherwise False

    """

    return True if num % 2 == 0 else False


def is_prime(num):
    """Test whether a given number is a prime number or not.

    Tests whether a given number is a prime number or not, by first testing
    whether it is 0, 1, negative or not a whole number. If neither of these
    conditions are met, then the function proceeds to test whether the given
    number can be divided by the numbers from 2 to the floor division of the
    given number by 2 without a remainder. If not, then the given number is
    indeed a prime number.

    Args:
        num (int): The number to test whether it is a prime number

    Returns:
        bool: True if num is a prime number, otherwise False

    """

    if num <= 1 or num % 1 > 0:
        return False
    for i in range(2, num // 2):
        if num % i == 0:
            return False
    return True


def is_fibonacci(num):
    """Test whether a given number is a Fibonacci number or not.

    Tests whether a given number is a Fibonacci number or not using
    the Mobius Test.

    Args:
        num (int): The number to test whether it is a Fibonacci number

    Returns:
        bool: True if num is a Fibonacci number, otherwise False

    """

    a = mobius_phi * num
    return num == 0 or abs(round(a) - a) < 1.0 / num


def is_perfect_square(num):
    """Test whether a given number is a perfect square.

    Tests whether a given number is a perfect square or not based
    on the Babylonian method for computing square roots.

    Args:
        num (int): The number to test whether it is a perfect square

    Returns:
        bool: True if num is a perfect square, otherwise False

    """

    if num < 0:
        return False
    if num == 0 or num == 1:
        return True

    x = num // 2
    y = {x}
    while x * x != num:
        x = (x + (num // x)) // 2
        if x in y:
            return False
        y.add(x)
    return True

stringutils.py

#!/usr/bin/env python3
"""Collection of useful tools for working with strings.

This module demonstrates the creation and usage of modules in Python.
The documentation standard for modules is to provide a docstring at the top
of the module script file. This docstring consists of a one-line summary
followed by a more detailed description of the module. Sections may also be
included in module docstrings, and are created with a section header and a
colon followed by a block of indented text. Refer to
https://www.python.org/dev/peps/pep-0008/ for the PEP 8 style guide for
Python code for further information.

"""

import re


def calc_word_frequency(my_string, my_word):
    """Calculate the number of occurrences of a given word in a given string.

    Args:
        my_string (str): String to search
        my_word (str): The word to search for

    Returns:
        int: The number of occurrences of the given word in the given string.

    """

    # Remove all non alphanumeric characters from the string
    filtered_string = re.sub(r'[^A-Za-z0-9 ]+', '', my_string)

    # Return the number of occurrences of my_word in the filtered string
    return filtered_string.split().count(my_word)

2.2.3. Readme

It is standard practice to include a README.md markdown file in the base of your Python project to describe to other developers what your project does. In our case our README.md markdown file can be found in examples/my-first-project/README.md and contains the following basic description:

# MyUtils
Bespoke Python package containing useful tools for working with dictionaries, lists, numbers, strings and tuples in Python 3.

To learn more about what should be included in a good README.md markdown file, please visit Documenting your projects on GitHub and Make a README respectively (amongst a plethora of other available and relevant resources).

2.2.4. Setup Build Script

setuptools is a package development process library designed to aid in the packaging of Python projects, and which comes pre-bundled with Anaconda and its base environment. setuptools requires a build script to define various metadata about the distribution packages that we wish to create, including distribution package name, version, author, description and the minimum Python version required to use our packages. In our examples/my-first-project Python project, we have a Python file called setup.py which acts as the build script for setuptools, and which we populate as follows:

import setuptools

with open("README.md", "r") as fh:
    long_description = fh.read()

setuptools.setup(
    name="myutils-hyperlearningai",
    version="0.0.1",
    author="Jillur Quddus",
    author_email="[email protected]",
    description="Collection of useful tools for Python",
    long_description=long_description,
    long_description_content_type="text/markdown",
    url="https://github.com/hyperlearningai/introduction-to-python",
    packages=setuptools.find_packages(),
    classifiers=[
        "Programming Language :: Python :: 3",
        "License :: OSI Approved :: MIT License",
        "Operating System :: OS Independent",
    ],
    python_requires='>=3.6',
)

Though most of the metadata is self-explanatory, it is worth highlighting the following arguments:

  • name - this is the distribution name of your package and can only contain letters, numbers, _ and -. The name you choose must not already be taken by another package on pypi.org, the Python Package Index (PyPI) that manages Python software libraries developed and shared by the global Python community.
  • long_description - a detailed description of the package. In our case, we read the contents of our README.md markdown file and assign it to the long description attribute. This is a common pattern implemented by many distribution packages.
  • longdescriptioncontent_type - the type of markup that is used for the long description. In our case, it is markdown.
  • packages - a list of all Python import packages that should be included in the final distributed package. Handily, we can use the find_packages() function to automatically find all packages and subpackages instead of listing each package one-by-one.
  • classifiers - additional metadata about the package. As a minimum, we should include the minimum version of Python required to use our package, the software licence that applies to the package (e.g. MIT), and which operating systems our package will work on.

2.2.5. License

It is standard practice to include a LICENSE file in the base of your Python project to describe to other developers the details of the software license that you are making your distribution package available under. In our case, we will release our distribution package to be used by other developers around the world under the terms of the MIT license, as follows:

MIT License

Copyright (c) 2020 HyperLearning AI

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

To learn more about the different types of common open source licenses available, and subsequent conditions of use, please visit Choose a License.

2.2.6. Unit Tests

Finally, we should include a directory in the base of Python projects called tests that is designed to store unit test files. Unit testing, and test driven development, is beyond the scope of this course but, at a high-level, it enables developers to take a test-first approach to developing code that promotes automation and improved test coverage. For the purposes of this module, we shall leave this directory empty.

2.2.7. Distribution Package

We are now ready to generate our first distribution package. The easiest way to generate a distribution package is using the command line in which we call the Python interpreter.

If you are running Anaconda under Windows, then you can execute the following commands using Anaconda Prompt, a command-line interface that automatically loads the Anaconda base environment, and which would have been installed during the installation of Anaconda. Click on the Windows Start Menu and start typing "Anaconda", and Anaconda Prompt should be one of the applications returned. In Unix-like operating systems, you can simply use a terminal shell (assuming Conda has been initialised).

First we make sure that we have the latest versions of the setuptools and wheel Python libraries installed in our Python environment, as follows:

python3 -m pip install --user --upgrade setuptools wheel

To generate the distribution package, we first navigate to the directory containing setup.py, in our case examples/my-first-project. We can now execute the following command to generate the distribution package:

cd examples/my-first-project
python3 setup.py sdist bdist_wheel

>> running sdist
>> running egg_info
>> creating myutils_hyperlearningai.egg-info
>> writing myutils_hyperlearningai.egg-info/PKG-INFO
>> writing dependency_links to myutils_hyperlearningai.egg-info/dependency_links.txt
>> writing top-level names to myutils_hyperlearningai.egg-info/top_level.txt
>> writing manifest file 'myutils_hyperlearningai.egg-info/SOURCES.txt'
>> reading manifest file 'myutils_hyperlearningai.egg-info/SOURCES.txt'
>> writing manifest file 'myutils_hyperlearningai.egg-info/SOURCES.txt'
>> running check
>> creating myutils-hyperlearningai-0.0.1
>> creating myutils-hyperlearningai-0.0.1/myutils
>> creating myutils-hyperlearningai-0.0.1/myutils/collections
>> creating myutils-hyperlearningai-0.0.1/myutils/numbers
>> creating myutils-hyperlearningai-0.0.1/myutils/strings
>> creating myutils-hyperlearningai-0.0.1/myutils_hyperlearningai.egg-info
>> copying files to myutils-hyperlearningai-0.0.1...
>> copying README.md -> myutils-hyperlearningai-0.0.1
>> copying setup.py -> myutils-hyperlearningai-0.0.1
>> copying myutils/__init__.py -> myutils-hyperlearningai-0.0.1/myutils
>> copying myutils/collections/__init__.py -> myutils-hyperlearningai-0.0.1/myutils/collections
>> copying myutils/collections/dictutils.py -> myutils-hyperlearningai-0.0.1/myutils/collections
>> copying myutils/collections/listutils.py -> myutils-hyperlearningai-0.0.1/myutils/collections
>> copying myutils/collections/tupleutils.py -> myutils-hyperlearningai-0.0.1/myutils/collections
>> copying myutils/numbers/__init__.py -> myutils-hyperlearningai-0.0.1/myutils/numbers
>> copying myutils/numbers/numberutils.py -> myutils-hyperlearningai-0.0.1/myutils/numbers
>> copying myutils/strings/__init__.py -> myutils-hyperlearningai-0.0.1/myutils/strings
>> copying myutils/strings/stringutils.py -> myutils-hyperlearningai-0.0.1/myutils/strings
>> copying myutils_hyperlearningai.egg-info/PKG-INFO -> myutils-hyperlearningai-0.0.1/myutils_hyperlearningai.egg-info
>> copying myutils_hyperlearningai.egg-info/SOURCES.txt -> myutils-hyperlearningai-0.0.1/myutils_hyperlearningai.egg-info
>> copying myutils_hyperlearningai.egg-info/dependency_links.txt -> myutils-hyperlearningai-0.0.1/myutils_hyperlearningai.egg-info
>> copying myutils_hyperlearningai.egg-info/top_level.txt -> myutils-hyperlearningai-0.0.1/myutils_hyperlearningai.egg-info
>> Writing myutils-hyperlearningai-0.0.1/setup.cfg
>> creating dist
>> Creating tar archive
>> removing 'myutils-hyperlearningai-0.0.1' (and everything under it)
>> running bdist_wheel
>> running build
>> running build_py
>> creating build
>> creating build/lib
>> creating build/lib/myutils
>> copying myutils/__init__.py -> build/lib/myutils
>> creating build/lib/myutils/collections
>> copying myutils/collections/listutils.py -> build/lib/myutils/collections
>> copying myutils/collections/tupleutils.py -> build/lib/myutils/collections
>> copying myutils/collections/__init__.py -> build/lib/myutils/collections
>> copying myutils/collections/dictutils.py -> build/lib/myutils/collections
>> creating build/lib/myutils/strings
>> copying myutils/strings/__init__.py -> build/lib/myutils/strings
>> copying myutils/strings/stringutils.py -> build/lib/myutils/strings
>> creating build/lib/myutils/numbers
>> copying myutils/numbers/__init__.py -> build/lib/myutils/numbers
>> copying myutils/numbers/numberutils.py -> build/lib/myutils/numbers
>> installing to build/bdist.linux-x86_64/wheel
>> running install
>> running install_lib
>> creating build/bdist.linux-x86_64
>> creating build/bdist.linux-x86_64/wheel
>> creating build/bdist.linux-x86_64/wheel/myutils
>> creating build/bdist.linux-x86_64/wheel/myutils/collections
>> copying build/lib/myutils/collections/listutils.py -> build/bdist.linux-x86_64/wheel/myutils/collections
>> copying build/lib/myutils/collections/tupleutils.py -> build/bdist.linux-x86_64/wheel/myutils/collections
>> copying build/lib/myutils/collections/__init__.py -> build/bdist.linux-x86_64/wheel/myutils/collections
>> copying build/lib/myutils/collections/dictutils.py -> build/bdist.linux-x86_64/wheel/myutils/collections
>> copying build/lib/myutils/__init__.py -> build/bdist.linux-x86_64/wheel/myutils
>> creating build/bdist.linux-x86_64/wheel/myutils/strings
>> copying build/lib/myutils/strings/__init__.py -> build/bdist.linux-x86_64/wheel/myutils/strings
>> copying build/lib/myutils/strings/stringutils.py -> build/bdist.linux-x86_64/wheel/myutils/strings
>> creating build/bdist.linux-x86_64/wheel/myutils/numbers
>> copying build/lib/myutils/numbers/__init__.py -> build/bdist.linux-x86_64/wheel/myutils/numbers
>> copying build/lib/myutils/numbers/numberutils.py -> build/bdist.linux-x86_64/wheel/myutils/numbers
>> running install_egg_info
>> Copying myutils_hyperlearningai.egg-info to build/bdist.linux-x86_64/wheel/myutils_hyperlearningai-0.0.1-py3.8.egg-info
>> running install_scripts
>> adding license file "LICENSE" (matched pattern "LICEN[CS]E*")
>> creating build/bdist.linux-x86_64/wheel/myutils_hyperlearningai-0.0.1.dist-info/WHEEL
>> creating 'dist/myutils_hyperlearningai-0.0.1-py3-none-any.whl' and adding 'build/bdist.linux-x86_64/wheel' to it
>> adding 'myutils/__init__.py'
>> adding 'myutils/collections/__init__.py'
>> adding 'myutils/collections/dictutils.py'
>> adding 'myutils/collections/listutils.py'
>> adding 'myutils/collections/tupleutils.py'
>> adding 'myutils/numbers/__init__.py'
>> adding 'myutils/numbers/numberutils.py'
>> adding 'myutils/strings/__init__.py'
>> adding 'myutils/strings/stringutils.py'
>> adding 'myutils_hyperlearningai-0.0.1.dist-info/LICENSE'
>> adding 'myutils_hyperlearningai-0.0.1.dist-info/METADATA'
>> adding 'myutils_hyperlearningai-0.0.1.dist-info/WHEEL'
>> adding 'myutils_hyperlearningai-0.0.1.dist-info/top_level.txt'
>> adding 'myutils_hyperlearningai-0.0.1.dist-info/RECORD'
>> removing build/bdist.linux-x86_64/wheel

Assuming that this command executes successfully, a dist folder is created in the current working folder i.e. examples/my-first-project/dist that contains the following files:

  • myutils-hyperlearningai-0.0.1.tar.gz - the .tar.gz file is a Source Archive for your distribution package, containing the raw source code for the release.
  • myutils_hyperlearningai-0.0.1-py3-none-any.whl - the .whl wheel file is a Built Distribution, meaning that we only need to copy this file to target systems in order to install the package. The Wheel format enables such installation by simple copying of a file.
my-first-project/
├── LICENSE
├── README.md
├── myutils/
│   └── __init__.py
│   └── collections/
│       └── __init__.py
│       └── dictutils.py
│       └── listutils.py
│       └── tupleutils.py
│   └── numbers/
│       └── __init__.py
│       └── numberutils.py
│   └── strings/
│       └── __init__.py
│       └── stringutils.py
├── setup.py
└── tests/
├── dist/
│   └── myutils-hyperlearningai-0.0.1.tar.gz
│   └── myutils_hyperlearningai-0.0.1-py3-none-any.whl

2.2.8. Install from Local Wheel

One option to install our distribution package as a new library in our Python environments is to install directly from the .whl wheel file (which can now be copied and shared across different machines and/or environments that we own) using pip as follows:

pip install dist/myutils_hyperlearningai-0.0.1-py3-none-any.whl

>> Processing ./dist/myutils_hyperlearningai-0.0.1-py3-none-any.whl
>> Installing collected packages: myutils-hyperlearningai
>> Successfully installed myutils-hyperlearningai-0.0.1

Our package is now available for us to import into any of our Python modules and applications, as follows:

# Import our generated distribution package myutils
import myutils

# Alternatively import a specific module from a specific package
import myutils.collections.dictutils as dictutils

# Call one of our myutils bespoke functions
my_dict = {
    1: ['python', 3.8], 
    2: ['java', 11], 
    3: ['scala', 2.13]
}

# Convert a dictionary to a Pandas DataFrame using our user-defined dictutils.convert_to_dataframe() function
df = dictutils.convert_to_dataframe(my_dict, ['Language', 'Version'])
df.head()

2.2.9. Install from PyPI

Should you wish to make your new Python distribution package available as a library for any Python developer in the world to install into their own Python environments and use in their Python modules and applications, you can upload your distribution package to the Python Package Index (PyPI). PyPI is an online repository of Python libraries developed and shared by the global Python community that is used by pip to find, install and update Python libraries. Instructions on how to upload your new distribution package to the PyPI test repository can be found here. Once you are ready to upload your distribution package to the real PyPI, please follow the instructions found here.

Once you have successfully uploaded your fully-tested distribution package to the real PyPI, it can be installed by any Python developer in the world and just like any other Python library using pip, as follows:

pip install <name of package on PyPI>

2.3. Init File

As discussed earlier in section 2. Packages, in order for Python to treat a directory as a package, it must contain a file named __init__.py (that is the word 'init' with two leading and two trailing underscore characters). This initialisation file can either be empty, or hold Python code that is executed when that package is initialised. But in either case, the file must exist for that directory to be treated as a package.

One common usage of __init__.py is to hide specific module entities. As discussed in section 1.1. Importing Modules, it is possible to use the * operator with from to import all modules from a package (except those with a name beginning with an underscore), such as from math import *. This is not recommended as it reduces the readability of your code, but it may also import sub-modules that the developer is not aware of and hence may introduce unforeseen issues and bugs into your application that may be hard or time-consuming to debug. Generally speaking, we should explicitly import sub-modules that we specifically require for our application to work, and not rely on the * operator.

To overcome this danger however, authors of Python distribution packages/libraries can define an explicit index of the package via a list named __all__ (that is the word 'all' with two leading and two trailing underscore characters) in a package's __init__.py file that contains, as items in the list, the names of those modules that should be imported when from package import * is encountered.

In the case of our myutils/collections/__init__.py file, for example, we could choose to define an __all__ list such that when from myutils.collections import * is encountered, only the two submodules dictutils and listutils are imported, in effect hiding the submodule tupleutils, as follows:

__all__ = ["dictutils", "listutils"]

This means that should we use from myutils.collections import *, we would not have access to the tupleutils module nor its functions. We would then need to explicitly import this module using from myutils.collections import tupleutils or an equivalent import statement in order to use it within our Python application.

Summary

In this module we have covered Python modules and packages. We now know how to create, document and import Python modules into our applications, as well as how to explicitly compile them and execute them as scripts using Python hashbangs. We also have an understanding of how to better structure our applications using Python packages, and we have constructed and distributed our first distribution package.

Homework

Please write Python programs for the following exercises. There may be many ways to solve the challenges below - first focus on writing a working Python program, and thereafter look for ways to make it more efficient using the techniques discussed both in this module and over this course thus far.

  1. Python Modules - GCSE Mathematics
    Write, and properly document, custom Python modules designed to solve common GCSE-level (or equivalent) Mathematics problems. Your Python modules should include the following function definitions (create one Python module to answer the geometry questions, and one Python module to answer the algebra questions):
  • Geometry - Area of a Triangle - given the base and height, return the area of a triangle.
  • Geometry - Area of a Circle - given the radius, return the area of a circle.
  • Geometry - Pythagoras Theorem - given the lengths of both shorter sides, return the length of the hypotenuse of a triangle.
  • Algebra - Linear Equations - given a string of the exact format ax + b = y that represents a valid equation, where the numerical values of a, b and y are given (for example 2x + 5 = 21), return the numerical value of x.
  • Algebra - Quadratic Equations - given a string of the exact format x**2 + bx + c = 0 that represents a valid equation, where the numerical values of b and c are given (for example x2 + 7x + 12 = 0), return all possible values of x.

2. Python Modules - Verbose GCSE Mathematics
Update the custom Python modules that you created in question 1 with a series of print() statements to help learners understand the step-by-step processes required to calculate the final answers. For example, you may display the following messages to the screen when solving a given quadratic equation:

print( gcsemaths.algebra.solve_quadratic_equation('x**2 + 7x + 12 = 0') )

>> 1. You have provided the following quadratic equation: x**2 + 7x + 12 = 0
>> 2. To solve this quadratic equation, we need to factorise it in the form: (x + a)(x + b) = 0
>> 3. We need to find two numbers a and b such that: a x b = 12, and a + b = 7
>> 4. If a = 3 and b = 4, then: 3 x 4 = 12, and 3 + 4 = 7
>> 5. The quadratic form of your equation is therefore: (x + 3)(x + 4) = 0
>> 6. Therefore either (x + 3) = 0, or (x + 4) = 0
>> 7. If x + 3 = 0, then x = -3
>> 8. If x + 4 = 0, then x = -4
>> 9. Therefore x = -3 or x = -4
>> (-3, -4)
  1. Python Distribution Package - GCSE Mathematics Revision Helper
    Once you have fully-tested your Python modules, organise them into suitable packages (for example gcsemaths.algebra, and gcsemaths.geometry) and generate a distribution package for your library. Then share the wheel file representation of your new GCSE Mathematics revision helper library with any friends and family members with young children to help them revise for GCSE Mathematics (or equivalent).

What's Next

In the next module, we will introduce the object oriented programming (OOP) paradigm for the first time - a means to model and program the world as objects that interact with each other. And we will use Python to create our first user-defined classes and objects.