1.7 Introduction to Property-based Testing

Hypothesis is a Python testing library that we’ll use occasionally in this course for exercises and assignments. It’s already available on the Teaching Lab machines, and you should have installed it on your own computer when you went through the steps of the Software Guide on Quercus.

When writing tests, we often try to identify key properties on the inputs to the function being tested. We then pick representative inputs that meet these properties, and use these inputs to write tests. We can extend this idea to trying to identify key properties of the function itself: central relationships between their inputs and outputs that must hold for all possible inputs. This type of testing is called property-based testing, and the most famous implementation of this type of testing in Python is the hypothesis library.

An example

Let’s see a concrete example of what these property tests might look like. Consider the following function:

def insert_after(lst: List[int], n1: int, n2: int) -> None:
    """After each occurrence of <n1> in <lst>, insert <n2>.

    >>> lst = [5, 1, 2, 1, 6]
    >>> insert_after(lst, 1, 99)
    >>> lst
    [5, 1, 99, 2, 1, 99, 6]
    """

We’ll test two properties of this function, which should hold for any valid input:

1.  `insert_after` always returns `None`.
2.  `insert_after` increases the length of `lst` by the number of times that `n1` occurs in that list.

Our first test is the following:

from typing import List

from hypothesis import given
from hypothesis.strategies import integers, lists

from insert import insert_after


@given(lists(integers()), integers(), integers())
def test_returns_none(lst: List[int], n1: int, n2: int) -> None:
    """Test that insert_after always returns None.
    """
    assert insert_after(lst, n1, n2) is None

The test case (test_returns_none) is preceded by the line @given(lists(integers()), integers(), integers()); what this line does is tell hypothesis to generate “random” inputs of the given types: a list of integers, and then two other integers. These values are then passed to the test function, which then simply calls insert_after on them, and checks that the output is None.

The most interesting part is that the “given” line doesn’t just generate one set of random inputs; instead, it generates dozens of them (or even hundreds, depending on how hypothesis is configured), and runs this test function on each one! We call the input specifiers like integers() or lists() a strategy; we’ll see more examples of strategies throughout the term.

A more complex property

Even though the previous test looked pretty straight-forward, don’t be fooled! Since a property test is just a Python function, we can write pretty complex tests using all of our Python knowledge.

For example, to test the second property we mentioned, we’ll need to store both the original length of lst, and the number of times that n1 appeared in it:

@given(lists(integers()), integers(), integers())
def test_new_item_count(lst: List[int], n1: int, n2: int) -> None:
    """Test that the correct number of items is added.
    """
    num_n1_occurrences = lst.count(n1)
    original_length = len(lst)
    insert_after(lst, n1, n2)

    final_length = len(lst)

    assert final_length - original_length == num_n1_occurrences

An example

A more complex property

Further reading