2.2 Representation Invariants

We now know how to define a class that bundles together related pieces of data and includes methods that operate on that data. These methods provide services to client code, and if we write them sensibly, the client code can be sure that any instances they create will always be in a sensible state. For instance, we can make sure that no data is missing by writing an initializer that creates and initializes every instance attribute. And if, say, one instance attribute must always be greater than another (because that is a rule in the domain of our program), we can ensure that the initializer and all of the methods will never violate that rule.

Let’s return to our Twitter example to consider what writing the methods “sensibly” entails.

Documenting rules with representation invariants

Twitter imposes a 280-character limit on tweets. If we want our code to be consistent with this rule, we must both document it and make sure that every method of the class enforces the rule. First, let’s formalize the notion of “rule”. A representation invariant is a property of the instance attributes that every instance of a class must satisfy. For example, we can say that a representation invariant for our Tweet class is that the content attribute is always at most 280 characters long.

We document representation invariants in the docstring of a class, underneath its attributes. While we could write these representation invariants in English, we often prefer concrete Python code expressions that evaluate to True or False, as such expressions are unambiguous and can be checked directly in our program.

class Tweet:
    """A tweet, like in Twitter.

    === Attributes ===
    userid: the id of the user who wrote the tweet.
    created_at: the date the tweet was written.
    content: the contents of the tweet.
    likes: the number of likes this tweet has received.

    === Representation Invariants ===
    - len(self.content) <= 280
    """
    # Attribute types
    userid: str
    created_at: date
    content: str
    likes: int

Even though this is a new definition, we have seen representation invariants already: every instance attribute type annotation is a representation invariant! For example, the annotation content: str means that the content of a tweet must always be a string.

Enforcing representation invariants

Even though documenting representation invariants is essential, documentation alone is not enough. As the author of a class, you have the responsibility of ensuring that each method is consistent with the representation invariants, in the following two ways:

  1. At the beginning of the method body (i.e., right when the method is called), you can always assume that all of the representation invariants are satisfied.
  2. At the end of the method (i.e., right before the method returns), it is your responsibility to ensure that all of the representation invariants are satisfied.

That is, each representation invariant is both a precondition and postcondition of every method in a class. You are free to temporarily violate the representation invariants during the body of the method (and will often do so while mutating the object), as long as by the end of the method, all of the invariants are restored.

The initializer method is an exception: it does not have any preconditions on the attributes (since they haven’t even been created yet), but it must initialize the attributes so that they satisfy every representation invariant.

In our Twitter code, what method(s) may require modification in order to ensure that our representation invariant (len(self.content) <= 280) is enforced? Currently, the initializer allows the user to create a Tweet object with any message they want, including one that exceeds the limit. There are a variety of strategies that we can take for enforcing our representation invariant.

One approach is to process the initializer arguments so that the instance attributes are initialized to allowed values. For example, we might truncate a tweet message that’s too long:

    def __init__(self, who: str, when: date, what: str) -> None:
        """Initialize a new Tweet.

        If <what> is longer than 280 chars, only first 280 chars are stored.

        >>> t = Tweet('Rukhsana', date(2017, 9, 16), 'Hey!')
        >>> t.userid
        'Rukhsana'
        >>> t.created_at
        datetime.date(2017, 9, 16)
        >>> t.content
        'Hey!'
        >>> t.likes
        0
        """
        self.userid = who
        self.created_at = when
        self.content = what[:280]
        self.likes = 0

Another approach is to not change the code at all, but instead specify a precondition on the initializer:

    def __init__(self, who: str, when: date, what: str) -> None:
        """Initialize a new Tweet.

        Precondition: len(what) <= 280.

        >>> t = Tweet('Rukhsana', date(2017, 9, 16), 'Hey!')
        >>> t.userid
        'Rukhsana'
        >>> t.created_at
        datetime.date(2017, 9, 16)
        >>> t.content
        'Hey!'
        >>> t.likes
        0
        """
        self.userid = who
        self.created_at = when
        self.content = what
        self.likes = 0

As we discussed in 1.3 The Function Design Recipe, a precondition is something that we assume to be true about the function’s input. In the context of this section, we’re saying, “The representation invariant will be enforced by our initializer assuming that the client code satisfies our preconditions.” On the other hand, if this precondition is not satisfied, we aren’t making any promise about what the method will do (and in particular, whether it will enforce the representation invariants).

Another example: non-negativity constraints

Look again at the attributes of Tweet. Another obvious representation invariant is that likes must be at least 0; our type annotation likes: int allows for negative integers, after all. Do any methods need to change so that we can ensure this is always true? We need to check the initializer and any other method that mutates self.likes.

First, the initializer sets likes to 0, which satisfies this invariant. The method Tweet.like adds to the likes attribute, which would seem safe, but what if the client code passes a negative number?

Again, we are faced with a choice on how to handle this. We could impose a precondition that Tweet.like be called with n >= 0. Or, we could allow negative numbers as input, but simply set self.likes = 0 if its value falls below 0. Or, we could simply refuse to add a negative number, and simply return (i.e., do nothing) in this case.

All of these options change the method’s behaviour, and so whatever we choose, we would need to update the method’s documentation!

Client code can violate representation invariants also

We’ve now learned how to write a class that declares and enforces appropriate representation invariants. We guarantee that whenever client code creates new instances of our class, and calls methods on them (obeying any preconditions we specify), our representation invariants will always be satisfied.

Sadly, even being vigilant in implementing our methods doesn’t fully prevent client code from violating representation invariants—we’ll see why in the next section.