Python's Counter-intuitive, Non-commutative Ternary Logic

06 Apr 2020 | Categories: code

Boolean logic is one of the foundational abstractions in computer science, from electrical circuits to programming languages. In Boolean logic, all variables take one of two values: 1/0, high/low, or TRUE/FALSE. However, many practical situations require the inclusion of a third value to indicate a variable is unknown, missing, or both false and true at the same time: usually something like None, NULL, NA, or Unknown.

This leads to a complication: for two-valued logic there’s a single agreed-upon way of defining how variables can be combined using the basic AND, OR, and NOT operations: Boolean logic. On the other hand, there are many possible — and at least two frequently used — three-valued formal logics. But in practice, programming languages often fall short of strictly implementing a consistent formal logic.

As it turns out, Python is no exception: boolean operations involving None have some counter-intuitive properties.

You might expect that operations on None would simply always return None, in a similar way that NA is propagated to the values of numerical operations in R. But nope:

>>> not None
True

Although, this does have the felicitous consequence that the tautology x or (not x) is still true in the case of None, unlike in SQL…:

>>> None or (not None)
True

But even weirder than this is that operations involving None are non-commutative: x and y is not the same as y and x.

>>> print(False and None)
False

>>> print(None and False)
None

I’ve searched for formal logical systems which don’t have commutativity for these operations, and I have yet to find this as an intentional choice anywhere else! It seems like it might be possible to define a substructural logic which did this, but I’m not sure why you would.

Why is None weird?

There are two design decisions which explain this strange behaviour: falsiness and short-circuit evaluation.

In Python, every object is either “truthy” or “falsy”: it can be coerced to the boolean value True or False with the bool() function:

>>> bool(True)
True
>>> bool(0)
False
>>> bool("a")
True
>>> bool("")
False
>>> bool(None)
False

From the Python docs:

the following values are interpreted as false: False, None, numeric zero of all types, and empty strings and containers (including strings, tuples, lists, dictionaries, sets and frozensets). All other values are interpreted as true.

This property is useful, for instance, when checking the results of operations which might return an empty string or list with an if-condition, but it has the consequence that because bool(None) evaluates to False, not None evaluates to True.

This choice in the docs is a bit puzzling to me:

neither and nor or restrict the value and type they return to False and True, but rather return the last evaluated argument

In practice, this means that:

>>> "a" or True
'a'
>>> "" or False
False

I think the reason for this is to allow short-circuit evaluation, which means that you could write this:

a = long_running_function_which_might_fail() or other_expensive_operation()

These functions will be evaluated sequentially from left to right, so that if the first one succeeds the second doesn’t need to be called at all, and the variable will take the return value of the first function. So what’s happening with False and None and None and False is that the first element is being evaluated, found to be falsy, and returned…

The Truth Tables for Python’s Ternary Logic

If we treat Python’s implementation of these boolean operations with True, False and None as a ternary logic, the resulting truth tables look like this:

not
True	False
None	True
False	True

and	True	None	False
True	True	None	False
None	None	None	None
False	False	False	False

or	True	None	False
True	True	True	True
None	True	None	False
False	True	None	False

Again, those matrices are not symmetrical because the operations are non-commutative.

I should say that in practice the ergonomic benefits to programmers probably outweigh the costs of the behaviour being strange in a formal sense, and I don’t think I’ve written any bugs as a result (but then again, do you ever know that you haven’t written a bug). Still, it’s a potential pitfall. And that’s without even considering float("nan"), numpy’s nan or pandas’ NaT. Hey, at least it’s not as much of a mess as Javascript.