Python's Counter-intuitive, Non-commutative Ternary Logic
| Categories: codeBoolean logic is one of the foundational abstractions in computer science, from electrical circuits to programming languages. In Boolean logic, all variables take one of two values: 1/0, high/low, or TRUE/FALSE. However, many practical situations require the inclusion of a third value to indicate a variable is unknown, missing, or both false and true at the same time: usually something like None
, NULL
, NA
, or Unknown
.
This leads to a complication: for two-valued logic there’s a single agreed-upon way of defining how variables can be combined using the basic AND, OR, and NOT operations: Boolean logic. On the other hand, there are many possible — and at least two frequently used — three-valued formal logics. But in practice, programming languages often fall short of strictly implementing a consistent formal logic.
As it turns out, Python is no exception: boolean operations involving None
have some counter-intuitive properties.
You might expect that operations on None
would simply always return None
, in a similar way that NA
is propagated to the values of numerical operations in R. But nope:
>>> not None
True
Although, this does have the felicitous consequence that the tautology x or (not x)
is still true in the case of None
, unlike in SQL…:
>>> None or (not None)
True
But even weirder than this is that operations involving None
are non-commutative: x and y
is not the same as y and x
.
>>> print(False and None)
False
>>> print(None and False)
None
I’ve searched for formal logical systems which don’t have commutativity for these operations, and I have yet to find this as an intentional choice anywhere else! It seems like it might be possible to define a substructural logic which did this, but I’m not sure why you would.
Why is None weird?
There are two design decisions which explain this strange behaviour: falsiness and short-circuit evaluation.
In Python, every object is either “truthy” or “falsy”: it can be coerced to the boolean value True or False with the bool()
function:
>>> bool(True)
True
>>> bool(0)
False
>>> bool("a")
True
>>> bool("")
False
>>> bool(None)
False
From the Python docs:
the following values are interpreted as false: False, None, numeric zero of all types, and empty strings and containers (including strings, tuples, lists, dictionaries, sets and frozensets). All other values are interpreted as true.
This property is useful, for instance, when checking the results of operations which might return an empty string or list with an if-condition, but it has the consequence that because bool(None)
evaluates to False
, not None
evaluates to True.
This choice in the docs is a bit puzzling to me:
neither
and
noror
restrict the value and type they return toFalse
andTrue
, but rather return the last evaluated argument
In practice, this means that:
>>> "a" or True
'a'
>>> "" or False
False
I think the reason for this is to allow short-circuit evaluation, which means that you could write this:
a = long_running_function_which_might_fail() or other_expensive_operation()
These functions will be evaluated sequentially from left to right, so that if the first one succeeds the second doesn’t need to be called at all, and the variable will take the return value of the first function. So what’s happening with False and None
and None and False
is that the first element is being evaluated, found to be falsy, and returned…
The Truth Tables for Python’s Ternary Logic
If we treat Python’s implementation of these boolean operations with True
, False
and None
as a ternary logic, the resulting truth tables look like this:
not | |
---|---|
True | False |
None | True |
False | True |
and | True | None | False |
True | True | None | False |
None | None | None | None |
False | False | False | False |
or | True | None | False |
True | True | True | True |
None | True | None | False |
False | True | None | False |
Again, those matrices are not symmetrical because the operations are non-commutative.
I should say that in practice the ergonomic benefits to programmers probably outweigh the costs of the behaviour being strange in a formal sense, and I don’t think I’ve written any bugs as a result (but then again, do you ever know that you haven’t written a bug). Still, it’s a potential pitfall. And that’s without even considering float("nan")
, numpy’s nan
or pandas’ NaT
. Hey, at least it’s not as much of a mess as Javascript.