Videos
n = [] print(all(n))
This returns True. 0, empty, False, None are False aren't they?
If I have a list, which contains boolean values like:
list = [True, False, True, True]
I need to check that it must contain only one 'True', and it can't contain all 'False'. How would I do that?
True is equal to 1.
>>> sum([True, True, False, False, False, True])
3
list has a count method:
>>> [True,True,False].count(True)
2
This is actually more efficient than sum, as well as being more explicit about the intent, so there's no reason to use sum:
In [1]: import random
In [2]: x = [random.choice([True, False]) for i in range(100)]
In [3]: %timeit x.count(True)
970 ns ± 41.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [4]: %timeit sum(x)
1.72 µs ± 161 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
- You can just use
x[n_trues:]rather thanx[n_trues:len(x)]. - Your comments don't really say more than the code. And so I'd recommend removing the comments.
- If you want to keep your code documented use docstrings, which can be exported to your documentation via tools like Sphinx.
- As commented by Konrad Rudolph, you can remove the
and not any(should_be_false)as this will always fail if theallfails.
def check_true_then_false(x):
"""Check first n values are True and the rest are False."""
return all(x[:sum(x)])
If you want your code to work with iterators, not just sequences then you can instead use:
def check_true_then_false(it):
"""Check first n values are True and the rest are False."""
it = iter(it)
# Takes advantage of the iterating side effect, where it consumes the iterator.
# This allows `all` to simultaneously checks `it` starts with trues and advances `it`.
return all(it) or not any(it)
For the following two inputs all will result in:
>>> all([True] * n)
True
>>> all([True] * n + [False, ...])
False
However it will mean that it is still [...] as all and any are lazy. Meaning that we just need to check the rest are false. Meaning all slices the iterator for you without you having to. Leaving any with:
>>> any([False] * n)
False
>>> any([False] * n + [True, ...])
True
Basically, you want your list of booleans to be sorted.
Specifically, since True > False, you want your list to be sorted in decreasing order:
def check_true_then_false(booleans):
return booleans == sorted(booleans, reverse=True)
Done!
>>> test_cases = [[True],
... [False],
... [True, False],
... [True, False, False],
... [True, True, True, False],
... [False, True],
... [True, False, True]]
>>>
>>> print([check_true_then_false(test_case) for test_case in test_cases])
[True, True, True, True, True, False, False]
Using any:
>>> data = [False, False, False]
>>> not any(data)
True
any will return True if there's any truth value in the iterable.
Basically there are two functions that deal with an iterable and return True or False depending on which boolean values elements of the sequence evaluate to.
all(iterable)returns True if all elements of theiterableare considered as true values (likereduce(operator.and_, iterable)).any(iterable)returns True if at least one element of theiterableis a true value (again, using functional stuff,reduce(operator.or_, iterable)).
Using the all function, you can map operator.not_ over your list or just build a new sequence with negated values and check that all the elements of the new sequence are true:
>>> all(not element for element in data)
With the any function, you can check that at least one element is true and then negate the result since you need to return False if there's a true element:
>>> not any(data)
According to De Morgan's law, these two variants will return the same result, but I would prefer the last one (which uses any) because it is shorter, more readable (and can be intuitively understood as "there isn't a true value in data") and more efficient (since you don't build any extra sequences).
Because:
>>> True == 1
True
>>> False == 0
True
Boolean is a subclass of int. It is safe* to say that True == 1 and False == 0. Thus, your code is identical to:
>>> [1, 2][1]
2
>>> [1, 2][0]
1
>>> [1, 2, 3][1]
2
That's why when you add more elements, the output will remain the same. It has nothing to do with the length of the list, because it is just basic indexing affecting only the first two values.
*: NB: True and False can actually be overritten in Python <=2.7. Observe:
>>> True = 4
>>> False = 5
>>> print True
4
>>> print False
5
*: However, since Python 3, True and False are now keywords. Trying to reproduce the code above will return:
>>> True = 4
File "<stdin>", line 1
SyntaxError: assignment to keyword
What's happening here is a little confusing, since [1,2,3][True] has two sets of []s that are being interpreted in different ways.
What's going on is a little more clear if we split the code over a few lines.
The first set of []s construct a list object. Let's assign that object the name a:
>>> [1,2,3]
[1, 2, 3]
>>> a = [1,2,3]
>>>
The second set of [] specify an index inside that list. You'd usually see code like this:
>>> a[0]
1
>>> a[1]
2
>>>
But it's just as valid to use the list object directly, without ever giving it a name:
>>> [1,2,3][0]
1
>>> [1,2,3][1]
2
Lastly, the fact that True and False are useable as indexes is because they're treated as integers. From the data model docs:
There are three types of integers:
Plain integers....
Long integers.....
Booleans
These represent the truth values False and True. The two objects representing the values False and True are the only Boolean objects. The Boolean type is a subtype of plain integers, and Boolean values behave like the values 0 and 1, respectively, in almost all contexts, the exception being that when converted to a string, the strings "False" or "True" are returned, respectively.
Thus, [1,2,3][True] is equivalent to [1,2,3][1]
I have a list that has floats, ints or boolean values. Example: [1.0, 27.8, 0, 23, 0.0, False, True]
How can I remove the boolean and leave the other numbers in the list? Where am I going wrong and what can I learn from this?
>>> ll = [1.0, 27.8, 0, 23, 0.0, False, True]
>>> ll.remove(False) >>> ll [1.0, 27.8, 23, 0.0, False, True] # removes the first 0 it finds
This also didn't work:
>>> ll_without_booleans = [x for x in ll if isinstance(x, (float, int))] >>> ll_without_booleans [1.0, 27.8, 0, 23, 0.0, False, True]
You're looking for itertools.compress:
>>> from itertools import compress
>>> list_a = [1, 2, 4, 6]
>>> fil = [True, False, True, False]
>>> list(compress(list_a, fil))
[1, 4]
Timing comparisons(py3.x):
>>> list_a = [1, 2, 4, 6]
>>> fil = [True, False, True, False]
>>> %timeit list(compress(list_a, fil))
100000 loops, best of 3: 2.58 us per loop
>>> %timeit [i for (i, v) in zip(list_a, fil) if v] #winner
100000 loops, best of 3: 1.98 us per loop
>>> list_a = [1, 2, 4, 6]*100
>>> fil = [True, False, True, False]*100
>>> %timeit list(compress(list_a, fil)) #winner
10000 loops, best of 3: 24.3 us per loop
>>> %timeit [i for (i, v) in zip(list_a, fil) if v]
10000 loops, best of 3: 82 us per loop
>>> list_a = [1, 2, 4, 6]*10000
>>> fil = [True, False, True, False]*10000
>>> %timeit list(compress(list_a, fil)) #winner
1000 loops, best of 3: 1.66 ms per loop
>>> %timeit [i for (i, v) in zip(list_a, fil) if v]
100 loops, best of 3: 7.65 ms per loop
Don't use filter as a variable name, it is a built-in function.
Like so:
filtered_list = [i for (i, v) in zip(list_a, filter) if v]
Using zip is the pythonic way to iterate over multiple sequences in parallel, without needing any indexing. This assumes both sequences have the same length (zip stops after the shortest runs out). Using itertools for such a simple case is a bit overkill ...
One thing you do in your example you should really stop doing is comparing things to True, this is usually not necessary. Instead of if filter[idx]==True: ..., you can simply write if filter[idx]: ....
values = map(compare, new_subjects.values())
len([x for x in values if x]) == len(values) - 1
Basically, you filter the list for true values and compare the length of that list to the original to see if it's one less.
If you mean is actually True and not evaluates to True, you can just count them?
>>> L1 = [True]*5
>>> L1
[True, True, True, True, True]
>>> L2 = [True]*5 + [False]*2
>>> L2
[True, True, True, True, True, False, False]
>>> L1.count(False)
0
>>> L2.count(False)
2
>>>
checking for only a single False:
>>> def there_can_be_only_one(L):
... return L.count(False) == 1
...
>>> there_can_be_only_one(L1)
False
>>> there_can_be_only_one(L2)
False
>>> L3 = [ True, True, False ]
>>> there_can_be_only_one(L3)
True
>>>
edit: This actually answer your question better:
>>> def there_must_be_only_one(L):
... return L.count(True) == len(L)-1
...
>>> there_must_be_only_one(L3)
True
>>> there_must_be_only_one(L2)
False
>>> there_must_be_only_one(L1)
False
Just use:
bool(my_list)
Which evaluates it as Python "truthiness" and returns a real Boolean.
99.9% of the time, performance doesn't matter, so just use bool(my_list) as Keith suggests.
In the cases where performance does matter though, the nature of bool means it's actually quite slow, at least on the CPython reference interpreter. It has to go through generalized function call paths, to generalized constructor paths, to generalized argument parsing for 0-1 arguments (and in all but the most recent versions of Python, checking for keyword arguments), all to eventually just increment as reference count on a singleton and return it.
You can see how much this costs with ipython microbenchmarks (on my Windows x64 3.6.3 build):
In [1]: %%timeit -r5 l = []
...: bool(l)
...:
118 ns ± 0.808 ns per loop (mean ± std. dev. of 5 runs, 10000000 loops each)
In [11]: %%timeit -r5 l = [1]
...: bool(l)
...:
117 ns ± 0.306 ns per loop (mean ± std. dev. of 5 runs, 10000000 loops each)
It may not be obvious, but even on my relatively weak laptop, 117-118 nanoseconds just to determine truthiness is a bit much. Luckily, there are a couple other options. One is to abuse syntax to go through a dedicated path for truthiness evaluation (from here on out, I'll just test the empty list, the timings are basically identical either way):
In [3]: %%timeit -r5 l = []
...: not not l
...:
25 ns ± 0.289 ns per loop (mean ± std. dev. of 5 runs, 10000000 loops each)
That's a vast improvement; it takes roughly one fifth the time. On Python 3, using True if l else False also works with equal speed, but it's much slower than not not on Python 2, where True and False aren't protected literals, just built-in names that must be loaded dynamically each time.
Still, it's not perfect; sometimes you need a callable, e.g. to convert a lot of values to bool via a callback function (e.g. with map). Luckily, the operator module has you covered with operator.truth; while it's still a callable with all the overhead that entails, it's not a constructor, it takes exactly one argument (not 0-1), and it doesn't allow keyword arguments, all of which cost a surprising amount on the CPython reference interpreter. So when you can't use implicit truthiness testing or syntax based conversion with not not, and you still need the speed, operator.truth has you covered:
In [4]: from operator import truth
In [5]: %%timeit -r5 l = []
...: truth(l)
...:
52.1 ns ± 1.1 ns per loop (mean ± std. dev. of 5 runs, 10000000 loops each)
Twice as long as not not, but if you're using it with built-ins that call it repeatedly (like map) being able to push all the work to the C layer, avoiding byte code execution entirely, can still make it a win, and it's still well under half as costly as bool() itself.
Reiterating my earlier point though: 99.9% of the time, performance doesn't matter, so just use bool(my_list) as Keith suggests. I only mention this because I once had a scenario where that boolean conversion really was the hottest point in my code (verified through profiling), and using implicit truthiness testing (not even converting, just returning the list with the caller doing if myfunc():) shaved 30% off the runtime, and returning not not of the list still got nearly a 20% savings.