Use a list comprehension:
[y for x,y in A if x>2]
Demo:
>>> A=[(1,'A'),(2,'H'),(3,'K'),(4,'J')]
>>> [y for x,y in A if x>2]
['K', 'J']
>>>
Answer from U13-Forward on Stack OverflowFiltering list of objects on multiple conditions
How to filter a list of lists based on a variable set of conditions with Python? - Stack Overflow
python - How to filter list based on multiple conditions? - Stack Overflow
Conditional loop filtering - Ideas - Discussions on Python.org
Videos
Hi all, interested in your approaches and suggestions to this problem.
Let's say I have created a custom Player class that contains many attributes relating to a particular sportsperson's characteristics:
@dataclass
class Player:
id: int
first_name: str
last_name: str
age: int
height: int
weight: int
team: int
games_played: int
...If I create a small (~500) length list of these objects and then want to filter on multiple conditions, what's the best way for me to do that?
Of course, in a one off situation I could write something like:
filtered = [player for player in player_list if 20 <= player.age < 25 and player.height > 180 and player.team == "Rovers"]
This gives me the flexibility to filter exactly how I'd like but it's not the most glamorous solution in my opinion, especially if I want to try out lots of different filters.
I had a thought that this might be easier to by creating something like a PlayerFilter class which can be instantiated and then have filters added to a dict attribute or similar, and then can be applied to a list of players and return the filtered list. This seems more Pythonic to me but also seems slightly overkill and I'm looking for the most straightforward and intuitive way to filter these values. I'm imagining if a random python user was to use my package with this filter implementation, I'm not sure if that would make sense.
Is the complexity a side effect of using a list of objects to store my data as opposed to something like a dataframe? I've considered this too but I want to maintain the ability to call the methods that I've created for my custom player class on any filtered output.
Let me know what you think!
I would simply make a function that returns the conditional:
def makeConditions(**p):
fieldname = {"ref": 0, "type": 1, "date": 2 }
def filterfunc(elt):
for k, v in p.items():
if elt[fieldname[k]] != v: # if one condition is not met: false
return False
return True
return filterfunc
Then you can use it that way:
>>> list(filter(makeConditions(ref=1), ex))
[[1, 'CB', '2017-12-11'], [1, 'CB', '2017-11-08']]
>>> list(filter(makeConditions(type='CB'), ex))
[[1, 'CB', '2017-12-11'], [2, 'CB', '2017-12-01'], [1, 'CB', '2017-11-08']]
>>> list(filter(makeConditions(type='CB', ref=2), ex))
[[2, 'CB', '2017-12-01']]
You was almost there, the idea is to create a list with the functions that checks the conditions you need, once you have them you can just call those functions over the list they have to check and use all function to check if all of them are evaluated to True, note the use of partial so the function in the filter call only takes the data list; check this:
from functools import partial
ex = [
# ["ref", "type", "date"]
[1, 'CB', '2017-12-11'],
[2, 'CB', '2017-12-01'],
[3, 'RET', '2017-11-08'],
[1, 'CB', '2017-11-08'],
[5, 'RET', '2017-10-10'],
]
conditions = {"ref": 3}
conditions2 = {"ref": 1, "type": "CB"}
def apply(data, *args):
"""
same as map, but takes some data and a variable list of functions instead
it will make all that functions evaluate over that data
"""
return map(lambda f: f(data), args)
def makeConditions(p, myList):
# For each key:value in the dictionnary
def checkvalue(index, val, lst):
return lst[index] == val
conds = []
for key, value in p.items():
if key == "ref":
conds.append(partial(checkvalue, 0, value))
elif key == "type":
conds.append(partial(checkvalue, 1, value))
elif key == "date":
conds.append(partial(checkvalue, 2, value))
return all(apply(myList, *conds)) # does all the value checks evaluate to true?
#use partial to bind the conditions to the makeConditions function
print(list(filter(partial(makeConditions, conditions), ex)))
#[[3, 'RET', '2017-11-08']]
print(list(filter(partial(makeConditions, conditions2), ex)))
#[[1, 'CB', '2017-12-11'], [1, 'CB', '2017-11-08']]
Here you have a live example
Do I have to execute the filter() function for each sublist or is it possible to do it for the whole global list?
Filter iterates over all the list aplying a function for each of the elements, if the function evaluates to True then the element will remind in the result, so filter works over the whole global list
You can define a "filter-making function" that preprocesses the target list. The advantages of this are:
- Does minimal work by caching information about
target_listin a set: The total time isO(N_target_list) + O(N), since set lookups are O(1) on average. - Does not use global variables. Easily testable.
- Does not use nested for loops
def prefixes(target):
"""
>>> prefixes("FOLD/AAA.RST.TXT")
('FOLD', 'AAA', 'RST')
>>> prefixes("FOLD/AAA.RST.12345.TXT")
('FOLD', 'AAA', 'RST')
"""
x, rest = target.split('/')
y, z, *_ = rest.split('.')
return x, y, z
def matcher(target_list):
targets = set(prefixes(target) for target in target_list)
def is_target(t):
return prefixes(t) in targets
return is_target
Then, you could do:
>>> list(filter(matcher(target_list), mylist))
['FOLD/AAA.RST.12345.TXT', 'FOLD/AAA.RST.87589.TXT']
Define a function to filter values:
target_list = ["FOLD/AAA.RST.TXT"]
def keep(path):
template = get_template(path)
return template in target_list
def get_template(path):
front, numbers, ext = path.rsplit('.', 2)
template = '.'.join([front, ext])
return template
This uses str.rsplit which searches the string in reverse and splits it on the given character, . in this case. The parameter 2 means it only performs at most two splits. This gives us three parts, the front, the numbers, and the extension:
>>> 'FOLD/AAA.RST.12345.TXT'.rsplit('.', 2)
['FOLD/AAA.RST', '12345', 'TXT']
We assign these to front, numbers and ext.
We then build a string again using str.join
>>> '.'.join(['FOLD/AAA.RST', 'TXT']
'FOLD/AAA.RST.TXT'
So this is what get_template returns:
>>> get_template('FOLD/AAA.RST.12345.TXT')
'FOLD/AAA.RST.TXT'
We can use it like so:
mylist = [
"FOLD/AAA.RST.12345.TXT",
"FOLD/BBB.RST.12345.TXT",
"RUNS/AAA.FGT.12345.TXT",
"FOLD/AAA.RST.87589.TXT",
"RUNS/AAA.RST.11111.TXT"
]
from pprint import pprint
pprint(filter(keep, mylist))
Output:
['FOLD/AAA.RST.12345.TXT'
'FOLD/AAA.RST.87589.TXT']