You are missing an else before 'O'. This works:
y = lambda symbol: 'X' if symbol==True else 'O' if symbol==False else ' '
However, I think you should stick to Adam Smith's approach. I find that easier to read.
Answer from Cristian Lupascu on Stack OverflowYou are missing an else before 'O'. This works:
y = lambda symbol: 'X' if symbol==True else 'O' if symbol==False else ' '
However, I think you should stick to Adam Smith's approach. I find that easier to read.
You can use an anonymous dict inside your anonymous function to test for this, using the default value of dict.get to symbolize your final "else"
y = lambda sym: {False: 'X', True: 'Y'}.get(sym, ' ')
Is there a way to perform "if" in python's lambda? - Stack Overflow
python - Lambda including if...elif...else - Stack Overflow
Implementing if-else in python dataframe using lambda when there are multiple variables - Stack Overflow
pandas - Lambda function with if else clause with Python - Stack Overflow
Videos
The syntax you're looking for:
lambda x: True if x % 2 == 0 else False
But you can't use print or raise in a lambda.
why don't you just define a function?
def f(x):
if x == 2:
print(x)
else:
raise ValueError
there really is no justification to use lambda in this case.
Nest if .. elses:
lambda x: x*10 if x<2 else (x**2 if x<4 else x+10)
I do not recommend the use of apply here: it should be avoided if there are better alternatives.
For example, if you are performing the following operation on a Series:
if cond1:
exp1
elif cond2:
exp2
else:
exp3
This is usually a good use case for np.where or np.select.
numpy.where
The if else chain above can be written using
np.where(cond1, exp1, np.where(cond2, exp2, ...))
np.where allows nesting. With one level of nesting, your problem can be solved with,
df['three'] = (
np.where(
df['one'] < 2,
df['one'] * 10,
np.where(df['one'] < 4, df['one'] ** 2, df['one'] + 10))
df
one two three
0 1 6 10
1 2 7 4
2 3 8 9
3 4 9 14
4 5 10 15
numpy.select
Allows for flexible syntax and is easily extensible. It follows the form,
np.select([cond1, cond2, ...], [exp1, exp2, ...])
Or, in this case,
np.select([cond1, cond2], [exp1, exp2], default=exp3)
df['three'] = (
np.select(
condlist=[df['one'] < 2, df['one'] < 4],
choicelist=[df['one'] * 10, df['one'] ** 2],
default=df['one'] + 10))
df
one two three
0 1 6 10
1 2 7 4
2 3 8 9
3 4 9 14
4 5 10 15
and/or (similar to the if/else)
Similar to if-else, requires the lambda:
df['three'] = df["one"].apply(
lambda x: (x < 2 and x * 10) or (x < 4 and x ** 2) or x + 10)
df
one two three
0 1 6 10
1 2 7 4
2 3 8 9
3 4 9 14
4 5 10 15
List Comprehension
Loopy solution that is still faster than apply.
df['three'] = [x*10 if x<2 else (x**2 if x<4 else x+10) for x in df['one']]
# df['three'] = [
# (x < 2 and x * 10) or (x < 4 and x ** 2) or x + 10) for x in df['one']
# ]
df
one two three
0 1 6 10
1 2 7 4
2 3 8 9
3 4 9 14
4 5 10 15
Apply across columns
Use pd.DataFrame.apply instead of pd.Series.apply and specify axis=1:
df['one'] = df.apply(lambda row: row['one']*100 if row['two']>8 else \
(row['one']*1 if row['two']<8 else row['one']**2), axis=1)
Unreadable? Yes, I agree. Let's try again but this time rewrite as a named function.
Using a function
Note lambda is just an anonymous function. We can define a function explicitly and use it with pd.DataFrame.apply:
def calc(row):
if row['two'] > 8:
return row['one'] * 100
elif row['two'] < 8:
return row['one']
else:
return row['one']**2
df['one'] = df.apply(calc, axis=1)
Readable? Yes. But this isn't vectorised. We're looping through each row one at at at time. We might as well have used a list. Pandas isn't just for clever table formatting, you can use it for vectorised calculations using arrays in contiguous memory blocks. So let's try one more time.
Vectorised calculations
Using numpy.where:
df['one'] = np.where(row['two'] > 8, row['one'] * 100,
np.where(row['two'] < 8, row['one'],
row['one']**2))
There we go. Readable and efficient. We have effectively vectorised our if / else statements. Does this mean that we are doing more calculations than necessary? Yes! But this is more than offset by the way in which we are performing the calculations, i.e. with well-defined blocks of memory rather than pointers. You will find an order of magnitude performance improvement.
Another example
Well, we can just use numpy.where again.
df['one'] = np.where(df['name'].isin(['a', 'b']), 100, df['two'])
you can do
df.apply(lambda x: x["one"] + x["two"], axis=1)
but i don't think that such a long lambda as lambda x: x["one"]*100 if x["two"]>8 else (x["one"]*1 if x["two"]<8 else x["one"]**2) is very pythonic. apply takes any callback:
def my_callback(x):
if x["two"] > 8:
return x["one"]*100
elif x["two"] < 8:
return x["one"]
else:
return x["one"]**2
df.apply(my_callback, axis=1)
writing an if statement with a lambda function:
I'm trying to filter a map in spark using an if statement but get a syntax error. I have not been able to find the error. Can you guys tell me what I am doing wrong here?
full_count_with0val = full_rdd.map(lambda x: json.loads(x[1])).flatMap(lambda x: x['exposures']).map(lambda x: x['pdd_list'] if len(x['pdd_list'])==0).take(5)