You can go with @greenAfrican example, if it's possible for you to rewrite your function. But if you don't want to rewrite your function, you can wrap it into anonymous function inside apply, like this:
>>> def fxy(x, y):
... return x * y
>>> df['newcolumn'] = df.apply(lambda x: fxy(x['A'], x['B']), axis=1)
>>> df
A B newcolumn
0 10 20 200
1 20 30 600
2 30 10 300
Answer from roman on Stack OverflowYou can go with @greenAfrican example, if it's possible for you to rewrite your function. But if you don't want to rewrite your function, you can wrap it into anonymous function inside apply, like this:
>>> def fxy(x, y):
... return x * y
>>> df['newcolumn'] = df.apply(lambda x: fxy(x['A'], x['B']), axis=1)
>>> df
A B newcolumn
0 10 20 200
1 20 30 600
2 30 10 300
Alternatively, you can use numpy underlying function:
>>> import numpy as np
>>> df = pd.DataFrame({"A": [10,20,30], "B": [20, 30, 10]})
>>> df['new_column'] = np.multiply(df['A'], df['B'])
>>> df
A B new_column
0 10 20 200
1 20 30 600
2 30 10 300
or vectorize arbitrary function in general case:
>>> def fx(x, y):
... return x*y
...
>>> df['new_column'] = np.vectorize(fx)(df['A'], df['B'])
>>> df
A B new_column
0 10 20 200
1 20 30 600
2 30 10 300
Videos
import pandas as pd
myDF = pd.DataFrame({'student_names':['Monserta ruff','Gonzalo Fryer','Kris Venmeter'],'grades':[34,58,100]})
def assign_letter(row):
if row >= 90:
result = 'A**'
elif row >=50:
result = 'C'
else:
result ='F'
return result
myDF['letter-grades'] = myDF['grades'].apply(assign_letter) #no argument required?
myDFWhy doesn't the function assign_letter(row) require an argument (no parenthesis) but it still gives the CORRECT Resultant DF (as below)?
| student_names | grades | letter_grades |
|---|---|---|
| Monserta ruff | 34 | F |
| Gonzalo Fryer | 58 | C |
| Kris Venmeter | 100 | A** |
It's just the way you think it would be, apply accepts args and kwargs and passes them directly to some_func.
df.apply(some_func, var1='DOG', axis=1)
Or,
df.apply(some_func, args=('DOG', ), axis=1)
0 foo-x-DOG
1 bar-y-DOG
dtype: object
If for any reason that won't work for your use case, then you can always fallback to using a lambda:
df.apply(lambda row: some_func(row, 'DOG'), axis=1)
0 foo-x-DOG
1 bar-y-DOG
dtype: object
You should use vectorized logic:
df['C'] = df['A'] + '-' + df['B'] + '-DOG'
If you really want to use df.apply, which is just a thinly veiled loop, you can simply feed your arguments as additional parameters:
def some_func(row, var1):
return '{0}-{1}-{2}'.format(row['A'], row['B'], var1)
df['C'] = df.apply(some_func, var1='DOG', axis=1)
As per the docs, df.apply accepts both positional and keyword arguments.
Newer versions of pandas do allow you to pass extra arguments (see the new documentation). So now you can do:
my_series.apply(your_function, args=(2,3,4), extra_kw=1)
The positional arguments are added after the element of the series.
For older version of pandas:
The documentation explains this clearly. The apply method accepts a python function which should have a single parameter. If you want to pass more parameters you should use functools.partial as suggested by Joel Cornett in his comment.
An example:
>>> import functools
>>> import operator
>>> add_3 = functools.partial(operator.add,3)
>>> add_3(2)
5
>>> add_3(7)
10
You can also pass keyword arguments using partial.
Another way would be to create a lambda:
my_series.apply((lambda x: your_func(a,b,c,d,...,x)))
But I think using partial is better.
Steps:
- Create a dataframe
- Create a function
- Use the named arguments of the function in the apply statement.
Example
x=pd.DataFrame([1,2,3,4])
def add(i1, i2):
return i1+i2
x.apply(add,i2=9)
The outcome of this example is that each number in the dataframe will be added to the number 9.
0
0 10
1 11
2 12
3 13
Explanation:
The "add" function has two parameters: i1, i2. The first parameter is going to be the value in data frame and the second is whatever we pass to the "apply" function. In this case, we are passing "9" to the apply function using the keyword argument "i2".