Maybe there is a more elegant way, but I would do in this way:
df['C'] = df['B'].apply(lambda x: f(x)[0])
df['D'] = df['B'].apply(lambda x: f(x)[1])
Applying the function to the columns and get the first and the second value of the outputs. It returns:
A B C D
0 1 O Ok On
1 2 P Pk Pn
2 3 Q Qk Qn
3 4 R Rk Rn
EDIT:
In a more concise way, thanks to this answer:
df[['C','D']] = df['B'].apply(lambda x: pd.Series([f(x)[0],f(x)[1]]))
Answer from Fabio Lamanna on Stack OverflowMaybe there is a more elegant way, but I would do in this way:
df['C'] = df['B'].apply(lambda x: f(x)[0])
df['D'] = df['B'].apply(lambda x: f(x)[1])
Applying the function to the columns and get the first and the second value of the outputs. It returns:
A B C D
0 1 O Ok On
1 2 P Pk Pn
2 3 Q Qk Qn
3 4 R Rk Rn
EDIT:
In a more concise way, thanks to this answer:
df[['C','D']] = df['B'].apply(lambda x: pd.Series([f(x)[0],f(x)[1]]))
If you want to use your function as such, here is a one liner:
df.update(df.B.apply(lambda x: pd.Series(dict(zip(['C','D'],f(x))))), overwrite=False)
In [350]: df
Out[350]:
A B C D
2 1 O s h
4 2 P Pk Pn
7 3 Q Qk Qn
9 4 R h m
You can also do:
df1 = df.copy()
df[['C','D']] = df.apply(lambda x: pd.Series([x['B'] + 'k', x['B'] + 'n']), axis=1)
df1.update(df, overwrite=False)
python - Pandas update multiple columns at once - Stack Overflow
python - Pandas update multiple columns using apply function - Stack Overflow
lambda - Update Multiple Columns using Pandas Apply Function - Stack Overflow
python - Updating multiple columns using apply function on a single column. The apply function returns two lists - Stack Overflow
you want to replace
print df.loc[df['Col1'].isnull(),['Col1','Col2', 'Col3']]
Col1 Col2 Col3
2 NaN NaN NaN
3 NaN NaN NaN
With:
replace_with_this = df.loc[df['Col1'].isnull(),['col1_v2','col2_v2', 'col3_v2']]
print replace_with_this
col1_v2 col2_v2 col3_v2
2 a b d
3 d e f
Seems reasonable. However, when you do the assignment, you need to account for index alignment, which includes columns.
So, this should work:
df.loc[df['Col1'].isnull(),['Col1','Col2', 'Col3']] = replace_with_this.values
print df
Col1 Col2 Col3 col1_v2 col2_v2 col3_v2
0 A B C NaN NaN NaN
1 D E F NaN NaN NaN
2 a b d a b d
3 d e f d e f
I accounted for columns by using .values at the end. This stripped the column information from the replace_with_this dataframe and just used the values in the appropriate positions.
In the "take the hill" spirit, I offer the below solution which yields the requested result.
I realize this is not exactly what you are after as I am not slicing the df (in the reasonable - but non functional - way in which you propose).
#Does not work when indexing on np.nan, so I fill with some arbitrary value.
df = df.fillna('AAA')
#mask to determine which rows to update
mask = df['Col1'] == 'AAA'
#dict with key value pairs for columns to be updated
mp = {'Col1':'col1_v2','Col2':'col2_v2','Col3':'col3_v2'}
#update
for k in mp:
df.loc[mask,k] = df[mp.get(k)]
#swap back np.nans for the arbitrary values
df = df.replace('AAA',np.nan)
Output:
Col1 Col2 Col3 col1_v2 col2_v2 col3_v2
A B C NaN NaN NaN
D E F NaN NaN NaN
a b d a b d
d e f d e f
The error I get if I do not replace nans is below. I'm going to research exactly where that error stems from.
ValueError: array is not broadcastable to correct shape
You can return pd.Series:
dp_data_df = pd.DataFrame({'A':[3,5,6,6],
'B':[6,7,8,9],
'P':[5,6,7,0],
'Y':[1,2,3,4]})
print (dp_data_df)
A B P Y
0 3 6 5 1
1 5 7 6 2
2 6 8 7 3
3 6 9 0 4
def update_row(row):
listy = [1,2,3]
return pd.Series(listy)
dp_data_df[['A', 'P','Y']] = dp_data_df.apply(update_row, axis=1)
print (dp_data_df)
A B P Y
0 1 6 2 3
1 1 7 2 3
2 1 8 2 3
3 1 9 2 3
You can use zip your output:
def update_row(row):
listy = [1,2,3]
return listy
dp_data_df['A'], dp_data_df['P'], dp_data_df['Y'] = zip(*dp_data_df.apply(update_row, axis=1))
The function will return two series in a tuple. If you assign it to a tuple of df['A'] and df['B'] this will work:
(df['A'], df['B']) = my_function(df['C'])
Output:
| A | B | C | |
|---|---|---|---|
| 0 | 200 | 300 | 100 |
| 1 | 400 | 600 | 200 |
| 2 | 600 | 900 | 300 |
It turns out it's easy, you just have to pack the results into a Series
# Use apply to update columns 'A' and 'B' based on column 'C'
df[['A', 'B']] = df['C'].apply(my_function).apply(pd.Series)
So, i've figured out how to use the pandas apply method to update/change the values of a column, row-wise based on multiple comparisons like this:
# for each row, if the value of both 'columns to check' are 'SOME STRING', change to 'NEW STRING # otherwise leave it as is my_df ['column_to_change'] = df.apply(lambda row: 'NEW STRING' if row['column_to_check_1'] and row['column_to_check_2'] == 'SOME STRING' else row['column_to_change'], axis=1)
Now, I can't figure out how to expand that beyond simple comparison operators. The specific example I'm trying to solve is:
" for each row, if the string value in COLUMN A contains 'foo', change the value in COLUMN B to 'bar', otherwise leave it as is"
I think this is all right, except the ##parts between the hashmarks##
my_df ['columb_b'] = df.apply(lambda row: 'bar' if ##column A contains 'foo'## else row['columb_b'], axis=1)
I usually do this using zip:
>>> df = pd.DataFrame([[i] for i in range(10)], columns=['num'])
>>> df
num
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
>>> def powers(x):
>>> return x, x**2, x**3, x**4, x**5, x**6
>>> df['p1'], df['p2'], df['p3'], df['p4'], df['p5'], df['p6'] = \
>>> zip(*df['num'].map(powers))
>>> df
num p1 p2 p3 p4 p5 p6
0 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1
2 2 2 4 8 16 32 64
3 3 3 9 27 81 243 729
4 4 4 16 64 256 1024 4096
5 5 5 25 125 625 3125 15625
6 6 6 36 216 1296 7776 46656
7 7 7 49 343 2401 16807 117649
8 8 8 64 512 4096 32768 262144
9 9 9 81 729 6561 59049 531441
In 2020, I use apply() with argument result_type='expand'
applied_df = df.apply(lambda row: fn(row.text), axis='columns', result_type='expand')
df = pd.concat([df, applied_df], axis='columns')
fn() should return a dict; its keys will be the new column names.
Alternatively you can do a one-liner by also specifying the column names:
df[["col1", "col2", ...]] = df.apply(lambda row: fn(row.text), axis='columns', result_type='expand')