pandas add value to row based on condition

Pandas/Python adding row based on condition

stackoverflow.com › questions › 40091963 › pandas-python-adding-row-based-on-condition

UPDATE: memory saving method - first set a new index with a gap for a new row:

In [30]: df
Out[30]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    D    E    1
3    E    F    1

if we want to insert a new row between indexes 1 and 2, we split the index at position 2:

In [31]: idxs = np.split(df.index, 2)

set a new index (with gap at position 2):

In [32]: df.set_index(idxs[0].union(idxs[1] + 1), inplace=True)

In [33]: df
Out[33]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
3    D    E    1
4    E    F    1

insert new row with index 2:

In [34]: df.loc[2] = ['X','X',2]

In [35]: df
Out[35]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
3    D    E    1
4    E    F    1
2    X    X    2

sort index:

In [38]: df.sort_index(inplace=True)

In [39]: df
Out[39]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    X    X    2
3    D    E    1
4    E    F    1

PS you also can insert DataFrame instead of single row using df.append(new_df):

In [42]: df
Out[42]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    D    E    1
3    E    F    1

In [43]: idxs = np.split(df.index, 2)

In [45]: new_df = pd.DataFrame([['X', 'X', 10], ['Y','Y',11]], columns=df.columns)

In [49]: new_df.index += idxs[1].min()

In [51]: new_df
Out[51]:
  Col1 Col2  Col3
2    X    X    10
3    Y    Y    11

In [52]: df = df.set_index(idxs[0].union(idxs[1]+len(new_df)))

In [53]: df
Out[53]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
4    D    E    1
5    E    F    1

In [54]: df = df.append(new_df)

In [55]: df
Out[55]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
4    D    E    1
5    E    F    1
2    X    X   10
3    Y    Y   11

In [56]: df.sort_index(inplace=True)

In [57]: df
Out[57]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    X    X   10
3    Y    Y   11
4    D    E    1
5    E    F    1

OLD answer:

One (among many) way to achieve that would be to split your DF and concatenate it together with needed DF in desired order:

Original DF:

In [12]: df
Out[12]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    D    E    1
3    E    F    1

let's split it into two parts ([0:1], [2:end]):

In [13]: dfs = np.split(df, [2])

In [14]: dfs
Out[14]:
[  Col1 Col2 Col3
 0    A    B    1
 1    B    C    1,   Col1 Col2 Col3
 2    D    E    1
 3    E    F    1]

now we can concatenate it together with a new DF in desired order:

In [15]: pd.concat([dfs[0], pd.DataFrame([['C','D', 1]], columns=df.columns), dfs[1]], ignore_index=True)
Out[15]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    C    D    1
3    D    E    1
4    E    F    1

Answer from MaxU - stand with Ukraine on Stack Overflow

Stack Overflow

stackoverflow.com › questions › 40091963 › pandas-python-adding-row-based-on-condition

Pandas/Python adding row based on condition - Stack Overflow

Top answer

1 of 1

5

UPDATE: memory saving method - first set a new index with a gap for a new row:

In [30]: df
Out[30]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    D    E    1
3    E    F    1

if we want to insert a new row between indexes 1 and 2, we split the index at position 2:

In [31]: idxs = np.split(df.index, 2)

set a new index (with gap at position 2):

In [32]: df.set_index(idxs[0].union(idxs[1] + 1), inplace=True)

In [33]: df
Out[33]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
3    D    E    1
4    E    F    1

insert new row with index 2:

In [34]: df.loc[2] = ['X','X',2]

In [35]: df
Out[35]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
3    D    E    1
4    E    F    1
2    X    X    2

sort index:

In [38]: df.sort_index(inplace=True)

In [39]: df
Out[39]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    X    X    2
3    D    E    1
4    E    F    1

PS you also can insert DataFrame instead of single row using df.append(new_df):

In [42]: df
Out[42]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    D    E    1
3    E    F    1

In [43]: idxs = np.split(df.index, 2)

In [45]: new_df = pd.DataFrame([['X', 'X', 10], ['Y','Y',11]], columns=df.columns)

In [49]: new_df.index += idxs[1].min()

In [51]: new_df
Out[51]:
  Col1 Col2  Col3
2    X    X    10
3    Y    Y    11

In [52]: df = df.set_index(idxs[0].union(idxs[1]+len(new_df)))

In [53]: df
Out[53]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
4    D    E    1
5    E    F    1

In [54]: df = df.append(new_df)

In [55]: df
Out[55]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
4    D    E    1
5    E    F    1
2    X    X   10
3    Y    Y   11

In [56]: df.sort_index(inplace=True)

In [57]: df
Out[57]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    X    X   10
3    Y    Y   11
4    D    E    1
5    E    F    1

OLD answer:

One (among many) way to achieve that would be to split your DF and concatenate it together with needed DF in desired order:

Original DF:

In [12]: df
Out[12]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    D    E    1
3    E    F    1

let's split it into two parts ([0:1], [2:end]):

In [13]: dfs = np.split(df, [2])

In [14]: dfs
Out[14]:
[  Col1 Col2 Col3
 0    A    B    1
 1    B    C    1,   Col1 Col2 Col3
 2    D    E    1
 3    E    F    1]

now we can concatenate it together with a new DF in desired order:

In [15]: pd.concat([dfs[0], pd.DataFrame([['C','D', 1]], columns=df.columns), dfs[1]], ignore_index=True)
Out[15]:
  Col1 Col2 Col3
0    A    B    1
1    B    C    1
2    C    D    1
3    D    E    1
4    E    F    1

Stack Overflow

stackoverflow.com › questions › 50375985 › pandas-add-column-with-value-based-on-condition-based-on-other-columns

python - Pandas add column with value based on condition based on other columns - Stack Overflow

Top answer

1 of 2

42

Use the timeits, Luke!

Conclusion
List comprehensions perform the best on smaller amounts of data because they incur very little overhead, even though they are not vectorized. OTOH, on larger data, loc and numpy.where perform better - vectorisation wins the day.

Keep in mind that the applicability of a method depends on your data, the number of conditions, and the data type of your columns. My suggestion is to test various methods on your data before settling on an option.

One sure take away from here, however, is that list comprehensions are pretty competitive—they're implemented in C and are highly optimised for performance.

Benchmarking code, for reference. Here are the functions being timed:

def numpy_where(df):
  return df.assign(is_rich=np.where(df['salary'] >= 50, 'yes', 'no'))

def list_comp(df):
  return df.assign(is_rich=['yes' if x >= 50 else 'no' for x in df['salary']])

def loc(df):
  df = df.assign(is_rich='no')
  df.loc[df['salary'] > 50, 'is_rich'] = 'yes'
  return df

2 of 2

3

Another method is by using the pandas mask (depending on the use-case where) method. First initialize a Series with a default value (chosen as "no") and replace some of them depending on a condition (a little like a mix between loc[] and numpy.where()).

df['is_rich'] = pd.Series('no', index=df.index).mask(df['salary']>50, 'yes')

It is probably the fastest option. For example, for a frame with 10 mil rows, mask() option is 40% faster than loc option.¹

I also updated the perfplot benchmark in cs95's answer to compare how the mask method performs compared to the other methods:

¹: The benchmark result that compares mask with loc.

def mask(df):
    return df.assign(is_rich=pd.Series('no', index=df.index).mask(df['salary']>50, 'yes'))

df = pd.DataFrame({'salary': np.random.rand(10_000_000)*100})

%timeit mask(df)
# 391 ms ± 3.87 ms per loop (mean ± std. dev. of 10 runs, 100 loops each)

%timeit loc(df)
# 558 ms ± 75.6 ms per loop (mean ± std. dev. of 10 runs, 100 loops each)

Discussions

python - Pandas - add value at specific iloc into new dataframe column - Stack Overflow

I have a large dataframe containing lots of columns. For each row/index in the dataframe I do some operations, read in some ancilliary ata, etc and get a new value. Is there a way to add that new ... More on stackoverflow.com

stackoverflow.com

Pandas/Python add a row based on condition - Stack Overflow

YY_MM_CD customerid pol_no type WE WP 2019-07 15680 1313145 new 3 89 2020-01 14672 1418080 renwd -8 223 2019-01 15681 1213143 new 4 ... More on stackoverflow.com

stackoverflow.com

Pandas - Add value to column IF row is in another

This is interesting. I'm not sure why pandas makes this so hard. Membership testing with pandas objects is just tricky. One way would be to convert each row of each df (main + subs) into a tuple so that the whole row is hashable (this will simplify membership testing), then convert the sub dfs each into a set of rows. Here's how to do that. dfs = {'main' : main_df.copy(), 'sub1' : sub_df1.copy(), 'sub2' : sub_df2.copy(), 'sub3' : sub_df3.copy(), ...} for df_name, df in dfs.items(): df = df.apply(tuple, axis=1) if df_name.startswith('sub'): df = set(df.values) dfs[df_name] = df From here, you can use df.apply to loop over the rows of your "tuplified" main df, and test the membership of each row against each sub df. This is probably best done using a custom function. Here's how to do that. def get_membership(main_row, dfs): for i, (df_name, sub_df) in enumerate(dfs.items(), start=1): if df_name == 'main': continue if main_row in sub_df: return i main_df['membership'] = dfs['main'].apply(get_membership, args=(dfs,)) I haven't tested this, but I think that main df rows which are not in any of the sub dfs will be assigned the value None, or possible NaN. You'll need to adjust the variable and column names here obviously, but that should give you the skeleton you need. Edit: Although I speculated that this will be faster than your implementation, I'll stop short of saying it's fast period. The actual membership testing of a single row should be approximately O(1) (or maybe 10*O(1) since there are 10 sub dfs). But the conversion of every row to a tuple and then of all rows to a set may be costly, depending on how many rows you have. I still think it will be faster than your solution though, which would still be a win. You'll just have to test and see. More on reddit.com

r/learnpython

8

2

March 11, 2022

How to add rows to the dataframe based on a condition?

I think there are 3 steps here. Look to the previous row and the next row. Based on condition. Add Rows based on the values. To accomplish the first step, you can create new columns that implement the shift function() https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.shift.html df['prevrow'] = df.eventtype.shift() df['nextrow'] = df.eventtype.shift(-1) You also need to do the same thing for the user. df['prevuser'] = df.user.shift() df['nextuser'] = df.user.shift(-1) For the second step, filter the data where the condition you listed is true. I did not 100% follow your logic for the condition so maybe I got it mixed up but it could look something like this:. If the current event is 1 and there’s no previous events sameasprev = df[df. user == df.prevuser] sameasprev = df[sameasprev.prevrow == sameasprev.eventtype] sameasprev = sameasprev[sameasprev. eventtype == 1] If the current event is a 0 and there’s no next event sameasnext = df[df.user == df.nextuser] sameasnext = sameasnext[sameasnext.nextrow == df. eventtype] sameasnext = sameasnext[df. eventtype == 0] Finally you want to add the rows to the dataframe. You will need to modify the values as you required. sameasprev.eventtype = 0 sameasnext.eventtype = 1 newdf = pd.concat(df,sameasprev).reset_index() newdf = pd.concat(newdf,sameasnext).reset_index() More on reddit.com

r/learnpython

2

1

August 30, 2022

Videos

09:43

YouTube

How to add rows and columns to a pandas DataFrame | Pandas DataFrame ...

June 27, 2023

05:02

YouTube

Add Row to pandas DataFrame in Python (2 Examples) | Append List ...

December 26, 2022

03:22

YouTube

How to Add a Row To a Data Frame in Pandas (Python) - YouTube

September 4, 2020

View all

Dataquest

dataquest.io › blog › tutorial-add-column-pandas-dataframe-based-on-if-else-condition

Add a Column in a Pandas DataFrame Based on an If-Else Condition

March 6, 2023 - To accomplish this, we can use a function called np.select(). We’ll give it two arguments: a list of our conditions, and a correspding list of the value we’d like to assign to each row in our new column.

Stack Overflow

stackoverflow.com › questions › 46113078 › pandas-add-value-at-specific-iloc-into-new-dataframe-column

python - Pandas - add value at specific iloc into new dataframe column - Stack Overflow

Top answer

1 of 4

96

There are two steps to created & populate a new column using only a row number... (in this approach iloc is not used)

First, get the row index value by using the row number

CopyrowIndex = df.index[someRowNumber]

Then, use row index with the loc function to reference the specific row and add the new column / value

Copydf.loc[rowIndex, 'New Column Title'] = "some value"

These two steps can be combine into one line as follows

Copydf.loc[df.index[someRowNumber], 'New Column Title'] = "some value"

2 of 4

22

If you have a dataframe like

Copyimport pandas as pd
df = pd.DataFrame(data={'X': [1.5, 6.777, 2.444, pd.np.NaN], 'Y': [1.111, pd.np.NaN, 8.77, pd.np.NaN], 'Z': [5.0, 2.333, 10, 6.6666]})

Instead of iloc,you can use .loc with row index and column name like df.loc[row_indexer,column_indexer]=value

Copydf.loc[[0,3],'Z'] = 3

Output:

       X      Y       Z
0  1.500  1.111   3.000
1  6.777    NaN   2.333
2  2.444  8.770  10.000
3    NaN    NaN   3.000

ProjectPro

projectpro.io › recipes › insert-new-column-based-on-condition-in-python

How to insert a new column based on condition in Python? -

August 23, 2023 - The apply() method shows you how to create a new column in a Pandas based on condition. The apply() method takes a function as an argument and applies that function to each row in the DataFrame.

Stack Overflow

stackoverflow.com › questions › 60985413 › pandas-python-add-a-row-based-on-condition

Pandas/Python add a row based on condition - Stack Overflow

Write the expiration date next to each customer id. Then when accessing the data just check if the current data is before the expiration date. ... For example. new_df = pd.DataFrame() df = YOURDATA groups = df.groupby("customerid") for group in groups: if len(group) < 2: #your condition df2 = pd.DataFrame( ADD YOUR DATA HERE) new_df.append(df2, ignore_index=True) at the end you can combine new_df and df with concat:https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html

reddit.com › r/learnpython › pandas - add value to column if row is in another

r/learnpython on Reddit: Pandas - Add value to column IF row is in another

March 11, 2022 -

Hi there,

I've got a little bit stuck with Pandas. I'm trying to add a new column to my database, and then give each row of the database a 1-10 value based on whether it's present in another df.

So my main dataframe is divided into 10 sub dataframes (df1, df2, df3...) and each row in the dataframe has a unique 'GroupID' value. I'd like to give every entry in the main dataframe a different value in a new column (called Segment), so if its present in df1, give it the value 1 for example.

I hope I've explained this as clear as possible, and many thanks in advance!

This is what I've written, but it's returning an error

if Attempt[Attempt['Group ID'].isin(df1['Group ID'])]: Attempt["Segment"] ='1' elif Attempt[Attempt['Group ID'].isin(df2['Group ID'])]: Attempt["Segment"] ='2' elif Attempt[Attempt['Group ID'].isin(df3['Group ID'])]: Attempt["Segment"] ='3' elif Attempt[Attempt['Group ID'].isin(df4['Group ID'])]: Attempt["Segment"] ='4' elif Attempt[Attempt['Group ID'].isin(df5['Group ID'])]: Attempt["Segment"] ='5' elif Attempt[Attempt['Group ID'].isin(df6['Group ID'])]: Attempt["Segment"] ='6' elif Attempt[Attempt['Group ID'].isin(df7['Group ID'])]: Attempt["Segment"] ='7' elif Attempt[Attempt['Group ID'].isin(df8['Group ID'])]: Attempt["Segment"] ='8' elif Attempt[Attempt['Group ID'].isin(df9['Group ID'])]: Attempt["Segment"] ='9' elif Attempt[Attempt['Group ID'].isin(df10['Group ID'])]: Attempt["Segment"] ='10'

Top answer

1 of 3

2

This is interesting. I'm not sure why pandas makes this so hard. Membership testing with pandas objects is just tricky. One way would be to convert each row of each df (main + subs) into a tuple so that the whole row is hashable (this will simplify membership testing), then convert the sub dfs each into a set of rows. Here's how to do that. dfs = {'main' : main_df.copy(), 'sub1' : sub_df1.copy(), 'sub2' : sub_df2.copy(), 'sub3' : sub_df3.copy(), ...} for df_name, df in dfs.items(): df = df.apply(tuple, axis=1) if df_name.startswith('sub'): df = set(df.values) dfs[df_name] = df From here, you can use df.apply to loop over the rows of your "tuplified" main df, and test the membership of each row against each sub df. This is probably best done using a custom function. Here's how to do that. def get_membership(main_row, dfs): for i, (df_name, sub_df) in enumerate(dfs.items(), start=1): if df_name == 'main': continue if main_row in sub_df: return i main_df['membership'] = dfs['main'].apply(get_membership, args=(dfs,)) I haven't tested this, but I think that main df rows which are not in any of the sub dfs will be assigned the value None, or possible NaN. You'll need to adjust the variable and column names here obviously, but that should give you the skeleton you need. Edit: Although I speculated that this will be faster than your implementation, I'll stop short of saying it's fast period. The actual membership testing of a single row should be approximately O(1) (or maybe 10*O(1) since there are 10 sub dfs). But the conversion of every row to a tuple and then of all rows to a set may be costly, depending on how many rows you have. I still think it will be faster than your solution though, which would still be a win. You'll just have to test and see.

2 of 3

2

While all of these other provided solutions might work, I would argue that they are “unpandonic”. This can be easily solved with a merge. import pandas as pd main = pd.DataFrame([[1, 99], [2, 88], [3, 77], [4, 66], [5, 55], [6, 44]], columns=['groupid', 'value']) sub1 = pd.DataFrame([[1, 9], [2, 8]], columns=['groupid', 'val']) sub2 = pd.DataFrame([[3, 7], [4, 6]], columns=['groupid', 'val']) sub3 = pd.DataFrame([[5, 5], [6, 4]], columns=['groupid', 'val']) subs = [sub1, sub2, sub3] subs = pd.concat([sub.assign(segment=i + 1) for i, sub in enumerate(subs)]) result = main.merge(subs[['groupid', 'segment']], on='groupid', how='left') If a groupid can show up in multiple subdfs and you want to assign the first segment, you can add this: result = result.drop_duplicates('groupid')

Find elsewhere

Google Bing Mojeek

datagy

datagy.io › home › python posts › set pandas conditional column based on values of another column

Set Pandas Conditional Column Based on Values of Another Column • datagy

February 23, 2022 - Let's see how we can accomplish this using numpy's .select() method. Similar to the method above to use .loc to create a conditional column in Pandas, we can use the numpy .select() method.

reddit.com › r/learnpython › how to add rows to the dataframe based on a condition?

r/learnpython on Reddit: How to add rows to the dataframe based on a condition?

August 30, 2022 -

I have a dataframe that, after grouping by User and Time, looks like below:

User | Time | Event type

————————————

A | xx:xx | 1

————————————

A | yy:yy | 0

————————————

B | xx:xx | 0

————————————

B | yy:yy | 1

————————————

B | zz:zz | 0

————————————

Now I want to check the current, the previous and the next event.

If the current event is 1 and there’s no previous events, I want to add a 0 event manually. So a new row
If the current event is a 0 and there’s no next event, I want to add a 1 event manually.

So basically i can have complete 0-1 sessions.

Is there any way to achieve this?

Top answer

1 of 1

1

I think there are 3 steps here. Look to the previous row and the next row. Based on condition. Add Rows based on the values. To accomplish the first step, you can create new columns that implement the shift function() https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.shift.html df['prevrow'] = df.eventtype.shift() df['nextrow'] = df.eventtype.shift(-1) You also need to do the same thing for the user. df['prevuser'] = df.user.shift() df['nextuser'] = df.user.shift(-1) For the second step, filter the data where the condition you listed is true. I did not 100% follow your logic for the condition so maybe I got it mixed up but it could look something like this:. If the current event is 1 and there’s no previous events sameasprev = df[df. user == df.prevuser] sameasprev = df[sameasprev.prevrow == sameasprev.eventtype] sameasprev = sameasprev[sameasprev. eventtype == 1] If the current event is a 0 and there’s no next event sameasnext = df[df.user == df.nextuser] sameasnext = sameasnext[sameasnext.nextrow == df. eventtype] sameasnext = sameasnext[df. eventtype == 0] Finally you want to add the rows to the dataframe. You will need to modify the values as you required. sameasprev.eventtype = 0 sameasnext.eventtype = 1 newdf = pd.concat(df,sameasprev).reset_index() newdf = pd.concat(newdf,sameasnext).reset_index()

Stack Overflow

stackoverflow.com › questions › 58529549 › pandas-updating-value-in-row-based-on-condition-with-apply

python - Pandas - Updating value in row based on condition with .apply() - Stack Overflow

Top answer

1 of 1

4

You don't need (and shouldn't use) apply:

# toy data
df = pd.DataFrame({'A':['TEST','NO'],
              'B' : ['A','B'],
              'C' :list('12')})

s = df['A']=='TEST'

df.loc[s,'B'] = 'WOW'
df.loc[~s, 'C'] = 'NO_GO'

Output:

      A    B      C
0  TEST  WOW      1
1    NO    B  NO_GO

Python Forum

python-forum.io › thread-30289.html

Pandas, Assign a value in a row, to another column based on a condition

I'm trying to put a particular value in the data frame row to other rows also if it matches a condition. However i'm not getting what is expected. Can someone help to get the value as shown in the expected value column. Thanks in advance! ...

Like Geeks

likegeeks.com › home › python › pandas › add rows to pandas dataframe based on conditions

Add Rows to Pandas DataFrame Based on Conditions

July 6, 2024 - Learn how to add rows to a Pandas DataFrame based on specific conditions. Including multiple conditions, nested statements, and string matching.

Stack Overflow

stackoverflow.com › questions › 36909977 › update-row-values-where-certain-condition-is-met-in-pandas

python - Update row values where certain condition is met in pandas - Stack Overflow

Top answer

1 of 3

358

I think you can use loc if you need update two columns to same value:

df1.loc[df1['stream'] == 2, ['feat','another_feat']] = 'aaaa'
print df1
   stream        feat another_feat
a       1  some_value   some_value
b       2        aaaa         aaaa
c       2        aaaa         aaaa
d       3  some_value   some_value

If you need update separate, one option is use:

df1.loc[df1['stream'] == 2, 'feat'] = 10
print df1
   stream        feat another_feat
a       1  some_value   some_value
b       2          10   some_value
c       2          10   some_value
d       3  some_value   some_value

Another common option is use numpy.where:

df1['feat'] = np.where(df1['stream'] == 2, 10,20)
print df1
   stream  feat another_feat
a       1    20   some_value
b       2    10   some_value
c       2    10   some_value
d       3    20   some_value

EDIT: If you need divide all columns without stream where condition is True, use:

print df1
   stream  feat  another_feat
a       1     4             5
b       2     4             5
c       2     2             9
d       3     1             7

#filter columns all without stream
cols = [col for col in df1.columns if col != 'stream']
print cols
['feat', 'another_feat']

df1.loc[df1['stream'] == 2, cols ] = df1 / 2
print df1
   stream  feat  another_feat
a       1   4.0           5.0
b       2   2.0           2.5
c       2   1.0           4.5
d       3   1.0           7.0

If working with multiple conditions is possible use multiple numpy.where or numpy.select:

df0 = pd.DataFrame({'Col':[5,0,-6]})

df0['New Col1'] = np.where((df0['Col'] > 0), 'Increasing', 
                          np.where((df0['Col'] < 0), 'Decreasing', 'No Change'))

df0['New Col2'] = np.select([df0['Col'] > 0, df0['Col'] < 0],
                            ['Increasing',  'Decreasing'], 
                            default='No Change')

print (df0)
   Col    New Col1    New Col2
0    5  Increasing  Increasing
1    0   No Change   No Change
2   -6  Decreasing  Decreasing

2 of 3

5

You can do the same with .ix, like this:

In [1]: df = pd.DataFrame(np.random.randn(5,4), columns=list('abcd'))

In [2]: df
Out[2]: 
          a         b         c         d
0 -0.323772  0.839542  0.173414 -1.341793
1 -1.001287  0.676910  0.465536  0.229544
2  0.963484 -0.905302 -0.435821  1.934512
3  0.266113 -0.034305 -0.110272 -0.720599
4 -0.522134 -0.913792  1.862832  0.314315

In [3]: df.ix[df.a>0, ['b','c']] = 0

In [4]: df
Out[4]: 
          a         b         c         d
0 -0.323772  0.839542  0.173414 -1.341793
1 -1.001287  0.676910  0.465536  0.229544
2  0.963484  0.000000  0.000000  1.934512
3  0.266113  0.000000  0.000000 -0.720599
4 -0.522134 -0.913792  1.862832  0.314315

EDIT

After the extra information, the following will return all columns - where some condition is met - with halved values:

>> condition = df.a > 0
>> df[condition][[i for i in df.columns.values if i not in ['a']]].apply(lambda x: x/2)

GeeksforGeeks

geeksforgeeks.org › pandas › python-creating-a-pandas-dataframe-column-based-on-a-given-condition

Python | Creating a Pandas dataframe column based on a given condition - GeeksforGeeks

October 30, 2025 - import pandas as pd df = pd.DataFrame({'Date': ['11/8/2011', '11/9/2011', '11/10/2011', '11/11/2011', '11/12/2011'], 'Event': ['Music', 'Poetry', 'Music', 'Comedy', 'Poetry']}) print(df) ... Date Event 0 11/8/2011 Music 1 11/9/2011 Poetry 2 11/10/2011 Music 3 11/11/2011 Comedy 4 11/12/2011 Poetry ... Date Event Price 0 11/8/2011 Music 1500 1 11/9/2011 Poetry 800 2 11/10/2011 Music 1500 3 11/11/2011 Comedy 800 4 11/12/2011 Poetry 800 ... def set_value(event, mapping): return mapping[event] event_dict = {'Music': 1500, 'Comedy': 1200, 'Poetry': 800} df['Price'] = df['Event'].apply(set_value, args=(event_dict,)) print(df)

Saturn Cloud

saturncloud.io › blog › how-to-update-values-in-a-specific-row-in-a-python-pandas-dataframe

How to update values in a specific row in a Python Pandas DataFrame | Saturn Cloud Blog

January 4, 2024 - If we need to update multiple values in the same row, we can use the .loc or .iloc methods to select the row as before, but this time we will pass a dictionary of column names and values to the .loc or .iloc methods. The keys of the dictionary should correspond to the column names, and the values should be the new values we want to set. import pandas as pd # create a sample DataFrame df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) # update the values in the second row df.loc[1] = {'A': 10, 'B': 20} print(df)

Medium

medium.com › analytics-vidhya › pandas-how-to-change-value-based-on-condition-fc8ee38ba529

Pandas: How to change value based on condition | by Jack Dong | Analytics Vidhya | Medium

January 17, 2024 - Pandas’ loc can create a boolean mask, based on condition. It can either just be selecting rows and columns, or it can be used to filter dataframes. example_df.loc[example_df["column_name1"] condition, "column_name2"] = value

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.add.html

pandas.DataFrame.add — pandas 3.0.1 documentation

Get Addition of dataframe and other, element-wise (binary operator add) · Equivalent to dataframe + other, but with support to substitute a fill_value for missing data in one of the inputs. With reverse version, radd

Stack Overflow

stackoverflow.com › questions › 69467949 › pandas-add-a-new-row-based-on-a-specific-condition

python - Pandas- Add a new row based on a specific condition - Stack Overflow

Top answer

1 of 4

2

You could use groupby to check if 'C' and 'D' are in the 'col_name' column and add them if not.

df = pd.DataFrame([{'owner':'svc','name':'dmn_dmn','col_name':'A','test_col1':1,'test_col2':'String1'},{'owner':'svc','name':'dmn_dmn','col_name':'B','test_col1':2,'test_col2':'String12'},{'owner':'svc','name':'dmn_dmn','col_name':'C','test_col1':'remain_constant_3','test_col2':'String13'},{'owner':'svc','name':'dmn_dmn','col_name':'D','test_col1':'remain_constant_3','test_col2':'String14'},{'owner':'svc','name':'time1','col_name':'E','test_col1':5,'test_col2':'String1123'}])

for g,g_hold in df.groupby('name'):
    if 'C' not in g_hold['col_name'].tolist():
        df = df.append({'owner':'svc','name':g,'col_name':'C','test_col1':'remain_constant_3','test_col2':'String13'},ignore_index=True)
    if 'D' not in g_hold['col_name'].tolist():
        df = df.append({'owner':'svc','name':g,'col_name':'D','test_col1':'remain_constant_3','test_col2':'String14'},ignore_index=True)

print(df.sort_values(['name','col_name']))

The code would end up looking something like this.

2 of 4

0

A better way is to use

import pandas as pd
df = pd.DataFrame({"owner": ["owner"] * 9,
               "name": ["dmn_dmn", "dmn_dmn", "dmn_dmn", "dmn_dmn", "time1", "time1", "sap", "sap", "sap"],
               "col_name": ["A", "B", "C", "D", "A", "B", "A", "B", "D"]})
index = pd.MultiIndex.from_product([df.owner.unique(), df.name.unique(), df.col_name.unique()])
result = df.set_index(['owner', 'name', "col_name"]).reindex(index).reset_index()
print(result)

Spark By {Examples}

sparkbyexamples.com › home › pandas › pandas add column based on another column

Pandas Add Column based on Another Column - Spark By {Examples}

July 7, 2025 - In this article, I have explained how to add/append a column to the Pandas DataFrame based on the values of another column using multiple functions and also I explained how to use basic arithmetic operations and other functions. Happy learning!! ... Pandas select columns based on condition.

Stack Exchange

datascience.stackexchange.com › questions › 110566 › create-new-rows-based-on-a-value-in-a-column

dataset - Create new rows based on a value in a column - Data Science Stack Exchange

Top answer

1 of 2

1

The easiest way of doing this is probably to first convert the dataframe back to a list of rows, then use base python syntax to repeat each row n times, and then convert that back to a dataframe:

import pandas as pd

df = pd.DataFrame({
    "event": ["A","B","C","D"],
    "budget": [123, 433, 1000, 1299],
    "duration_days": [6, 3, 4, 2]
})

pd.DataFrame([
    row # select the full row
    for row in df.to_dict(orient="records") # for each row in the dataframe
    for _ in range(row["duration_days"]) # and repeat the row for row["duration"] times
])

Which gives the following dataframe:

event	budget	duration_days
A	123	6
A	123	6
A	123	6
A	123	6
A	123	6
A	123	6
B	433	3
B	433	3
B	433	3
C	1000	4
C	1000	4
C	1000	4
C	1000	4
D	1299	2
D	1299	2

2 of 2

0

First of all, I think your dataset have some problem because if you use single quote like this '['1','2','3','4']' in the value for the key then python will show you a syntax error.

So I will take the as below

df = {'event':['A','B','C','D'],'budget':['123','433','1000','1299'],'duration_days':['6','3','4','2']}

Then convert it to a data frame as your requirement.

data = []

for i in range(len(df['event'])):
    for j in range(int(df['duration_days'][i])):
        temp = [df['event'][i], df['budget'][i], df['duration_days'][i]]
        data.append(temp)

data_df = pd.DataFrame(data, columns=['event','budget','duration_days'])
data_df

event	budget	duration_days
0	A	123
1	A	123
2	A	123
3	A	123
4	A	123
5	A	123
6	B	433
7	B	433
8	B	433