There are two steps to created & populate a new column using only a row number... (in this approach iloc is not used)
First, get the row index value by using the row number
CopyrowIndex = df.index[someRowNumber]
Then, use row index with the loc function to reference the specific row and add the new column / value
Copydf.loc[rowIndex, 'New Column Title'] = "some value"
These two steps can be combine into one line as follows
Copydf.loc[df.index[someRowNumber], 'New Column Title'] = "some value"
Answer from RumbleFish on Stack OverflowThere are two steps to created & populate a new column using only a row number... (in this approach iloc is not used)
First, get the row index value by using the row number
CopyrowIndex = df.index[someRowNumber]
Then, use row index with the loc function to reference the specific row and add the new column / value
Copydf.loc[rowIndex, 'New Column Title'] = "some value"
These two steps can be combine into one line as follows
Copydf.loc[df.index[someRowNumber], 'New Column Title'] = "some value"
If you have a dataframe like
Copyimport pandas as pd
df = pd.DataFrame(data={'X': [1.5, 6.777, 2.444, pd.np.NaN], 'Y': [1.111, pd.np.NaN, 8.77, pd.np.NaN], 'Z': [5.0, 2.333, 10, 6.6666]})
Instead of iloc,you can use .loc with row index and column name like df.loc[row_indexer,column_indexer]=value
Copydf.loc[[0,3],'Z'] = 3
Output:
X Y Z 0 1.500 1.111 3.000 1 6.777 NaN 2.333 2 2.444 8.770 10.000 3 NaN NaN 3.000
Videos
Use rename index by Series:
df2['new'] = df2.rename(index=df.set_index('key')['bar']).index
print (df2)
other_data new
key
a 10 1
a 20 1
b 30 2
Or map:
df2['new'] = df2.index.to_series().map(df.set_index('key')['bar'])
print (df2)
other_data new
key
a 10 1
a 20 1
b 30 2
If want better performance, the best is avoid duplicates in index. Also some function like reindex failed in duplicates index.
You can use join
df2.join(df.set_index('key'))
other_data bar
key
a 10 1
a 20 1
b 30 2
One way to rename the column in the process
df2.join(df.set_index('key').bar.rename('new'))
other_data new
key
a 10 1
a 20 1
b 30 2
I finally found the solution, don't know why the other one didn't work though. But this one is simpler:
tempDf = pd.DataFrame(columns=['PRODUCT','CAT_ID','MARKET_ID'])
tempDf['PRODUCT'] = df['PRODUCT']
tempDf['CAT_ID'] = catid
tempDf['MARKET_ID'] = 13
finalDf = pd.concat([finalDf,tempDf])
You actually do not need catids and marketids:
finalDf['CAT_ID'] = catid
finalDf['MARKET_ID'] = marketid
Will work.
For the rest of the script, I would probably have made things a bit simpler in that way:
finalDf = pd.DataFrame()
finalDf['PRODUCT'] = df['PRODUCT'].reset_index()
Supposing that you are not interested in df's original index as your code implied.
You can use isin for index, that you'd normally use with Series, it essentially creates an array of truth value, which you can pass to np.where with true and false values, assign the result as a column.
df['Is it valid?'] = np.where(df.index.isin(indexlist), 'Yes', 'No')
OUTPUT:
Name Country Is it valid?
0 John BR Yes
1 Peter BR Yes
2 Paul BR No
3 James CZ No
4 Jonatan CZ Yes
5 Maria DK No
you can use iloc
df['Is it valid?']='No'
df['Is it valid?'].iloc[indexlist] = 'Yes'
As you noticed the line
df.loc[df['foo'] == 1, foo2] = b
overwrites previous value so that you end up with only b. Is there a particular reason you need to solve your problem this way? You can write a custom function and use apply to append values but this is not how pandas is supposed to be used.
I would recommend creating another column 'foo3'. You can always join the letters afterwards.
This should achieve what you are looking for. However, just as maow already pointed out, you should think about adding new columns and concatenating them when you need to.
import pandas as pd
df = pd.DataFrame({'foo': [1,2,3,3], 'bar': [1,2,3,3], 'foo2':[None, None, None, None]})
print(df)
a = 'a'
b = 'b'
# insert a
df.loc[df['foo'] == 1, 'foo2'] = a
print(df)
# concat b to a
df.loc[df['foo'] == 1, 'foo2'] += ',' + b
print(df)
RukTech's answer, df.set_value('C', 'x', 10), is far and away faster than the options I've suggested below. However, it has been slated for deprecation.
Going forward, the recommended method is .iat/.at.
Why df.xs('C')['x']=10 does not work:
df.xs('C') by default, returns a new dataframe with a copy of the data, so
df.xs('C')['x']=10
modifies this new dataframe only.
df['x'] returns a view of the df dataframe, so
df['x']['C'] = 10
modifies df itself.
Warning: It is sometimes difficult to predict if an operation returns a copy or a view. For this reason the docs recommend avoiding assignments with "chained indexing".
So the recommended alternative is
df.at['C', 'x'] = 10
which does modify df.
In [18]: %timeit df.set_value('C', 'x', 10)
100000 loops, best of 3: 2.9 µs per loop
In [20]: %timeit df['x']['C'] = 10
100000 loops, best of 3: 6.31 µs per loop
In [81]: %timeit df.at['C', 'x'] = 10
100000 loops, best of 3: 9.2 µs per loop
Update: The .set_value method is going to be deprecated. .iat/.at are good replacements, unfortunately pandas provides little documentation
The fastest way to do this is using set_value. This method is ~100 times faster than .ix method. For example:
df.set_value('C', 'x', 10)