There are two steps to created & populate a new column using only a row number... (in this approach iloc is not used)
First, get the row index value by using the row number
CopyrowIndex = df.index[someRowNumber]
Then, use row index with the loc function to reference the specific row and add the new column / value
Copydf.loc[rowIndex, 'New Column Title'] = "some value"
These two steps can be combine into one line as follows
Copydf.loc[df.index[someRowNumber], 'New Column Title'] = "some value"
Answer from RumbleFish on Stack OverflowVideos
There are two steps to created & populate a new column using only a row number... (in this approach iloc is not used)
First, get the row index value by using the row number
CopyrowIndex = df.index[someRowNumber]
Then, use row index with the loc function to reference the specific row and add the new column / value
Copydf.loc[rowIndex, 'New Column Title'] = "some value"
These two steps can be combine into one line as follows
Copydf.loc[df.index[someRowNumber], 'New Column Title'] = "some value"
If you have a dataframe like
Copyimport pandas as pd
df = pd.DataFrame(data={'X': [1.5, 6.777, 2.444, pd.np.NaN], 'Y': [1.111, pd.np.NaN, 8.77, pd.np.NaN], 'Z': [5.0, 2.333, 10, 6.6666]})
Instead of iloc,you can use .loc with row index and column name like df.loc[row_indexer,column_indexer]=value
Copydf.loc[[0,3],'Z'] = 3
Output:
X Y Z 0 1.500 1.111 3.000 1 6.777 NaN 2.333 2 2.444 8.770 10.000 3 NaN NaN 3.000
You can use df.loc[i], where the row with index i will be what you specify it to be in the dataframe.
>>> import pandas as pd
>>> from numpy.random import randint
>>> df = pd.DataFrame(columns=['lib', 'qty1', 'qty2'])
>>> for i in range(5):
>>> df.loc[i] = ['name' + str(i)] + list(randint(10, size=2))
>>> df
lib qty1 qty2
0 name0 3 3
1 name1 2 4
2 name2 2 8
3 name3 2 1
4 name4 9 6
In case you can get all data for the data frame upfront, there is a much faster approach than appending to a data frame:
- Create a list of dictionaries in which each dictionary corresponds to an input data row.
- Create a data frame from this list.
I had a similar task for which appending to a data frame row by row took 30 min, and creating a data frame from a list of dictionaries completed within seconds.
rows_list = []
for row in input_rows:
dict1 = {}
# get input row in dictionary format
# key = col_name
dict1.update(blah..)
rows_list.append(dict1)
df = pd.DataFrame(rows_list)
RukTech's answer, df.set_value('C', 'x', 10), is far and away faster than the options I've suggested below. However, it has been slated for deprecation.
Going forward, the recommended method is .iat/.at.
Why df.xs('C')['x']=10 does not work:
df.xs('C') by default, returns a new dataframe with a copy of the data, so
df.xs('C')['x']=10
modifies this new dataframe only.
df['x'] returns a view of the df dataframe, so
df['x']['C'] = 10
modifies df itself.
Warning: It is sometimes difficult to predict if an operation returns a copy or a view. For this reason the docs recommend avoiding assignments with "chained indexing".
So the recommended alternative is
df.at['C', 'x'] = 10
which does modify df.
In [18]: %timeit df.set_value('C', 'x', 10)
100000 loops, best of 3: 2.9 µs per loop
In [20]: %timeit df['x']['C'] = 10
100000 loops, best of 3: 6.31 µs per loop
In [81]: %timeit df.at['C', 'x'] = 10
100000 loops, best of 3: 9.2 µs per loop
Update: The .set_value method is going to be deprecated. .iat/.at are good replacements, unfortunately pandas provides little documentation
The fastest way to do this is using set_value. This method is ~100 times faster than .ix method. For example:
df.set_value('C', 'x', 10)