If you look at the documentation for pd.DataFrame.append
Append rows of other to the end of this frame, returning a new object. Columns not in this frame are added as new columns.
(emphasis mine).
Try
df_res = df_res.append(res)
Incidentally, note that pandas isn't that efficient for creating a DataFrame by successive concatenations. You might try this, instead:
all_res = []
for df in df_all:
for i in substr:
res = df[df['url'].str.contains(i)]
all_res.append(res)
df_res = pd.concat(all_res)
This first creates a list of all the parts, then creates a DataFrame from all of them once at the end.
Answer from Ami Tavory on Stack Overflowpython - How do I append one pandas DataFrame to another? - Stack Overflow
Don't append rows to a pandas DataFrame
python - Create a Pandas Dataframe by appending one row at a time - Stack Overflow
python - Good alternative to Pandas .append() method, now that it has been deprecated? - Stack Overflow
Videos
If you look at the documentation for pd.DataFrame.append
Append rows of other to the end of this frame, returning a new object. Columns not in this frame are added as new columns.
(emphasis mine).
Try
df_res = df_res.append(res)
Incidentally, note that pandas isn't that efficient for creating a DataFrame by successive concatenations. You might try this, instead:
all_res = []
for df in df_all:
for i in substr:
res = df[df['url'].str.contains(i)]
all_res.append(res)
df_res = pd.concat(all_res)
This first creates a list of all the parts, then creates a DataFrame from all of them once at the end.
Why am I getting "AttributeError: 'DataFrame' object has no attribute 'append'?
pandas >= 2.0 append has been removed, use pd.concat instead1
Starting from pandas 2.0, append has been removed from the API. It was previously deprecated in version 1.4. See the docs on Deprecations as well as this github issue that originally proposed its deprecation.
The rationale for its removal was to discourage iteratively growing DataFrames in a loop (which is what people typically use append for). This is because append makes a new copy at each stage, resulting in quadratic complexity in memory.
1. This assume you're appending one DataFrame to another. If you're appending a row to a DataFrame, the solution is slightly different - see below.
The idiomatic way to append DataFrames is to collect all your smaller DataFrames into a list, and then make one single call to pd.concat. Here's a(n oversimplified) example
df_list = []
for df in some_function_that_yields_dfs():
df_list.append(df)
final_df = pd.concat(df_list)
Note that if you are trying to append one row at a time rather than one DataFrame at a time, the solution is even simpler.
data = []
for a, b, c from some_function_that_yields_data():
data.append([a, b, c])
df = pd.DataFrame(data, columns=['a', 'b', 'c'])
More information in Creating an empty Pandas DataFrame, and then filling it?
https://www.wrighters.io/dont-append-rows-to-a-pandas-dataframe/
You can use df.loc[i], where the row with index i will be what you specify it to be in the dataframe.
>>> import pandas as pd
>>> from numpy.random import randint
>>> df = pd.DataFrame(columns=['lib', 'qty1', 'qty2'])
>>> for i in range(5):
>>> df.loc[i] = ['name' + str(i)] + list(randint(10, size=2))
>>> df
lib qty1 qty2
0 name0 3 3
1 name1 2 4
2 name2 2 8
3 name3 2 1
4 name4 9 6
In case you can get all data for the data frame upfront, there is a much faster approach than appending to a data frame:
- Create a list of dictionaries in which each dictionary corresponds to an input data row.
- Create a data frame from this list.
I had a similar task for which appending to a data frame row by row took 30 min, and creating a data frame from a list of dictionaries completed within seconds.
rows_list = []
for row in input_rows:
dict1 = {}
# get input row in dictionary format
# key = col_name
dict1.update(blah..)
rows_list.append(dict1)
df = pd.DataFrame(rows_list)
I also like the append method. But you can do it in one line with a list of dicts
df = pd.concat([df, pd.DataFrame.from_records([{ 'a': 1, 'b': 2 }])])
or using loc and tuples for values on DataFrames with incremenal ascending indexes
df.loc[len(df), ['a','b']] = 1, 2
or maybe
df.loc[len(df), df.columns] = 3, 4
Create a list with your dictionaries, if they are needed, and then create a new dataframe with df = pd.DataFrame.from_records(your_list). List's "append" method are very efficient and won't be ever deprecated. Dataframes on the other hand, frequently have to be recreated and all data copied over on appends, due to their design - that is why they deprecated the method