When you use df.apply(), each row of your DataFrame will be passed to your lambda function as a pandas Series. The frame's columns will then be the index of the series and you can access values using series[label].
So this should work:
df['D'] = (df.apply(lambda x: myfunc(x[colNames[0]], x[colNames[1]]), axis=1))
Answer from foglerit on Stack OverflowWhen you use df.apply(), each row of your DataFrame will be passed to your lambda function as a pandas Series. The frame's columns will then be the index of the series and you can access values using series[label].
So this should work:
df['D'] = (df.apply(lambda x: myfunc(x[colNames[0]], x[colNames[1]]), axis=1))
In general, this error occurs if you try to access an attribute that doesn't exist on an object. For pandas Serieses (or DataFrames), it occurs because you tried to index it using the attribute access (.).
In the case in the OP, they used x.colNames[0] to access the value on colNames[0] in row x but df doesn't have attribute colNames, so the error occurred.1
Another case this error may occur is if an index had a white space in it that you didn't know about. For example, the following case reproduces this error.
s = pd.Series([1, 2], index=[' a', 'b'])
s.a
In this case, make sure to remove the white space:
s.index = [x.strip() for x in s.index]
# or
s.index = [x.replace(' ', '') for x in s.index]
Finally, it's always safe to use [] to index a Series (or a DataFrame).
1: Serieses have the following attributes: axes, dtypes, empty, index, ndim, size, shape, T, values. DataFrames have all of these attributes + columns. When you use df.apply(..., axis=1), it iterates over the rows where each row is a Series whose indices are the column names of df.
'Series' object has no attribute '_data' -- Loaded Model fails to predict when loaded on Windows but works on Linux
python - AttributeError: 'DataFrame' object has no attribute 'series' in pandas - Stack Overflow
AttributeError: 'Series' object has no attribute [X] when preparing DataBlock - Part 1 (2020) - fast.ai Course Forums
Profile generation failure - AttributeError: 'Series' object has no attribute 'reshape'
Videos
The Error in my problem is the file name.
I save the file as pandas.py that's why i got the error.
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html
data : array-like, dict, or scalar value Contains data stored in Series
Don't think a set counts as array-like, so just convert it to a list before calling pd.Series.
import pandas as pd
numbers = {1,2,3,4,5}
ser = pd.Series(list(numbers))
Output:
print ser
0 1
1 2
2 3
3 4
4 5
You need to remove .values:
phone_numbers = merged_df.loc[(merged_df['Facility Code'] ==facility_number) & (merged_df['group'] == group) & (merged_df['Optedout'] == optout)]['phone']
@Serge Ballesta's comment is the most likely cause.
There are typos in the code that you have shared. Check whether you called value instead of values.
The following code works as expected:
import pandas as pd
data = {'phone': [25470000000, 25470000000, 25470000010, 25470000020, 25470000000], 'group': ['MAMA', 'MAMA', 'MAMA', 'MAMA', 'MAMA'], 'County': ['Orange', 'Orange', 'Orange', 'Orange', 'Orange'], 'PNC/ANC': ['PNC', 'PNC', 'PNC', 'PNC', 'PNC'], 'Facility Name': ['Main Centre', 'Main Centre', 'Centre', 'Centre', 'Main Centre'], 'Optedout': ['FALSE', 'FALSE', 'FALSE', 'FALSE', 'FALSE'], 'Facility Code': [112, 112, 108, 108, 112]}
merged_df = pd.DataFrame.from_dict(data)
facility_number = 108
group = 'MAMA'
optout = 'FALSE'
phone_numbers = merged_df.loc[(merged_df['Facility Code'] ==facility_number) & (merged_df['group'] == group) & (merged_df['Optedout'] == optout)]['phone'].values
print(phone_numbers)
Output:
[25470000010 25470000020]
By removing .values, the output is a dataframe:
2 25470000010
3 25470000020
Name: phone, dtype: int64
DataFrame column is a Series, and for Series you need dt.accessor to calculate days (if you are using a newer Pandas version). You can see docs here
So, you need to change:
df['days'] = float(df['delta'].days)
To
df['days'] = float(df['delta'].dt.days)
While subtracting the dates you should use the following code.
df = pd.DataFrame([ pd.Timestamp('20010101'), pd.Timestamp('20040605') ])
(df.loc[0]-df.loc[1]).astype('timedelta64[D]')
So basically use .astype('timedelta64[D]') on the subtracted column.
I have a pandas dataframe with two columns: the first is 'Document' which contains normal one line sentences, and the second is 'Label' which is the topic of that sentence
I want to change each sentence under the Document column such that all stopwords are removed.
Code:
def remove_sw(x):
x = x.split(' ')
return ' '.join(z for z in x if z not in stop_words)
corpus_df['Document'] =
corpus_df.apply(remove_sw,axis=1)I am getting this error: AttributeError: ("'Series' object has no attribute 'split'", 'occurred at index 0')
So, the function is running on the entire series instead of each item in that series one at a time. Is there a parameter in the apply function to fix this or is there another function in Pandas for this?