Use the str accessor with square brackets:
df['col'] = df['col'].str[:9]
Or str.slice:
df['col'] = df['col'].str.slice(0, 9)
Answer from user2285236 on Stack OverflowVideos
Use a boolean mask to filter your df and then call str and slice the string:
In [77]:
df.loc[(df['Name'] == 'Richard') & (df['Points']==35),'String'].str[3:5]
Out[77]:
1 67
3 38
Name: String, dtype: object
Pandas string methods
You can mask it on your criteria and then use pandas string methods
mask_richard = df.Name == 'Richard'
mask_points = df.Points == 35
df[mask_richard & mask_points].String.str[3:5]
1 67
3 38
This is perhaps more suited for StackOverflow. I would also use a better/more descriptive title for the question itself; that way others that are facing a similar problem are able to find it.
The reason you are seeing that error is because of the nan values, which are of type float. So while most of the rows in df['location'] contain strings, every row instance of an nan in the column is a float, and str.index() is not available for floats.
Your check of if df.location[i] == np.nan: is pointless, because np.nan == np.nan is always False due to the very definition of nan. Refer to this question on the topic. Because your check fails, the loop enters the else block and encounters a float object attempting to invoke a string method.
In my opinion you are using a very complicated approach to get what you want.
Replace your code with this. It should give you what you are looking for. Any nan values encountered will be handled by python.
df['location']=df['location'].str.replace(" ","").str.strip('(').str.strip(')')
df['latitude']=df['location'].str.split(',').str[0]
df['longitude']=df['location'].str.split(',').str[1]
I tested this using the following code segment:
df=pd.DataFrame()
df['location']=(
"(37.785719256680785, -122.40852313194863)",
"(37.78733980600732, -122.41063199757738)",
"(37.7946573324287, -122.42232562979227)",
"(37.79595867909168, -122.41557405519474)",
"(37.78315261897309, -122.40950883997789)",
np.nan,
"(37.78615261897309, -122.405550883997789)")
df['location']=df['location'].str.replace(" ", "").str.strip('(').str.strip(')')
df['latitude']=df['location'].str.split(',').str[0]
df['longitude']=df['location'].str.split(',').str[1]
print(df[['latitude','longitude']])
This produces the output:
latitude longitude
0 37.785719256680785 -122.40852313194863
1 37.78733980600732 -122.41063199757738
2 37.7946573324287 -122.42232562979227
3 37.79595867909168 -122.41557405519474
4 37.78315261897309 -122.40950883997789
5 NaN NaN
6 37.78615261897309 -122.405550883997789
If I got it right from this one, the df['location'][i] should be some kind of a float and you can't get index from the float type. Check the type of the df['location'][i] with the isinstance(df['location'][i], float) you don't need to but just so you can see that it is really float. The error tells you everything. Maybe you are expecting string? There is not really much to say.