since strings data types have variable length, it is by default stored as object dtype. If you want to store them as string type, you can do something like this.
df['column'] = df['column'].astype('|S80') #where the max length is set at 80 bytes,
or alternatively
df['column'] = df['column'].astype('|S') # which will by default set the length to the max len it encounters
Answer from Siraj S. on Stack OverflowUnable to convert a pandas object to a string in my DataFrame
python - How to convert column with dtype as object to string in Pandas Dataframe - Stack Overflow
BUG: Pandas refuses to convert object type to string
Wasting time again on Pandas trying to convert a freaking string into a float... So much fun with this library
Videos
Trying to use the YouTube API to pull through some videos for data analysis and am currently using just two videos in a dataframe to play around with the functionality as I'm new to all of this.
I'm using another API to get the transcripts for each video but I need to input the video_id into that API to get transcripts for each video.
The only problem is everything is stored as an object and whenever I try .astype(str) or something like that, it still says the data is an object and means I can't do anything with the data when a string is a required argument for the other API
This is what I get when calling .info() on my dataframe:
<class 'pandas.core.frame.DataFrame'> RangeIndex: 2 entries, 0 to 1 Data columns (total 10 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 video_id 2 non-null object 1 publishedAt 2 non-null object 2 channelId 2 non-null object 3 title 2 non-null object 4 description 2 non-null object 5 channelTitle 2 non-null object 6 tags 2 non-null object 7 categoryId 2 non-null object 8 liveBroadcastContent 2 non-null object 9 defaultAudioLanguage 2 non-null object dtypes: object(10) memory usage: 288.0+ bytes
Any help would be really appreciated or an explanation of how these issues are usually handled
since strings data types have variable length, it is by default stored as object dtype. If you want to store them as string type, you can do something like this.
df['column'] = df['column'].astype('|S80') #where the max length is set at 80 bytes,
or alternatively
df['column'] = df['column'].astype('|S') # which will by default set the length to the max len it encounters
Did you try assigning it back to the column?
df['column'] = df['column'].astype('str')
Referring to this question, the pandas dataframe stores the pointers to the strings and hence it is of type 'object'. As per the docs ,You could try:
df['column_new'] = df['column'].str.split(',')