b'...' means it's a byte-string and the default dtype for arrays of strings depends on the kind of strings. Unicodes (python 3 strings are unicode) are U and Python 2 str or Python 3 bytes have the dtype S. You can find the explanation of dtypes in the NumPy documentation here
Array-protocol type strings
The first character specifies the kind of data and the remaining characters specify the number of bytes per item, except for Unicode, where it is interpreted as the number of characters. The item size must correspond to an existing type, or an error will be raised. The supported kinds are:
- '?' boolean
- 'b' (signed) byte
- 'B' unsigned byte
- 'i' (signed) integer
- 'u' unsigned integer
- 'f' floating-point
- 'c' complex-floating point
- 'm' timedelta
- 'M' datetime
- 'O' (Python) objects
- 'S', 'a' zero-terminated bytes (not recommended)
- 'U' Unicode string
- 'V' raw data (void)
However in your first case you actually forced NumPy to convert it to bytes because you specified dtype='S'.
b'...' means it's a byte-string and the default dtype for arrays of strings depends on the kind of strings. Unicodes (python 3 strings are unicode) are U and Python 2 str or Python 3 bytes have the dtype S. You can find the explanation of dtypes in the NumPy documentation here
Array-protocol type strings
The first character specifies the kind of data and the remaining characters specify the number of bytes per item, except for Unicode, where it is interpreted as the number of characters. The item size must correspond to an existing type, or an error will be raised. The supported kinds are:
- '?' boolean
- 'b' (signed) byte
- 'B' unsigned byte
- 'i' (signed) integer
- 'u' unsigned integer
- 'f' floating-point
- 'c' complex-floating point
- 'm' timedelta
- 'M' datetime
- 'O' (Python) objects
- 'S', 'a' zero-terminated bytes (not recommended)
- 'U' Unicode string
- 'V' raw data (void)
However in your first case you actually forced NumPy to convert it to bytes because you specified dtype='S'.
Since NumPy version 2.0 a new numpy.dtypes.StringDType is available.
Often, real-world string data does not have a predictable length. In these cases it is awkward to use fixed-width strings, since storing all the data without truncation requires knowing the length of the longest string one would like to store in the array before the array is created.
To support situations like this, NumPy provides numpy.dtypes.StringDType, which stores variable-width string data in a UTF-8 encoding in a NumPy array:
from numpy.dtypes import StringDType
data = ["this is a longer string", "short string"]
arr = np.array(data, dtype=StringDType())
arr
array(['this is a longer string', 'short string'], dtype=StringDType())
Note that unlike fixed-width strings, StringDType is not parameterized by the maximum length of an array element, arbitrarily long or short strings can live in the same array without needing to reserve storage for padding bytes in the short strings.
I am following this guide on working with text data types. From there, I cobbled the following:
import pandas as pd
# "Int64" dtype for both series and element therein
#--------------------------------------------------
s1 = pd.Series([1, 2, np.nan], dtype="Int64")
s1
0 1
1 2
2 <NA>
dtype: Int64
type(s1[0])
numpy.int64
# "string" dtype for series vs. "str" dtype for element therein
#--------------------------------------------------------------
s2 = s1.astype("string")
s2
Out[13]:
0 1
1 2
2 <NA>
dtype: string
type(s2[0])
str
For Int64 series s1, the series type matches the type of the element therein (other than inconsistent case).
For string series s2, the elements therein of a completely different type str. From web browsing, I know that str is the native Python string type while string is the pandas string type. My web browsings further indicate that the pandas string type is the native Python string type (as opposed to the fixed-length mutable string type of NumPy).
In that case, why is there a different name (string vs. str) and why do the names differ in the last two lines of output above? My (possibly wrong) understanding is that the dtype shown for a series reflects the type of the elements therein.
The dtype object comes from NumPy, it describes the type of element in a ndarray. Every element in an ndarray must have the same size in bytes. For int64 and float64, they are 8 bytes. But for strings, the length of the string is not fixed. So instead of saving the bytes of strings in the ndarray directly, Pandas uses an object ndarray, which saves pointers to objects; because of this the dtype of this kind ndarray is object.
Here is an example:
- the int64 array contains 4 int64 value.
- the object array contains 4 pointers to 3 string objects.

@HYRY's answer is great. I just want to provide a little more context..
Arrays store data as contiguous, fixed-size memory blocks. The combination of these properties together is what makes arrays lightning fast for data access. For example, consider how your computer might store an array of 32-bit integers, [3,0,1].

If you ask your computer to fetch the 3rd element in the array, it'll start at the beginning and then jump across 64 bits to get to the 3rd element. Knowing exactly how many bits to jump across is what makes arrays fast.
Now consider the sequence of strings ['hello', 'i', 'am', 'a', 'banana']. Strings are objects that vary in size, so if you tried to store them in contiguous memory blocks, it'd end up looking like this.

Now your computer doesn't have a fast way to access a randomly requested element. The key to overcoming this is to use pointers. Basically, store each string in some random memory location, and fill the array with the memory address of each string. (Memory addresses are just integers.) So now, things look like this

Now, if you ask your computer to fetch the 3rd element, just as before, it can jump across 64 bits (assuming the memory addresses are 32-bit integers) and then make one extra step to go fetch the string.
The challenge for NumPy is that there's no guarantee the pointers are actually pointing to strings. That's why it reports the dtype as 'object'.
Shamelessly gonna plug my own course on NumPy where I originally discussed this.