NumPy arrays are stored as contiguous blocks of memory. They usually have a single datatype (e.g. integers, floats or fixed-length strings) and then the bits in memory are interpreted as values with that datatype.

Creating an array with dtype=object is different. The memory taken by the array now is filled with pointers to Python objects which are being stored elsewhere in memory (much like a Python list is really just a list of pointers to objects, not the objects themselves).

Arithmetic operators such as * don't work with arrays such as ar1 which have a string_ datatype (there are special functions instead - see below). NumPy is just treating the bits in memory as characters and the * operator doesn't make sense here. However, the line

np.array(['avinash','jay'], dtype=object) * 2

works because now the array is an array of (pointers to) Python strings. The * operator is well defined for these Python string objects. New Python strings are created in memory and a new object array with references to the new strings is returned.


If you have an array with string_ or unicode_ dtype and want to repeat each string, you can use np.char.multiply:

In [52]: np.char.multiply(ar1, 2)
Out[52]: array(['avinashavinash', 'jayjay'], 
      dtype='<U14')

NumPy has many other vectorised string methods too.

Answer from Alex Riley on Stack Overflow
🌐
NumPy
numpy.org › doc › stable › reference › arrays.dtypes.html
Data type objects (dtype) — NumPy v2.4 Manual
This style allows passing in the fields attribute of a data-type object. obj should contain string or unicode keys that refer to (data-type, offset) or (data-type, offset, title) tuples. ... Try it in your browser! Data type containing field col1 (10-character string at byte position 0), col2 ...
Top answer
1 of 2
64

NumPy arrays are stored as contiguous blocks of memory. They usually have a single datatype (e.g. integers, floats or fixed-length strings) and then the bits in memory are interpreted as values with that datatype.

Creating an array with dtype=object is different. The memory taken by the array now is filled with pointers to Python objects which are being stored elsewhere in memory (much like a Python list is really just a list of pointers to objects, not the objects themselves).

Arithmetic operators such as * don't work with arrays such as ar1 which have a string_ datatype (there are special functions instead - see below). NumPy is just treating the bits in memory as characters and the * operator doesn't make sense here. However, the line

np.array(['avinash','jay'], dtype=object) * 2

works because now the array is an array of (pointers to) Python strings. The * operator is well defined for these Python string objects. New Python strings are created in memory and a new object array with references to the new strings is returned.


If you have an array with string_ or unicode_ dtype and want to repeat each string, you can use np.char.multiply:

In [52]: np.char.multiply(ar1, 2)
Out[52]: array(['avinashavinash', 'jayjay'], 
      dtype='<U14')

NumPy has many other vectorised string methods too.

2 of 2
4

There are 3 main dtypes to store strings in numpy:

  • object: Stores pointers to Python objects
  • str: Stores fixed-width strings
  • numpy.types.StringDType(): New in numpy 2.0 and stores variable-width strings

str consumes more memory than object; StringDType is better

Depending on the length of the fixed-length string and the size of the array, the ratio differs but as long as the longest string in the array is longer than 2 characters, str consumes more memory (they are equal when the longest string in the array is 2 characters long). For example, in the following example, str consumes almost 8 times more memory.

On the other hand, the new (in numpy>=2.0) numpy.dtypes.StringDType stores variable width strings, so consumes much less memory.

from pympler.asizeof import asizeof

ar1 = np.array(['this is a string', 'string']*1000, dtype=object)
ar2 = np.array(['this is a string', 'string']*1000, dtype=str)
ar3 = np.array(['this is a string', 'string']*1000, dtype=np.dtypes.StringDType())

asizeof(ar2) / asizeof(ar1)  # 7.944444444444445
asizeof(ar3) / asizeof(ar1)  # 1.992063492063492

For numpy 1.x, str is slower than object

For numpy>=2.0.0, str is faster than object

Numpy 2.0 has introduced a new numpy.strings API that has much more performant ufuncs for string operations. A simple test (on numpy 2.2.0) below shows that vectorized string operations on an array of str or StringDType dtype is much faster than the same operations on an object dtype array.

import timeit

t1 = min(timeit.repeat(lambda: ar1*2, number=1000))
t2a = min(timeit.repeat(lambda: np.strings.multiply(ar2, 2), number=1000))
t2b = min(timeit.repeat(lambda: np.strings.multiply(ar3, 2), number=1000))
print(t2a / t1)   # 0.8786601958427778
print(t2b / t1)   # 0.7311586933668037

t3 = min(timeit.repeat(lambda: np.array([s.count('i') for s in ar1]), number=1000))
t4a = min(timeit.repeat(lambda: np.strings.count(ar2, 'i'), number=1000))
t4b = min(timeit.repeat(lambda: np.strings.count(ar3, 'i'), number=1000))

print(t4a / t3)   # 0.13328748153237377
print(t4b / t3)   # 0.3365874412749679
For numpy<2.0.0 (tested on numpy 1.26.0)

Numpy 1.x's vectorized string methods are not optimized, so operating on the object array is often faster. For example, in the example in the OP where each character is repeated, a simple * (aka multiply()) is not only more concise but also over 10 times faster than char.multiply().

import timeit
setup = "import numpy as np; from __main__ import ar1, ar2"
t1 = min(timeit.repeat("ar1*2", setup, number=1000))
t2 = min(timeit.repeat("np.char.multiply(ar2, 2)", setup, number=1000))
t2 / t1   # 10.650433758517027

Even for functions that cannot be readily be applied on the array, instead of the vectorized char method of str arrays, it is faster to loop over the object array and work on the Python strings.

For example, iterating over the object array and calling str.count() on each Python string is over 3 times faster than the vectorized char.count() on the str array.

f1 = lambda: np.array([s.count('i') for s in ar1])
f2 = lambda: np.char.count(ar2, 'i')

setup = "import numpy as np; from __main__ import ar1, ar2, f1, f2, f3"
t3 = min(timeit.repeat("f1()", setup, number=1000))
t4 = min(timeit.repeat("f2()", setup, number=1000))

t4 / t3   # 3.251369161574832

On a side note, if it comes to explicit loop, iterating over a list is faster than iterating over a numpy array. So in the previous example, a further performance gain can be made by iterating over the list

f3 = lambda: np.array([s.count('i') for s in ar1.tolist()])
#                                               ^^^^^^^^^  <--- convert to list here
t5 = min(timeit.repeat("f3()", setup, number=1000))
t3 / t5   # 1.2623498005294627
🌐
NumPy
numpy.org › doc › 2.1 › reference › generated › numpy.dtype.html
numpy.dtype — NumPy v2.1 Manual
Create a data type object. A numpy array is homogeneous, and contains elements described by a dtype object.
🌐
NumPy
numpy.org › devdocs › reference › arrays.dtypes.html
Data type objects (dtype) — NumPy v2.6.dev0 Manual
This style allows passing in the fields attribute of a data-type object. obj should contain string or unicode keys that refer to (data-type, offset) or (data-type, offset, title) tuples. ... Try it in your browser! Data type containing field col1 (10-character string at byte position 0), col2 ...
🌐
NumPy
numpy.org › doc › stable › reference › generated › numpy.dtype.html
numpy.dtype — NumPy v2.4 Manual
Create a data type object. A numpy array is homogeneous, and contains elements described by a dtype object.
🌐
Python Course
python-course.eu › numerical-programming › numpy-data-objects-dtype.php
3. Numpy Data Objects, dtype | Numerical Programming
Before we work with a complex data structure like the one shown above, let’s first introduce dtype using a very simple example. We define a data type based on int16 and refer to it as i16. (Admittedly, this isn’t a very descriptive name, but we’ll use it just for this example.) The elements of a list named lst are then converted to the i16 type to create a two-dimensional array called A. import numpy as np i16 = np.dtype(np.int16) print(i16) lst = [ [3.4, 8.7, 9.9], [1.1, -7.8, -0.7], [4.1, 12.3, 4.8] ] A = np.array(lst, dtype=i16) print(A)
🌐
W3Schools
w3schools.com › python › numpy › numpy_data_types.asp
NumPy Data Types
The NumPy array object has a property called dtype that returns the data type of the array: Get the data type of an array object: import numpy as np arr = np.array([1, 2, 3, 4]) print(arr.dtype) Try it Yourself » · Get the data type of an ...
🌐
GeeksforGeeks
geeksforgeeks.org › python › data-type-object-dtype-numpy-python
Data type Object (dtype) in NumPy Python - GeeksforGeeks
May 20, 2026 - ... Explanation: attribute arr.dtype returns the data type of the array elements, which is int in this case because all values are integers. A dtype object is an instance of the numpy.dtype class.
Find elsewhere
🌐
NumPy
numpy.org › doc › stable › user › basics.types.html
Data types — NumPy v2.4 Manual
To determine the type of an array, look at the dtype attribute: ... dtype objects also contain information about the type, such as its bit-width and its byte-order.
🌐
NumPy
numpy.org › doc › 2.1 › reference › arrays.dtypes.html
Data type objects (dtype) — NumPy v2.1 Manual
This style allows passing in the fields attribute of a data-type object. obj should contain string or unicode keys that refer to (data-type, offset) or (data-type, offset, title) tuples. ... Data type containing field col1 (10-character string at byte position 0), col2 (32-bit float at byte ...
🌐
NumPy
numpy.org › doc › 2.3 › reference › arrays.dtypes.html
Data type objects (dtype) — NumPy v2.3 Manual
This style allows passing in the fields attribute of a data-type object. obj should contain string or unicode keys that refer to (data-type, offset) or (data-type, offset, title) tuples. ... Try it in your browser! Data type containing field col1 (10-character string at byte position 0), col2 ...
🌐
NumPy
numpy.org › devdocs › reference › generated › numpy.dtype.html
numpy.dtype — NumPy v2.6.dev0 Manual
Create a data type object. A numpy array is homogeneous, and contains elements described by a dtype object.
🌐
NumPy
numpy.org › doc › 2.2 › reference › arrays.dtypes.html
Data type objects (dtype) — NumPy v2.2 Manual
This style allows passing in the fields attribute of a data-type object. obj should contain string or unicode keys that refer to (data-type, offset) or (data-type, offset, title) tuples. ... Data type containing field col1 (10-character string at byte position 0), col2 (32-bit float at byte ...
🌐
Codegive
codegive.com › blog › numpy_array_dtype_object.php
Numpy array dtype object
This tutorial will dive deep into ... it. When a NumPy array has dtype=object, it means that each element in the array is a reference (or pointer) to an arbitrary Python object, rather than storing the actual value itself....
🌐
Numpy
numpy.net › doc › stable › reference › arrays.dtypes.html
Data type objects (dtype) — NumPy v1.26 Manual
Such conversions are done by the dtype constructor: What can be converted to a data-type object is described below: ... Used as-is. ... The default data type: float_. ... The 24 built-in array scalar type objects all convert to an associated data-type object.
🌐
University of Texas at Austin
het.as.utexas.edu › HET › Software › Numpy › reference › generated › numpy.dtype.html
numpy.dtype — NumPy v1.9 Manual
A dtype object can be constructed from different combinations of fundamental numeric types. ... Using comma-separated field formats. The shape is (2,3): >>> np.dtype("i4, (2,3)f8") dtype([('f0', '<i4'), ('f1', '<f8', (2, 3))]) Using tuples. int is a fixed type, 3 the field’s shape. void is ...
🌐
GeeksforGeeks
geeksforgeeks.org › python › numpy-data-type-objects
NumPy - Data type Objects(dtype) - GeeksforGeeks
July 23, 2025 - import numpy as np # Structured dtype with multiple fields person_dtype = np.dtype([('name', 'S10'), ('age', 'i4'), ('height', 'f4')]) people = np.array([('John', 28, 5.9), ('Emma', 32, 5.5)], dtype=person_dtype) print(people['name']) # Access 'name' field
🌐
Runebook.dev
runebook.dev › en › docs › numpy › reference › arrays.dtypes
The NumPy dtype Object: Avoiding Common Pitfalls
In NumPy, a dtype object is a special object that describes how the data in an array is stored in memory. Think of it as a blueprint for the array's elements, specifying the data type (like integer, float, or string) and how many bytes it takes up.
🌐
NumPy
numpy.org › doc › stable › reference › generated › numpy.ndarray.dtype.html
numpy.ndarray.dtype — NumPy v2.4 Manual
January 31, 2021 - >>> import numpy as np >>> x = np.arange(4).reshape((2, 2)) >>> x array([[0, 1], [2, 3]]) >>> x.dtype dtype('int64') # may vary (OS, bitness) >>> isinstance(x.dtype, np.dtype) True
🌐
NumPy
numpy.org › doc › 2.0 › reference › arrays.dtypes.html
Data type objects (dtype) — NumPy v2.0 Manual
Such conversions are done by the dtype constructor: What can be converted to a data-type object is described below: ... Used as-is. ... The default data type: float64. ... The 24 built-in array scalar type objects all convert to an associated data-type object.