dtype python numpy - Brave Search

What does dtype=object mean while creating a numpy array?

stackoverflow.com › questions › 29877508 › what-does-dtype-object-mean-while-creating-a-numpy-array

NumPy arrays are stored as contiguous blocks of memory. They usually have a single datatype (e.g. integers, floats or fixed-length strings) and then the bits in memory are interpreted as values with that datatype.

Creating an array with dtype=object is different. The memory taken by the array now is filled with pointers to Python objects which are being stored elsewhere in memory (much like a Python list is really just a list of pointers to objects, not the objects themselves).

Arithmetic operators such as * don't work with arrays such as ar1 which have a string_ datatype (there are special functions instead - see below). NumPy is just treating the bits in memory as characters and the * operator doesn't make sense here. However, the line

np.array(['avinash','jay'], dtype=object) * 2

works because now the array is an array of (pointers to) Python strings. The * operator is well defined for these Python string objects. New Python strings are created in memory and a new object array with references to the new strings is returned.

If you have an array with string_ or unicode_ dtype and want to repeat each string, you can use np.char.multiply:

In [52]: np.char.multiply(ar1, 2)
Out[52]: array(['avinashavinash', 'jayjay'], 
      dtype='<U14')

NumPy has many other vectorised string methods too.

Answer from Alex Riley on Stack Overflow

numpy.org › doc › stable › reference › arrays.dtypes.html

Data type objects (dtype) — NumPy v2.4 Manual

>>> dt = np.dtype(('i4', [('r','u1'),('g','u1'),('b','u1'),('a','u1')])) ... When checking for a specific data type, use == comparison. ... Try it in your browser! ... As opposed to Python types, a comparison using is should not be used. First, NumPy treats data type specifications (everything that can be passed to the dtype constructor) as equivalent to the data type object itself.

numpy.org › doc › 2.1 › reference › generated › numpy.dtype.html

numpy.dtype — NumPy v2.1 Manual

A numpy array is homogeneous, and contains elements described by a dtype object.

Videos

NumPy Data Types (dtype) Explained - Complete Guide for Beginners ...

October 28, 2025

Data Type Objects, dtype in numpy - YouTube

Python + Pandas Tutorial - (Pt.3) dtypes - YouTube

November 16, 2017

type vs dtype in Python using google Colab - YouTube

October 10, 2020

Numpy Data type Objects | Data Type Objects in Numpy | Python Numpy ...

Python Basics Tutorial Pandas DataFrame dtype Attribute - YouTube

October 8, 2019

w3schools.com › python › numpy › numpy_data_types.asp

NumPy Data Types

import numpy as np arr = np.array([1, 2, 3, 4], dtype='i4') print(arr) print(arr.dtype) Try it Yourself » · If a type is given in which elements can't be casted then NumPy will raise a ValueError. ValueError: In Python ValueError is raised when the type of passed argument to a function is unexpected/incorrect.

geeksforgeeks.org › python › data-type-object-dtype-numpy-python

Data type Object (dtype) in NumPy Python - GeeksforGeeks

January 19, 2026 - In NumPy, dtype defines the type of data stored in an array and how much memory each value uses. It controls how raw memory bytes are interpreted, making NumPy operations fast and efficient.

python-course.eu › numerical-programming › numpy-data-objects-dtype.php

3. Numpy Data Objects, dtype | Numerical Programming

We offer live Python training courses covering the content of this site. ... To demonstrate how structured NumPy arrays with the same shape and fields can be compared, we will create a new array containing the population data from 1995: import numpy as np # Define the structured dtype dt = np.dtype([ ('country', 'U20'), ('density', 'i4'), ('area', 'i4'), ('population', 'i4') ]) # 1995 population data population_table_1995 = np.array([ ('Netherlands', 462, 33720, 15_565_032), ('Belgium', 332, 30510, 10_137_265), ('United Kingdom', 239, 243610, 58_154_634), ('Germany', 235, 348560, 82_019_890),

stackoverflow.com › questions › 29877508 › what-does-dtype-object-mean-while-creating-a-numpy-array

python - What does dtype=object mean while creating a numpy array? - Stack Overflow

NumPy arrays are stored as contiguous blocks of memory. They usually have a single datatype (e.g. integers, floats or fixed-length strings) and then the bits in memory are interpreted as values with that datatype.

Creating an array with dtype=object is different. The memory taken by the array now is filled with pointers to Python objects which are being stored elsewhere in memory (much like a Python list is really just a list of pointers to objects, not the objects themselves).

Arithmetic operators such as * don't work with arrays such as ar1 which have a string_ datatype (there are special functions instead - see below). NumPy is just treating the bits in memory as characters and the * operator doesn't make sense here. However, the line

np.array(['avinash','jay'], dtype=object) * 2

works because now the array is an array of (pointers to) Python strings. The * operator is well defined for these Python string objects. New Python strings are created in memory and a new object array with references to the new strings is returned.

If you have an array with string_ or unicode_ dtype and want to repeat each string, you can use np.char.multiply:

In [52]: np.char.multiply(ar1, 2)
Out[52]: array(['avinashavinash', 'jayjay'], 
      dtype='<U14')

NumPy has many other vectorised string methods too.

There are 3 main dtypes to store strings in numpy:

object: Stores pointers to Python objects
str: Stores fixed-width strings
numpy.types.StringDType(): New in numpy 2.0 and stores variable-width strings

`str` consumes more memory than `object`; `StringDType` is better

Depending on the length of the fixed-length string and the size of the array, the ratio differs but as long as the longest string in the array is longer than 2 characters, str consumes more memory (they are equal when the longest string in the array is 2 characters long). For example, in the following example, str consumes almost 8 times more memory.

On the other hand, the new (in numpy>=2.0) numpy.dtypes.StringDType stores variable width strings, so consumes much less memory.

from pympler.asizeof import asizeof

ar1 = np.array(['this is a string', 'string']*1000, dtype=object)
ar2 = np.array(['this is a string', 'string']*1000, dtype=str)
ar3 = np.array(['this is a string', 'string']*1000, dtype=np.dtypes.StringDType())

asizeof(ar2) / asizeof(ar1)  # 7.944444444444445
asizeof(ar3) / asizeof(ar1)  # 1.992063492063492

For numpy 1.x, `str` is slower than `object`

For numpy>=2.0.0, `str` is faster than `object`

Numpy 2.0 has introduced a new numpy.strings API that has much more performant ufuncs for string operations. A simple test (on numpy 2.2.0) below shows that vectorized string operations on an array of str or StringDType dtype is much faster than the same operations on an object dtype array.

import timeit

t1 = min(timeit.repeat(lambda: ar1*2, number=1000))
t2a = min(timeit.repeat(lambda: np.strings.multiply(ar2, 2), number=1000))
t2b = min(timeit.repeat(lambda: np.strings.multiply(ar3, 2), number=1000))
print(t2a / t1)   # 0.8786601958427778
print(t2b / t1)   # 0.7311586933668037

t3 = min(timeit.repeat(lambda: np.array([s.count('i') for s in ar1]), number=1000))
t4a = min(timeit.repeat(lambda: np.strings.count(ar2, 'i'), number=1000))
t4b = min(timeit.repeat(lambda: np.strings.count(ar3, 'i'), number=1000))

print(t4a / t3)   # 0.13328748153237377
print(t4b / t3)   # 0.3365874412749679

For numpy<2.0.0 (tested on numpy 1.26.0)

Numpy 1.x's vectorized string methods are not optimized, so operating on the object array is often faster. For example, in the example in the OP where each character is repeated, a simple * (aka multiply()) is not only more concise but also over 10 times faster than char.multiply().

import timeit
setup = "import numpy as np; from __main__ import ar1, ar2"
t1 = min(timeit.repeat("ar1*2", setup, number=1000))
t2 = min(timeit.repeat("np.char.multiply(ar2, 2)", setup, number=1000))
t2 / t1   # 10.650433758517027

Even for functions that cannot be readily be applied on the array, instead of the vectorized char method of str arrays, it is faster to loop over the object array and work on the Python strings.

For example, iterating over the object array and calling str.count() on each Python string is over 3 times faster than the vectorized char.count() on the str array.

f1 = lambda: np.array([s.count('i') for s in ar1])
f2 = lambda: np.char.count(ar2, 'i')

setup = "import numpy as np; from __main__ import ar1, ar2, f1, f2, f3"
t3 = min(timeit.repeat("f1()", setup, number=1000))
t4 = min(timeit.repeat("f2()", setup, number=1000))

t4 / t3   # 3.251369161574832

On a side note, if it comes to explicit loop, iterating over a list is faster than iterating over a numpy array. So in the previous example, a further performance gain can be made by iterating over the list

f3 = lambda: np.array([s.count('i') for s in ar1.tolist()])
#                                               ^^^^^^^^^  <--- convert to list here
t5 = min(timeit.repeat("f3()", setup, number=1000))
t3 / t5   # 1.2623498005294627

numpy.org › doc › 2.2 › reference › generated › numpy.ndarray.dtype.html

numpy.ndarray.dtype — NumPy v2.2 Manual

Setting will replace the dtype without modifying the memory (see also ndarray.view and ndarray.astype).

datacamp.com › doc › numpy › data-types

NumPy Data Types

The `dtype` in NumPy is used to specify the desired data type for the elements of an array. This can be crucial for optimizing performance and ensuring compatibility with other data processing operations and interoperability with other systems and libraries. python import numpy as np array ...

Find elsewhere

Google Bing Mojeek

numpy.org › doc › stable › › reference › generated › numpy.dtype.html

numpy.dtype — NumPy v2.4 Manual

A numpy array is homogeneous, and contains elements described by a dtype object.

numpy.org › doc › stable › user › basics.types.html

Data types — NumPy v2.4 Manual

In addition to numerical types, NumPy also supports storing unicode strings, via the numpy.str_ dtype (U character code), null-terminated byte sequences via numpy.bytes_ (S character code), and arbitrary byte sequences, via numpy.void (V character code).

numpy.org › doc › 2.1 › reference › arrays.dtypes.html

Data type objects (dtype) — NumPy v2.1 Manual

NumPy allows a modification on the format in that any string that can uniquely identify the type can be used to specify the data-type in a field. The generated data-type fields are named 'f0', 'f1', …, 'f<N-1>' where N (>1) is the number of comma-separated basic formats in the string.

note.nkmk.me › home › python › numpy

NumPy: astype() to change dtype of an array | note.nkmk.me

February 4, 2024 - NumPy arrays (ndarray) hold a data type (dtype). You can set this through various operations, such as when creating an ndarray with np.array(), or change it later with astype(). Data type objects (dty ...

numpy.org › devdocs › reference › generated › numpy.dtype.html

numpy.dtype — NumPy v2.5.dev0 Manual

Returns dtype for the base element of the subarrays, regardless of their dimension or shape.

w3resource.com › numpy › data-type-routines › dtype.php

NumPy Data type: dtype() function - w3resource

A dtype object can be constructed from different combinations of fundamental numeric types. ... >>> import numpy as np >>> np.dtype(np.int16) dtype('int16') >>> np.dtype([('f1', np.int16)]) dtype([('f1', '<i2')]) >>> np.dtype([('f1', [('f1', np.int16)])]) dtype([('f1', [('f1', '<i2')])]) >>> np.dtype([('f1', np.uint), ('f2', np.int32)]) dtype([('f1', '<u8'), ('f2', '<i4')]) >>> np.dtype([('a','f8'),('b','S10')]) dtype([('a', '<f8'), ('b', 'S10')]) >>> np.dtype("i4, (2,3)f8") dtype([('f0', '<i4'), ('f1', '<f8', (2, 3))]) >>> np.dtype([('hello',(int,3)),('world',np.void,10)]) dtype([('hello', '<

numpy.org › devdocs › reference › arrays.dtypes.html

Data type objects (dtype) — NumPy v2.5.dev0 Manual

A data type object (an instance of numpy.dtype class) describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted. It describes the following aspects of the data: Type of the data (integer, float, Python object, etc.)

tutorialspoint.com › numpy › numpy_data_types.htm

NumPy - Data Types

Original array: [1 2 3 4 5] Original dtype: int64 Converted array: [1. 2. 3. 4. 5.] Converted dtype: float32 · NumPy also provides functions for casting arrays to specific types.

numpy.org › devdocs › user › basics.types.html

Data types — NumPy v2.5.dev0 Manual

In addition to numerical types, NumPy also supports storing unicode strings, via the numpy.str_ dtype (U character code), null-terminated byte sequences via numpy.bytes_ (S character code), and arbitrary byte sequences, via numpy.void (V character code).

numpy.org › doc › 2.2 › reference › generated › numpy.dtype.html

numpy.dtype — NumPy v2.2 Manual

A numpy array is homogeneous, and contains elements described by a dtype object.

Python for Data Science

python4data.science › en › latest › workspace › numpy › dtype.html

dtype - Python for Data Science

ndarray is a container for homogeneous data, i.e. all elements must be of the same type. Each array has a dtype, an object that describes the data type of the array: NumPy data types:,,, Type, Type...

stackoverflow.com › questions › 32265718 › dtype-parameter-in-numpy-array

python - dtype parameter in numpy.array() - Stack Overflow

A python list can contain disparately typed objects, e.g. X = ['apples', 'oranges',10]. If you do type([10]) you'll see the Python type for the container is technically called a list, not an array.

In contrast, in a numpy array all objects are of the same type, the dtype.

The docs are telling you that on creation of a numpy array, the dtype is set to the type that will hold all of the existing objects.

See, look:

the type will be determined as the minimum type required to hold the objects in the sequence

The writers perhaps should have added "and not their values"

We can make a uint8 ten easily enough:

ten = np.uint8(10)

If that is put into a Python list, it retains its type because Python lists preserve types. If that list is sent to numpy.array() to make a numpy array, then the numpy array will use dtype np.uint8 because it is big enough to hold all (1) of the pre-existing Python list objects.

In [49]: np.array([ten]).dtype
Out[49]: dtype('uint8')

But if we use a literal 10, python will create an int object for it instead of an np.uint8 because np.uint8 is peculiar to numpy and all 10 does is invoke python to create that number.

If we make a Python list containing a literal 10, we duplicate your result (with machine-architecture ints):

In [50]: np.array([10]).dtype
Out[50]: dtype('int64')

And if we put the two types together in a python list, and send that list to np.array for creation of a numpy array, then the dtype must be big enough to hold both objects, in this case int64.

In [51]: np.array([ten, 10]).dtype
Out[51]: dtype('int64')