First off, the code you're learning from is flawed. It almost certainly doesn't do what the original author thought it did based on the comments in the code.

What the author probably meant was this:

def to_1d(array):
    """prepares an array into a 1d real vector"""
    return array.astype(np.float64).ravel()

However, if array is always going to be an array of complex numbers, then the original code makes some sense.

The only cases where viewing the array (a.dtype = 'float64' is equivalent to doing a = a.view('float64')) would double its size is if it's a complex array (numpy.complex128) or a 128-bit floating point array. For any other dtype, it doesn't make much sense.

For the specific case of a complex array, the original code would convert something like np.array([0.5+1j, 9.0+1.33j]) into np.array([0.5, 1.0, 9.0, 1.33]).

A cleaner way to write that would be:

def complex_to_iterleaved_real(array):
     """prepares a complex array into an "interleaved" 1d real vector"""
    return array.copy().view('float64').ravel()

(I'm ignoring the part about returning the original dtype and shape, for the moment.)


Background on numpy arrays

To explain what's going on here, you need to understand a bit about what numpy arrays are.

A numpy array consists of a "raw" memory buffer that is interpreted as an array through "views". You can think of all numpy arrays as views.

Views, in the numpy sense, are just a different way of slicing and dicing the same memory buffer without making a copy.

A view has a shape, a data type (dtype), an offset, and strides. Where possible, indexing/reshaping operations on a numpy array will just return a view of the original memory buffer.

This means that things like y = x.T or y = x[::2] don't use any extra memory, and don't make copies of x.

So, if we have an array similar to this:

import numpy as np
x = np.array([1,2,3,4,5,6,7,8,9,10])

We could reshape it by doing either:

x = x.reshape((2, 5))

or

x.shape = (2, 5)

For readability, the first option is better. They're (almost) exactly equivalent, though. Neither one will make a copy that will use up more memory (the first will result in a new python object, but that's beside the point, at the moment.).


Dtypes and views

The same thing applies to the dtype. We can view an array as a different dtype by either setting x.dtype or by calling x.view(...).

So we can do things like this:

import numpy as np
x = np.array([1,2,3], dtype=np.int)

print 'The original array'
print x

print '\n...Viewed as unsigned 8-bit integers (notice the length change!)'
y = x.view(np.uint8)
print y

print '\n...Doing the same thing by setting the dtype'
x.dtype = np.uint8
print x

print '\n...And we can set the dtype again and go back to the original.'
x.dtype = np.int
print x

Which yields:

The original array
[1 2 3]

...Viewed as unsigned 8-bit integers (notice the length change!)
[1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0]

...Doing the same thing by setting the dtype
[1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0]

...And we can set the dtype again and go back to the original.
[1 2 3]

Keep in mind, though, that this is giving you low-level control over the way that the memory buffer is interpreted.

For example:

import numpy as np
x = np.arange(10, dtype=np.int)

print 'An integer array:', x
print 'But if we view it as a float:', x.view(np.float)
print "...It's probably not what we expected..."

This yields:

An integer array: [0 1 2 3 4 5 6 7 8 9]
But if we view it as a float: [  0.00000000e+000   4.94065646e-324   
   9.88131292e-324   1.48219694e-323   1.97626258e-323   
   2.47032823e-323   2.96439388e-323   3.45845952e-323
   3.95252517e-323   4.44659081e-323]
...It's probably not what we expected...

So, we're interpreting the underlying bits of the original memory buffer as floats, in this case.

If we wanted to make a new copy with the ints recasted as floats, we'd use x.astype(np.float).


Complex Numbers

Complex numbers are stored (in both C, python, and numpy) as two floats. The first is the real part and the second is the imaginary part.

So, if we do:

import numpy as np
x = np.array([0.5+1j, 1.0+2j, 3.0+0j])

We can see the real (x.real) and imaginary (x.imag) parts. If we convert this to a float, we'll get a warning about discarding the imaginary part, and we'll get an array with just the real part.

print x.real
print x.astype(float)

astype makes a copy and converts the values to the new type.

However, if we view this array as a float, we'll get a sequence of item1.real, item1.imag, item2.real, item2.imag, ....

print x
print x.view(float)

yields:

[ 0.5+1.j  1.0+2.j  3.0+0.j]
[ 0.5  1.   1.   2.   3.   0. ]

Each complex number is essentially two floats, so if we change how numpy interprets the underlying memory buffer, we get an array of twice the length.

Hopefully that helps clear things up a bit...

Answer from Joe Kington on Stack Overflow
🌐
NumPy
numpy.org › devdocs › reference › arrays.dtypes.html
Data type objects (dtype) — NumPy v2.6.dev0 Manual
A data type object (an instance of numpy.dtype class) describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted. It describes the following aspects of the data: Type of the data (integer, float, Python object, etc.)
🌐
GeeksforGeeks
geeksforgeeks.org › python › data-type-object-dtype-numpy-python
Data type Object (dtype) in NumPy Python - GeeksforGeeks
May 20, 2026 - DSA Python · Data Science · NumPy · Pandas · Practice · Django · Flask · Last Updated : 20 May, 2026 · dtype defines the type of data stored in an array and how much memory each value uses.
🌐
W3Schools
w3schools.com › python › numpy › numpy_data_types.asp
NumPy Data Types
We use the array() function to create arrays, this function can take an optional argument: dtype that allows us to define the expected data type of the array elements:
🌐
NumPy
numpy.org › doc › 2.3 › reference › arrays.dtypes.html
Data type objects (dtype) — NumPy v2.3 Manual
A data type object (an instance of numpy.dtype class) describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted. It describes the following aspects of the data: Type of the data (integer, float, Python object, etc.)
🌐
NumPy
numpy.org › doc › stable › reference › arrays.dtypes.html
Data type objects (dtype) — NumPy v2.4 Manual
A data type object (an instance of numpy.dtype class) describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted. It describes the following aspects of the data: Type of the data (integer, float, Python object, etc.)
🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.DataFrame.dtypes.html
pandas.DataFrame.dtypes — pandas 3.0.3 documentation
>>> df = pd.DataFrame( ... { ... "float": [1.0], ... "int": [1], ... "datetime": [pd.Timestamp("20180310")], ... "string": ["foo"], ... } ... ) >>> df.dtypes float float64 int int64 datetime datetime64[us] string str dtype: object
Top answer
1 of 4
45

First off, the code you're learning from is flawed. It almost certainly doesn't do what the original author thought it did based on the comments in the code.

What the author probably meant was this:

def to_1d(array):
    """prepares an array into a 1d real vector"""
    return array.astype(np.float64).ravel()

However, if array is always going to be an array of complex numbers, then the original code makes some sense.

The only cases where viewing the array (a.dtype = 'float64' is equivalent to doing a = a.view('float64')) would double its size is if it's a complex array (numpy.complex128) or a 128-bit floating point array. For any other dtype, it doesn't make much sense.

For the specific case of a complex array, the original code would convert something like np.array([0.5+1j, 9.0+1.33j]) into np.array([0.5, 1.0, 9.0, 1.33]).

A cleaner way to write that would be:

def complex_to_iterleaved_real(array):
     """prepares a complex array into an "interleaved" 1d real vector"""
    return array.copy().view('float64').ravel()

(I'm ignoring the part about returning the original dtype and shape, for the moment.)


Background on numpy arrays

To explain what's going on here, you need to understand a bit about what numpy arrays are.

A numpy array consists of a "raw" memory buffer that is interpreted as an array through "views". You can think of all numpy arrays as views.

Views, in the numpy sense, are just a different way of slicing and dicing the same memory buffer without making a copy.

A view has a shape, a data type (dtype), an offset, and strides. Where possible, indexing/reshaping operations on a numpy array will just return a view of the original memory buffer.

This means that things like y = x.T or y = x[::2] don't use any extra memory, and don't make copies of x.

So, if we have an array similar to this:

import numpy as np
x = np.array([1,2,3,4,5,6,7,8,9,10])

We could reshape it by doing either:

x = x.reshape((2, 5))

or

x.shape = (2, 5)

For readability, the first option is better. They're (almost) exactly equivalent, though. Neither one will make a copy that will use up more memory (the first will result in a new python object, but that's beside the point, at the moment.).


Dtypes and views

The same thing applies to the dtype. We can view an array as a different dtype by either setting x.dtype or by calling x.view(...).

So we can do things like this:

import numpy as np
x = np.array([1,2,3], dtype=np.int)

print 'The original array'
print x

print '\n...Viewed as unsigned 8-bit integers (notice the length change!)'
y = x.view(np.uint8)
print y

print '\n...Doing the same thing by setting the dtype'
x.dtype = np.uint8
print x

print '\n...And we can set the dtype again and go back to the original.'
x.dtype = np.int
print x

Which yields:

The original array
[1 2 3]

...Viewed as unsigned 8-bit integers (notice the length change!)
[1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0]

...Doing the same thing by setting the dtype
[1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0]

...And we can set the dtype again and go back to the original.
[1 2 3]

Keep in mind, though, that this is giving you low-level control over the way that the memory buffer is interpreted.

For example:

import numpy as np
x = np.arange(10, dtype=np.int)

print 'An integer array:', x
print 'But if we view it as a float:', x.view(np.float)
print "...It's probably not what we expected..."

This yields:

An integer array: [0 1 2 3 4 5 6 7 8 9]
But if we view it as a float: [  0.00000000e+000   4.94065646e-324   
   9.88131292e-324   1.48219694e-323   1.97626258e-323   
   2.47032823e-323   2.96439388e-323   3.45845952e-323
   3.95252517e-323   4.44659081e-323]
...It's probably not what we expected...

So, we're interpreting the underlying bits of the original memory buffer as floats, in this case.

If we wanted to make a new copy with the ints recasted as floats, we'd use x.astype(np.float).


Complex Numbers

Complex numbers are stored (in both C, python, and numpy) as two floats. The first is the real part and the second is the imaginary part.

So, if we do:

import numpy as np
x = np.array([0.5+1j, 1.0+2j, 3.0+0j])

We can see the real (x.real) and imaginary (x.imag) parts. If we convert this to a float, we'll get a warning about discarding the imaginary part, and we'll get an array with just the real part.

print x.real
print x.astype(float)

astype makes a copy and converts the values to the new type.

However, if we view this array as a float, we'll get a sequence of item1.real, item1.imag, item2.real, item2.imag, ....

print x
print x.view(float)

yields:

[ 0.5+1.j  1.0+2.j  3.0+0.j]
[ 0.5  1.   1.   2.   3.   0. ]

Each complex number is essentially two floats, so if we change how numpy interprets the underlying memory buffer, we get an array of twice the length.

Hopefully that helps clear things up a bit...

2 of 4
6

By changing the dtype in this way, you are changing the way a fixed block of memory is being interpreted.

Example:

>>> import numpy as np
>>> a=np.array([1,0,0,0,0,0,0,0],dtype='int8')
>>> a
array([1, 0, 0, 0, 0, 0, 0, 0], dtype=int8)
>>> a.dtype='int64'
>>> a
array([1])

Note how the change from int8 to int64 changed an 8 element, 8 bit integer array, into a 1 element, 64 bit array. It is the same 8 byte block however. On my i7 machine with native endianess, the byte pattern is the same as 1 in an int64 format.

Change the position of the 1:

>>> a=np.array([0,0,0,1,0,0,0,0],dtype='int8')
>>> a.dtype='int64'
>>> a
array([16777216])

Another example:

>>> a=np.array([0,0,0,0,0,0,1,0],dtype='int32')
>>> a.dtype='int64'
>>> a
array([0, 0, 0, 1])

Change the position of the 1 in the 32 byte, 32 bit array:

>>> a=np.array([0,0,0,1,0,0,0,0],dtype='int32')
>>> a.dtype='int64'
>>> a
array([         0, 4294967296,          0,          0]) 

It is the same block of bits reinterpreted.

Find elsewhere
🌐
Medium
medium.com › @gunkurnia › understanding-the-difference-between-dtype-and-dtypes-in-data-analysis-1f0638a224d8
Understanding the Difference Between dtype and dtypes in Data Analysis | by GunKurnia | Medium
March 21, 2025 - Understanding the difference between dtype and dtypes makes data analysis in pandas more efficient—one focuses on a single column, while the other handles multiple columns. ... Data types are fundamental elements in programming and data analysis that determine how data is stored, processed, and manipulated. When working with data analysis libraries like NumPy and pandas in Python, you’ll frequently encounter the terms dtype and dtypes.
🌐
Python for Data Science
python4data.science › en › latest › workspace › numpy › dtype.html
dtype - Python for Data Science
May 15, 2026 - ndarray is a container for homogeneous data, i.e. all elements must be of the same type. Each array has a dtype, an object that describes the data type of the array: NumPy data types:,,, Type, Type...
Top answer
1 of 2
64

NumPy arrays are stored as contiguous blocks of memory. They usually have a single datatype (e.g. integers, floats or fixed-length strings) and then the bits in memory are interpreted as values with that datatype.

Creating an array with dtype=object is different. The memory taken by the array now is filled with pointers to Python objects which are being stored elsewhere in memory (much like a Python list is really just a list of pointers to objects, not the objects themselves).

Arithmetic operators such as * don't work with arrays such as ar1 which have a string_ datatype (there are special functions instead - see below). NumPy is just treating the bits in memory as characters and the * operator doesn't make sense here. However, the line

np.array(['avinash','jay'], dtype=object) * 2

works because now the array is an array of (pointers to) Python strings. The * operator is well defined for these Python string objects. New Python strings are created in memory and a new object array with references to the new strings is returned.


If you have an array with string_ or unicode_ dtype and want to repeat each string, you can use np.char.multiply:

In [52]: np.char.multiply(ar1, 2)
Out[52]: array(['avinashavinash', 'jayjay'], 
      dtype='<U14')

NumPy has many other vectorised string methods too.

2 of 2
4

There are 3 main dtypes to store strings in numpy:

  • object: Stores pointers to Python objects
  • str: Stores fixed-width strings
  • numpy.types.StringDType(): New in numpy 2.0 and stores variable-width strings

str consumes more memory than object; StringDType is better

Depending on the length of the fixed-length string and the size of the array, the ratio differs but as long as the longest string in the array is longer than 2 characters, str consumes more memory (they are equal when the longest string in the array is 2 characters long). For example, in the following example, str consumes almost 8 times more memory.

On the other hand, the new (in numpy>=2.0) numpy.dtypes.StringDType stores variable width strings, so consumes much less memory.

from pympler.asizeof import asizeof

ar1 = np.array(['this is a string', 'string']*1000, dtype=object)
ar2 = np.array(['this is a string', 'string']*1000, dtype=str)
ar3 = np.array(['this is a string', 'string']*1000, dtype=np.dtypes.StringDType())

asizeof(ar2) / asizeof(ar1)  # 7.944444444444445
asizeof(ar3) / asizeof(ar1)  # 1.992063492063492

For numpy 1.x, str is slower than object

For numpy>=2.0.0, str is faster than object

Numpy 2.0 has introduced a new numpy.strings API that has much more performant ufuncs for string operations. A simple test (on numpy 2.2.0) below shows that vectorized string operations on an array of str or StringDType dtype is much faster than the same operations on an object dtype array.

import timeit

t1 = min(timeit.repeat(lambda: ar1*2, number=1000))
t2a = min(timeit.repeat(lambda: np.strings.multiply(ar2, 2), number=1000))
t2b = min(timeit.repeat(lambda: np.strings.multiply(ar3, 2), number=1000))
print(t2a / t1)   # 0.8786601958427778
print(t2b / t1)   # 0.7311586933668037

t3 = min(timeit.repeat(lambda: np.array([s.count('i') for s in ar1]), number=1000))
t4a = min(timeit.repeat(lambda: np.strings.count(ar2, 'i'), number=1000))
t4b = min(timeit.repeat(lambda: np.strings.count(ar3, 'i'), number=1000))

print(t4a / t3)   # 0.13328748153237377
print(t4b / t3)   # 0.3365874412749679
For numpy<2.0.0 (tested on numpy 1.26.0)

Numpy 1.x's vectorized string methods are not optimized, so operating on the object array is often faster. For example, in the example in the OP where each character is repeated, a simple * (aka multiply()) is not only more concise but also over 10 times faster than char.multiply().

import timeit
setup = "import numpy as np; from __main__ import ar1, ar2"
t1 = min(timeit.repeat("ar1*2", setup, number=1000))
t2 = min(timeit.repeat("np.char.multiply(ar2, 2)", setup, number=1000))
t2 / t1   # 10.650433758517027

Even for functions that cannot be readily be applied on the array, instead of the vectorized char method of str arrays, it is faster to loop over the object array and work on the Python strings.

For example, iterating over the object array and calling str.count() on each Python string is over 3 times faster than the vectorized char.count() on the str array.

f1 = lambda: np.array([s.count('i') for s in ar1])
f2 = lambda: np.char.count(ar2, 'i')

setup = "import numpy as np; from __main__ import ar1, ar2, f1, f2, f3"
t3 = min(timeit.repeat("f1()", setup, number=1000))
t4 = min(timeit.repeat("f2()", setup, number=1000))

t4 / t3   # 3.251369161574832

On a side note, if it comes to explicit loop, iterating over a list is faster than iterating over a numpy array. So in the previous example, a further performance gain can be made by iterating over the list

f3 = lambda: np.array([s.count('i') for s in ar1.tolist()])
#                                               ^^^^^^^^^  <--- convert to list here
t5 = min(timeit.repeat("f3()", setup, number=1000))
t3 / t5   # 1.2623498005294627
🌐
NumPy
numpy.org › doc › 2.1 › reference › generated › numpy.dtype.html
numpy.dtype — NumPy v2.1 Manual
Returns dtype for the base element of the subarrays, regardless of their dimension or shape.
🌐
Python Course
python-course.eu › numerical-programming › numpy-data-objects-dtype.php
3. Numpy Data Objects, dtype | Numerical Programming
We offer live Python training courses covering the content of this site. ... To demonstrate how structured NumPy arrays with the same shape and fields can be compared, we will create a new array containing the population data from 1995: import numpy as np # Define the structured dtype dt = np.dtype([ ('country', 'U20'), ('density', 'i4'), ('area', 'i4'), ('population', 'i4') ]) # 1995 population data population_table_1995 = np.array([ ('Netherlands', 462, 33720, 15_565_032), ('Belgium', 332, 30510, 10_137_265), ('United Kingdom', 239, 243610, 58_154_634), ('Germany', 235, 348560, 82_019_890),
🌐
Spark Code Hub
sparkcodehub.com › numpy › basics › understanding dtypes
Understanding Dtypes | NUMPY Tutorial | Spark Code Hub
dtype (data type), which defines the type and size of each element in the array. Unlike Python’s flexible, dynamically typed lists, NumPy’s strict typing ensures optimal performance and memory usage, making
🌐
Kaggle
kaggle.com › code › residentmario › indexing-selecting-assigning
Indexing, Selecting & Assigning
April 21, 2023 - Python · IntroductionNative accessorsIndexing in pandasManipulating the indexConditional selectionAssigning dataYour turn · Learn Tutorial · Pandas · Course step ·
🌐
Python Data Science Handbook
jakevdp.github.io › PythonDataScienceHandbook › 02.01-understanding-data-types.html
Understanding Data Types in Python | Python Data Science Handbook
Remember that unlike Python lists, NumPy is constrained to arrays that all contain the same type. If types do not match, NumPy will upcast if possible (here, integers are up-cast to floating point): In [9]: np.array([3.14, 4, 2, 3]) Out[9]: array([ 3.14, 4. , 2. , 3. ]) If we want to explicitly set the data type of the resulting array, we can use the dtype keyword: In [10]: np.array([1, 2, 3, 4], dtype='float32') Out[10]: array([ 1., 2., 3., 4.], dtype=float32) Finally, unlike Python lists, NumPy arrays can explicitly be multi-dimensional; here's one way of initializing a multidimensional array using a list of lists: In [11]: # nested lists result in multi-dimensional arrays np.array([range(i, i + 3) for i in [2, 4, 6]]) Out[11]: array([[2, 3, 4], [4, 5, 6], [6, 7, 8]]) The inner lists are treated as rows of the resulting two-dimensional array.
🌐
Ultralytics
docs.ultralytics.com › ultralytics docs › modes › predict
Model Prediction with Ultralytics YOLO | Ultralytics Docs
November 12, 2023 - The common fields above are always available, while the task-specific prediction data is stored in the fields below. Coordinate, confidence, and probability tensors are torch.float32 unless half precision is used, then torch.float16. After result.numpy(), tensors become NumPy arrays with matching NumPy dtypes.
🌐
Wikipedia
en.wikipedia.org › wiki › PyTorch
PyTorch - Wikipedia
1 week ago - Pytorch can save and load models using its own file format, which is a ZIP64 archive containing the model weights in a Python pickle file, and other information such as the byte order. The file extensions .pt and .pth are commonly used for these files. The following program shows the low-level functionality of the library with a simple example. import torch dtype = torch.float device = torch.device("cpu") # Execute all calculations on the CPU # device = torch.device("cuda:0") # Executes all calculations on the GPU # Create a tensor and fill it with random numbers a = torch.randn(2, 3, device=d
🌐
Unsloth AI
unsloth.ai › docs › get-started › fine-tuning-llms-guide
Fine-tuning LLMs Guide | Unsloth Documentation
dtype = None – Defaults to None; use torch.float16 or torch.bfloat16 for newer GPUs.