GPU operations have to additionally get memory to/from the GPU

The problem is that your GPU operation always has to put the input on the GPU memory, and then retrieve the results from there, which is a quite costly operation.

NumPy, on the other hand, directly processes the data from the CPU/main memory, so there is almost no delay here. Additionally, your matrices are extremely small, so even in the best-case scenario, there should only be a minute difference.

This is also partially the reason why you use mini-batches when training on a GPU in neural networks: Instead of having several extremely small operations, you now have "one big bulk" of numbers that you can process in parallel.
Also note that GPU clock speeds are generally way lower than CPU clocks, so the GPU only really shines because it has way more cores. If your matrix does not utilize all of them fully, you are also likely to see a faster result on your CPU.

TL;DR: If your matrix is big enough, you will eventually see a speed-up in CUDA than Numpy, even with the additional cost of the GPU transfer.

Answer from dennlinger on Stack Overflow
🌐
Kaggle
kaggle.com › code › amirmotefaker › pytorch-vs-numpy
PyTorch vs NumPy
July 14, 2024 - NOTEPytorch vs NumpyNumPyPyTorchPyTorch: Tensors and autogradPyTorch: Defining new autograd functionsPyTorch: nn modulePyTorch: optimalPyTorch: Custom nn ModulesPyTorch: Control Flow + Weight Sharing
🌐
Medium
medium.com › @aktooall › numpy-vs-pytorch-tensors-same-dna-different-superpowers-f59c1211274d
“NumPy vs PyTorch Tensors: Same DNA, Different Superpowers” | by Arunkumar Ravichandran | Medium
September 8, 2025 - Some functions like where exist in NumPy but have slightly different implementations in PyTorch (torch.where). Random number handling in PyTorch often links with seeds for reproducibility, while NumPy’s default_rng() gives a modern, flexible random generator.
Discussions

Pytorch tensor constructor speed vs numpy
So I was comparing the performance of the tensor constructor to the numpy array constructor For pytorch torch.inference_mode() total_time = 0.0 iterations = 10000 for _ in range(iterations): data = np.random.normal(0, 1, (1000, 10)).tolist() # Use numpy for RNG # but convert back to python ... More on discuss.pytorch.org
🌐 discuss.pytorch.org
8
1
November 8, 2023
Is pytorch faster than numpy on a single CPU?
If you really enforce a single thread, then it's all up to the BLAS library which does the heavy lifting. Both numpy and pytorch are compatible with the most common libs, so it's a matter of which lib they are linked against respectively. Numpy with MKL will most likely beat pytorch with OpenBLAS for most workloads. More on reddit.com
🌐 r/Python
29
64
July 12, 2024
Tensors vs Numpy Arrays
reach disarm sophisticated strong six coherent cooperative important reminiscent telephone This post was mass deleted and anonymized with Redact More on reddit.com
🌐 r/learnmachinelearning
27
56
April 21, 2023
Numerical differences between numpy and pytorch?
How come ? rng = np.random.RandomState(1) dataset1 = rng.uniform(low=-0.01, high=0.01, size=(1000, 20)) dataset2 = torch.from_numpy(data_numpy) print(dataset1.mean(), dataset1.std()) print(dataset2.mean(), dataset2.std()) returns 2.5537095782416174e-05 0.005769608507668796 tensor(-3.141987... More on discuss.pytorch.org
🌐 discuss.pytorch.org
3
1
July 17, 2020
🌐
GeeksforGeeks
geeksforgeeks.org › deep learning › pytorch-tensor-vs-numpy-array
PyTorch Tensor vs NumPy Array - GeeksforGeeks
July 23, 2025 - You might think that both the PyTorch Tensors and NumPy Arrays are almost the same in terms of functionality. So let us move on to the differences between them. In the below code snippet, we are importing the torch module to create the Tensors and then performing the element-wise operations ...
🌐
Medium
kenichinakanishi.medium.com › pytorch-vs-numpy-exploring-some-syntactical-and-behavioural-differences-8d8ef7b3a130
PyTorch vs Numpy — exploring some syntactical and behavioural differences | by Kenichi Nakanishi | Medium
June 17, 2020 - First and perhaps most importantly is the PyTorch function that converts a numpy array into a the tensor datatype for further manipulation in PyTorch. It is important to note that both the array and tensor will share the same memory storage. As a consequence, any changes in values from either side will affect the other. Here we see that an array containing only integers is converted to a torch tensor with the datatype int32.
🌐
Medium
medium.com › @yunjiangster › comparing-speed-of-torch-and-numpy-f22b11aabfcf
Comparing Speed of Torch and Numpy | by Yunjiang Jiang | Medium
June 19, 2023 - Huggingface converts the input to the compute_metrics function from torch tensors to numpy arrays already, since it wasn’t expecting people to do heavy computation within the compute_metrics function.
🌐
PyTorch Forums
discuss.pytorch.org › t › pytorch-tensor-constructor-speed-vs-numpy › 191487
Pytorch tensor constructor speed vs numpy - PyTorch Forums
November 8, 2023 - So I was comparing the performance of the tensor constructor to the numpy array constructor For pytorch torch.inference_mode() total_time = 0.0 iterations = 10000 for _ in range(iterations): data = np.random.normal(0, 1, (1000, 10)).tolist() # Use numpy for RNG # but convert back to python since we're benchmarking constructor t1 = time.time() thing = torch.tensor(data, dtype=torch.float64) total_time += time.time() - t1 print(total_time / iterations) I get 0.0003917569875...
Find elsewhere
🌐
Medium
medium.com › @lostandfound2654 › the-similarities-and-differences-between-numpy-and-pytorch-b076ea7d419b
The similarities and differences between numpy and pytorch | by Lost and Found | Medium
December 23, 2022 - If we already use Numpy, can we use PyTorch? Sure! Just use some function to convert one to the other. array_10 = np.arange(9) print("Before convert:", array_10) array_10 = torch.tensor(array_10) print("After convert:", array_10) array_10 = array_10.numpy() print("Convert again:", array_10)
🌐
Quora
quora.com › What-are-the-benefits-of-PyTorch-over-NumPy
What are the benefits of PyTorch over NumPy? - Quora
Answer: PyTorch is a deep learning focused library while NumPy is for scientific computing. Both of these libraries are made with different goals in mind: * If you want to just do basic matrix operations/transformation and array operations then ...
🌐
APXML
apxml.com › courses › getting-started-with-pytorch › chapter-1-pytorch-fundamentals-setup › relationship-numpy
PyTorch vs NumPy Tensors
If you have experience with scientific ... structure that has become a standard for numerical operations in Python. PyTorch recognizes this prevalence and provides excellent interoperability with NumPy arrays....
🌐
Towards Data Science
towardsdatascience.com › home › latest › hello pytorch – installation & numpy comparison
Hello PyTorch - Installation & Numpy Comparison | Towards Data Science
January 24, 2025 - In the Numpy library, the concept of Arrays is known, irrelevant of whether the array is one-dimensional or multidimensional. In Torch for the same concept, but the name Tensor is used and is used to generalize a concept of n-dimensional arrays.
🌐
PyTorch Forums
discuss.pytorch.org › t › numerical-differences-between-numpy-and-pytorch › 89607
Numerical differences between numpy and pytorch? - PyTorch Forums
July 17, 2020 - Hello all, When computing mean and std on numpy or pytorch tensors, they yield different results. How come ? rng = np.random.RandomState(1) dataset1 = rng.uniform(low=-0.01, high=0.01, size=(1000, 20)) dataset2 = torch.from_numpy(data_numpy) print(dataset1.mean(), dataset1.std()) print(dataset2.mean(), dataset2.std()) returns 2.5537095782416174e-05 0.005769608507668796 tensor(-3.14198780749532207562380037302319e-06, dtype=torch.float64) tensor(0.00576444647196127767790896356814, dtype=t...
🌐
PyTorch Forums
discuss.pytorch.org › t › torch-is-slow-compared-to-numpy › 117502
Torch is slow compared to numpy - PyTorch Forums
April 15, 2021 - In this benchmark I implemented the same algorithm in numpy/cupy, pytorch and native cpp/cuda. The benchmark is attached below. In all tests numpy was significantly faster than pytorch.
🌐
Medium
medium.com › @chethanvakiti › pytorch-vs-numpy-a-beginners-perspective-56623bb4d751
PyTorch vs NumPy — A Beginner’s Perspective | by Chethan Vakiti | Medium
August 29, 2025 - When I first started learning deep learning, I often came across two names again and again: NumPy and PyTorch. At first, they looked quite similar because both of them deal with numbers and arrays, but as I dug deeper, I understood the differences.
🌐
Medium
medium.com › axinc-ai › conversion-between-torch-and-numpy-operators-ce189b3882b1
Conversion between torch and numpy operators | by David Cochard | ailia Tech BLOG (EN) | Medium
April 29, 2021 - Below is the code to compute the maximum value of a tensor and the index of this element, which is done in a single call in torch, but requires several calls in numpy.
🌐
F0nzie
f0nzie.github.io › rtorch-minimal-book › pytorch-and-numpy.html
Chapter 2 PyTorch and NumPy | A Minimal rTorch Book
November 19, 2020 - There is some interdependence between both. Anytime that we need to do some transformation that is not available in PyTorch, we will use numpy. Just keep in mind that numpy does not have support for GPUs; you will have to convert the numpy array to a torch tensor afterwards.
🌐
Medium
medium.com › @ashish.iitr2015 › comparison-between-pytorch-tensor-and-numpy-array-de41e389c213
Comparison between Pytorch Tensor and Numpy Array | by Ashish Singh | Medium
August 11, 2020 - Numpy arrays are mainly used in typical machine learning algorithms (such as k-means or Decision Tree in scikit-learn) whereas pytorch tensors are mainly used in deep learning which requires heavy matrix computation.
🌐
Medium
medium.com › @heyamit10 › converting-pytorch-tensors-to-numpy-arrays-fa804b1fae1c
Converting PyTorch Tensors to NumPy Arrays | by Hey Amit | Medium
February 26, 2025 - PyTorch defaults to float32, while NumPy often defaults to float64. This difference can cause problems when the precision is crucial, such as in scientific computations or certain ML models.