Seeing lots of comments being dismissive of the efforts of improving Python is a bit disheartening. Of course you should strive to write proper code, it's doesn't mean that having a good, fast implementation of the language isn't a large improvement. I'm thankful for people working on this, just getting glue code, scripts or not easily JIT'able code to run faster is very cool! Answer from Deleted User on reddit.com
🌐
Reddit
reddit.com › r/python › why python is slow and how to make it faster
r/Python on Reddit: Why Python is slow and how to make it faster
January 8, 2024 -

As there was a recent discussion on Python's speed, here is a collection of some good articles discussing about Python's speed and why it poses extra challenges to be fast as CPU instructions/executed code.

  • Pycon talk: Anthony Shaw - Why Python is slow

  • Pycon talk: Mark Shannon - How we are making CPython faster

  • Python 3.13 will ship with --enable-jit, --disable-gil

  • Python performance: it’s not just the interpreter

  • Cinder: Instagram's performance-oriented Python fork

Also remember, the raw CPU speed rarely matters, as many workloads are IO-bound, network-bound, or a performance question is irrelevant... or: Python trades some software development cost for increased hardware cost. In these cases, Python extensions and specialised libraries can do the heavy lifting outside the interpreter (PyArrow, Polards, Pandas, Numba, etc.).

🌐
Python
wiki.python.org › moin › PythonSpeed › PerformanceTips
PythonSpeed/PerformanceTips
The first step to speeding up your program is learning where the bottlenecks lie. It hardly makes sense to optimize code that is never executed or that already runs fast. I use two modules to help locate the hotspots in my code, profile and trace. In later examples I also use the timeit module, ...
Discussions

performance - How to speed up Python execution? - Stack Overflow
I have a project written in Python which work with a big size of data. I would like to speed up the execution time. In simple words, let's say I have this sample fully optimized code: def foo(x):... More on stackoverflow.com
🌐 stackoverflow.com
performance - What tools or approaches are available to speed up code written in Python? - Computational Science Stack Exchange
However, I know that unless I make heavy use of functions from modules derived from compiled libraries (i.e., I only use raw Python, and not many built-in functions), then it could be quite slow. Question: What tools or approaches are available to help me speed up code I write in Python for ... More on scicomp.stackexchange.com
🌐 scicomp.stackexchange.com
News faster CPython, JIT and 3.14
I worked a bit on studying the effect of the new CPython JIT and more generally of performance for simple CPU bounded pure Python code. I focused on a very simple benchmark involving only very simple pure Python: def short_calcul(n): result = 0 for i in range(1, n + 1): result += i return result ... More on discuss.python.org
🌐 discuss.python.org
17
1
March 20, 2025
Performance hacks for faster Python code
Hack 1: Don't Use The Obviously Wrong Data Structure For Your Problem · Hack 2: Don't Have The Computer Do Useless Stuff More on news.ycombinator.com
🌐 news.ycombinator.com
64
119
November 24, 2025
🌐
Python⇒Speed
pythonspeed.com › articles › different-ways-speed
330× faster: Four different ways to speed up your code
July 3, 2025 - There’s nothing wrong with starting with this approach, of using built-in Python functions implemented in C. But if the result isn’t fast enough, you will need to apply either or both of the Practice of Compilation or Practice of Efficiency. Efficiency won’t always give you faster results in Python on its own, as in this case, so you might need to first switch to a compiled language and then look for opportunities to speed things up with Efficiency.
🌐
JetBrains
blog.jetbrains.com › pycharm › 2025 › 11 › 10-smart-performance-hacks-for-faster-python-code
10 Smart Performance Hacks For Faster Python Code | The PyCharm Blog
November 17, 2025 - Learn practical optimization hacks, from data structures to built-in modules, that boost speed, reduce overhead, and keep your Python code clean.
🌐
Plain English
python.plainenglish.io › how-i-speed-up-my-python-scripts-by-300-31030219a6d5
How I Speed Up My Python Scripts by 300% | by Kiran Maan | Python in Plain English
January 25, 2025 - I still remember the time, when I just wrote a Python script in which I had to process a large dataset. It was a small project but for processing that much data, I had to wait…wait..wait for a long time. The task that could be done in a few minutes is dragged into hours. ... Then, I realized something was wrong. My code was not optimized, so I learned to optimize my code after a lot of attempts. It makes my script speed up by 300%.
🌐
Towards Data Science
towardsdatascience.com › home › latest › run python up to 150× faster with c
Run Python Up to 150× Faster with C | Towards Data Science
November 12, 2025 - Subprocess. Compile the C code into an executable (e.g., with gcc or Visual Studio Build Tools) and run it from Python using the subprocess module. This is easy to set up and already shows a huge speedup compared to pure Python.
Find elsewhere
🌐
Medium
klaviyo.tech › how-to-improve-python-performance-6a43210f7359
How to Improve Python Performance | Klaviyo Engineering
December 19, 2022 - Tips, tricks, and techniques to optimize Python including how a single decorator can lead to a 99.9999% speedup.
🌐
Medium
medium.com › data-science › make-python-run-as-fast-as-c-9fdccdb501d4
Make Python Run As Fast As C. Faster Python Code With Numba | by Lukas Frei | TDS Archive | Medium
August 11, 2021 - In doing so, Numba enables you to boost your Python code’s speed by “one to two orders of magnitude.”[2] The actual speed gains from using Numba, however, highly depend on each specific use case as the project focuses on scientific computing applications. One of Numba’s great advantages concerns its ease of use. As opposed to more or less complicated installation procedures required for alternative ways to speed up Python, Numba can be completely installed using pip or conda .
Top answer
1 of 5
10

To answer your last question first, if you have a problem with performance, then it's worth it. That's the only criterion, really.

As for how:

If your algorithm is slow because it's computationally expensive, consider rewriting it as a C extension, or use Cython, which will let you write fast extensions in a Python-esque language. Also, PyPy is getting faster and faster and may just be able to run your code without modification.

If the code is not computationally expensive, but it just loops a huge amount, it may be possible to break it down with Multiprocessing, so it gets done in parallel.

Lastly, if this is some kind of basic data splatting task, consider using a fast data store. All the major relational databases are optimised up the wazoo, and you may find that your task can be sped up simply by getting the database to do it for you. You may even be able to shape it to fit a Redis store, which can aggregate big data sets brilliantly.

2 of 5
8

The only real way to know would be to profile and measure. Your code could be doing anything. "doSomething" might be a time.sleep(10) in which case, forking off 10000000 processes would make the whole program run in approximately 10 seconds (ignoring the forking overhead and resulting slowdowns).

Use http://docs.python.org/library/profile.html and check to see where the bottle necks are, see if you can optimise the "fully optimised" program using better coding. If it's already fast enough, stop.

Then, depending on whether it's CPU or I/O bound and the hardware you have, you might want to try multiprocessing or threading. You can also try distributing to multiple machines and doing some map/reduce kind of thing if your problem can be broken down.

Top answer
1 of 4
40

I'm going to break up my answer into three parts. Profiling, speeding up the python code via c, and speeding up python via python. It is my view that Python has some of the best tools for looking at what your code's performance is then drilling down to the actual bottle necks. Speeding up code without profiling is about like trying to kill a deer with an uzi.

If you are really only interested in mat-vec products, I would recommend scipy.sparse.

Python tools for profiling

profile and cProfile modules: These modules will give you your standard run time analysis and function call stack. It is pretty nice to save their statistics and using the pstats module you can look at the data in a number of ways.

kernprof: this tool puts together many routines for doing things like line by line code timing

memory_profiler: this tool produces line by line memory foot print of your code.

IPython timers: The timeit function is quite nice for seeing the differences in functions in a quick interactive way.

Speeding up Python

Cython: cython is the quickest way to take a few functions in python and get faster code. You can decorate the function with the cython variant of python and it generates c code. This is very maintable and can also link to other hand written code in c/c++/fortran quite easily. It is by far the preferred tool today.

ctypes: ctypes will allow you to write your functions in c and then wrap them quickly with its simple decoration of the code. It handles all the pain of casting from PyObjects and managing the gil to call the c function.

Other approaches exist for writing your code in C but they are all somewhat more for taking a C/C++ library and wrapping it in Python.

Python-only approaches

If you want to stay inside Python mostly, my advice is to figure out what data you are using and picking correct data types for implementing your algorithms. It has been my experience that you will usually get much farther by optimizing your data structures then any low level c hack. For example:

numpy: a contingous array very fast for strided operations of arrays

numexpr: a numpy array expression optimizer. It allows for multithreading numpy array expressions and also gets rid of the numerous temporaries numpy makes because of restrictions of the Python interpreter.

blist: a b-tree implementation of a list, very fast for inserting, indexing, and moving the internal nodes of a list

pandas: data frames (or tables) very fast analytics on the arrays.

pytables: fast structured hierarchical tables (like hdf5), especially good for out of core calculations and queries to large data.

2 of 4
6

First of all, if there is a C or Fortran implementation available (MATLAB MEX function?), why don't you write a Python wrapper?

If you want your own implementation not only a wrapper, I would strongly suggest to use the numpy module for linear algebra stuff. Make sure it is linked to an optimized blas (like ATLAS, GOTOblas, uBLAS, Intel MKL, ...). And use Cython or weave. Read this Performance Python article for a good introduction and benchmark. The different implementations in this article are available for download here courtesy of Travis Oliphant (Numpy-guru).

Good luck.

🌐
Medium
medium.com › @nirmalya.ghosh › 13-ways-to-speedup-python-loops-e3ee56cd6b73
13 Ways To Speedup Python Loops With Minimal Effort | by Nirmalya Ghosh | Medium
January 3, 2024 - 13 Ways To Speedup Python Loops With Minimal Effort In this article, I cover a few simple ways to achieve 1.3x to 970x speedup of Python for loops with minimal effort. I have included code snippets …
🌐
NVIDIA Developer
developer.nvidia.com › blog › 7-drop-in-replacements-to-instantly-speed-up-your-python-data-science-workflows
7 Drop-In Replacements to Instantly Speed Up Your Python Data Science Workflows | NVIDIA Technical Blog
September 22, 2025 - Many of Python’s most popular data science libraries—including pandas, Polars, scikit-learn, and XGBoost—can now run much faster on GPUs with little to no code changes. Using libraries like NVIDIA cuDF, cuML, and cuGraph, you can keep your existing code and scale it to handle much larger workloads with ease. This post shares how to speed up seven drop-in replacements for popular Python libraries—complete with starter code to try yourself.
🌐
InfoWorld
infoworld.com › home › software development › programming languages › python
10 tips for speeding up Python programs | InfoWorld
May 14, 2025 - On average, large objects like that in NumPy take up around one-fourth of the memory required if they were expressed in conventional Python. Note that it helps to begin with the right data structure for a job—which is an optimization in itself. Rewriting Python algorithms to use NumPy takes some work since array objects need to be declared using NumPy’s syntax. Plus, the biggest speedups come by way of using NumPy-specific “broadcasting” techniques, where a function or behavior is applied across an array.
🌐
Python⇒Speed
pythonspeed.com
Write faster Python code, and ship your code faster
Helping you deploy with confidence, ship higher quality code, and speed up your application.
🌐
SOFTFORMANCE
softformance.com › home › blog › 25 tips for optimizing python performance
Optimizing Python Code for Performance: Tips & Tricks | SoftFormance
January 10, 2024 - Python offers list, tuple, set, and dictionary as the built-in data structures. And most developers rely on a list of all cases. However, if you want a truly good performance, check how different data structures fit different cases. And, as a result of your research, choose data structures depending on your needs · This approach optimizes and speeds up the code execution.
🌐
Real Python
realpython.com › python-concurrency
Speed Up Your Python Program With Concurrency – Real Python
November 25, 2024 - For I/O-bound problems, there’s a general rule of thumb in the Python community: “Use asyncio when you can, threading or concurrent.futures when you must.” asyncio can provide the best speed-up for this type of program, but sometimes you’ll require critical libraries that haven’t been ported to take advantage of asyncio.
🌐
TheServerSide
theserverside.com › tip › Tips-to-improve-Python-performance
9 tips to improve Python performance | TheServerSide
Python performance gets a bad rap compared with languages such as Java. Use these tips to identify and fix problems in your Python code to tweak its performance. Speed up Python and NumPy by avoiding the conversion tax
🌐
Hacker News
news.ycombinator.com › item
Performance hacks for faster Python code | Hacker News
November 24, 2025 - Hack 1: Don't Use The Obviously Wrong Data Structure For Your Problem · Hack 2: Don't Have The Computer Do Useless Stuff
🌐
GitConnected
levelup.gitconnected.com › 10-python-tools-for-fast-development-every-developer-should-know-93cb368e9591
10 Python Tools for Fast Development Every Developer Should Know | by Sandun Lakshan | Nov, 2025 | Level Up Coding
November 21, 2025 - 10 Python Tools for Fast Development Every Developer Should Know Most Python projects slow down not because of logic, but because of setup, debugging, and maintenance. The right tools handle those …