I’m starting a series on Python performance optimizations, Looking for real-world use cases!
optimization - Speeding Up Python - Stack Overflow
Optimizing speed
Try Pandas, it handles tabular data much faster than iterating through of lists.
More on reddit.comWhen speed matters (10 Ways to Make Python Go Faster)
Videos
Hey everyone,
I’m planning to start a series (not sure yet if it’ll be a blog, video, podcast, or something else) focused on Python performance. The idea is to explore concrete ways to:
-
Make Python code run faster
-
Optimize memory usage
-
Reduce infrastructure costs (e.g., cloud bills)
I’d love to base this on real-world use cases instead of just micro-benchmarks or contrived examples.
If you’ve ever run into performance issues in Python whether it’s slow scripts, web backends costing too much to run, or anything else I’d really appreciate if you could share your story.
These will serve as case studies for me to propose optimizations, compare approaches, and hopefully make the series valuable for the community.
Thanks in advance for any examples you can provide!
Regarding "Secondly: When writing a program from scratch in python, what are some good ways to greatly improve performance?"
Remember the Jackson rules of optimization:
- Rule 1: Don't do it.
- Rule 2 (for experts only): Don't do it yet.
And the Knuth rule:
- "Premature optimization is the root of all evil."
The more useful rules are in the General Rules for Optimization.
Don't optimize as you go. First get it right. Then get it fast. Optimizing a wrong program is still wrong.
Remember the 80/20 rule.
Always run "before" and "after" benchmarks. Otherwise, you won't know if you've found the 80%.
Use the right algorithms and data structures. This rule should be first. Nothing matters as much as algorithm and data structure.
Bottom Line
You can't prevent or avoid the "optimize this program" effort. It's part of the job. You have to plan for it and do it carefully, just like the design, code and test activities.
Rather than just punting to C, I'd suggest:
Make your code count. Do more with fewer executions of lines:
- Change the algorithm to a faster one. It doesn't need to be fancy to be faster in many cases.
- Use python primitives that happens to be written in C. Some things will force an interpreter dispatch where some wont. The latter is preferable
- Beware of code that first constructs a big data structure followed by its consumation. Think the difference between range and xrange. In general it is often worth thinking about memory usage of the program. Using generators can sometimes bring O(n) memory use down to O(1).
- Python is generally non-optimizing. Hoist invariant code out of loops, eliminate common subexpressions where possible in tight loops.
- If something is expensive, then precompute or memoize it. Regular expressions can be compiled for instance.
- Need to crunch numbers? You might want to check
numpyout. - Many python programs are slow because they are bound by disk I/O or database access. Make sure you have something worthwhile to do while you wait on the data to arrive rather than just blocking. A weapon could be something like the
Twistedframework. - Note that many crucial data-processing libraries have C-versions, be it XML, JSON or whatnot. They are often considerably faster than the Python interpreter.
If all of the above fails for profiled and measured code, then begin thinking about the C-rewrite path.