Regarding "Secondly: When writing a program from scratch in python, what are some good ways to greatly improve performance?"
Remember the Jackson rules of optimization:
- Rule 1: Don't do it.
- Rule 2 (for experts only): Don't do it yet.
And the Knuth rule:
- "Premature optimization is the root of all evil."
The more useful rules are in the General Rules for Optimization.
Don't optimize as you go. First get it right. Then get it fast. Optimizing a wrong program is still wrong.
Remember the 80/20 rule.
Always run "before" and "after" benchmarks. Otherwise, you won't know if you've found the 80%.
Use the right algorithms and data structures. This rule should be first. Nothing matters as much as algorithm and data structure.
Bottom Line
You can't prevent or avoid the "optimize this program" effort. It's part of the job. You have to plan for it and do it carefully, just like the design, code and test activities.
Answer from S.Lott on Stack OverflowHow to optimize python code?
performance - Optimizing Python Code - Stack Overflow
Why Python is slow and how to make it faster
[deleted by user]
Videos
Regarding "Secondly: When writing a program from scratch in python, what are some good ways to greatly improve performance?"
Remember the Jackson rules of optimization:
- Rule 1: Don't do it.
- Rule 2 (for experts only): Don't do it yet.
And the Knuth rule:
- "Premature optimization is the root of all evil."
The more useful rules are in the General Rules for Optimization.
Don't optimize as you go. First get it right. Then get it fast. Optimizing a wrong program is still wrong.
Remember the 80/20 rule.
Always run "before" and "after" benchmarks. Otherwise, you won't know if you've found the 80%.
Use the right algorithms and data structures. This rule should be first. Nothing matters as much as algorithm and data structure.
Bottom Line
You can't prevent or avoid the "optimize this program" effort. It's part of the job. You have to plan for it and do it carefully, just like the design, code and test activities.
Rather than just punting to C, I'd suggest:
Make your code count. Do more with fewer executions of lines:
- Change the algorithm to a faster one. It doesn't need to be fancy to be faster in many cases.
- Use python primitives that happens to be written in C. Some things will force an interpreter dispatch where some wont. The latter is preferable
- Beware of code that first constructs a big data structure followed by its consumation. Think the difference between range and xrange. In general it is often worth thinking about memory usage of the program. Using generators can sometimes bring O(n) memory use down to O(1).
- Python is generally non-optimizing. Hoist invariant code out of loops, eliminate common subexpressions where possible in tight loops.
- If something is expensive, then precompute or memoize it. Regular expressions can be compiled for instance.
- Need to crunch numbers? You might want to check
numpyout. - Many python programs are slow because they are bound by disk I/O or database access. Make sure you have something worthwhile to do while you wait on the data to arrive rather than just blocking. A weapon could be something like the
Twistedframework. - Note that many crucial data-processing libraries have C-versions, be it XML, JSON or whatnot. They are often considerably faster than the Python interpreter.
If all of the above fails for profiled and measured code, then begin thinking about the C-rewrite path.
Hi Pythonistas,
I'm interested in learning what optimization techniques you know for python code. I know its a general statement, but I'm interested in really pushing execution to the maximum.
I use the following -
-
I declare
__slots__in custom classes -
I use typing blocks for typing imports
-
I use builtins when possible
-
I try to reduce function calls
-
I use set lookups wherever possible
-
I prefer iteration to recursion
Edit: I am using a profiler, and benchmarks. I'm working on a library - an ASGI Api framework. The code is async. Its not darascience. Its neither compatible with pypy, nor with numba..
What else?
If your question is about optimising python code generally (which I think it should be ;) then there are all sorts of intesting things you can do, but first:
You probably shouldn't be obsessively optimising python code! If you're using the fastest algorithm for the problem you're trying to solve and python doesn't do it fast enough you should probably be using a different language.
That said, there are several approaches you can take (because sometimes, you really do want to make python code faster):
Profile (do this first!)
There are lots of ways of profiling python code, but there are two that I'll mention: cProfile (or profile) module, and PyCallGraph.
cProfile
This is what you should actually use, though interpreting the results can be a bit daunting. It works by recording when each function is entered or exited, and what the calling function was (and tracking exceptions).
You can run a function in cProfile like this:
import cProfile
cProfile.run('myFunction()', 'myFunction.profile')
Then to view the results:
import pstats
stats = pstats.Stats('myFunction.profile')
stats.strip_dirs().sort_stats('time').print_stats()
This will show you in which functions most of the time is spent.
PyCallGraph
PyCallGraph provides a prettiest and maybe the easiest way of profiling python programs -- and it's a good introduction to understanding where the time in your program is spent, however it adds significant execution overhead
To run pycallgraph:
pycallgraph graphviz ./myprogram.py
Simple! You get a png graph image as output (perhaps after a while...)
Use Libraries
If you're trying to do something in python that a module already exists for (maybe even in the standard library), then use that module instead!
Most of the standard library modules are written in C, and they will execute hundreds of times faster than equivilent python implementations of, say, bisection search.
Make the Interpreter do as Much of Your Work as You Can
The interpreter will do some things for you, like looping. Really? Yes! You can use the map, reduce, and filter keywords to significantly speed up tight loops:
consider:
for x in xrange(0, 100):
doSomethingWithX(x)
vs:
map(doSomethingWithX, xrange(0,100))
Well obviously this could be faster because the interpreter only has to deal with a single statement, rather than two, but that's a bit vague... in fact, this is faster for two reasons:
- all flow control (have we finished looping yet...) is done in the interpreter
- the doSomethingWithX function name is only resolved once
In the for loop, each time around the loop python has to check exactly where the doSomethingWithX function is! even with cacheing this is a bit of an overhead.
Remember that Python is an Interpreted Language
(Note that this section really is about tiny tiny optimisations that you shouldn't let affect your normal, readable coding style!) If you come from a background of a programming in a compiled language, like c or Fortran, then some things about the performance of different python statements might be surprising:
try:ing is cheap, ifing is expensive
If you have code like this:
if somethingcrazy_happened:
uhOhBetterDoSomething()
else:
doWhatWeNormallyDo()
And doWhatWeNormallyDo() would throw an exception if something crazy had happened, then it would be faster to arrange your code like this:
try:
doWhatWeNormallyDo()
except SomethingCrazy:
uhOhBetterDoSomething()
Why? well the interpreter can dive straight in and start doing what you normally do; in the first case the interpreter has to do a symbol look up each time the if statement is executed, because the name could refer to something different since the last time the statement was executed! (And a name lookup, especially if somethingcrazy_happened is global can be nontrivial).
You mean Who??
Because of cost of name lookups it can also be better to cache global values within functions, and bake-in simple boolean tests into functions like this:
Unoptimised function:
def foo():
if condition_that_rarely_changes:
doSomething()
else:
doSomethingElse()
Optimised approach, instead of using a variable, exploit the fact that the interpreter is doing a name lookup on the function anyway!
When the condition becomes true:
foo = doSomething # now foo() calls doSomething()
When the condition becomes false:
foo = doSomethingElse # now foo() calls doSomethingElse()
PyPy
PyPy is a python implementation written in python. Surely that means it will run code infinitely slower? Well, no. PyPy actually uses a Just-In-Time compiler (JIT) to run python programs.
If you don't use any external libraries (or the ones you do use are compatible with PyPy), then this is an extremely easy way to (almost certainly) speed up repetitive tasks in your program.
Basically the JIT can generate code that will do what the python interpreter would, but much faster, since it is generated for a single case, rather than having to deal with every possible legal python expression.
Where to look Next
Of course, the first place you should have looked was to improve your algorithms and data structures, and to consider things like caching, or even whether you need to be doing so much in the first place, but anyway:
This page of the python.org wiki provides lots of information about how to speed up python code, though some of it is a bit out of date.
Here's the BDFL himself on the subject of optimising loops.
There are quite a few things, even from my own limited experience that I've missed out, but this answer was long enough already!
This is all based on my own recent experiences with some python code that just wasn't fast enough, and I'd like to stress again that I don't really think any of what I've suggested is actually a good idea, sometimes though, you have to....
First off, profile your code so you know where the problems lie. There are many examples of how to do this, here's one: https://codereview.stackexchange.com/questions/3393/im-trying-to-understand-how-to-make-my-application-more-efficient
You do a lot of indexed access as in:
for pair in range(i-1, j):
if coordinates[pair][0] >= 0 and coordinates[pair][1] >= 0:
Which could be written more plainly as:
for coord in coordinates[i-1:j]:
if coord[0] >= 0 and cood[1] >= 0:
List comprehensions are cool and "pythonic", but this code would probably run faster if you didn't create 4 lists:
N = int(raw_input())
coordinates = []
coordinates = [raw_input() for i in xrange(N)]
coordinates = [pair.split(" ") for pair in coordinates]
coordinates = [[int(pair[0]), int(pair[1])] for pair in coordinates]
I would instead roll all those together into one simple loop or if you're really dead set on list comprehensions, encapsulate the multiple transformations into a function which operates on the raw_input().