Is ruby really slow?
Why is Python faster than Ruby? - Stack Overflow
Comparing performance between ruby and python code - Stack Overflow
Why Ruby is More Readable than Python
Videos
Many people say ruby is slow. Any plans for asynchrony ruby? Or faster ruby?
Nothing deep, I am pretty sure -- it's strictly a matter of implementation choices and maturity. Python was quite a bit slower in many aspects not so long ago, after all! Consider for example:
$ py24 -mtimeit '[i+i for i in xrange(55)]'
100000 loops, best of 3: 10.8 usec per loop
$ py25 -mtimeit '[i+i for i in xrange(55)]'
100000 loops, best of 3: 9.83 usec per loop
$ py26 -mtimeit '[i+i for i in xrange(55)]'
100000 loops, best of 3: 8.12 usec per loop
$ py27 -mtimeit '[i+i for i in xrange(55)]'
100000 loops, best of 3: 6.35 usec per loop
Yep, all on the same machine (Macbook Pro, 2.4 GHz Intel Core 2 Duo, OSX 10.5), all "official" Mac releases from python.org (latest one of each x in the 2.x series). I have no 2.3 around to check, but I'd expect it to be a wee bit slower than 2.4.
This is just the kinds of speed-up that a lot of loving, painstaking work can achieve among successive releases of pretty much the same underlying architecture. Not as flashy as adding feechurz, but often vastly more useful in the real world!-)
I'm pretty sure, therefore, that Ruby can also stabilize on a sound, performance-robust underlying architecture, then start getting a steady stream of under-the-hood performance tweaks over the years to get (e.g.) the 40% or so further improvement we observe here has been happening in (at least some parts of) Python in the last few years.
One reason is Python's being compiled into bytecode which is then executed by a highly optimized VM. AFAIK Ruby doesn't work this way in 1.8 and earlier - but interprets the trees on the fly.
Think of it this way:
Python:
- Parse code into ASTs
- Convert ASTs into bytecode
- Run bytecode on a VM
Ruby (prior to 1.9):
- Parse code into ASTs
- Interpret the ASTs directly by recursive traversal
Without getting too much into detail, step 2 in the old Ruby has a lot of repetitions because it has to "understand" the ASTs each time it sees them (which, in an inner loop is a lot). Python "understands" the ASTs only once, and then the VM runs the bytecode as fast as it can (which isn't different in principle from the way the Java and .NET VMs work).
Ruby 1.9 moved to YARV, which is also a VM-based approach. Ruby 1.9 is faster than 1.8. Here's a quote from the creator of YARV, Koichi Sasada:
At first, YARV is simple stack machine which run pseudo sequential instructions. Old interpreter (matzruby) traverses abstract syntax tree (AST) naively. Obviously it's slow. YARV compile that AST to YARV bytecode and run it.
An interesting point to note is that the Python VM is also stack based, just like YARV.
For Python I recommend heapy
from guppy import hpy
h = hpy()
print h.heap()
or Dowser or PySizer
For Ruby you can use the BleakHouse Plugin or just read this answer on memory leak debugging (ruby).
If you really need to write fast code in a language like this (and not a language far more suited to CPU intensive operations and close control over memory usage such as C++) then I'd recommend pushing the bulk of the work out to Cython.
Cython is a language that makes writing C extensions for the Python language as easy as Python itself. Cython is based on the well-known Pyrex, but supports more cutting edge functionality and optimizations.
The Cython language is very close to the Python language, but Cython additionally supports calling C functions and declaring C types on variables and class attributes. This allows the compiler to generate very efficient C code from Cython code.
That way you can get most of the efficiency of C with most of the ease of use of Python.