In PY3, range is an object that can generate a sequence of numbers; it is not the actual sequence. You may need to brush up on some basic Python reading, paying attention to things like lists and generators, and their differences.
In [359]: x = range(3)
In [360]: x
Out[360]: range(0, 3)
We have use something like list or a list comprehension to actually create those numbers:
In [361]: list(x)
Out[361]: [0, 1, 2]
In [362]: [i for i in x]
Out[362]: [0, 1, 2]
A range is often used in a for i in range(3): print(i) kind of loop.
arange is a numpy function that produces a numpy array:
In [363]: arr = np.arange(3)
In [364]: arr
Out[364]: array([0, 1, 2])
We can iterate on such an array, but it is slower than [362]:
In [365]: [i for i in arr]
Out[365]: [0, 1, 2]
But for doing things math, the array is much better:
In [366]: arr * 10
Out[366]: array([ 0, 10, 20])
The array can also be created from the list [361] (and for compatibility with earlier Py2 usage from the range itself):
In [376]: np.array(list(x)) # np.array(x)
Out[376]: array([0, 1, 2])
But this is slower than using arange directly (that's an implementation detail).
Despite the similarity in names, these shouldn't be seen as simple alternatives. Use range in basic Python constructs such as for loop and comprehension. Use arange when you need an array.
An important innovation in Python (compared to earlier languages) is that we could iterate directly on a list. We didn't have to step through indices. And if we needed indices along with with values we could use enumerate:
In [378]: alist = ['a','b','c']
In [379]: for i in range(3): print(alist[i]) # index iteration
a
b
c
In [380]: for v in alist: print(v) # iterate on list directly
a
b
c
In [381]: for i,v in enumerate(alist): print(i,v) # index and values
0 a
1 b
2 c
Thus you might not see range used that much in basic Python code.
In PY3, range is an object that can generate a sequence of numbers; it is not the actual sequence. You may need to brush up on some basic Python reading, paying attention to things like lists and generators, and their differences.
In [359]: x = range(3)
In [360]: x
Out[360]: range(0, 3)
We have use something like list or a list comprehension to actually create those numbers:
In [361]: list(x)
Out[361]: [0, 1, 2]
In [362]: [i for i in x]
Out[362]: [0, 1, 2]
A range is often used in a for i in range(3): print(i) kind of loop.
arange is a numpy function that produces a numpy array:
In [363]: arr = np.arange(3)
In [364]: arr
Out[364]: array([0, 1, 2])
We can iterate on such an array, but it is slower than [362]:
In [365]: [i for i in arr]
Out[365]: [0, 1, 2]
But for doing things math, the array is much better:
In [366]: arr * 10
Out[366]: array([ 0, 10, 20])
The array can also be created from the list [361] (and for compatibility with earlier Py2 usage from the range itself):
In [376]: np.array(list(x)) # np.array(x)
Out[376]: array([0, 1, 2])
But this is slower than using arange directly (that's an implementation detail).
Despite the similarity in names, these shouldn't be seen as simple alternatives. Use range in basic Python constructs such as for loop and comprehension. Use arange when you need an array.
An important innovation in Python (compared to earlier languages) is that we could iterate directly on a list. We didn't have to step through indices. And if we needed indices along with with values we could use enumerate:
In [378]: alist = ['a','b','c']
In [379]: for i in range(3): print(alist[i]) # index iteration
a
b
c
In [380]: for v in alist: print(v) # iterate on list directly
a
b
c
In [381]: for i,v in enumerate(alist): print(i,v) # index and values
0 a
1 b
2 c
Thus you might not see range used that much in basic Python code.
the range type constructor creates range objects, which represent sequences of integers with a start, stop, and step in a space efficient manner, calculating the values on the fly.
np.arange function returns a numpy.ndarray object, which is essentially a wrapper around a primitive array. This is a fast and relatively compact representation, compared to if you created a python list, so list(range(N)), but range objects are more space efficient, and indeed, take constant space, so for all practical purposes, range(a) is the same size as range(b) for any integers a, b
As an aside, you should take care interpreting the results of sys.getsizeof, you must understand what it is doing. So do not naively compare the size of Python lists and numpy.ndarray, for example.
Perhaps whatever you read was referring to Python 2, where range returned a list. List objects do require more space than numpy.ndarray objects, generally.
Videos
I'm working with pandas and numpy and I can't seem to figure out the difference between range and arange. Is there any difference? Or is np.arange just a different implementation of the range method?
numpy.arange
numpy.arange([start, ]stop, [step, ]dtype=None)Return evenly spaced values within a given interval.
Values are generated within the half-open interval [start, stop) (in other words, the interval including start but excluding stop). For integer arguments the function is equivalent to the Python built-in range function, but returns an ndarray rather than a list.
(Source)
So they are the same when using integers (except for the return type), but numpy's version can use other variable types.
If you're using Python 3, though, note that range has changed and now returns an iterator instead of a list. (Explained here.)
range gives you a regular list (python 2) or a specialized "range object" (like a generator; python 3), np.arangegives you a numpy array. If you care about speed enough to use numpy, use numpy arrays.
For large arrays, a vectorised numpy operation is the fastest. If you must loop, prefer xrange/range and avoid using np.arange.
In numpy you should use combinations of vectorized calculations, ufuncs and indexing to solve your problems as it runs at C speed.
Looping over numpy arrays is inefficient compared to this.
(Something like the worst thing you could do would be to iterate over the array with an index created with range or np.arange as the first sentence in your question suggests, but I'm not sure if you really mean that.)
import numpy as np
import sys
sys.version
# out: '2.7.3rc2 (default, Mar 22 2012, 04:35:15) \n[GCC 4.6.3]'
np.version.version
# out: '1.6.2'
size = int(1E6)
%timeit for x in range(size): x ** 2
# out: 10 loops, best of 3: 136 ms per loop
%timeit for x in xrange(size): x ** 2
# out: 10 loops, best of 3: 88.9 ms per loop
# avoid this
%timeit for x in np.arange(size): x ** 2
#out: 1 loops, best of 3: 1.16 s per loop
# use this
%timeit np.arange(size) ** 2
#out: 100 loops, best of 3: 19.5 ms per loop
So for this case numpy is 4 times faster than using xrange if you do it right. Depending on your problem numpy can be much faster than a 4 or 5 times speed up.
The answers to this question explain some more advantages of using numpy arrays instead of python lists for large data sets.
First of all, as written by @bmu, you should use combinations of vectorized calculations, ufuncs and indexing. There are indeed some cases where explicit looping is required, but those are really rare.
If explicit loop is needed, with python 2.6 and 2.7, you should use xrange (see below). From what you say, in Python 3, range is the same as xrange (returns a generator). So maybe range is as good for you.
Now, you should try it yourself (using timeit: - here the ipython "magic function"):
%timeit for i in range(1000000): pass
[out] 10 loops, best of 3: 63.6 ms per loop
%timeit for i in np.arange(1000000): pass
[out] 10 loops, best of 3: 158 ms per loop
%timeit for i in xrange(1000000): pass
[out] 10 loops, best of 3: 23.4 ms per loop
Again, as mentioned above, most of the time it is possible to use numpy vector/array formula (or ufunc etc...) which run a c speed: much faster. This is what we could call "vector programming". It makes program easier to implement than C (and more readable) but almost as fast in the end.