You are close, you want to use np.tile, but like this:
a = np.array([0,1,2])
np.tile(a,(3,1))
Result:
array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
If you call np.tile(a,3) you will get concatenate behavior like you were seeing
array([0, 1, 2, 0, 1, 2, 0, 1, 2])
http://docs.scipy.org/doc/numpy/reference/generated/numpy.tile.html
Answer from Cory Kramer on Stack OverflowVideos
You are close, you want to use np.tile, but like this:
a = np.array([0,1,2])
np.tile(a,(3,1))
Result:
array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
If you call np.tile(a,3) you will get concatenate behavior like you were seeing
array([0, 1, 2, 0, 1, 2, 0, 1, 2])
http://docs.scipy.org/doc/numpy/reference/generated/numpy.tile.html
You could use vstack:
numpy.vstack([X]*N)
or array (credit to bluenote10 below):
numpy.array([X]*N)
e.g.
>>> import numpy as np
>>> X = np.array([1,2,3,4])
>>> N = 7
>>> np.vstack([X]*N)
array([[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4]])
One can use np.repeat methods together with np.newaxis:
import numpy as np
test = np.random.randn(40,40,3)
result = np.repeat(test[np.newaxis,...], 10, axis=0)
print(result.shape)
>> (10, 40, 40, 3)
Assuming you're looking to copy the values 10 times, you can just stack 10 of the array:
def repeat(arr, count):
return np.stack([arr for _ in range(count)], axis=0)
axis=0 is actually the default, so it's not really necessary here, but I think it makes it clearer that you're adding the new axis on the front.
In fact, this is pretty much identical to what the examples for stack are doing:
>>> arrays = [np.random.randn(3, 4) for _ in range(10)]
>>> np.stack(arrays, axis=0).shape
(10, 3, 4)
At first glance you might think repeat or tile would be a better fit.
But repeat is about repeating over an existing axis (or flattening the array), so you'd need to reshape either before or after. (Which is just as efficient, but I think not as simple.)
And tile (assuming you use an array-like repsโwith scalar reps it basically repeat) is about filling out a multidimensional spec in all directions, which is much more complex than what you want for this simple case.
All of these options will be equally efficient. They all copy the data 10 times over, which is the expensive part; the cost of any internal processing, building tiny intermediate objects, etc. is irrelevant. The only way to make it faster is to avoid copying. Which you probably don't want to do.
But if you doโฆ To share row storage across the 10 copies, you probably want broadcast_to:
def repeat(arr, count):
return np.broadcast_to(arr, (count,)+arr.shape)
Notice that broadcast_to doesn't actually guarantee that it avoids copying, just that it returns some kind of readonly view where "more than one element of a broadcasted array may refer to a single memory location". In practice, it's going to avoid copying. If you actually need that to be guaranteed for some reason (or if you want a writable viewโwhich is usually going to be a terrible idea, but maybe you have a good reasonโฆ), you have to drop down to as_strided:
def repeat(arr, count):
shape = (count,) + arr.shape
strides = (0,) + arr.strides
return np.lib.stride_tricks.as_strided(
arr, shape=shape, strides=strides, writeable=False)
Notice that half the docs for as_strided are warning that you probably shouldn't use it, and the other half are warning that you definitely shouldn't use it for writable views, soโฆ make sure this is what you want before doing it.
Use np.repeat with parameter axis=0 as:
a = np.array([[2, 3],[5, 6],[7, 9]])
print(a)
[[2 3]
[5 6]
[7 9]]
r_a = np.repeat(a, repeats=3, axis=0)
print(r_a)
[[2 3]
[2 3]
[2 3]
[5 6]
[5 6]
[5 6]
[7 9]
[7 9]
[7 9]]
If your input is a vector, use atleast_2d first.
a = np.atleast_2d([2, 3]).repeat(repeats=3, axis=0)
print(a)
# [[2 3]
# [2 3]
# [2 3]]
Use the timeit module in python for testing timings.
from copy import *
a=range(1000)
def cop():
b=copy(a)
def func1():
b=list(a)
def slice():
b=a[:]
def slice_len():
b=a[0:len(a)]
if __name__=="__main__":
import timeit
print "copy(a)",timeit.timeit("cop()", setup="from __main__ import cop")
print "list(a)",timeit.timeit("func1()", setup="from __main__ import func1")
print "a[:]",timeit.timeit("slice()", setup="from __main__ import slice")
print "a[0:len(a)]",timeit.timeit("slice_len()", setup="from __main__ import slice_len")
Results:
copy(a) 3.98940896988
list(a) 2.54542589188
a[:] 1.96630120277 #winner
a[0:len(a)] 10.5431251526
It's surely the extra steps involved in a[0:len(a)] are the reason for it's slowness.
Here's the byte code comparison of the two:
In [19]: dis.dis(func1)
2 0 LOAD_GLOBAL 0 (range)
3 LOAD_CONST 1 (100000)
6 CALL_FUNCTION 1
9 STORE_FAST 0 (a)
3 12 LOAD_FAST 0 (a)
15 SLICE+0
16 STORE_FAST 1 (b)
19 LOAD_CONST 0 (None)
22 RETURN_VALUE
In [20]: dis.dis(func2)
2 0 LOAD_GLOBAL 0 (range)
3 LOAD_CONST 1 (100000)
6 CALL_FUNCTION 1
9 STORE_FAST 0 (a)
3 12 LOAD_FAST 0 (a) #same up to here
15 LOAD_CONST 2 (0) #loads 0
18 LOAD_GLOBAL 1 (len) # loads the builtin len(),
# so it might take some lookup time
21 LOAD_FAST 0 (a)
24 CALL_FUNCTION 1
27 SLICE+3
28 STORE_FAST 1 (b)
31 LOAD_CONST 0 (None)
34 RETURN_VALUE
I can't comment on the ruby timing vs. the python timing. But I can comment on list vs. slice. Here's a quick inspection of the bytecode:
>>> import dis
>>> a = range(10)
>>> def func(a):
... return a[:]
...
>>> def func2(a):
... return list(a)
...
>>> dis.dis(func)
2 0 LOAD_FAST 0 (a)
3 SLICE+0
4 RETURN_VALUE
>>> dis.dis(func2)
2 0 LOAD_GLOBAL 0 (list)
3 LOAD_FAST 0 (a)
6 CALL_FUNCTION 1
9 RETURN_VALUE
Notice that list requires a LOAD_GLOBAL to find the function list. Looking up globals (and calling functions) in python is relatively slow. This would explain why a[0:len(a)] is also slower. Also remember that list needs to be able to handle arbitrary iterators whereas slicing doesn't. This means that list needs to allocate a new list, pack elements into that list as it iterates over the list and resize when necessary. There are a few things in here which are expensive -- resizing if necessary and iterating (effectively in python, not C). With the slicing method, you can calculate the size of the memory you'll need so can probably avoid resizing, and the iteration can be done completely in C (probably with a memcpy or something.
disclaimer : I'm not a python dev, so I don't know how the internals of list() are implemented for sure. I'm just speculating based what I know of the specification.
EDIT -- So I've looked at the source (with a little guidance from Martijn). The relevant code is in listobject.c. list calls list_init which then calls listextend at line 799. That function has some checks to see if it can use a fast branch if the object is a list or a tuple (line 812). Finally, the heavy lifting is done starting at line 834:
src = PySequence_Fast_ITEMS(b);
dest = self->ob_item + m;
for (i = 0; i < n; i++) {
PyObject *o = src[i];
Py_INCREF(o);
dest[i] = o;
}
Compare that to the slice version which I think is defined in list_subscript (line 2544). That calls list_slice (line 2570) where the heavy lifting is done by the following loop (line 486):
src = a->ob_item + ilow;
dest = np->ob_item;
for (i = 0; i < len; i++) {
PyObject *v = src[i];
Py_INCREF(v);
dest[i] = v;
}
They're pretty much the same code, so it's not surprising that the performance is almost the same for large lists (where the overhead of the small stuff like unpacking slices, looking up global variables, etc becomes less important)
Here's how I would run the python tests (and the results for my Ubuntu system):
$ python -m timeit -s 'a=range(30)' 'list(a)'
1000000 loops, best of 3: 0.39 usec per loop
$ python -m timeit -s 'a=range(30)' 'a[:]'
10000000 loops, best of 3: 0.183 usec per loop
$ python -m timeit -s 'a=range(30)' 'a[0:len(a)]'
1000000 loops, best of 3: 0.254 usec per loop
As of numpy version 1.9.0, np.unique has an argument return_counts which greatly simplifies your task:
u, c = np.unique(a, return_counts=True)
dup = u[c > 1]
This is similar to using Counter, except you get a pair of arrays instead of a mapping. I'd be curious to see how they perform relative to each other.
It's probably worth mentioning that even though np.unique is quite fast in practice due to its numpyness, it has worse algorithmic complexity than the Counter solution. np.unique is sort-based, so runs asymptotically in O(n log n) time. Counter is hash-based, so has O(n) complexity. This will not matter much for anything but the largest datasets.
I think this is most clear done outside of numpy. You'll have to time it against your numpy solutions if you are concerned with speed.
>>> import numpy as np
>>> from collections import Counter
>>> a = np.array([1, 2, 1, 3, 3, 3, 0])
>>> [item for item, count in Counter(a).items() if count > 1]
[1, 3]
note: This is similar to Burhan Khalid's answer, but the use of items without subscripting in the condition should be faster.
The ideal way is probably numpy.repeat:
In [16]:
import numpy as np
x1=[1,2,3,4]
In [17]:
np.repeat(x1,3)
Out[17]:
array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4])
In case you really want result as list, and generator is not sufficient:
import itertools
lst = range(1,5)
list(itertools.chain.from_iterable(itertools.repeat(x, 3) for x in lst))
Out[8]: [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4]
We can use np.broadcast_to -
np.broadcast_to(a,(10,)+a.shape).copy() # a is input array
If we are okay with a view instead, skip .copy() for a virtually free runtime and zero memory overhead.
We can also use np.repeat -
np.repeat(a[None],10,axis=0)
You can use np.resize, which will tile if the new size is larger than the old one:
array = np.ones((320, 320, 3))
new_array = np.resize(array, (10, *array.shape))
print(new_array.shape)
# (10, 320, 320, 3)
From the docs:
numpy.resize(a, new_shape): If the new array is larger than the original array, then the new array is filled with repeated copies of a.