According the NumPy tutorial, the correct way to do it is:
a[tuple(b)]
Answer from JoshAdel on Stack OverflowAccording the NumPy tutorial, the correct way to do it is:
a[tuple(b)]
Suppose you want to access a subvector of a with n index pairs stored in blike so:
b = array([[0, 0],
...
[1, 1]])
This can be done as follows:
a[b[:,0], b[:,1]]
For a single pair index vector this changes to a[b[0],b[1]], but I guess the tuple approach is easier to read and hence preferable.
Indexing one array with another array has different behavior than if I index with the same array without explicitly casting it to a numpy array first (i.e. I leave it as a list of lists). I can't find the pages in the documentation that explain this kind of indexing
Example:
#make a 5x5 matrix for testing, the numbers arent important
a = np.random.rand(5,5)
#another arbitrary 5x5 matrix
b = [[0, 0, 0, 0, 1],
[0, 0, 0, 1, 1],
[0, 0, 1, 1, 0],
[0, 1, 1, 0, 0],
[1, 1, 0, 0, 0]]
c = np.array(b)
a[b] #gives the error "too many indices for array: array is 2-dimensional, but 5 were indexed"
a[tuple(c)] #gives the same error as a[b]
a[c] #for some reason this works, and it returns a 5x5x5 matrix So the behavior changes when I convert the list of lists to a numpy array. And I can't really tell what it's doing by looking at the output of a[c]. It seems to be switching the rows around somehow but I'm confused at why it returns five copies of the original matrix. Is there any page in the documentation that describes this type of indexing?
Videos
It can be done with array indexing but it doesn't feel natural.
import numpy as np
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
d = np.array([2, 1, 3])
col_ix = [ 0, 0, 1, 1, 1, 2 ] # column ix for each item to change
row_ix = [ 2, 3, 1, 2, 3, 3 ] # row index for each item to change
a[ row_ix, col_ix ] = 0
a
# array([[1, 2, 3],
# [4, 0, 6],
# [0, 0, 9],
# [0, 0, 0]])
With a for loop
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
for ix_col, ix_row in enumerate( d ): # iterate across the columns
a[ ix_row:, ix_col ] = 0
a
# array([[1, 2, 3],
# [4, 0, 6],
# [0, 0, 9],
# [0, 0, 0]])
A widely used approach for this kind of problem is to construct a boolean mask, comparing the index array with the appropriate arange:
In [619]: mask = np.arange(4)[:,None]>=d
In [620]: mask
Out[620]:
array([[False, False, False],
[False, True, False],
[ True, True, False],
[ True, True, True]])
In [621]: a[mask]
Out[621]: array([ 5, 7, 8, 10, 11, 12])
In [622]: a[mask] = 0
In [623]: a
Out[623]:
array([[1, 2, 3],
[4, 0, 6],
[0, 0, 9],
[0, 0, 0]])
That's not necessarily faster than a row (or in this case column) iteration. Since slicing is basic indexing, it may be faster, even if done several times.
In [624]: for i,v in enumerate(d):
...: print(a[v:,i])
...:
[0 0]
[0 0 0]
[0]
Generally if a result involves multiple arrays or lists with different lengths, there isn't a "neat" multidimensional solution. Either iterate over those lists, or step back and "think outside the box".
I want to suggest one-line solution:
indices = np.where(np.in1d(x, y))[0]
The result is an array with indices for x array which corresponds to elements from y which were found in x.
One can use it without numpy.where if needs.
As Joe Kington said, searchsorted() can search element very quickly. To deal with elements that are not in x, you can check the searched result with original y, and create a masked array:
import numpy as np
x = np.array([3,5,7,1,9,8,6,6])
y = np.array([2,1,5,10,100,6])
index = np.argsort(x)
sorted_x = x[index]
sorted_index = np.searchsorted(sorted_x, y)
yindex = np.take(index, sorted_index, mode="clip")
mask = x[yindex] != y
result = np.ma.array(yindex, mask=mask)
print result
the result is:
[-- 3 1 -- -- 6]
Just use a tuple:
>>> A[(3, 1)]
8
>>> A[tuple(ind)]
8
The A[] actually calls the special method __getitem__:
>>> A.__getitem__((3, 1))
8
and using a comma creates a tuple:
>>> 3, 1
(3, 1)
Putting these two basic Python principles together solves your problem.
You can store your index in a tuple in the first place, if you don't need NumPy array features for it.
That is because by giving an array you actually ask
A[[3,1]]
Which gives the third and first index of the 2d array instead of the first index of the third index of the array as you want.
You can use
A[ind[0],ind[1]]
You can also use (if you want more indexes at the same time);
A[indx,indy]
Where indx and indy are numpy arrays of indexes for the first and second dimension accordingly.
See here for all possible indexing methods for numpy arrays: http://docs.scipy.org/doc/numpy-1.10.1/user/basics.indexing.html
Just use it like:
print(arr_2[arr_idx_num_3])
output:
>>> [40 50 70]
A simple for loop should do the trick.
import numpy as np
arr_1 = np.array([5, 1, 6, 3, 3, 10, 3, 6, 12])
arr_2 = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90])
idx_num = 3
arr_idx_num = []
for i in range(len(arr_1)):
if arr_1[i] == idx_num:
arr_idx_num.append(arr_2[i])
The numpy way to do this is by using np.choose or fancy indexing/take (see below):
m = array([[1, 2],
[4, 5],
[7, 8],
[6, 2]])
select = array([0,1,0,0])
result = np.choose(select, m.T)
So there is no need for python loops, or anything, with all the speed advantages numpy gives you. m.T is just needed because choose is really more a choise between the two arrays np.choose(select, (m[:,0], m[:1])), but its straight forward to use it like this.
Using fancy indexing:
result = m[np.arange(len(select)), select]
And if speed is very important np.take, which works on a 1D view (its quite a bit faster for some reason, but maybe not for these tiny arrays):
result = m.take(select+np.arange(0, len(select) * m.shape[1], m.shape[1]))
I prefer to use NP.where for indexing tasks of this sort (rather than NP.ix_)
What is not mentioned in the OP is whether the result is selected by location (row/col in the source array) or by some condition (e.g., m >= 5). In any event, the code snippet below covers both scenarios.
Three steps:
create the condition array;
generate an index array by calling NP.where, passing in this condition array; and
apply this index array against the source array
>>> import numpy as NP
>>> cnd = (m==1) | (m==5) | (m==7) | (m==6)
>>> cnd
matrix([[ True, False],
[False, True],
[ True, False],
[ True, False]], dtype=bool)
>>> # generate the index array/matrix
>>> # by calling NP.where, passing in the condition (cnd)
>>> ndx = NP.where(cnd)
>>> ndx
(matrix([[0, 1, 2, 3]]), matrix([[0, 1, 0, 0]]))
>>> # now apply it against the source array
>>> m[ndx]
matrix([[1, 5, 7, 6]])
The argument passed to NP.where, cnd, is a boolean array, which in this case, is the result from a single expression comprised of compound conditional expressions (first line above)
If constructing such a value filter doesn't apply to your particular use case, that's fine, you just need to generate the actual boolean matrix (the value of cnd) some other way (or create it directly).