According the NumPy tutorial, the correct way to do it is:
a[tuple(b)]
Answer from JoshAdel on Stack OverflowAccording the NumPy tutorial, the correct way to do it is:
a[tuple(b)]
Suppose you want to access a subvector of a with n index pairs stored in blike so:
b = array([[0, 0],
...
[1, 1]])
This can be done as follows:
a[b[:,0], b[:,1]]
For a single pair index vector this changes to a[b[0],b[1]], but I guess the tuple approach is easier to read and hence preferable.
Indexing one array with another array has different behavior than if I index with the same array without explicitly casting it to a numpy array first (i.e. I leave it as a list of lists). I can't find the pages in the documentation that explain this kind of indexing
Example:
#make a 5x5 matrix for testing, the numbers arent important
a = np.random.rand(5,5)
#another arbitrary 5x5 matrix
b = [[0, 0, 0, 0, 1],
[0, 0, 0, 1, 1],
[0, 0, 1, 1, 0],
[0, 1, 1, 0, 0],
[1, 1, 0, 0, 0]]
c = np.array(b)
a[b] #gives the error "too many indices for array: array is 2-dimensional, but 5 were indexed"
a[tuple(c)] #gives the same error as a[b]
a[c] #for some reason this works, and it returns a 5x5x5 matrix So the behavior changes when I convert the list of lists to a numpy array. And I can't really tell what it's doing by looking at the output of a[c]. It seems to be switching the rows around somehow but I'm confused at why it returns five copies of the original matrix. Is there any page in the documentation that describes this type of indexing?
Videos
It can be done with array indexing but it doesn't feel natural.
import numpy as np
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
d = np.array([2, 1, 3])
col_ix = [ 0, 0, 1, 1, 1, 2 ] # column ix for each item to change
row_ix = [ 2, 3, 1, 2, 3, 3 ] # row index for each item to change
a[ row_ix, col_ix ] = 0
a
# array([[1, 2, 3],
# [4, 0, 6],
# [0, 0, 9],
# [0, 0, 0]])
With a for loop
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
for ix_col, ix_row in enumerate( d ): # iterate across the columns
a[ ix_row:, ix_col ] = 0
a
# array([[1, 2, 3],
# [4, 0, 6],
# [0, 0, 9],
# [0, 0, 0]])
A widely used approach for this kind of problem is to construct a boolean mask, comparing the index array with the appropriate arange:
In [619]: mask = np.arange(4)[:,None]>=d
In [620]: mask
Out[620]:
array([[False, False, False],
[False, True, False],
[ True, True, False],
[ True, True, True]])
In [621]: a[mask]
Out[621]: array([ 5, 7, 8, 10, 11, 12])
In [622]: a[mask] = 0
In [623]: a
Out[623]:
array([[1, 2, 3],
[4, 0, 6],
[0, 0, 9],
[0, 0, 0]])
That's not necessarily faster than a row (or in this case column) iteration. Since slicing is basic indexing, it may be faster, even if done several times.
In [624]: for i,v in enumerate(d):
...: print(a[v:,i])
...:
[0 0]
[0 0 0]
[0]
Generally if a result involves multiple arrays or lists with different lengths, there isn't a "neat" multidimensional solution. Either iterate over those lists, or step back and "think outside the box".
Just use a tuple:
>>> A[(3, 1)]
8
>>> A[tuple(ind)]
8
The A[] actually calls the special method __getitem__:
>>> A.__getitem__((3, 1))
8
and using a comma creates a tuple:
>>> 3, 1
(3, 1)
Putting these two basic Python principles together solves your problem.
You can store your index in a tuple in the first place, if you don't need NumPy array features for it.
That is because by giving an array you actually ask
A[[3,1]]
Which gives the third and first index of the 2d array instead of the first index of the third index of the array as you want.
You can use
A[ind[0],ind[1]]
You can also use (if you want more indexes at the same time);
A[indx,indy]
Where indx and indy are numpy arrays of indexes for the first and second dimension accordingly.
See here for all possible indexing methods for numpy arrays: http://docs.scipy.org/doc/numpy-1.10.1/user/basics.indexing.html
I want to suggest one-line solution:
indices = np.where(np.in1d(x, y))[0]
The result is an array with indices for x array which corresponds to elements from y which were found in x.
One can use it without numpy.where if needs.
As Joe Kington said, searchsorted() can search element very quickly. To deal with elements that are not in x, you can check the searched result with original y, and create a masked array:
import numpy as np
x = np.array([3,5,7,1,9,8,6,6])
y = np.array([2,1,5,10,100,6])
index = np.argsort(x)
sorted_x = x[index]
sorted_index = np.searchsorted(sorted_x, y)
yindex = np.take(index, sorted_index, mode="clip")
mask = x[yindex] != y
result = np.ma.array(yindex, mask=mask)
print result
the result is:
[-- 3 1 -- -- 6]
Just use it like:
print(arr_2[arr_idx_num_3])
output:
>>> [40 50 70]
A simple for loop should do the trick.
import numpy as np
arr_1 = np.array([5, 1, 6, 3, 3, 10, 3, 6, 12])
arr_2 = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90])
idx_num = 3
arr_idx_num = []
for i in range(len(arr_1)):
if arr_1[i] == idx_num:
arr_idx_num.append(arr_2[i])