When you use b = a.reshape((5,4,5)) you just create a different view on the same data used by the array a. (ie changes to the elements of a will appear in b). reshape() does not copy data in this case, so it is a very fast operation. Slicing b and slicing a accesses the same memory, so there shouldn't be any need for a different syntax for the b array (just use a[:10]). If you have created a copy of the data, perhaps with np.resize(), and discarded a, just reshape b: b.reshape((20,5))[:10].
When you use b = a.reshape((5,4,5)) you just create a different view on the same data used by the array a. (ie changes to the elements of a will appear in b). reshape() does not copy data in this case, so it is a very fast operation. Slicing b and slicing a accesses the same memory, so there shouldn't be any need for a different syntax for the b array (just use a[:10]). If you have created a copy of the data, perhaps with np.resize(), and discarded a, just reshape b: b.reshape((20,5))[:10].
By reshaping (20,5) to (5,4,5), there's no way you can pull out the 1st half of the values. You can't split those 5 rows into 2 even groups:
In [9]: b[:2]
Out[9]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]]])
In [10]: b[:3]
Out[10]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
The last row of a[:10] is in the middle of b[3,:,:].
Note that b[:2] is (2,4,5), 8 rows of a, grouped into 2 sets of 4.
Now if you'd done c=a.reshape(4,5,5), then c[:2] would have those same 10 rows - in 2 sets of 5. And c[:2].reshape(10,-1) will look just like a[:10].
Videos
The criterion to satisfy for providing the new shape is that 'The new shape should be compatible with the original shape'
numpy allow us to give one of new shape parameter as -1 (eg: (2,-1) or (-1,3) but not (-1, -1)). It simply means that it is an unknown dimension and we want numpy to figure it out. And numpy will figure this by looking at the 'length of the array and remaining dimensions' and making sure it satisfies the above mentioned criteria
Now see the example.
z = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]])
z.shape
(3, 4)
Now trying to reshape with (-1) . Result new shape is (12,) and is compatible with original shape (3,4)
z.reshape(-1)
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
Now trying to reshape with (-1, 1) . We have provided column as 1 but rows as unknown . So we get result new shape as (12, 1).again compatible with original shape(3,4)
z.reshape(-1,1)
array([[ 1],
[ 2],
[ 3],
[ 4],
[ 5],
[ 6],
[ 7],
[ 8],
[ 9],
[10],
[11],
[12]])
The above is consistent with numpy advice/error message, to use reshape(-1,1) for a single feature; i.e. single column
Reshape your data using
array.reshape(-1, 1)if your data has a single feature
New shape as (-1, 2). row unknown, column 2. we get result new shape as (6, 2)
z.reshape(-1, 2)
array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10],
[11, 12]])
Now trying to keep column as unknown. New shape as (1,-1). i.e, row is 1, column unknown. we get result new shape as (1, 12)
z.reshape(1,-1)
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]])
The above is consistent with numpy advice/error message, to use reshape(1,-1) for a single sample; i.e. single row
Reshape your data using
array.reshape(1, -1)if it contains a single sample
New shape (2, -1). Row 2, column unknown. we get result new shape as (2,6)
z.reshape(2, -1)
array([[ 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12]])
New shape as (3, -1). Row 3, column unknown. we get result new shape as (3,4)
z.reshape(3, -1)
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
And finally, if we try to provide both dimension as unknown i.e new shape as (-1,-1). It will throw an error
z.reshape(-1, -1)
ValueError: can only specify one unknown dimension
Say we have a 3 dimensional array of dimensions 2 x 10 x 10:
r = numpy.random.rand(2, 10, 10)
Now we want to reshape to 5 X 5 x 8:
numpy.reshape(r, shape=(5, 5, 8))
will do the job.
Note that, once you fix first dim = 5 and second dim = 5, you don't need to determine third dimension. To assist your laziness, Numpy gives the option of using -1:
numpy.reshape(r, shape=(5, 5, -1))
will give you an array of shape = (5, 5, 8).
Likewise,
numpy.reshape(r, shape=(50, -1))
will give you an array of shape = (50, 4)
You can read more at http://anie.me/numpy-reshape-transpose-theano-dimshuffle/
I'm learning numpy and reading the documentation for np.reshape() the order parameter:
Read the elements of a using this index order, and place the elements into the reshaped array using this index order. โCโ means to read / write the elements using C-like index order, with the last axis index changing fastest, back to the first axis index changing slowest. โFโ means to read / write the elements using Fortran-like index order, with the first index changing fastest, and the last index changing slowest. Note that the โCโ and โFโ options take no account of the memory layout of the underlying array, and only refer to the order of indexing. โAโ means to read / write the elements in Fortran-like index order if a is Fortran contiguous in memory, C-like order otherwise.
Frankly this is like a foreign language to me, and I actually am coming from C as a programmer :?
I was hoping it would be like: if you have a 1d array C will put them in row-wise, and F will put them in column-wise. Obviously it can't be that simple as a general case, but is it that simple if I am reshaping a 1d array into a 2d array?
Anyone have a discussion that is simple at least for the simpler cases like that (I don't need a fully general discussion for nd to nd cases).