That is the wrong mental model for using NumPy efficiently. NumPy arrays are stored in contiguous blocks of memory. To append rows or columns to an existing array, the entire array needs to be copied to a new block of memory, creating gaps for the new elements to be stored. This is very inefficient if done repeatedly.
Instead of appending rows, allocate a suitably sized array, and then assign to it row-by-row:
>>> import numpy as np
>>> a = np.zeros(shape=(3, 2))
>>> a
array([[ 0., 0.],
[ 0., 0.],
[ 0., 0.]])
>>> a[0] = [1, 2]
>>> a[1] = [3, 4]
>>> a[2] = [5, 6]
>>> a
array([[ 1., 2.],
[ 3., 4.],
[ 5., 6.]])
Answer from Stephen Simmons on Stack OverflowThat is the wrong mental model for using NumPy efficiently. NumPy arrays are stored in contiguous blocks of memory. To append rows or columns to an existing array, the entire array needs to be copied to a new block of memory, creating gaps for the new elements to be stored. This is very inefficient if done repeatedly.
Instead of appending rows, allocate a suitably sized array, and then assign to it row-by-row:
>>> import numpy as np
>>> a = np.zeros(shape=(3, 2))
>>> a
array([[ 0., 0.],
[ 0., 0.],
[ 0., 0.]])
>>> a[0] = [1, 2]
>>> a[1] = [3, 4]
>>> a[2] = [5, 6]
>>> a
array([[ 1., 2.],
[ 3., 4.],
[ 5., 6.]])
A NumPy array is a very different data structure from a list and is designed to be used in different ways. Your use of hstack is potentially very inefficient... every time you call it, all the data in the existing array is copied into a new one. (The append function will have the same issue.) If you want to build up your matrix one column at a time, you might be best off to keep it in a list until it is finished, and only then convert it into an array.
e.g.
mylist = []
for item in data:
mylist.append(item)
mat = numpy.array(mylist)
item can be a list, an array or any iterable, as long
as each item has the same number of elements.
In this particular case (data is some iterable holding the matrix columns) you can simply use
mat = numpy.array(data)
(Also note that using list as a variable name is probably not good practice since it masks the built-in type by that name, which can lead to bugs.)
EDIT:
If for some reason you really do want to create an empty array, you can just use numpy.array([]), but this is rarely useful!
How can I create a truly empty numpy array which can be merged onto (by a recursive function)?
Creating empty nxn square matrix of 0
Why does numpy.empty put numbers on the order of 1^9 or 1^(-300) in the array?
Why are 2D arrays in Python so stupid?
Videos
I'm kind of stuck conceptually on how to make this happen. I have a recursive method that builds a binary tree, and stores the tree as an instance variable. However, the function is not allowed to return anything, so each recursive call should (according to me) modify in-place the tree instance variable. However, I'm not sure how to set up my instance variable such that all said and done it holds a multidimensional array that represents the tree.
Say I set initialize it as a 1x1 array with element zero as a placeholder. Then as I go about recursing through my tree I can merge to it... but at the end I'm left with a spare [0] element that I don't need. In this case, I'd need some kind of final stop condition and function to remove that unnecessary placeholder stump. I don't think this is possible?
Otherwise, say I initialize the instance variable as None. Then when the first series of recursive calls, it would have to reassign the tree variable to change from None to an ndarray object, but all future calls would have to merge to the array. I don't think this is what the function should be asked to do?
Is there a way to make a truly empty array that I can merge onto? (e.g. np.empty doesn't reallly give an empty array, it gives an array with placeholder values so I'm still left with a useless stump at the end).