In Octave you can't use if/else statements in an inline or anonymous function in the normal way. You can define your function in it's own file or as a subfunction like this:
function a = testIf(x)
if x>=2
a = 1;
else
a = 0;
end
end
and call arrayfun like this:
arrayfun(@testIf,a)
ans =
0 1
1 1
Or you can use this work around with an inline function:
iif = @(varargin) varargin{2 * find([varargin{1:2:end}], 1, ...
'first')}();
arrayfun(iif, a >= 2, 1, true, 0)
ans =
0 1
1 1
There's more information here.
Answer from Molly on Stack OverflowIn Octave you can't use if/else statements in an inline or anonymous function in the normal way. You can define your function in it's own file or as a subfunction like this:
function a = testIf(x)
if x>=2
a = 1;
else
a = 0;
end
end
and call arrayfun like this:
arrayfun(@testIf,a)
ans =
0 1
1 1
Or you can use this work around with an inline function:
iif = @(varargin) varargin{2 * find([varargin{1:2:end}], 1, ...
'first')}();
arrayfun(iif, a >= 2, 1, true, 0)
ans =
0 1
1 1
There's more information here.
In MATLAB you don't need an if statement for the problem that you describe. In fact it is really simple to use arrayfun:
arrayfun(@(x) x>=2, a)
My guess is that it works in Octave as well.
Note that you don't actually need arrayfun in this case at all:
x>=2
Should do the trick here.
Many built-in operations like sum and prod are already able to operate across rows or columns, so you may be able to refactor the function you are applying to take advantage of this.
If that's not a viable option, one way to do it is to collect the rows or columns into cells using mat2cell or num2cell, then use cellfun to operate on the resulting cell array.
As an example, let's say you want to sum the columns of a matrix M. You can do this simply using sum:
M = magic(10); %# A 10-by-10 matrix
columnSums = sum(M, 1); %# A 1-by-10 vector of sums for each column
And here is how you would do this using the more complicated num2cell/cellfun option:
M = magic(10); %# A 10-by-10 matrix
C = num2cell(M, 1); %# Collect the columns into cells
columnSums = cellfun(@sum, C); %# A 1-by-10 vector of sums for each cell
You may want the more obscure Matlab function bsxfun. From the Matlab documentation, bsxfun "applies the element-by-element binary operation specified by the function handle fun to arrays A and B, with singleton expansion enabled."
@gnovice stated above that sum and other basic functions already operate on the first non-singleton dimension (i.e., rows if there's more than one row, columns if there's only one row, or higher dimensions if the lower dimensions all have size==1). However, bsxfun works for any function, including (and especially) user-defined functions.
For example, let's say you have a matrix A and a row vector B. E.g., let's say:
A = [1 2 3;
4 5 6;
7 8 9]
B = [0 1 2]
You want a function power_by_col which returns in a vector C all the elements in A to the power of the corresponding column of B.
From the above example, C is a 3x3 matrix:
C = [1^0 2^1 3^2;
4^0 5^1 6^2;
7^0 8^1 9^2]
i.e.,
C = [1 2 9;
1 5 36;
1 8 81]
You could do this the brute force way using repmat:
C = A.^repmat(B, size(A, 1), 1)
Or you could do this the classy way using bsxfun, which internally takes care of the repmat step:
C = bsxfun(@(x,y) x.^y, A, B)
So bsxfun saves you some steps (you don't need to explicitly calculate the dimensions of A). However, in some informal tests of mine, it turns out that repmat is roughly twice as fast if the function to be applied (like my power function, above) is simple. So you'll need to choose whether you want simplicity or speed.
The reason that you don't get any speed increase by calling matlabpool before calling arrayfun is that just the act of creating multiple workers doesn't make all code utilize these workers to perform calculations. If you want to exploit the pool of workers, you need to explicitly parallelize your code with parfor (related info here).
parfor k = 1:10
result{k} = sum(sum(a*b));
end
In general, arrayfun does not do any parallelization or acceleration. In fact, it's often slower than simply writing out the for loop because the explicit for loop allows for better JIT acceleration.
for k = 1:10
result(k) = sum(sum(a * b));
end
If you want to perform the operation you've shown using the GPU, if the input data to arrayfun is a gpuarray, then it will excecute on the GPU (using the distributed version of arrayfun). The issue though is that anything performed on the GPU using arrayfun has to be element-wise operations only so that the operation on each element is independent of the operations on all other elements (making it parallelizable). In your case, it is not element-wise operations and therefore the GPU-version of arrayfun cannot be used.
As a side-note, you'll want to use parpool rather than matlabpool since the latter has been deprecated.
Core MATLAB does use threads and vector operations, but you have to vectorize the code yourself. For your example, for instance, you need to write
A = rand(1000, 1000, 100);
B = sum( sum( A, 1 ), 2 );
B is now a 1-by-1-by-100 array of the sums. I've used two sums to help you understand what's going on, if you actually wanted to sum every number in a matrix you'd go sum(A(:)), or for this batch example, sum( reshape(A, [], 100) ).
For task parallelism rather than data parallelism use parfor, batch, parfeval or some other parallel instruction.
In this case, you don't need arrayfun to perform your calculation, you can simply do this:
imgStack = rand( 10, 10, 4 ); % 4 10x10 images
r = sum( sum( imgStack, 1 ), 2 ); % sum along both dimensions 1 and 2
In general, lots of MATLAB operations will operate on a whole array at once, that's the usual way of avoiding loops.
MATLAB's normal "arrayfun" is not parallel. However, for GPUArrays (with Parallel Computing Toolbox), there is a parallel version of arrayfun.
On your first question: You might try accumarray for this. One suggestion
function ds = applyfun_onfirstdim(arr, h_fun)
dimvec = size(arr);
indexarr = repmat( (1:dimvec(1))', [1, dimvec(2:end)] );
ds = accumarray(indexarr(:), arr(:), [], h_fun);
This creates an auxiliary index-array of the same dimensions as the input "arr". Every slice you want to apply h_fun to gets the same index-number. In this example it is the first.
This is interesting. Your examples are performing two different operations, which happen to lead to the same result. It's kind of fun to explore.
TL;DR. You should generally use arrayfun when your input is an array, and cellfun when your input is a cell, although you can often force arrayfun to do the job, with varyig levels of syntax hell.
Fundamentally, arrayfun is meant to operate on arrays and cellfun is meant to operate on cells. But, the Matlab-wise will note that a cell is nothing more than an array of "cells", so arrayfun works anyway.
As you point out, the following two lines perform the same operation:
cellfun(@(c) c, {'one' 'two' 'three'}, 'uniformoutput', 0) %returns {'one' 'two' 'three'}
arrayfun(@(c) c(1), {'one' 'two' 'three'}); %returns {'one' 'two' 'three'}
However, if we want to do something during our manipulations, it's a little different. For example, we may want to extract the first character of each string. Compare the results of cellfun and arrayfun here:
cellfun( @(c) c(1), {'one' 'two' 'three'}, 'uniformoutput', 0); %returns {'o' 't' 't'}
arrayfun(@(c) c(1), {'one' 'two' 'three'}); %Returns {'one' 'two' 'three'}
Do get the same result with arrayfun, we need to dereference the cell within the anonymous function, and then extract the character, and then put the results into a cell array rather than a character array. Like this:
arrayfun(@(c) c{1}(1), {'one' 'two' 'three'},'uniformoutput',false) %Returns {'o' 't' 't'}
So the difference is that cellfun takes care of the dereference operation which is required to do detailed operations on individual elements of a cell when looping (that is, the {}), whereas arrayfun just performs the standard indexing (that is, the ()). In addition, the 'uniformoutput',false notation determines if the output is written to a regular arral or a cell array.
To show what this means in code, see the following functions which are equivalent to cellfun and arrayfun, both with and without the 'uniformoutput',false notation. These four functions are equivalent except for the use of the () vs. {} within the loop:
function out = cellFunEquivalent(fn, x)
for ix = numel(x):-1:1
out(ix) = fn(x{ix});
end
out = reshape(out,size(x));
end
function out = arrayFunEquivalent(fn, x)
for ix = numel(x):-1:1
out(ix) = fn(x(ix));
end
out = reshape(out,size(x));
end
function out = cellFunEquivalent_nonuniform(fn, x)
for ix = numel(x):-1:1
out{ix} = fn(x{ix});
end
out = reshape(out,size(x));
end
function out = arrayFunEquivalent_nonuniform(fn, x)
for ix = numel(x):-1:1
out{ix} = fn(x(ix));
end
out = reshape(out,size(x));
end
For the example you posted, the arrayfun function is actually operating on single element cells, and reconstructing a copy of those cells into another array of the same (cell) class (see arrayFunEquivalent). The cellfun operation is dereferencing each element of the input cell array and then reconstructing a copy of those strings into a cell array (see cellFunEquivalent_nonuniform). When the input x is a cell, these operations are equivalent.
There are a few built-in functions that can be referenced by name in cellfun but cannot be used in the same way in arrayfun. From the help:
A = CELLFUN('fun', C), where 'fun' is one of the following strings,
returns a logical or double array A the elements of which are computed
from those of C as follows:
'isreal' -- true for cells containing a real array, false
otherwise
'isempty' -- true for cells containing an empty array, false
otherwise
'islogical' -- true for cells containing a logical array, false
otherwise
'length' -- the length of the contents of each cell
'ndims' -- the number of dimensions of the contents of each cell
'prodofsize' -- the number of elements of the contents of each cell
So cellfun('isreal', {'one' 'two' 'three'}) is a valid expression, but any similar call with arrayfun will trigger the First input must be a function handle error.
Of course, you can just use @isreal or @isempty for arrayfun
As for why cellfun still exists, I suspect it's historical (don't break backward compatibility)