Make sure you are using an appropriate -march setting, first off. GCC defaults to not using any instructions that were not supported on the original i386 - allowing it to use newer instruction sets can make a BIG difference at times! On -march=core2 -O2 I get:
min:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %edx
movl 12(%ebp), %ecx
movl 16(%ebp), %eax
cmpl %edx, %ecx
leave
cmovbe %ecx, %edx
cmpl %eax, %edx
cmovbe %edx, %eax
ret
The use of cmov here may help you avoid branch delays - and you get it without any inline asm just by passing in -march. When inlined into a larger function this is likely to be even more efficient, possibly just four assembly operations. If you need something faster than this, see if you can get the SSE vector operations to work in the context of your overall algorithm.
Make sure you are using an appropriate -march setting, first off. GCC defaults to not using any instructions that were not supported on the original i386 - allowing it to use newer instruction sets can make a BIG difference at times! On -march=core2 -O2 I get:
min:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %edx
movl 12(%ebp), %ecx
movl 16(%ebp), %eax
cmpl %edx, %ecx
leave
cmovbe %ecx, %edx
cmpl %eax, %edx
cmovbe %edx, %eax
ret
The use of cmov here may help you avoid branch delays - and you get it without any inline asm just by passing in -march. When inlined into a larger function this is likely to be even more efficient, possibly just four assembly operations. If you need something faster than this, see if you can get the SSE vector operations to work in the context of your overall algorithm.
Assuming your compiler isn't out to lunch, this should compile down to two compares and two conditional moves. It isn't possible to do much better than that.
If you post the assembly that your compiler is actually generating, we can see if there's anything unnecessary that's slowing it down.
The number one thing to check is that the routine is actually getting inlined. The compiler isn't obligated to do so, and if it's generating a function call, that will be hugely expensive for such a simple operation.
If the call really is getting inlined, then loop unrolling may be beneficial, as DigitalRoss said, or vectorization may be possible.
Edit: If you want to vectorize the code, and are using a recent x86 processor, you will want to use the SSE4.1 pminud instruction (intrinsic: _mm_min_epu32), which takes two vectors of four unsigned ints each, and produces a vector of four unsigned ints. Each element of the result is the minimum of the corresponding elements in the two inputs.
I also note that your compiler used branches instead of conditional moves; you should probably try a version that uses conditional moves first and see if that gets you any speedup before you go off to the races on a vector implementation.
#include <stdio.h>
#include <math.h>
double minimum(double x, double y, double z)
{
double temp = 0;
if (isnan(x) || isnan (y) || isnan(z))
return NAN;
temp = (x < y) ? x : y;
return (temp < z)? temp : z;
}
int main(void) {
double x, y, z, minVal;
printf("Please enter three numeric values: ");
scanf("%lf%lf%lf", &x, &y, &z);
minVal = minimum(x, y, z);
printf("minimum(%0.10f, %0.10f, %0.10f) = %0.10f\n", x, y, z, minVal);
return 0;
}
method for double:
int main(void)
{
double a, b, c, temp, min;
printf ("Enter three nos. separated by spaces: ");
scanf ("%lf%lf%lf", &a, &b, &c);
temp = (a < b) ? a : b;
min = (c < temp) ? c : temp;
printf ("The Minimum of the three is: %lf", min);
/* indicate success */
return 0;
}
method for int:
int main(void)
{
int a, b, c, temp, min;
printf ("Enter three nos. separated by spaces: ");
scanf ("%d%d%d", &a, &b, &c);
temp = (a < b) ? a : b;
min = (c < temp) ? c : temp;
printf ("The Minimum of the three is: %d", min);
/* indicate success */
return 0;
}
Videos
If possible, I recommend using C++11 or newer which allows you to compute the desired result w/out implementing your own function (std::min). As already pointed out in one of the comments, you can do
T minimum(std::min({x, y, z}));
or
T minimum = std::min({x, y, z});
which stores the minimum of the variables x, y and z in the variable minimum of type T (note that x, y and z must have the same type or have to be implicitly convertible to it). Correspondingly, the same can be done to obtain a maximum: std::max({x, y, z}).
There's a number of improvements that can be made.
You could use standard functions to make it clearer:
// Notice I made the return type an int instead of a float,
// since you're passing in ints
int smallest(int x, int y, int z){
return std::min(std::min(x, y), z);
}
Or better still, as pointed out in the comments:
int smallest(int x, int y, int z){
return std::min({x, y, z});
}
If you want it to operate on any number of ints, you could do something like this:
int smallest(const std::vector<int>& intvec){
int smallest = std::numeric_limits<int>::max(); // Largest possible integer
// there are a number of ways to structure this loop, this is just one
for (int i = 0; i < intvec.size(); ++i)
{
smallest = std::min(smallest, intvec[i]);
}
return smallest;
}
You could also make it generic so that it'll operate on any type, instead of just ints
template <typename T>
T smallest(const std::vector<T>& vec){
T smallest = std::numeric_limits<T>::max(); // Largest possible integer
// there are a number of ways to structure this loop, this is just one
for (int i = 0; i < vec.size(); ++i)
{
smallest = std::min(smallest, vec[i]);
}
return smallest;
}
As long as your compiler is optimizing that's probably as good as you're going to get.
#include <algorithm>
int test(int i, int j, int k)
{
return std::min(i, std::min(j, k));
}
compiled with g++ -S -c -O3 test.cpp I get
cmpl %edi, %esi
movl %edx, %eax
cmovg %edi, %esi
cmpl %edx, %esi
cmovle %esi, %eax
ret
Perhaps you didn't pass any optimization flags when compiling?
Check this solution if it helps you
quoting below code from above reference
int x; // we want to find the minimum of x and y
int y;
int r; // the result goes here
r = y ^ ((x ^ y) & -(x < y)); // min(x, y)
One more trick under same heading is
r = y + ((x - y) & ((x - y) >> (sizeof(int) * CHAR_BIT - 1))); // min(x, y)
You can remove -1 from trick 2. It works only on some systems and is not portable.
You can extend this idea to find minimum of 3 numbers.
static inline int min3(int x, int y, int z)
{
register int r, d;
d = x - y;
r = y + (d & (d >> (sizeof(int) * CHAR_BIT))); /* add -1 if required */
d = r - z;
r = z + (d & (d >> (sizeof(int) * CHAR_BIT))); /* add -1 if required */
return r;
}
If range of your numbers is very less, you can even use some look up, or more efficient but hack.