The pseudo-code "implementation" of strcmp would go something like:
define strcmp (str1, str2):
p1 = address of first character of str1
p2 = address of first character of str2
while p1 not at end of str1:
if p2 at end of str2:
return 1
if contents of p2 greater than contents of p1:
return -1
if contents of p1 greater than contents of p2:
return 1
advance p1
advance p2
if p2 not at end of str2:
return -1
return 0
That's basically it. Each character is compared in turn and a decision is made as to whether the first or second string is greater(a), based on that character.
Only if the characters are identical do you move to the next character and, if all the characters were identical, zero is returned.
Note that you may not necessarily get 1 and -1, the specs say that any positive or negative value will suffice, so you should always check the return value with < 0, > 0 or == 0.
Turning that into real C would result in something like this:
int myStrCmp (const char *str1, const char *str2) {
const unsigned char *p1 = (const unsigned char *) str1;
const unsigned char *p2 = (const unsigned char *) str2;
while (*p1 != '\0') {
if (*p2 == '\0') return 1;
if (*p2 > *p1) return -1;
if (*p1 > *p2) return 1;
p1++;
p2++;
}
if (*p2 != '\0') return -1;
return 0;
}
(a) Keep in mind that "greater" in the context of characters is not necessarily based on simple ASCII ordering for all string functions.
C has a concept called 'locales' which specify (amongst other things) the collation (ordering of the underlying character set).
You may therefore find, for example, that the characters from the set {a, á, à, ä} are all considered identical when comparing.
The pseudo-code "implementation" of strcmp would go something like:
define strcmp (str1, str2):
p1 = address of first character of str1
p2 = address of first character of str2
while p1 not at end of str1:
if p2 at end of str2:
return 1
if contents of p2 greater than contents of p1:
return -1
if contents of p1 greater than contents of p2:
return 1
advance p1
advance p2
if p2 not at end of str2:
return -1
return 0
That's basically it. Each character is compared in turn and a decision is made as to whether the first or second string is greater(a), based on that character.
Only if the characters are identical do you move to the next character and, if all the characters were identical, zero is returned.
Note that you may not necessarily get 1 and -1, the specs say that any positive or negative value will suffice, so you should always check the return value with < 0, > 0 or == 0.
Turning that into real C would result in something like this:
int myStrCmp (const char *str1, const char *str2) {
const unsigned char *p1 = (const unsigned char *) str1;
const unsigned char *p2 = (const unsigned char *) str2;
while (*p1 != '\0') {
if (*p2 == '\0') return 1;
if (*p2 > *p1) return -1;
if (*p1 > *p2) return 1;
p1++;
p2++;
}
if (*p2 != '\0') return -1;
return 0;
}
(a) Keep in mind that "greater" in the context of characters is not necessarily based on simple ASCII ordering for all string functions.
C has a concept called 'locales' which specify (amongst other things) the collation (ordering of the underlying character set).
You may therefore find, for example, that the characters from the set {a, á, à, ä} are all considered identical when comparing.
Here is the BSD implementation:
int
strcmp(s1, s2)
register const char *s1, *s2;
{
while (*s1 == *s2++)
if (*s1++ == 0)
return (0);
return (*(const unsigned char *)s1 - *(const unsigned char *)(s2 - 1));
}
Once there is a mismatch between two characters, it just returns the difference between those two characters.
Videos
From the cppreference.com documentation
int strcmp( const char *lhs, const char *rhs );Return value
Negative value if lhs appears before rhs in lexicographical order.
Zero if lhs and rhs compare equal.
Positive value if lhs appears after rhs in lexicographical order.
As you can see it just says negative, zero or positive. You can't count on anything else.
The site you linked isn't incorrect. It tells you that the return value is < 0, == 0 or > 0 and it gives an example and shows it's output. It doesn't tell the output should be 111.
To quote the man page:
The strcmp() and strncmp() functions return an integer less than, equal to, or greater than zero if s1 (or the first n bytes thereof) is found, respectively, to be less than, to match, or be greater than s2.
In other words, you should never rely on the exact return value of strcmp (other than 0, of course). The only guarantee is that the return value will be negative if the first string is "smaller", positive if the first string is "bigger" or 0 if they are equal. The same inputs may generate different results on different platforms with different implementations of strcmp.