In general, any function that does not check bounds in the arguments. A list would be
- gets()
- scanf()
- strcpy()
- strcat()
You should use size limited versions like stncpy, strncat, fgets, etc. Then be careful while giving the size limit; take into consideration the '\0' terminating the string.
Also, arrays are NOT bound checked in C or C++. The following example would cause errors. See off by one error
int foo[3];
foo[3] = WALKED_OFF_END_OF_ARRAY;
edit: Copied answers of @MrValdez , @Denton Gentry
Answer from hayalci on Stack OverflowBuffer Overflow Vulnerability C Code
What C/C++ functions are most often used incorrectly and can lead to buffer overflows? - Stack Overflow
How is printf() in C/C++ a Buffer overflow vulnerability? - Information Security Stack Exchange
c - Why is this code vulnerable to buffer overflow attacks? - Stack Overflow
Videos
Hi folks,
I have this c code:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int ssp(char * str)
{
char buffer[100];
strcpy(buffer,str);
return 1;
}
int main(int argc, char **argv)
{
char str[400];
FILE * afile;
afile = fopen("afile", "r");
fread(str, sizeof(char), 400, afile);
ssp(str);
printf("Returned Properly\n");
return 1;
}The program provided reads the contents of a file called `"afile"` into a character array called `str`, which can hold up to 400 characters. It then calls the `ssp` function and passes `str` as an argument.
The `ssp` function copies the contents of the `str` character array into a local character array called buffer. The `strcpy` function used to copy the string data does not perform any bounds checking, which can lead to buffer overflow vulnerabilities if the input string is longer than the buffer size.
However, the lack of bounds checking in the `strcpy` function in the `ssp` function can potentially lead to buffer overflow vulnerabilities if used in a larger program or in an environment with untrusted input data.
Could anyone please assist with a shellcode at the end of "afile" and then store the shellcode on the stack to run? Please...
In general, any function that does not check bounds in the arguments. A list would be
- gets()
- scanf()
- strcpy()
- strcat()
You should use size limited versions like stncpy, strncat, fgets, etc. Then be careful while giving the size limit; take into consideration the '\0' terminating the string.
Also, arrays are NOT bound checked in C or C++. The following example would cause errors. See off by one error
int foo[3];
foo[3] = WALKED_OFF_END_OF_ARRAY;
edit: Copied answers of @MrValdez , @Denton Gentry
Valgrind is your new best friend.
valgrind --tool=memcheck --leak-check=full ./a.out
It is possible to have issues with printf(), by using as format string a user-provided argument, i.e. printf(arg) instead of printf("%s", arg). I have seen it done way too often. Since the caller did not push extra arguments, a string with some spurious % specifiers can be used to read whatever is on the stack, and with %n some values can be written to memory (%n means: "the next argument is an int *; go write there the number of characters emitted so far).
However, I find it more plausible that the article you quote contains a simple typographical mistake, and really means sprintf(), not printf().
(I could also argue that apart from gets(), there is no inherently vulnerable C function; only functions which need to be used with care. The so-called "safe" replacements like snprintf() don't actually solve the problem; they hide it by replacing a buffer overflow with a silent truncation, which is less noisy but not necessarily better.)
In addition to the answer by @Tom, I would also like to guide you to the OWASP code review guidelines, where some issues on using printf() are highlighted and this answer to a similar question on cs.stackexchange website.
On most compilers the maximum value of an unsigned short is 65535.
Any value above that gets wrapped around, so 65536 becomes 0, and 65600 becomes 65.
This means that long strings of the right length (e.g. 65600) will pass the check, and overflow the buffer.
Use size_t to store the result of strlen(), not unsigned short, and compare len to an expression that directly encodes the size of buffer. So for example:
char buffer[100];
size_t len = strlen(str);
if (len >= sizeof(buffer) / sizeof(buffer[0])) return -1;
memcpy(buffer, str, len + 1);
The problem is here:
strncpy(buffer,str,strlen(str));
^^^^^^^^^^^
If the string is greater than the length of the target buffer, strncpy will still copy it over. You are basing the number of characters of the string as the number to copy instead of the size of the buffer. The correct way to do this is as follows:
strncpy(buffer,str, sizeof(buff) - 1);
buffer[sizeof(buff) - 1] = '\0';
What this does is limit the amount of data copied to the actual size of the buffer minus one for the null terminating character. Then we set the last byte in the buffer to the null character as an added safeguard. The reason for this is because strncpy will copy upto n bytes, including the terminating null, if strlen(str) < len - 1. If not, then the null is not copied and you have a crash scenario because now your buffer has a unterminated string.
Hope this helps.
EDIT: Upon further examination and input from others, a possible coding for the function follows:
int func (char *str)
{
char buffer[100];
unsigned short size = sizeof(buffer);
unsigned short len = strlen(str);
if (len > size - 1) return(-1);
memcpy(buffer, str, len + 1);
buffer[size - 1] = '\0';
return(0);
}
Since we already know the length of the string, we can use memcpy to copy the string from the location that is referenced by str into the buffer. Note that per the manual page for strlen(3) (on a FreeBSD 9.3 system), the following is stated:
The strlen() function returns the number of characters that precede the terminating NUL character. The strnlen() function returns either the same result as strlen() or maxlen, whichever is smaller.
Which I interpret to be that the length of the string does not include the null. That is why I copy len + 1 bytes to include the null, and the test checks to make sure that the length < size of buffer - 2. Minus one because the buffer starts at position 0, and minus another one to make sure there's room for the null.
EDIT: Turns out, the size of something starts with 1 while access starts with 0, so the -2 before was incorrect because it would return an error for anything > 98 bytes but it should be > 99 bytes.
EDIT: Although the answer about a unsigned short is generally correct as the maximum length that can be represented is 65,535 characters, it doesn't really matter because if the string is longer than that, the value will wrap around. It's like taking 75,231 (which is 0x000125DF) and masking off the top 16 bits giving you 9695 (0x000025DF). The only problem that I see with this is the first 100 chars past 65,535 as the length check will allow the copy, but it will only copy up to the first 100 characters of the string in all cases and null terminate the string. So even with the wraparound issue, the buffer still will not be overflowed.
This may or may not in itself pose a security risk depending on the content of the string and what you are using it for. If it's just straight text that is human readable, then there is generally no problem. You just get a truncated string. However, if it's something like a URL or even a SQL command sequence, you could have a problem.
They did fix the libraries.
Any modern C standard library contains safer variants of strcpy, strcat, sprintf, and so on.
On C99 systems - which is most Unixes - you will find these with names like strncat and snprintf, the "n" indicating that it takes an argument that's the size of a buffer or a maximum number of elements to copy.
These functions can be used to handle many operations more securely, but in retrospect their usability is not great. For example some snprintf implementations don't guarantee the buffer is null-terminated. strncat takes a number of elements to copy, but many people mistakenly pass the size of the dest buffer.
On Windows, one often finds the strcat_s, sprintf_s, the "_s" suffix indicating "safe". These too have found their way into the C standard library in C11, and provide more control over what happens in the event of an overflow (truncation vs. assert for example).
Many vendors provide even more non-standard alternatives like asprintf in the GNU libc, which will allocate a buffer of the appropriate size automatically.
The idea that you can "just fix C" is a misunderstanding. Fixing C is not the problem - and has already been done. The problem is fixing decades of C code written by ignorant, tired, or hurried programmers, or code that has been ported from contexts where security didn't matter to contexts where security does. No changes to the standard library can fix this code, although migration to newer compilers and standard libraries can often help identify the problems automatically.
It's not really inaccurate to say that C is actually "error-prone" by design. Aside from some grievous mistakes like gets, the C language can't really be any other way without losing the primary feature that draws people to C in the first place.
C was designed as a systems language to act as a sort of "portable assembly." A major feature of the C language is that unlike higher-level languages, C code often maps very closely to the actual machine code. In other words, ++i is usually just an inc instruction, and you can often get a general idea of what the processor will be doing at run-time by looking at the C code.
But adding in implicit bounds checking adds a lot of extra overhead - overhead which the programmer didn't ask for and might not want. This overhead goes way beyond the extra storage required to store the length of each array, or the extra instructions to check array bounds on every array access. What about pointer arithmetic? Or what if you have a function that takes in a pointer? The runtime environment has no way of knowing if that pointer falls within the bounds of a legitimately allocated memory block. In order to keep track of this, you'd need some serious runtime architecture that can check each pointer against a table of currently allocated memory blocks, at which point we're already getting into Java/C#-style managed runtime territory.