Strings in C are sequences of characters terminated by a null character (\0). Unlike higher-level languages, C does not have a built-in string type. Instead, strings are implemented as null-terminated character arrays using the char data type.
Key Characteristics
Null Termination: Every string must end with a
\0character, which marks the end of the string. This is essential for functions likestrlen(),strcpy(), andprintf()to work correctly.Array-Based: A string is declared as a
chararray. For example:char greeting[] = "Hello";Here, the compiler automatically adds the
\0at the end, making the array size 6 (5 characters + 1 null terminator).String Literals: Double quotes (
") are used to define string literals. The compiler stores them in read-only memory and appends\0.
Declaring and Initializing Strings
Using string literals:
char str[] = "Hello";Character-by-character initialization (must include
\0):char str[] = {'H', 'e', 'l', 'l', 'o', '\0'};
Important Notes
No Assignment: You cannot assign a string literal to an array using the assignment operator after declaration:
char str[10]; str = "Hello"; // ❌ ErrorUse
strcpy()instead:strcpy(str, "Hello");Memory Management: When using pointers or dynamic allocation, ensure sufficient space is allocated for the string and the
\0character.
Common String Functions (from <string.h>)
strlen(s)– Returns the length of strings(excludes\0).strcpy(dst, src)– Copiessrctodst.strcat(dst, src)– Concatenatessrctodst.strcmp(s1, s2)– Compares two strings.strncpy,strncat,strstr,strchr,strtok– Additional useful functions.
Example
#include <stdio.h>
#include <string.h>
int main() {
char name[] = "Alice";
printf("Name: %s\n", name); // Output: Name: Alice
printf("Length: %zu\n", strlen(name)); // Output: Length: 5
return 0;
}Summary
C strings are arrays of
charending with\0.They are not a data type but a convention based on null-terminated arrays.
Always ensure proper null termination and memory allocation to avoid undefined behavior.
C does not and never has had a native string type. By convention, the language uses arrays of char terminated with a null char, i.e., with '\0'. Functions and macros in the language's standard libraries provide support for the null-terminated character arrays, e.g., strlen iterates over an array of char until it encounters a '\0' character and strcpy copies from the source string until it encounters a '\0'.
The use of null-terminated strings in C reflects the fact that C was intended to be only a little more high-level than assembly language. Zero-terminated strings were already directly supported at that time in assembly language for the PDP-10 and PDP-11.
It is worth noting that this property of C strings leads to quite a few nasty buffer overrun bugs, including serious security flaws. For example, if you forget to null-terminate a character string passed as the source argument to strcpy, the function will keep copying sequential bytes from whatever happens to be in memory past the end of the source string until it happens to encounter a 0, potentially overwriting whatever valuable information follows the destination string's location in memory.
In your code example, the string literal "Hello, world!" will be compiled into a 14-byte long array of char. The first 13 bytes will hold the letters, comma, space, and exclamation mark and the final byte will hold the null-terminator character '\0', automatically added for you by the compiler. If you were to access the array's last element, you would find it equal to 0. E.g.:
const char foo[] = "Hello, world!";
assert(foo[12] == '!');
assert(foo[13] == '\0');
However, in your example, message is only 10 bytes long. strcpy is going to write all 14 bytes, including the null-terminator, into memory starting at the address of message. The first 10 bytes will be written into the memory allocated on the stack for message and the remaining four bytes will simply be written on to the end of the stack. The consequence of writing those four extra bytes onto the stack is hard to predict in this case (in this simple example, it might not hurt a thing), but in real-world code it usually leads to corrupted data or memory access violation errors.
Does C have a string type? - Stack Overflow
Why use C strings in C++? - Stack Overflow
How do strings work in C
New to C. If C does not have strings, then what exactly is printf doing?
Videos
My friends are spilt down the middle on this. Half of us think since a c doesn’t have built in strings and only arrays of characters that they don’t. While the other half think that the array of characters would be considered string.
C does not and never has had a native string type. By convention, the language uses arrays of char terminated with a null char, i.e., with '\0'. Functions and macros in the language's standard libraries provide support for the null-terminated character arrays, e.g., strlen iterates over an array of char until it encounters a '\0' character and strcpy copies from the source string until it encounters a '\0'.
The use of null-terminated strings in C reflects the fact that C was intended to be only a little more high-level than assembly language. Zero-terminated strings were already directly supported at that time in assembly language for the PDP-10 and PDP-11.
It is worth noting that this property of C strings leads to quite a few nasty buffer overrun bugs, including serious security flaws. For example, if you forget to null-terminate a character string passed as the source argument to strcpy, the function will keep copying sequential bytes from whatever happens to be in memory past the end of the source string until it happens to encounter a 0, potentially overwriting whatever valuable information follows the destination string's location in memory.
In your code example, the string literal "Hello, world!" will be compiled into a 14-byte long array of char. The first 13 bytes will hold the letters, comma, space, and exclamation mark and the final byte will hold the null-terminator character '\0', automatically added for you by the compiler. If you were to access the array's last element, you would find it equal to 0. E.g.:
const char foo[] = "Hello, world!";
assert(foo[12] == '!');
assert(foo[13] == '\0');
However, in your example, message is only 10 bytes long. strcpy is going to write all 14 bytes, including the null-terminator, into memory starting at the address of message. The first 10 bytes will be written into the memory allocated on the stack for message and the remaining four bytes will simply be written on to the end of the stack. The consequence of writing those four extra bytes onto the stack is hard to predict in this case (in this simple example, it might not hurt a thing), but in real-world code it usually leads to corrupted data or memory access violation errors.
To note it in the languages you mentioned:
Java:
String str = new String("Hello");
Python:
str = "Hello"
Both Java and Python have the concept of a "string", C does not have the concept of a "string". C has character arrays which can come in "read only" or manipulatable.
C:
char * str = "Hello"; // the string "Hello\0" is pointed to by the character pointer
// str. This "string" can not be modified (read only)
or
char str[] = "Hello"; // the characters: 'H''e''l''l''o''\0' have been copied to the
// array str. You can change them via: str[x] = 't'
A character array is a sequence of contiguous characters with a unique sentinel character at the end (normally a NULL terminator '\0'). Note that the sentinel character is auto-magically appended for you in the cases above.
The only reasons I've had to use them is when interfacing with 3rd party libraries that use C style strings. There might also be esoteric situations where you would use C style strings for performance reasons, but more often than not, using methods on C++ strings is probably faster due to inlining and specialization, etc.
You can use the c_str() method in many cases when working with those sort of APIs, but you should be aware that the char * returned is const, and you should not modify the string via that pointer. In those sort of situations, you can still use a vector<char> instead, and at least get the benefit of easier memory management.
A couple more memory control notes:
C strings are POD types, so they can be allocated in your application's read-only data segment. If you declare and define std::string constants at namespace scope, the compiler will generate additional code that runs before main() that calls the std::string constructor for each constant. If your application has many constant strings (e.g. if you have generated C++ code that uses constant strings), C strings may be preferable in this situation.
Some implementations of std::string support a feature called SSO ("short string optimization" or "small string optimization") where the std::string class contains storage for strings up to a certain length. This increases the size of std::string but often significantly reduces the frequency of free-store allocations/deallocations, improving performance. If your implementation of std::string does not support SSO, then constructing an empty std::string on the stack will still perform a free-store allocation. If that is the case, using temporary stack-allocated C strings may be helpful for performance-critical code that uses strings. Of course, you have to be careful not to shoot yourself in the foot when you do this.
There are multiple ways to create a string in C:
char* string1 = "hi";
char string2[] = "world";
printf("%s %s", string1, string2)I have a lot of problems with this:
According to my understanding of [[Pointers]], string1 is a pointer and we're passing it to [[printf]] which expects actual values not references.
if we accept the fact that printf expects a pointer, than how does it handle string2 (not a pointer) just fine
I understand that char* is designed to point to the first character of a string which means it effectively points to the entire string, but what if I actually wanted to point to a single character
this doesn't work, because we are assigning a value to a pointer:
int* a; a = 8
so why does this work:
char* str; str = "hi"
Brand new to C, and I am told that there is no string data type. So I am just curious, if that id the case, then how exactly is something like: printf(“Hello World”) a thing?