To your first question: I would go with Paul R's comment and terminate with '\0'. But the value 0 itself works also fine. A matter of taste. But don't use the MACRO NULLwhich is meant for pointers.

To your second question: If your string is not terminated with\0, it might still print the expected output because following your string is a non-printable character in your memory. This is a really nasty bug though, since it might blow up when you might not expect it. Always terminate a string with '\0'.

Answer from Lucas on Stack Overflow

To your first question: I would go with Paul R's comment and terminate with '\0'. But the value 0 itself works also fine. A matter of taste. But don't use the MACRO NULLwhich is meant for pointers.

To your second question: If your string is not terminated with\0, it might still print the expected output because following your string is a non-printable character in your memory. This is a really nasty bug though, since it might blow up when you might not expect it. Always terminate a string with '\0'.

Answer from Lucas on Stack Overflow
🌐
Reddit
reddit.com › r/c_programming › null character '\0' & null terminated strings
r/C_Programming on Reddit: Null character '\0' & null terminated strings
December 25, 2022 -

Hello everyone!
In C, strings (character arrays) are terminated by null character '\0' - character with value zero.
In ASCII, the NUL control code has value 0 (0x00). Now, if we were working in different character set (say the machine's character set wouldn't be ASCII but different one), should the strings be terminated by NUL in that character set, or by a character whose value is zero?

For example, if the machine's character set would be UTF-16, the in C, byte would be 16bits and strings would be terminated by \0 character with value 0x00 00, which is also NUL in UTF-16.
But, what if the machine's character set would be modified UTF-8 (or UTF-7, ...). Then, according to Wikipedia, the null character is encoded as two bytes 0xC0, 0x80. How would be strings terminated in that case? By the byte with value 0 or by the null character.

I guess my question could be rephrased as: Are null terminated strings terminated by the NUL character (which in that character set might be represented by a nonzero value) or by a character whose value is zero (which in that character set might not represent the NUL character).

Thank you all very much and I'm sorry for all mistakes and errors as english is not my first language.

Thanks again.

Top answer
1 of 3
31
should the strings be terminated by NUL in that character set, or by a character whose value is zero? The character '\0' is guaranteed to be a byte with all bits zero, and to have a numeric value equal to zero. A string in C always ends with this character. Then, according to Wikipedia, the null character is encoded as two bytes 0xC0, 0x80. No, in standard UTF-8 the code point with value zero is encoded in a single zero byte. You may have been reading something about "modified UTF-8", which appears to be a rather Java-centric external encoding for strings. It deliberately uses an "overlong" encoding of Java '\u0000' so that the resulting byte sequence does not contain a zero byte. One reason for this is because the length of strings in Java is not defined by use of a terminating character — a Java string can contain arbitrary '\u0000' characters — and you might need some way to round-trip such strings between Java and a language like C that does use a zero byte as a terminator.
2 of 3
17
C11 states: 5.2 Environmental considerations 5.2.1 Character sets 2. In a character constant or string literal, members of the execution character set shall be represented by corresponding members of the source character set or by escape sequences consisting of the backslash \ followed by one or more characters. A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string. Emphasis is mine From that we can understand that the terminating null character is always completely 0. Then, there's: 5.2.1.2 Multibyte characters A byte with all bits zero shall be interpreted as a null character independent of shift state. Such a byte shall not occur as part of any other multibyte character. 7.1.1 Definitions of terms A string is a contiguous sequence of characters terminated by and including the first null character. The term multibyte string is sometimes used instead to emphasize special processing given to multibyte characters contained in the string or to avoid confusion with a wide string. A pointer to a string is a pointer to its initial (lowest addressed) character. The length of a string is the number of bytes preceding the null character and the value of a string is the sequence of the values of the contained characters, in order.
🌐
Wikipedia
en.wikipedia.org › wiki › Null-terminated_string
Null-terminated string - Wikipedia
March 25, 2025 - This allows the string to contain NUL and made finding the length need only one memory access (O(1) (constant) time), but limited string length to 255 characters. C designer Dennis Ritchie chose to follow the convention of null-termination to avoid the limitation on the length of a string and because maintaining the count seemed, in his experience, less convenient than using a terminator.
🌐
LabEx
labex.io › tutorials › c-how-to-ensure-string-null-termination-438491
How to ensure string null termination | LabEx
gcc -Wall -Wextra -Werror -O2 -g -fsanitize=address ## Enables comprehensive error checking ... Mastering string null termination is a fundamental skill in C programming.
🌐
Cprogramming
cboard.cprogramming.com › c-programming › 181878-null-terminated-strings.html
null terminated strings
Text files on a disk, no matter what software created them, are line terminated with either a newline, in UNIX, Linux or modern Mac, '\n', or two characters, a carriage return & newline, "\r\n" in DOS/WIndows. NOT "Null terminated!!! When a text line is read into memory, by fgets(), etc..., then the string, in memory, IS automatically Nul terminated by fgets().
Find elsewhere
🌐
TutorialsPoint
tutorialspoint.com › what-is-a-null-terminated-string-in-c-cplusplus
What is a null-terminated string in C/C++?
The null terminated strings are basically a sequence of characters, and the last element is one null character (denoted by '\0'). When we write some string using double quotes ("..."), then it is converted into null terminated strings by the compiler. The size of the string may smaller than the array size, but if there are some null character inside that array, that will be treated as the end of that string. Here, we assign null character in the 13th index position, since indexing start from 0 of the string to the null character (\0).
🌐
TutorialsPoint
tutorialspoint.com › cprogramming › c_strings.htm
Strings in C
In the following example, you can input a string using scanf() function, after inputting the specific characters (5 in the following example), we are assigning null ('\0') to terminate the string.
🌐
Hacker News
news.ycombinator.com › item
The decision to make C strings null terminated with implied length instead of le... | Hacker News
March 8, 2021 - But also, "strings" and "time" are actually very complex concepts, and these functions operate on often outdated assumptions about those underlying abstractions · C99 came so very very close with VLAs. You can declare a function like:
🌐
ScienceDirect
sciencedirect.com › topics › computer-science › null-terminated-string
Null-Terminated String - an overview | ScienceDirect Topics
In c programs, the function strlen in the standard library takes a string as its argument and returns the string's length, expressed as an integer. In pl/i, the built-in function length performs the same function. The two string representations described previously lead to radically different costs for the length computation. ... Null Terminated String The length computation must start at the beginning of the string and examine each character, in order, until it reaches the null character.
Top answer
1 of 12
19

An option missing from the question is fat pointers ─ the type &str in Rust is an example of this. The length is not stored on the heap as a prefix to the string data, instead it is stored alongside the pointer, so that a reference to a string takes two words (length and pointer) instead of just one for a pointer.

This means that if there are multiple references to the same string, then the length data is duplicated compared to a length-prefixed string, which would only store the length once, where the string data is. But the upside is that a fat pointer can reference a substring without duplicating the string data on the heap.

In the diagram above (from the official Rust book), s is a String so it has a fat pointer to the whole string allocation (plus a capacity field, since it's a growable string), while world is a shared reference (i.e. a fat pointer) to a substring. This sharing would not be possible with length-prefixing, and would be possible with null-termination for substrings at the end of the string but not otherwise.

2 of 12
15

Length-prefixed strings have the advantage of being able to find their length in O(1) time rather than O(n) time. This means you can find the end of the string more easily with the length prefix. They are also less error prone to use since you don't have to deal with forgetting to null terminate a string.

One disadvantage to length prefixed strings is that they require more space. In addition, you are limited in what the max size of the string can be based on how many bytes are used to store the length.

Top answer
1 of 7
27

If it's not null-terminated, then it's not a C string, and you can't use functions like strlen - they will march off the end of the array, causing undefined behaviour. You'll need to keep track of the length some other way.

You can still print a non-terminated character array with printf, as long as you give the length:

printf("str is %.3s",s2);
printf("str is %.*s",s2_length,s2);

or, if you have access to the array itself, not a pointer:

printf("str is %.*s", (int)(sizeof s2), s2);

You've also tagged the question C++: in that language, you usually want to avoid all this error-prone malarkey and use std::string instead.

2 of 7
10

A "C string" is, by definition, null-terminated. The name comes from the C convention of having null-terminated strings. If you want something else, it's not a C string.

So if you have a string that is not null-terminated, you cannot use the C string manipulation routines on it. You can't use strlen, strcpy or strcat. Basically, any function that takes a char* but no separate length is not usable.

Then what can you do? If you have a string that is not null-terminated, you will have the length separately. (If you don't, you're screwed. You need some way to find the length, either by a terminator or by storing it separately.) What you can do is allocate a buffer of the appropriate size, copy the string over, and append a null. Or you can write your own set of string manipulation functions that work with pointer and length. In C++ you can use std::string's constructor that takes a char* and a length; that one doesn't need the terminator.

🌐
Quora
quora.com › What-is-the-use-of-null-termination-in-the-C-programming-language
What is the use of null termination in the C programming language? - Quora
Answer (1 of 5): In C character strings are stored in char arrays. The presence of the null character, '\0′, signifies the end the string in the array. For example if I declared a char array of 8 chars long, I can use the strcpy function to store the string “hello” in the char array.
🌐
Weber State University
icarus.cs.weber.edu › ~dab › cs1410 › textbook › 8.Strings › c_string.html
8.2. C-Strings
C++ arrays, including those forming C-strings, are zero-indexed, so C-strings always begin at index location 0. The null terminator can appear anywhere in the array, partially filling it if the terminator is not the last array element. The C-string functions ignore all array elements following the null terminator. The name of an array, without any trailing brackets, is the array's address.
🌐
SEI CERT
wiki.sei.cmu.edu › confluence › x › r9UxBQ
STR32-C. Do not pass a non-null-terminated character ...
The requested service is temporarily unavailable · We would like to apologize for any inconvenience that this may cause
🌐
Microsoft Learn
learn.microsoft.com › en-us › answers › questions › 672283 › how-to-find-length-of-non-null-terminated-sequence
How to find length of non null terminated sequence of character's length in C and C++? - Microsoft Q&A
This will allow the compiler to catch attempts to alter the constant string, rather than wait for an exception at run time. ... Array b1 does have room for a nul at the end after the two chars but as you have not specified a value for the third element in the array it may not contain a nul. If these lines of code are at global scope - outside of main() or any other function - then the 3rd element in b1 will be default-initialized to binary zero which is the nul-terminator character.
🌐
Reddit
reddit.com › r/c_programming › should i end a dynamically allocated array with a null terminated character?
r/C_Programming on Reddit: Should I end a dynamically allocated array with a null terminated character?
January 13, 2023 -

I am confused with this topic, my teacher said that strings in C are null terminated automatically then if I manually allocate 5 bytes for the string "word", should I add "\0" at the end or not?

Thank you for answering my question!

Top answer
1 of 4
5
Your teacher was probably talking about string literals. That is, char s[] = "word"; printf("%zu\n", sizeof s); prints 5, and s can be passed to all string-related functions and work exactly as expected. However, this is because of the initialisation line. If you dynamically allocate the string instead, you'll need to remeber to add the null terminator.
2 of 4
3
First off, it's good that you're thinking about this. I suspect that your teacher's statement might be some Obi-Wan "true, from a certain point of view" speak. As both u/balkenbrij and u/Ninesquared81 pointed out, if you have a string literal in your code, when compiled, a null byte will be added, and in that sense, the string is "null terminated automatically." If you are using the string APIs in the C standard library, they will expect their inputs to be null-terminated, and will null-terminate their outputs, and in that sense as well, the string is "null terminated automatically." if I manually allocate 5 bytes for the string "word", should I add "\0" at the end or not? Short answer, yes. A string in C has two characteristics: it's a block of contiguous memory, and it has a zero byte at the end. Allocating the memory gets you the first, but if you're not using a string API you need to do the second. You can either manually write a \0 after the end of the character sequence, or just use calloc to get a zero-initialized memory block and then it's null-terminated by default*. Long answer: if you are allocating a char[] manually, and it is for a string in the C sense, and you expect to be able to use that manually-allocated character array as a string in the string APIs, then you must null-terminate it yourself. The reason I add this caveat is that, in general, an arbitrary data structure does not have to be null-terminated: if I have a fixed-size structure, then there's no reason to add a terminator to indicate the end, I already know how big it is. I'm not terribly well-versed on the history of the C language, but I suspect this is an accident of the early days: by using a null-terminator to mark the end of the string, you can avoid having to maintain a struct with both the char pointer and a length element, which could save potentially several bytes per string. Several bytes! * when using calloc, your string is null-terminated by default unless you overflow the allocated region, then all bets are off. Any time you are manually manipulating memory, be alert to the potential for overflows.
🌐
Ilya Safro
eecis.udel.edu › ~davis › cpeg222 › AssemblyTutorial › Chapter-20 › ass20_2.html
Null-terminated String
A zero byte (also called a null byte). A null-terminated string is a sequence of ASCII characters, one to a byte, followed by a zero byte (a null byte).