NULL character in C
string - About Null character in C language - Stack Overflow
c - How does one represent the empty char? - Stack Overflow
Null character '\0' & null terminated strings
Videos
So, it's represented by '\0' or '0', right?
Why '0' too when it's value is 48 and not 0 - is it not the character '0' in ASCII but the symbol '0' used when talking about the NULL character?
EDIT: So it's just the ASCII character '\0'.
ALSO: Why is a char array printable as a string even if it contains no NULL character?
#include <stdio.h>
int main(void)
{
char s[] = {'h', 'i', '.', '.'};
printf("%s\n", s);
printf("%i %i %i %i\n", s[0], s[1], s[2], s[3]);
}prints
hi.. 104 105 46 46
but if I change the last line to
printf("%i %i %i %i %i\n", s[0], s[1], s[2], s[3], s[4);it complains "array index 4 is past the end of the array" which means there is no NULL.
EDIT: I just used the code above and changed s[] to s[5] and it printed the additional 0 - why - can't printf see the NULL if you declare the array without specifying its size in advance?
You can use c[i]= '\0' or simply c[i] = (char) 0.
The null/empty char is simply a value of zero, but can also be represented as a character with an escaped zero.
You can't store "no character" in a character - it doesn't make sense.
As an alternative you could store a character that has a special meaning to you - e.g. null char '\0' - and treat this specially.
Hello everyone!
In C, strings (character arrays) are terminated by null character '\0' - character with value zero.
In ASCII, the NUL control code has value 0 (0x00). Now, if we were working in different character set (say the machine's character set wouldn't be ASCII but different one), should the strings be terminated by NUL in that character set, or by a character whose value is zero?
For example, if the machine's character set would be UTF-16, the in C, byte would be 16bits and strings would be terminated by \0 character with value 0x00 00, which is also NUL in UTF-16.
But, what if the machine's character set would be modified UTF-8 (or UTF-7, ...). Then, according to Wikipedia, the null character is encoded as two bytes 0xC0, 0x80. How would be strings terminated in that case? By the byte with value 0 or by the null character.
I guess my question could be rephrased as: Are null terminated strings terminated by the NUL character (which in that character set might be represented by a nonzero value) or by a character whose value is zero (which in that character set might not represent the NUL character).
Thank you all very much and I'm sorry for all mistakes and errors as english is not my first language.
Thanks again.
It doesn't.
The string terminator is a byte containing all 0 bits.
The unsigned int is two or four bytes (depending on your environment) each containing all 0 bits.
The two items are stored at different addresses. Your compiled code performs operations suitable for strings on the former location, and operations suitable for unsigned binary numbers on the latter. (Unless you have either a bug in your code, or some dangerously clever code!)
But all of these bytes look the same to the CPU. Data in memory (in most currently-common instruction set architectures) doesn't have any type associated with it. That's an abstraction that exists only in the source code and means something only to the compiler.
Edit-added: As an example: It is perfectly possible, even common, to perform arithmetic on the bytes that make up a string. If you have a string of 8-bit ASCII characters, you can convert the letters in the string between upper and lower case by adding or subtracting 32 (decimal). Or if you are translating to another character code you can use their values as indices into an array whose elements provide the equivalent bit coding in the other code.
To the CPU the chars are really extra-short integers. (eight bits each instead of 16, 32, or 64.) To us humans their values happen to be associated with readable characters, but the CPU has no idea of that. It also doesn't know anything about the "C" convention of "null byte ends a string", either (and as many have noted in other answers and comments, there are programming environments in which that convention isn't used at all).
To be sure, there are some instructions in x86/x64 that tend to be used a lot with strings - the REP prefix, for example - but you can just as well use them on an array of integers, if they achieve the desired result.
In short there is no difference (except that an int is 2 or 4 bytes wide and a char just 1).
The thing is that all modern libaries either use the null terminator technique or store the length of a string. And in both cases the program/computer knows it reached the end of a string when it either read a null character or it has read as many characters as the size tells it to.
Issues with this start when the null terminator is missing or the length is wrong as then the program starts reading from memory it isn't supposed to.