From C11 5.2.4.2.1
The values given below shall be replaced by constant expressions suitable for use in #if preprocessing directives. Moreover, except for CHAR_BIT and MB_LEN_MAX, the following shall be replaced by expressions that have the same type as would an expression that is an object of the corresponding type converted according to the integer promotions. Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.
(emphasis mine)
So the standard defines that at a minimum UCHAR_MAX needs to be 255 but it can be greater than that.
The guarantees that we have on size are:
sizeof(char) = 1 and sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
And at a minimum the signed versions of the data types must be able to hold:
char [-127, 127]short [-32767, 32767]int [-32767, 32767]long [-2147483647, 2147483647]long long [-9223372036854775807, 9223372036854775807]
What is the use of unsigned char?
c++ - How do you determine the length of an unsigned char*? - Stack Overflow
Difference between char, signed char, unsigned char, uint8_t, int8_t and std::byte?
Not all platforms implement char as 8 bits, the standard dictates that a char must be at least 8 bits, but doesn't require it to be.
int8_t was therefore added in order to avoid confusion, it's guaranteed to be exactly 8 bits but a platform specific compiler isn't required to implement it.
unsigned char and uint8_t respectively are the unsigned equivalent types.
std::byte follows the same rules as char, only it doesn't offer integer-specific arithmetics like +, - etc. This is in order to emphasize its intend as a binary data holder and not an alphanumeric value.
More on reddit.comGet sizeof(unsigned char *) - C++ Forum
From C11 5.2.4.2.1
The values given below shall be replaced by constant expressions suitable for use in #if preprocessing directives. Moreover, except for CHAR_BIT and MB_LEN_MAX, the following shall be replaced by expressions that have the same type as would an expression that is an object of the corresponding type converted according to the integer promotions. Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.
(emphasis mine)
So the standard defines that at a minimum UCHAR_MAX needs to be 255 but it can be greater than that.
The guarantees that we have on size are:
sizeof(char) = 1 and sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
And at a minimum the signed versions of the data types must be able to hold:
char [-127, 127]short [-32767, 32767]int [-32767, 32767]long [-2147483647, 2147483647]long long [-9223372036854775807, 9223372036854775807]
Videos
I don’t understand the use of unsigned char, I know c++ use number to store letter according the ASCII table, but the table is ranged from 0 to 127, so it wouldn’t make sense to have a have more number like unsigned char which is 0 to 255. I know default char is signed which is -128 to 127. Also could anyone give me an example code on the usage of unsigned char
For the actual size of the pointer:
size_t s = sizeof(unsigned char*);
If you want the length of the string:
unsigned char* bla = (unsigned char*)"blabla";
int s = strlen((char*)bla);
In an ideal world, you don't. You use char* for C-style strings (which are NUL-terminated and you can measure the length of), and unsigned char* only for byte data (which comes with its length in another parameter or whatever, and which you probably get into an STL container ASAP, such as vector<unsigned char> or basic_string<unsigned char>).
The root problem is that you can't make portable assumptions about whether the storage representations of char and unsigned char are the same. They usually are, but they're allowed not to be. So there are no string-like library functions which operate on unsigned char*, only on char*, and it is not in general safe to cast unsigned char* to signed char* and treat the result as a string. Since char might be signed, this means no casting unsigned char* to char*.
However, 0 is always the same value representation in unsigned char and char. So in a non-ideal world, if you've got a C-style string from somewhere but it has arrived as an unsigned char*, then you (a) cast it to char* and get on with it, but also (b) find out who did this to you, and ask them please to stop.
In Rust it's easy. There is just u8 and i8.
But C++ has char, signed char, unsigned char, unit8_t, int8_t and std::byte. Which of these am I supposed to use?
Not all platforms implement char as 8 bits, the standard dictates that a char must be at least 8 bits, but doesn't require it to be.
int8_t was therefore added in order to avoid confusion, it's guaranteed to be exactly 8 bits but a platform specific compiler isn't required to implement it.
unsigned char and uint8_t respectively are the unsigned equivalent types.
std::byte follows the same rules as char, only it doesn't offer integer-specific arithmetics like +, - etc. This is in order to emphasize its intend as a binary data holder and not an alphanumeric value.
I generally use those types for the following things:
-
char: ASCII character. -
unsigned char: Extended ASCII character. (rare) -
int8_t: Integer value between -128 and 127. -
uint8_t: Integer value between 0 and 255. -
std::byte: Use with a collection (e.g: std::vector<std::byte>) type to represent an opaque data blob.
Edit: Formatting, clarity.