List comprehension.
[c + '\0' for c in S]
But it smells like you want UTF-16LE instead.
u'teststring'.encode('utf-16le')
Answer from Ignacio Vazquez-Abrams on Stack OverflowMy output looked correct however the check50 said output was not a valid ASCII text. After googling I fixed it by adding the null character to the end of the output array.
I understand that the null character is how C tells where the string ends, however I still don't fully understand why it was needed for my program, since the output that I see still looks the same whether or not I have that character. Since my output is in a char array so C already knows the size of that array.
Is someone able to explain why adding this character was needed? Is it just something that the check50 checks arbitrarily to qualify it as "valid ASCII text"?
List comprehension.
[c + '\0' for c in S]
But it smells like you want UTF-16LE instead.
u'teststring'.encode('utf-16le')
Answer from Ignacio Vazquez-Abrams on Stack OverflowIn Python 2, when a number starts with a leading zero, it means it's in octal (base 8). In Python 3 octal literals start with 0o instead. 00 specifically is 0.
The leading \ in \00 is a way of specifying a byte value, a number between 0-255. It's normally used to represent a character that isn't on your keyboard, or otherwise can't be easily represented in a string. Some special characters also have non-numeric "escape codes", like \n for a newline.
The zero byte is also known as the nul byte or null byte. It doesn't display anything when printed -- it's null.
See http://www.ascii.cl/ for ASCII character codes.
Yes, replace will still work with it, it just has no meaning as a display character.
It's sometimes used for other purposes, see http://en.wikipedia.org/wiki/Null_character.
The backslash followed by a number is used to represent the character with that octal value. So your \00 represents ASCII NUL.
An representative example of the JSON I would like to create is:
[
{
"aaaa": {
"bbbb": [
{
"cccc": "eeee",
"dddd": "ffff\u0000gggg"
}
]
}
}
]What I would like to be able to do is separate ffff and gggg will the null character as a delimiter.
Is this valid JSON according to the spec?
Googling turned up little information. I did find:
https://jansson.readthedocs.io/en/1.2/conformance.html
which says:
JSON strings are mapped to C-style null-terminated character arrays, and UTF-8 encoding is used internally. Strings may not contain embedded null characters, not even escaped ones. For example, trying to decode the following JSON text leads to a parse error: ["this string contains the null character: \u0000"] All other Unicode codepoints U+0001 through U+10FFFF are allowed.
and this seems to indicate that ffff\u0000gggg is not legal.
However, based on my tests, ffff\u0000gggg seems to be parsed correctly by both Python and Javascript parsers correctly. However, I am not sure if I am getting lucky or what exactly the right answer is.
Can anyone clear this up?