python convert string to bytes without encoding using

How to cast a string to bytes without encoding

stackoverflow.com › questions › 42795042 › how-to-cast-a-string-to-bytes-without-encoding

Though I suspect something else is decoding your data for you (a char* in C is usually best represented as bytes, especially if it is binary data):

The latin1 codec can round trip every byte. You can verify this with the following short program:

>>> s = ''.join(chr(i) for i in range(0x100))
>>> s
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0¡¢£¤¥¦§¨ª«¬\xad¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ'
>>> s2 = s.encode('latin1').decode('latin1')
>>> s2 == s
True
>>> sb = bytes(range(0x100))
>>> sb
b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
>>> sb == s.encode('latin1')
True

Answer from anthony sottile on Stack Overflow

Stack Overflow

stackoverflow.com › questions › 42795042 › how-to-cast-a-string-to-bytes-without-encoding

python 3.x - How to cast a string to bytes without encoding - Stack Overflow

Top answer

1 of 6

Though I suspect something else is decoding your data for you (a char* in C is usually best represented as bytes, especially if it is binary data):

The latin1 codec can round trip every byte. You can verify this with the following short program:

>>> s = ''.join(chr(i) for i in range(0x100))
>>> s
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0¡¢£¤¥¦§¨ª«¬\xad¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ'
>>> s2 = s.encode('latin1').decode('latin1')
>>> s2 == s
True
>>> sb = bytes(range(0x100))
>>> sb
b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
>>> sb == s.encode('latin1')
True

2 of 6

Just now I ran into the same problem. This is what I came up with:

import struct

def rawbytes(s):
    """Convert a string to raw bytes without encoding"""
    outlist = []
    for cp in s:
        num = ord(cp)
        if num < 255:
            outlist.append(struct.pack('B', num))
        elif num < 65535:
            outlist.append(struct.pack('>H', num))
        else:
            b = (num & 0xFF0000) >> 16
            H = num & 0xFFFF
            outlist.append(struct.pack('>bH', b, H))
    return b''.join(outlist)

Some examples:

In [34]: rawbytes('this is a test')
Out[34]: b'this is a test'

In [35]: rawbytes('\udc80\udcdf\udcff\udcff\udcff\x7f')
Out[35]: b'\xdc\x80\xdc\xdf\xdc\xff\xdc\xff\xdc\xff\x7f'

Python.org

discuss.python.org › ideas

Alliow `bytes(mystring)` without specifying the encoding - Ideas - Discussions on Python.org

September 20, 2022 - ", line 1, in "hello".encode() b'hello' For consistency, I would suggest that calling bytes on a str object without an encoding also assumes UTF-8 by default, as ...

Discussions

String to Bytes Python without change in encoding - Stack Overflow

I have this issue and I can't figure out how to solve it. I have this string: data = '\xc4\xb7\x86\x17\xcd' When I tried to encode it: data.encode() I get this result: b'\xc3\x84\xc2\xb7\xc2\x86... More on stackoverflow.com

stackoverflow.com

January 21, 2018

Convert bytes to a string in Python 3 - Stack Overflow

See Best way to convert string to bytes in Python 3? for the other way around. ... @CharlieParker Because str(text_bytes) can't specify the encoding. More on stackoverflow.com

stackoverflow.com

image - How to create an HTML img tag string with base64 encoding from bytes in Python? - Stack Overflow

I am using Python 3.6 and I have an image as bytes: img = b'\xff\xd8\xff\xe0\x00\x10JFIF\x00' I need to convert the bytes into a string without encoding so it looks like: raw_img = '\xff\xd8\xff\x... More on stackoverflow.com

stackoverflow.com

Best way to convert string to bytes in Python 3? - TestMu AI Community

Best way to convert string to bytes in Python 3 More on community.testmuai.com

community.testmuai.com

June 6, 2024

Videos

04:11

YouTube

How To Encode String To Bytes In Python - YouTube

June 2, 2025

02:30

YouTube

How to cast a string to bytes without encoding - YouTube

November 11, 2023

02:33

YouTube

Best way to convert string to bytes in Python 3? - YouTube

April 10, 2022

View all

reddit.com › r/learnpython › how to read string to bytes without encoding?

r/learnpython on Reddit: How to read string to bytes without encoding?

May 9, 2019 -

I'd been receiving some binary data on a python socket, and printing the data to the console for a few days. I've switched the output to files, but would like to reclaim the previous data. The data looks like this, but when I load the file, it goes in and encodes the data, escaping the single quotes and pre-existing backslashes, like this.

I'm wanting to read the file, one line at a time, or loading the entire file as an array or list, but need to bypass or backtrack the encoding to use the data properly. Any pointers on what modules/functions I should be looking at?

Top answer

1 of 2

https://repl.it/repls/DroopyInfatuatedLicense

2 of 2

The data that you currently have is not binary data; they're python byte literals. You should not try to take these byte literals and do anything with them. Instead you should follow the data back to find out where these byte literals came from and try fix the problem at its source. Show us your code that reads from the socket and writes to the file.

Stack Overflow

stackoverflow.com › questions › 48367128 › string-to-bytes-python-without-change-in-encoding

String to Bytes Python without change in encoding - Stack Overflow

Top answer

1 of 2

You cannot convert a string into bytes or bytes into string without taking an encoding into account. The whole point about the bytes type is an encoding-independent sequence of bytes, while str is a sequence of Unicode code points which by design have no unique byte representation.

So when you want to convert one into the other, you must tell explicitly what encoding you want to use to perform this conversion. When converting into bytes, you have to say how to represent each character as a byte sequence; and when you convert from bytes, you have to say what method to use to map those bytes into characters.

If you don’t specify the encoding, then UTF-8 is the default, which is a sane default since UTF-8 is ubiquitous, but it's also just one of many valid encodings.

If you take your original string, '\xc4\xb7\x86\x17\xcd', take a look at what Unicode code points these characters represent. \xc4 for example is the LATIN CAPITAL LETTER A WITH DIAERESIS, i.e. Ä. That character happens to be encoded in UTF-8 as 0xC3 0x84 which explains why that’s what you get when you encode it into bytes. But it also has an encoding of 0x00C4 in UTF-16 for example.

As for how to solve this properly so you get the desired output, there is no clear correct answer. The solution that Kasramvd mentioned is also somewhat imperfect. If you read about the raw_unicode_escape codec in the documentation:

raw_unicode_escape

Latin-1 encoding with \uXXXX and \UXXXXXXXX for other code points. Existing backslashes are not escaped in any way. It is used in the Python pickle protocol.

So this is just a Latin-1 encoding which has a built-in fallback for characters outside of it. I would consider this fallback somewhat harmful for your purpose. For Unicode characters that cannot be represented as a \xXX sequence, this might be problematic:

>>> chr(256).encode('raw_unicode_escape')
b'\\u0100'

So the code point 256 is explicitly outside of Latin-1 which causes the raw_unicode_escape encoding to instead return the encoded bytes for the string '\\u0100', turning that one character into 6 bytes which have little to do with the original character (since it’s an escape sequence).

So if you wanted to use Latin-1 here, I would suggest you to use that one explictly, without having that escape sequence fallback from raw_unicode_escape. This will simply cause an exception when trying to convert code points outside of the Latin-1 area:

>>> '\xc4\xb7\x86\x17\xcd'.encode('latin1')
b'\xc4\xb7\x86\x17\xcd'
>>> chr(256).encode('latin1')
Traceback (most recent call last):
  File "<pyshell#28>", line 1, in <module>
    chr(256).encode('latin1')
UnicodeEncodeError: 'latin-1' codec can't encode character '\u0100' in position 0: ordinal not in range(256)

Of course, whether or not code points outside of the Latin-1 area can cause problems for you depends on where that string actually comes from. But if you can make guarantees that the input will only contain valid Latin-1 characters, then chances are that you don't really need to be working with a string there in the first place. Since you are actually dealing with some kind of bytes, you should look whether you cannot simply retrieve those values as bytes in the first place. That way you won’t introduce two levels of encoding there where you can corrupt data by misinterpreting the input.

2 of 2

You can use 'raw_unicode_escape' as your encoding:

In [14]: bytes(data, 'raw_unicode_escape')
Out[14]: b'\xc4\xb7\x86\x17\xcd'

As mentioned in comments you can also pass the encoding directly to the encode method of your string.

In [15]: data.encode("raw_unicode_escape")
Out[15]: b'\xc4\xb7\x86\x17\xcd'

Devace Technologies

devacetech.com › home › insights › string to bytes conversion in python-2025 manual

How to convert a string to bytes in Python

July 28, 2025 - In the given example, bytes ( ) receives text and encoding scheme, giving the same result as .encode ( ). Beginners may sometimes face errors while converting a string to bytes in Python, including: ... python # This will raise a TypeError bytes ...

Stack Overflow

stackoverflow.com › questions › 606191 › convert-bytes-to-a-string-in-python-3

Convert bytes to a string in Python 3 - Stack Overflow

Top answer

1 of 16

5877

Decode the bytes object to produce a string:

>>> b"abcde".decode("utf-8")
'abcde'

The above example assumes that the bytes object is in UTF-8, because it is a common encoding. However, you should use the encoding your data is actually in!

2 of 16

430

Decode the byte string and turn it in to a character (Unicode) string.

Python 3:

encoding = 'utf-8'
b'hello'.decode(encoding)

str(b'hello', encoding)

Python 2:

encoding = 'utf-8'
'hello'.decode(encoding)

unicode('hello', encoding)

Google Groups

groups.google.com › g › comp.lang.python › c › 3nPnNzgBoxQ

"convert" string to bytes without changing data (encoding)

March 28, 2012 - Steven D'Aprano <steve+comp....@pearwood.info> wrote: >The right way to convert bytes to strings, and vice versa, is via >encoding and decoding operations. If you want to dictate to the original poster the correct way to do things then you don't need to do anything more that. You don't need to pretend like Chris Angelico that there's isn't a direct mapping from the his Python 3 implementation's internal respresentation of strings to bytes in order to label what he's asking for as being "silly".

Find elsewhere

Google Bing Mojeek

Analytics Vidhya

analyticsvidhya.com › home › 7 ways to convert string to bytes in python

7 Ways to Convert String to Bytes in Python - Analytics Vidhya

February 7, 2024 - The bytes() function provides a simple way to convert strings to bytes. It is similar to the encode() method but returns an immutable bytes object instead of a mutable one. However, it is important to note that the bytes() function may raise ...

Stack Overflow

stackoverflow.com › questions › 49677790 › convert-python-bytes-to-string-without-encoding

image - How to create an HTML img tag string with base64 encoding from bytes in Python? - Stack Overflow

Top answer

1 of 6

img.decode("utf-8")

You can decode the variable with the above. Then convert it to base64.

"<img src='data:image/png;base64,{}'/>".format( base64.b64encode(img.decode("utf-8")) )

UPDATED:

What you really want is this:

raw_img = repr(img)
"<img src='data:image/png;base64,{}'/>".format( base64.b64encode(raw_img) )

2 of 6

I've solved it (2022 - bit late to the party...) If you try img_raw.decode() you get the UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte error

But if you leave img_raw as a binary string and pass it into b64encode and then decode it, it doesn't have the UnicodeDecodeError, and you can pass it in as a data string to your image tag.

base64.b64encode(raw_image).decode()

Python

mail.python.org › pipermail › python-list › 2012-March › 621947.html

"convert" string to bytes without changing data (encoding)

March 28, 2012 - You don't need to >> pretend like Chris Angelico that there's isn't a direct mapping from >> the his Python 3 implementation's internal respresentation of strings >> to bytes in order to label what he's asking for as being "silly". > > It might be technically possible to recreate internal implementation, > or get the byte data. That does not mean it will make any sense or > be understood in a meaningful manner. I think Ian summarized it > very well: > >>You can't generally just "deal with the ascii portions" without >>knowing something about the encoding.

Flexiple

flexiple.com › python › python-string-to-bytes

How to convert Python string to bytes? | Flexiple Tutorials | Python - Flexiple

The bytes() method is an inbuilt function that can be used to convert objects to byte objects. ... The bytes take in an object (a string in our case), the required encoding method, and convert it into a byte object.

RTEE Tech

blog.rteetech.com › home › python convert string to bytes – methods, encoding & alternatives

Python Convert String to Bytes| Methods, Encoding,Alternatives

February 27, 2025 - In such cases, you can use the bytes() function without any encoding. python string = "Hello, World!" byte_string = bytes(string, 'utf-8') print(byte_string) However, in most situations, it is essential to define the encoding to ensure the correct ...

GeeksforGeeks

geeksforgeeks.org › python › how-to-fix-typeerror-string-argument-without-an-encoding-in-python

How to Fix TypeError: String Argument Without an Encoding in Python - GeeksforGeeks

July 23, 2025 - encode() method turns a string into a sequence of bytes, using a format like 'utf-8' that tells Python how to represent the characters in the string. ... When we use the bytes() function, we need to specify the encoding format, like 'utf-8', ...

GeeksforGeeks

geeksforgeeks.org › python › python-convert-string-to-bytes

Convert String to bytes-Python - GeeksforGeeks

08:42

The goal here is to convert a string into bytes in Python. This is essential for working with binary data or when encoding strings for storage or transmission.

Published July 11, 2025

TestMu AI Community

community.testmuai.com › ask a question

Best way to convert string to bytes in Python 3? - TestMu AI Community

June 6, 2024 - Best way to convert string to bytes in Python 3

KDnuggets

kdnuggets.com › convert-bytes-to-string-in-python-a-tutorial-for-beginners

Convert Bytes to String in Python: A Tutorial for Beginners - KDnuggets

July 15, 2024 - Note: Strings do not have an associated ... bytes to string, you can use the decode() method on the bytes object. And to convert string to bytes, you can use the encode() method on the string....

reddit.com › r/codinghelp › converting between string to bytes without creating a double backslash.

r/CodingHelp on Reddit: CONVERTING between string to bytes without creating a double backslash.

January 24, 2022 -

I have a string of bytes that I have read from a file:

\x00\x01\x00\xc0\x01\x00\x00\x00\x04 This is a string not bytes.

I know I can convert it to bytes via

s_new = bytes(string, "raw_encoding_escape")

if I want to make it b"\x00\x01\x00\xc0\x01\x00\x00\x00\x04"

This only works if I pass the string in directly from the program and not read it in from a file. If I read it in from a file it becomes:

b'\\x00\\x01\\x00\\xc0\\x01\\x00\\x00\\x00\\x04'

The double backslash occurs and I don't know why :/ This doesn't occur when passing in the string that has not been read from a file.

Any help?

Top answer

1 of 3

2 of 3

How are you reading the file? Are you using open() with "rb" as the file mode? https://docs.python.org/3/library/io.html#binary-i-o

reddit.com › r/python › how to convert bytes to string in python?

r/Python on Reddit: How to convert bytes to string in python?

February 8, 2016 -

i am stuck in python programming. i read a line from the serial port

line=ser.readline()
print(line)

when the above code is executed i get like

b'-83,-156,-205\r\n'

but i want only values like a=-83,b=-156,c=-205. how to extract these values/ how to get these values?

Top answer

1 of 3

Try this: line = ser.readline().strip() values = line.decode('ascii').split(',') a, b, c = [int(s) for s in values] The call to .strip() removes the trailing newline. The call .decode('ascii') converts the raw bytes to a string. .split(',') splits the string on commas. Finally the call [int(s) for s in value] is called a list comprehension, and produces a list of integers.

2 of 3

On python 3: line = ser.readline().decode() Also you can specify the character encoding: line = ser.readline().decode("ascii") line = ser.readline().decode("utf-8")

reddit.com › r/learnpython › how do i convert a string to bytes?

r/learnpython on Reddit: How do I convert a string to bytes?

August 8, 2023 -

Suppose I something like

s = "GW\x25\001"

How do I convert that string to bytes, interpreting the backslashes as escapes? In other words, the resulting byte array should be of length 4.

UPDATE: Hmmm, I was taking the string from sys.argv[1], which seems to complicate things and not make it turn out as expected. So I'm still not sure what the answer is.

Top answer

1 of 4

Or use the bytes() builtin function . An example: x = "abc☺xyz" print(x) y = bytes(x, encoding="utf-8") print(y)

2 of 4

If you assign a literal-string to a variable like that >>> s = "GW\x25\001" Then the assignment will interpret the escape-sequences (backslash codes) then and there, and the string will just be 4 characters long, so you can convert it to bytes with encode() in the normal way. >>> len(s) 4 Did you mean that you have a raw string containing the elaborated escape-sequences that you wish to interpret? >>> s = r"GW\x25\001" >>> len(s) 10 In which case, you can use the unicode_escape() method. Note that Python assumes that the encoding of bytestrings containing escape-sequences is latin-1 rather than utf-8. >>> b = s.encode('latin-1').decode('unicode_escape').encode('utf-8') >>> b b'GW%\x01' >>> len(b) 4

Delft Stack

delftstack.com › home › howto › python › how to convert string to bytes in python

How to Convert String to Bytes in Python | Delft Stack

March 4, 2025 - The bytes constructor is a straightforward way to convert a string into bytes. This method takes a string and an optional encoding argument and returns the corresponding byte representation.