decode the bytes to produce a str:

b = b'1234'
print(b.decode('utf-8'))  # '1234'
Answer from hiro protagonist on Stack Overflow
Top answer
1 of 9
233

decode the bytes to produce a str:

b = b'1234'
print(b.decode('utf-8'))  # '1234'
2 of 9
27

The object you are printing is not a string, but rather a bytes object as a byte literal.

Consider creating a byte object by typing a byte literal (literally defining a byte object without actually using a byte object e.g. by typing b'') and converting it into a string object encoded in utf-8. (Note that converting here means decoding)

byte_object= b"test" # byte object by literally typing characters
print(byte_object) # Prints b'test'
print(byte_object.decode('utf8')) # Prints "test" without quotations

We simply applied the .decode(utf8) function.


String literals are described by the following lexical definitions:

https://docs.python.org/3.3/reference/lexical_analysis.html#string-and-bytes-literals

stringliteral   ::=  stringprefix
stringprefix    ::=  "r" | "u" | "R" | "U"
shortstring     ::=  "'" shortstringitem* "'" | '"' shortstringitem* '"'
longstring      ::=  "'''" longstringitem* "'''" | '"""' longstringitem* '"""'
shortstringitem ::=  shortstringchar | stringescapeseq
longstringitem  ::=  longstringchar | stringescapeseq
shortstringchar ::=  <any source character except "\" or newline or the quote>
longstringchar  ::=  <any source character except "\">
stringescapeseq ::=  "\" <any source character>

bytesliteral   ::=  bytesprefix(shortbytes | longbytes)
bytesprefix    ::=  "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB"
shortbytes     ::=  "'" shortbytesitem* "'" | '"' shortbytesitem* '"'
longbytes      ::=  "'''" longbytesitem* "'''" | '"""' longbytesitem* '"""'
shortbytesitem ::=  shortbyteschar | bytesescapeseq
longbytesitem  ::=  longbyteschar | bytesescapeseq
shortbyteschar ::=  <any ASCII character except "\" or newline or the quote>
longbyteschar  ::=  <any ASCII character except "\">
bytesescapeseq ::=  "\" <any ASCII character>
Discussions

python - Is there any way to remove the b'' from decoded base64 output? - Stack Overflow
When i run import base64 a = "aHR0cHM6Ly9kaXNjb3JkLmNvbS9hcGkvd2ViaG9va3MvMTA2MTQ2NzIxNDI2MTI2MDMwOC9wNklOLWgzVkRGSEo1UVIyNDVSXy1NUFlWZ2xfU2tRZ3RUemVPMUN2SGlUZlBSMEExSjhOQUdHVmt0NU1sZzhrYXVkR... More on stackoverflow.com
🌐 stackoverflow.com
Removing certain strings from base64 output python - Stack Overflow
Advertising & Talent Reach devs ... employer brand ... Find centralized, trusted content and collaborate around the technologies you use most. Learn more about Collectives ... Connect and share knowledge within a single location that is structured and easy to search. Learn more about Teams ... Here is my code. import base64 encoded = ... More on stackoverflow.com
🌐 stackoverflow.com
October 13, 2017
python - Why do I need 'b' to encode a string with Base64? - Stack Overflow
I followed an example from the documentation of how to use Base64 encoding in Python: >>> import base64 >>> encoded = base64.b64encode(b'data to be encoded') >>> encoded b' More on stackoverflow.com
🌐 stackoverflow.com
Hot Linked Questions - Stack Overflow
Stack Overflow | The World’s Largest Online Community for Developers More on stackoverflow.com
🌐 stackoverflow.com
🌐
Python Forum
python-forum.io › thread-34291.html
remove b due to conversion in PyQ
Hi Team, Am trying to get rid of the b that appears in the column? Data type of my output: df.info()Output:Date: Dtype: object Col 2: Dtype: object Col 3: Dtype: object Col 4: Dtype: int16I want to remove [b''] from output: [b'yxyz'] from Col 2 - ...
🌐
Bobby Hadz
bobbyhadz.com › blog › python-remove-b-prefix-from-string
How to remove the 'b' prefix from a String in Python | bobbyhadz
Use the `bytes.decode()` method to remove the `b` prefix from a bytes object by converting it to a string.
🌐
sebhastian
sebhastian.com › python-remove-b-string
How to remove the 'b' prefix from an encoded Python string | sebhastian
March 28, 2023 - This article shows how to remove the 'b' prefix from an encoded Python string
🌐
Quora
quora.com › How-do-I-get-rid-of-the-b-prefix-in-a-string-in-Python
How to get rid of the b-prefix in a string in Python - Quora
Answer: If you see a sequence in Python like: b’foo’ it’s technically not a “string.” That’s a sequence of bytes. You can render it into a string by decoding it. In other words, the bytes sequence b’foo’ can be transformed into the string ‘foo’ using the expression b’foo’.decode() Conversely I c...
Find elsewhere
🌐
CodeSpeedy
codespeedy.com › home › what is ‘b’ in front of string and how to remove that in python?
What is 'b' in front of string & how to remove that in Python - CodeSpeedy
February 7, 2022 - Thus on being converted to a normal string, the ‘b’ in the prefix is automatically gone. Also read, Encoding and Decoding Base64 Strings in Python
🌐
Discoverbits
discoverbits.in › 2108 › python-how-to-remove-b-prefix-from-a-byte-string
Python - how to remove b' prefix from a byte string - DiscoverBits
August 10, 2020 - The output of one python function is a byte string (e.g. b'my string'). I want to convert it to ... . How can I remove b' prefix from the byte string?
🌐
Stack Overflow
stackoverflow.com › questions › 46728119 › removing-certain-strings-from-base64-output-python
Removing certain strings from base64 output python - Stack Overflow
October 13, 2017 - Here is my code. import base64 encoded = base64.b64encode(b"data to be encoded") print(encoded) print(encoded.replace("b", "")) Here is my output b'ZGF0YSB0byBiZSBlbmNvZGVk' Traceback (most recent
🌐
YouTube
youtube.com › codinggpt
python remove b prefix - YouTube
Download this code from https://codegive.com In Python, a bytes literal is represented with a 'b' prefix, indicating that the following sequence of character...
Published   December 23, 2023
Views   112
🌐
Python Pool
pythonpool.com › home › blog › 5 ways to convert bytes to string in python
5 Ways to Convert bytes to string in Python - Python Pool
May 3, 2023 - import base64 encodedans = base64.b64encode(b'encoded data') #to get the data back, use decode function olddata = base64.b64decode(encodedans) So you can decode it back to original data.
🌐
Python
docs.python.org › 3 › library › base64.html
base64 — Base16, Base32, Base64, Base85 Data Encodings
Source code: Lib/base64.py This module provides functions for encoding binary data to printable ASCII characters and decoding such encodings back to binary data. This includes the encodings specifi...
Top answer
1 of 5
363

base64 encoding takes 8-bit binary byte data and encodes it uses only the characters A-Z, a-z, 0-9, +, /* so it can be transmitted over channels that do not preserve all 8-bits of data, such as email.

Hence, it wants a string of 8-bit bytes. You create those in Python 3 with the b'' syntax.

If you remove the b, it becomes a string. A string is a sequence of Unicode characters. base64 has no idea what to do with Unicode data, it's not 8-bit. It's not really any bits, in fact. :-)

In your second example:

>>> encoded = base64.b64encode('data to be encoded')

All the characters fit neatly into the ASCII character set, and base64 encoding is therefore actually a bit pointless. You can convert it to ascii instead, with

>>> encoded = 'data to be encoded'.encode('ascii')

Or simpler:

>>> encoded = b'data to be encoded'

Which would be the same thing in this case.


* Most base64 flavours may also include a = at the end as padding. In addition, some base64 variants may use characters other than + and /. See the Variants summary table at Wikipedia for an overview.

2 of 5
252

Short Answer

You need to push a bytes-like object (bytes, bytearray, etc) to the base64.b64encode() method. Here are two ways:

>>> import base64
>>> data = base64.b64encode(b'data to be encoded')
>>> print(data)
b'ZGF0YSB0byBiZSBlbmNvZGVk'

Or with a variable:

>>> import base64
>>> string = 'data to be encoded'
>>> data = base64.b64encode(string.encode())
>>> print(data)
b'ZGF0YSB0byBiZSBlbmNvZGVk'

Why?

In Python 3, str objects are not C-style character arrays (so they are not byte arrays), but rather, they are data structures that do not have any inherent encoding. You can encode that string (or interpret it) in a variety of ways. The most common (and default in Python 3) is utf-8, especially since it is backwards compatible with ASCII (although, as are most widely-used encodings). That is what is happening when you take a string and call the .encode() method on it: Python is interpreting the string in utf-8 (the default encoding) and providing you the array of bytes that it corresponds to.

Base-64 Encoding in Python 3

Originally the question title asked about Base-64 encoding. Read on for Base-64 stuff.

base64 encoding takes 6-bit binary chunks and encodes them using the characters A-Z, a-z, 0-9, '+', '/', and '=' (some encodings use different characters in place of '+' and '/'). This is a character encoding that is based off of the mathematical construct of radix-64 or base-64 number system, but they are very different. Base-64 in math is a number system like binary or decimal, and you do this change of radix on the entire number, or (if the radix you're converting from is a power of 2 less than 64) in chunks from right to left.

In base64 encoding, the translation is done from left to right; those first 64 characters are why it is called base64 encoding. The 65th '=' symbol is used for padding, since the encoding pulls 6-bit chunks but the data it is usually meant to encode are 8-bit bytes, so sometimes there are only two or 4 bits in the last chunk.

Example:

>>> data = b'test'
>>> for byte in data:
...     print(format(byte, '08b'), end=" ")
...
01110100 01100101 01110011 01110100
>>>

If you interpret that binary data as a single integer, then this is how you would convert it to base-10 and base-64 (table for base-64):

base-2:  01 110100 011001 010111 001101 110100 (base-64 grouping shown)
base-10:                            1952805748
base-64:  B      0      Z      X      N      0

base64 encoding, however, will re-group this data thusly:

base-2:  011101  000110  010101 110011 011101 00(0000) <- pad w/zeros to make a clean 6-bit chunk
base-10:     29       6      21     51     29      0
base-64:      d       G       V      z      d      A

So, 'B0ZXN0' is the base-64 version of our binary, mathematically speaking. However, base64 encoding has to do the encoding in the opposite direction (so the raw data is converted to 'dGVzdA') and also has a rule to tell other applications how much space is left off at the end. This is done by padding the end with '=' symbols. So, the base64 encoding of this data is 'dGVzdA==', with two '=' symbols to signify two pairs of bits will need to be removed from the end when this data gets decoded to make it match the original data.

Let's test this to see if I am being dishonest:

>>> encoded = base64.b64encode(data)
>>> print(encoded)
b'dGVzdA=='

Why use base64 encoding?

Let's say I have to send some data to someone via email, like this data:

>>> data = b'\x04\x6d\x73\x67\x08\x08\x08\x20\x20\x20'
>>> print(data.decode())
   
>>> print(data)
b'\x04msg\x08\x08\x08   '
>>>

There are two problems I planted:

  1. If I tried to send that email in Unix, the email would send as soon as the \x04 character was read, because that is ASCII for END-OF-TRANSMISSION (Ctrl-D), so the remaining data would be left out of the transmission.
  2. Also, while Python is smart enough to escape all of my evil control characters when I print the data directly, when that string is decoded as ASCII, you can see that the 'msg' is not there. That is because I used three BACKSPACE characters and three SPACE characters to erase the 'msg'. Thus, even if I didn't have the EOF character there the end user wouldn't be able to translate from the text on screen to the real, raw data.

This is just a demo to show you how hard it can be to simply send raw data. Encoding the data into base64 format gives you the exact same data but in a format that ensures it is safe for sending over electronic media such as email.

🌐
W3Schools
w3schools.com › python › ref_module_base64.asp
Python base64 Module
Remove List Duplicates Reverse a String Add Two Numbers · Python Examples Python Compiler Python Exercises Python Quiz Python Challenges Python Practice Problems Python Server Python Syllabus Python Study Plan Python Interview Q&A Python Bootcamp Python Training ... import base64 s = "Linus".encode() b = base64.b64encode(s) print(b) print(base64.b64decode(b).decode()) Try it Yourself »
🌐
Stack Overflow
stackoverflow.com › questions › linked › 37016946
Hot Linked Questions - Stack Overflow
username = 'root' password = '1234' auth_str = '%s:%s' % (username, password) b64_auth_str = base64.b64encode(auth_str.encode()) headers = {'Authorization': 'Basic %s' % b64_auth_str} Above is my ...
🌐
GeeksforGeeks
geeksforgeeks.org › python › encoding-and-decoding-base64-strings-in-python
Encoding and Decoding Base64 Strings in Python - GeeksforGeeks
August 7, 2024 - Get the ASCII value of each character in the string. Compute the 8-bit binary equivalent of the ASCII values · Convert the 8-bit characters chunk into chunks of 6 bits by re-grouping the digits · Convert the 6-bit binary groups to their respective decimal values. Use the Base64 encoding table to align the respective Base64 values for each decimal value. The below image provides us with a Base64 encoding table. ... In Python the base64 module is used to encode and decode data.
🌐
Edureka Community
edureka.co › home › community › categories › python › remove b character do in front of a string...
Remove b character do in front of a string literal in Python 3 | Edureka Community
May 2, 2022 - I am new in python programming and i am a bit confused. I try to get the bytes from a string ... m.update(pw_bytes) OUTPUT: print b'my secret data'