Copy>>> import os
>>> "\x00"+os.urandom(4)+"\x00"
'\x00!\xc0zK\x00'
Answer from John La Rooy on Stack OverflowCopy>>> import os
>>> "\x00"+os.urandom(4)+"\x00"
'\x00!\xc0zK\x00'
An alternative way to obtaining a secure random sequence of bytes could be to use the standard library secrets module, available since Python 3.6.
Example, based on the given question:
Copyimport secrets
b"\x00" + secrets.token_bytes(4) + b"\x00"
More information can be found at: https://docs.python.org/3/library/secrets.html
Videos
The os module provides urandom, even on Windows:
bytearray(os.urandom(1000000))
This seems to perform as quickly as you need, in fact, I get better timings than your numpy (though our machines could be wildly different):
timeit.timeit(lambda:bytearray(os.urandom(1000000)), number=10)
0.0554857286941
There are several possibilities, some faster than os.urandom. Also consider whether the data has to be generated deterministically from a random seed. This is invaluable for unit tests where failures have to be reproducible.
short and pithy:
lambda n:bytearray(map(random.getrandbits,(8,)*n))
I've use the above for unit tests and it was fast enough but can it be done faster?
using itertools:
lambda n:bytearray(itertools.imap(random.getrandbits,itertools.repeat(8,n))))
itertools and struct producing 8 bytes per iteration
lambda n:(b''.join(map(struct.Struct("!Q").pack,itertools.imap(
random.getrandbits,itertools.repeat(64,(n+7)//8)))))[:n]
Anything based on b''.join will fill 3-7x the memory consumed by the final bytearray with temporary objects since it queues up all the sub-strings before joining them together and python objects have lots of storage overhead.
Producing large chunks with a specialized function gives better performance and avoids filling memory.
import random,itertools,struct,operator
def randbytes(n,_struct8k=struct.Struct("!1000Q").pack_into):
if n<8000:
longs=(n+7)//8
return struct.pack("!%iQ"%longs,*map(
random.getrandbits,itertools.repeat(64,longs)))[:n]
data=bytearray(n);
for offset in xrange(0,n-7999,8000):
_struct8k(data,offset,
*map(random.getrandbits,itertools.repeat(64,1000)))
offset+=8000
data[offset:]=randbytes(n-offset)
return data
Performance
- .84 MB/s :original solution with
randint: - 4.8 MB/s :
bytearray(getrandbits(8) for _ in xrange(n)): (solution by other poster) - 6.4MB/s :
bytearray(map(getrandbits,(8,)*n)) - 7.2 MB/s :
itertoolsandgetrandbits - 10 MB/s :
os.urandom - 23 MB/s :
itertoolsandstruct - 35 MB/s :optimised function (holds for len = 100MB ... 1KB)
Note:all tests used 10KB as the string size. Results were consistent up till intermediate results filled memory.
Note:os.urandom is meant to provide secure random seeds. Applications expand that seed with their own fast PRNG. Here's an example, using AES in counter mode as a PRNG:
import os
seed=os.urandom(32)
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend
backend = default_backend()
cipher = Cipher(algorithms.AES(seed), modes.CTR(b'\0'*16), backend=backend)
encryptor = cipher.encryptor()
nulls=b'\0'*(10**5) #100k
from timeit import timeit
t=timeit(lambda:encryptor.update(nulls),number=10**5) #1GB, (100K*10k)
print("%.1f MB/s"%(1000/t))
This produces pseudorandom data at 180 MB/s. (no hardware AES acceleration, single core) That's only ~5x the speed of the pure python code above.
Addendum
There's a pure python crypto library waiting to be written. Putting the above techniques together with hashlib and stream cipher techniques looks promising. Here's a teaser, a fast string xor (42MB/s).
def xor(a,b):
s="!%iQ%iB"%divmod(len(a),8)
return struct.pack(s,*itertools.imap(operator.xor,
struct.unpack(s,a),
struct.unpack(s,b)))