You can use an auto increment number and create a corresponding pseudorandom number with some algorithm, here's an example:
class IdGenerater
{
private static $RANDOMCHARS = # random
[
'2083417956', '4823019567', '8402135679', '4802316759',
'2483051679', '8421350679', '1503248697', '1053872469',
'0157824639', '1502784639', '5170248639', '0751248693',
];
private $digits = [];
public function generate(int $id) : int
{
$p = 0;
while ($id >= 10)
{
$rem = $id % 10;
$this->digits[$p++] = $rem;
$id = (int)(($id - $rem) / 10);
}
$this->digits[$p++] = $id;
for(; $p < 12; $p++)
$this->digits[$p] = 0;
$p = 0; $q = 0;
for ($i = 0; $i < 12; $i++)
{
$p += $this->digits[$i];
$q = $q * 10 + (int)self::$RANDOMCHARS[$i][$p % 10];
}
return $q;
}
}
The auto increment number can be generated from another service, the result is one-to-one correspondence, but it's not easy to revert.
$gen = new IdGenerater();
echo $gen->generate(0), PHP_EOL; # 248428110150
echo $gen->generate(666666), PHP_EOL; # 727320824488
So far there has not been an answer to create a 12 digit identifier that would be random, unique, "light", and unrelated to the user id.
However both the options on the question and the option shared by shingo are perfectly viable, and after running some tests my conclusion was that I was overthinking things.
At the end of the day none of these method actually had a drawback that made them non viable, you should choose what method to use depending on your requirements.
loop until I get a unique number
| finite loops | few loops | unrelated to id |
|---|---|---|
| no | yes | yes |
Let's say for this example we have 100000 users.
A new user registers and I have to create a new identifier for said user.
Only 100000 identifiers have been used out of the 1000000000 available identifiers giving a 1/10000 chance of the while loop looping even once.
Furthermore the code would have to loop thousands of times before becoming an issue.
This can only become an issue is if a large percentage of the identifiers are already being used.
You should not use this method to give a 6 digit identifier to a table with 500000 users.
adjust the random number accordingly
| finite loops | few loops | unrelated to id |
|---|---|---|
| yes | no | yes |
The issue I had with this code was that it "loops too much".
If my table had 100000 users it could easily end up looping for 100000 times.
However after testing some code on a sandbox I came to the conclusion that a for loop this small can easily loop a 100000 times in a matter of milliseconds, therefore removing any worries about the code becoming a performance killer.
While the idea of looping so much still bugs me, I would judge this method to be the "safest".
create a pseudorandom number
| finite loops | few loops | unrelated to id |
|---|---|---|
| yes | yes? | no |
This is the solution offered by shingo. The issue with this method is the one-to-one correspondence with the user id, which I wanted to avoid.
However if you don't have any issues with that, this is probably the best method.
I think this method also removes the need to store the 12 digit number on your database and it allows you to directly get the user id from the 12 digit code.
Whats wrong with a straightforward approach?
>>> import random
>>> random.randint(100000000000,999999999999)
544234865004L
And if you want it with leading zeros, you need a string.
>>> "%0.12d" % random.randint(0,999999999999)
'023432326286'
Edit:
My own solution to this problem would be something like this:
import random
def rand_x_digit_num(x, leading_zeroes=True):
"""Return an X digit number, leading_zeroes returns a string, otherwise int"""
if not leading_zeroes:
# wrap with str() for uniform results
return random.randint(10**(x-1), 10**x-1)
else:
if x > 6000:
return ''.join([str(random.randint(0, 9)) for i in xrange(x)])
else:
return '{0:0{x}d}'.format(random.randint(0, 10**x-1), x=x)
Testing Results:
>>> rand_x_digit_num(5)
'97225'
>>> rand_x_digit_num(5, False)
15470
>>> rand_x_digit_num(10)
'8273890244'
>>> rand_x_digit_num(10)
'0019234207'
>>> rand_x_digit_num(10, False)
9140630927L
Timing methods for speed:
def timer(x):
s1 = datetime.now()
a = ''.join([str(random.randint(0, 9)) for i in xrange(x)])
e1 = datetime.now()
s2 = datetime.now()
b = str("%0." + str(x) + "d") % random.randint(0, 10**x-1)
e2 = datetime.now()
print "a took %s, b took %s" % (e1-s1, e2-s2)
Speed test results:
>>> timer(1000)
a took 0:00:00.002000, b took 0:00:00
>>> timer(10000)
a took 0:00:00.021000, b took 0:00:00.064000
>>> timer(100000)
a took 0:00:00.409000, b took 0:00:04.643000
>>> timer(6000)
a took 0:00:00.013000, b took 0:00:00.012000
>>> timer(2000)
a took 0:00:00.004000, b took 0:00:00.001000
What it tells us:
For any digit under around 6000 characters in length my method is faster - sometimes MUCH faster, but for larger numbers the method suggested by arshajii looks better.
Do random.randrange(10**11, 10**12). It works like randint meets range
From the documentation:
randrange(self, start, stop=None, step=1, int=<type 'int'>, default=None, maxwidth=9007199254740992L) method of random.Random instance
Choose a random item from range(start, stop[, step]).
This fixes the problem with randint() which includes the
endpoint; in Python this is usually not what you want.
Do not supply the 'int', 'default', and 'maxwidth' arguments.
This is effectively like doing random.choice(range(10**11, 10**12)) or random.randint(10**1, 10**12-1). Since it conforms to the same syntax as range(), it's a lot more intuitive and cleaner than these two alternatives
If leading zeros are allowed:
"%012d" %random.randrange(10**12)