You can use UUID this way to get always the same UUID for your input String:
String aString="JUST_A_TEST_STRING";
String result = UUID.nameUUIDFromBytes(aString.getBytes()).toString();
Answer from uraimo on Stack OverflowYou can use UUID this way to get always the same UUID for your input String:
String aString="JUST_A_TEST_STRING";
String result = UUID.nameUUIDFromBytes(aString.getBytes()).toString();
UUID.nameUUIDFromBytes() only generates MD5 UUIDs. However, SHA1 is preferred over MD5, if backward compatibility is not an issue.
The utility class below generates MD5 UUIDs and SHA-1 UUIDs. Feel free to use and share.
package com.example;
import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.UUID;
/**
* Generates UUIDv3 (MD5) and UUIDv5 (SHA1).
*
* It is fully compliant with RFC-4122.
*/
public class HashUuid {
private static final int V3 = 3; // MD5
private static final int V5 = 5; // SHA-1
private static final String HASH_V3 = "MD5";
private static final String HASH_V5 = "SHA-1";
public static final UUID NAMESPACE_DNS = new UUID(0x6ba7b8109dad11d1L, 0x80b400c04fd430c8L);
public static final UUID NAMESPACE_URL = new UUID(0x6ba7b8119dad11d1L, 0x80b400c04fd430c8L);
public static final UUID NAMESPACE_OID = new UUID(0x6ba7b8129dad11d1L, 0x80b400c04fd430c8L);
public static final UUID NAMESPACE_X500 = new UUID(0x6ba7b8149dad11d1L, 0x80b400c04fd430c8L);
public static UUID v3(String name) {
return generate(V3, HASH_V3, null, name);
}
public static UUID v5(String name) {
return generate(V5, HASH_V5, null, name);
}
public static UUID v3(UUID namespace, String name) {
return generate(V3, HASH_V3, namespace, name);
}
public static UUID v5(UUID namespace, String name) {
return generate(V5, HASH_V5, namespace, name);
}
private static UUID generate(int version, String algorithm, UUID namespace, String name) {
MessageDigest hasher = hasher(algorithm);
if (namespace != null) {
ByteBuffer ns = ByteBuffer.allocate(16);
ns.putLong(namespace.getMostSignificantBits());
ns.putLong(namespace.getLeastSignificantBits());
hasher.update(ns.array());
}
hasher.update(name.getBytes(StandardCharsets.UTF_8));
ByteBuffer hash = ByteBuffer.wrap(hasher.digest());
final long msb = (hash.getLong() & 0xffffffffffff0fffL) | (version & 0x0f) << 12;
final long lsb = (hash.getLong() & 0x3fffffffffffffffL) | 0x8000000000000000L;
return new UUID(msb, lsb);
}
private static MessageDigest hasher(String algorithm) {
try {
return MessageDigest.getInstance(algorithm);
} catch (NoSuchAlgorithmException e) {
throw new RuntimeException(String.format("%s not supported.", algorithm));
}
}
/**
* For tests!
*/
public static void main(String[] args) {
UUID namespace = UUID.randomUUID();
String name = "JUST_A_TEST_STRING";
System.out.println(String.format("UUID.nameUUIDFromBytes(): '%s'", UUID.nameUUIDFromBytes(name.getBytes())));
System.out.println();
System.out.println(String.format("HashUuid.v3(name): '%s'", HashUuid.v3(name)));
System.out.println(String.format("HashUuid.v5(name): '%s'", HashUuid.v5(name)));
System.out.println(String.format("HashUuid.v3(namespace, name): '%s'", HashUuid.v3(namespace, name)));
System.out.println(String.format("HashUuid.v5(namespace, name): '%s'", HashUuid.v5(namespace, name)));
}
}
This is the output:
UUID.nameUUIDFromBytes(): '9e120341-627f-32be-8393-58b5d655b751'
HashUuid.v3(name): '9e120341-627f-32be-8393-58b5d655b751'
HashUuid.v5(name): 'e4586bed-032a-5ae6-9883-331cd94c4ffa'
HashUuid.v3(namespace, name): 'f0043437-723b-308f-a6c0-74ec36ddf9c2'
HashUuid.v5(namespace, name): '18a45fd8-8fab-5647-aad7-1d3264932180'
Alternatively, you can also use uuid-creator. See this example:
// Create a UUIDv5 (SHA1)
String name = "JUST_A_TEST_STRING";
UUID uuid = UuidCreator.getNameBasedSha1(name);
linux - Repeated set of UUIDs from java's UUID.randomUUID() - Stack Overflow
design - Is it possible to build a system to generate UUIDs where every UUID is guaranteed unique? - Software Engineering Stack Exchange
same uuid based on seed string in js - javascript
random - Java UUID.randomUUID() or SecureRandom for id segment on URL? - Information Security Stack Exchange
I've been learning about UUIDs and have a few questions about version 4 implementation.
I understand that the likelihood of generating the same UUID is incredibly small, but is that likelihood jeopardized at all by the random number generator and seed used? Maybe the likelihood of using the same seed is incredibly small too, so how does one go about choosing a good seed? I'm thinking about this from the perspective of someone implementing their own UUID generator. I assume that the libraries available have good implementations.
I guess the question I'm really asking is this, when implementing a version 4 UUID generator how does one avoid seed related issues creating duplicate UUIDs?
Thanks!
there is a finite number of ids which can fit into some structure
Correct. If you have n bits in your structure, after you have generated 2^n IDs, the next one must be a collision; this is the pigeonhole principle.
However, assuming you are dealing with an actual physical system rather than a pure thought experiment, there are "only" 10^80 = 2^265 electrons in the universe so if your data structure is bigger than 265 bits or so then, given a perfect algorithm, you'll never be able to generate all the IDs.
You want version 1 UUIDs. They don't collide but they leak the generating computer's MAC address and require a monotonic clock.
From RFC4122 we have this format
8 bytes time-low
2 bytes time-mid
2 bytes time-high & version (high nybble is a 1)
2 bytes clock-seq-high & reserved (high two bits are 1 and 0)
2 bytes clock-seq-low
6 bytes MAC address
Clock-sequence is actually the subsecond portion of the clock now.
The thing about the two byte fiedls is they are not two one byte fields. If you mess the code up, you get different results on little endian and big endian machines and you have a chance of collision again. The string form looks like this no matter what your endian is:
????????-????-1???-[89AB]???-?[02468ACE]??????????
The thing about this is when it works, it works. If you don't have MAC address collisions (the general rule is if you get colliding MAC addresses, RMA both cards) and if you actually have a monotonic clock, this works 100% of the time.
There's a work around for not having a monotonic clock. That is, do it the old way. Serialize all UUID generations and use clock-seq as a counter that increments with each generation.
The thing about this is you will have bad generation and most likely collisions if your PC battery dies on any node that generates.
It's been pointed out in comments that VMs don't really get globally unique MAC addresses, and it has also been pointed out that the solution to this is to hand out the node IDs yourself. When you do this you are supposed to set the broadcast bit (the one marked [02468ACE] so that it would match [13579BDF] instead).
If you can't tolerate leaking the MAC addresses and can't issue node IDs, you're stuck with lesser means. V4 UUIDs are really good for all terrestrial use under modern hardware assumptions (has thermal diode or other hardware RNG). V5 UUIDs can do the job under specific circumstances (they're a hash of something else; the main problem is they're subject to adversarial attack).
Current accepted solution will only work in NodeJS environment per github page of uuid-1345:
Un-Features:
- Does not work in the browser due to the use of NodeJS's crypto module.
If you are looking for a solution that will work in the browser, you should use a more popular uuid library.
const uuidv5 = require('uuid/v5');
// ... using a custom namespace
//
// Note: Custom namespaces should be a UUID string specific to your application!
// E.g. the one here was generated using this modules `uuid` CLI.
const MY_NAMESPACE = '1b671a64-40d5-491e-99b0-da01ff1f3341';
uuidv5('Hello, World!', MY_NAMESPACE); // ⇨ '630eb68f-e0fa-5ecc-887a-7c7a62614681'
The UUID will stay consistent as long as you pass it the same namespace.
Hope that helps.
Easiest way:
const getUuid = require('uuid-by-string')
const uuidHash = getUuid('Hello world!')
// d3486ae9-136e-5856-bc42-212385ea7970
https://www.npmjs.com/package/uuid-by-string
In terms of the raw amount of random bits, yes.
Looking at the source for Java's random UUID generation, you can see they actually utilize the SecureRandom class to generate the raw bytes. So, the 'quality' of your randomness isn't going to change.
The fact that you have 6 more bits does, technically, make it more difficult to brute force a duplicate. Wikipedia's article on UUID has a good description of the math behind it. To quote it:
[A]fter generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%
If these links are going to expire in, say, 24 hours, this is a very small thing to worry about, and that's only with 122 bits.
Now, is more bits going to make it harder for an attacker to brute force a collision? Sure! Is it worth it? Maybe... but maybe not. Only you can really decide what's right for your use case.
When you get to the point where brute forcing a collision would, on average, take longer than the time for the estimated heat death of the universe, it might be a little overboard. That's getting to the territory of needing to worry more about hackers compromising the security of your storage more than brute forcing (and, honestly, even with UUIDs, you're going to have that problem). Ultimately, the difficulty of brute forcing doesn't matter if they can find another way in. If you're really worried, storage is cheap - store 256 bit strings that you generate with a SecureRandom and call it good, but make sure you have proper security around everything else, or it's useless.
Well, in principle, UUIDs and cryptographic RNGs promise two different things:
- UUIDs promise low probability of duplicates;
- Cryptographic RNGs promise unpredictability to a malicious adversary.
If #2 is a concern, I'd make sure to use a cryptographic RNG. For example, if you're concerned that a malicious attacker who can request a lot of UUIDs in a short timespan might learn enough information to predict those given out to other users' requests.
As it happens, Java's UUID.randomUUID() method does use a cryptographic RNG, so it should be safe in that regard. But the risk then is that developers who later look at your code may not understand that your intent includes secure cryptographic random choice, and not just uniqueness. For example, somebody might rewrite that particular module using some other language or UUID library that does not use a cryptographic RNG, and expose you to prediction-based attacks that you meant to exclude. So if cryptographic security really was an important factor I'd try to write the code in a way that makes that obvious, by using SecureRandom explicitly. (Security-critical code needs to be very clear!)
If all I cared for was uniqueness instead of security then I'd go ahead and use UUID.randomUUID().
You can use UUID.nameUUIDFromBytes(byte[] bytes) where you get byte[] bytes from a Random or SecureRandom that you seeded
I would create my own class which wraps the UUID class and that can accept some kind of flag to determine if it's in debug mode in which case it would return a constant value or "production" mode in which case it would work as expected.
An even cleaner solution would be to define an interface like IRandomUUIDGenerator and have two implementations for it: ConstantUUIDGenerator which you can use for your testing and DefaultRandomUUIDGenerator implementation for your production code. You can then specify in a config file which implementation to use depending on your environment.