We currently believe that 128 bits of security should be sufficient for current and future needs. Much of this is due to physical constraints: using the theoretical minimum energy to store 2^128 bits of data would require more energy than is required to boil the world's oceans.
If you're using a password manager, that means that a 20 character truly random password using the 94 non-whitespace ASCII characters, and 22 characters if you're using the Base64 characters. If you're not using a password manager, you probably should be.
If you want to protect against even the most unlikely scenarios, then you can use a password with 256 bits of entropy. I do this on sites where I can, and since I use a password manager, there's no need for me to remember it.
Note, however, that in many cases, there are likely easier attacks than brute force. For example, it may be easier to compromise the site or service in question than conduct a brute force attack on the password. That being said, it's easy to use strong passwords in most cases, so there's little reason not to.
Answer from bk2204 on Stack ExchangeWe currently believe that 128 bits of security should be sufficient for current and future needs. Much of this is due to physical constraints: using the theoretical minimum energy to store 2^128 bits of data would require more energy than is required to boil the world's oceans.
If you're using a password manager, that means that a 20 character truly random password using the 94 non-whitespace ASCII characters, and 22 characters if you're using the Base64 characters. If you're not using a password manager, you probably should be.
If you want to protect against even the most unlikely scenarios, then you can use a password with 256 bits of entropy. I do this on sites where I can, and since I use a password manager, there's no need for me to remember it.
Note, however, that in many cases, there are likely easier attacks than brute force. For example, it may be easier to compromise the site or service in question than conduct a brute force attack on the password. That being said, it's easy to use strong passwords in most cases, so there's little reason not to.
One trillion passwords per second sounds like a good rule of thumb and jibes with the number I found when researching my table of pw cracking times in 2015 (I found an estimated 23 billion per second per node, which is plausible since the NSA could easily have a few dozen nodes).
When calculating this sort of thing, I prefer to assume Moore's Law applies directly to crack efficiency (it doubles every 18mo). There have been 2344 days since the 2015-07-01 (the mid-point of 2015), which accounts for 4.3 periods of 18mo since mid-2015, so 1T pw/s in 2015 becomes 24.3 = 19.4T pw/s. Since pessimism fits here, Let's round that to 20T pw/s (in scientific notation, 2e13).
When calculating crack time for entropy, assume an attacker will get the password in half of the number of tries needed, so an entropy of 80, which refers to 280 passwords, can be cracked in 280pw / 2e13pw/s / 2 / 86400s/d / 365.25d/y = 957.7 years with November 2021 upgrades to the same cluster size.
I've lost the math I used to determine how to account for upgrades every 18mo, but it's easier just to fast-forward ten years: 2e13 × 210/1.5 = 2e15, an increase of two orders of magnitude. 280 / 2e15 / 2 / 86400 / 365.25 = 9.58 years, with an expected crack around 2041.
However, an entropy of 80 is very unwieldy when it comes to human memory. Diceware (7776 entries) would require six words (for entropy 77, add a random alphanumeric to make that 83). Using a standard spelling dictionary (100k entries), you'd need five words (for entropy 83). Using random printable ASCII characters, you'd need twelve (for entropy 78, thirteen yields 85).
At that point, is memorizing six random words or thirteen random characters really feasible for every account you use?
Use authenticators on accounts that allow it. For most systems, that's multi-factor or 2FA. Sometimes, there's a passwordless login that uses a 2FA technology on its own. For most services, cryptographic authentication apps provide sufficient security and we'll see a lot of transition away from passwords in the very near future. Higher-security concerns (like your email, bank, and password manager) will still need both a secure authenticator and a password.
Use a password manager with a passcode generator that uses 16+ random characters including a minimum of 1 lower, 1 upper, and 1 special, yielding an entropy of log₂(94¹³×26×26×32) = 99. To get to bk2204's suggested 128 bit entropy, you'd need 21 total characters.
For items you can't put into the password manager (like the password manager's entry code itself), I recommend a 8-ish character passcode put in a random place within a passphrase of 5 dictionary words or 6 Diceware words. Use a generator, humans are very bad at understanding that random ≠ arbitrary ≠ obscure. For example, I just popped open Bitwarden (Diceware) and generated reversion junkie unknotted opposite litter stamina as a passphrase and 7!&CPc9T as a passcode (requiring one upper, one lower, and one special), then I got 3 as a random number from 0-6 (echo $(($RANDOM%7)) in bash), so I can combine those to add the code after the 3rd word: reversion junkie unknotted 7!&CPc9T opposite litter stamina. Make a story to remember the words and write the code down on a piece of paper that lives in your wallet.
This has an entropy of log₂(7776⁶×94⁵×26×26×32) = 125.
(Update: Bitwarden uses Diceware, which is unnecessarily limiting. I've added a word to get a more robust entropy value until Bitwarden gains support for larger dictionaries.)
Don't fully trust your cloud-connected password manager for your most trusted items (PGP key, bank, email, etc)? Generate random 16-char codes and save them as the password, then generate another 5-char (or 2-word) code that you write down in your wallet with cryptic clues about what it is. Append that to your password after filling in the login form with your pw manager.
Just be careful not to lock yourself out with a scheme you don't recall, especially with sites like banks that lock you out after a small number of login attempts.
I'm setting up full disk encryption with LUKS2 for the first time and have to choose a passphrase. Before I do this I'd like to clear my confusion about brute forcing.
Using https://www.security.org/how-secure-is-my-password/ to check password strength, a 14 character password like $8wtS!^9C6voA2 takes 2 hundred million years to brute force whereas a passphrase such as Emphasis-Tubby3-Boat takes 42 quintillion years. I'm not sure how accurate this is though, but it sure is surprising to see such a ginormous difference.
At first I considered using a password like the former because it looks far more complex than the passphrase. Although I prefer a passphrase because it is easier to remember, before checking its strength I thought it to be a worse option because it primarily consists of dictionary words!
Now with some effort I can remember long and complex passwords like the former, I did it before but it took me several months.
Can anyone explain why a passphrase containing dictionary words and a mere amount of 3 non-alphabetic characters is deemed much stronger than a 14 character randomly generated password?
Edit: 91 bit vs 150 bit entropy. So it makes sense that the passphrase would be deemed as stronger, however doesn't the fact that it contains primarily dictionary words negate its perceived security?
Assume you have no rainbow table (or other precomputed list of hashes), and would actually need to do a brute-force or dictionary attack.
This program IGHASHGPU v0.90 asserts to be able to do about 1300 millions of SHA-1 hashes (i.e. more than 2^30) in each second on a single ATI HD5870 GPU.
Assume a password of 40 bits of entropy, this needs 2^10 seconds, which is about 17 minutes.
A password of 44 bits of entropy (like the one in the famous XKCD comic) takes 68 minutes (worst case, average case is half of this).
Running on multiple GPUs in parallel speeds this up proportionally.
So, brute-forcing with fast hashes is a real danger, not a theoretical one. And many passwords have a much lower entropy, making brute-forcing even faster.
If I use a truly random salt for each user, on what order of magnitude will this affect the length of time to crack my password?
The salt itself is assumed to be known to the attacker, and it by itself doesn't much increase the cracking time for a single password (it might increase it a bit, because the hashed data becomes one block longer, but that at most doubles the work).
The real benefit of a (independent random) salt is that an attacker can't use the same work to attack the passwords of multiple users at the same time. When the attacker wants just any user's password (or "as many as possible"), and you have some millions of users, not having a salt would could down the attack time proportionally, even if all users would have strong passwords. And certainly not all will have.
As a bonus, which hashing algorithms are safest to use?
The current standard is to use a slow hashing algorithm. PBKDF2, bcrypt or scrypt all take both a password and a salt as input and a configurable work factor - set this work factor as high as your users just accept on login time with your server's hardware.
- PBKDF2 is simply an iterated fast hash (i.e. still efficiently parallelizable). (It is a scheme which can be used with different base algorithms. Use whatever algorithm you are using anyways in your system.)
- Bcrypt needs some (4KB) working memory, and thus is less efficiently implementable on a GPU with less than 4KB of per-processor cache.
- Scrypt uses a (configurable) large amount of memory additionally to processing time, which makes it extremely costly to parallelize on GPUs or custom hardware, while "normal" computers usually have enough RAM available.
All these functions have a salt input, and you should use it.
If you don't use salts, the hacker can simply type them into Google and likely find it. MD5 encrypt your password, then Google the result, and you'll see what I mean.
If you've chosen a sufficiently complex password, then it probably won't be on Google, but it will still be in pre-computed tables called "rainbow tables" (the full tables would take too much space, so the "rainbow" technique is used to compress them).
If no tables are being used, then it doesn't matter if the password is salted or not. Of course, tables are always used, so yes, you should salt the passwords.
A desktop machine, with a gaming card (which can also be used to accelerate password cracking), and calculate a billion hashes/second. How fast this cracks your password depends. It can calculate all combinations for short passwords within a few minutes. But the, the problem is exponential, so that it cannot calculate all possible combinations for long passwords, even if you used a billion computers over a billion years.
Instead, what hacker do is a "mutated dictionary" crack. They start with a list of well-known words. It's called a "dictionary", but such lists contain a lot of words that you wouldn't find in a real dictionary, like "ncc1701", the designation for the Star Trek Enterprise. For each word, it does a number of mutations, such as capitalizing some letters, changing some letters to numbers, like "p4ssw0rd", or appending characters, like "password1234". How fast this works depends upon the skill of the hacker choosing just the right dictionary and just the right mutations. This can also change depending upon the hacker's knowledge of you. For example, if you are Hispanic, the hacker will add Spanish words and names to the dictionary.
You can expect that if your password database is stolen, the hacker will be able to crack about half of the salted passwords with a couple day's worth of work.
As others have mentioned, for entropy to be well-defined, you need to have some underlying probability distribution $D$. If you are willing to look around some unsavory places, you could look for a moderately-large data breach to get an empirical distribution of passwords that you can compute the entropy of. Alternatively there might be some user studies that have computed things like this.
The main point of this answer is to mention that there are multiple inequivalent notions of entropy, and that the traditional (shannon) entropy is not always the best one in cryptography. Shannon entropy is defined as
$$H(X) = \sum_{x\in \mathsf{supp}(X)} p(x)\log(1/p(x))$$
Another fairly-common notion of entropy is the min-entropy, defined as
$$H_\infty(X) = \max_{x\in \mathsf{supp}(X)} \log(1/p(x))$$
This roughly captures the most likely output under $X$. It is often much better to use for cryptographic purposes, which can be demonstrated via a simple example.
Let $X$ be a random variable that is
- $0$ with probability $1/2$, and
- uniform over $\{1,\dots,2^k\}$ with probability $1/2$.
The min entropy of this is very small (it is $1$). The shannon entropy of it is much larger (I believe it is something like $k-O(1)$, I am too lazy to compute the constant). So if you are measuring the quality of a distribution over passwords via
- min entropy, you will think that $X$ is quite bad, vs
- shannon entropy, you will think $X$ is quite good.
Of course, half of all users sampling passwords from $X$ can be trivially attacked, so it should perhaps be considered "bad" (in accordance to what min entropy would predict).
The answer depends on many factors. In particular:
- If a human has generated it, the probability of some value will be higher that the probability of the others.
- If random generator was used, it depends on what is known about its properties. If it is known that it generates some values more often that the others, this can affect the probability to guess.
- It depends on correctness of implementation of password generator. Implementation bugs can affect the probability of some values.
- It depends on how guessing is organized. If human is guessing, then even knowing the information above, it will have one probability. If some application is guessing, it will have another probability.
The formula 1/2entropy is correct. But, as Royce Williams said, in many cases it is hard to calculate the entropy. Entropy is not something absolute. It depends on the context, on the factors mentioned above. For instance, password is generated randomly using lower case English letters. One person knows this exactly. Another person knows only that password can contain English letters in both cases and digits. The probability to guess for the first person will be higher.
Thus the actual question is: How to calculate entropy?