Skip to content

Instantly share code, notes, and snippets.

@tqbf
Last active November 11, 2024 19:20
Show Gist options
  • Save tqbf/be58d2d39690c3b366ad to your computer and use it in GitHub Desktop.
Save tqbf/be58d2d39690c3b366ad to your computer and use it in GitHub Desktop.
(Updated) Cryptographic Right Answers

Encrypting data (Was: AES-CTR with HMAC): Use, in order of preference: (1) The Nacl/libsodium default, (2) Chacha20-Poly1305, or (3) AES-GCM.

You care about this if: you're hiding information from users or the network.

All three options get you "AEAD", which is the only way you want to encrypt in 2015. Options (2) and (3) are morally the same thing: a stream cipher with a polynomial ("thermonuclear CRC") MAC. Option (2) gets there with a native stream cipher and a MAC optimized for general purpose CPUs; Poly1305 is also easier than GCM for library designers to implement safely. Option (3)'s AES-GCM is the industry standard; it's fast and usually hardware accelerated on modern processors, but has implementation safety pitfalls on platforms that aren't accelerated.

Avoid: AES-CBC, AES-CTR by itself, block ciphers with 64-bit blocks --- most especially Blowfish, which is inexplicably popular, OFB mode. Don't ever use RC4, which is comically broken.

Symmetric key length (Was: Use 256 bit keys): Go ahead and use 256 bit keys.

You care about this if: you're using cryptography.

But rememeber: your AES key is far less likely to be broken than your public key pair, so the latter key size should be larger if you're going to obsess about this.

Avoid: constructions with huge keys, cipher "cascades", key sizes under 128 bits.

Symmetric signatures (Was: use HMAC): Remains HMAC.

You care about this if: you're securing an API, encrypting session cookies, or are encrypting user data but, against medical advice, not using an AEAD construction.

If you're authenticating but not encrypting, as with API requests, don't do anything complicated. There is a class of crypto implementation bugs that arises from how you feed data to your MAC, so, if you're designing a new system from scratch, Google "crypto canonicalization bugs". Also, use a secure compare function.

Avoid: custom "keyed hash" constructions, HMAC-MD5, HMAC-SHA1, complex polynomial MACs, encrypted hashes, CRC.

Hashing/HMAC algorithm (Was: use SHA256/HMAC-SHA256): Remains SHA-2.

You care about this if: you always care about this.

If you can get away with it: use SHA-512/256, which truncates its output and sidesteps length extension attacks. Meanwhile: it's less likely that you'll upgrade from SHA-2 to SHA-3 than it is that you'll upgrade from SHA-2 to something faster than SHA-3, and SHA-2 looks great right now, so get comfortable and cuddly with SHA-2.

Avoid: SHA-1, MD5, MD6.

Random IDs (Was: Use 256-bit random numbers): Remains: use 256-bit random numbers.

You care about this if: you always care about this.

From /dev/urandom.

Avoid: userspace random number generators, havaged, prngd, egd, /dev/random.

Password handling (Was: scrypt or PBKDF2): In order of preference, use scrypt, bcrypt, and then if nothing else is available PBKDF2.

You care about this if: you accept passwords from users or, anywhere in your system, have human-intelligible secret keys.

But don't obsess about which you use, or built elaborate password-hash-agility schemes to accomodate Password Hash Competition winners; they're all pretty good. The real weakness is in systems that don't use password hashes at all.

Avoid: naked SHA-2, SHA-1, MD5.

Asymmetric encryption (Was: Use RSAES-OAEP with SHA256 and MGF1+SHA256 bzzrt pop ffssssssst exponent 65537): Use Nacl.

You care about this if: you need to encrypt the same kind of message to many different people, some of them strangers, and they need to be able to accept the message asynchronously, like it was store-and-forward email, and then decrypt it offline. It's a pretty narrow use case.

Of all the cryptographic "right answers", this is the one you're least likely to get right on your own. Don't freelance public key encryption, and don't use a low-level crypto library like OpenSSL or BouncyCastle.

Here are several reasons you should stop using RSA and switch to elliptic curve software:

  • Progress in attacking RSA --- really, all the classic multiplicative group primitives, including DH and DSA and presumably ElGamal --- is proceeding faster than progress against elliptic curve.
  • RSA (and DH) drag you towards "backwards compatibility" (ie: downgrade-attack compatibility) with insecure systems. Elliptic curve schemes generally don't need to be vigilant about accidentally accepting 768-bit parameters.
  • RSA begs implementors to encrypt directly with its public key primitive, which is usually not what you want to do: not only does accidentally designing with RSA encryption usually forfeit forward-secrecy, but it also exposes you to new classes of implementation bugs. Elliptic curve systems don't promote this particular foot-gun.
  • The weight of correctness/safety in elliptic curve systems falls primarily on cryptographers, who must provide a set of curve parameters optimized for security at a particular performance level; once that happens, there aren't many knobs for implementors to turn that can subvert security. The opposite is true in RSA. Even if you use RSA-OAEP, there are additional parameters to supply and things you have to know to get right.

If you have to use RSA, do use RSA-OAEP. But don't use RSA.

Avoid: RSA-PKCS1v15, RSA, ElGamal, I don't know, Merkle-Hellman knapsacks? Just avoid RSA.

Asymmetric signatures (Was: Use RSASSA-PSS with SHA256 then MGF1+SHA256 yabble babble): Use Nacl, Ed25519, or RFC6979.

You care about this if: you're designing a new cryptocurrency. Or, a system to sign Ruby Gems or Vagrant images, or a DRM scheme, where the authenticity of a series of files arriving at random times needs to be checked offline against the same secret key. Or, you're designing an encrypted message transport.

The allegations from the previous answer are incorporated herein as if stated in full.

In 10+ years of doing software security assessments I can count on none fingers the number of RSA-PSS users I was paid to look at. RSA-PSS is an academic recommendation.

The two dominating use cases within the last 10 years for asymmetric signatures are cryptocurrencies and forward-secret key agreement, as with ECDHE-TLS. The dominating algorithms for these use cases are all elliptic-curve based. Be wary of new systems that use RSA signatures.

In the last few years there has been a major shift away from conventional DSA signatures and towards misuse-resistent "deterministic" signature schemes, of which EdDSA and RFC6979 are the best examples. You can think of these schemes as "user-proofed" responses to the Playstation 3 ECDSA flaw, in which reuse of a random number leaked secret keys. Use deterministic signatures in preference to any other signature scheme.

Avoid: RSA-PKCS1v15, RSA, ECDSA, DSA; really, especially avoid conventional DSA and ECDSA.

Diffie-Hellman (Was: Operate over the 2048-bit Group #14 with a generator of 2): Probably still DH-2048, or Nacl.

You care about this if: you're designing an encrypted transport or messaging system that will be used someday by a stranger, and so static AES keys won't work.

This is the trickiest one. Here is roughly the set of considerations:

  • If you can just use Nacl, use Nacl. You don't even have to care what Nacl does.
  • If you can use a very trustworthy library, use Curve25519; it's the modern ECDH curve with the best software support and the most analysis. People really beat the crap out of Curve25519 when they tried to get it standardized for TLS. There are stronger curves, but none supported as well as Curve25519.
  • But don't implement Curve25519 yourself or port the C code for it.
  • If you can't use a very trustworthy library for ECDH but can for DH, use DH-2048 with a standard 2048 bit group, like Colin says, but only if you can hardcode the DH parameters.
  • But don't use conventional DH if you need to negotiate parameters or interoperate with other implementations.
  • If you have to do handshake negotiation or interoperate with older software, consider using NIST P-256, which has very widespread software support. Hardcoded-param DH-2048 is safer than NIST P-256, but NIST P-256 is safer than negotiated DH. But only if you have very trustworthy library support, because NIST P-256 has some pitfalls. P-256 is probably the safest of the NIST curves; don't go down to -224. Isn't crypto fun?
  • If your threat model is criminals, prefer DH-1024 to sketchy curve libraries. If your threat model is governments, prefer sketchy curve libraries to DH-1024. But come on, find a way to one of the previous recommendations.

It sucks that DH (really, "key agreement") is such an important crypto building block, but it is.

Avoid: conventional DH, SRP, J-PAKE, handshakes and negotiation, elaborate key negotiation schemes that only use block ciphers, srand(time()).

Website security (Was: Use OpenSSL.): Remains: OpenSSL, or BoringSSL if you can. Or just use AWS ELBs.

You care about this if: you have a website.

By "website security", Colin means "the library you use to make your web server speak HTTPS". Believe it or not, OpenSSL is still probably the right decision here, if you can't just delegate this to Amazon and use HTTPS elastic load balancers, which makes this their problem not yours.

Avoid: offbeat TLS libraries like PolarSSL, GnuTLS, and MatrixSSL.

Client-server application security (Was: ship RSA keys and do custom RSA protocol) Use TLS.

You care about this if: the previous recommendations about public-key crypto were relevant to you.

What happens when you design your own custom RSA protocol is that 1-18 months afterwards, hopefully sooner but often later, you discover that you made a mistake and your protocol had virtually no security. That happened to Colin, but a better example is Salt Stack. Salt managed to deploy e=1 RSA.

It seems a little crazy to recommend TLS given its recent history:

  • The Logjam DH negotiation attack
  • The FREAK export cipher attack
  • The POODLE CBC oracle attack
  • The RC4 fiasco
  • The CRIME compression attack
  • The Lucky13 CBC padding oracle timing attack
  • The BEAST CBC chained IV attack
  • Heartbleed
  • Renegotiation
  • Triple Handshakes
  • Compromised CAs

Here's why you should still use TLS for your custom transport problem:

  • Many of these attacks only work against browsers, because they rely on the victim accepting and executing attacker-controlled Javascript in order to generate repeated known/chosen plaintexts.
  • Most of these attacks can be mitigated by hardcoding TLS 1.2+, ECDHE and AES-GCM. That sounds tricky, and it is, but it's less tricky than designing your own transport protocol with ECDHE and AES-GCM!
  • In a custom transport scenario, you don't need to depend on CAs: you can self-sign a certificate and ship it with your code, just like Colin suggests you do with RSA keys.

Avoid: designing your own encrypted transport, which is a genuinely hard engineering problem; using TLS but in a default configuration, like, with "curl"; using "curl", IPSEC.

Online backups (Was: Use Tarsnap): Remains Tarsnap.

What can I say? This recommendation stood the test of time.

@WhyNotHugo
Copy link

There is also one Tarsnap alternative worth considering: Least Authority

A lot more expensive unless you're storing huge amounts of data (over 100GiB). Which is really a lot, considering it's only for personal use.

@tqbf
Copy link
Author

tqbf commented May 23, 2015

There is no such thing as "enriching an entropy pool".

@ex3ndr
Copy link

ex3ndr commented May 23, 2015

NaCl is the native library. We need pure java-implementations for asymmetric encryption for working with it in all environments. What can you recommend?

Copy link

It would be great if this document cited sources for some of the claims or rationales. I'm still wondering why you've said to avoid AES-CBC.

@NemoPublius
Copy link

@phene: AES-CBC is not an authenticated mode, and as the author says, authenticated modes are the "only way you want to encrypt in 2015".

I am a little disappointed OCB mode did not even get a mention. Rogaway is a good guy.

I agree it would be nice to have a few links to details. Some of these answers are of the "if you have to ask, you will never understand" sort, but not all.

@colinmahns
Copy link

Out of curiosity, do you consider LibreSSL an "offbeat TLS library"? That team is doing all it can to make the OpenSSL code-base sane.

@leonklingele
Copy link

Mind adding additional information to Asymmetric encryption > Elliptic curves?
Maybe let ECC get its own section?
A link to safecurves.cr.yp.to or curve recommendations would also be helpful.

EDIT:
Why is this a secret gist? Give it a name and make it public!

@Zate
Copy link

Zate commented May 24, 2015

This is amazingly helpful for someone who has to field 3 or 4 queries from the business a week on what to do in certain crypto situations to secure large scale client/server infrastructures.

@maximegmd
Copy link

I read http://cr.yp.to/streamciphers/why.html and it appears Salsa20 is stronger and faster than AES on a CPU that does not have the AES-NI extension. What do you think ?

@nnathan
Copy link

nnathan commented May 24, 2015

What about the option to choose between (entity) authentication and encryption or both?

I was looking through the braindead SNMP specs and for the life of me, cannot understand why there is separate passwords for authentication and privacy. I realize the example provided is comically bad on its own right.

@CodesInChaos
Copy link

A few nitpicks:

  1. NaCl is only authenticated encryption, not authenticated encryption with additional data.
  2. Since option 1 is XSalsa20+Poly1305, it's morally the same thing as options 2 and 3: a stream cipher with a polynomial ("thermonuclear CRC") MAC.

That happened to Colin

I don't recall him messing up transport security, only at-rest security via nonce reuse. Neither TLS nor NaCl could have prevented that.

@Yamashi
Salsa20 has similar properties as AES-CTR. In particular it's unauthenticated. So unless you have a good reason to use a raw stream cipher, you're probably better off with authenticated encryption. NaCl's crypto_box and secret_box (Option 1)are based on Salsa20 and ChaCha (Option 2) is Salsa20's successor.

@arielb1
Designing a secure transport protocol on top of crypto_box isn't easy once you add in forward secrecy or protection against replay attacks.

@ex3ndr
I've ported the ref10 implementation of the Curve25519 key exchange to java. But since I won't maintain it, I can't recommend it for direct use. But it might be a good starting point for somebody else who wants to maintain a Java NaCl port, since ref10 is a great basis, having decent performance and being written in portable C.

Copy link

@NemoPublius - I reject that anything is "if you have to ask, you will never understand." Don't conflate ignorance of a subject or detail with unwillingness and incapability of learning it.

Copy link

I'd add that if you're creating a service on *nix that can run over OpenSSH and your users already use OpenSSH (i.e. they're developers or sysadmins), then run that service over the existing OpenSSH infrastructure rather than setting up a separate service that runs over TLS. Invoke the user's regular "ssh" binary; don't re-implement the SSH protocol yourself.

@samson
Copy link

samson commented May 26, 2015

PolarSSL is now managed by ARM (the CPU manufacturers) and has been renamed to mbedTLS as part of their embedded systems software package. Calling this 'offbeat' is a little 'out there' if you ask me.

Steering people towards the libraries which have been proven to have the most problems is also a bit of a problem.

@stribika-rdonly
Copy link

Hardcoded-param DH-2048 is safer than NIST P-256, but NIST P-256 is safer than negotiated DH.

Is this true for SSH's DH-GEX protocol?

I don't think it can be downgraded because after the key exchange, the server and the client use SHA256(g^xy mod p || g || p || min_dh_size || preferred_dh_size || max_dh_size || other_stuff) as the shared secret. If anyone were to replace either the DH parameters from the server or the requirements from the client, then the keys won't match.

I have an SSH guide in which I recommend Curve25519 ECDH and negotiated, >2048 bit DH, in that order. I would prefer not to spread bad advice.

@borski
Copy link

borski commented May 27, 2015

We wrote a post over a year ago trying to simplify the differences between encryption and hashing for people that have little security experience. This is a much more extensive list, but our version basically boils down to: "Use NaCl"

@tqbf
Copy link
Author

tqbf commented May 28, 2015

The fact that a major vendor now owns the rights to an SSL library does not mean that library's crypto constructions have received especially careful review. You should be concerned about that. "Libraries that aren't OpenSSL" somewhat routinely reincarnate old crypto bugs.

@mansourmoufid
Copy link

Also avoid libgcrypt for generating keys. It uses an internal PRNG by default, based on the Gutmann design predating Yarrow, never audited. The only advantage is its "secure memory" pool, which you can have if your program is setuid (don't do that either). Use /dev/urandom.

@jamshid
Copy link

jamshid commented May 31, 2015

What do the references to curl mean in:

Avoid: .. using TLS but in a default configuration, like, with "curl"; using "curl", IPSEC.

Do curl or libcurl handle TLS incorrectly?

@dfoxfranke
Copy link

You should include some discussion about precautions against nonce reuse and the dire consequences of getting this wrong, especially given that you've dropped the CTR-then-HMAC recommendation in favor of polynomial-MAC-based schemes that use 64-bit nonces and allow arbitrary message forgery if you reuse one. With these schemes, random nonces have too high a risk of collision. Counting up from 0 works fine for ephemeral keys, but for long-lived ones it requires you to correctly handle resumption after a process crash or power failure, and some buggy disk controllers make this effectively impossible. The problem gets even harder if the key is shared across a cluster of machines.

@byronhe
Copy link

byronhe commented Jun 10, 2015

thank you for you excellent job.

i made a Chinese translation ,here, in case someone is interested.

Copy link

Why not recommend SHA384 instead of SHA512/256? From what I understand, it prevents length extension attacks and is readily available on all operating systems and most programming languages. It seems the primary downside is slightly longer hashes, right?

Copy link

Options (2) and (3) are morally the same thing: a stream cipher with a polynomial ("thermonuclear CRC") MAC.

Isn't the NaCl default (xsalsa20 + poly1305) also a stream cipher with a polynomial MAC?

From /dev/urandom.

If you can get away with not supporting kernels before 3.17, especially if there is a chance of the code running at boot time, getrandom() with flags=0 is nicer.

@cacsar
Copy link

cacsar commented Jul 9, 2015

I agree with @dfoxfranke using AES-GCM with long lived keys across different machines is a good way to reuse something that must not be reused and end up in deep deep trouble, and I'd have to look at the others to check. How you are encrypting data and what you're using it for is an important element in choosing your system. It'd be good to see a breakdown of why not to use some of the things mentioned. For example is AES-CBC an avoid because it's "slow" and unauthenticated or an avoid for another reason?

Copy link

Although oxymoronic but it's a good thing that this Gist is SECRET !
😜

@oconnor663
Copy link

@CodesInChaos, about your nitpick #1: You could take SHA2(nonce + additional data), and use the first 192 bits of that hash as the NaCl nonce. Is that very different from what an "official" additional data API would do under the covers?

@atoponce
Copy link

It's worth adding sha256crypt and sha512crypt as a last resort for password hashing. These are not the same as vanilla SHA-256 and SHA-512. They're similar in design to PBKDF (default rounds is typically 1,000), but instead of generating keys of arbitrary length, they have a fixed length output. They are the default algorithms for GNU/Linux systems, and you can fine-tune the rounds (default is 5,000). So I would recommend:

In order of preference, use scrypt, bcrypt, and then if nothing else is available sha256/512crypt followed by PBKDF2.

@enkore
Copy link

enkore commented Dec 9, 2016

For example is AES-CBC an avoid because it's "slow" and unauthenticated or an avoid for another reason?

Combine CBC with anything else than a full EtM of the ciphertext and you probably have a fatal security flaw in your protocol.

So besides being unnecessarily hard to apply correctly it's also unnecessarily slow due to it's name-sake chaining operation (to encrypt block n, you already need to have block n-1 encrypted), ie. on modern CPUs you'll leave somewhere around a factor of 7-8 on the table, compared to CTR mode. Practically everything else will be faster, including AES-GCM and -OCB, and ChaCha20-Poly1305. This is less on older CPUs without AES acceleration, but the difference due to pipelining is still significant. With CTR you can split decryption of long messages over CPU cores, increasing performance manifold again.

An older argument against CTR-like modes was that the input to generate the key stream will usually be completely predictable by any observer (although this depends on the protocol; in some protocols it is not possible to predict it as an observer). I don't believe any cryptographer shares this concern, or has shared it, for quite some time -- if the block cipher would generate distinguishable or predictable blocks under an predictable input, it would be broken anyway.

Another old argument for CBC is that it is not quite as malleable as keystream-based constructions. However, theory and practice shows that this argument is unfounded; "malleability is different|not as easy" is absolutely not a valid argument for one mode over another, since you're authenticating anyway, aren't you?

So in summary, there are no arguments for CBC, but some against. Why would anyone still use it?


PBKDF2: Don't. At least b/scrypt or even better Argon2 (any of these are rather hard targets - Argon2 with the correct parameters becomes a ridiculously hard target).

Happy to see that this FAQ was updated here to advise against RSA in any form.

Good work!

@tqbf
Copy link
Author

tqbf commented Jul 1, 2017

Was it updated about RSA? I feel like I've been beating that drum --- to the annoyance of people more expert than me --- since before I wrote this.

I can generate an argument for CBC, not from theory but from systems engineering: the failure mode for IV misuse in CBC is different than the failure mode for nonce misuse in CTR. If you can foresee randomness or state tracking failures, CBC might be more resilient.

It's not a compelling argument and I'd tend not to use CBC, but I will say that as someone who primarily hunts for crypto vulnerabilities and doesn't do much system design, my pentesting salivary glands tend to work harder when I see a system using CTR than when I see one using CBC.

I really don't think it much matters what KDF/password hash you use. I know that's an annoying thing to say, but I think it's ultimately true.

@JonTheNiceGuy
Copy link

An updated response was released yesterday, found here: http://latacora.singles/2018/04/03/cryptographic-right-answers.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment