How Does HTTPS Work?

HTTPS keeps your data safe in transit. There are some vulnerabilities that are worth knowing about. But it's a battle-tested technology that generally just works, and that's about all that most of us really need to know.

But if, like me, you want to know more about how it all works, read on.

HTTPS

HTTPS (Hypertext Transfer Protocol Secure) is little more than the HTTP that we all know and love over SSL (Secure Sockets Layer).

Browsers themselves supply some extra convieniences, such as the padlock, shield or other security indicators in the address bar, or alerting you if unsecure data is being served at the same time as secure data, but none of these are actually part of the protocol itself.

SSL/TLS

So the real magic of HTTPS happens with SSL. Except that SSL itself has been supplanted by TLS (Transport Layer Security). TLS is a drop-in replacement for SSL, and the most recent version of TLS, 1.2, can essentially be considered SSL 3.2.

So how does TLS work?

Let's run through the steps of the diagram below, which depicts the TLS handshake. If you run into a term you don't understand, just bear with me. We'll get to them all sometime before the end of this post.

  1. Client hello: When a client makes a request over HTTPS the client performs the normal TCP handshake to establish communication. It then sends over information about the versions of SSL/TLS and cryptographic suites it supports.

  2. Server hello: The server responds with the highest shared version of TLS, and its choice of cryptographic suite, along with its security certificate.

  3. Verify server certificate: The client checks the server's certificate against its internal list of trusted certification authorities and uses that CA's public decryption key to verify its authenticity.

  4. Client key exchange: Once the client is satisfied that the server is who it reports to be (or the user has ignored warnings to the contrary) the client will send the 'shared secret' portion of a symmetric key, encrypted using the server's public key.

  5. The rest: If this transaction requires extra secrecy, the server will confirm the client's certificate information. The client and server confirm the connection details and begin transferring data using a shared key.

(A)symmetric encryption

How do you allow another party, with whom you have no prior relationship, to send you data securely, even if a third party is snooping on the entire conversation?

Asymmetric and symmetric encryption solve this problem in different ways and TLS doubles down on its security by utilizing both, which makes it a hybrid cryptosystem.

Asymmetric encryption

Asymmetric encryption, also known as public-key cryptography, uses an algorithm capable of generating a pair of keys with the unique property that data encypted with the first key can be decrypted with the second key.

TLS generally uses the RSA algorithm for its asymmetric key pairs, and employs them in two clever ways:

Server public/private key pairs

The server makes the encryption key publicly available, keeping the decryption key a closely guarded secret.

Clients are able to encrypt data using the server's public key, and be fairly confident that any 3rd party listening on the line will be unable to decrypt that data without the server's private key.

TLS uses this secure connection in the process of generating the symmetric key.

Certification authorities

Certification authorities (CAs) utilize the opposite pattern with RSA. The CA's decryption key is made publicly available, while the encryption key is kept (fiercely) secret.

When a server purchases a certificate from a CA they will provide identifying information, including the domain they are serving. For certain levels of certification the owner of the server may need to jump through some hoops to prove to the CA that they actually own the domain.

The CA generates a key pair and encrypts the public (encryption) key along with this identifying information and encrypts it all using its own private key. The owner of the server then installs the certificate along with the private (decryption) key that matches the public key in the certificate.

When you, as a user, receive a certificate from a server, your browser will first assess whether the specified certificate purports to be from a trusted CA, as determined by a pre-installed list. (Note that this list is one of the largest potential vulnerabilities in the entire TLS process)

The browser uses that CA's public key to decrypt the certificate. It can now check whether the domain in the decrypted matches the domain the server is serving, and if not, throw up some kind of warning.

Because it was encrypted with the CA's private key, any tampering with the certificate causes the decryption process to produce nonsense.

The brilliant part of this is that anyone on the internet can steal (for instance) Google's certificate and pretend to be Google. But if they try to change the public key in the certificate to one that they have private key match to, the certificate will be corrupted, and the browser will throw up security flags.

Symmetric encryption

Symmetric encryption strategies require the use of a shared key. The most common technique for arriving at a shared key over a public connection is known as the Diffie-Hellman Key Exchange, named after it's inventors. The mechanics of how the two parties arrive at a shared key is out of the scope of this article, but is well covered on Wikipedia