HTTP protocol is a plaintext transmission protocol, meaning that the interaction process and data transmission are not encrypted. There is no authentication between the communicating parties, making the communication process highly susceptible to hijacking, eavesdropping, and tampering. In severe cases, this can lead to malicious traffic hijacking and even serious security issues such as personal privacy leakage (such as bank card numbers and passwords).
HTTP communication can be compared to sending a letter. When A sends a letter to B, the letter goes through many hands of postal workers during the delivery process. They can open the letter and read its contents (since HTTP is transmitted in plaintext). Any content in A's letter (including various account numbers and passwords) can be easily stolen. In addition, postal workers can forge or modify the content of the letter, causing B to receive false information.
For example, in HTTP communication, a "middleman" could insert an advertising link into the HTTP message sent from the server to the user, causing many inappropriate links to appear on the user's interface. Alternatively, the middleman could modify the user's request header URL, leading the user's request to be hijacked to another website, and the user's request never reaches the real server. These issues can result in users not receiving the correct service and even suffering significant losses.
To address the issues caused by HTTP, encryption and identity verification mechanisms must be introduced.
Imagine a server sends a message to the client in ciphertext, which only the server and client can understand, ensuring data confidentiality. Simultaneously, verifying the other party's legal identity before exchanging data can ensure both parties' security. However, the question arises: how can the client understand the data after the server encrypts it? The server must provide the client with the encryption key (symmetric key, explained in detail later), allowing the client to decrypt the content using the symmetric key. But if the server sends this symmetric key to the client in plaintext, it can still be intercepted by a middleman. The middleman would then know the symmetric key, which still cannot ensure the confidentiality of the communication. But if the server sends the symmetric key to the client in ciphertext, how can the client decrypt the ciphertext and obtain the symmetric key?
At this point, we introduce the concept of asymmetric encryption and decryption. In asymmetric encryption and decryption algorithms, data encrypted with a public key can only be decrypted by a unique private key. Therefore, as long as the server sends the public key to the client, the client can use this public key to encrypt the symmetric key for data transmission. When the client sends the symmetric key to the server using the public key, even if a middleman intercepts the information, they cannot decrypt it because the private key is only deployed on the server, and no one else has the private key. Therefore, only the server can decrypt it.
After the server receives the client's information and decrypts it with the private key, it can obtain the symmetric key used for data encryption and decryption. The server then uses this symmetric key for subsequent communication data encryption and decryption. In addition, asymmetric encryption can manage symmetric keys well, ensuring that the symmetric keys for each data encryption are different. This way, even if a client's virus retrieves communication cache information, it cannot steal normal communication content.
However, this seems to be insufficient. If during the communication process, a middleman hijacks the client's request during the three-way handshake or when the client initiates an HTTP request, the middleman can impersonate a "fake client" and communicate with the server. The middleman can also impersonate a "fake server" and communicate with the client. Next, we will elaborate on the process of the middleman obtaining the symmetric key:
When the middleman receives the public key sent by the server to the client (here, the "correct public key"), they do not send it to the client. Instead, the middleman sends their public key (the middleman also has a pair of public and private keys, referred to here as the "forged public key") to the client. Afterward, the client encrypts the symmetric key with this "forged public key" and sends it through the middleman. The middleman can then use their private key to decrypt the data and obtain the symmetric key. At this point, the middleman re-encrypts the symmetric key with the "correct public key" and sends it back to the server. Now, the client, middleman, and server all have the same symmetric key, and the middleman can decrypt all subsequent encrypted data between the client and server using the symmetric key.
To solve this problem, we introduce the concept of digital certificates. The server first generates a public-private key pair and provides the public key to a relevant authority (CA). The CA puts the public key into a digital certificate and issues it to the server. At this point, the server does not simply give the public key to the client, but gives the client a digital certificate. The digital certificate includes some digital signature mechanisms to ensure that the digital certificate is definitely from the server to the client. The forged certificate sent by the middleman cannot be authenticated by the CA. At this point, the client and server know that the communication has been hijacked.
In summary, combining the above three points ensures secure communication: using an asymmetric encryption algorithm (public key and private key) to exchange symmetric keys, utilizing digital certificates to verify identity (checking whether the public key is forged), and employing symmetric keys to encrypt and decrypt subsequent transmitted data. This combination of methods results in secure communication.
Why provide a simple introduction to the HTTPS protocol? Because HTTPS involves many components, especially the encryption and decryption algorithms, which are very complex. The author cannot fully explore these algorithms and only understands some of the basics. This section is just a brief introduction to some of the most fundamental principles of HTTPS, laying the theoretical foundation for later analysis of the HTTPS establishment process and optimization, among other topics.
Symmetric encryption refers to an algorithm that uses the same key for encryption and decryption. It requires the sender and receiver to agree on a symmetric key before secure communication. The security of symmetric algorithms relies entirely on the key, and the leakage of the key means that anyone can decrypt the messages they send or receive. Therefore, the confidentiality of the key is crucial to communication.
3.1.1 Symmetric encryption is divided into two modes: stream encryption and block encryption
Stream encryption treats the message as a byte stream and applies mathematical functions to each byte separately. When using stream encryption, each encryption will convert the same plaintext bit into different ciphertext bits. Stream encryption uses a key stream generator, which generates a byte stream that is XORed with the plaintext byte stream to generate ciphertext.
Block encryption divides the message into several groups, which are then processed by mathematical functions, one group at a time. For example, a 64-bit block cipher is used, and the message length is 640 bits. It will be divided into ten 64-bit groups (if the last group is less than 64 bits, it will be padded with zeros to reach 64 bits). Each group is processed using a series of mathematical formulas, resulting in ten encrypted text groups. Then, this ciphertext message is sent to the other end. The other end must have the same block cipher and use the previous algorithm in reverse order to decrypt the ten ciphertext groups, ultimately obtaining the plaintext message. Some commonly used block encryption algorithms are DES, 3DES, and AES. Among them, DES is an older encryption algorithm, which has now been proven to be insecure. 3DES is a transitional encryption algorithm, which is equivalent to tripling the operation on the basis of DES to improve security, but its essence is still consistent with the DES algorithm. AES is a substitute algorithm for DES and is one of the most secure symmetric encryption algorithms currently available.
3.1.2 Advantages and disadvantages of symmetric encryption algorithms:
Advantages: Symmetric encryption algorithms have low computational complexity, fast encryption speed, and high encryption efficiency.
Disadvantages:
(1) Both parties involved in the transaction use the same key, which cannot guarantee security;
(2) Each time a symmetric encryption algorithm is used, a unique key unknown to others must be used. This causes the number of keys owned by both the sender and receiver to grow geometrically, making key management a burden.
Before the advent of asymmetric key exchange algorithms, the main drawback of symmetric encryption was not knowing how to transmit symmetric keys between the communicating parties without allowing middlemen to steal them. After the birth of asymmetric key exchange algorithms, they were specifically designed for encrypting and decrypting symmetric key transmissions, making the interaction and transmission of symmetric keys very secure.
Asymmetric key exchange algorithms themselves are very complex, and the key exchange process involves random number generation, modular exponentiation, blank padding, encryption, signing, and a series of extremely complex processes. The author has not fully researched these algorithms. Common key exchange algorithms include RSA, ECDHE, DH, and DHE. These involve relatively complex mathematical problems. Among them, the most classic and commonly used is the RSA algorithm.
RSA: Born in 1977, it has undergone a long period of cracking tests and has a high level of algorithm security. Most importantly, the algorithm implementation is very simple. The disadvantage is that it requires relatively large prime numbers (currently commonly used are 2048-bit) to ensure security strength, which consumes a lot of CPU computing resources. RSA is currently the only algorithm that can be used for both key exchange and certificate signing. RSA is the most classic and also the most commonly used asymmetric encryption and decryption algorithm.
3.2.1 Asymmetric encryption is more secure than symmetric encryption, but it also has two significant drawbacks:
(1) CPU computing resources are heavily consumed. In a complete TLS handshake, the asymmetric decryption computation during key exchange accounts for more than 90% of the entire handshake process. The computational complexity of symmetric encryption is only 0.1% of that of asymmetric encryption. If the subsequent application layer data transmission process also uses asymmetric encryption and decryption, the CPU performance overhead would be too enormous for the server to bear. Experimental data from Symantec shows that for encrypting and decrypting the same number of files, asymmetric algorithms consume over 1000 times more CPU resources than symmetric algorithms.
(2) Asymmetric encryption algorithms have a limit on the length of the encrypted content, which cannot exceed the public key length. For example, the currently commonly used public key length is 2048 bits, which means that the content to be encrypted cannot exceed 256 bytes.
Therefore, asymmetric encryption and decryption (which extremely consume CPU resources) can currently only be used for symmetric key exchange or CA signing and are not suitable for application layer content transmission encryption and decryption.
The identity authentication part of the HTTPS protocol is completed by CA digital certificates, which consist of public keys, certificate subjects, digital signatures, and other content. After the client initiates an SSL request, the server sends the digital certificate to the client, and the client verifies the certificate (checking whether the certificate is forged, i.e., whether the public key is forged). If the certificate is not forged, the client obtains the asymmetric key used for symmetric key exchange (obtaining the public key).
3.3.1 Digital certificates have three functions:
1. Identity authorization. Ensure that the website accessed by the browser is a trusted website verified by the CA.
2. Distributing public keys. Each digital certificate contains the registrant-generated public key (verified to ensure it is legal and not forged). During the SSL handshake, it is transmitted to the client through the certificate message.
3. Verifying certificate legitimacy. After receiving the digital certificate, the client verifies its legitimacy. Only certificates that pass the verification can proceed with subsequent communication processes.
3.3.2 The process of applying for a trusted CA digital certificate usually includes the following steps:
(1) The company (entity) server generates public and private keys, as well as a CA digital certificate request.
(2) RA (Certificate Registration and Audit Authority) checks the legality of the entity (whether it is a registered and legitimate company in the registration system).
(3) CA (Certificate Issuing Authority) issues the certificate and sends it to the applicant entity.
(4) The certificate is updated to the repository (responsible for the storage and distribution of digital certificates and CRL content). The entity terminal subsequently updates the certificate from the repository and queries the certificate status, etc.
After the applicant obtains the CA certificate and deploys it on the website server, how can the browser confirm that the certificate is issued by the CA when initiating a handshake and receiving the certificate? How can third-party forgery of the certificate be avoided? The answer is the digital signature. Digital signatures are anti-counterfeiting labels for certificates, and the most widely used is SHA-RSA (SHA is used for the hash algorithm, and RSA is used for asymmetric encryption algorithms). The creation and verification process of digital signatures is as follows:
1. Issuance of digital signatures. First, a secure hash is performed on the content to be signed using a hash function, generating a message digest. Then, the CA's private key is used to encrypt the message digest.
2. Verification of digital signatures. Decrypt the signature using the CA's public key, then sign the content of the signature certificate using the same signature function, and compare it with the signature content in the server's digital signature. If they are the same, the verification is considered successful.
It is important to note:
(1) The asymmetric keys used for digital signature issuance and verification are the CA's own public and private keys, which have nothing to do with the public key submitted by the certificate applicant (the company entity submitting the certificate application).
(2) The process of digital signature issuance is just the opposite of the public key encryption process, that is, encryption with a private key and decryption with a public key. (For a pair of public and private keys, the content encrypted by the public key can only be decrypted by the private key; conversely, the content encrypted by the private key can only be decrypted by the public key.)
(3) Nowadays, large CAs have certificate chains. The benefits of certificate chains are: first, security, keeping the CA's private key for offline use. The second benefit is easy deployment and revocation. Why revoke here? Because if there is a problem with the CA digital certificate (tampering or contamination), you only need to revoke the corresponding level of the certificate, and the root certificate is still secure.
(4) Root CA certificates are self-signed, that is, the signature creation and verification are completed using their own public and private keys. The certificate signatures on the certificate chain are signed and verified using the asymmetric keys of the previous level certificate.
(5) How to obtain the key pairs of the root CA and multi-level CA? Also, since they are self-signed and self-authenticated, are they safe and trustworthy? The answer here is: of course, they are trustworthy because these manufacturers have cooperated with browsers and operating systems, and their root public keys are installed by default in the browser or operating system environment.
The integrity of data transmission is ensured using the MAC algorithm. To prevent data transmitted over the network from being tampered with illegally or data bits from being contaminated, SSL uses MAC algorithms based on MD5 or SHA to ensure message integrity (since MD5 has a higher likelihood of conflicts in practical applications, it is better not to use MD5 to verify content consistency). The MAC algorithm is a data digest algorithm with the participation of a key, which can convert the key and data of any length into fixed-length data. Under the influence of the key, the sender uses the MAC algorithm to calculate the MAC value of the message, adds it to the message to be sent, and sends it to the receiver. The receiver uses the same key and MAC algorithm to calculate the MAC value of the message and compares it with the received MAC value. If they are the same, the message has not changed; otherwise, the message has been modified or contaminated during transmission, and the receiver will discard the message. SHA should not use SHA0 and SHA1 either. Professor Wang Xiaoyun of Shandong University (a very accomplished female professor, you can search for her story online if you are interested) announced in 2005 that she had cracked the full version of the SHA-1 algorithm and received recognition from industry experts. Microsoft and Google have both announced that they will no longer support sha1-signed certificates after 2016 and 2017.
This article has captured packets for Baidu search twice. The first packet capture was done after clearing all browser caches; the second packet capture was done within half a minute after the first packet capture.
Baidu completed the full-site HTTPS for Baidu search in 2015, which has significant meaning in the development of HTTPS in China (currently, among the three major BAT companies, only Baidu claims to have completed full-site HTTPS). Therefore, this article takes www.baidu.com as an example for analysis.
At the same time, the author uses the Chrome browser, which supports the SNI (Server Name Indication) feature, which is very useful for HTTPS performance optimization.
Note: SNI is an SSL/TLS extension designed to solve the problem of a server using multiple domain names and certificates. In a nutshell, its working principle is: before establishing an SSL connection with the server, send the domain name (hostname) to be accessed first, so that the server returns a suitable certificate based on this domain name. Currently, most operating systems and browsers support the SNI extension very well. OpenSSL 0.9.8 has built-in this feature, and new versions of Nginx and Apache also support the SNI extension feature.
The URL visited by this packet capture is: http://www.baidu.com/
(If it is https://www.baidu.com/, the results below will be different!)
Packet capture results:
As can be seen, Baidu adopts the following strategies:
(1) For higher version browsers, if they support HTTPS and the encryption and decryption algorithm is above TLS 1.0, all HTTP requests will be redirected to HTTPS requests.
(2) For HTTPS requests, they remain unchanged.
[Detailed analysis process]
As can be seen, my computer is accessing http://www.baidu.com/, and during the initial three-way handshake, the client tries to connect to port 8080 (since the network exit of my residential area has a layer of overall proxy, the client actually performs the three-way handshake with the proxy machine, and the proxy machine then helps the client to connect to the Baidu server).
Since the residential gateway has set up proxy access, when accessing HTTPS, the client needs to establish an "HTTPS CONNECT tunnel" connection with the proxy machine (regarding the "HTTPS CONNECT tunnel" connection, it can be understood as: although the subsequent HTTPS requests are carried out between the proxy machine and the Baidu server, involving public-private key connections, symmetric key exchanges, and data communication; however, with the tunnel connection, it can be considered that the client is also directly communicating with the Baidu server).
3.1 Random number
In the client greeting, four bytes are recorded in Unix time format as the client's Coordinated Universal Time (UTC). Coordinated Universal Time is the number of seconds elapsed from January 1, 1970, to the current moment. In this example, 0x2516b84b is the Coordinated Universal Time. There are 28 bytes of random numbers (random_C) following it, which we will use in the subsequent process.
3.2 SID (Session ID)
If the conversation is interrupted for some reason, a handshake is required again. To avoid the inefficiency of access caused by re-handshaking, the concept of session ID is introduced. The idea of the session ID is simple: each conversation has a number (session ID). If the conversation is interrupted, the next time the connection is re-established, the client only needs to provide this number, and if the server has a record of this number, both parties can reuse the existing "symmetric key" without having to generate a new one.
Since we captured packets when accessing https://www.baodu.com for the first time within a few hours, there is no Session ID here. (Later, we will see that there is a Session ID in the second packet capture after half a minute)
Session ID is a method supported by all browsers currently, but its drawback is that the session ID is often only retained on one server. Therefore, if the client's request is sent to another server (which is very likely, for the same domain name, when the traffic is heavy, there are often dozens of RS machines providing service in the background), the conversation cannot be restored. The session ticket was born to solve this problem, and currently, only Firefox and Chrome browsers support it.
3.3 Cipher Suites
RFC2246 recommends many combinations, usually written as "key exchange algorithm-symmetric encryption algorithm-hash algorithm". For example, "TLS_RSA_WITH_AES_256_CBC_SHA":
(a) TLS is the protocol, and RSA is the key exchange algorithm;
(b) AES_256_CBC is the symmetric encryption algorithm (where 256 is the key length, and CBC is the block mode);
(c) SHA is the hash algorithm.
Browsers generally support many encryption algorithms, and the server will choose a more suitable encryption combination to send to the client based on its own business situation (such as considering security, speed, performance, and other factors).
3.4 Server_name extension (generally, browsers also support SNI extension)
When we visit a website, we must first resolve the corresponding IP address of the site through DNS and access the site through the IP address. Since many times, a single IP address is shared by many sites, without the server_name field, the server would be unable to provide the appropriate digital certificate to the client. The Server_name extension allows the server to grant the corresponding certificate for the browser's request.
Server response
(Includes Server Hello, Certificate, Certificate Status)
After receiving the client hello, the server will reply with three packets. Let's take a look at each:
4.1 We get the server's UTC recorded in Unix time format and the 28-byte random number (random_S).
4.2 Session ID, the server generally has three choices for the session ID (later, we will see that there is a Session ID in the second packet capture after half a minute):
(1) Recovered session ID: As we mentioned earlier in the client hello, if the session ID in the client hello has a cache on the server, the server will try to recover this session;
(2) New session ID: There are two cases here. The first is that the session ID in the client hello is empty, in which case the server will give the client a new session ID. The second is that the server did not find a corresponding cache for the session ID in the client hello, in which case a new session ID will also be returned to the client;
(3) NULL: The server does not want this session to be recovered, so the session ID is empty.
4.3 We remember that in the client hello, the client provided multiple Cipher Suites. Among the encryption suites provided by the client, the server selected "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256"
(a) TLS is the protocol, and RSA is the key exchange algorithm;
(b) AES_256_CBC is the symmetric encryption algorithm (where 256 is the key length, and CBC is the block mode);
(c) SHA is the hash algorithm.
This means that the server will use the ECDHE-RSA algorithm for key exchange, the AES_128_GCM symmetric encryption algorithm for encrypting data, and the SHA256 hash algorithm to ensure data integrity.
In the previous study of the HTTPS principle, we know that in order to securely send the public key to the client, the server will put the public key into the digital certificate and send it to the client (digital certificates can be self-issued, but generally, a dedicated CA organization is used to ensure security). So this message is a digital certificate, and the 4097 bytes is the length of the certificate.
Opening this certificate, we can see the specific information of the certificate. This specific information is not very intuitive through the packet capture method, but it can be viewed directly in the browser (click the green lock button in the upper left corner of the Chrome browser).
The packet we captured combines the Server Hello Done and server key exchange:
The client verifies the legality of the certificate. If the verification passes, subsequent communication will proceed; otherwise, prompts and actions will be made according to different error situations. Legality verification includes the following:
(1) Trustworthiness of the certificate chain (trusted certificate path), as described earlier;
(2) Certificate revocation, there are two types of offline CRL and online OCSP, different client behaviors will vary;
(3) Expiry date, whether the certificate is within the valid time range;
(4) Domain, check whether the certificate domain matches the current access domain, and analyze the matching rules later.
This process is very complex, here is a brief summary:
(1) First, the client uses the CA digital certificate for identity authentication and negotiates a symmetric key using asymmetric encryption.
(2) The client will transmit a "pubkey" random number to the server. After receiving it, the server generates another "pubkey" random number using a specific algorithm. The client uses these two "pubkey" random numbers to generate a pre-master random number.
(3) The client uses the random number random_C transmitted in its client hello and the random number random_S received in the server hello, plus the pre-master random number, to generate a symmetric key enc_key using the symmetric key generation algorithm: enc_key = Fuc(random_C, random_S, Pre-Master)
If the conversation is interrupted for some reason, a handshake is required again. To avoid the inefficiency of access caused by re-handshaking, the concept of session ID is introduced. The idea of session ID (and session ticket) is simple: each conversation has a number (session ID). If the conversation is interrupted, the next time the connection is re-established, the client only needs to provide this number, and if the server has a record of this number, both parties can reuse the existing "session key" without having to generate a new one.
Since we captured packets when accessing the https://www.baodu.com homepage for the first time within a few hours, there is no Session ID here. (Later, we will see that there is a Session ID in the second packet capture after half a minute)
Session ID is a method supported by all browsers currently, but its drawback is that the session ID is often only retained on one server. Therefore, if the client's request is sent to another server, the conversation cannot be restored. The session ticket was born to solve this problem, and currently, only Firefox and Chrome browsers support it.
Subsequent new HTTPS sessions can use session IDs or session Tickets, and the symmetric key can be reused, thus avoiding the process of HTTPS public-private key exchange, CA authentication, etc., and greatly shortening the HTTPS session connection time.