A non-techie explanation of how web pages are secured

I first published this overview back in 2004. I'm prompted to update and repost it here because the original website is no longer live (it was the website for Fuse PR back in the day), because I was talking about it with some Meanwhile colleagues this week, and because it's interesting. No seriously, it is.

When is a web page secure?

Content on a web page is secured when the URL starts "https://" rather than "http://". Saying that, this doesn't mean that all content on a page marked as such is secured, but if you're on a website of a trusted source, like Amazon, you may be happy to assume that the parts that need to be are, by design.

https

If you'd like to find out what other information your browser gives you about the level of security for a particular web page, check out the relevant link here:

What do we mean by 'secured'?

We mean that information sent from your browser to the web server and sensitive information sent from the web server to your browser is encrypted so that, if the data was intercepted, it would be meaningless to the interceptor.

Occasionally, a web page can be secure without any of the visual indications or confirmations described above. This is when a secure web page is served within (a frame of) another webpage. This is poor website design, and if you are unsure if a web page is secure, act as if it's definitely not.

Why not make all web pages secure?

Cryptography places a considerable calculation burden on the processor power of web servers and your own computer. Whilst the latter is unlikely to be tasked for long, highly popular websites would really struggle to encrypt all pages and the time taken to access these pages would take longer. Therefore, browsing the latest weather forecast or tonight's TV schedule isn't encrypted, but your online banking definitely is.

How does this encryption work exactly? Let's get the acronyms out of the way, and then we'll take a look at the basic mathematics. If you can read clocks then the mathematical basis is a breeze!

Acronyms

Hypertext Transfer Protocol (HTTP) is a set of rules for transferring data files over the World Wide Web. Transmission Control Protocol (TCP) is used with Internet Protocol (IP) to divide this data up into manageable little packets for efficient shipping across the Internet.

The Secure Sockets Layer (SSL) is inserted between the HTTP and TCP layers to undertake the encryption and decryption task for secure web pages. Originally developed by the company behind the first popular browser, Netscape (which effectively became Firefox upon the formation of the Mozilla Foundation), SSL reached near ubiquitous application across all makes of browser.

Like all technologies, SSL has evolved and since SSL 3.0 it has become something called Transport Layer Security (TLS). This improved protocol is included in all modern browsers.

I have a key to secure my home, a key to secure my car.  Where's the key here?

SSL and TLS use something known as public-and-private key encryption from a company called RSA. This is a different kind of key to the physical ones you use for your home or car. The most striking difference is that, whilst the same key is used to lock and unlock your front door, SSL uses one key to lock (encrypt) the information and another key to unlock it.

This feature is critical to its success. It means that there is no need to restrict access to or be secretive about the key used to lock the information as it is useless for unlocking the information. This key is therefore known as the public key. The unlocking key is known as the private key.

It is the openness surrounding the public key that means the general user is unaware of the process being undertaken. As there is no security risk associated with knowing the public key, your browser automatically requests the public key for locking information on secure web pages. It just goes ahead and locks it up.

Walk me through this locking and unlocking process

For anyone interested in mathematics, this whole cryptography revolution harks back to the work done on clock calculators by Gauss and on a theorem proved by Fermat – not his last one, but one known as Fermat's Little Theorem. In hindsight, it's amazingly simple.

If you ask anyone to add the numbers 9 and 4 you will get the answer 13. Similarly, if you ask them at 9 o'clock what the time will be in 4 hours, they will tell you 1 o'clock. Why do we get the answers 13 and 1 to very similar questions? In the instance of telling the time we know there are 12 hours on the clock, so we are actually adding 9 and 4 and, if it is greater than 12, subtracting 12. We keep moving round a clock with 12 numbers.

12-hour clockAnother question could be "It's 9 o'clock now, what time will it be in 20 hours?" in which case the answer is 9 + 20 - 12 - 12 = 5 o'clock. It seems we keeping subtracting twelves until we get an answer that lies between 0 and 12. This is known as modular arithmetic – a form of arithmetic where numbers are considered equal if they leave the same remainder when divided by the same number (modulus).

In modular arithmetic where the modulus is 12 (as for our clock example here):

9 = 21 = 33 = 45 because

  • 12's go into 21 once, leaving a remainder of 9
  • 12's go into 33 twice, leaving a remainder of 9
  • 12's go into 45 three times, leaving a remainder of 9.

The slightly harder bit...

Gauss found an appealing characteristic related to an earlier discovery by Fermat if he undertook similar calculations using clocks with a prime number of hours on it instead of 12. A prime number is a number than cannot be divided exactly by any other number except itself and 1. The numbers 1, 2, 3, 5, 7, 11, 13, 17 and 19 are all prime numbers.  All the other numbers up to 20 are not prime as they can be divided exactly by other numbers. For example 15 can be divided exactly by itself and 1, but also by 3 and 5.

When using a prime number clock with P hours, if you take a number X and raise it to the power P then you get back to the same number you started with.

7-hour clockFor example, using the 7-hour clock shown here (P=7) and the initial number 3 (X=3), then 3 to the power 7 = 3 x 3 x 3 x 3 x 3 x 3 x 3 = 2187.

Sevens go into this number 312 times leave a remainder of 3. Back where we started. Or to put it another way, we go forward 2187 hours on our 7-hour clock and see what time we come to – 3 o'clock.

Although Fermat claimed to have proved this theorem, he died before telling anyone how! It was left to another distinguished mathematician, Leonard Euler, to provide the proof in 1736 that this worked for all prime numbers and any number X.

Euler took things further by looking at semiprime numbers too. A semiprime can only be divided by itself, 1 and two prime numbers. In other words, a semiprime N = p x q where both p and q are prime numbers. For semiprime number clocks, Euler found that the pattern got back to the beginning after raising the original number to the power of (p-1) x (q-1) + 1.

Let's go shopping

We are now nearly there. Let's look at how you give Amazon your credit card number, securely.

Amazon's computers select two very large prime numbers, p and q, of around 60 digits each and multiply them together to make a third number N. We are therefore using a clock with a massive number of hours. Massive. In fact, the number is usually bigger than there are atoms in the universe!

The number N is published as part of the public key, but p and q are kept secret. It is very very difficult, almost impossible without many years and an incredibly powerful computer, to work out what p and q are from N. In fact it is so secure that Amazon will continue to use the same number N for several months.

The other part of the public key is called the encoding number, E. So now what happens to your credit card number C (or you might consider C to stand for the digital representation of any content to be secured)?

Your browser does a calculation on C based on the clock with the massive number of hours and the encoding number E. It raises C to the power E and works out what the number is on the clock in the same way we did for much smaller numbers above, and transmits this number to Amazon. In other words, your browser has used the public key to encrypt your credit card number.

If anyone intercepted the transmission they could not calculate your credit card number. They know Amazon's public key (N and E) but you cannot use these to reverse the calculation.

However, Amazon can calculate the credit card number because they know p and q, the private key. They know that if your credit card number was raised to the power of (p-1) x (q-1) + 1, that the same number reappears.

As your browser has raised the number to the power of E already, it simply remains for Amazon's computer to raise the result further, by (p-1) x (q-1) + 1 - E on the same clock with N hours, to get back to the same number. Mission accomplished, and Stephen Hawking's God Invented the Integers will be dropping on your doormat soon. That's what you were shopping for right?

Next time you're securely online, think of Fermat, Gauss and Euler, and the three mathematicians at RSA who brought this work from the 17th Century and applied it to the world of the Internet - Rivest, Shamir and Adleman.