Concepts/OpenPGP For Beginners/en: Difference between revisions

Latest revision as of 11:43, 26 April 2013

Other languages:

Introduction

The effective security of a cryptography solution depends more on knowing what you do and what certain technical facts mean (and what not!) than on key lengths and the software itself. Thus this article shall give an introduction to the OpenPGP core concepts.

This tutorial is for beginners so any too complicated stuff is left out. Furthermore you will not find explanations of how to get something done with a certain software. You have the software documentation for that. This will help you to better understand the actions explained there.

There is another article that prepares you for key generation and one explaining advanced concepts.

Asymmetric keys

OpenPGP uses key pairs. This means that there is always a secret key and a public key belonging together. In contrast to the very intuitive concept of symmetric encryption (i.e. the same password encrypts and decrypts the data) this is hard to understand. Don't think about that, just accept it; the math behind this is a nightmare so for most people going into details would be a stressful waste of time.

As the names suggest: The secret key is to be known to its owner only, and the public key should be known to everyone ideally. With symmetric encryption the problem was to safely share the password with the recipient of a message. With public keys the problem has changed: Now the hard part is to be sure that you are using the correct public key (and not a forged one an attacker tries to trick you into using).

Encryption

One of the two functions of OpenPGP is encryption. You encrypt data to one ore more public keys (symmetric encryption i.e. using a password instead is possible but rarely used). For decrypting the data the secret key of one of the recipient keys is needed.

Except for the already mentioned problem "Which is the right key to encrypt for?" encryption - decryption is a quite simple operation because there is no room for misunderstanding: You encrypt something and nobody except the owner(s) of the recipient key(s) can read it. And you can either decrypt data or you can't. The decrypted data itself may be hard to understand but the decryption operation is not. But: Part of the "Which is the right key to encrypt for?" question is not only "Who is the owner of this key?" but also "Is this key secure enough for the data to be encrypted?". This is not about key length or similar but about key handling. Thus at least before sending critical information: Ask the key owner about the security level of the key!

Digital signatures

The encryption of data can be kind of reversed: Instead of creating data which only one key can understand you can create data which everyone can understand but which only one key can have created. This impossibility to create the same data without access to the respective secret key makes this data a signature. Once again: Don't ask for how this is possible unless you really like math.

One of the great things about digital cryptography is: In contrast to a handwritten signature it is very easy for everyone (OK: for everyone's computer) to check that this signature was made by a certain key. If you can relate a certain key to a person then you can also relate a digital signature to this person – unless, of course, the key has been compromised. You may have noticed: At this point the task becomes an organizational and legal one.

Technology does not solve all of your problems. And it is extremely important that you are always aware where the border between the technical and organizational problems is.

And relating a key to a person is not the hard part! That is: "What does the signature mean?" Is your interpretation of a signature legally binding for its creator? The meaning can be as low as a timestamp (which is a perfectly valid an serious application for crypto signatures!), proving nothing more than that a certain document existed at a certain time (and was not created later).

If somebody signs all his emails (to prevent address forgery) then the pure fact that he sent a certain document within such a signed email does not mean anything except for that he probably wanted to offer you a look at it. If the email (the signed part, not the unsigned subject) says something like "I accept the attached agreement" then the meaning is clear and the remaining risk of the recipient is mainly a technical one (compromised low security key).

Thus it makes sense to have different keys at different security levels: One for reasonably securing everyday tasks and another one for signing agreements (where the key policy documents explain the limits and privileges of the respective key).

In contrast to encryption the signature of data does not (technically) have an addressee. Everyone with access to the public key can check the signature. In many cases that is not a problem (or in contrast: It may even be a requirement). Instead of selecting an addressee you select the secret key which shall be used to create the signature (if you have more than one).

The big "Which is the right public key?" problem occurs with signatures, too. Not with creating signatures but when interpreting a successful signature validation. The real life question is: "What does a signature mean?" Obviously a signature by "any" key does not mean anything. Everyone can have created it. The signature itself does not state more than: "Somebody with access to the secret key decided to create this signature." This is a technical fact without relevance in real life.

Relating keys to people

This is another hard and complicated part. And because only few people do this correctly the whole system is much less secure than most people believe. You have to tell apart four components of this check. The first is the easiest: the key itself. You have to be sure that you are using the right key material (just the huge random number itself).

As keys are to big to be compared manually a secure hash is used instead. Once again: Evil math stuff you fortunately need not understand. A hash function does this: You throw any kind and amount of data at it (from a single digit to a DVD image file) and it outputs a "number" of fixed length. If it is (considered) impossible to create two different inputs which create the same output then the hash function is secure.

OpenPGP currently uses the hash function SHA-1 for identifying keys. SHA-1 has security issues but they do not affect the usage in OpenPGP. A SHA-1 value looks like this:

   7D82 FB9F D25A 2CE4 5241  6C37 BF4B 8EEF 1A57 1DF5

This is called the fingerprint of the key. There are two ways of being safe about the identity of a key (the raw key material) without third parties involved: You get either the key itself from a secure source (USB stick handed over by the key owner) or you get the fingerprint from a secure source (which is obviously much easier as you can print is on small pieces of paper, even on your business card, and spread them).

Your OpenPGP application shows you the fingerprint of the key you got from an insecure source and you compare "what is" with "what it should be". If that is the same then you can be sure about the key itself. Thus: Always have small slips of paper with your fingerprint with you.

A public OpenPGP key (a "certificate") consists of two parts: the key material and the user IDs. A user ID is just a text string. The typical usage of this string is:

   Firstname Lastname (comment) <email address>

Many user IDs do not have a comment, some do not have an email address and there are keys without a (real) name, too (e.g. for anonymous usage). Even if you are sure about the fingerprint the name, email, and comment can be wrong.

Email is rather easy to check (send an encrypted message to the address and wait for a response which guarantees your message to be decrypted).

Checking the identity of unknown persons is not easy. At keysigning parties this is done by checking passports and the like. But would you recognize a well forged passport?

Fortunately for your own purposes the identity is usually not so important. "The one I met on that event who calls himself Peter" is usually enough. So this is more a problem for the web of trust (see below).

Comments can be critial, too: The comment "CEO of whatever inc." may make a real difference (if not to you then to somebody else). The question when to accept (and certify) a user ID is a really complicated one.

Most people don't understand this problem and thus reduce their own security and that of others. You can make this decision for others easier by having user IDs which consist of just your name or just your email address. This may be easier acceptable by someone checking your user IDs.

If you are sure about a user ID you should certify it. This means that you make a digital signature over the public key and this user ID. You can make this certification for yourself only (called a "local signature") or for the public (the "web of trust"). If a key has several user IDs then you can decide which ones you certify.

You can give a rough hint how well you have checked the user ID and key, too. It makes a big difference to OpenPGP applications (and so should it to you!) whether they recognize a key as "valid" or not.

The keys you have the secret key for are considered valid automatically. The others can become valid by signatures of your own keys. And by keys of others.

The web of trust (WoT)

In connection with OpenPGP you will often hear about the web of trust. This in an indirect method for relating people to keys, a mighty but complicated technology. Beginners should not use the web of trust but first become familiar with verifying and certifying keys directly. The WoT is explained in the article OpenPGP For Advanced Users.

Summary of key usage

you need	in order to
the public key of another person	encrypt data for him
the public key of another person	check those key's signatures (technical correctness, not key validity)
your secret subkeys	decrypt data that has been encrypted for you
your secret subkeys	create signatures for data
your secret main key	manage your key (add user IDs or subkeys, change settings like expiration date)
your secret main key	certify other keys (i.e. some or all of their user IDs)
the fingerprint of the key of another person	ensure that you have imported the right key (before certifying the key either locally or for the public)

@@ Line 19: / Line 19: @@
 One of the two functions of OpenPGP is encryption. You encrypt data to one ore more public keys (symmetric encryption i.e. using a password instead is possible but rarely used). For decrypting the data the secret key of one of the recipient keys is needed.
-Except for the already mentioned problem "Which is the right key to encrypt for?" encryption - decryption is a quite simple operation because there is no room for misunderstanding: You encrypt something and nobody except the owner(s) of the recipient key(s) can read it. And you can either decrypt data or you can't. The decrypted data itself may be hard to understand but the decryption operation is not.
+Except for the already mentioned problem "Which is the right key to encrypt for?" encryption - decryption is a quite simple operation because there is no room for misunderstanding: You encrypt something and nobody except the owner(s) of the recipient key(s) can read it. And you can either decrypt data or you can't. The decrypted data itself may be hard to understand but the decryption operation is not. But: Part of the "Which is the right key to encrypt for?" question is not only "Who is the owner of this key?" but also "Is this key secure enough for the data to be encrypted?". This is not about key length or similar but about key handling. Thus at least before sending critical information: Ask the key owner about the security level of the key!
 == Digital signatures ==
@@ Line 40: / Line 40: @@
-== Relating keys to people or organizations ==
+== Relating keys to people ==
-This is the first really hard part. And because only few people do this correctly the whole system is much less secure than most people believe. You have to tell apart four components of this check. The first is the easiest: the key itself. You have to be sure that you are using the right key material (just the huge random number itself).
+This is another hard and complicated part. And because only few people do this correctly the whole system is much less secure than most people believe. You have to tell apart four components of this check. The first is the easiest: the key itself. You have to be sure that you are using the right key material (just the huge random number itself).
 As keys are to big to be compared manually a secure hash is used instead. Once again: Evil math stuff you fortunately need not understand. A hash function does this: You throw any kind and amount of data at it (from a single digit to a DVD image file) and it outputs a "number" of fixed length. If it is (considered) impossible to create two different inputs which create the same output then the hash function is secure.
@@ Line 75: / Line 75: @@
 The keys you have the secret key for are considered valid automatically. The others can become valid by signatures of your own keys. And by keys of others.
 == The web of trust (WoT) ==