## Internet Security Lectures by Prabhaker Mateti

Internet Security Lectures by Prabhaker MatetiPrabhaker Mateti

Abstract:Data integrity and privacy on the Internet primarily rests on usingcryptography well. Unfortunately, it is easily compromised by errorsin (operating) system configuration. This lecture is a quick overviewof cryptography as relevant in Internet security and passwords.

Data integrity and privacy on the Internet primarily rests on usingcryptography well. The design and implementation of cryptographyrequires deep understanding of discrete mathematics and number theory.Unfortunately, when cryptography is deployed carelessly, it is easilycompromised by errors in (operating) system configuration. Thislecture is a quick overview of cryptography as relevant in Internetsecurity and passwords.

A cryptographic encryption algorithm, also known as cipher,transforms a "plain text" (e.g., humanreadable) pt and outputs cipher textct as the output,

so that it is possible to re-generate the pt fromthe ct through a companion decryption algorithm. Notethat we said "for example, human readable" and not"that is, human readable" as an explanation for the phrase"plain text". Often, the so-called "plain text"is human un-readable binary data that is ready-to-be-used by acomputer.

Ciphers use keys together with plain text as the input to produce cipher text. It is in the key that the security of a modern cipher lies, not in the details of the algorithm.

Roughly speaking, computationally infeasible means that a certaincomputation that we are talking about takes way too long (hundreds ofyears) to compute using the fastest of (super)computers.

Suppose our key is a 128-bit number. There are

340,282,366,920,938,463,463,374,607,431,768,211,456

128-bit numbers starting from zero (i.e., 128 bits of 0). Torecover a particular key by brute force, one must, on average, searchhalf the key space:

170,141,183,460,469,231,731,687,303715,884,105,728.

If we use 1,000,000,000 machines that could try 1,000,000,000keys/sec, it would take all these machines longer than the universe aswe know it has existed to find the key.

This is not the same thing as saying that computationalinfeasibility is the same idea as Turing-incomputable. Nor is it thesame thing as saying that you cannot make a lucky guess, orheuristically arrive at a possible answer, and then systematicallyverify that the guessed answer is indeed the correct answer, all donewithin a matter of seconds on a lowly PC. Here is an example:Microsoft Windows NT uses the DES encryption algorithm in storing thepasswords. Brute-forcing such a scrambled password to compute theplain text password can take, according to Microsoft, "about abillion years." But the L0pht team( http://www.l0pht.com) claims thatL0phtCrack breaks Windows passwords in about one week, running in thebackground on an old Pentium PC.

In the context of cryptography, the factorization of an arbitrarilylarge number N, into its constituent primes, determining the powersn2, n3, n5, n7, etc. of the primes, is computationally infeasible --as far as we know.

N = 2n2 * 3 n3* 5 n5 * 7 n7* ...

Based on this, the decryption is computationally infeasible. Note thatthis is assuming that we are using known methods, including brute force.

Is it possible that some one or some country has actuallydiscovered fast algorithms, but chose to keep them secret, for these tasksthat we believe to be computationally infeasible?

A hash function maps input sequences of bytes into a fixed-lengthsequence. The fixed length is considerably shorter than thetypical length (thousands of bytes) of the input, and hence thefunction is a hash function.

The nature of all hash functions is that there must exist multipleinput sequences that map to the same hash. The inverse is amathematical relation, not a mathematical function. But, good hashfunctions have the following properties: It is hard to find twostrings, from the expected set of typically used strings, that wouldproduce the same hash value. A slight change in an input stringcauses the hash value to change drastically.

A "one way" hash function is designed to be computationallyinfeasible to reverse the process, that is, to algorithmicallydiscover a string that hashes to a given value.

One-way hashfunctions are also known as message digests (MD), fingerprints, orcompression functions. The most popular one-way hash algorithms areMD4 and MD5 (both producing a 128-bit hash value), and SHA, also knownas SHA1 (producing a 160-bit hash value).

As of 2006, both MD5 and SHA1 are considered separately broken. Thatis, given plain text p, it is possible to modify p to a desired p' sothat md5(p) == md5(p'); similarly, for SHA1. What is not known is ifwe can modify p to a p' so that md5(p) == md5(p') and sha1(p)== sha1(p').

Symmetric-key cryptography is an encryption system in which thesender and receiver of a message share a single, common key to encryptand decrypt the message. Symmetric-key systems are simpler andfaster, but their main drawback is that the two parties must somehowexchange the key in a secure way. Symmetric-key cryptography issometimes also called secret-key cryptography.

If ct = encryption (pt, key), then pt = decryption (ct, key).

The most popular symmetric-key system is the DES, short for DataEncryption Standard. DES was developed in 1975 andstandardized by ANSI in 1981 as ANSI X.3.92. DES encrypts data in64-bit blocks using a 56-bit key. The algorithm transforms theinput in a series of steps into a 64-bit output.

IDEA (International Data Encryption Algorithm) is a block cipherwhich uses a 128-bit length key to encrypt successive 64-bit blocks ofplain text. The procedure is quite complicated using subkeys generatedfrom the key to carry out a series of modular arithmetic and XORoperations on segments of the 64-bit plaintext block. The encryptionscheme uses a total of fifty-two 16-bit subkeys.

Blowfish is a symmetric block cipher that can be used as a drop-inreplacement for DES or IDEA. It takes a variable-length key, from 32bits to 448 bits, making it ideal for both domestic and exportableuse. Blowfish is unpatented and license-free, and is availablefree for all uses.

Public key cryptography uses two keys -- a public key knownto everyone, and a private or secret key that is safeguarded. Public key cryptography was invented in 1976 by WhitfieldDiffie and Martin Hellman. For this reason, it is sometimes alsocalled Diffie-Hellman encryption. It is also calledasymmetric encryption because it uses two keys instead of one key. The two keys are mathematically related, yet it is computationally infeasible to deduce one from the other.

Unfortunately, public-key cryptography is about 1000 times slowerthan symmetric key cryptography.

The most well-known of the public-key encryption algorithms is RSA, named after its designers Rivest, Shamir, and Adelman. The un-breakability of the algorithm is based on the fact that there is no efficient way to factor very large numbers into their primes.

An example of the above numbers: rsa.txt. Look up the man page: openssl(1).

The e and d are symmetric in that using either ((n,e) or (n,d)) as the encryption key, the other can be used as the decryption key.

The only way known to find d is to know p and q. If the number n is small, p and q are easy todiscover by prime factorization. Thus, p and q are chosen to be as large as possible,say, a few hundred digits long. Obviously, p and qshould never be revealed, preferably destroyed.

Encryption isdone as follows. Consider the entire message to be encrypted asa sequence of bits. Suppose the length of n in bits is b. Split the message into blocks of length b or b-1. A block viewedas a b-bit number should be less than n; if it is not, choose it to beb-1 bits long. Each block is separately encrypted, and theencryption of the entire message is the catenation of the encryptionof the blocks. Let m stand for a block viewed as a number. Multiply m with itself e times, and take the modulo n result as c,which is the encryption of m. That is, c = m^emod n.

Decryption is the "inverse" operation: m = c^dmod n.

The Digital Signature Algorithm (DSA) is a United States Federal Government standard for digital signatures.

An example of the above numbers: dsa.txt.Look up the man page: openssl(1).

Public-key systems, such as Pretty Good Privacy (PGP), are popular for transmitting information via the Internet. They are extremely secure and relatively simple to use. You need to retrieve the recipient's public key from one of several world-wide registries of public keys that now exist to encrypt a message.

When John wants to send a secure message to Jane, he uses Jane's public key to encrypt the message. Jane then uses her private key to decrypt it.

In real-world implementations, public keys are rarely used to encrypt actual messages because public-key cryptography is slow. Instead, public-key cryptography is used to distribute symmetric keys, which are then used to encrypt and decrypt actual messages, as follows:

A digital signature is a way to authenticate to a recipient that a received object is indeed that of the sender.

The public key-based communication between Alice and Bob described above is vulnerable to a man-in-the-middle attack.

Let us assume that Mallory, a cracker, not only can listen to thetraffic between Alice and Bob, but also can modify, delete, andsubstitute Alice's and Bob's messages, as well as introduce newones. Mallory can impersonate Alice when talking to Bob andimpersonate Bob when talking to Alice. Here is how the attackworks.

A man-in-the-middle attack works because Alice and Bob have no wayto verify they are talking to each other. An independent third partythat everyone trusts is needed to foil the attack. This third partycould bundle the name "Bob" with Bob's public key and signthe package with its own private key. When Alice receives the signedpublic key from Bob, she can verify the third party's signature. Thisway she knows that the public key really belongs to Bob, and notMallory.

A package containing a person's name (and possibly some otherinformation such as an E-mail address and company name) and his publickey and signed by a trusted third party is called a digital certificate (ordigital ID). An independent third party that everyone trusts, whoseresponsibility is to issue certificates, is called a CertificationAuthority (CA). A digital certificate serves two purposes. First, itprovides a cryptographic key that allows another party to encryptinformation for the certificate's owner. Second, it provides a measureof proof that the holder of the certificate is who they claim to be -because otherwise, they will not be able to decrypt any informationthat was encrypted using the key in the certificate.

The recipient of an encrypted message uses the CA's public key todecode the digital certificate attached to the message, verifies it asissued by the CA and then obtains the sender's public key andidentification information held within the certificate. With thisinformation, the recipient can send an encrypted reply.

The most widely used standard for digital certificates is X.509,which defines the following structure for public-key certificates:

You can obtain a personal certificate from companies likeverisign.com or comodo.com.

The Web.