Question 1

Demonstrate your understanding of classical cipher like Caesar Cipher – its process and security value.

Question 2

Describe the Kerckhoff’s Principle of Cryptosystem.

Question 3

What are the challenges of key cryptosystem?

Note:

Write answers to the above three questions in about 4 pages in length, not including the required cover page and reference page.

• Paper should be in APA format with in-text citation. Your paper should include:

1. an introduction

2. a body with fully developed content

3. a conclusion.

• Support your answers with the readings from the course and at least two scholarly journal articles to support your positions, claims, and observations, in addition to your textbook (attached).

CRYPTOGRAPHY AND

NETWORK SECURITY

PRINCIPLES AND PRACTICE

SEVENTH EDITION

GLOBAL EDITION

William Stallings

Boston Columbus Indianapolis New York San Francisco Hoboken

Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montréal Toronto

Delhi Mexico City São Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo

For Tricia: never dull, never boring,

the smartest and bravest person

I know

ISBN 10:1-292-15858-1

ISBN 13: 978-1-292-15858-7

10 9 8 7 6 5 4 3 2 1

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

Vice President and Editorial Director, ECS:

Marcia J. Horton

Executive Editor: Tracy Johnson (Dunkelberger)

Editorial Assistant: Kristy Alaura

Acquisitions Editor, Global Editions: Abhijit Baroi

Program Manager: Carole Snyder

Project Manager: Robert Engelhardt

Project Editor, Global Editions: K.K. Neelakantan

Media Team Lead: Steve Wright

R&P Manager: Rachel Youdelman

R&P Senior Project Manager: William Opaluch

Senior Operations Specialist: Maura Zaldivar-Garcia

Inventory Manager: Meredith Maresca

Inventory Manager: Meredith Maresca

Senior Manufacturing Controller, Global Editions:

Trudy Kimber

Media Production Manager, Global Editions:

Vikram Kumar

Product Marketing Manager: Bram Van Kempen

Marketing Assistant: Jon Bryant

Cover Designer: Lumina Datamatics

Cover Art: © goghy73 / Shutterstock

Full-Service Project Management:

Chandrakala Prakash, SPi Global

Composition: SPi Global

Credits and acknowledgments borrowed from other sources and reproduced, with permission, in this textbook

appear on page 753.

© Pearson Education Limited 2017

The right of William Stallings to be identified as the author of this work has been asserted by him in accordance

with the Copyright, Designs and Patents Act 1988.

Authorized adaptation from the United States edition, entitled Cryptography and Network Security: Principles and

Practice, 7th Edition, ISBN 978-0-13-444428-4, by William Stallings published by Pearson Education © 2017.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in

any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without either the prior

written permission of the publisher or a license permitting restricted copying in the United Kingdom issued by the

Copyright Licensing Agency Ltd, Saffron House, 6–10 Kirby Street, London EC1N 8TS.

All trademarks used herein are the property of their respective owners. The use of any trademark in this text does

not vest in the author or publisher any trademark ownership rights in such trademarks, nor does the use of such

trademarks imply any affiliation with or endorsement of this book by such owners.

Pearson Education Limited

Edinburgh Gate

Harlow

Essex CM20 2JE

England

and Associated Companies throughout the world

Visit us on the World Wide Web at:

www.pearsonglobaleditions.com

Typeset by SPi Global

Printed and bound in Malaysia.

http://www.pearsonglobaleditions.com

3

CONTENTS

Notation 10

Preface 12

About the Author 18

PART ONE: BACKGROUND 19

Chapter 1 Computer and Network Security Concepts 19

1.1 Computer Security Concepts 21

1.2 The OSI Security Architecture 26

1.3 Security Attacks 27

1.4 Security Services 29

1.5 Security Mechanisms 32

1.6 Fundamental Security Design Principles 34

1.7 Attack Surfaces and Attack Trees 37

1.8 A Model for Network Security 41

1.9 Standards 43

1.10 Key Terms, Review Questions, and Problems 44

Chapter 2 Introduction to Number Theory 46

2.1 Divisibility and the Division Algorithm 47

2.2 The Euclidean Algorithm 49

2.3 Modular Arithmetic 53

2.4 Prime Numbers 61

2.5 Fermat’s and Euler’s Theorems 64

2.6 Testing for Primality 68

2.7 The Chinese Remainder Theorem 71

2.8 Discrete Logarithms 73

2.9 Key Terms, Review Questions, and Problems 78

Appendix 2A The Meaning of Mod 82

PART TWO: SYMMETRIC CIPHERS 85

Chapter 3 Classical Encryption Techniques 85

3.1 Symmetric Cipher Model 86

3.2 Substitution Techniques 92

3.3 Transposition Techniques 107

3.4 Rotor Machines 108

3.5 Steganography 110

3.6 Key Terms, Review Questions, and Problems 112

Chapter 4 Block Ciphers and the Data Encryption Standard 118

4.1 Traditional Block Cipher Structure 119

4.2 The Data Encryption Standard 129

4.3 A DES Example 131

4.4 The Strength of DES 134

4 CONTENTS

4.5 Block Cipher Design Principles 135

4.6 Key Terms, Review Questions, and Problems 137

Chapter 5 Finite Fields 141

5.1 Groups 143

5.2 Rings 145

5.3 Fields 146

5.4 Finite Fields of the Form GF( p) 147

5.5 Polynomial Arithmetic 151

5.6 Finite Fields of the Form GF(2n) 157

5.7 Key Terms, Review Questions, and Problems 169

Chapter 6 Advanced Encryption Standard 171

6.1 Finite Field Arithmetic 172

6.2 AES Structure 174

6.3 AES Transformation Functions 179

6.4 AES Key Expansion 190

6.5 An AES Example 193

6.6 AES Implementation 197

6.7 Key Terms, Review Questions, and Problems 202

Appendix 6A Polynomials with Coefficients in GF(28) 203

Chapter 7 Block Cipher Operation 207

7.1 Multiple Encryption and Triple DES 208

7.2 Electronic Codebook 213

7.3 Cipher Block Chaining Mode 216

7.4 Cipher Feedback Mode 218

7.5 Output Feedback Mode 220

7.6 Counter Mode 222

7.7 XTS-AES Mode for Block-Oriented Storage Devices 224

7.8 Format-Preserving Encryption 231

7.9 Key Terms, Review Questions, and Problems 245

Chapter 8 Random Bit Generation and Stream Ciphers 250

8.1 Principles of Pseudorandom Number Generation 252

8.2 Pseudorandom Number Generators 258

8.3 Pseudorandom Number Generation Using a Block Cipher 261

8.4 Stream Ciphers 267

8.5 RC4 269

8.6 True Random Number Generators 271

8.7 Key Terms, Review Questions, and Problems 280

PART THREE: ASYMMETRIC CIPHERS 283

Chapter 9 Public-Key Cryptography and RSA 283

9.1 Principles of Public-Key Cryptosystems 285

9.2 The RSA Algorithm 294

9.3 Key Terms, Review Questions, and Problems 308

CONTENTS 5

Chapter 10 Other Public-Key Cryptosystems 313

10.1 Diffie-Hellman Key Exchange 314

10.2 Elgamal Cryptographic System 318

10.3 Elliptic Curve Arithmetic 321

10.4 Elliptic Curve Cryptography 330

10.5 Pseudorandom Number Generation Based on an Asymmetric Cipher 334

10.6 Key Terms, Review Questions, and Problems 336

PART FOUR: CRYPTOGRAPHIC DATA INTEGRITY ALGORITHMS 339

Chapter 11 Cryptographic Hash Functions 339

11.1 Applications of Cryptographic Hash Functions 341

11.2 Two Simple Hash Functions 346

11.3 Requirements and Security 348

11.4 Hash Functions Based on Cipher Block Chaining 354

11.5 Secure Hash Algorithm (SHA) 355

11.6 SHA-3 365

11.7 Key Terms, Review Questions, and Problems 377

Chapter 12 Message Authentication Codes 381

12.1 Message Authentication Requirements 382

12.2 Message Authentication Functions 383

12.3 Requirements for Message Authentication Codes 391

12.4 Security of MACs 393

12.5 MACs Based on Hash Functions: HMAC 394

12.6 MACs Based on Block Ciphers: DAA and CMAC 399

12.7 Authenticated Encryption: CCM and GCM 402

12.8 Key Wrapping 408

12.9 Pseudorandom Number Generation Using Hash Functions and MACs 413

12.10 Key Terms, Review Questions, and Problems 416

Chapter 13 Digital Signatures 419

13.1 Digital Signatures 421

13.2 Elgamal Digital Signature Scheme 424

13.3 Schnorr Digital Signature Scheme 425

13.4 NIST Digital Signature Algorithm 426

13.5 Elliptic Curve Digital Signature Algorithm 430

13.6 RSA-PSS Digital Signature Algorithm 433

13.7 Key Terms, Review Questions, and Problems 438

PART FIVE: MUTUAL TRUST 441

Chapter 14 Key Management and Distribution 441

14.1 Symmetric Key Distribution Using Symmetric Encryption 442

14.2 Symmetric Key Distribution Using Asymmetric Encryption 451

14.3 Distribution of Public Keys 454

14.4 X.509 Certificates 459

6 CONTENTS

14.5 Public-Key Infrastructure 467

14.6 Key Terms, Review Questions, and Problems 469

Chapter 15 User Authentication 473

15.1 Remote User-Authentication Principles 474

15.2 Remote User-Authentication Using Symmetric Encryption 478

15.3 Kerberos 482

15.4 Remote User-Authentication Using Asymmetric Encryption 500

15.5 Federated Identity Management 502

15.6 Personal Identity Verification 508

15.7 Key Terms, Review Questions, and Problems 515

PART SIX: NETWORK AND INTERNET SECURITY 519

Chapter 16 Network Access Control and Cloud Security 519

16.1 Network Access Control 520

16.2 Extensible Authentication Protocol 523

16.3 IEEE 802.1X Port-Based Network Access Control 527

16.4 Cloud Computing 529

16.5 Cloud Security Risks and Countermeasures 535

16.6 Data Protection in the Cloud 537

16.7 Cloud Security as a Service 541

16.8 Addressing Cloud Computing Security Concerns 544

16.9 Key Terms, Review Questions, and Problems 545

Chapter 17 Transport-Level Security 546

17.1 Web Security Considerations 547

17.2 Transport Layer Security 549

17.3 HTTPS 566

17.4 Secure Shell (SSH) 567

17.5 Key Terms, Review Questions, and Problems 579

Chapter 18 Wireless Network Security 581

18.1 Wireless Security 582

18.2 Mobile Device Security 585

18.3 IEEE 802.11 Wireless LAN Overview 589

18.4 IEEE 802.11i Wireless LAN Security 595

18.5 Key Terms, Review Questions, and Problems 610

Chapter 19 Electronic Mail Security 612

19.1 Internet Mail Architecture 613

19.2 Email Formats 617

19.3 Email Threats and Comprehensive Email Security 625

19.4 S/MIME 627

19.5 Pretty Good Privacy 638

19.6 DNSSEC 639

19.7 DNS-Based Authentication of Named Entities 643

19.8 Sender Policy Framework 645

19.9 DomainKeys Identified Mail 648

CONTENTS 7

19.10 Domain-Based Message Authentication, Reporting, and Conformance 654

19.11 Key Terms, Review Questions, and Problems 659

Chapter 20 IP Security 661

20.1 IP Security Overview 662

20.2 IP Security Policy 668

20.3 Encapsulating Security Payload 673

20.4 Combining Security Associations 681

20.5 Internet Key Exchange 684

20.6 Cryptographic Suites 692

20.7 Key Terms, Review Questions, and Problems 694

APPENDICES 696

Appendix A Projects for Teaching Cryptography and Network Security 696

A.1 Sage Computer Algebra Projects 697

A.2 Hacking Project 698

A.3 Block Cipher Projects 699

A.4 Laboratory Exercises 699

A.5 Research Projects 699

A.6 Programming Projects 700

A.7 Practical Security Assessments 700

A.8 Firewall Projects 701

A.9 Case Studies 701

A.10 Writing Assignments 701

A.11 Reading/Report Assignments 702

A.12 Discussion Topics 702

Appendix B Sage Examples 703

B.1 Linear Algebra and Matrix Functionality 704

B.2 Chapter 2: Number Theory 705

B.3 Chapter 3: Classical Encryption 710

B.4 Chapter 4: Block Ciphers and the Data Encryption Standard 713

B.5 Chapter 5: Basic Concepts in Number Theory and Finite Fields 717

B.6 Chapter 6: Advanced Encryption Standard 724

B.7 Chapter 8: Pseudorandom Number Generation and Stream Ciphers 729

B.8 Chapter 9: Public-Key Cryptography and RSA 731

B.9 Chapter 10: Other Public-Key Cryptosystems 734

B.10 Chapter 11: Cryptographic Hash Functions 739

B.11 Chapter 13: Digital Signatures 741

References 744

Credits 753

Index 754

8 CONTENTS

ONLINE CHAPTERS AND APPENDICES1

PART SEVEN: SYSTEM SECURITY

Chapter 21 Malicious Software

21.1 Types of Malicious Software (Malware)

21.2 Advanced Persistent Threat

21.3 Propagation—Infected Content—Viruses

21.4 Propagation—Vulnerability Exploit—Worms

21.5 Propagation—Social Engineering—Spam E-mail, Trojans

21.6 Payload—System Corruption

21.7 Payload—Attack Agent—Zombie, Bots

21.8 Payload—Information Theft—Keyloggers, Phishing, Spyware

21.9 Payload—Stealthing—Backdoors, Rootkits

21.10 Countermeasures

21.11 Distributed Denial of Service Attacks

21.12 References

21.13 Key Terms, Review Questions, and Problems

Chapter 22 Intruders

22.1 Intruders

22.2 Intrusion Detection

22.3 Password Management

22.4 References

22.5 Key Terms, Review Questions, and Problems

Chapter 23 Firewalls

23.1 The Need for Firewalls

23.2 Firewall Characteristics and Access Policy

23.3 Types of Firewalls

23.4 Firewall Basing

23.5 Firewall Location and Configurations

23.6 References

23.7 Key Terms, Review Questions, and Problems

PART EIGHT: LEGAL AND ETHICAL ISSUES

Chapter 24 Legal and Ethical Aspects

24.1 Cybercrime and Computer Crime

24.2 Intellectual Property

24.3 Privacy

24.4 Ethical Issues

24.5 Recommended Reading

24.6 References

24.7 Key Terms, Review Questions, and Problems

24.A Information Privacy

1Online chapters, appendices, and other documents are at the Companion Website, available via the

access card at the front of this book.

CONTENTS 9

Appendix C Sage Exercises

Appendix D Standards and Standard-Setting Organizations

Appendix E Basic Concepts from Linear Algebra

Appendix F Measures of Secrecy and Security

Appendix G Simplified DES

Appendix H Evaluation Criteria for AES

Appendix I Simplified AES

Appendix J The Knapsack Algorithm

Appendix K Proof of the Digital Signature Algorithm

Appendix L TCP/IP and OSI

Appendix M Java Cryptographic APIs

Appendix N MD5 Hash Function

Appendix O Data Compression Using ZIP

Appendix P PGP

Appendix Q The International Reference Alphabet

Appendix R Proof of the RSA Algorithm

Appendix S Data Encryption Standard

Appendix T Kerberos Encryption Techniques

Appendix U Mathematical Basis of the Birthday Attack

Appendix V Evaluation Criteria for SHA-3

Appendix W The Complexity of Algorithms

Appendix X Radix-64 Conversion

Appendix Y The Base Rate Fallacy

Glossary

NOTATION

Symbol Expression Meaning

D, K D(K, Y) Symmetric decryption of ciphertext Y using secret key K

D, PRa D(PRa, Y) Asymmetric decryption of ciphertext Y using A’s private key PRa

D, PUa D(PUa, Y) Asymmetric decryption of ciphertext Y using A’s public key PUa

E, K E(K, X) Symmetric encryption of plaintext X using secret key K

E, PRa E(PRa, X) Asymmetric encryption of plaintext X using A’s private key PRa

E, PUa E(PUa, X) Asymmetric encryption of plaintext X using A’s public key PUa

K Secret key

PRa Private key of user A

PUa Public key of user A

MAC, K MAC(K, X) Message authentication code of message X using secret key K

GF(p)

The finite field of order p, where p is prime.The field is defined as

the set Zp together with the arithmetic operations modulo p.

GF(2n) The finite field of order 2n

Zn Set of nonnegative integers less than n

gcd gcd(i, j)

Greatest common divisor; the largest positive integer that

divides both i and j with no remainder on division.

mod a mod m Remainder after division of a by m

mod, K a K b (mod m) a mod m = b mod m

mod, [ a [ b (mod m) a mod m ≠ b mod m

dlog dloga,p(b) Discrete logarithm of the number b for the base a (mod p)

w f(n)

The number of positive integers less than n and relatively

prime to n.

This is Euler’s totient function.

Σ a

n

i = 1

ai

a1 + a2 + g + an

Π q

n

i = 1

ai

a1 * a2 * g * an

� i� j

i divides j, which means that there is no remainder when j is

divided by i

� , � �a� Absolute value of a

10

NOTATION 11

Symbol Expression Meaning

} x} y x concatenated with y

≈ x ≈ y x is approximately equal to y

⊕ x ⊕ y

Exclusive-OR of x and y for single-bit variables;

Bitwise exclusive-OR of x and y for multiple-bit variables

:, ; :x; The largest integer less than or equal to x

∈ x ∈ S The element x is contained in the set S.

·

A · (a1, a2,

c ak)

The integer A corresponds to the sequence of integers

(a1, a2, c ak)

PREFACE

WHAT’S NEW IN THE SEVENTH EDITION

In the four years since the sixth edition of this book was published, the field has seen contin-

ued innovations and improvements. In this new edition, I try to capture these changes while

maintaining a broad and comprehensive coverage of the entire field. To begin this process of

revision, the sixth edition of this book was extensively reviewed by a number of professors

who teach the subject and by professionals working in the field. The result is that, in many

places, the narrative has been clarified and tightened, and illustrations have been improved.

Beyond these refinements to improve pedagogy and user-friendliness, there have been

substantive changes throughout the book. Roughly the same chapter organization has been

retained, but much of the material has been revised and new material has been added. The

most noteworthy changes are as follows:

■ Fundamental security design principles: Chapter 1 includes a new section discussing the

security design principles listed as fundamental by the National Centers of Academic

Excellence in Information Assurance/Cyber Defense, which is jointly sponsored by the

U.S. National Security Agency and the U.S. Department of Homeland Security.

■ Attack surfaces and attack trees: Chapter 1 includes a new section describing these two

concepts, which are useful in evaluating and classifying security threats.

■ Number theory coverage: The material on number theory has been consolidated

into a single chapter, Chapter 2. This makes for a convenient reference. The relevant

portions of Chapter 2 can be assigned as needed.

■ Finite fields: The chapter on finite fields has been revised and expanded with addi-

tional text and new figures to enhance understanding.

■ Format-preserving encryption: This relatively new mode of encryption is enjoying

increasing commercial success. A new section in Chapter 7 covers this method.

■ Conditioning and health testing for true random number generators: Chapter 8 now

provides coverage of these important topics.

■ User authentication model: Chapter 15 includes a new description of a general model

for user authentication, which helps to unify the discussion of the various approaches

to user authentication.

■ Cloud security: The material on cloud security in Chapter 16 has been updated and

expanded to reflect its importance and recent developments.

■ Transport Layer Security (TLS): The treatment of TLS in Chapter 17 has been updated,

reorganized to improve clarity, and now includes a discussion of the new TLS version 1.3.

■ Email Security: Chapter 19 has been completely rewritten to provide a comprehensive

and up-to-date discussion of email security. It includes:

— New: discussion of email threats and a comprehensive approach to email security.

— New: discussion of STARTTLS, which provides confidentiality and authentication

for SMTP.

12

PREFACE 13

— Revised: treatment of S/MIME has been updated to reflect the latest version 3.2.

— New: discussion of DNSSEC and its role in supporting email security.

— New: discussion of DNS-based Authentication of Named Entities (DANE) and the

use of this approach to enhance security for certificate use in SMTP and S/MIME.

— New: discussion of Sender Policy Framework (SPF), which is the standardized way

for a sending domain to identify and assert the mail senders for a given domain.

— Revised: discussion of DomainKeys Identified Mail (DKIM) has been revised.

— New: discussion of Domain-based Message Authentication, Reporting, and Confor-

mance (DMARC) allows email senders to specify policy on how their mail should

be handled, the types of reports that receivers can send back, and the frequency

those reports should be sent.

OBJECTIVES

It is the purpose of this book to provide a practical survey of both the principles and practice

of cryptography and network security. In the first part of the book, the basic issues to be

addressed by a network security capability are explored by providing a tutorial and survey

of cryptography and network security technology. The latter part of the book deals with the

practice of network security: practical applications that have been implemented and are in

use to provide network security.

The subject, and therefore this book, draws on a variety of disciplines. In particular,

it is impossible to appreciate the significance of some of the techniques discussed in this

book without a basic understanding of number theory and some results from probability

theory. Nevertheless, an attempt has been made to make the book self-contained. The book

not only presents the basic mathematical results that are needed but provides the reader

with an intuitive understanding of those results. Such background material is introduced

as needed. This approach helps to motivate the material that is introduced, and the author

considers this preferable to simply presenting all of the mathematical material in a lump at

the beginning of the book.

SUPPORT OF ACM/IEEE COMPUTER SCIENCE CURRICULA 2013

The book is intended for both academic and professional audiences. As a textbook, it is

intended as a one-semester undergraduate course in cryptography and network security for

computer science, computer engineering, and electrical engineering majors. The changes to

this edition are intended to provide support of the ACM/IEEE Computer Science Curricula

2013 (CS2013). CS2013 adds Information Assurance and Security (IAS) to the curriculum rec-

ommendation as one of the Knowledge Areas in the Computer Science Body of Knowledge.

The document states that IAS is now part of the curriculum recommendation because of the

critical role of IAS in computer science education. CS2013 divides all course work into three

categories: Core-Tier 1 (all topics should be included in the curriculum), Core-Tier-2 (all or

almost all topics should be included), and elective (desirable to provide breadth and depth).

In the IAS area, CS2013 recommends topics in Fundamental Concepts and Network Security

14 PREFACE

in Tier 1 and Tier 2, and Cryptography topics as elective. This text covers virtually all of the

topics listed by CS2013 in these three categories.

The book also serves as a basic reference volume and is suitable for self-study.

PLAN OF THE TEXT

The book is divided into eight parts.

■ Background

■ Symmetric Ciphers

■ Asymmetric Ciphers

■ Cryptographic Data Integrity Algorithms

■ Mutual Trust

■ Network and Internet Security

■ System Security

■ Legal and Ethical Issues

The book includes a number of pedagogic features, including the use of the computer

algebra system Sage and numerous figures and tables to clarify the discussions. Each chap-

ter includes a list of key words, review questions, homework problems, and suggestions

for further reading. The book also includes an extensive glossary, a list of frequently used

acronyms, and a bibliography. In addition, a test bank is available to instructors.

INSTRUCTOR SUPPORT MATERIALS

The major goal of this text is to make it as effective a teaching tool for this exciting and

fast-moving subject as possible. This goal is reflected both in the structure of the book and in

the supporting material. The text is accompanied by the following supplementary material

that will aid the instructor:

■ Solutions manual: Solutions to all end-of-chapter Review Questions and Problems.

■ Projects manual: Suggested project assignments for all of the project categories listed

below.

■ PowerPoint slides: A set of slides covering all chapters, suitable for use in lecturing.

■ PDF files: Reproductions of all figures and tables from the book.

■ Test bank: A chapter-by-chapter set of questions with a separate file of answers.

■ Sample syllabuses: The text contains more material than can be conveniently covered

in one semester. Accordingly, instructors are provided with several sample syllabuses

that guide the use of the text within limited time.

All of these support materials are available at the Instructor Resource Center

(IRC) for this textbook, which can be reached through the publisher’s Web site

www.pearsonglobaleditions.com/stallings. To gain access to the IRC, please contact your

local Pearson sales representative.

http://www.pearsonglobaleditions.com/stallings

PREFACE 15

PROJECTS AND OTHER STUDENT EXERCISES

For many instructors, an important component of a cryptography or network security course

is a project or set of projects by which the student gets hands-on experience to reinforce

concepts from the text. This book provides an unparalleled degree of support, including a

projects component in the course. The IRC not only includes guidance on how to assign and

structure the projects, but also includes a set of project assignments that covers a broad range

of topics from the text:

■ Sage projects: Described in the next section.

■ Hacking project: Exercise designed to illuminate the key issues in intrusion detection

and prevention.

■ Block cipher projects: A lab that explores the operation of the AES encryption algo-

rithm by tracing its execution, computing one round by hand, and then exploring the

various block cipher modes of use. The lab also covers DES. In both cases, an online

Java applet is used (or can be downloaded) to execute AES or DES.

■ Lab exercises: A series of projects that involve programming and experimenting with

concepts from the book.

■ Research projects: A series of research assignments that instruct the student to research

a particular topic on the Internet and write a report.

■ Programming projects: A series of programming projects that cover a broad range of

topics and that can be implemented in any suitable language on any platform.

■ Practical security assessments: A set of exercises to examine current infrastructure and

practices of an existing organization.

■ Firewall projects: A portable network firewall visualization simulator, together with

exercises for teaching the fundamentals of firewalls.

■ Case studies: A set of real-world case studies, including learning objectives, case

description, and a series of case discussion questions.

■ Writing assignments: A set of suggested writing assignments, organized by chapter.

■ Reading/report assignments: A list of papers in the literature—one for each chapter—

that can be assigned for the student to read and then write a short report.

This diverse set of projects and other student exercises enables the instructor to use

the book as one component in a rich and varied learning experience and to tailor a course

plan to meet the specific needs of the instructor and students. See Appendix A in this book

for details.

THE SAGE COMPUTER ALGEBRA SYSTEM

One of the most important features of this book is the use of Sage for cryptographic examples

and homework assignments. Sage is an open-source, multiplatform, freeware package that

implements a very powerful, flexible, and easily learned mathematics and computer algebra

system. Unlike competing systems (such as Mathematica, Maple, and MATLAB), there are

16 PREFACE

no licensing agreements or fees involved. Thus, Sage can be made available on computers

and networks at school, and students can individually download the software to their own

personal computers for use at home. Another advantage of using Sage is that students learn

a powerful, flexible tool that can be used for virtually any mathematical application, not

just cryptography.

The use of Sage can make a significant difference to the teaching of the mathematics

of cryptographic algorithms. This book provides a large number of examples of the use of

Sage covering many cryptographic concepts in Appendix B, which is included in this book.

Appendix C lists exercises in each of these topic areas to enable the student to gain

hands-on experience with cryptographic algorithms. This appendix is available to instruc-

tors at the IRC for this book. Appendix C includes a section on how to download and get

started with Sage, a section on programming with Sage, and exercises that can be assigned to

students in the following categories:

■ Chapter 2—Number Theory and Finite Fields: Euclidean and extended Euclidean

algorithms, polynomial arithmetic, GF(24), Euler’s Totient function, Miller–Rabin, fac-

toring, modular exponentiation, discrete logarithm, and Chinese remainder theorem.

■ Chapter 3—Classical Encryption: Affine ciphers and the Hill cipher.

■ Chapter 4—Block Ciphers and the Data Encryption Standard: Exercises based

on SDES.

■ Chapter 6—Advanced Encryption Standard: Exercises based on SAES.

■ Chapter 8—Pseudorandom Number Generation and Stream Ciphers: Blum Blum

Shub, linear congruential generator, and ANSI X9.17 PRNG.

■ Chapter 9—Public-Key Cryptography and RSA: RSA encrypt/decrypt and signing.

■ Chapter 10—Other Public-Key Cryptosystems: Diffie–Hellman, elliptic curve.

■ Chapter 11—Cryptographic Hash Functions: Number-theoretic hash function.

■ Chapter 13—Digital Signatures: DSA.

ONLINE DOCUMENTS FOR STUDENTS

For this new edition, a tremendous amount of original supporting material for students has

been made available online.

Purchasing this textbook new also grants the reader six months of access to the

Companion Website, which includes the following materials:

■ Online chapters: To limit the size and cost of the book, four chapters of the book are

provided in PDF format. This includes three chapters on computer security and one on

legal and ethical issues. The chapters are listed in this book’s table of contents.

■ Online appendices: There are numerous interesting topics that support material found

in the text but whose inclusion is not warranted in the printed text. A total of 20 online

appendices cover these topics for the interested student. The appendices are listed in

this book’s table of contents.

PREFACE 17

■ Homework problems and solutions: To aid the student in understanding the material,

a separate set of homework problems with solutions are available.

■ Key papers: A number of papers from the professional literature, many hard to find,

are provided for further reading.

■ Supporting documents: A variety of other useful documents are referenced in the text

and provided online.

■ Sage code: The Sage code from the examples in Appendix B is useful in case the student

wants to play around with the examples.

To access the Companion Website, follow the instructions for “digital resources for

students” found in the front of this book.

ACKNOWLEDGMENTS

This new edition has benefited from review by a number of people who gave generously

of their time and expertise. The following professors reviewed all or a large part of the

manuscript: Hossein Beyzavi (Marymount University), Donald F. Costello (University of

Nebraska–Lincoln), James Haralambides (Barry University), Anand Seetharam (California

State University at Monterey Bay), Marius C. Silaghi (Florida Institute of Technology),

Shambhu Upadhyaya (University at Buffalo), Zhengping Wu (California State University

at San Bernardino), Liangliang Xiao (Frostburg State University), Seong-Moo (Sam) Yoo

(The University of Alabama in Huntsville), and Hong Zhang (Armstrong State University).

Thanks also to the people who provided detailed technical reviews of one or more

chapters: Dino M. Amaral, Chris Andrew, Prof. (Dr). C. Annamalai, Andrew Bain, Riccardo

Bernardini, Olivier Blazy, Zervopoulou Christina, Maria Christofi, Dhananjoy Dey, Mario

Emmanuel, Mike Fikuart, Alexander Fries, Pierpaolo Giacomin, Pedro R. M. Inácio,

Daniela Tamy Iwassa, Krzysztof Janowski, Sergey Katsev, Adnan Kilic, Rob Knox, Mina

Pourdashty, Yuri Poeluev, Pritesh Prajapati, Venkatesh Ramamoorthy, Andrea Razzini,

Rami Rosen, Javier Scodelaro, Jamshid Shokrollahi, Oscar So, and David Tillemans.

In addition, I was fortunate to have reviews of individual topics by “subject-area

gurus,” including Jesse Walker of Intel (Intel’s Digital Random Number Generator), Russ

Housley of Vigil Security (key wrapping), Joan Daemen (AES), Edward F. Schaefer of

Santa Clara University (Simplified AES), Tim Mathews, formerly of RSA Laboratories

(S/MIME), Alfred Menezes of the University of Waterloo (elliptic curve cryptography),

William Sutton, Editor/Publisher of The Cryptogram (classical encryption), Avi Rubin of

Johns Hopkins University (number theory), Michael Markowitz of Information Security

Corporation (SHA and DSS), Don Davis of IBM Internet Security Systems (Kerberos),

Steve Kent of BBN Technologies (X.509), and Phil Zimmerman (PGP).

Nikhil Bhargava (IIT Delhi) developed the set of online homework problems and

solutions. Dan Shumow of Microsoft and the University of Washington developed all of

the Sage examples and assignments in Appendices B and C. Professor Sreekanth Malladi of

Dakota State University developed the hacking exercises. Lawrie Brown of the Australian

Defence Force Academy provided the AES/DES block cipher projects and the security

assessment assignments.

18 PREFACE

Sanjay Rao and Ruben Torres of Purdue University developed the laboratory exercises

that appear in the IRC. The following people contributed project assignments that appear in

the instructor’s supplement: Henning Schulzrinne (Columbia University); Cetin Kaya Koc

(Oregon State University); and David Balenson (Trusted Information Systems and George

Washington University). Kim McLaughlin developed the test bank.

Finally, I thank the many people responsible for the publication of this book, all of

whom did their usual excellent job. This includes the staff at Pearson, particularly my editor

Tracy Johnson, program manager Carole Snyder, and production manager Bob Engelhardt.

Thanks also to the marketing and sales staffs at Pearson, without whose efforts this book

would not be in front of you.

ACKNOWLEDGMENTS FOR THE GLOBAL EDITION

Pearson would like to thank and acknowledge Somitra Kumar Sanadhya (Indraprastha

Institute of Information Technology Delhi), and Somanath Tripathy (Indian Institute of

Technology Patna) for contributing to the Global Edition, and Anwitaman Datta (Nanyang

Technological University Singapore), Atul Kahate (Pune University), Goutam Paul (Indian

Statistical Institute Kolkata), and Khyat Sharma for reviewing the Global Edition.

ABOUT THE AUTHOR

Dr. William Stallings has authored 18 titles, and counting revised editions, over 40 books

on computer security, computer networking, and computer architecture. His writings have

appeared in numerous publications, including the Proceedings of the IEEE, ACM Computing

Reviews, and Cryptologia.

He has 13 times received the award for the best Computer Science textbook of the

year from the Text and Academic Authors Association.

In over 30 years in the field, he has been a technical contributor, technical manager,

and an executive with several high-technology firms. He has designed and implemented

both TCP/IP-based and OSI-based protocol suites on a variety of computers and operating

systems, ranging from microcomputers to mainframes. As a consultant, he has advised gov-

ernment agencies, computer and software vendors, and major users on the design, selection,

and use of networking software and products.

He created and maintains the Computer Science Student Resource Site at

ComputerScienceStudent.com. This site provides documents and links on a variety of

subjects of general interest to computer science students (and professionals). He is a member

of the editorial board of Cryptologia, a scholarly journal devoted to all aspects of cryptology.

Dr. Stallings holds a PhD from MIT in computer science and a BS from Notre Dame

in electrical engineering.

19

PART ONE: BACKGROUND

CHAPTER

Computer and Network

Security Concepts

1.1 Computer Security Concepts

A Definition of Computer Security

Examples

The Challenges of Computer Security

1.2 The OSI Security Architecture

1.3 Security Attacks

Passive Attacks

Active Attacks

1.4 Security Services

Authentication

Access Control

Data Confidentiality

Data Integrity

Nonrepudiation

Availability Service

1.5 Security Mechanisms

1.6 Fundamental Security Design Principles

1.7 Attack Surfaces and Attack Trees

Attack Surfaces

Attack Trees

1.8 A Model for Network Security

1.9 Standards

1.10 Key Terms, Review Questions, and Problems

19

20 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

This book focuses on two broad areas: cryptographic algorithms and protocols, which

have a broad range of applications; and network and Internet security, which rely

heavily on cryptographic techniques.

Cryptographic algorithms and protocols can be grouped into four main areas:

■ Symmetric encryption: Used to conceal the contents of blocks or streams of

data of any size, including messages, files, encryption keys, and passwords.

■ Asymmetric encryption: Used to conceal small blocks of data, such as encryp-

tion keys and hash function values, which are used in digital signatures.

■ Data integrity algorithms: Used to protect blocks of data, such as messages,

from alteration.

■ Authentication protocols: These are schemes based on the use of crypto-

graphic algorithms designed to authenticate the identity of entities.

The field of network and Internet security consists of measures to deter, prevent,

detect, and correct security violations that involve the transmission of information.

That is a broad statement that covers a host of possibilities. To give you a feel for the

areas covered in this book, consider the following examples of security violations:

1. User A transmits a file to user B. The file contains sensitive information

(e.g., payroll records) that is to be protected from disclosure. User C, who is

not authorized to read the file, is able to monitor the transmission and capture

a copy of the file during its transmission.

2. A network manager, D, transmits a message to a computer, E, under its man-

agement. The message instructs computer E to update an authorization file to

include the identities of a number of new users who are to be given access to

that computer. User F intercepts the message, alters its contents to add or delete

entries, and then forwards the message to computer E, which accepts the mes-

sage as coming from manager D and updates its authorization file accordingly.

LEARNING OBJECTIVES

After studying this chapter, you should be able to:

◆ Describe the key security requirements of confidentiality, integrity, and

availability.

◆ Describe the X.800 security architecture for OSI.

◆ Discuss the types of security threats and attacks that must be dealt with

and give examples of the types of threats and attacks that apply to differ-

ent categories of computer and network assets.

◆ Explain the fundamental security design principles.

◆ Discuss the use of attack surfaces and attack trees.

◆ List and briefly describe key organizations involved in cryptography

standards.

1.1 / COMPUTER SECURITY CONCEPTS 21

3. Rather than intercept a message, user F constructs its own message with the

desired entries and transmits that message to computer E as if it had come

from manager D. Computer E accepts the message as coming from manager D

and updates its authorization file accordingly.

4. An employee is fired without warning. The personnel manager sends a mes-

sage to a server system to invalidate the employee’s account. When the invali-

dation is accomplished, the server is to post a notice to the employee’s file as

confirmation of the action. The employee is able to intercept the message and

delay it long enough to make a final access to the server to retrieve sensitive

information. The message is then forwarded, the action taken, and the confir-

mation posted. The employee’s action may go unnoticed for some consider-

able time.

5. A message is sent from a customer to a stockbroker with instructions for vari-

ous transactions. Subsequently, the investments lose value and the customer

denies sending the message.

Although this list by no means exhausts the possible types of network security viola-

tions, it illustrates the range of concerns of network security.

1.1 COMPUTER SECURITY CONCEPTS

A Definition of Computer Security

The NIST Computer Security Handbook [NIST95] defines the term computer secu-

rity as follows:

Computer Security: The protection afforded to an automated information system

in order to attain the applicable objectives of preserving the integrity, availability,

and confidentiality of information system resources (includes hardware, software,

firmware, information/data, and telecommunications).

This definition introduces three key objectives that are at the heart of com-

puter security:

■ Confidentiality: This term covers two related concepts:

Data1 confidentiality: Assures that private or confidential information is

not made available or disclosed to unauthorized individuals.

Privacy: Assures that individuals control or influence what information re-

lated to them may be collected and stored and by whom and to whom that

information may be disclosed.

1RFC 4949 defines information as “facts and ideas, which can be represented (encoded) as various forms

of data,” and data as “information in a specific physical representation, usually a sequence of symbols

that have meaning; especially a representation of information that can be processed or produced by a

computer.” Security literature typically does not make much of a distinction, nor does this book.

22 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

■ Integrity: This term covers two related concepts:

Data integrity: Assures that information (both stored and in transmit-

ted packets) and programs are changed only in a specified and authorized

manner.

System integrity: Assures that a system performs its intended function in

an unimpaired manner, free from deliberate or inadvertent unauthorized

manipulation of the system.

■ Availability: Assures that systems work promptly and service is not denied to

authorized users.

These three concepts form what is often referred to as the CIA triad. The

three concepts embody the fundamental security objectives for both data and for

information and computing services. For example, the NIST standard FIPS 199

(Standards for Security Categorization of Federal Information and Information

Systems) lists confidentiality, integrity, and availability as the three security objec-

tives for information and for information systems. FIPS 199 provides a useful char-

acterization of these three objectives in terms of requirements and the definition of

a loss of security in each category:

■ Confidentiality: Preserving authorized restrictions on information access

and disclosure, including means for protecting personal privacy and propri-

etary information. A loss of confidentiality is the unauthorized disclosure of

information.

■ Integrity: Guarding against improper information modification or destruc-

tion, including ensuring information nonrepudiation and authenticity. A loss

of integrity is the unauthorized modification or destruction of information.

■ Availability: Ensuring timely and reliable access to and use of information.

A loss of availability is the disruption of access to or use of information or an

information system.

Although the use of the CIA triad to define security objectives is well estab-

lished, some in the security field feel that additional concepts are needed to present a

complete picture (Figure 1.1). Two of the most commonly mentioned are as follows:

Figure 1.1 Essential Network and Computer Security

Requirements

Data

and

services

Availability

Integrity

A

ccountability

A

ut

he

nt

ic

ity

Co

nfi

den

tia

lity

1.1 / COMPUTER SECURITY CONCEPTS 23

■ Authenticity: The property of being genuine and being able to be verified and

trusted; confidence in the validity of a transmission, a message, or message

originator. This means verifying that users are who they say they are and that

each input arriving at the system came from a trusted source.

■ Accountability: The security goal that generates the requirement for actions

of an entity to be traced uniquely to that entity. This supports nonrepudia-

tion, deterrence, fault isolation, intrusion detection and prevention, and after-

action recovery and legal action. Because truly secure systems are not yet an

achievable goal, we must be able to trace a security breach to a responsible

party. Systems must keep records of their activities to permit later forensic

analysis to trace security breaches or to aid in transaction disputes.

Examples

We now provide some examples of applications that illustrate the requirements just

enumerated.2 For these examples, we use three levels of impact on organizations or

individuals should there be a breach of security (i.e., a loss of confidentiality, integ-

rity, or availability). These levels are defined in FIPS PUB 199:

■ Low: The loss could be expected to have a limited adverse effect on organi-

zational operations, organizational assets, or individuals. A limited adverse

effect means that, for example, the loss of confidentiality, integrity, or avail-

ability might (i) cause a degradation in mission capability to an extent and

duration that the organization is able to perform its primary functions, but the

effectiveness of the functions is noticeably reduced; (ii) result in minor dam-

age to organizational assets; (iii) result in minor financial loss; or (iv) result in

minor harm to individuals.

■ Moderate: The loss could be expected to have a serious adverse effect on

organizational operations, organizational assets, or individuals. A serious

adverse effect means that, for example, the loss might (i) cause a signifi-

cant degradation in mission capability to an extent and duration that the

organization is able to perform its primary functions, but the effectiveness

of the functions is significantly reduced; (ii) result in significant damage to

organizational assets; (iii) result in significant financial loss; or (iv) result in

significant harm to individuals that does not involve loss of life or serious,

life-threatening injuries.

■ High: The loss could be expected to have a severe or catastrophic adverse

effect on organizational operations, organizational assets, or individuals.

A severe or catastrophic adverse effect means that, for example, the loss

might (i) cause a severe degradation in or loss of mission capability to an

extent and duration that the organization is not able to perform one or more

of its primary functions; (ii) result in major damage to organizational assets;

(iii) result in major financial loss; or (iv) result in severe or catastrophic harm

to individuals involving loss of life or serious, life-threatening injuries.

2These examples are taken from a security policy document published by the Information Technology

Security and Privacy Office at Purdue University.

24 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

CONFIDENTIALITY Student grade information is an asset whose confidentiality is

considered to be highly important by students. In the United States, the release of

such information is regulated by the Family Educational Rights and Privacy Act

(FERPA). Grade information should only be available to students, their parents,

and employees that require the information to do their job. Student enrollment

information may have a moderate confidentiality rating. While still covered by

FERPA, this information is seen by more people on a daily basis, is less likely to be

targeted than grade information, and results in less damage if disclosed. Directory

information, such as lists of students or faculty or departmental lists, may be as-

signed a low confidentiality rating or indeed no rating. This information is typically

freely available to the public and published on a school’s Web site.

INTEGRITY Several aspects of integrity are illustrated by the example of a hospital

patient’s allergy information stored in a database. The doctor should be able to

trust that the information is correct and current. Now suppose that an employee

(e.g., a nurse) who is authorized to view and update this information deliberately

falsifies the data to cause harm to the hospital. The database needs to be restored

to a trusted basis quickly, and it should be possible to trace the error back to the

person responsible. Patient allergy information is an example of an asset with a high

requirement for integrity. Inaccurate information could result in serious harm or

death to a patient and expose the hospital to massive liability.

An example of an asset that may be assigned a moderate level of integrity

requirement is a Web site that offers a forum to registered users to discuss some

specific topic. Either a registered user or a hacker could falsify some entries or

deface the Web site. If the forum exists only for the enjoyment of the users, brings

in little or no advertising revenue, and is not used for something important such

as research, then potential damage is not severe. The Web master may experience

some data, financial, and time loss.

An example of a low integrity requirement is an anonymous online poll. Many

Web sites, such as news organizations, offer these polls to their users with very few

safeguards. However, the inaccuracy and unscientific nature of such polls is well

understood.

AVAILABILITY The more critical a component or service, the higher is the level of

availability required. Consider a system that provides authentication services for

critical systems, applications, and devices. An interruption of service results in the

inability for customers to access computing resources and staff to access the re-

sources they need to perform critical tasks. The loss of the service translates into a

large financial loss in lost employee productivity and potential customer loss.

An example of an asset that would typically be rated as having a moderate

availability requirement is a public Web site for a university; the Web site provides

information for current and prospective students and donors. Such a site is not a

critical component of the university’s information system, but its unavailability will

cause some embarrassment.

An online telephone directory lookup application would be classified as a low

availability requirement. Although the temporary loss of the application may be

an annoyance, there are other ways to access the information, such as a hardcopy

directory or the operator.

1.1 / COMPUTER SECURITY CONCEPTS 25

The Challenges of Computer Security

Computer and network security is both fascinating and complex. Some of the

reasons follow:

1. Security is not as simple as it might first appear to the novice. The require-

ments seem to be straightforward; indeed, most of the major requirements for

security services can be given self-explanatory, one-word labels: confidential-

ity, authentication, nonrepudiation, or integrity. But the mechanisms used to

meet those requirements can be quite complex, and understanding them may

involve rather subtle reasoning.

2. In developing a particular security mechanism or algorithm, one must always

consider potential attacks on those security features. In many cases, successful

attacks are designed by looking at the problem in a completely different way,

therefore exploiting an unexpected weakness in the mechanism.

3. Because of point 2, the procedures used to provide particular services are

often counterintuitive. Typically, a security mechanism is complex, and it is not

obvious from the statement of a particular requirement that such elaborate

measures are needed. It is only when the various aspects of the threat are con-

sidered that elaborate security mechanisms make sense.

4. Having designed various security mechanisms, it is necessary to decide where

to use them. This is true both in terms of physical placement (e.g., at what points

in a network are certain security mechanisms needed) and in a logical sense

(e.g., at what layer or layers of an architecture such as TCP/IP [Transmission

Control Protocol/Internet Protocol] should mechanisms be placed).

5. Security mechanisms typically involve more than a particular algorithm or

protocol. They also require that participants be in possession of some secret in-

formation (e.g., an encryption key), which raises questions about the creation,

distribution, and protection of that secret information. There also may be a re-

liance on communications protocols whose behavior may complicate the task

of developing the security mechanism. For example, if the proper functioning

of the security mechanism requires setting time limits on the transit time of a

message from sender to receiver, then any protocol or network that introduces

variable, unpredictable delays may render such time limits meaningless.

6. Computer and network security is essentially a battle of wits between a per-

petrator who tries to find holes and the designer or administrator who tries to

close them. The great advantage that the attacker has is that he or she need

only find a single weakness, while the designer must find and eliminate all

weaknesses to achieve perfect security.

7. There is a natural tendency on the part of users and system managers to per-

ceive little benefit from security investment until a security failure occurs.

8. Security requires regular, even constant, monitoring, and this is difficult in

today’s short-term, overloaded environment.

9. Security is still too often an afterthought to be incorporated into a system

after the design is complete rather than being an integral part of the design

process.

26 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

10. Many users and even security administrators view strong security as an

impediment to efficient and user-friendly operation of an information system

or use of information.

The difficulties just enumerated will be encountered in numerous ways as we

examine the various security threats and mechanisms throughout this book.

1.2 THE OSI SECURITY ARCHITECTURE

To assess effectively the security needs of an organization and to evaluate and

choose various security products and policies, the manager responsible for security

needs some systematic way of defining the requirements for security and character-

izing the approaches to satisfying those requirements. This is difficult enough in a

centralized data processing environment; with the use of local and wide area net-

works, the problems are compounded.

ITU-T3 Recommendation X.800, Security Architecture for OSI, defines such a

systematic approach.4 The OSI security architecture is useful to managers as a way

of organizing the task of providing security. Furthermore, because this architecture

was developed as an international standard, computer and communications vendors

have developed security features for their products and services that relate to this

structured definition of services and mechanisms.

For our purposes, the OSI security architecture provides a useful, if abstract,

overview of many of the concepts that this book deals with. The OSI security archi-

tecture focuses on security attacks, mechanisms, and services. These can be defined

briefly as

■ Security attack: Any action that compromises the security of information

owned by an organization.

■ Security mechanism: A process (or a device incorporating such a process)

that is designed to detect, prevent, or recover from a security attack.

■ Security service: A processing or communication service that enhances the

security of the data processing systems and the information transfers of an

organization. The services are intended to counter security attacks, and they

make use of one or more security mechanisms to provide the service.

In the literature, the terms threat and attack are commonly used to mean more

or less the same thing. Table 1.1 provides definitions taken from RFC 4949, Internet

Security Glossary.

3The International Telecommunication Union (ITU) Telecommunication Standardization Sector (ITU-T)

is a United Nations-sponsored agency that develops standards, called Recommendations, relating to tele-

communications and to open systems interconnection (OSI).

4The OSI security architecture was developed in the context of the OSI protocol architecture, which is

described in Appendix L. However, for our purposes in this chapter, an understanding of the OSI proto-

col architecture is not required.

1.3 / SECURITY ATTACKS 27

1.3 SECURITY ATTACKS

A useful means of classifying security attacks, used both in X.800 and RFC 4949, is

in terms of passive attacks and active attacks (Figure 1.2). A passive attack attempts

to learn or make use of information from the system but does not affect system re-

sources. An active attack attempts to alter system resources or affect their operation.

Passive Attacks

Passive attacks (Figure 1.2a) are in the nature of eavesdropping on, or monitoring

of, transmissions. The goal of the opponent is to obtain information that is being

transmitted. Two types of passive attacks are the release of message contents and

traffic analysis.

The release of message contents is easily understood. A telephone conver-

sation, an electronic mail message, and a transferred file may contain sensitive or

confidential information. We would like to prevent an opponent from learning the

contents of these transmissions.

A second type of passive attack, traffic analysis, is subtler. Suppose that we

had a way of masking the contents of messages or other information traffic so that

opponents, even if they captured the message, could not extract the information

from the message. The common technique for masking contents is encryption. If we

had encryption protection in place, an opponent might still be able to observe the

pattern of these messages. The opponent could determine the location and identity

of communicating hosts and could observe the frequency and length of messages

being exchanged. This information might be useful in guessing the nature of the

communication that was taking place.

Passive attacks are very difficult to detect, because they do not involve any

alteration of the data. Typically, the message traffic is sent and received in an appar-

ently normal fashion, and neither the sender nor receiver is aware that a third party

has read the messages or observed the traffic pattern. However, it is feasible to pre-

vent the success of these attacks, usually by means of encryption. Thus, the empha-

sis in dealing with passive attacks is on prevention rather than detection.

Active Attacks

Active attacks (Figure 1.2b) involve some modification of the data stream or the

creation of a false stream and can be subdivided into four categories: masquerade,

replay, modification of messages, and denial of service.

Threat

A potential for violation of security, which exists when there is a circumstance, capability, action,

or event that could breach security and cause harm. That is, a threat is a possible danger that might

exploit a vulnerability.

Attack

An assault on system security that derives from an intelligent threat; that is, an intelligent act that

is a deliberate attempt (especially in the sense of a method or technique) to evade security services

and violate the security policy of a system.

Table 1.1 Threats and Attacks (RFC 4949)

28 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

A masquerade takes place when one entity pretends to be a different entity

(path 2 of Figure 1.2b is active). A masquerade attack usually includes one of the

other forms of active attack. For example, authentication sequences can be captured

and replayed after a valid authentication sequence has taken place, thus enabling an

authorized entity with few privileges to obtain extra privileges by impersonating an

entity that has those privileges.

Replay involves the passive capture of a data unit and its subsequent retrans-

mission to produce an unauthorized effect (paths 1, 2, and 3 active).

Modification of messages simply means that some portion of a legitimate mes-

sage is altered, or that messages are delayed or reordered, to produce an unauthor-

ized effect (paths 1 and 2 active). For example, a message meaning “Allow John

Smith to read confidential file accounts” is modified to mean “Allow Fred Brown to

read confidential file accounts.”

Figure 1.2 Security Attacks

(a) Passive attacks

Alice

(b) Active attacks

Bob

Darth

Bob

Darth

Alice

Internet or

other communications facility

Internet or

other communications facility

1 2

3

1.4 / SECURITY SERVICES 29

The denial of service prevents or inhibits the normal use or management of

communications facilities (path 3 active). This attack may have a specific target; for

example, an entity may suppress all messages directed to a particular destination

(e.g., the security audit service). Another form of service denial is the disruption of

an entire network, either by disabling the network or by overloading it with mes-

sages so as to degrade performance.

Active attacks present the opposite characteristics of passive attacks. Whereas

passive attacks are difficult to detect, measures are available to prevent their success.

On the other hand, it is quite difficult to prevent active attacks absolutely because

of the wide variety of potential physical, software, and network vulnerabilities.

Instead, the goal is to detect active attacks and to recover from any disruption or

delays caused by them. If the detection has a deterrent effect, it may also contribute

to prevention.

1.4 SECURITY SERVICES

X.800 defines a security service as a service that is provided by a protocol layer of

communicating open systems and that ensures adequate security of the systems or

of data transfers. Perhaps a clearer definition is found in RFC 4949, which provides

the following definition: a processing or communication service that is provided by

a system to give a specific kind of protection to system resources; security services

implement security policies and are implemented by security mechanisms.

X.800 divides these services into five categories and fourteen specific services

(Table 1.2). We look at each category in turn.5

Authentication

The authentication service is concerned with assuring that a communication is au-

thentic. In the case of a single message, such as a warning or alarm signal, the function

of the authentication service is to assure the recipient that the message is from the

source that it claims to be from. In the case of an ongoing interaction, such as the con-

nection of a terminal to a host, two aspects are involved. First, at the time of connec-

tion initiation, the service assures that the two entities are authentic, that is, that each

is the entity that it claims to be. Second, the service must assure that the connection is

not interfered with in such a way that a third party can masquerade as one of the two

legitimate parties for the purposes of unauthorized transmission or reception.

Two specific authentication services are defined in X.800:

■ Peer entity authentication: Provides for the corroboration of the identity of a

peer entity in an association. Two entities are considered peers if they imple-

ment to same protocol in different systems; for example two TCP modules

in two communicating systems. Peer entity authentication is provided for

5There is no universal agreement about many of the terms used in the security literature. For example, the

term integrity is sometimes used to refer to all aspects of information security. The term authentication is

sometimes used to refer both to verification of identity and to the various functions listed under integrity

in this chapter. Our usage here agrees with both X.800 and RFC 4949.

30 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

AUTHENTICATION

The assurance that the communicating entity is the

one that it claims to be.

Peer Entity Authentication

Used in association with a logical connection to

provide confidence in the identity of the entities

connected.

Data-Origin Authentication

In a connectionless transfer, provides assurance that

the source of received data is as claimed.

ACCESS CONTROL

The prevention of unauthorized use of a resource

(i.e., this service controls who can have access to a

resource, under what conditions access can occur,

and what those accessing the resource are allowed

to do).

DATA CONFIDENTIALITY

The protection of data from unauthorized

disclosure.

Connection Confidentiality

The protection of all user data on a connection.

Connectionless Confidentiality

The protection of all user data in a single data block.

Selective-Field Confidentiality

The confidentiality of selected fields within the user

data on a connection or in a single data block.

Traffic-Flow Confidentiality

The protection of the information that might be

derived from observation of traffic flows.

DATA INTEGRITY

The assurance that data received are exactly as

sent by an authorized entity (i.e., contain no modi-

fication, insertion, deletion, or replay).

Connection Integrity with Recovery

Provides for the integrity of all user data on a connec-

tion and detects any modification, insertion, deletion,

or replay of any data within an entire data sequence,

with recovery attempted.

Connection Integrity without Recovery

As above, but provides only detection without

recovery.

Selective-Field Connection Integrity

Provides for the integrity of selected fields within the

user data of a data block transferred over a connec-

tion and takes the form of determination of whether

the selected fields have been modified, inserted,

deleted, or replayed.

Connectionless Integrity

Provides for the integrity of a single connectionless

data block and may take the form of detection of

data modification. Additionally, a limited form of

replay detection may be provided.

Selective-Field Connectionless Integrity

Provides for the integrity of selected fields within a

single connectionless data block; takes the form of

determination of whether the selected fields have

been modified.

NONREPUDIATION

Provides protection against denial by one of the

entities involved in a communication of having par-

ticipated in all or part of the communication.

Nonrepudiation, Origin

Proof that the message was sent by the specified

party.

Nonrepudiation, Destination

Proof that the message was received by the specified

party.

Table 1.2 Security Services (X.800)

use at the establishment of, or at times during the data transfer phase of, a

connection. It attempts to provide confidence that an entity is not performing

either a masquerade or an unauthorized replay of a previous connection.

■ Data origin authentication: Provides for the corroboration of the source of a

data unit. It does not provide protection against the duplication or modifica-

tion of data units. This type of service supports applications like electronic mail,

where there are no prior interactions between the communicating entities.

1.4 / SECURITY SERVICES 31

Access Control

In the context of network security, access control is the ability to limit and control

the access to host systems and applications via communications links. To achieve

this, each entity trying to gain access must first be identified, or authenticated,

so that access rights can be tailored to the individual.

Data Confidentiality

Confidentiality is the protection of transmitted data from passive attacks. With re-

spect to the content of a data transmission, several levels of protection can be iden-

tified. The broadest service protects all user data transmitted between two users

over a period of time. For example, when a TCP connection is set up between two

systems, this broad protection prevents the release of any user data transmitted over

the TCP connection. Narrower forms of this service can also be defined, including

the protection of a single message or even specific fields within a message. These

refinements are less useful than the broad approach and may even be more complex

and expensive to implement.

The other aspect of confidentiality is the protection of traffic flow from

analysis. This requires that an attacker not be able to observe the source and desti-

nation, frequency, length, or other characteristics of the traffic on a communications

facility.

Data Integrity

As with confidentiality, integrity can apply to a stream of messages, a single mes-

sage, or selected fields within a message. Again, the most useful and straightforward

approach is total stream protection.

A connection-oriented integrity service, one that deals with a stream of mes-

sages, assures that messages are received as sent with no duplication, insertion,

modification, reordering, or replays. The destruction of data is also covered under

this service. Thus, the connection-oriented integrity service addresses both mes-

sage stream modification and denial of service. On the other hand, a connection-

less integrity service, one that deals with individual messages without regard to any

larger context, generally provides protection against message modification only.

We can make a distinction between service with and without recovery. Because

the integrity service relates to active attacks, we are concerned with detection rather

than prevention. If a violation of integrity is detected, then the service may simply

report this violation, and some other portion of software or human intervention is

required to recover from the violation. Alternatively, there are mechanisms avail-

able to recover from the loss of integrity of data, as we will review subsequently. The

incorporation of automated recovery mechanisms is, in general, the more attractive

alternative.

Nonrepudiation

Nonrepudiation prevents either sender or receiver from denying a transmitted mes-

sage. Thus, when a message is sent, the receiver can prove that the alleged sender in

fact sent the message. Similarly, when a message is received, the sender can prove

that the alleged receiver in fact received the message.

32 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

Availability Service

Both X.800 and RFC 4949 define availability to be the property of a system or a

system resource being accessible and usable upon demand by an authorized system

entity, according to performance specifications for the system (i.e., a system is avail-

able if it provides services according to the system design whenever users request

them). A variety of attacks can result in the loss of or reduction in availability. Some

of these attacks are amenable to automated countermeasures, such as authentica-

tion and encryption, whereas others require some sort of physical action to prevent

or recover from loss of availability of elements of a distributed system.

X.800 treats availability as a property to be associated with various security

services. However, it makes sense to call out specifically an availability service. An

availability service is one that protects a system to ensure its availability. This ser-

vice addresses the security concerns raised by denial-of-service attacks. It depends

on proper management and control of system resources and thus depends on access

control service and other security services.

1.5 SECURITY MECHANISMS

Table 1.3 lists the security mechanisms defined in X.800. The mechanisms are

divided into those that are implemented in a specific protocol layer, such as TCP or

an application-layer protocol, and those that are not specific to any particular pro-

tocol layer or security service. These mechanisms will be covered in the appropriate

SPECIFIC SECURITY MECHANISMS

May be incorporated into the appropriate protocol

layer in order to provide some of the OSI security

services.

Encipherment

The use of mathematical algorithms to transform

data into a form that is not readily intelligible. The

transformation and subsequent recovery of the data

depend on an algorithm and zero or more encryption

keys.

Digital Signature

Data appended to, or a cryptographic transformation

of, a data unit that allows a recipient of the data unit

to prove the source and integrity of the data unit and

protect against forgery (e.g., by the recipient).

Access Control

A variety of mechanisms that enforce access rights to

resources.

Data Integrity

A variety of mechanisms used to assure the integrity

of a data unit or stream of data units.

PERVASIVE SECURITY MECHANISMS

Mechanisms that are not specific to any particular

OSI security service or protocol layer.

Trusted Functionality

That which is perceived to be correct with respect

to some criteria (e.g., as established by a security

policy).

Security Label

The marking bound to a resource (which may be a

data unit) that names or designates the security attri-

butes of that resource.

Event Detection

Detection of security-relevant events.

Security Audit Trail

Data collected and potentially used to facilitate a

security audit, which is an independent review and

examination of system records and activities.

Security Recovery

Deals with requests from mechanisms, such as event

handling and management functions, and takes

recovery actions.

Table 1.3 Security Mechanisms (X.800)

1.5 / SECURITY MECHANISMS 33

places in the book. So we do not elaborate now, except to comment on the defini-

tion of encipherment. X.800 distinguishes between reversible encipherment mech-

anisms and irreversible encipherment mechanisms. A reversible encipherment

mechanism is simply an encryption algorithm that allows data to be encrypted and

subsequently decrypted. Irreversible encipherment mechanisms include hash algo-

rithms and message authentication codes, which are used in digital signature and

message authentication applications.

Table 1.4, based on one in X.800, indicates the relationship between security

services and security mechanisms.

SPECIFIC SECURITY MECHANISMS

Authentication Exchange

A mechanism intended to ensure the identity of an

entity by means of information exchange.

Traffic Padding

The insertion of bits into gaps in a data stream to

frustrate traffic analysis attempts.

Routing Control

Enables selection of particular physically secure

routes for certain data and allows routing changes,

especially when a breach of security is suspected.

Notarization

The use of a trusted third party to assure certain

properties of a data exchange.

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Peer entity authentication

SERVICE

MECHANISM

En

cip

he

rm

en

t

Di

git

al

sig

na

tu

re

Ac

ce

ss

co

nt

ro

l

Da

ta

int

eg

rit

y

Au

th

en

tic

ati

on

ex

ch

an

ge

Tr

affi

c p

ad

din

g

Ro

ut

ing

co

nt

ro

l

No

tar

iza

tio

n

Data origin authentication

Access control

Confidentiality

Traffic flow confidentiality

Data integrity

Nonrepudiation

Availability

Table 1.4 Relationship Between Security Services and Mechanisms

34 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

1.6 FUNDAMENTAL SECURITY DESIGN PRINCIPLES

Despite years of research and development, it has not been possible to develop

security design and implementation techniques that systematically exclude security

flaws and prevent all unauthorized actions. In the absence of such foolproof tech-

niques, it is useful to have a set of widely agreed design principles that can guide

the development of protection mechanisms. The National Centers of Academic

Excellence in Information Assurance/Cyber Defense, which is jointly sponsored by

the U.S. National Security Agency and the U.S. Department of Homeland Security,

list the following as fundamental security design principles [NCAE13]:

■ Economy of mechanism

■ Fail-safe defaults

■ Complete mediation

■ Open design

■ Separation of privilege

■ Least privilege

■ Least common mechanism

■ Psychological acceptability

■ Isolation

■ Encapsulation

■ Modularity

■ Layering

■ Least astonishment

The first eight listed principles were first proposed in [SALT75] and have withstood

the test of time. In this section, we briefly discuss each principle.

Economy of mechanism means that the design of security measures embod-

ied in both hardware and software should be as simple and small as possible.

The motivation for this principle is that relatively simple, small design is eas-

ier to test and verify thoroughly. With a complex design, there are many more

opportunities for an adversary to discover subtle weaknesses to exploit that may

be difficult to spot ahead of time. The more complex the mechanism, the more

likely it is to possess exploitable flaws. Simple mechanisms tend to have fewer

exploitable flaws and require less maintenance. Further, because configuration

management issues are simplified, updating or replacing a simple mechanism

becomes a less intensive process. In practice, this is perhaps the most difficult

principle to honor. There is a constant demand for new features in both hard-

ware and software, complicating the security design task. The best that can be

done is to keep this principle in mind during system design to try to eliminate

unnecessary complexity.

Fail-safe defaults means that access decisions should be based on permission

rather than exclusion. That is, the default situation is lack of access, and the protec-

tion scheme identifies conditions under which access is permitted. This approach

1.6 / FUNDAMENTAL SECURITY DESIGN PRINCIPLES 35

exhibits a better failure mode than the alternative approach, where the default is

to permit access. A design or implementation mistake in a mechanism that gives

explicit permission tends to fail by refusing permission, a safe situation that can

be quickly detected. On the other hand, a design or implementation mistake in a

mechanism that explicitly excludes access tends to fail by allowing access, a failure

that may long go unnoticed in normal use. Most file access systems and virtually all

protected services on client/server systems use fail-safe defaults.

Complete mediation means that every access must be checked against the

access control mechanism. Systems should not rely on access decisions retrieved

from a cache. In a system designed to operate continuously, this principle requires

that, if access decisions are remembered for future use, careful consideration be

given to how changes in authority are propagated into such local memories. File

access systems appear to provide an example of a system that complies with this

principle. However, typically, once a user has opened a file, no check is made to see

if permissions change. To fully implement complete mediation, every time a user

reads a field or record in a file, or a data item in a database, the system must exercise

access control. This resource-intensive approach is rarely used.

Open design means that the design of a security mechanism should be open

rather than secret. For example, although encryption keys must be secret, encryption

algorithms should be open to public scrutiny. The algorithms can then be reviewed

by many experts, and users can therefore have high confidence in them. This is the

philosophy behind the National Institute of Standards and Technology (NIST)

program of standardizing encryption and hash algorithms, and has led to the wide-

spread adoption of NIST-approved algorithms.

Separation of privilege is defined in [SALT75] as a practice in which mul-

tiple privilege attributes are required to achieve access to a restricted resource.

A good example of this is multifactor user authentication, which requires the use of

multiple techniques, such as a password and a smart card, to authorize a user. The

term is also now applied to any technique in which a program is divided into parts

that are limited to the specific privileges they require in order to perform a specific

task. This is used to mitigate the potential damage of a computer security attack.

One example of this latter interpretation of the principle is removing high privilege

operations to another process and running that process with the higher privileges

required to perform its tasks. Day-to-day interfaces are executed in a lower privi-

leged process.

Least privilege means that every process and every user of the system should

operate using the least set of privileges necessary to perform the task. A good

example of the use of this principle is role-based access control. The system security

policy can identify and define the various roles of users or processes. Each role is

assigned only those permissions needed to perform its functions. Each permission

specifies a permitted access to a particular resource (such as read and write access

to a specified file or directory, connect access to a given host and port). Unless a

permission is granted explicitly, the user or process should not be able to access the

protected resource. More generally, any access control system should allow each

user only the privileges that are authorized for that user. There is also a temporal

aspect to the least privilege principle. For example, system programs or administra-

tors who have special privileges should have those privileges only when necessary;

36 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

when they are doing ordinary activities the privileges should be withdrawn. Leaving

them in place just opens the door to accidents.

Least common mechanism means that the design should minimize the func-

tions shared by different users, providing mutual security. This principle helps

reduce the number of unintended communication paths and reduces the amount of

hardware and software on which all users depend, thus making it easier to verify if

there are any undesirable security implications.

Psychological acceptability implies that the security mechanisms should not

interfere unduly with the work of users, while at the same time meeting the needs of

those who authorize access. If security mechanisms hinder the usability or accessibil-

ity of resources, then users may opt to turn off those mechanisms. Where possible,

security mechanisms should be transparent to the users of the system or at most

introduce minimal obstruction. In addition to not being intrusive or burdensome,

security procedures must reflect the user’s mental model of protection. If the protec-

tion procedures do not make sense to the user or if the user must translate his image

of protection into a substantially different protocol, the user is likely to make errors.

Isolation is a principle that applies in three contexts. First, public access sys-

tems should be isolated from critical resources (data, processes, etc.) to prevent dis-

closure or tampering. In cases where the sensitivity or criticality of the information

is high, organizations may want to limit the number of systems on which that data is

stored and isolate them, either physically or logically. Physical isolation may include

ensuring that no physical connection exists between an organization’s public access

information resources and an organization’s critical information. When implement-

ing logical isolation solutions, layers of security services and mechanisms should be

established between public systems and secure systems responsible for protecting

critical resources. Second, the processes and files of individual users should be iso-

lated from one another except where it is explicitly desired. All modern operating

systems provide facilities for such isolation, so that individual users have separate,

isolated process space, memory space, and file space, with protections for prevent-

ing unauthorized access. And finally, security mechanisms should be isolated in the

sense of preventing access to those mechanisms. For example, logical access control

may provide a means of isolating cryptographic software from other parts of the

host system and for protecting cryptographic software from tampering and the keys

from replacement or disclosure.

Encapsulation can be viewed as a specific form of isolation based on object-

oriented functionality. Protection is provided by encapsulating a collection of pro-

cedures and data objects in a domain of its own so that the internal structure of a

data object is accessible only to the procedures of the protected subsystem, and the

procedures may be called only at designated domain entry points.

Modularity in the context of security refers both to the development of security

functions as separate, protected modules and to the use of a modular architecture for

mechanism design and implementation. With respect to the use of separate security

modules, the design goal here is to provide common security functions and services,

such as cryptographic functions, as common modules. For example, numerous proto-

cols and applications make use of cryptographic functions. Rather than implement-

ing such functions in each protocol or application, a more secure design is provided

by developing a common cryptographic module that can be invoked by numerous

1.7 / ATTACK SURFACES AND ATTACK TREES 37

protocols and applications. The design and implementation effort can then focus on

the secure design and implementation of a single cryptographic module and includ-

ing mechanisms to protect the module from tampering. With respect to the use of a

modular architecture, each security mechanism should be able to support migration

to new technology or upgrade of new features without requiring an entire system

redesign. The security design should be modular so that individual parts of the secu-

rity design can be upgraded without the requirement to modify the entire system.

Layering refers to the use of multiple, overlapping protection approaches

addressing the people, technology, and operational aspects of information systems.

By using multiple, overlapping protection approaches, the failure or circumven-

tion of any individual protection approach will not leave the system unprotected.

We will see throughout this book that a layering approach is often used to provide

multiple barriers between an adversary and protected information or services. This

technique is often referred to as defense in depth.

Least astonishment means that a program or user interface should always

respond in the way that is least likely to astonish the user. For example, the mechanism

for authorization should be transparent enough to a user that the user has a good intui-

tive understanding of how the security goals map to the provided security mechanism.

1.7 ATTACK SURFACES AND ATTACK TREES

In Section 1.3, we provided an overview of the spectrum of security threats and

attacks facing computer and network systems. Section 22.1 goes into more detail

about the nature of attacks and the types of adversaries that present security threats.

In this section, we elaborate on two concepts that are useful in evaluating and clas-

sifying threats: attack surfaces and attack trees.

Attack Surfaces

An attack surface consists of the reachable and exploitable vulnerabilities in a sys-

tem [MANA11, HOWA03]. Examples of attack surfaces are the following:

■ Open ports on outward facing Web and other servers, and code listening on

those ports

■ Services available on the inside of a firewall

■ Code that processes incoming data, email, XML, office documents, and indus-

try-specific custom data exchange formats

■ Interfaces, SQL, and Web forms

■ An employee with access to sensitive information vulnerable to a social

engineering attack

Attack surfaces can be categorized as follows:

■ Network attack surface: This category refers to vulnerabilities over an enterprise

network, wide-area network, or the Internet. Included in this category are net-

work protocol vulnerabilities, such as those used for a denial-of-service attack,

disruption of communications links, and various forms of intruder attacks.

38 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

■ Software attack surface: This refers to vulnerabilities in application, utility,

or operating system code. A particular focus in this category is Web server

software.

■ Human attack surface: This category refers to vulnerabilities created by

personnel or outsiders, such as social engineering, human error, and trusted

insiders.

An attack surface analysis is a useful technique for assessing the scale and

severity of threats to a system. A systematic analysis of points of vulnerability

makes developers and security analysts aware of where security mechanisms are

required. Once an attack surface is defined, designers may be able to find ways to

make the surface smaller, thus making the task of the adversary more difficult. The

attack surface also provides guidance on setting priorities for testing, strengthening

security measures, and modifying the service or application.

As illustrated in Figure 1.3, the use of layering, or defense in depth, and attack

surface reduction complement each other in mitigating security risk.

Attack Trees

An attack tree is a branching, hierarchical data structure that represents a set of poten-

tial techniques for exploiting security vulnerabilities [MAUW05, MOOR01, SCHN99].

The security incident that is the goal of the attack is represented as the root node of

the tree, and the ways that an attacker could reach that goal are iteratively and incre-

mentally represented as branches and subnodes of the tree. Each subnode defines a

subgoal, and each subgoal may have its own set of further subgoals, and so on. The

final nodes on the paths outward from the root, that is, the leaf nodes, represent differ-

ent ways to initiate an attack. Each node other than a leaf is either an AND-node or an

OR-node. To achieve the goal represented by an AND-node, the subgoals represented

by all of that node’s subnodes must be achieved; and for an OR-node, at least one of

the subgoals must be achieved. Branches can be labeled with values representing dif-

ficulty, cost, or other attack attributes, so that alternative attacks can be compared.

Figure 1.3 Defense in Depth and Attack Surface

Attack surface

Medium

security risk

High

security risk

Low

security riskD

ee

p

L

ay

er

in

g

Sh

al

lo

w

Small Large

Medium

security risk

1.7 / ATTACK SURFACES AND ATTACK TREES 39

The motivation for the use of attack trees is to effectively exploit the infor-

mation available on attack patterns. Organizations such as CERT publish security

advisories that have enabled the development of a body of knowledge about both

general attack strategies and specific attack patterns. Security analysts can use the

attack tree to document security attacks in a structured form that reveals key vul-

nerabilities. The attack tree can guide both the design of systems and applications,

and the choice and strength of countermeasures.

Figure 1.4, based on a figure in [DIMI07], is an example of an attack tree

analysis for an Internet banking authentication application. The root of the tree is

the objective of the attacker, which is to compromise a user’s account. The shaded

boxes on the tree are the leaf nodes, which represent events that comprise the

attacks. Note that in this tree, all the nodes other than leaf nodes are OR-nodes.

The analysis to generate this tree considered the three components involved in

authentication:

Figure 1.4 An Attack Tree for Internet Banking Authentication

Bank account compromise

User credential compromise

User credential guessing

UT/U1a User surveillance

UT/U1b Theft of token and

handwritten notes

Malicious software

installation

Vulnerability exploit

UT/U2a Hidden code

UT/U2b Worms

UT/U3a Smartcard analyzers

UT/U2c Emails with

malicious code

UT/U3b Smartcard reader

manipulator

UT/U3c Brute force attacks

with PIN calculators

CC2 Sniffing

UT/U4a Social engineering

IBS3 Web site manipulation

UT/U4b Web page

obfuscation

CC1 Pharming

Redirection of

communication toward

fraudulent site

CC3 Active man-in-the

middle attacks

IBS1 Brute force attacks

User communication

with attacker

Injection of commands

Use of known authenticated

session by attacker

Normal user authentication

with specified session ID

CC4 Pre-defined session

IDs (session hijacking)

IBS2 Security policy

violation

40 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

■ User terminal and user (UT/U): These attacks target the user equipment,

including the tokens that may be involved, such as smartcards or other pass-

word generators, as well as the actions of the user.

■ Communications channel (CC): This type of attack focuses on communica-

tion links.

■ Internet banking server (IBS): These types of attacks are offline attacks against

the servers that host the Internet banking application.

Five overall attack strategies can be identified, each of which exploits one or

more of the three components. The five strategies are as follows:

■ User credential compromise: This strategy can be used against many ele-

ments of the attack surface. There are procedural attacks, such as monitoring

a user’s action to observe a PIN or other credential, or theft of the user’s

token or handwritten notes. An adversary may also compromise token

information using a variety of token attack tools, such as hacking the smart-

card or using a brute force approach to guess the PIN. Another possible

strategy is to embed malicious software to compromise the user’s login and

password. An adversary may also attempt to obtain credential information

via the communication channel (sniffing). Finally, an adversary may use

various means to engage in communication with the target user, as shown

in Figure 1.4.

■ Injection of commands: In this type of attack, the attacker is able to intercept

communication between the UT and the IBS. Various schemes can be used

to be able to impersonate the valid user and so gain access to the banking

system.

■ User credential guessing: It is reported in [HILT06] that brute force attacks

against some banking authentication schemes are feasible by sending ran-

dom usernames and passwords. The attack mechanism is based on distributed

zombie personal computers, hosting automated programs for username- or

password-based calculation.

■ Security policy violation: For example, violating the bank’s security policy

in combination with weak access control and logging mechanisms, an em-

ployee may cause an internal security incident and expose a customer’s

account.

■ Use of known authenticated session: This type of attack persuades or forces

the user to connect to the IBS with a preset session ID. Once the user authen-

ticates to the server, the attacker may utilize the known session ID to send

packets to the IBS, spoofing the user’s identity.

Figure 1.4 provides a thorough view of the different types of attacks on an

Internet banking authentication application. Using this tree as a starting point, secu-

rity analysts can assess the risk of each attack and, using the design principles out-

lined in the preceding section, design a comprehensive security facility. [DIMO07]

provides a good account of the results of this design effort.

1.8 / A MODEL FOR NETWORK SECURITY 41

1.8 A MODEL FOR NETWORK SECURITY

A model for much of what we will be discussing is captured, in very general terms, in

Figure 1.5. A message is to be transferred from one party to another across some sort

of Internet service. The two parties, who are the principals in this transaction, must

cooperate for the exchange to take place. A logical information channel is established

by defining a route through the Internet from source to destination and by the coop-

erative use of communication protocols (e.g., TCP/IP) by the two principals.

Security aspects come into play when it is necessary or desirable to protect the

information transmission from an opponent who may present a threat to confidentiality,

authenticity, and so on. All the techniques for providing security have two components:

■ A security-related transformation on the information to be sent. Examples

include the encryption of the message, which scrambles the message so that it

is unreadable by the opponent, and the addition of a code based on the con-

tents of the message, which can be used to verify the identity of the sender.

■ Some secret information shared by the two principals and, it is hoped,

unknown to the opponent. An example is an encryption key used in conjunc-

tion with the transformation to scramble the message before transmission

and unscramble it on reception.6

A trusted third party may be needed to achieve secure transmission. For

example, a third party may be responsible for distributing the secret information

6Part Two discusses a form of encryption, known as a symmetric encryption, in which only one of the two

principals needs to have the secret information.

Figure 1.5 Model for Network Security

Information

channelSecurity-related

transformation

Sender

Secret

information

M

es

sa

ge

M

es

sa

ge

Se

cu

re

m

es

sa

ge

Se

cu

re

m

es

sa

ge

Recipient

Opponent

Trusted third party

(e.g., arbiter, distributer

of secret information)

Security-related

transformation

Secret

information

42 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

to the two principals while keeping it from any opponent. Or a third party may be

needed to arbitrate disputes between the two principals concerning the authenticity

of a message transmission.

This general model shows that there are four basic tasks in designing a par-

ticular security service:

1. Design an algorithm for performing the security-related transformation. The

algorithm should be such that an opponent cannot defeat its purpose.

2. Generate the secret information to be used with the algorithm.

3. Develop methods for the distribution and sharing of the secret information.

4. Specify a protocol to be used by the two principals that makes use of the

security algorithm and the secret information to achieve a particular security

service.

Parts One through Five of this book concentrate on the types of security

mechanisms and services that fit into the model shown in Figure 1.5. However,

there are other security-related situations of interest that do not neatly fit this

model but are considered in this book. A general model of these other situations

is illustrated in Figure 1.6, which reflects a concern for protecting an information

system from unwanted access. Most readers are familiar with the concerns caused

by the existence of hackers, who attempt to penetrate systems that can be accessed

over a network. The hacker can be someone who, with no malign intent, simply gets

satisfaction from breaking and entering a computer system. The intruder can be a

disgruntled employee who wishes to do damage or a criminal who seeks to exploit

computer assets for financial gain (e.g., obtaining credit card numbers or perform-

ing illegal money transfers).

Another type of unwanted access is the placement in a computer system of

logic that exploits vulnerabilities in the system and that can affect application pro-

grams as well as utility programs, such as editors and compilers. Programs can pres-

ent two kinds of threats:

■ Information access threats: Intercept or modify data on behalf of users who

should not have access to that data.

■ Service threats: Exploit service flaws in computers to inhibit use by legitimate

users.

Figure 1.6 Network Access Security Model

Computing resources

(processor, memory, I/O)

Data

Processes

Software

Internal security controls

Information system

Gatekeeper

function

Opponent

—human (e.g., hacker)

—software

(e.g., virus, worm)

Access channel

1.9 / STANDARDS 43

Viruses and worms are two examples of software attacks. Such attacks can be

introduced into a system by means of a disk that contains the unwanted logic con-

cealed in otherwise useful software. They can also be inserted into a system across a

network; this latter mechanism is of more concern in network security.

The security mechanisms needed to cope with unwanted access fall into two

broad categories (see Figure 1.6). The first category might be termed a gatekeeper

function. It includes password-based login procedures that are designed to deny

access to all but authorized users and screening logic that is designed to detect and

reject worms, viruses, and other similar attacks. Once either an unwanted user

or unwanted software gains access, the second line of defense consists of a vari-

ety of internal controls that monitor activity and analyze stored information in an

attempt to detect the presence of unwanted intruders. These issues are explored

in Part Six.

1.9 STANDARDS

Many of the security techniques and applications described in this book have been

specified as standards. Additionally, standards have been developed to cover man-

agement practices and the overall architecture of security mechanisms and services.

Throughout this book, we describe the most important standards in use or that are

being developed for various aspects of cryptography and network security. Various

organizations have been involved in the development or promotion of these stan-

dards. The most important (in the current context) of these organizations are as

follows:

■ National Institute of Standards and Technology: NIST is a U.S. federal agency

that deals with measurement science, standards, and technology related to

U.S. government use and to the promotion of U.S. private-sector innovation.

Despite its national scope, NIST Federal Information Processing Standards

(FIPS) and Special Publications (SP) have a worldwide impact.

■ Internet Society: ISOC is a professional membership society with world-

wide organizational and individual membership. It provides leadership in

addressing issues that confront the future of the Internet and is the organiza-

tion home for the groups responsible for Internet infrastructure standards,

including the Internet Engineering Task Force (IETF) and the Internet

Architecture Board (IAB). These organizations develop Internet stan-

dards and related specifications, all of which are published as Requests for

Comments (RFCs).

■ ITU-T: The International Telecommunication Union (ITU) is an interna-

tional organization within the United Nations System in which governments

and the private sector coordinate global telecom networks and services. The

ITU Telecommunication Standardization Sector (ITU-T) is one of the three

sectors of the ITU. ITU-T’s mission is the development of technical standards

covering all fields of telecommunications. ITU-T standards are referred to as

Recommendations.

44 CHAPTER 1 / COMPUTER AND NETWORK SECURITY CONCEPTS

■ ISO: The International Organization for Standardization (ISO)7 is a world-

wide federation of national standards bodies from more than 140 countries,

one from each country. ISO is a nongovernmental organization that promotes

the development of standardization and related activities with a view to fa-

cilitating the international exchange of goods and services and to developing

cooperation in the spheres of intellectual, scientific, technological, and eco-

nomic activity. ISO’s work results in international agreements that are pub-

lished as International Standards.

A more detailed discussion of these organizations is contained in Appendix D.

1.10 KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS

7ISO is not an acronym (in which case it would be IOS), but it is a word, derived from the Greek, mean-

ing equal.

Key Terms

access control

active attack

authentication

authenticity

availability

data confidentiality

data integrity

denial of service

encryption

integrity

intruder

masquerade

nonrepudiation

OSI security architecture

passive attack

replay

security attacks

security mechanisms

security services

traffic analysis

Review Questions

1.1 What is the OSI security architecture?

1.2 List and briefly define the three key objectives of computer security.

1.3 List and briefly define categories of passive and active security attacks.

1.4 List and briefly define categories of security services.

1.5 List and briefly define categories of security mechanisms.

1.6 List and briefly define the fundamental security design principles.

1.7 Explain the difference between an attack surface and an attack tree.

Problems

1.1 Consider an automated cash deposit machine in which users provide a card or an ac-

count number to deposit cash. Give examples of confidentiality, integrity, and avail-

ability requirements associated with the system, and, in each case, indicate the degree

of importance of the requirement.

1.2 Repeat Problem 1.1 for a payment gateway system where a user pays for an item

using their account via the payment gateway.

1.10 / KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS 45

1.3 Consider a financial report publishing system used to produce reports for various

organizations.

a. Give an example of a type of publication in which confidentiality of the stored

data is the most important requirement.

b. Give an example of a type of publication in which data integrity is the most im-

portant requirement.

c. Give an example in which system availability is the most important requirement.

1.4 For each of the following assets, assign a low, moderate, or high impact level for the

loss of confidentiality, availability, and integrity, respectively. Justify your answers.

a. A student maintaining a blog to post public information.

b. An examination section of a university that is managing sensitive information

about exam papers.

c. An information system in a pathological laboratory maintaining the patient’s data.

d. A student information system used for maintaining student data in a university

that contains both personal, academic information and routine administrative in-

formation (not privacy related). Assess the impact for the two data sets separately

and the information system as a whole.

e. A University library contains a library management system which controls the

distribution of books amongst the students of various departments. The library

management system contains both the student data and the book data. Assess the

impact for the two data sets separately and the information system as a whole.

1.5 Draw a matrix similar to Table 1.4 that shows the relationship between security ser-

vices and attacks.

1.6 Draw a matrix similar to Table 1.4 that shows the relationship between security

mechanisms and attacks.

1.7 Develop an attack tree for gaining access to the contents of a physical safe.

1.8 Consider a company whose operations are housed in two buildings on the same prop-

erty; one building is headquarters, the other building contains network and computer

services. The property is physically protected by a fence around the perimeter, and

the only entrance to the property is through this fenced perimeter. In addition to

the perimeter fence, physical security consists of a guarded front gate. The local net-

works are split between the Headquarters’ LAN and the Network Services’ LAN.

Internet users connect to the Web server through a firewall. Dial-up users get access

to a particular server on the Network Services’ LAN. Develop an attack tree in which

the root node represents disclosure of proprietary secrets. Include physical, social

engineering, and technical attacks. The tree may contain both AND and OR nodes.

Develop a tree that has at least 15 leaf nodes.

1.9 Read all of the classic papers cited in the Recommended Reading section for this

chapter, available at the Author Web site at WilliamStallings.com/Cryptography. The

papers are available at box.com/Crypto7e. Compose a 500–1000 word paper (or 8–12

slide PowerPoint presentation) that summarizes the key concepts that emerge from

these papers, emphasizing concepts that are common to most or all of the papers.

4646

2.1 Divisibility and The Division Algorithm

Divisibility

The Division Algorithm

2.2 The Euclidean Algorithm

Greatest Common Divisor

Finding the Greatest Common Divisor

2.3 Modular Arithmetic

The Modulus

Properties of Congruences

Modular Arithmetic Operations

Properties of Modular Arithmetic

Euclidean Algorithm Revisited

The Extended Euclidean Algorithm

2.4 Prime Numbers

2.5 Fermat’s and Euler’s Theorems

Fermat’s Theorem

Euler’s Totient Function

Euler’s Theorem

2.6 Testing for Primality

Miller–Rabin Algorithm

A Deterministic Primality Algorithm

Distribution of Primes

2.7 The Chinese Remainder Theorem

2.8 Discrete Logarithms

The Powers of an Integer, Modulo n

Logarithms for Modular Arithmetic

Calculation of Discrete Logarithms

2.9 Key Terms, Review Questions, and Problems

Appendix 2A The Meaning of Mod

CHAPTER

Introduction to Number Theory

2.1 / DIVISIBILITY AND THE DIVISION ALGORITHM 47

Number theory is pervasive in cryptographic algorithms. This chapter provides

sufficient breadth and depth of coverage of relevant number theory topics for under-

standing the wide range of applications in cryptography. The reader familiar with these

topics can safely skip this chapter.

The first three sections introduce basic concepts from number theory that are

needed for understanding finite fields; these include divisibility, the Euclidian algo-

rithm, and modular arithmetic. The reader may study these sections now or wait until

ready to tackle Chapter 5 on finite fields.

Sections 2.4 through 2.8 discuss aspects of number theory related to prime num-

bers and discrete logarithms. These topics are fundamental to the design of asymmetric

(public-key) cryptographic algorithms. The reader may study these sections now or

wait until ready to read Part Three.

The concepts and techniques of number theory are quite abstract, and it is often

difficult to grasp them intuitively without examples. Accordingly, this chapter includes

a number of examples, each of which is highlighted in a shaded box.

2.1 DIVISIBILITY AND THE DIVISION ALGORITHM

Divisibility

We say that a nonzero b divides a if a = mb for some m, where a, b, and m are

integers. That is, b divides a if there is no remainder on division. The notation b�a

is commonly used to mean b divides a. Also, if b�a, we say that b is a divisor of a.

LEARNING OBJECTIVES

After studying this chapter, you should be able to:

◆ Understand the concept of divisibility and the division algorithm.

◆ Understand how to use the Euclidean algorithm to find the greatest com-

mon divisor.

◆ Present an overview of the concepts of modular arithmetic.

◆ Explain the operation of the extended Euclidean algorithm.

◆ Discuss key concepts relating to prime numbers.

◆ Understand Fermat’s theorem.

◆ Understand Euler’s theorem.

◆ Define Euler’s totient function.

◆ Make a presentation on the topic of testing for primality.

◆ Explain the Chinese remainder theorem.

◆ Define discrete logarithms.

48 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

Subsequently, we will need some simple properties of divisibility for integers,

which are as follows:

■ If a�1, then a = {1.

■ If a�b and b�a, then a = {b.

■ Any b ≠ 0 divides 0.

■ If a�b and b�c, then a�c:

The positive divisors of 24 are 1, 2, 3, 4, 6, 8, 12, and 24.

13�182; – 5�30; 17�289; – 3�33; 17�0

11�66 and 66�198 1 11�198

b = 7; g = 14; h = 63; m = 3; n = 2

7�14 and 7�63.

To show 7�(3 * 14 + 2 * 63),

we have (3 * 14 + 2 * 63) = 7(3 * 2 + 2 * 9),

and it is obvious that 7�(7(3 * 2 + 2 * 9)).

■ If b�g and b�h, then b�(mg + nh) for arbitrary integers m and n.

To see this last point, note that

■ If b�g, then g is of the form g = b * g1 for some integer g1.

■ If b�h, then h is of the form h = b * h1 for some integer h1.

So

mg + nh = mbg1 + nbh1 = b * (mg1 + nh1)

and therefore b divides mg + nh.

The Division Algorithm

Given any positive integer n and any nonnegative integer a, if we divide a by n,

we get an integer quotient q and an integer remainder r that obey the following

relationship:

a = qn + r 0 … r 6 n; q = :a/n; (2.1)

where :x; is the largest integer less than or equal to x. Equation (2.1) is referred to

as the division algorithm.1

1Equation (2.1) expresses a theorem rather than an algorithm, but by tradition, this is referred to as the

division algorithm.

2.2 / THE EUCLIDEAN ALGORITHM 49

Figure 2.1a demonstrates that, given a and positive n, it is always possible to

find q and r that satisfy the preceding relationship. Represent the integers on the

number line; a will fall somewhere on that line (positive a is shown, a similar dem-

onstration can be made for negative a). Starting at 0, proceed to n, 2n, up to qn, such

that qn … a and (q + 1)n 7 a. The distance from qn to a is r, and we have found

the unique values of q and r. The remainder r is often referred to as a residue.

a = 11; n = 7; 11 = 1 * 7 + 4; r = 4 q = 1

a = – 11; n = 7; – 11 = ( – 2) * 7 + 3; r = 3 q = – 2

Figure 2.1b provides another example.

Figure 2.1 The Relationship a = qn + r; 0 … r 6 n

0

n 2n 3n qn (q + 1)na

n

r(a) General relationship

0 15

15

10

30

= 2 × 15

70

(b) Example: 70 = (4 × 15) + 10

45

= 3 × 15

60

= 4 × 15

75

= 5 × 15

2.2 THE EUCLIDEAN ALGORITHM

One of the basic techniques of number theory is the Euclidean algorithm, which

is a simple procedure for determining the greatest common divisor of two positive

integers. First, we need a simple definition: Two integers are relatively prime if and

only if their only common positive integer factor is 1.

Greatest Common Divisor

Recall that nonzero b is defined to be a divisor of a if a = mb for some m, where

a, b, and m are integers. We will use the notation gcd(a, b) to mean the greatest

common divisor of a and b. The greatest common divisor of a and b is the largest

integer that divides both a and b. We also define gcd(0, 0) = 0.

50 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

More formally, the positive integer c is said to be the greatest common divisor

of a and b if

1. c is a divisor of a and of b.

2. any divisor of a and b is a divisor of c.

An equivalent definition is the following:

gcd(a, b) = max[k, such that k�a and k�b]

Because we require that the greatest common divisor be positive, gcd(a, b) =

gcd(a, – b) = gcd( – a, b) = gcd( – a, – b). In general, gcd(a, b) = gcd(�a� , �b�).

gcd(60, 24) = gcd(60, – 24) = 12

8 and 15 are relatively prime because the positive divisors of 8 are 1, 2, 4, and 8, and

the positive divisors of 15 are 1, 3, 5, and 15. So 1 is the only integer on both lists.

Also, because all nonzero integers divide 0, we have gcd(a, 0) = �a� .

We stated that two integers a and b are relatively prime if and only if their

only common positive integer factor is 1. This is equivalent to saying that a and b are

relatively prime if gcd(a, b) = 1.

Finding the Greatest Common Divisor

We now describe an algorithm credited to Euclid for easily finding the greatest

common divisor of two integers (Figure 2.2). This algorithm has broad significance

in cryptography. The explanation of the algorithm can be broken down into the fol-

lowing points:

1. Suppose we wish to determine the greatest common divisor d of the integers

a and b; that is determine d = gcd(a, b). Because gcd(�a� , �b�) = gcd(a, b),

there is no harm in assuming a Ú b 7 0.

2. Dividing a by b and applying the division algorithm, we can state:

a = q1b + r1 0 … r1 6 b (2.2)

3. First consider the case in which r1 = 0. Therefore b divides a and clearly no

larger number divides both b and a, because that number would be larger

than b. So we have d = gcd(a, b) = b.

4. The other possibility from Equation (2.2) is r1 ≠ 0. For this case, we can state

that d�r1. This is due to the basic properties of divisibility: the relations d�a

and d�b together imply that d�(a – q1b), which is the same as d�r1.

5. Before proceeding with the Euclidian algorithm, we need to answer the ques-

tion: What is the gcd(b, r1)? We know that d�b and d�r1. Now take any arbi-

trary integer c that divides both b and r1. Therefore, c�(q1b + r1) = a. Because

c divides both a and b, we must have c … d, which is the greatest common

divisor of a and b. Therefore d = gcd(b, r1).

2.2 / THE EUCLIDEAN ALGORITHM 51

Let us now return to Equation (2.2) and assume that r1 ≠ 0. Because b 7 r1,

we can divide b by r1 and apply the division algorithm to obtain:

b = q2r1 + r2 0 … r2 6 r1

As before, if r2 = 0, then d = r1 and if r2 ≠ 0, then d = gcd(r1, r2). Note that the

remainders form a descending series of nonnegative values and so must terminate

when the remainder is zero. This happens, say, at the (n + 1)th stage where rn – 1 is

divided by rn. The result is the following system of equations:

a = q1b + r1 0 6 r1 6 b

b = q2r1 + r2 0 6 r2 6 r1

r1 = q3r2 + r3 0 6 r3 6 r2

~ ~

~ ~

~ ~

rn – 2 = qnrn – 1 + rn 0 6 rn 6 rn – 1

rn – 1 = qn + 1rn + 0

d = gcd(a, b) = rn

w (2.3)

At each iteration, we have d = gcd(ri, ri + 1) until finally d = gcd(rn, 0) = rn.

Thus, we can find the greatest common divisor of two integers by repetitive appli-

cation of the division algorithm. This scheme is known as the Euclidean algorithm.

Figure 2.3 illustrates a simple example.

We have essentially argued from the top down that the final result is the

gcd(a, b). We can also argue from the bottom up. The first step is to show that rn

divides a and b. It follows from the last division in Equation (2.3) that rn divides

rn – 1. The next to last division shows that rn divides rn – 2 because it divides both

Figure 2.2 Euclidean Algorithm

No

No Yes

a > b?

r > 0?

Swap

a and b

Replace

b with r

Replace

a with b

Divide a by b,

calling the

remainder r

GCD is

the final

value of b

START

END Figure 2.3 Euclidean

Algorithm Example:

gcd(710, 310)

710 = 2 × 310 + 90

310 = 3 × 90 + 40

90 = 2 × 40 + 10

40 = 4 × 10

GCDGCD

Same GCD

52 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

terms on the right. Successively, one sees that rn divides all ri>s and finally a and b.

It remains to show that rn is the largest divisor that divides a and b. If we take any

arbitrary integer that divides a and b, it must also divide r1, as explained previously.

We can follow the sequence of equations in Equation (2.3) down and show that c

must divide all ri>s. Therefore c must divide rn, so that rn = gcd(a, b).

Let us now look at an example with relatively large numbers to see the power

of this algorithm:

To find d = gcd(a, b) = gcd(1160718174, 316258250)

a = q1b + r1 1160718174 = 3 * 316258250 + 211943424 d = gcd(316258250, 211943424)

b = q2r1 + r2 316258250 = 1 * 211943424 + 104314826 d = gcd(211943424, 104314826)

r1 = q3r2 + r3 211943424 = 2 * 104314826 + 3313772 d = gcd(104314826, 3313772)

r2 = q4r3 + r4 104314826 = 31 * 3313772 + 1587894 d = gcd(3313772, 1587894)

r3 = q5r4 + r5 3313772 = 2 * 1587894 + 137984 d = gcd(1587894, 137984)

r4 = q6r5 + r6 1587894 = 11 * 137984 + 70070 d = gcd(137984, 70070)

r5 = q7r6 + r7 137984 = 1 * 70070 + 67914 d = gcd(70070, 67914)

r6 = q8r7 + r8 70070 = 1 * 67914 + 2156 d = gcd(67914, 2156)

r7 = q9r8 + r9 67914 = 31 * 2156 + 1078 d = gcd(2156, 1078)

r8 = q10r9 + r10 2156 = 2 * 1078 + 0 d = gcd(1078, 0) = 1078

Therefore, d = gcd(1160718174, 316258250) = 1078

In this example, we begin by dividing 1160718174 by 316258250, which gives 3

with a remainder of 211943424. Next we take 316258250 and divide it by 211943424.

The process continues until we get a remainder of 0, yielding a result of 1078.

It will be helpful in what follows to recast the above computation in tabular

form. For every step of the iteration, we have ri – 2 = qiri – 1 + ri, where ri – 2 is the

dividend, ri – 1 is the divisor, qi is the quotient, and ri is the remainder. Table 2.1 sum-

marizes the results.

Dividend Divisor Quotient Remainder

a = 1160718174 b = 316258250 q1 = 3 r1 = 211943424

b = 316258250 r1 = 211943434 q2 = 1 r2 = 104314826

r1 = 211943424 r2 = 104314826 q3 = 2 r3 = 3313772

r2 = 104314826 r3 = 3313772 q4 = 31 r4 = 1587894

r3 = 3313772 r4 = 1587894 q5 = 2 r5 = 137984

r4 = 1587894 r5 = 137984 q6 = 11 r6 = 70070

r5 = 137984 r6 = 70070 q7 = 1 r7 = 67914

r6 = 70070 r7 = 67914 q8 = 1 r8 = 2156

r7 = 67914 r8 = 2156 q9 = 31 r9 = 1078

r8 = 2156 r9 = 1078 q10 = 2 r10 = 0

Table 2.1 Euclidean Algorithm Example

2.3 / MODULAR ARITHMETIC 53

2.3 MODULAR ARITHMETIC

The Modulus

If a is an integer and n is a positive integer, we define a mod n to be the remainder

when a is divided by n. The integer n is called the modulus. Thus, for any integer a,

we can rewrite Equation (2.1) as follows:

a = qn + r 0 … r 6 n; q = :a/n;

a = :a/n; * n + (a mod n)

11 mod 7 = 4; – 11 mod 7 = 3

73 K 4 (mod 23); 21 K – 9 (mod 10)

Two integers a and b are said to be congruent modulo n, if (a mod n) =

(b mod n). This is written as a K b (mod n).2

2We have just used the operator mod in two different ways: first as a binary operator that produces a re-

mainder, as in the expression a mod b; second as a congruence relation that shows the equivalence of two

integers, as in the expression a K b (mod n). See Appendix 2A for a discussion.

Note that if a K 0 (mod n), then n�a.

Properties of Congruences

Congruences have the following properties:

1. a K b (mod n) if n�(a – b).

2. a K b (mod n) implies b K a (mod n).

3. a K b (mod n) and b K c (mod n) imply a K c (mod n).

To demonstrate the first point, if n�(a – b), then (a – b) = kn for some k.

So we can write a = b + kn. Therefore, (a mod n) = (remainder when b +

kn is divided by n) = (remainder when b is divided by n) = (b mod n).

23 K 8 (mod 5) because 23 – 8 = 15 = 5 * 3

– 11 K 5 (mod 8) because – 11 – 5 = – 16 = 8 * ( – 2)

81 K 0 (mod 27) because 81 – 0 = 81 = 27 * 3

The remaining points are as easily proved.

54 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

Modular Arithmetic Operations

Note that, by definition (Figure 2.1), the (mod n) operator maps all integers into

the set of integers {0, 1, c , (n – 1)}. This suggests the question: Can we perform

arithmetic operations within the confines of this set? It turns out that we can; this

technique is known as modular arithmetic.

Modular arithmetic exhibits the following properties:

1. [(a mod n) + (b mod n)] mod n = (a + b) mod n

2. [(a mod n) – (b mod n)] mod n = (a – b) mod n

3. [(a mod n) * (b mod n)] mod n = (a * b) mod n

We demonstrate the first property. Define (a mod n) = ra and (b mod n) = rb.

Then we can write a = ra + jn for some integer j and b = rb + kn for some integer k.

Then

(a + b) mod n = (ra + jn + rb + kn) mod n

= (ra + rb + (k + j)n) mod n

= (ra + rb) mod n

= [(a mod n) + (b mod n)] mod n

The remaining properties are proven as easily. Here are examples of the three

properties:

11 mod 8 = 3; 15 mod 8 = 7

[(11 mod 8) + (15 mod 8)] mod 8 = 10 mod 8 = 2

(11 + 15) mod 8 = 26 mod 8 = 2

[(11 mod 8) – (15 mod 8)] mod 8 = – 4 mod 8 = 4

(11 – 15) mod 8 = – 4 mod 8 = 4

[(11 mod 8) * (15 mod 8)] mod 8 = 21 mod 8 = 5

(11 * 15) mod 8 = 165 mod 8 = 5

To find 117 mod 13, we can proceed as follows:

112 = 121 K 4 (mod 13)

114 = (112)2 K 42 K 3 (mod 13)

117 = 11 * 112 * 114

117 K 11 * 4 * 3 K 132 K 2 (mod 13)

Exponentiation is performed by repeated multiplication, as in ordinary

arithmetic.

Thus, the rules for ordinary arithmetic involving addition, subtraction, and

multiplication carry over into modular arithmetic.

2.3 / MODULAR ARITHMETIC 55

Table 2.2 provides an illustration of modular addition and multiplication

modulo 8. Looking at addition, the results are straightforward, and there is a reg-

ular pattern to the matrix. Both matrices are symmetric about the main diagonal

in conformance to the commutative property of addition and multiplication. As in

ordinary addition, there is an additive inverse, or negative, to each integer in modu-

lar arithmetic. In this case, the negative of an integer x is the integer y such that

(x + y) mod 8 = 0. To find the additive inverse of an integer in the left-hand col-

umn, scan across the corresponding row of the matrix to find the value 0; the integer

at the top of that column is the additive inverse; thus, (2 + 6) mod 8 = 0. Similarly,

the entries in the multiplication table are straightforward. In modular arithmetic mod

8, the multiplicative inverse of x is the integer y such that (x * y) mod 8 = 1 mod 8.

Now, to find the multiplicative inverse of an integer from the multiplication table,

scan across the matrix in the row for that integer to find the value 1; the integer at

the top of that column is the multiplicative inverse; thus, (3 * 3) mod 8 = 1. Note

that not all integers mod 8 have a multiplicative inverse; more about that later.

Properties of Modular Arithmetic

Define the set Zn as the set of nonnegative integers less than n:

Zn = {0, 1, c , (n – 1)}

Table 2.2 Arithmetic Modulo 8

+ 0 1 2 3 4 5 6 7

0 0 1 2 3 4 5 6 7

1 1 2 3 4 5 6 7 0

2 2 3 4 5 6 7 0 1

3 3 4 5 6 7 0 1 2

4 4 5 6 7 0 1 2 3

5 5 6 7 0 1 2 3 4

6 6 7 0 1 2 3 4 5

7 7 0 1 2 3 4 5 6

(a) Addition modulo 8

* 0 1 2 3 4 5 6 7

0 0 0 0 0 0 0 0 0

1 0 1 2 3 4 5 6 7

2 0 2 4 6 0 2 4 6

3 0 3 6 1 4 7 2 5

4 0 4 0 4 0 4 0 4

5 0 5 2 7 4 1 6 3

6 0 6 4 2 0 6 4 2

7 0 7 6 5 4 3 2 1

(b) Multiplication modulo 8

w – w w -1

0 0 —

1 7 1

2 6 —

3 5 3

4 4 —

5 3 5

6 2 —

7 1 7

(c) Additive and multiplicative

inverse modulo 8

56 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

This is referred to as the set of residues, or residue classes (mod n). To be more pre-

cise, each integer in Zn represents a residue class. We can label the residue classes

(mod n) as [0], [1], [2], c , [n – 1], where

[r] = {a: a is an integer, a K r (mod n)}

The residue classes (mod 4) are

[0] = { c , – 16, – 12, – 8, – 4, 0, 4, 8, 12, 16, c }

[1] = { c , – 15, – 11, – 7, – 3, 1, 5, 9, 13, 17, c }

[2] = { c , – 14, – 10, – 6, – 2, 2, 6, 10, 14, 18, c }

[3] = { c , – 13, – 9, – 5, – 1, 3, 7, 11, 15, 19, c }

Property Expression

Commutative Laws

(w + x) mod n = (x + w) mod n

(w * x) mod n = (x * w) mod n

Associative Laws

[(w + x) + y] mod n = [w + (x + y)] mod n

[(w * x) * y] mod n = [w * (x * y)] mod n

Distributive Law [w * (x + y)] mod n = [(w * x) + (w * y)] mod n

Identities

(0 + w) mod n = w mod n

(1 * w) mod n = w mod n

Additive Inverse ( – w) For each w ∈ Zn, there exists a z such that w + z K 0 mod n

Table 2.3 Properties of Modular Arithmetic for Integers in Zn

Of all the integers in a residue class, the smallest nonnegative integer is the

one used to represent the residue class. Finding the smallest nonnegative integer to

which k is congruent modulo n is called reducing k modulo n.

If we perform modular arithmetic within Zn, the properties shown in Table 2.3

hold for integers in Zn. We show in the next section that this implies that Zn is a

commutative ring with a multiplicative identity element.

There is one peculiarity of modular arithmetic that sets it apart from ordinary

arithmetic. First, observe that (as in ordinary arithmetic) we can write the following:

if (a + b) K (a + c) (mod n) then b K c (mod n) (2.4)

(5 + 23) K (5 + 7)(mod 8); 23 K 7(mod 8)

Equation (2.4) is consistent with the existence of an additive inverse. Adding

the additive inverse of a to both sides of Equation (2.4), we have

(( – a) + a + b) K (( – a) + a + c)(mod n)

b K c (mod n)

2.3 / MODULAR ARITHMETIC 57

However, the following statement is true only with the attached condition:

if (a * b) K (a * c)(mod n) then b K c(mod n) if a is relatively prime to n (2.5)

Recall that two integers are relatively prime if their only common positive integer

factor is 1. Similar to the case of Equation (2.4), we can say that Equation (2.5) is

consistent with the existence of a multiplicative inverse. Applying the multiplicative

inverse of a to both sides of Equation (2.5), we have

((a-1)ab) K ((a-1)ac)(mod n)

b K c(mod n)

To see this, consider an example in which the condition of Equation (2.5) does not

hold. The integers 6 and 8 are not relatively prime, since they have the common

factor 2. We have the following:

6 * 3 = 18 K 2(mod 8)

6 * 7 = 42 K 2(mod 8)

Yet 3 [ 7 (mod 8).

The reason for this strange result is that for any general modulus n, a multi-

plier a that is applied in turn to the integers 0 through (n – 1) will fail to produce a

complete set of residues if a and n have any factors in common.

With a = 6 and n = 8,

Z8 0 1 2 3 4 5 6 7

Multiply by 6 0 6 12 18 24 30 36 42

Residues 0 6 4 2 0 6 4 2

Because we do not have a complete set of residues when multiplying by

6, more than one integer in Z8 maps into the same residue. Specifically,

6 * 0 mod 8 = 6 * 4 mod 8; 6 * 1 mod 8 = 6 * 5 mod 8; and so on. Because

this is a many-to-one mapping, there is not a unique inverse to the multiply

operation.

However, if we take a = 5 and n = 8, whose only common factor is 1,

Z8 0 1 2 3 4 5 6 7

Multiply by 5 0 5 10 15 20 25 30 35

Residues 0 5 2 7 4 1 6 3

The line of residues contains all the integers in Z8, in a different order.

58 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

In general, an integer has a multiplicative inverse in Zn if and only if that inte-

ger is relatively prime to n. Table 2.2c shows that the integers 1, 3, 5, and 7 have a

multiplicative inverse in Z8; but 2, 4, and 6 do not.

Euclidean Algorithm Revisited

The Euclidean algorithm can be based on the following theorem: For any integers

a, b, with a Ú b Ú 0,

gcd(a, b) = gcd(b, a mod b) (2.6)

gcd(55, 22) = gcd(22, 55 mod 22) = gcd(22, 11) = 11

gcd(18, 12) = gcd(12, 6) = gcd(6, 0) = 6

gcd(11, 10) = gcd(10, 1) = gcd(1, 0) = 1

To see that Equation (2.6) works, let d = gcd(a, b). Then, by the definition of

gcd, d�a and d�b. For any positive integer b, we can express a as

a = kb + r K r (mod b)

a mod b = r

with k, r integers. Therefore, (a mod b) = a – kb for some integer k. But because

d�b, it also divides kb. We also have d�a. Therefore, d�(a mod b). This shows that

d is a common divisor of b and (a mod b). Conversely, if d is a common divisor of b

and (a mod b), then d�kb and thus d�[kb + (a mod b)], which is equivalent to d�a.

Thus, the set of common divisors of a and b is equal to the set of common divisors

of b and (a mod b). Therefore, the gcd of one pair is the same as the gcd of the other

pair, proving the theorem.

Equation (2.6) can be used repetitively to determine the greatest common divisor.

This is the same scheme shown in Equation (2.3), which can be rewritten in

the following way.

Euclidean Algorithm

Calculate Which satisfies

r1 = a mod b a = q1b + r1

r2 = b mod r1 b = q2r1 + r2

r3 = r1 mod r2 r1 = q3r2 + r3

~

~

~

~

~

~

rn = rn – 2 mod rn – 1 rn – 2 = qnrn – 1 + rn

rn + 1 = rn – 1 mod rn = 0 rn – 1 = qn + 1rn + 0

d = gcd(a, b) = rn

We can define the Euclidean algorithm concisely as the following recursive

function.

2.3 / MODULAR ARITHMETIC 59

Euclid(a,b)

if (b=0) then return a;

else return Euclid(b, a mod b);

The Extended Euclidean Algorithm

We now proceed to look at an extension to the Euclidean algorithm that will be

important for later computations in the area of finite fields and in encryption algo-

rithms, such as RSA. For given integers a and b, the extended Euclidean algorithm

not only calculates the greatest common divisor d but also two additional integers x

and y that satisfy the following equation.

ax + by = d = gcd(a, b) (2.7)

It should be clear that x and y will have opposite signs. Before examining the

algorithm, let us look at some of the values of x and y when a = 42 and b = 30.

Note that gcd(42, 30) = 6. Here is a partial table of values3 for 42x + 30y.

x − 3 − 2 − 1 0 1 2 3

y

– 3 – 216 – 174 – 132 – 90 – 48 – 6 36

– 2 – 186 – 144 – 102 – 60 – 18 24 66

– 1 – 156 – 114 – 72 – 30 12 54 96

0 – 126 – 84 – 42 0 42 84 126

1 – 96 – 54 – 12 30 72 114 156

2 – 66 – 24 18 60 102 144 186

3 – 36 6 48 90 132 174 216

Observe that all of the entries are divisible by 6. This is not surpris-

ing, because both 42 and 30 are divisible by 6, so every number of the form

42x + 30y = 6(7x + 5y) is a multiple of 6. Note also that gcd(42, 30) = 6 appears

in the table. In general, it can be shown that for given integers a and b, the smallest

positive value of ax + by is equal to gcd(a, b).

Now let us show how to extend the Euclidean algorithm to determine (x, y, d)

given a and b. We again go through the sequence of divisions indicated in Equation

(2.3), and we assume that at each step i we can find integers xi and yi that satisfy

ri = axi + byi. We end up with the following sequence.

a = q1b + r1 r1 = ax1 + by1

b = q2r1 + r2 r2 = ax2 + by2

r1 = q3r2 + r3 r3 = ax3 + by3

f f

rn – 2 = qnrn – 1 + rn rn = axn + byn

rn – 1 = qn + 1rn + 0

3This example is taken from [SILV06].

60 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

Now, observe that we can rearrange terms to write

ri = ri – 2 – ri – 1qi (2.8)

Also, in rows i – 1 and i – 2, we find the values

ri – 2 = axi – 2 + byi – 2 and ri – 1 = axi – 1 + byi – 1

Substituting into Equation (2.8), we have

ri = (axi – 2 + byi – 2) – (axi – 1 + byi – 1)qi

= a(xi – 2 – qixi – 1) + b(yi – 2 – qiyi – 1)

But we have already assumed that ri = axi + byi. Therefore,

xi = xi – 2 – qixi – 1 and yi = yi – 2 – qiyi – 1

We now summarize the calculations:

Extended Euclidean Algorithm

Calculate Which satisfies Calculate Which satisfies

r-1 = a x-1 = 1; y-1 = 0 a = ax-1 + by-1

r0 = b x0 = 0; y0 = 1 b = ax0 + by0

r1 = a mod b

q1 = :a/b;

a = q1b + r1 x1 = x-1 – q1x0 = 1

y1 = y-1 – q1y0 = – q1

r1 = ax1 + by1

r2 = b mod r1

q2 = :b/r1;

b = q2r1 + r2 x2 = x0 – q2x1

y2 = y0 – q2y1

r2 = ax2 + by2

r3 = r1 mod r2

q3 = :r1/r2;

r1 = q3r2 + r3 x3 = x1 – q3x2

y3 = y1 – q3y2

r3 = ax3 + by3

~

~

~

~

~

~

~

~

~

~

~

~

rn = rn – 2 mod rn – 1

qn = :rn – 2/rn – 1;

rn – 2 = qnrn – 1 + rn xn = xn – 2 – qnxn – 1

yn = yn – 2 – qnyn – 1

rn = axn + byn

rn + 1 = rn – 1 mod rn = 0

qn + 1 = :rn – 1/rn;

rn – 1 = qn + 1rn + 0 d = gcd(a, b) = rn

x = xn; y = yn

We need to make several additional comments here. In each row, we calculate

a new remainder ri based on the remainders of the previous two rows, namely ri – 1

and ri – 2. To start the algorithm, we need values for r0 and r-1, which are just a and b.

It is then straightforward to determine the required values for x-1, y-1, x0, and y0.

We know from the original Euclidean algorithm that the process ends

with a remainder of zero and that the greatest common divisor of a and b is

d = gcd(a, b) = rn. But we also have determined that d = rn = axn + byn.

Therefore, in Equation (2.7), x = xn and y = yn.

As an example, let us use a = 1759 and b = 550 and solve for

1759x + 550y = gcd(1759, 550). The results are shown in Table 2.4. Thus, we have

1759 * ( – 111) + 550 * 355 = – 195249 + 195250 = 1.

2.4 / PRIME NUMBERS 61

2.4 PRIME NUMBERS4

A central concern of number theory is the study of prime numbers. Indeed, whole

books have been written on the subject (e.g., [CRAN01], [RIBE96]). In this section,

we provide an overview relevant to the concerns of this book.

An integer p 7 1 is a prime number if and only if its only divisors5 are {1 and

{p. Prime numbers play a critical role in number theory and in the techniques dis-

cussed in this chapter. Table 2.5 shows the primes less than 2000. Note the way the

primes are distributed. In particular, note the number of primes in each range of

100 numbers.

Any integer a 7 1 can be factored in a unique way as

a = p1

a1 * p2a2 * g * ptat (2.9)

where p1 6 p2 6 c 6 pt are prime numbers and where each ai is a positive inte-

ger. This is known as the fundamental theorem of arithmetic; a proof can be found

in any text on number theory.

4In this section, unless otherwise noted, we deal only with the nonnegative integers. The use of negative

integers would introduce no essential differences.

5Recall from Section 2.1 that integer a is said to be a divisor of integer b if there is no remainder on

division. Equivalently, we say that a divides b.

i ri qi xi yi

– 1 1759 1 0

0 550 0 1

1 109 3 1 – 3

2 5 5 – 5 16

3 4 21 106 – 339

4 1 1 – 111 355

5 0 4

Result: d = 1; x = – 111; y = 355

Table 2.4 Extended Euclidean Algorithm Example

91 = 7 * 13

3600 = 24 * 32 * 52

11011 = 7 * 112 * 13

It is useful for what follows to express this another way. If P is the set of

all prime numbers, then any positive integer a can be written uniquely in the

following form:

a = q

p∈P

pap where each ap Ú 0

62 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

2

1

0

1

2

1

1

3

0

7

4

0

1

5

0

3

6

0

1

7

0

1

8

0

9

9

0

7

1

0

0

9

1

1

0

3

1

2

0

1

1

3

0

1

1

4

0

9

1

5

1

1

1

6

0

1

1

7

0

9

1

8

0

1

1

9

0

1

3

1

0

3

2

2

3

3

1

1

4

0

9

5

0

9

6

0

7

7

0

9

8

1

1

9

1

1

1

0

1

3

1

1

0

9

1

2

1

3

1

3

0

3

1

4

2

3

1

5

2

3

1

6

0

7

1

7

2

1

1

8

1

1

1

9

0

7

5

1

0

7

2

2

7

3

1

3

4

1

9

5

2

1

6

1

3

7

1

9

8

2

1

9

1

9

1

0

1

9

1

1

1

7

1

2

1

7

1

3

0

7

1

4

2

7

1

5

3

1

1

6

0

9

1

7

2

3

1

8

2

3

1

9

1

3

7

1

0

9

2

2

9

3

1

7

4

2

1

5

2

3

6

1

7

7

2

7

8

2

3

9

2

9

1

0

2

1

1

1

2

3

1

2

2

3

1

3

1

9

1

4

2

9

1

5

4

3

1

6

1

3

1

7

3

3

1

8

3

1

1

9

3

1

1

1

1

1

3

2

3

3

3

3

1

4

3

1

5

4

1

6

1

9

7

3

3

8

2

7

9

3

7

1

0

3

1

1

1

2

9

1

2

2

9

1

3

2

1

1

4

3

3

1

5

4

9

1

6

1

9

1

7

4

1

1

8

4

7

1

9

3

3

1

3

1

2

7

2

3

9

3

3

7

4

3

3

5

4

7

6

3

1

7

3

9

8

2

9

9

4

1

1

0

3

3

1

1

5

1

1

2

3

1

1

3

2

7

1

4

3

9

1

5

5

3

1

6

2

1

1

7

4

7

1

8

6

1

1

9

4

9

1

7

1

3

1

2

4

1

3

4

7

4

3

9

5

5

7

6

4

1

7

4

3

8

3

9

9

4

7

1

0

3

9

1

1

5

3

1

2

3

7

1

3

6

1

1

4

4

7

1

5

5

9

1

6

2

7

1

7

5

3

1

8

6

7

1

9

5

1

1

9

1

3

7

2

5

1

3

4

9

4

4

3

5

6

3

6

4

3

7

5

1

8

5

3

9

5

3

1

0

4

9

1

1

6

3

1

2

4

9

1

3

6

7

1

4

5

1

1

5

6

7

1

6

3

7

1

7

5

9

1

8

7

1

1

9

7

3

2

3

1

3

9

2

5

7

3

5

3

4

4

9

5

6

9

6

4

7

7

5

7

8

5

7

9

6

7

1

0

5

1

1

1

7

1

1

2

5

9

1

3

7

3

1

4

5

3

1

5

7

1

1

6

5

7

1

7

7

7

1

8

7

3

1

9

7

9

2

9

1

4

9

2

6

3

3

5

9

4

5

7

5

7

1

6

5

3

7

6

1

8

5

9

9

7

1

1

0

6

1

1

1

8

1

1

2

7

7

1

3

8

1

1

4

5

9

1

5

7

9

1

6

6

3

1

7

8

3

1

8

7

7

1

9

8

7

3

1

1

5

1

2

6

9

3

6

7

4

6

1

5

7

7

6

5

9

7

6

9

8

6

3

9

7

7

1

0

6

3

1

1

8

7

1

2

7

9

1

3

9

9

1

4

7

1

1

5

8

3

1

6

6

7

1

7

8

7

1

8

7

9

1

9

9

3

3

7

1

5

7

2

7

1

3

7

3

4

6

3

5

8

7

6

6

1

7

7

3

8

7

7

9

8

3

1

0

6

9

1

1

9

3

1

2

8

3

1

4

8

1

1

5

9

7

1

6

6

9

1

7

8

9

1

8

8

9

1

9

9

7

4

1

1

6

3

2

7

7

3

7

9

4

6

7

5

9

3

6

7

3

7

8

7

8

8

1

9

9

1

1

0

8

7

1

2

8

9

1

4

8

3

1

6

9

3

1

9

9

9

4

3

1

6

7

2

8

1

3

8

3

4

7

9

5

9

9

6

7

7

7

9

7

8

8

3

9

9

7

1

0

9

1

1

2

9

1

1

4

8

7

1

6

9

7

4

7

1

7

3

2

8

3

3

8

9

4

8

7

6

8

3

8

8

7

1

0

9

3

1

2

9

7

1

4

8

9

1

6

9

9

5

3

1

7

9

2

9

3

3

9

7

4

9

1

6

9

1

1

0

9

7

1

4

9

3

5

9

1

8

1

4

9

9

1

4

9

9

6

1

1

9

1

6

7

1

9

3

7

1

1

9

7

7

3

1

9

9

7

9

8

3

8

9

9

7

T

ab

le

2

.5

P

ri

m

e

s

U

n

d

e

r

2

0

0

0

2.4 / PRIME NUMBERS 63

The right-hand side is the product over all possible prime numbers p; for any par-

ticular value of a, most of the exponents ap will be 0.

The value of any given positive integer can be specified by simply listing all the

nonzero exponents in the foregoing formulation.

The integer 12 is represented by {a2 = 2, a3 = 1}.

The integer 18 is represented by {a2 = 1, a3 = 2}.

The integer 91 is represented by {a7 = 1, a13 = 1}.

Multiplication of two numbers is equivalent to adding the corresponding

exponents. Given a = q

p∈P

pap, b = q

p∈P

pbp. Define k = ab. We know that the inte-

ger k can be expressed as the product of powers of primes: k = q

p∈P

pkp. It follows

that kp = ap + bp for all p ∈ P.

k = 12 * 18 = (22 * 3) * (2 * 32) = 216

k2 = 2 + 1 = 3; k3 = 1 + 2 = 3

216 = 23 * 33 = 8 * 27

a = 12; b = 36; 12�36

12 = 22 * 3; 36 = 22 * 32

a2 = 2 = b2

a3 = 1 … 2 = b3

Thus, the inequality ap … bp is satisfied for all prime numbers.

What does it mean, in terms of the prime factors of a and b, to say that a divides b?

Any integer of the form pn can be divided only by an integer that is of a lesser

or equal power of the same prime number, pj with j … n. Thus, we can say the

following.

Given

a = q

p∈P

pap, b = q

p∈P

pbp

If a�b, then ap … bp for all p.

It is easy to determine the greatest common divisor of two positive integers if

we express each integer as the product of primes.

64 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

The following relationship always holds:

If k = gcd(a, b), then kp = min(ap, bp) for all p.

Determining the prime factors of a large number is no easy task, so the pre-

ceding relationship does not directly lead to a practical method of calculating the

greatest common divisor.

2.5 FERMAT’S AND EULER’S THEOREMS

Two theorems that play important roles in public-key cryptography are Fermat’s

theorem and Euler’s theorem.

Fermat’s Theorem6

Fermat’s theorem states the following: If p is prime and a is a positive integer not

divisible by p, then

ap – 1 K 1 (mod p) (2.10)

Proof: Consider the set of positive integers less than p: {1, 2, c , p – 1} and mul-

tiply each element by a, modulo p, to get the set X = {a mod p, 2a mod p, c ,

(p – 1)a mod p}. None of the elements of X is equal to zero because p does not

divide a. Furthermore, no two of the integers in X are equal. To see this, assume that

ja K ka(mod p)), where 1 … j 6 k … p – 1. Because a is relatively prime7 to p, we

can eliminate a from both sides of the equation [see Equation (2.3)] resulting in

j K k(mod p). This last equality is impossible, because j and k are both positive inte-

gers less than p. Therefore, we know that the (p – 1) elements of X are all positive

integers with no two elements equal. We can conclude the X consists of the set of

integers {1, 2, c , p – 1} in some order. Multiplying the numbers in both sets

(p and X) and taking the result mod p yields

a * 2a * g * (p – 1)a K [(1 * 2 * g * (p – 1)](mod p)

ap – 1(p – 1)! K (p – 1)! (mod p)

We can cancel the (p – 1)! term because it is relatively prime to p [see Equation

(2.5)]. This yields Equation (2.10), which completes the proof.

6This is sometimes referred to as Fermat’s little theorem.

7Recall from Section 2.2 that two numbers are relatively prime if they have no prime factors in common;

that is, their only common divisor is 1. This is equivalent to saying that two numbers are relatively prime

if their greatest common divisor is 1.

300 = 22 * 31 * 52

18 = 21 * 32

gcd(18,300) = 21 * 31 * 50 = 6

2.5 / FERMAT’S AND EULER’S THEOREMS 65

An alternative form of Fermat’s theorem is also useful: If p is prime and a is a

positive integer, then

ap K a(mod p) (2.11)

Note that the first form of the theorem [Equation (2.10)] requires that a be rela-

tively prime to p, but this form does not.

a = 7, p = 19

72 = 49 K 11 (mod 19)

74 K 121 K 7 (mod 19)

78 K 49 K 11 (mod 19)

716 K 121 K 7 (mod 19)

ap – 1 = 718 = 716 * 72 K 7 * 11 K 1 (mod 19)

p = 5, a = 3 ap = 35 = 243 K 3(mod 5) = a(mod p)

p = 5, a = 10 ap = 105 = 100000 K 10(mod 5) K 0(mod 5) = a(mod p)

Euler’s Totient Function

Before presenting Euler’s theorem, we need to introduce an important quantity in

number theory, referred to as Euler’s totient function. This function, written f(n),

is defined as the number of positive integers less than n and relatively prime to n.

By convention, f(1) = 1.

Determine f(37) and f(35).

Because 37 is prime, all of the positive integers from 1 through 36 are relatively

prime to 37. Thus f(37) = 36.

To determine f(35), we list all of the positive integers less than 35 that are

relatively prime to it:

1, 2, 3, 4, 6, 8, 9, 11, 12, 13, 16, 17, 18

19, 22, 23, 24, 26, 27, 29, 31, 32, 33, 34

There are 24 numbers on the list, so f(35) = 24.

Table 2.6 lists the first 30 values of f(n). The value f(1) is without meaning

but is defined to have the value 1.

It should be clear that, for a prime number p,

f(p) = p – 1

Now suppose that we have two prime numbers p and q with p ≠ q. Then we can

show that, for n = pq,

66 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

f(n) = f(pq) = f(p) * f(q) = (p – 1) * (q – 1)

To see that f(n) = f(p) * f(q), consider that the set of positive integers less than

n is the set {1, c , (pq – 1)}. The integers in this set that are not relatively prime

to n are the set {p, 2p, c , (q – 1)p} and the set {q, 2q, c , (p – 1)q}. To see

this, consider that any integer that divides n must divide either of the prime num-

bers p or q. Therefore, any integer that does not contain either p or q as a factor is

relatively prime to n. Further note that the two sets just listed are non-overlapping:

Because p and q are prime, we can state that none of the integers in the first set can

be written as a multiple of q, and none of the integers in the second set can be writ-

ten as a multiple of p. Thus the total number of unique integers in the two sets is

(q – 1) + (p – 1). Accordingly,

f(n) = (pq – 1) – [(q – 1) + (p – 1)]

= pq – (p + q) + 1

= (p – 1) * (q – 1)

= f(p) * f(q)

f(21) = f(3) * f(7) = (3 – 1) * (7 – 1) = 2 * 6 = 12

where the 12 integers are {1, 2, 4, 5, 8, 10, 11, 13, 16, 17, 19, 20}.

Table 2.6 Some Values of Euler’s Totient Function f(n)

n f(n)

1 1

2 1

3 2

4 2

5 4

6 2

7 6

8 4

9 6

10 4

n f(n)

11 10

12 4

13 12

14 6

15 8

16 8

17 16

18 6

19 18

20 8

n f(n)

21 12

22 10

23 22

24 8

25 20

26 12

27 18

28 12

29 28

30 8

Euler’s Theorem

Euler’s theorem states that for every a and n that are relatively prime:

af(n) K 1(mod n) (2.12)

Proof: Equation (2.12) is true if n is prime, because in that case, f(n) = (n – 1)

and Fermat’s theorem holds. However, it also holds for any integer n. Recall that

2.5 / FERMAT’S AND EULER’S THEOREMS 67

f(n) is the number of positive integers less than n that are relatively prime to n.

Consider the set of such integers, labeled as

R = {x1, x2, c , xf(n)}

That is, each element xi of R is a unique positive integer less than n with gcd(xi, n) = 1.

Now multiply each element by a, modulo n:

S = {(ax1 mod n), (ax2 mod n), c , (axf(n) mod n)}

The set S is a permutation8 of R , by the following line of reasoning:

1. Because a is relatively prime to n and xi is relatively prime to n, axi must also

be relatively prime to n. Thus, all the members of S are integers that are less

than n and that are relatively prime to n.

2. There are no duplicates in S. Refer to Equation (2.5). If axi mod n = axj

mod n, then xi = xj.

Therefore,

q

f(n)

i = 1

(axi mod n) = q

f(n)

i = 1

xi

q

f(n)

i = 1

axi K q

f(n)

i = 1

xi (mod n)

af(n) * J qf(n)

i = 1

xi R K qf(n)

i = 1

xi (mod n)

af(n) K 1 (mod n)

which completes the proof. This is the same line of reasoning applied to the proof

of Fermat’s theorem.

8A permutation of a finite set of elements S is an ordered sequence of all the elements of S, with each

element appearing exactly once.

a = 3; n = 10; f(10) = 4; af(n) = 34 = 81 = 1(mod 10) = 1(mod n)

a = 2; n = 11; f(11) = 10; af(n) = 210 = 1024 = 1(mod 11) = 1(mod n)

As is the case for Fermat’s theorem, an alternative form of the theorem is also

useful:

af(n) + 1 K a(mod n) (2.13)

Again, similar to the case with Fermat’s theorem, the first form of Euler’s theorem

[Equation (2.12)] requires that a be relatively prime to n, but this form does not.

68 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

2.6 TESTING FOR PRIMALITY

For many cryptographic algorithms, it is necessary to select one or more very large

prime numbers at random. Thus, we are faced with the task of determining whether

a given large number is prime. There is no simple yet efficient means of accomplish-

ing this task.

In this section, we present one attractive and popular algorithm. You may be

surprised to learn that this algorithm yields a number that is not necessarily a prime.

However, the algorithm can yield a number that is almost certainly a prime. This will

be explained presently. We also make reference to a deterministic algorithm for find-

ing primes. The section closes with a discussion concerning the distribution of primes.

Miller–Rabin Algorithm9

The algorithm due to Miller and Rabin [MILL75, RABI80] is typically used to test

a large number for primality. Before explaining the algorithm, we need some back-

ground. First, any positive odd integer n Ú 3 can be expressed as

n – 1 = 2kq with k 7 0, q odd

To see this, note that n – 1 is an even integer. Then, divide (n – 1) by 2 until the

result is an odd number q, for a total of k divisions. If n is expressed as a binary

number, then the result is achieved by shifting the number to the right until the

rightmost digit is a 1, for a total of k shifts. We now develop two properties of prime

numbers that we will need.

TWO PROPERTIES OF PRIME NUMBERS The first property is stated as follows: If p is

prime and a is a positive integer less than p, then a2 mod p = 1 if and only if either

a mod p = 1 or a mod p = – 1 mod p = p – 1. By the rules of modular arithmetic

(a mod p) (a mod p) = a2 mod p. Thus, if either a mod p = 1 or a mod p = – 1,

then a2 mod p = 1. Conversely, if a2 mod p = 1, then (a mod p)2 = 1, which is true

only for a mod p = 1 or a mod p = – 1.

The second property is stated as follows: Let p be a prime number greater

than 2. We can then write p – 1 = 2kq with k 7 0, q odd. Let a be any integer in

the range 1 6 a 6 p – 1. Then one of the two following conditions is true.

1. aq is congruent to 1 modulo p. That is, aq mod p = 1, or equivalently,

aq K 1(mod p).

2. One of the numbers aq, a2q, a4q, c , a2

k – 1q is congruent to – 1 mod-

ulo p. That is, there is some number j in the range (1 … j … k) such that

a2

j – 1q mod p = – 1 mod p = p – 1 or equivalently, a2

j – 1q K – 1(mod p).

Proof: Fermat’s theorem [Equation (2.10)] states that an – 1 K 1(mod n) if n is

prime. We have p – 1 = 2kq. Thus, we know that ap – 1 mod p = a2

kq mod p = 1.

Thus, if we look at the sequence of numbers

aq mod p, a2q mod p, a4q mod p, c , a2

k – 1q mod p, a2

kq mod p (2.14)

9Also referred to in the literature as the Rabin-Miller algorithm, or the Rabin-Miller test, or the Miller–

Rabin test.

2.6 / TESTING FOR PRIMALITY 69

we know that the last number in the list has value 1. Further, each number in the list

is the square of the previous number. Therefore, one of the following possibilities

must be true.

1. The first number on the list, and therefore all subsequent numbers on the list,

equals 1.

2. Some number on the list does not equal 1, but its square mod p does equal 1.

By virtue of the first property of prime numbers defined above, we know that

the only number that satisfies this condition is p – 1. So, in this case, the list

contains an element equal to p – 1.

This completes the proof.

DETAILS OF THE ALGORITHM These considerations lead to the conclusion that,

if n is prime, then either the first element in the list of residues, or remainders,

(aq, a2q, c , a2

k – 1q, a2

kq) modulo n equals 1; or some element in the list equals

(n – 1); otherwise n is composite (i.e., not a prime). On the other hand, if the

condition is met, that does not necessarily mean that n is prime. For example, if

n = 2047 = 23 * 89, then n – 1 = 2 * 1023. We compute 21023 mod 2047 = 1, so

that 2047 meets the condition but is not prime.

We can use the preceding property to devise a test for primality. The procedure

TEST takes a candidate integer n as input and returns the result composite if n is

definitely not a prime, and the result inconclusive if n may or may not be a prime.

TEST (n)

1. Find integers k, q, with k > 0, q odd, so that

(n − 1 = 2k q);

2. Select a random integer a, 1 < a < n - 1;
3. if aq mod n = 1 then return(”inconclusive”);
4. for j = 0 to k - 1 do
5. if a2
j
qmod n = n - 1 then return(”inconclusive”);
6. return(”composite”);
Let us apply the test to the prime number n = 29. We have (n - 1) = 28 =
22(7) = 2kq. First, let us try a = 10. We compute 107 mod 29 = 17, which is neither
1 nor 28, so we continue the test. The next calculation finds that (107)2 mod 29 = 28,
and the test returns inconclusive (i.e., 29 may be prime). Let’s try again with
a = 2. We have the following calculations: 27 mod 29 = 12; 214 mod 29 = 28; and
the test again returns inconclusive. If we perform the test for all integers a in
the range 1 through 28, we get the same inconclusive result, which is compatible
with n being a prime number.
Now let us apply the test to the composite number n = 13 * 17 = 221. Then
(n - 1) = 220 = 22(55) = 2kq. Let us try a = 5. Then we have 555 mod 221 = 112,
which is neither 1 nor 220(555)2 mod 221 = 168. Because we have used all values of j
(i.e., j = 0 and j = 1) in line 4 of the TEST algorithm, the test returns composite, indi-
cating that 221 is definitely a composite number. But suppose we had selected a = 21.
Then we have 2155 mod 221 = 200; (2155)2 mod 221 = 220; and the test returns
inconclusive, indicating that 221 may be prime. In fact, of the 218 integers from 2
through 219, four of these will return an inconclusive result, namely 21, 47, 174, and 200.
70 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY
REPEATED USE OF THE MILLER–RABIN ALGORITHM How can we use the Miller–Rabin
algorithm to determine with a high degree of confidence whether or not an integer
is prime? It can be shown [KNUT98] that given an odd number n that is not prime
and a randomly chosen integer, a with 1 6 a 6 n - 1, the probability that TEST
will return inconclusive (i.e., fail to detect that n is not prime) is less than 1/4.
Thus, if t different values of a are chosen, the probability that all of them will pass
TEST (return inconclusive) for n is less than (1/4)t. For example, for t = 10, the
probability that a nonprime number will pass all ten tests is less than 10-6. Thus,
for a sufficiently large value of t , we can be confident that n is prime if Miller’s test
always returns inconclusive.
This gives us a basis for determining whether an odd integer n is prime with
a reasonable degree of confidence. The procedure is as follows: Repeatedly invoke
TEST (n) using randomly chosen values for a. If, at any point, TEST returns
composite, then n is determined to be nonprime. If TEST continues to return
inconclusive for t tests, then for a sufficiently large value of t, assume that n
is prime.
A Deterministic Primality Algorithm
Prior to 2002, there was no known method of efficiently proving the primality of
very large numbers. All of the algorithms in use, including the most popular (Miller–
Rabin), produced a probabilistic result. In 2002 (announced in 2002, published
in 2004), Agrawal, Kayal, and Saxena [AGRA04] developed a relatively simple
deterministic algorithm that efficiently determines whether a given large number
is a prime. The algorithm, known as the AKS algorithm, does not appear to be as
efficient as the Miller–Rabin algorithm. Thus far, it has not supplanted this older,
probabilistic technique.
Distribution of Primes
It is worth noting how many numbers are likely to be rejected before a prime num-
ber is found using the Miller–Rabin test, or any other test for primality. A result
from number theory, known as the prime number theorem, states that the primes
near n are spaced on the average one every ln (n) integers. Thus, on average, one
would have to test on the order of ln(n) integers before a prime is found. Because
all even integers can be immediately rejected, the correct figure is 0.5 ln(n). For
example, if a prime on the order of magnitude of 2200 were sought, then about
0.5 ln(2200) = 69 trials would be needed to find a prime. However, this figure is just
an average. In some places along the number line, primes are closely packed, and in
other places there are large gaps.
The two consecutive odd integers 1,000,000,000,061 and 1,000,000,000,063
are both prime. On the other hand, 1001! + 2, 1001! + 3, c , 1001! + 1000,
1001! + 1001 is a sequence of 1000 consecutive composite integers.
2.7 / THE CHINESE REMAINDER THEOREM 71
2.7 THE CHINESE REMAINDER THEOREM
One of the most useful results of number theory is the Chinese remainder theorem
(CRT).10 In essence, the CRT says it is possible to reconstruct integers in a certain
range from their residues modulo a set of pairwise relatively prime moduli.
10The CRT is so called because it is believed to have been discovered by the Chinese mathematician
Sun-Tsu in around 100 A.D.
The 10 integers in Z10, that is the integers 0 through 9, can be reconstructed from
their two residues modulo 2 and 5 (the relatively prime factors of 10). Say the
known residues of a decimal digit x are r2 = 0 and r5 = 3; that is, x mod 2 = 0
and x mod 5 = 3. Therefore, x is an even integer in Z10 whose remainder, on divi-
sion by 5, is 3. The unique solution is x = 8.
The CRT can be stated in several ways. We present here a formulation that is most
useful from the point of view of this text. An alternative formulation is explored in
Problem 2.33. Let
M = q
k
i = 1
mi
where the mi are pairwise relatively prime; that is, gcd(mi, mj) = 1 for 1 … i, j … k,
and i ≠ j. We can represent any integer A in ZM by a k-tuple whose elements are in
Zmi using the following correspondence:
A 4 (a1, a2, c , ak) (2.15)
where A ∈ ZM, ai ∈ Zmi, and ai = A mod mi for 1 … i … k. The CRT makes two
assertions.
1. The mapping of Equation (2.15) is a one-to-one correspondence (called a
bijection) between ZM and the Cartesian product Zm1 * Zm2 * c * Zmk.
That is, for every integer A such that 0 … A 6 M, there is a unique k- tuple
(a1, a2, c , ak) with 0 … ai 6 mi that represents it, and for every such
k- tuple (a1, a2, c , ak), there is a unique integer A in ZM.
2. Operations performed on the elements of ZM can be equivalently performed
on the corresponding k-tuples by performing the operation independently in
each coordinate position in the appropriate system.
Let us demonstrate the first assertion. The transformation from A to
(a1, a2, c , ak), is obviously unique; that is, each ai is uniquely calculated as
ai = A mod mi. Computing A from (a1, a2, c , ak) can be done as follows. Let
72 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY
Mi = M/mi for 1 … i … k. Note that Mi = m1 * m2 * c * mi - 1 * mi + 1 * c
* mk, so that Mi K 0 (mod mj) for all j ≠ i. Then let
ci = Mi * (Mi-1 mod mi) for 1 … i … k (2.16)
By the definition of Mi, it is relatively prime to mi and therefore has a unique multi-
plicative inverse mod mi. So Equation (2.16) is well defined and produces a unique
value ci. We can now compute
A K ¢ ak
i = 1
aici≤(mod M) (2.17)
To show that the value of A produced by Equation (2.17) is correct, we must
show that ai = A mod mi for 1 … i … k. Note that cj K Mj K 0 (mod mi) if j ≠ i,
and that ci K 1 (mod mi). It follows that ai = A mod mi.
The second assertion of the CRT, concerning arithmetic operations, follows
from the rules for modular arithmetic. That is, the second assertion can be stated as
follows: If
A 4 (a1, a2, c , ak)
B 4 (b1, b2, c , bk)
then
(A + B) mod M 4 ((a1 + b1) mod m1, c , (ak + bk) mod mk)
(A - B) mod M 4 ((a1 - b1) mod m1, c , (ak - bk) mod mk)
(A * B) mod M 4 ((a1 * b1) mod m1, c , (ak * bk) mod mk)
One of the useful features of the Chinese remainder theorem is that it provides
a way to manipulate (potentially very large) numbers mod M in terms of tuples of
smaller numbers. This can be useful when M is 150 digits or more. However, note
that it is necessary to know beforehand the factorization of M.
To represent 973 mod 1813 as a pair of numbers mod 37 and 49, define
m1 = 37
m2 = 49
M = 1813
A = 973
We also have M1 = 49 and M2 = 37. Using the extended Euclidean algorithm,
we compute M1
-1 = 34 mod m1 and M2
-1 = 4 mod m2. (Note that we only need
to compute each Mi and each Mi
-1 once.) Taking residues modulo 37 and 49, our
representation of 973 is (11, 42), because 973 mod 37 = 11 and 973 mod 49 = 42.
Now suppose we want to add 678 to 973. What do we do to (11, 42)? First
we compute (678) 4 (678 mod 37, 678 mod 49) = (12, 41). Then we add the
tuples element-wise and reduce (11 + 12 mod 37, 42 + 41 mod 49) = (23, 34).
To verify that this has the correct effect, we compute
2.8 / DISCRETE LOGARITHMS 73
2.8 DISCRETE LOGARITHMS
Discrete logarithms are fundamental to a number of public-key algorithms, includ-
ing Diffie–Hellman key exchange and the digital signature algorithm (DSA). This
section provides a brief overview of discrete logarithms. For the interested reader,
more detailed developments of this topic can be found in [ORE67] and [LEVE90].
The Powers of an Integer, Modulo n
Recall from Euler’s theorem [Equation (2.12)] that, for every a and n that are rela-
tively prime,
af(n) K 1 (mod n)
where f(n), Euler’s totient function, is the number of positive integers less than n
and relatively prime to n. Now consider the more general expression:
am K 1 (mod n) (2.18)
If a and n are relatively prime, then there is at least one integer m that satisfies
Equation (2.18), namely, m = f(n). The least positive exponent m for which
Equation (2.18) holds is referred to in several ways:
■ The order of a (mod n)
■ The exponent to which a belongs (mod n)
■ The length of the period generated by a
(23, 34) 4 a1M1M1-1 + a2M2M2-1 mod M
= [(23)(49)(34) + (34)(37)(4)] mod 1813
= 43350 mod 1813
= 1651
and check that it is equal to (973 + 678) mod 1813 = 1651. Remember that in
the above derivation, Mi
-1 is the multiplicative inverse of M1 modulo m1 and M2
-1
is the multiplicative inverse of M2 modulo m2.
Suppose we want to multiply 1651 (mod 1813) by 73. We multiply (23, 34)
by 73 and reduce to get (23 * 73 mod 37, 34 * 73 mod 49) = (14, 32). It is eas-
ily verified that
(14, 32) 4 [(14)(49)(34) + (32)(37)(4)] mod 1813
= 865
= 1651 * 73 mod 1813
74 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY
Table 2.7 shows all the powers of a, modulo 19 for all positive a 6 19. The
length of the sequence for each base value is indicated by shading. Note the
following:
1. All sequences end in 1. This is consistent with the reasoning of the preceding
few paragraphs.
2. The length of a sequence divides f(19) = 18. That is, an integral number of
sequences occur in each row of the table.
3. Some of the sequences are of length 18. In this case, it is said that the base inte-
ger a generates (via powers) the set of nonzero integers modulo 19. Each such
integer is called a primitive root of the modulus 19.
More generally, we can say that the highest possible exponent to which a num-
ber can belong (mod n) is f(n). If a number is of this order, it is referred to as a
primitive root of n. The importance of this notion is that if a is a primitive root of n,
then its powers
a, a2, c , af(n)
are distinct (mod n) and are all relatively prime to n. In particular, for a prime num-
ber p, if a is a primitive root of p, then
a, a2, c , ap - 1
are distinct (mod p). For the prime number 19, its primitive roots are 2, 3, 10, 13, 14,
and 15.
Not all integers have primitive roots. In fact, the only integers with primitive
roots are those of the form 2, 4, pa, and 2pa, where p is any odd prime and a is a
positive integer. The proof is not simple but can be found in many number theory
books, including [ORE76].
To see this last point, consider the powers of 7, modulo 19:
71 K 7 (mod 19)
72 = 49 = 2 * 19 + 11 K 11 (mod 19)
73 = 343 = 18 * 19 + 1 K 1 (mod 19)
74 = 2401 = 126 * 19 + 7 K 7 (mod 19)
75 = 16807 = 884 * 19 + 11 K 11 (mod 19)
There is no point in continuing because the sequence is repeating. This can be
proven by noting that 73 K 1(mod 19), and therefore, 73 + j K 737j K 7j(mod 19),
and hence, any two powers of 7 whose exponents differ by 3 (or a multiple of 3)
are congruent to each other (mod 19). In other words, the sequence is periodic,
and the length of the period is the smallest positive exponent m such that
7m K 1(mod 19).
2.8 / DISCRETE LOGARITHMS 75
Logarithms for Modular Arithmetic
With ordinary positive real numbers, the logarithm function is the inverse of expo-
nentiation. An analogous function exists for modular arithmetic.
Let us briefly review the properties of ordinary logarithms. The logarithm of a
number is defined to be the power to which some positive base (except 1) must be
raised in order to equal the number. That is, for base x and for a value y,
y = xlogx(y)
The properties of logarithms include
logx(1) = 0
logx(x) = 1
logx(yz) = logx(y) + log x(z) (2.19)
logx(y
r) = r * log x(y) (2.20)
Consider a primitive root a for some prime number p (the argument can
be developed for nonprimes as well). Then we know that the powers of a from
a a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 a16 a17 a18
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 4 8 16 13 7 14 9 18 17 15 11 3 6 12 5 10 1
3 9 8 5 15 7 2 6 18 16 10 11 14 4 12 17 13 1
4 16 7 9 17 11 6 5 1 4 16 7 9 17 11 6 5 1
5 6 11 17 9 7 16 4 1 5 6 11 17 9 7 16 4 1
6 17 7 4 5 11 9 16 1 6 17 7 4 5 11 9 16 1
7 11 1 7 11 1 7 11 1 7 11 1 7 11 1 7 11 1
8 7 18 11 12 1 8 7 18 11 12 1 8 7 18 11 12 1
9 5 7 6 16 11 4 17 1 9 5 7 6 16 11 4 17 1
10 5 12 6 3 11 15 17 18 9 14 7 13 16 8 4 2 1
11 7 1 11 7 1 11 7 1 11 7 1 11 7 1 11 7 1
12 11 18 7 8 1 12 11 18 7 8 1 12 11 18 7 8 1
13 17 12 4 14 11 10 16 18 6 2 7 15 5 8 9 3 1
14 6 8 17 10 7 3 4 18 5 13 11 2 9 12 16 15 1
15 16 12 9 2 11 13 5 18 4 3 7 10 17 8 6 14 1
16 9 11 5 4 7 17 6 1 16 9 11 5 4 7 17 6 1
17 4 11 16 6 7 5 9 1 17 4 11 16 6 7 5 9 1
18 1 18 1 18 1 18 1 18 1 18 1 18 1 18 1 18 1
Table 2.7 Powers of Integers, Modulo 19
76 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY
1 through (p - 1) produce each integer from 1 through (p - 1) exactly once. We
also know that any integer b satisfies
b K r (mod p) for some r, where 0 … r … (p - 1)
by the definition of modular arithmetic. It follows that for any integer b and a primi-
tive root a of prime number p, we can find a unique exponent i such that
b K ai(mod p) where 0 … i … (p - 1)
This exponent i is referred to as the discrete logarithm of the number b for the base
a (mod p). We denote this value as dloga,p(b).
11
Note the following:
dloga,p(1) = 0 because a
0 mod p = 1 mod p = 1 (2.21)
dloga,p(a) = 1 because a
1 mod p = a (2.22)
11Many texts refer to the discrete logarithm as the index. There is no generally agreed notation for this
concept, much less an agreed name.
Here is an example using a nonprime modulus, n = 9. Here f(n) = 6 and a = 2
is a primitive root. We compute the various powers of a and find
20 = 1 24 K 7 (mod 9)
21 = 2 25 K 5 (mod 9)
22 = 4 26 K 1 (mod 9)
23 = 8
This gives us the following table of the numbers with given discrete logarithms
(mod 9) for the root a = 2:
Logarithm 0 1 2 3 4 5
Number 1 2 4 8 7 5
To make it easy to obtain the discrete logarithms of a given number, we rearrange
the table:
Number 1 2 4 5 7 8
Logarithm 0 1 2 5 4 3
Now consider
x = adloga, p(x) mod p y = adloga, p(y) mod p
xy = adloga, p(xy) mod p
2.8 / DISCRETE LOGARITHMS 77
Using the rules of modular multiplication,
xy mod p = [(x mod p)(y mod p)] mod p
adloga, p(xy) mod p = [(adloga, p(x) mod p)(adloga, p(y) mod p)] mod p
= (adloga, p(x) + dloga, p(y)) mod p
But now consider Euler’s theorem, which states that, for every a and n that are
relatively prime,
af(n) K 1(mod n)
Any positive integer z can be expressed in the form z = q + kf(n), with
0 … q 6 f(n). Therefore, by Euler’s theorem,
az K aq(mod n) if z K q mod f(n)
Applying this to the foregoing equality, we have
dloga, p(xy) K [dlog a, p(x) + dlog a, p(y)](mod f(p))
and generalizing,
dloga, p(y
r) K [r * dloga, p(y)](mod f(p))
This demonstrates the analogy between true logarithms and discrete logarithms.
Keep in mind that unique discrete logarithms mod m to some base a exist only
if a is a primitive root of m.
Table 2.8, which is directly derived from Table 2.7, shows the sets of discrete
logarithms that can be defined for modulus 19.
Calculation of Discrete Logarithms
Consider the equation
y = gx mod p
Given g, x, and p, it is a straightforward matter to calculate y. At the worst, we must
perform x repeated multiplications, and algorithms exist for achieving greater effi-
ciency (see Chapter 9).
However, given y, g, and p, it is, in general, very difficult to calculate x (take
the discrete logarithm). The difficulty seems to be on the same order of magnitude
as that of factoring primes required for RSA. At the time of this writing, the asymp-
totically fastest known algorithm for taking discrete logarithms modulo a prime
number is on the order of [BETH91]:
e((ln p)
1/3(ln(ln p))2/3)
which is not feasible for large primes.
78 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY
2.9 KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS
(a) Discrete logarithms to the base 2, modulo 19
a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
log2,19(a) 18 1 13 2 16 14 6 3 8 17 12 15 5 7 11 4 10 9
(b) Discrete logarithms to the base 3, modulo 19
a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
log3,19(a) 18 7 1 14 4 8 6 3 2 11 12 15 17 13 5 10 16 9
(c) Discrete logarithms to the base 10, modulo 19
a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
log10,19(a) 18 17 5 16 2 4 12 15 10 1 6 3 13 11 7 14 8 9
(d) Discrete logarithms to the base 13, modulo 19
a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
log13,19(a) 18 11 17 4 14 10 12 15 16 7 6 3 1 5 13 8 2 9
(e) Discrete logarithms to the base 14, modulo 19
a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
log14,19(a) 18 13 7 8 10 2 6 3 14 5 12 15 11 1 17 16 4 9
(f) Discrete logarithms to the base 15, modulo 19
a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
log15,19(a) 18 5 11 10 8 16 12 15 4 13 6 3 7 17 1 2 14 9
Table 2.8 Tables of Discrete Logarithms, Modulo 19
Key Terms
bijection
composite number
commutative
Chinese remainder theorem
discrete logarithm
divisor
Euclidean algorithm
Euler’s theorem
Euler’s totient function
Fermat’s theorem
greatest common divisor
identity element
index
modular arithmetic
modulus
order
prime number
primitive root
relatively prime
residue
2.9 / KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS 79
Review Questions
2.1 What does it mean to say that b is a divisor of a?
2.2 What is the meaning of the expression a divides b?
2.3 What is the difference between modular arithmetic and ordinary arithmetic?
2.4 What is a prime number?
2.5 What is Euler’s totient function?
2.6 The Miller–Rabin test can determine if a number is not prime but cannot determine
if a number is prime. How can such an algorithm be used to test for primality?
2.7 What is a primitive root of a number?
2.8 What is the difference between an index and a discrete logarithm?
Problems
2.1 Reformulate Equation (2.1), removing the restriction that a is a nonnegative integer.
That is, let a be any integer.
2.2 Draw a figure similar to Figure 2.1 for a 6 0.
2.3 For each of the following equations, find an integer x that satisfies the equation.
a. 4 x K 2 (m od 3 )
b. 7 x K 4 (m od 9 )
c. 5 x K 3 (m od 1 1 )
2.4 In this text, we assume that the modulus is a positive integer. But the definition of the
expression a mod n also makes perfect sense if n is negative. Determine the following:
a. 7 mod 4
b. 7 mod - 4
c. - 7 mod 4
d. - 7 m od - 4
2.5 A modulus of 0 does not fit the definition but is defined by convention as follows:
a mod 0 = a. With this definition in mind, what does the following expression mean:
a K b (mod 0)?
2.6 In Section 2.3, we define the congruence relationship as follows: Two integers a and
b are said to be congruent modulo n if (a mod n) = (b mod n). We then proved that
a K b (mod n) if n�(a - b). Some texts on number theory use this latter relation-
ship as the definition of congruence: Two integers a and b are said to be congruent
modulo n if n�(a - b). Using this latter definition as the starting point, prove that, if
(a mod n) = (b mod n), then n divides (a - b).
2.7 What is the smallest positive integer that has exactly k divisors? Provide answers for
values for 1 … k … 8.
2.8 Prove the following:
a. a K b (mod n) implies b K a (mod n)
b. a K b (mod n) and b K c (mod n) imply a K c (mod n)
2.9 Prove the following:
a. [(a mod n) - (b mod n)] mod n = (a - b) mod n
b. [(a mod n) * (b mod n)] mod n = (a * b) mod n
2.10 Find the multiplicative inverse of each nonzero element in Z5.
2.11 Show that an integer N is congruent modulo 9 to the sum of its decimal digits. For
example, 7 2 3 K 7 + 2 + 3 K 1 2 K 1 + 2 K 3 (m od 9 ). This is the basis for the
familiar procedure of “casting out 9’s” when checking computations in arithmetic.
80 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY
2.12 a. Determine gcd(72345, 43215)
b. Determine gcd(3486, 10292)
2.13 The purpose of this problem is to set an upper bound on the number of iterations of
the Euclidean algorithm.
a. Suppose that m = qn + r with q 7 0 and 0 … r 6 n. Show that m/2 7 r.
b. Let Ai be the value of A in the Euclidean algorithm after the ith iteration. Show that
Ai + 2 6
Ai
2
c. Show that if m, n, and N are integers with (1 … m, n, … 2N), then the Euclidean
algorithm takes at most 2N steps to find gcd(m, n).
2.14 The Euclidean algorithm has been known for over 2000 years and has always been
a favorite among number theorists. After these many years, there is now a potential
competitor, invented by J. Stein in 1961. Stein’s algorithms is as follows: Determine
gcd(A, B) with A, B Ú 1.
STEP 1 Set A1 = A, B1 = B, C1 = 1
STEP 2 For n > 1, (1) If An = Bn, stop. gcd(A, B) = AnCn

(2) If An and Bn are both even, set An + 1 = An/2, Bn + 1 = Bn/2,

Cn + 1 = 2Cn

(3) If An is even and Bn is odd, set An + 1 = An/2, Bn + 1 = Bn,

Cn + 1 = Cn

(4) If An is odd and Bn is even, set An + 1 = An, Bn + 1 = Bn/2,

Cn + 1 = Cn

(5) If An and Bn are both odd, set An + 1 = �An – Bn� , Bn + 1 =

min (Bn, An), Cn + 1 = Cn

Continue to step n + 1.

a. To get a feel for the two algorithms, compute gcd(6150, 704) using both the Euclid-

ean and Stein’s algorithm.

b. What is the apparent advantage of Stein’s algorithm over the Euclidean algorithm?

2.15 a. Show that if Stein’s algorithm does not stop before the nth step, then

Cn + 1 * gcd(An + 1, Bn + 1) = Cn * gcd(An, Bn)

b. Show that if the algorithm does not stop before step (n – 1), then

An + 2Bn + 2 …

AnBn

2

c. Show that if 1 … A, B … 2N, then Stein’s algorithm takes at most 4N steps to find

gcd(m, n). Thus, Stein’s algorithm works in roughly the same number of steps as

the Euclidean algorithm.

d. Demonstrate that Stein’s algorithm does indeed return gcd(A, B).

2.16 Using the extended Euclidean algorithm, find the multiplicative inverse of

a. 135 mod 61

b. 7465 mod 2464

c. 42828 mod 6407

2.17 The purpose of this problem is to determine how many prime numbers there

are. Suppose there are a total of n prime numbers, and we list these in order:

p1 = 2 6 p2 = 3 6 p3 = 5 6 c 6 pn.

a. Define X = 1 + p1p2 c pn. That is, X is equal to one plus the product of all the

primes. Can we find a prime number Pm that divides X?

b. What can you say about m?

c. Deduce that the total number of primes cannot be finite.

d. Show that Pn + 1 … 1 + p1p2 c pn.

2.9 / KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS 81

2.18 The purpose of this problem is to demonstrate that the probability that two random

numbers are relatively prime is about 0.6.

a. Let P = Pr[gcd(a, b) = 1]. Show that P = Pr[gcd(a, b) = d] = P/d2. Hint:

Consider the quantity gcd aa

d

,

b

d

b.

b. The sum of the result of part (a) over all possible values of d is 1. That is

Σd Ú 1Pr[gcd(a, b) = d] = 1. Use this equality to determine the value of P. Hint:

Use the identity a

∞

i = 1

1

i2

=

p2

6

.

2.19 Why is gcd(n, n + 1) = 1 for two consecutive integers n and n + 1?

2.20 Using Fermat’s theorem, find 4 2 2 5 mod 13.

2.21 Use Fermat’s theorem to find a number a between 0 and 92 with a congruent to 71013

modulo 93.

2.22 Use Fermat’s theorem to find a number x between 0 and 37 with x 7 3 congruent to 4

modulo 37. (You should not need to use any brute-force searching.)

2.23 Use Euler’s theorem to find a number a between 0 and 9 such that a is congruent to

9 1 0 1 modulo 10. (Note: This is the same as the last digit of the decimal expansion of

9 1 0 0 . )

2.24 Use Euler’s theorem to find a number x between 0 and 14 with x 6 1 congruent to 7

modulo 15. (You should not need to use any brute-force searching.)

2.25 Notice in Table 2.6 that f(n) is even for n 7 2. This is true for all n 7 2. Give a con-

cise argument why this is so.

2.26 Prove the following: If p is prime, then f(pi) = pi – pi – 1. Hint: What numbers have

a factor in common with pi?

2.27 It can be shown (see any book on number theory) that if gcd(m, n) = 1 then

f(mn) = f(m)f(n). Using this property, the property developed in the preceding

problem, and the property that f(p) = p – 1 for p prime, it is straightforward to

determine the value of f(n) for any n. Determine the following:

a. f(29) b. f(51) c. f(455) d. f(616)

2.28 It can also be shown that for arbitrary positive integer a, f(a) is given by

f(a) = q

t

i = 1

[pi

ai – 1(pi – 1)]

where a is given by Equation (2.9), namely: a = P1

a1P2

a2 c Pt

at. Demonstrate this result.

2.29 Consider the function: f(n) = number of elements in the set {a: 0 … a 6 n and

gcd(a, n) = 1}. What is this function?

2.30 Although ancient Chinese mathematicians did good work coming up with their

remainder theorem, they did not always get it right. They had a test for primality. The

test said that n is prime if and only if n divides (2n – 2).

a. Give an example that satisfies the condition using an odd prime.

b. The condition is obviously true for n = 2. Prove that the condition is true if n is an

odd prime (proving the if condition).

c. Give an example of an odd n that is not prime and that does not satisfy the condi-

tion. You can do this with nonprime numbers up to a very large value. This misled

the Chinese mathematicians into thinking that if the condition is true then n is prime.

d. Unfortunately, the ancient Chinese never tried n = 341, which is nonprime

(341 = 11 * 31), yet 341 divides 2341 – 2 without remainder. Demonstrate that

2341 K 2 (mod 341) (disproving the only if condition). Hint: It is not necessary to

calculate 2341; play around with the congruences instead.

82 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

2.31 Show that, if n is an odd composite integer, then the Miller–Rabin test will return

inconclusive for a = 1 and a = (n – 1).

2.32 If n is composite and passes the Miller–Rabin test for the base a, then n is called

a strong pseudoprime to the base a. Show that 2047 is a strong pseudoprime to the

base 2.

2.33 A common formulation of the Chinese remainder theorem (CRT) is as follows: Let

m1, c , mk be integers that are pairwise relatively prime for 1 … i, j … k, and i ≠ j.

Define M to be the product of all the mi>s. Let a1, c , ak be integers. Then the set of

congruences:

x K a1(mod m1)

x K a2(mod m2)

~

~

~

x K ak(mod mk)

has a unique solution modulo M. Show that the theorem stated in this form is true.

2.34 The example used by Sun-Tsu to illustrate the CRT was

x K 2 (mod 3); x K 3 (mod 5); x K 2 (mod 7)

Solve for x.

2.35 Six professors begin courses on Monday, Tuesday, Wednesday, Thursday, Friday,

and Saturday, respectively, and announce their intentions of lecturing at intervals of

3, 2, 5, 6, 1, and 4 days, respectively. The regulations of the university forbid Sunday

lectures (so that a Sunday lecture must be omitted). When first will all six professors

find themselves compelled to omit a lecture? Hint: Use the CRT.

2.36 Find all primitive roots of 37.

2.37 Given 5 as a primitive root of 23, construct a table of discrete logarithms, and use it to

solve the following congruences.

a. 3×5 K 2 (mod 23)

b. 7×10 + 1 K 0 (mod 23)

c. 5x K 6 (mod 23)

Programming Problems

2.1 Write a computer program that implements fast exponentiation (successive squaring)

modulo n.

2.2 Write a computer program that implements the Miller–Rabin algorithm for a user-

specified n. The program should allow the user two choices: (1) specify a possible

witness a to test using the Witness procedure or (2) specify a number s of random

witnesses for the Miller–Rabin test to check.

APPENDIX 2A THE MEANING OF MOD

The operator mod is used in this book and in the literature in two different ways: as

a binary operator and as a congruence relation. This appendix explains the distinc-

tion and precisely defines the notation used in this book regarding parentheses. This

notation is common but, unfortunately, not universal.

APPENDIX 2A / THE MEANING OF MOD 83

The Binary Operator mod

If a is an integer and n is a positive integer, we define a mod n to be the remainder

when a is divided by n. The integer n is called the modulus, and the remainder is

called the residue. Thus, for any integer a, we can always write

a = :a/n; * n + (a mod n)

Formally, we define the operator mod as

a mod n = a – :a/n; * n for n ≠ 0

As a binary operation, mod takes two integer arguments and returns the re-

mainder. For example, 7 mod 3 = 1. The arguments may be integers, integer vari-

ables, or integer variable expressions. For example, all of the following are valid,

with the obvious meanings:

7 mod 3

7 mod m

x mod 3

x mod m

(x2 + y + 1) mod (2m + n)

where all of the variables are integers. In each case, the left-hand term is divided by

the right-hand term, and the resulting value is the remainder. Note that if either the

left- or right-hand argument is an expression, the expression is parenthesized. The

operator mod is not inside parentheses.

In fact, the mod operation also works if the two arguments are arbitrary real num-

bers, not just integers. In this book, we are concerned only with the integer operation.

The Congruence Relation mod

As a congruence relation, mod expresses that two arguments have the same remain-

der with respect to a given modulus. For example, 7 K 4 (mod 3) expresses the

fact that both 7 and 4 have a remainder of 1 when divided by 3. The following two

expressions are equivalent:

a K b (mod m) 3 a mod m = b mod m

Another way of expressing it is to say that the expression a K b (mod m) is the

same as saying that a – b is an integral multiple of m. Again, all the arguments may

be integers, integer variables, or integer variable expressions. For example, all of

the following are valid, with the obvious meanings:

7 K 4 (mod 3)

x K y (mod m)

(x2 + y + 1) K (a + 1)(mod [m + n])

where all of the variables are integers. Two conventions are used. The congruence

sign is K. The modulus for the relation is defined by placing the mod operator fol-

lowed by the modulus in parentheses.

84 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY

The congruence relation is used to define residue classes. Those numbers that

have the same remainder r when divided by m form a residue class (mod m). There

are m residue classes (mod m). For a given remainder r, the residue class to which it

belongs consists of the numbers

r, r { m, r { 2m, c

According to our definition, the congruence

a K b (mod m)

signifies that the numbers a and b differ by a multiple of m. Consequently, the con-

gruence can also be expressed in the terms that a and b belong to the same residue

class (mod m).

85

PART TWO: SYMMETRIC CIPHERS

CHAPTER

Classical Encryption Techniques

3.1 Symmetric Cipher Model

Cryptography

Cryptanalysis and Brute-Force Attack

3.2 Substitution Techniques

Caesar Cipher

Monoalphabetic Ciphers

Playfair Cipher

Hill Cipher

Polyalphabetic Ciphers

One-Time Pad

3.3 Transposition Techniques

3.4 Rotor Machines

3.5 Steganography

3.6 Key Terms, Review Questions, and Problems

86 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

Symmetric encryption, also referred to as conventional encryption or single-key

encryption, was the only type of encryption in use prior to the development of public-

key encryption in the 1970s. It remains by far the most widely used of the two types

of encryption. Part One examines a number of symmetric ciphers. In this chapter, we

begin with a look at a general model for the symmetric encryption process; this will

enable us to understand the context within which the algorithms are used. Next, we

examine a variety of algorithms in use before the computer era. Finally, we look briefly

at a different approach known as steganography. Chapters 4 and 6 introduce the two

most widely used symmetric cipher: DES and AES.

Before beginning, we define some terms. An original message is known as the

plaintext, while the coded message is called the ciphertext. The process of convert-

ing from plaintext to ciphertext is known as enciphering or encryption; restoring the

plaintext from the ciphertext is deciphering or decryption. The many schemes used

for encryption constitute the area of study known as cryptography. Such a scheme

is known as a cryptographic system or a cipher. Techniques used for deciphering a

message without any knowledge of the enciphering details fall into the area of crypt-

analysis. Cryptanalysis is what the layperson calls “breaking the code.” The areas of

cryptography and cryptanalysis together are called cryptology.

3.1 SYMMETRIC CIPHER MODEL

A symmetric encryption scheme has five ingredients (Figure 3.1):

■ Plaintext: This is the original intelligible message or data that is fed into the

algorithm as input.

■ Encryption algorithm: The encryption algorithm performs various substitu-

tions and transformations on the plaintext.

■ Secret key: The secret key is also input to the encryption algorithm. The key is

a value independent of the plaintext and of the algorithm. The algorithm will

produce a different output depending on the specific key being used at the

time. The exact substitutions and transformations performed by the algorithm

depend on the key.

LEARNING OBJECTIVES

After studying this chapter, you should be able to:

◆ Present an overview of the main concepts of symmetric cryptography.

◆ Explain the difference between cryptanalysis and brute-force attack.

◆ Understand the operation of a monoalphabetic substitution cipher.

◆ Understand the operation of a polyalphabetic cipher.

◆ Present an overview of the Hill cipher.

◆ Describe the operation of a rotor machine.

3.1 / SYMMETRIC CIPHER MODEL 87

■ Ciphertext: This is the scrambled message produced as output. It depends on

the plaintext and the secret key. For a given message, two different keys will

produce two different ciphertexts. The ciphertext is an apparently random

stream of data and, as it stands, is unintelligible.

■ Decryption algorithm: This is essentially the encryption algorithm run in

reverse. It takes the ciphertext and the secret key and produces the original

plaintext.

There are two requirements for secure use of conventional encryption:

1. We need a strong encryption algorithm. At a minimum, we would like the algo-

rithm to be such that an opponent who knows the algorithm and has access to

one or more ciphertexts would be unable to decipher the ciphertext or figure

out the key. This requirement is usually stated in a stronger form: The oppo-

nent should be unable to decrypt ciphertext or discover the key even if he or

she is in possession of a number of ciphertexts together with the plaintext that

produced each ciphertext.

2. Sender and receiver must have obtained copies of the secret key in a secure

fashion and must keep the key secure. If someone can discover the key and

knows the algorithm, all communication using this key is readable.

We assume that it is impractical to decrypt a message on the basis of the

ciphertext plus knowledge of the encryption/decryption algorithm. In other words,

we do not need to keep the algorithm secret; we need to keep only the key secret.

This feature of symmetric encryption is what makes it feasible for widespread use.

The fact that the algorithm need not be kept secret means that manufacturers can

and have developed low-cost chip implementations of data encryption algorithms.

These chips are widely available and incorporated into a number of products. With

the use of symmetric encryption, the principal security problem is maintaining the

secrecy of the key.

Let us take a closer look at the essential elements of a symmetric encryp-

tion scheme, using Figure 3.2. A source produces a message in plaintext,

X = [X1, X2, c , XM]. The M elements of X are letters in some finite alphabet.

Traditionally, the alphabet usually consisted of the 26 capital letters. Nowadays,

Figure 3.1 Simplified Model of Symmetric Encryption

Plaintext

input

Y = E(K, X ) X = D(K, Y )

X

KK

Transmitted

ciphertext

Plaintext

output

Secret key shared by

sender and recipient

Secret key shared by

sender and recipient

Encryption algorithm

(e.g., AES)

Decryption algorithm

(reverse of encryption

algorithm)

88 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

the binary alphabet {0, 1} is typically used. For encryption, a key of the form

K = [K1, K2, c , KJ] is generated. If the key is generated at the message source,

then it must also be provided to the destination by means of some secure channel.

Alternatively, a third party could generate the key and securely deliver it to both

source and destination.

With the message X and the encryption key K as input, the encryption algo-

rithm forms the ciphertext Y = [Y1, Y2, c , YN]. We can write this as

Y = E(K, X)

This notation indicates that Y is produced by using encryption algorithm E as a

function of the plaintext X, with the specific function determined by the value of

the key K.

The intended receiver, in possession of the key, is able to invert the

transformation:

X = D(K, Y)

An opponent, observing Y but not having access to K or X, may attempt to

recover X or K or both X and K. It is assumed that the opponent knows the encryp-

tion (E) and decryption (D) algorithms. If the opponent is interested in only this

particular message, then the focus of the effort is to recover X by generating a plain-

text estimate Xn . Often, however, the opponent is interested in being able to read

future messages as well, in which case an attempt is made to recover K by generat-

ing an estimate Kn .

Figure 3.2 Model of Symmetric Cryptosystem

Message

source

Cryptanalyst

Key

source

Destination

X X

X

K

Y = E(K, X )

Secure channel

K

Encryption

algorithm

Decryption

algorithm

3.1 / SYMMETRIC CIPHER MODEL 89

Cryptography

Cryptographic systems are characterized along three independent dimensions:

1. The type of operations used for transforming plaintext to ciphertext. All

encryption algorithms are based on two general principles: substitution,

in which each element in the plaintext (bit, letter, group of bits or letters)

is mapped into another element, and transposition, in which elements

in the plaintext are rearranged. The fundamental requirement is that no

information be lost (i.e., that all operations are reversible). Most systems,

referred to as product systems, involve multiple stages of substitutions and

transpositions.

2. The number of keys used. If both sender and receiver use the same key, the

system is referred to as symmetric, single-key, secret-key, or conventional

encryption. If the sender and receiver use different keys, the system is referred

to as asymmetric, two-key, or public-key encryption.

3. The way in which the plaintext is processed. A block cipher processes the input

one block of elements at a time, producing an output block for each input

block. A stream cipher processes the input elements continuously, producing

output one element at a time, as it goes along.

Cryptanalysis and Brute-Force Attack

Typically, the objective of attacking an encryption system is to recover the key in

use rather than simply to recover the plaintext of a single ciphertext. There are two

general approaches to attacking a conventional encryption scheme:

■ Cryptanalysis: Cryptanalytic attacks rely on the nature of the algorithm plus

perhaps some knowledge of the general characteristics of the plaintext or even

some sample plaintext–ciphertext pairs. This type of attack exploits the charac-

teristics of the algorithm to attempt to deduce a specific plaintext or to deduce

the key being used.

■ Brute-force attack: The attacker tries every possible key on a piece of cipher-

text until an intelligible translation into plaintext is obtained. On average, half

of all possible keys must be tried to achieve success.

If either type of attack succeeds in deducing the key, the effect is catastrophic:

All future and past messages encrypted with that key are compromised.

We first consider cryptanalysis and then discuss brute-force attacks.

Table 3.1 summarizes the various types of cryptanalytic attacks based on the

amount of information known to the cryptanalyst. The most difficult problem is

presented when all that is available is the ciphertext only. In some cases, not even

the encryption algorithm is known, but in general, we can assume that the opponent

does know the algorithm used for encryption. One possible attack under these cir-

cumstances is the brute-force approach of trying all possible keys. If the key space

is very large, this becomes impractical. Thus, the opponent must rely on an analysis

of the ciphertext itself, generally applying various statistical tests to it. To use this

90 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

approach, the opponent must have some general idea of the type of plaintext that

is concealed, such as English or French text, an EXE file, a Java source listing, an

accounting file, and so on.

The ciphertext-only attack is the easiest to defend against because the oppo-

nent has the least amount of information to work with. In many cases, however,

the analyst has more information. The analyst may be able to capture one or more

plaintext messages as well as their encryptions. Or the analyst may know that certain

plaintext patterns will appear in a message. For example, a file that is encoded in the

Postscript format always begins with the same pattern, or there may be a standard-

ized header or banner to an electronic funds transfer message, and so on. All these

are examples of known plaintext. With this knowledge, the analyst may be able to

deduce the key on the basis of the way in which the known plaintext is transformed.

Closely related to the known-plaintext attack is what might be referred to as a

probable-word attack. If the opponent is working with the encryption of some gen-

eral prose message, he or she may have little knowledge of what is in the message.

However, if the opponent is after some very specific information, then parts of the

message may be known. For example, if an entire accounting file is being transmit-

ted, the opponent may know the placement of certain key words in the header of the

file. As another example, the source code for a program developed by Corporation

X might include a copyright statement in some standardized position.

If the analyst is able somehow to get the source system to insert into the sys-

tem a message chosen by the analyst, then a chosen-plaintext attack is possible.

An example of this strategy is differential cryptanalysis, explored in Appendix S.

Type of Attack Known to Cryptanalyst

Ciphertext Only ■ Encryption algorithm

■ Ciphertext

Known Plaintext ■ Encryption algorithm

■ Ciphertext

■ One or more plaintext–ciphertext pairs formed with the secret key

Chosen Plaintext ■ Encryption algorithm

■ Ciphertext

■ Plaintext message chosen by cryptanalyst, together with its corresponding

ciphertext generated with the secret key

Chosen Ciphertext ■ Encryption algorithm

■ Ciphertext

■ Ciphertext chosen by cryptanalyst, together with its corresponding decrypted

plaintext generated with the secret key

Chosen Text ■ Encryption algorithm

■ Ciphertext

■ Plaintext message chosen by cryptanalyst, together with its corresponding

ciphertext generated with the secret key

■ Ciphertext chosen by cryptanalyst, together with its corresponding decrypted

plaintext generated with the secret key

Table 3.1 Types of Attacks on Encrypted Messages

3.1 / SYMMETRIC CIPHER MODEL 91

In general, if the analyst is able to choose the messages to encrypt, the analyst may

deliberately pick patterns that can be expected to reveal the structure of the key.

Table 3.1 lists two other types of attack: chosen ciphertext and chosen text.

These are less commonly employed as cryptanalytic techniques but are nevertheless

possible avenues of attack.

Only relatively weak algorithms fail to withstand a ciphertext-only attack.

Generally, an encryption algorithm is designed to withstand a known-plaintext

attack.

Two more definitions are worthy of note. An encryption scheme is

unconditionally secure if the ciphertext generated by the scheme does not contain

enough information to determine uniquely the corresponding plaintext, no matter

how much ciphertext is available. That is, no matter how much time an opponent

has, it is impossible for him or her to decrypt the ciphertext simply because the

required information is not there. With the exception of a scheme known as the

one-time pad (described later in this chapter), there is no encryption algorithm that

is unconditionally secure. Therefore, all that the users of an encryption algorithm

can strive for is an algorithm that meets one or both of the following criteria:

■ The cost of breaking the cipher exceeds the value of the encrypted information.

■ The time required to break the cipher exceeds the useful lifetime of the

information.

An encryption scheme is said to be computationally secure if either of the

foregoing two criteria are met. Unfortunately, it is very difficult to estimate the

amount of effort required to cryptanalyze ciphertext successfully.

All forms of cryptanalysis for symmetric encryption schemes are designed

to exploit the fact that traces of structure or pattern in the plaintext may survive

encryption and be discernible in the ciphertext. This will become clear as we exam-

ine various symmetric encryption schemes in this chapter. We will see in Part Two

that cryptanalysis for public-key schemes proceeds from a fundamentally different

premise, namely, that the mathematical properties of the pair of keys may make it

possible for one of the two keys to be deduced from the other.

A brute-force attack involves trying every possible key until an intelligible

translation of the ciphertext into plaintext is obtained. On average, half of all pos-

sible keys must be tried to achieve success. That is, if there are X different keys, on

average an attacker would discover the actual key after X/2 tries. It is important to

note that there is more to a brute-force attack than simply running through all pos-

sible keys. Unless known plaintext is provided, the analyst must be able to recognize

plaintext as plaintext. If the message is just plain text in English, then the result pops

out easily, although the task of recognizing English would have to be automated. If

the text message has been compressed before encryption, then recognition is more

difficult. And if the message is some more general type of data, such as a numeri-

cal file, and this has been compressed, the problem becomes even more difficult to

automate. Thus, to supplement the brute-force approach, some degree of knowl-

edge about the expected plaintext is needed, and some means of automatically dis-

tinguishing plaintext from garble is also needed.

92 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

3.2 SUBSTITUTION TECHNIQUES

In this section and the next, we examine a sampling of what might be called classical

encryption techniques. A study of these techniques enables us to illustrate the basic

approaches to symmetric encryption used today and the types of cryptanalytic at-

tacks that must be anticipated.

The two basic building blocks of all encryption techniques are substitution

and transposition. We examine these in the next two sections. Finally, we discuss a

system that combines both substitution and transposition.

A substitution technique is one in which the letters of plaintext are replaced

by other letters or by numbers or symbols.1 If the plaintext is viewed as a sequence

of bits, then substitution involves replacing plaintext bit patterns with ciphertext bit

patterns.

Caesar Cipher

The earliest known, and the simplest, use of a substitution cipher was by Julius

Caesar. The Caesar cipher involves replacing each letter of the alphabet with the

letter standing three places further down the alphabet. For example,

plain: meet me after the toga party

cipher: PHHW PH DIWHU WKH WRJD SDUWB

Note that the alphabet is wrapped around, so that the letter following Z is A.

We can define the transformation by listing all possibilities, as follows:

plain: a b c d e f g h i j k l m n o p q r s t u v w x y z

cipher: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C

Let us assign a numerical equivalent to each letter:

a b c d e f g h i j k l m

0 1 2 3 4 5 6 7 8 9 10 11 12

n o p q r s t u v w x y z

13 14 15 16 17 18 19 20 21 22 23 24 25

Then the algorithm can be expressed as follows. For each plaintext letter p, substi-

tute the ciphertext letter C:2

C = E(3, p) = (p + 3) mod 26

A shift may be of any amount, so that the general Caesar algorithm is

C = E(k, p) = (p + k) mod 26 (3.1)

1When letters are involved, the following conventions are used in this book. Plaintext is always in

lowercase; ciphertext is in uppercase; key values are in italicized lowercase.

2We define a mod n to be the remainder when a is divided by n. For example, 11 mod 7 = 4. See Chapter 2

for a further discussion of modular arithmetic.

3.2 / SUBSTITUTION TECHNIQUES 93

where k takes on a value in the range 1 to 25. The decryption algorithm is simply

p = D(k, C) = (C – k) mod 26 (3.2)

If it is known that a given ciphertext is a Caesar cipher, then a brute-force

cryptanalysis is easily performed: simply try all the 25 possible keys. Figure 3.3

shows the results of applying this strategy to the example ciphertext. In this case, the

plaintext leaps out as occupying the third line.

Three important characteristics of this problem enabled us to use a brute-

force cryptanalysis:

1. The encryption and decryption algorithms are known.

2. There are only 25 keys to try.

3. The language of the plaintext is known and easily recognizable.

In most networking situations, we can assume that the algorithms are known.

What generally makes brute-force cryptanalysis impractical is the use of an algo-

rithm that employs a large number of keys. For example, the triple DES algorithm,

Figure 3.3 Brute-Force Cryptanalysis of Caesar Cipher

PHHW PH DIWHU WKH WRJD SDUWB

KEY

1 oggv og chvgt vjg vqic rctva

2 nffu nf bgufs uif uphb qbsuz

3 meet me after the toga party

4 ldds ld zesdq sgd snfz ozqsx

5 kccr kc ydrcp rfc rmey nyprw

6 jbbq jb xcqbo qeb qldx mxoqv

7 iaap ia wbpan pda pkcw lwnpu

8 hzzo hz vaozm ocz ojbv kvmot

9 gyyn gy uznyl nby niau julns

10 fxxm fx tymxk max mhzt itkmr

11 ewwl ew sxlwj lzw lgys hsjlq

12 dvvk dv rwkvi kyv kfxr grikp

13 cuuj cu qvjuh jxu jewq fqhjo

14 btti bt puitg iwt idvp epgin

15 assh as othsf hvs hcuo dofhm

16 zrrg zr nsgre gur gbtn cnegl

17 yqqf yq mrfqd ftq fasm bmdfk

18 xppe xp lqepc esp ezrl alcej

19 wood wo kpdob dro dyqk zkbdi

20 vnnc vn jocna cqn cxpj yjach

21 ummb um inbmz bpm bwoi xizbg

22 tlla tl hmaly aol avnh whyaf

23 skkz sk glzkx znk zumg vgxze

24 rjjy rj fkyjw ymj ytlf ufwyd

25 qiix qi ejxiv xli xske tevxc

94 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

examined in Chapter 7, makes use of a 168-bit key, giving a key space of 2168 or

greater than 3.7 * 1050 possible keys.

The third characteristic is also significant. If the language of the plaintext is

unknown, then plaintext output may not be recognizable. Furthermore, the input

may be abbreviated or compressed in some fashion, again making recognition dif-

ficult. For example, Figure 3.4 shows a portion of a text file compressed using an

algorithm called ZIP. If this file is then encrypted with a simple substitution cipher

(expanded to include more than just 26 alphabetic characters), then the plaintext

may not be recognized when it is uncovered in the brute-force cryptanalysis.

Monoalphabetic Ciphers

With only 25 possible keys, the Caesar cipher is far from secure. A dramatic increase

in the key space can be achieved by allowing an arbitrary substitution. Before pro-

ceeding, we define the term permutation. A permutation of a finite set of elements S

is an ordered sequence of all the elements of S, with each element appearing exactly

once. For example, if S = {a, b, c}, there are six permutations of S:

abc, acb, bac, bca, cab, cba

In general, there are n! permutations of a set of n elements, because the first

element can be chosen in one of n ways, the second in n – 1 ways, the third in n – 2

ways, and so on.

Recall the assignment for the Caesar cipher:

plain: a b c d e f g h i j k l m n o p q r s t u v w x y z

cipher: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C

If, instead, the “cipher” line can be any permutation of the 26 alphabetic characters,

then there are 26! or greater than 4 * 1026 possible keys. This is 10 orders of mag-

nitude greater than the key space for DES and would seem to eliminate brute-force

techniques for cryptanalysis. Such an approach is referred to as a monoalphabetic

substitution cipher, because a single cipher alphabet (mapping from plain alphabet

to cipher alphabet) is used per message.

There is, however, another line of attack. If the cryptanalyst knows the nature

of the plaintext (e.g., noncompressed English text), then the analyst can exploit the

regularities of the language. To see how such a cryptanalysis might proceed, we give

a partial example here that is adapted from one in [SINK09]. The ciphertext to be

solved is

Figure 3.4 Sample of Compressed Text

3.2 / SUBSTITUTION TECHNIQUES 95

UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ

VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX

EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ

As a first step, the relative frequency of the letters can be determined and

compared to a standard frequency distribution for English, such as is shown in

Figure 3.5 (based on [LEWA00]). If the message were long enough, this technique

alone might be sufficient, but because this is a relatively short message, we cannot

expect an exact match. In any case, the relative frequencies of the letters in the

ciphertext (in percentages) are as follows:

P 13.33 H 5.83 F 3.33 B 1.67 C 0.00

Z 11.67 D 5.00 W 3.33 G 1.67 K 0.00

S 8.33 E 5.00 Q 2.50 Y 1.67 L 0.00

U 8.33 V 4.17 T 2.50 I 0.83 N 0.00

O 7.50 X 4.17 A 1.67 J 0.83 R 0.00

M 6.67

Comparing this breakdown with Figure 3.5, it seems likely that cipher letters

P and Z are the equivalents of plain letters e and t, but it is not certain which is which.

The letters S, U, O, M, and H are all of relatively high frequency and probably

Figure 3.5 Relative Frequency of Letters in English Text

0

2

4

6

8

10

12

14

A

8.

16

7

1.

49

2

2.

78

2

4.

25

3

12

.7

02

2.

22

8

2.

01

5

6.

09

4 6

.9

96

0.

15

3 0.

77

2

4.

02

5

2.

40

6

6.

74

9 7.

50

7

1.

92

9

0.

09

5

5.

98

7

6.

32

7

9.

05

6

2.

75

8

0.

97

8

2.

36

0

0.

15

0

1.

97

4

0.

07

4

B C D E F G H I J K L M N

R

el

at

iv

e

fr

eq

ue

nc

y

(%

)

O P Q R S T U V W X Y Z

96 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

correspond to plain letters from the set {a, h, i, n, o, r, s}. The letters with the lowest

frequencies (namely, A, B, G, Y, I, J) are likely included in the set {b, j, k, q, v, x, z}.

There are a number of ways to proceed at this point. We could make some

tentative assignments and start to fill in the plaintext to see if it looks like a rea-

sonable “skeleton” of a message. A more systematic approach is to look for other

regularities. For example, certain words may be known to be in the text. Or we

could look for repeating sequences of cipher letters and try to deduce their plaintext

equivalents.

A powerful tool is to look at the frequency of two-letter combinations, known

as digrams. A table similar to Figure 3.5 could be drawn up showing the relative fre-

quency of digrams. The most common such digram is th. In our ciphertext, the most

common digram is ZW, which appears three times. So we make the correspondence

of Z with t and W with h. Then, by our earlier hypothesis, we can equate P with e.

Now notice that the sequence ZWP appears in the ciphertext, and we can translate

that sequence as “the.” This is the most frequent trigram (three-letter combination)

in English, which seems to indicate that we are on the right track.

Next, notice the sequence ZWSZ in the first line. We do not know that these

four letters form a complete word, but if they do, it is of the form th_t. If so, S

equates with a.

So far, then, we have

UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ

t a e e te a that e e a a

VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX

e t ta t ha e ee a e th t a

EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ

e e e tat e the t

Only four letters have been identified, but already we have quite a bit of the

message. Continued analysis of frequencies plus trial and error should easily yield a

solution from this point. The complete plaintext, with spaces added between words,

follows:

it was disclosed yesterday that several informal but

direct contacts have been made with political

representatives of the viet cong in moscow

Monoalphabetic ciphers are easy to break because they reflect the frequency

data of the original alphabet. A countermeasure is to provide multiple substi-

tutes, known as homophones, for a single letter. For example, the letter e could

be assigned a number of different cipher symbols, such as 16, 74, 35, and 21, with

each homophone assigned to a letter in rotation or randomly. If the number of

symbols assigned to each letter is proportional to the relative frequency of that let-

ter, then single-letter frequency information is completely obliterated. The great

mathematician Carl Friedrich Gauss believed that he had devised an unbreak-

able cipher using homophones. However, even with homophones, each element

of plaintext affects only one element of ciphertext, and multiple-letter patterns

3.2 / SUBSTITUTION TECHNIQUES 97

(e.g., digram frequencies) still survive in the ciphertext, making cryptanalysis rela-

tively straightforward.

Two principal methods are used in substitution ciphers to lessen the extent to

which the structure of the plaintext survives in the ciphertext: One approach is to

encrypt multiple letters of plaintext, and the other is to use multiple cipher alpha-

bets. We briefly examine each.

Playfair Cipher

The best-known multiple-letter encryption cipher is the Playfair, which treats di-

grams in the plaintext as single units and translates these units into ciphertext

digrams.3

The Playfair algorithm is based on the use of a 5 * 5 matrix of letters con-

structed using a keyword. Here is an example, solved by Lord Peter Wimsey in

Dorothy Sayers’s Have His Carcase:4

M O N A R

C H Y B D

E F G I/J K

L P Q S T

U V W X Z

In this case, the keyword is monarchy. The matrix is constructed by filling

in the letters of the keyword (minus duplicates) from left to right and from top to

bottom, and then filling in the remainder of the matrix with the remaining letters in

alphabetic order. The letters I and J count as one letter. Plaintext is encrypted two

letters at a time, according to the following rules:

1. Repeating plaintext letters that are in the same pair are separated with a filler

letter, such as x, so that balloon would be treated as ba lx lo on.

2. Two plaintext letters that fall in the same row of the matrix are each replaced

by the letter to the right, with the first element of the row circularly following

the last. For example, ar is encrypted as RM.

3. Two plaintext letters that fall in the same column are each replaced by the let-

ter beneath, with the top element of the column circularly following the last.

For example, mu is encrypted as CM.

4. Otherwise, each plaintext letter in a pair is replaced by the letter that lies in

its own row and the column occupied by the other plaintext letter. Thus, hs

becomes BP and ea becomes IM (or JM, as the encipherer wishes).

The Playfair cipher is a great advance over simple monoalphabetic ciphers.

For one thing, whereas there are only 26 letters, there are 26 * 26 = 676 digrams,

3This cipher was actually invented by British scientist Sir Charles Wheatstone in 1854, but it bears the

name of his friend Baron Playfair of St. Andrews, who championed the cipher at the British foreign office.

4The book provides an absorbing account of a probable-word attack.

98 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

so that identification of individual digrams is more difficult. Furthermore, the rela-

tive frequencies of individual letters exhibit a much greater range than that of

digrams, making frequency analysis much more difficult. For these reasons, the

Playfair cipher was for a long time considered unbreakable. It was used as the stan-

dard field system by the British Army in World War I and still enjoyed considerable

use by the U.S. Army and other Allied forces during World War II.

Despite this level of confidence in its security, the Playfair cipher is relatively

easy to break, because it still leaves much of the structure of the plaintext language

intact. A few hundred letters of ciphertext are generally sufficient.

One way of revealing the effectiveness of the Playfair and other ciphers is

shown in Figure 3.6. The line labeled plaintext plots a typical frequency distribution

of the 26 alphabetic characters (no distinction between upper and lower case) in

ordinary text. This is also the frequency distribution of any monoalphabetic substi-

tution cipher, because the frequency values for individual letters are the same, just

with different letters substituted for the original letters. The plot is developed in the

following way: The number of occurrences of each letter in the text is counted and

divided by the number of occurrences of the most frequently used letter. Using the

results of Figure 3.5, we see that e is the most frequently used letter. As a result, e

has a relative frequency of 1, t of 9.056/12.702 ≈ 0.72, and so on. The points on the

horizontal axis correspond to the letters in order of decreasing frequency.

Figure 3.6 also shows the frequency distribution that results when the text is

encrypted using the Playfair cipher. To normalize the plot, the number of occur-

rences of each letter in the ciphertext was again divided by the number of occur-

rences of e in the plaintext. The resulting plot therefore shows the extent to which

the frequency distribution of letters, which makes it trivial to solve substitution

Figure 3.6 Relative Frequency of Occurrence of Letters

0

1 2 3 4 5 6 1 7 8 9 10 10 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Plaintext

Playfair

Vigenère

Random polyalphabetic

Frequency ranked letters (decreasing frequency)

N

or

m

al

iz

ed

r

el

at

iv

e

fr

eq

ue

nc

y

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

3.2 / SUBSTITUTION TECHNIQUES 99

ciphers, is masked by encryption. If the frequency distribution information were

totally concealed in the encryption process, the ciphertext plot of frequencies would

be flat, and cryptanalysis using ciphertext only would be effectively impossible. As

the figure shows, the Playfair cipher has a flatter distribution than does plaintext,

but nevertheless, it reveals plenty of structure for a cryptanalyst to work with. The

plot also shows the Vigenère cipher, discussed subsequently. The Hill and Vigenère

curves on the plot are based on results reported in [SIMM93].

Hill Cipher5

Another interesting multiletter cipher is the Hill cipher, developed by the math-

ematician Lester Hill in 1929.

CONCEPTS FROM LINEAR ALGEBRA Before describing the Hill cipher, let us briefly

review some terminology from linear algebra. In this discussion, we are concerned

with matrix arithmetic modulo 26. For the reader who needs a refresher on matrix

multiplication and inversion, see Appendix E.

We define the inverse M-1 of a square matrix M by the equation M(M-1) =

M-1M = I, where I is the identity matrix. I is a square matrix that is all zeros except

for ones along the main diagonal from upper left to lower right. The inverse of a

matrix does not always exist, but when it does, it satisfies the preceding equation.

For example,

A = ¢ 5 8

17 3

≤ A-1 mod 26 = ¢9 2

1 15

≤

AA-1 = ¢ (5 * 9) + (8 * 1) (5 * 2) + (8 * 15)

(17 * 9) + (3 * 1) (17 * 2) + (3 * 15)

≤

= ¢ 53 130

156 79

≤ mod 26 = ¢1 0

0 1

≤

To explain how the inverse of a matrix is computed, we begin with the concept

of determinant. For any square matrix (m * m), the determinant equals the sum of

all the products that can be formed by taking exactly one element from each row

and exactly one element from each column, with certain of the product terms pre-

ceded by a minus sign. For a 2 * 2 matrix,

¢k11 k12

k21 k22

≤

the determinant is k11k22 – k12k21. For a 3 * 3 matrix, the value of the determinant

is k11k22k33 + k21k32k13 + k31k12k23 – k31k22k13 – k21k12k33 – k11k32k23. If a square

5This cipher is somewhat more difficult to understand than the others in this chapter, but it illustrates an

important point about cryptanalysis that will be useful later on. This subsection can be skipped on a first

reading.

100 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

matrix A has a nonzero determinant, then the inverse of the matrix is computed

as [A-1]ij = (det A)

-1( – 1)i + j(Dji), where (Dji) is the subdeterminant formed by

deleting the jth row and the ith column of A, det(A) is the determinant of A, and

(det A)-1 is the multiplicative inverse of (det A) mod 26.

Continuing our example,

det ¢ 5 8

17 3

≤ = (5 * 3) – (8 * 17) = – 121 mod 26 = 9

We can show that 9-1 mod 26 = 3, because 9 * 3 = 27 mod 26 = 1 (see

Chapter 2 or Appendix E). Therefore, we compute the inverse of A as

A = ¢ 5 8

17 3

≤

A-1 mod 26 = 3¢ 3 – 8

– 17 5

≤ = 3¢3 18

9 5

≤ = ¢ 9 54

27 15

≤ = ¢9 2

1 15

≤

THE HILL ALGORITHM This encryption algorithm takes m successive plaintext let-

ters and substitutes for them m ciphertext letters. The substitution is determined

by m linear equations in which each character is assigned a numerical value

(a = 0, b = 1, c , z = 25). For m = 3, the system can be described as

c1 = (k11p1 + k21p2 + k31p3) mod 26

c2 = (k12p1 + k22p2 + k32p3) mod 26

c3 = (k13p1 + k23p2 + k33p3) mod 26

This can be expressed in terms of row vectors and matrices:6

(c1 c2 c3) = (p1 p2 p3)£ k11 k12 k13k21 k22 k23

k31 k32 k33

≥ mod 26

or

C = PK mod 26

where C and P are row vectors of length 3 representing the plaintext and ciphertext,

and K is a 3 * 3 matrix representing the encryption key. Operations are performed

mod 26.

6Some cryptography books express the plaintext and ciphertext as column vectors, so that the column

vector is placed after the matrix rather than the row vector placed before the matrix. Sage uses row vec-

tors, so we adopt that convention.

3.2 / SUBSTITUTION TECHNIQUES 101

For example, consider the plaintext “paymoremoney” and use the encryption key

K = £ 17 17 521 18 21

2 2 19

≥

The first three letters of the plaintext are represented by the vector (15 0 24).

Then (15 0 24)K = (303 303 531) mod 26 = (17 17 11) = RRL. Continuing in this

fashion, the ciphertext for the entire plaintext is RRLMWBKASPDH.

Decryption requires using the inverse of the matrix K. We can compute det

K = 23, and therefore, (det K)-1 mod 26 = 17. We can then compute the inverse as7

K-1 = £ 4 9 1515 17 6

24 0 17

≥

This is demonstrated as

£ 17 17 521 18 21

2 2 19

≥£ 4 9 1515 17 6

24 0 17

≥ = £ 443 442 442858 495 780

494 52 365

≥ mod 26 = £ 1 0 00 1 0

0 0 1

≥

It is easily seen that if the matrix K-1 is applied to the ciphertext, then the

plaintext is recovered.

In general terms, the Hill system can be expressed as

C = E(K, P) = PK mod 26

P = D(K, C) = CK-1 mod 26 = PKK-1 = P

As with Playfair, the strength of the Hill cipher is that it completely hides

single-letter frequencies. Indeed, with Hill, the use of a larger matrix hides more

frequency information. Thus, a 3 * 3 Hill cipher hides not only single-letter but

also two-letter frequency information.

Although the Hill cipher is strong against a ciphertext-only attack, it is easily

broken with a known plaintext attack. For an m * m Hill cipher, suppose we have m

plaintext–ciphertext pairs, each of length m. We label the pairs Pj = (p1jp1j c pmj)

and Cj = (c1jc1j c cmj) such that Cj = PjK for 1 … j … m and for some unknown

key matrix K. Now define two m * m matrices X = (pij) and Y = (cij). Then we

can form the matrix equation Y = XK. If X has an inverse, then we can determine

K = X-1Y. If X is not invertible, then a new version of X can be formed with addi-

tional plaintext–ciphertext pairs until an invertible X is obtained.

Consider this example. Suppose that the plaintext “hillcipher” is encrypted

using a 2 * 2 Hill cipher to yield the ciphertext HCRZSSXNSP. Thus, we know

that (7 8)K mod 26 = (7 2); (11 11)K mod 26 = (17 25); and so on. Using

the first two plaintext-ciphertext pairs, we have

7The calculations for this example are provided in detail in Appendix E.

102 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

¢ 7 2

17 25

≤ = ¢ 7 8

11 11

≤K mod 26

The inverse of X can be computed:

¢ 7 8

11 11

≤-1 = ¢25 22

1 23

≤

so

K = ¢25 22

1 23

≤ ¢ 7 2

17 25

≤ = ¢549 600

398 577

≤ mod 26 = ¢3 2

8 5

≤

This result is verified by testing the remaining plaintext–ciphertext pairs.

Polyalphabetic Ciphers

Another way to improve on the simple monoalphabetic technique is to use differ-

ent monoalphabetic substitutions as one proceeds through the plaintext message.

The general name for this approach is polyalphabetic substitution cipher. All these

techniques have the following features in common:

1. A set of related monoalphabetic substitution rules is used.

2. A key determines which particular rule is chosen for a given transformation.

VIGENÈRE CIPHER The best known, and one of the simplest, polyalphabetic ciphers

is the Vigenère cipher. In this scheme, the set of related monoalphabetic substitu-

tion rules consists of the 26 Caesar ciphers with shifts of 0 through 25. Each cipher is

denoted by a key letter, which is the ciphertext letter that substitutes for the plain-

text letter a. Thus, a Caesar cipher with a shift of 3 is denoted by the key value 3.8

We can express the Vigenère cipher in the following manner. Assume a

sequence of plaintext letters P = p0, p1, p2, c , pn – 1 and a key consisting of the

sequence of letters K = k0, k1, k2, c , km – 1, where typically m 6 n. The sequence

of ciphertext letters C = C0, C1, C2, c , Cn – 1 is calculated as follows:

C = C0, C1, C2, c , Cn – 1 = E(K, P) = E[(k0, k1, k2, c , km – 1), (p0, p1, p2, c , pn – 1)]

= (p0 + k0) mod 26, (p1 + k1) mod 26, c ,(pm – 1 + km – 1) mod 26,

(pm + k0) mod 26, (pm + 1 + k1) mod 26, c , (p2m – 1 + km – 1) mod 26, c

Thus, the first letter of the key is added to the first letter of the plaintext, mod 26,

the second letters are added, and so on through the first m letters of the plaintext.

For the next m letters of the plaintext, the key letters are repeated. This process

8To aid in understanding this scheme and also to aid in it use, a matrix known as the Vigenère tableau is

often used. This tableau is discussed in a document at box.com/Crypto7e.

3.2 / SUBSTITUTION TECHNIQUES 103

continues until all of the plaintext sequence is encrypted. A general equation of the

encryption process is

Ci = (pi + ki mod m) mod 26 (3.3)

Compare this with Equation (3.1) for the Caesar cipher. In essence, each plain-

text character is encrypted with a different Caesar cipher, depending on the corre-

sponding key character. Similarly, decryption is a generalization of Equation (3.2):

pi = (Ci – ki mod m) mod 26 (3.4)

To encrypt a message, a key is needed that is as long as the message. Usually,

the key is a repeating keyword. For example, if the keyword is deceptive, the mes-

sage “we are discovered save yourself” is encrypted as

key: deceptivedeceptivedeceptive

plaintext: wearediscoveredsaveyourself

ciphertext: ZICVTWQNGRZGVTWAVZHCQYGLMGJ

Expressed numerically, we have the following result.

key 3 4 2 4 15 19 8 21 4 3 4 2 4 15

plaintext 22 4 0 17 4 3 8 18 2 14 21 4 17 4

ciphertext 25 8 2 21 19 22 16 13 6 17 25 6 21 19

key 19 8 21 4 3 4 2 4 15 19 8 21 4

plaintext 3 18 0 21 4 24 14 20 17 18 4 11 5

ciphertext 22 0 21 25 7 2 16 24 6 11 12 6 9

The strength of this cipher is that there are multiple ciphertext letters for

each plaintext letter, one for each unique letter of the keyword. Thus, the letter fre-

quency information is obscured. However, not all knowledge of the plaintext struc-

ture is lost. For example, Figure 3.6 shows the frequency distribution for a Vigenère

cipher with a keyword of length 9. An improvement is achieved over the Playfair

cipher, but considerable frequency information remains.

It is instructive to sketch a method of breaking this cipher, because the method

reveals some of the mathematical principles that apply in cryptanalysis.

First, suppose that the opponent believes that the ciphertext was encrypted

using either monoalphabetic substitution or a Vigenère cipher. A simple test can

be made to make a determination. If a monoalphabetic substitution is used, then

the statistical properties of the ciphertext should be the same as that of the lan-

guage of the plaintext. Thus, referring to Figure 3.5, there should be one cipher let-

ter with a relative frequency of occurrence of about 12.7%, one with about 9.06%,

and so on. If only a single message is available for analysis, we would not expect

an exact match of this small sample with the statistical profile of the plaintext lan-

guage. Nevertheless, if the correspondence is close, we can assume a monoalpha-

betic substitution.

104 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

If, on the other hand, a Vigenère cipher is suspected, then progress depends on

determining the length of the keyword, as will be seen in a moment. For now, let us

concentrate on how the keyword length can be determined. The important insight

that leads to a solution is the following: If two identical sequences of plaintext let-

ters occur at a distance that is an integer multiple of the keyword length, they will

generate identical ciphertext sequences. In the foregoing example, two instances

of the sequence “red” are separated by nine character positions. Consequently, in

both cases, r is encrypted using key letter e, e is encrypted using key letter p, and d

is encrypted using key letter t. Thus, in both cases, the ciphertext sequence is VTW.

We indicate this above by underlining the relevant ciphertext letters and shading

the relevant ciphertext numbers.

An analyst looking at only the ciphertext would detect the repeated sequences

VTW at a displacement of 9 and make the assumption that the keyword is either

three or nine letters in length. The appearance of VTW twice could be by chance

and may not reflect identical plaintext letters encrypted with identical key letters.

However, if the message is long enough, there will be a number of such repeated

ciphertext sequences. By looking for common factors in the displacements of the vari-

ous sequences, the analyst should be able to make a good guess of the keyword length.

Solution of the cipher now depends on an important insight. If the keyword

length is m, then the cipher, in effect, consists of m monoalphabetic substitution

ciphers. For example, with the keyword DECEPTIVE, the letters in positions 1, 10,

19, and so on are all encrypted with the same monoalphabetic cipher. Thus, we can

use the known frequency characteristics of the plaintext language to attack each of

the monoalphabetic ciphers separately.

The periodic nature of the keyword can be eliminated by using a nonrepeating

keyword that is as long as the message itself. Vigenère proposed what is referred to

as an autokey system, in which a keyword is concatenated with the plaintext itself to

provide a running key. For our example,

key: deceptivewearediscoveredsav

plaintext: wearediscoveredsaveyourself

ciphertext: ZICVTWQNGKZEIIGASXSTSLVVWLA

Even this scheme is vulnerable to cryptanalysis. Because the key and the

plaintext share the same frequency distribution of letters, a statistical technique can

be applied. For example, e enciphered by e, by Figure 3.5, can be expected to occur

with a frequency of (0.127)2 ≈ 0.016, whereas t enciphered by t would occur only

about half as often. These regularities can be exploited to achieve successful

cryptanalysis.9

VERNAM CIPHER The ultimate defense against such a cryptanalysis is to choose a

keyword that is as long as the plaintext and has no statistical relationship to it. Such

a system was introduced by an AT&T engineer named Gilbert Vernam in 1918.

9Although the techniques for breaking a Vigenère cipher are by no means complex, a 1917 issue of

Scientific American characterized this system as “impossible of translation.” This is a point worth remem-

bering when similar claims are made for modern algorithms.

3.2 / SUBSTITUTION TECHNIQUES 105

His system works on binary data (bits) rather than letters. The system can be

expressed succinctly as follows (Figure 3.7):

ci = pi ⊕ ki

where

pi = ith binary digit of plaintext

ki = ith binary digit of key

ci = ith binary digit of ciphertext

⊕ = exclusive@or (XOR) operation

Compare this with Equation (3.3) for the Vigenère cipher.

Thus, the ciphertext is generated by performing the bitwise XOR of the plain-

text and the key. Because of the properties of the XOR, decryption simply involves

the same bitwise operation:

pi = ci ⊕ ki

which compares with Equation (3.4).

The essence of this technique is the means of construction of the key. Vernam

proposed the use of a running loop of tape that eventually repeated the key, so that

in fact the system worked with a very long but repeating keyword. Although such

a scheme, with a long key, presents formidable cryptanalytic difficulties, it can be

broken with sufficient ciphertext, the use of known or probable plaintext sequences,

or both.

One-Time Pad

An Army Signal Corp officer, Joseph Mauborgne, proposed an improvement to the

Vernam cipher that yields the ultimate in security. Mauborgne suggested using a

random key that is as long as the message, so that the key need not be repeated. In

addition, the key is to be used to encrypt and decrypt a single message, and then is

discarded. Each new message requires a new key of the same length as the new mes-

sage. Such a scheme, known as a one-time pad, is unbreakable. It produces random

output that bears no statistical relationship to the plaintext. Because the ciphertext

Figure 3.7 Vernam Cipher

Key stream

generator

Cryptographic

bit stream (ki)

Cryptographic

bit stream (ki)

Plaintext

(pi)

Plaintext

(pi)

Ciphertext

(ci )

Key stream

generator

106 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

contains no information whatsoever about the plaintext, there is simply no way to

break the code.

An example should illustrate our point. Suppose that we are using a Vigenère

scheme with 27 characters in which the twenty-seventh character is the space

character, but with a one-time key that is as long as the message. Consider the

ciphertext

ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS

We now show two different decryptions using two different keys:

ciphertext: ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS

key: pxlmvmsydofuyrvzwc tnlebnecvgdupahfzzlmnyih

plaintext: mr mustard with the candlestick in the hall

ciphertext: ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS

key: pftgpmiydgaxgoufhklllmhsqdqogtewbqfgyovuhwt

plaintext: miss scarlet with the knife in the library

Suppose that a cryptanalyst had managed to find these two keys. Two plau-

sible plaintexts are produced. How is the cryptanalyst to decide which is the correct

decryption (i.e., which is the correct key)? If the actual key were produced in a truly

random fashion, then the cryptanalyst cannot say that one of these two keys is more

likely than the other. Thus, there is no way to decide which key is correct and there-

fore which plaintext is correct.

In fact, given any plaintext of equal length to the ciphertext, there is a key that

produces that plaintext. Therefore, if you did an exhaustive search of all possible

keys, you would end up with many legible plaintexts, with no way of knowing which

was the intended plaintext. Therefore, the code is unbreakable.

The security of the one-time pad is entirely due to the randomness of the key.

If the stream of characters that constitute the key is truly random, then the stream

of characters that constitute the ciphertext will be truly random. Thus, there are no

patterns or regularities that a cryptanalyst can use to attack the ciphertext.

In theory, we need look no further for a cipher. The one-time pad offers com-

plete security but, in practice, has two fundamental difficulties:

1. There is the practical problem of making large quantities of random keys. Any

heavily used system might require millions of random characters on a regular

basis. Supplying truly random characters in this volume is a significant task.

2. Even more daunting is the problem of key distribution and protection. For

every message to be sent, a key of equal length is needed by both sender and

receiver. Thus, a mammoth key distribution problem exists.

Because of these difficulties, the one-time pad is of limited utility and is useful

primarily for low-bandwidth channels requiring very high security.

The one-time pad is the only cryptosystem that exhibits what is referred to as

perfect secrecy. This concept is explored in Appendix F.

3.3 / TRANSPOSITION TECHNIQUES 107

3.3 TRANSPOSITION TECHNIQUES

All the techniques examined so far involve the substitution of a ciphertext symbol

for a plaintext symbol. A very different kind of mapping is achieved by performing

some sort of permutation on the plaintext letters. This technique is referred to as a

transposition cipher.

The simplest such cipher is the rail fence technique, in which the plaintext is

written down as a sequence of diagonals and then read off as a sequence of rows.

For example, to encipher the message “meet me after the toga party” with a rail

fence of depth 2, we write the following:

m e m a t r h t g p r y

e t e f e t e o a a t

The encrypted message is

MEMATRHTGPRYETEFETEOAAT

This sort of thing would be trivial to cryptanalyze. A more complex scheme is

to write the message in a rectangle, row by row, and read the message off, column

by column, but permute the order of the columns. The order of the columns then

becomes the key to the algorithm. For example,

Key: 4 3 1 2 5 6 7

Plaintext: a t t a c k p

o s t p o n e

d u n t i l t

w o a m x y z

Ciphertext: TTNAAPTMTSUOAODWCOIXKNLYPETZ

Thus, in this example, the key is 4312567. To encrypt, start with the column

that is labeled 1, in this case column 3. Write down all the letters in that column.

Proceed to column 4, which is labeled 2, then column 2, then column 1, then

columns 5, 6, and 7.

A pure transposition cipher is easily recognized because it has the same letter

frequencies as the original plaintext. For the type of columnar transposition just

shown, cryptanalysis is fairly straightforward and involves laying out the cipher-

text in a matrix and playing around with column positions. Digram and trigram fre-

quency tables can be useful.

The transposition cipher can be made significantly more secure by perform-

ing more than one stage of transposition. The result is a more complex permutation

that is not easily reconstructed. Thus, if the foregoing message is reencrypted using

the same algorithm,

108 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

Key: 4 3 1 2 5 6 7

Input: t t n a a p t

m t s u o a o

d w c o i x k

n l y p e t z

Output: NSCYAUOPTTWLTMDNAOIEPAXTTOKZ

To visualize the result of this double transposition, designate the letters in the

original plaintext message by the numbers designating their position. Thus, with 28

letters in the message, the original sequence of letters is

01 02 03 04 05 06 07 08 09 10 11 12 13 14

15 16 17 18 19 20 21 22 23 24 25 26 27 28

After the first transposition, we have

03 10 17 24 04 11 18 25 02 09 16 23 01 08

15 22 05 12 19 26 06 13 20 27 07 14 21 28

which has a somewhat regular structure. But after the second transposition, we have

17 09 05 27 24 16 12 07 10 02 22 20 03 25

15 13 04 23 19 14 11 01 26 21 18 08 06 28

This is a much less structured permutation and is much more difficult to cryptanalyze.

3.4 ROTOR MACHINES

The example just given suggests that multiple stages of encryption can produce an

algorithm that is significantly more difficult to cryptanalyze. This is as true of substi-

tution ciphers as it is of transposition ciphers. Before the introduction of DES, the

most important application of the principle of multiple stages of encryption was a

class of systems known as rotor machines.10

The basic principle of the rotor machine is illustrated in Figure 3.8. The

machine consists of a set of independently rotating cylinders through which electri-

cal pulses can flow. Each cylinder has 26 input pins and 26 output pins, with internal

wiring that connects each input pin to a unique output pin. For simplicity, only three

of the internal connections in each cylinder are shown.

If we associate each input and output pin with a letter of the alphabet, then a

single cylinder defines a monoalphabetic substitution. For example, in Figure 3.8,

if an operator depresses the key for the letter A, an electric signal is applied to

10Machines based on the rotor principle were used by both Germany (Enigma) and Japan (Purple) in

World War II. The breaking of both codes by the Allies was a significant factor in the war’s outcome.

3.4 / ROTOR MACHINES 109

the first pin of the first cylinder and flows through the internal connection to the

twenty-fifth output pin.

Consider a machine with a single cylinder. After each input key is depressed,

the cylinder rotates one position, so that the internal connections are shifted accord-

ingly. Thus, a different monoalphabetic substitution cipher is defined. After 26 let-

ters of plaintext, the cylinder would be back to the initial position. Thus, we have a

polyalphabetic substitution algorithm with a period of 26.

A single-cylinder system is trivial and does not present a formidable crypt-

analytic task. The power of the rotor machine is in the use of multiple cylinders, in

which the output pins of one cylinder are connected to the input pins of the next.

Figure 3.8 shows a three-cylinder system. The left half of the figure shows a position

in which the input from the operator to the first pin (plaintext letter a) is routed

through the three cylinders to appear at the output of the second pin (ciphertext

letter B).

With multiple cylinders, the one closest to the operator input rotates one

pin position with each keystroke. The right half of Figure 3.8 shows the system’s

configuration after a single keystroke. For every complete rotation of the inner

cylinder, the middle cylinder rotates one pin position. Finally, for every complete

rotation of the middle cylinder, the outer cylinder rotates one pin position. This

is the same type of operation seen with an odometer. The result is that there are

26 * 26 * 26 = 17,576 different substitution alphabets used before the system

Figure 3.8 Three-Rotor Machine with Wiring Represented by Numbered Contacts

24

25

26

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

21

3

15

1

19

10

14

26

20

8

16

7

22

4

11

5

17

9

12

23

18

2

25

6

24

13

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

26

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

20

1

6

4

15

3

14

12

23

5

16

2

22

19

11

18

25

24

13

7

10

8

21

9

26

17

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

8

18

26

17

20

22

10

3

13

11

4

23

5

24

9

12

25

16

19

6

15

21

2

7

1

14

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

Direction of motion Direction of motion

Fast rotor Medium rotor Slow rotor Fast rotor Medium rotor Slow rotor

(a) Initial setting (b) Setting after one keystroke

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

23

24

25

26

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

13

21

3

15

1

19

10

14

26

20

8

16

7

22

4

11

5

17

9

12

23

18

2

25

6

24

26

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

20

1

6

4

15

3

14

12

23

5

16

2

22

19

11

18

25

24

13

7

10

8

21

9

26

17

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

8

18

26

17

20

22

10

3

13

11

4

23

5

24

9

12

25

16

19

6

15

21

2

7

1

14

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

110 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

repeats. The addition of fourth and fifth rotors results in periods of 456,976 and

11,881,376 letters, respectively. Thus, a given setting of a 5-rotor machine is equiva-

lent to a Vigenère cipher with a key length of 11,881,376.

Such a scheme presents a formidable cryptanalytic challenge. If, for example,

the cryptanalyst attempts to use a letter frequency analysis approach, the analyst

is faced with the equivalent of over 11 million monoalphabetic ciphers. We might

need on the order of 50 letters in each monalphabetic cipher for a solution, which

means that the analyst would need to be in possession of a ciphertext with a length

of over half a billion letters.

The significance of the rotor machine today is that it points the way to a large

class of symmetric ciphers, of which the Data Encryption Standard (DES) is the

most prominent. DES is introduced in Chapter 4.

3.5 STEGANOGRAPHY

We conclude with a discussion of a technique that (strictly speaking), is not encryp-

tion, namely, steganography.

A plaintext message may be hidden in one of two ways. The methods of

steganography conceal the existence of the message, whereas the methods of cryp-

tography render the message unintelligible to outsiders by various transformations

of the text.11

A simple form of steganography, but one that is time-consuming to construct,

is one in which an arrangement of words or letters within an apparently innocuous

text spells out the real message. For example, the sequence of first letters of each

word of the overall message spells out the hidden message. Figure 3.9 shows an

example in which a subset of the words of the overall message is used to convey the

hidden message. See if you can decipher this; it’s not too hard.

Various other techniques have been used historically; some examples are the

following [MYER91]:

■ Character marking: Selected letters of printed or typewritten text are over-

written in pencil. The marks are ordinarily not visible unless the paper is held

at an angle to bright light.

■ Invisible ink: A number of substances can be used for writing but leave no vis-

ible trace until heat or some chemical is applied to the paper.

■ Pin punctures: Small pin punctures on selected letters are ordinarily not vis-

ible unless the paper is held up in front of a light.

■ Typewriter correction ribbon: Used between lines typed with a black ribbon,

the results of typing with the correction tape are visible only under a strong

light.

11Steganography was an obsolete word that was revived by David Kahn and given the meaning it has

today [KAHN96].

3.5 / STEGANOGRAPHY 111

Although these techniques may seem archaic, they have contemporary equiv-

alents. [WAYN09] proposes hiding a message by using the least significant bits of

frames on a CD. For example, the Kodak Photo CD format’s maximum resolution

is 3096 * 6144 pixels, with each pixel containing 24 bits of RGB color information.

The least significant bit of each 24-bit pixel can be changed without greatly affecting

the quality of the image. The result is that you can hide a 130-kB message in a single

digital snapshot. There are now a number of software packages available that take

this type of approach to steganography.

Steganography has a number of drawbacks when compared to encryption.

It requires a lot of overhead to hide a relatively few bits of information, although

using a scheme like that proposed in the preceding paragraph may make it more

effective. Also, once the system is discovered, it becomes virtually worthless. This

problem, too, can be overcome if the insertion method depends on some sort of key

(e.g., see Problem 3.22). Alternatively, a message can be first encrypted and then

hidden using steganography.

The advantage of steganography is that it can be employed by parties who

have something to lose should the fact of their secret communication (not necessar-

ily the content) be discovered. Encryption flags traffic as important or secret or may

identify the sender or receiver as someone with something to hide.

Figure 3.9 A Puzzle for Inspector Morse

(From The Silent World of Nicholas Quinn, by Colin Dexter)

112 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

3.6 KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS

Key Terms

block cipher

brute-force attack

Caesar cipher

cipher

ciphertext

computationally secure

conventional encryption

cryptanalysis

cryptographic system

cryptography

cryptology

deciphering

decryption

digram

enciphering

encryption

Hill cipher

monoalphabetic cipher

one-time pad

plaintext

Playfair cipher

polyalphabetic cipher

rail fence cipher

single-key encryption

steganography

stream cipher

symmetric encryption

transposition cipher

unconditionally secure

Vigenère cipher

Review Questions

3.1 Describe the main requirements for the secure use of symmetric encryption.

3.2 What are the two basic functions used in encryption algorithms?

3.3 Differentiate between secret-key encryption and public-key encryption.

3.4 What is the difference between a block cipher and a stream cipher?

3.5 What are the two general approaches to attacking a cipher?

3.6 List and briefly define types of cryptanalytic attacks based on what is known to the

attacker.

3.7 What is the difference between an unconditionally secure cipher and a computation-

ally secure cipher?

3.8 Why is the Caesar cipher substitution technique vulnerable to a brute-force cryptanalysis?

3.9 How much key space is available when a monoalphabetic substitution cipher is used

to replace plaintext with ciphertext?

3.10 What is the drawback of a Playfair cipher?

3.11 What is the difference between a monoalphabetic cipher and a polyalphabetic cipher?

3.12 What are two problems with the one-time pad?

3.13 What is a transposition cipher?

3.14 What are the drawbacks of Steganography?

Problems

3.1 A generalization of the Caesar cipher, known as the affine Caesar cipher, has the fol-

lowing form: For each plaintext letter p, substitute the ciphertext letter C:

C = E([a, b], p) = (ap + b) mod 26

A basic requirement of any encryption algorithm is that it be one-to-one. That is, if

p ≠ q, then E(k, p) ≠ E(k, q). Otherwise, decryption is impossible, because more

than one plaintext character maps into the same ciphertext character. The affine

Caesar cipher is not one-to-one for all values of a. For example, for a = 2 and b = 3,

then E([a, b], 0) = E([a, b], 13) = 3.

a. Are there any limitations on the value of b? Explain why or why not.

b. Determine which values of a are not allowed.

3.6 / KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS 113

c. Provide a general statement of which values of a are and are not allowed. Justify

your statement.

3.2 How many one-to-one affine Caesar ciphers are there?

3.3 A ciphertext has been generated with an affine cipher. The most frequent letter of

the ciphertext is “C,” and the second most frequent letter of the ciphertext is “Z.”

Break this code.

3.4 The following ciphertext was generated using a simple substitution algorithm.

hzsrnqc klyy wqc flo mflwf ol zqdn nsoznj wskn lj xzsrbjnf,

wzsxz gqv zqhhnf ol ozn glco zlfnco hnlhrn; nsoznj jnrqosdnc

lj fnqj kjsnfbc, wzsxz sc xnjoqsfrv gljn efeceqr. zn rsdnb

qrlfn sf zsc zlecn sf cqdsrrn jlw, wzsoznj flfn hnfnojqonb.

q csfyrn blgncosx cekksxnb ol cnjdn zsg. zn pjnqmkqconb qfb

bsfnb qo ozn xrep, qo zlejc gqozngqosxqrrv ksanb, sf ozn cqgn

jllg, qo ozn cqgn oqprn, fndnj oqmsfy zsc gnqrc wsoz loznj

gngpnjc, gexz rncc pjsfysfy q yenco wsoz zsg; qfb wnfo zlgn

qo naqxorv gsbfsyzo, lfrv ol jnosjn qo lfxn ol pnb. zn fndnj

ecnb ozn xlcv xzqgpnjc wzsxz ozn jnkljg hjldsbnc klj soc

kqdlejnb gngpnjc. zn hqccnb onf zlejc leo lk ozn ownfov-klej

sf cqdsrrn jlw, nsoznj sf crnnhsfy lj gqmsfy zsc olsrno.

Decrypt this message.

Hints:

1. As you know, the most frequently occurring letter in English is e. Therefore, the

first or second (or perhaps third?) most common character in the message is likely

to stand for e. Also, e is often seen in pairs (e.g., meet, fleet, speed, seen, been,

agree, etc.). Try to find a character in the ciphertext that decodes to e.

2. The most common word in English is “the.” Use this fact to guess the characters

that stand for t and h.

3. Decipher the rest of the message by deducing additional words.

Warning: The resulting message is in English but may not make much sense on a first

reading.

3.5 One way to solve the key distribution problem is to use a line from a book that both

the sender and the receiver possess. Typically, at least in spy novels, the first sentence

of a book serves as the key. The particular scheme discussed in this problem is from

one of the best suspense novels involving secret codes, Talking to Strange Men, by

Ruth Rendell. Work this problem without consulting that book!

Consider the following message:

SIDKHKDM AF HCRKIABIE SHIMC KD LFEAILA

This ciphertext was produced using the first sentence of The Other Side of Silence

(a book about the spy Kim Philby):

The snow lay thick on the steps and the snowflakes driven by the wind

looked black in the headlights of the cars.

A simple substitution cipher was used.

a. What is the encryption algorithm?

b. How secure is it?

c. To make the key distribution problem simple, both parties can agree to use the first or

last sentence of a book as the key. To change the key, they simply need to agree on a

new book. The use of the first sentence would be preferable to the use of the last. Why?

3.6 In one of his cases, Sherlock Holmes was confronted with the following message.

534 C2 13 127 36 31 4 17 21 41

DOUGLAS 109 293 5 37 BIRLSTONE

26 BIRLSTONE 9 127 171

114 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

Although Watson was puzzled, Holmes was able immediately to deduce the type of

cipher. Can you?

3.7 This problem uses a real-world example, from an old U.S. Special Forces manual

(public domain). The document, filename SpecialForces , is available at box.com/

Crypto7e.

a. Using the two keys (memory words) cryptographic and network security, encrypt

the following message:

Be at the third pillar from the left outside the lyceum theatre tonight at seven.

If you are distrustful bring two friends.

Make reasonable assumptions about how to treat redundant letters and excess

letters in the memory words and how to treat spaces and punctuation. Indicate

what your assumptions are. Note: The message is from the Sherlock Holmes novel,

The Sign of Four.

b. Decrypt the ciphertext. Show your work.

c. Comment on when it would be appropriate to use this technique and what its

advantages are.

3.8 A disadvantage of the general monoalphabetic cipher is that both sender and receiver

must commit the permuted cipher sequence to memory. A common technique for

avoiding this is to use a keyword from which the cipher sequence can be gener-

ated. For example, using the keyword CRYPTO, write out the keyword followed by

unused letters in normal order and match this against the plaintext letters:

plain: a b c d e f g h i j k l m n o p q r s t u v w x y z

cipher: C R Y P T O A B D E F G H I J K L M N Q S U V W X Z

If it is felt that this process does not produce sufficient mixing, write the remain-

ing letters on successive lines and then generate the sequence by reading down the

columns:

C R Y P T O

A B D E F G

H I J K L M

N Q S U V W

X Z

This yields the sequence:

C A H N X R B I Q Z Y D J S P E K U T F L V O G M W

Such a system is used in the example in Section 3.2 (the one that begins “it was

disclosed yesterday”). Determine the keyword.

3.9 When the PT-109 American patrol boat, under the command of Lieutenant John F.

Kennedy, was sunk by a Japanese destroyer, a message was received at an Australian

wireless station in Playfair code:

KXJEY UREBE ZWEHE WRYTU HEYFS

KREHE GOYFI WTTTU OLKSY CAJPO

BOTEI ZONTX BYBNT GONEY CUZWR

GDSON SXBOU YWRHE BAAHY USEDQ

The key used was royal new zealand navy. Decrypt the message. Translate TT into tt.

3.6 / KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS 115

3.10 a. Construct a Playfair matrix with the key algorithm.

b. Construct a Playfair matrix with the key cryptography. Make a reasonable assump-

tion about how to treat redundant letters in the key.

3.11 a. Using this Playfair matrix:

J/K C D E F

U N P Q S

Z V W X Y

R A L G O

B I T H M

Encrypt this message:

I only regret that I have but one life to give for my country.

Note: This message is by Nathan Hale, a soldier in the American Revolutionary War.

b. Repeat part (a) using the Playfair matrix from Problem 3.10a.

c. How do you account for the results of this problem? Can you generalize your

conclusion?

3.12 a. How many possible keys does the Playfair cipher have? Ignore the fact that

some keys might produce identical encryption results. Express your answer as an

approximate power of 2.

b. Now take into account the fact that some Playfair keys produce the same encryp-

tion results. How many effectively unique keys does the Playfair cipher have?

3.13 What substitution system results when we use a 1 * 25 Playfair matrix?

3.14 a. Encrypt the message “meet me at the usual place at ten rather than eight o clock”

using the Hill cipher with the key ¢7 3

2 5

≤. Show your calculations and the result.

b. Show the calculations for the corresponding decryption of the ciphertext to

recover the original plaintext.

3.15 We have shown that the Hill cipher succumbs to a known plaintext attack if sufficient

plaintext–ciphertext pairs are provided. It is even easier to solve the Hill cipher if a

chosen plaintext attack can be mounted. Describe such an attack.

3.16 It can be shown that the Hill cipher with the matrix ¢a b

c d

≤ requires that (ad – bc)

is relatively prime to 26; that is, the only common positive integer factor of (ad – bc)

and 26 is 1. Thus, if (ad – bc) = 13 or is even, the matrix is not allowed. Determine

the number of different (good) keys there are for a 2 * 2 Hill cipher without count-

ing them one by one, using the following steps:

a. Find the number of matrices whose determinant is even because one or both rows

are even. (A row is “even” if both entries in the row are even.)

b. Find the number of matrices whose determinant is even because one or both col-

umns are even. (A column is “even” if both entries in the column are even.)

c. Find the number of matrices whose determinant is even because all of the entries

are odd.

d. Taking into account overlaps, find the total number of matrices whose determi-

nant is even.

e. Find the number of matrices whose determinant is a multiple of 13 because the

first column is a multiple of 13.

116 CHAPTER 3 / CLASSICAL ENCRYPTION TECHNIQUES

f. Find the number of matrices whose determinant is a multiple of 13 where

the first column is not a multiple of 13 but the second column is a mul-

tiple of the first modulo 13.

g. Find the total number of matrices whose determinant is a multiple of 13.

h. Find the number of matrices whose determinant is a multiple of 26

because they fit cases parts (a) and (e), (b) and (e), (c) and (e), (a) and

(f), and so on.

i. Find the total number of matrices whose determinant is neither a mul-

tiple of 2 nor a multiple of 13.

3.17 Calculate the determinant mod 26 of

a. ¢2 3 5

1 3 7

≤ b. £ 2 1 1 3 2 55 7 1 8

3 1 4 1 2

≥

3.18 Determine the inverse mod 26 of

a. ¢2 3

1 22

≤ b. £ 6 24 113 16 10

20 17 15

≥

3.19 Using the Vigenère cipher, encrypt the word “cryptographic” using the word

“eng”.

3.20 This problem explores the use of a one-time pad version of the Vigenère

cipher. In this scheme, the key is a stream of random numbers between 0

and 26. For example, if the key is 3 19 5 . . . , then the first letter of plaintext

is encrypted with a shift of 3 letters, the second with a shift of 19 letters, the

third with a shift of 5 letters, and so on.

a. Encrypt the plaintext sendmoremoney with the key stream

3 11 5 7 17 21 0 11 14 8 7 13 9

b. Using the ciphertext produced in part (a), find a key so that the cipher-

text decrypts to the plaintext cashnotneeded.

3.21 What is the message embedded in Figure 3.9?

3.22 In one of Dorothy Sayers’s mysteries, Lord Peter is confronted with the

message shown in Figure 3.10. He also discovers the key to the message,

which is a sequence of integers:

787656543432112343456567878878765654

3432112343456567878878765654433211234

a. Decrypt the message. Hint: What is the largest integer value?

b. If the algorithm is known but not the key, how secure is the scheme?

c. If the key is known but not the algorithm, how secure is the scheme?

Figure 3.10 A Puzzle for Lord Peter

I thought to see the fairies in the fields, but I saw only the evil elephants with their black

backs. Woe! how that sight awed me! The elves danced all around and about while I heard

voices calling clearly. Ah! how I tried to see—throw off the ugly cloud—but no blind eye

of a mortal was permitted to spy them. So then came minstrels, having gold trumpets, harps

and drums. These played very loudly beside me, breaking that spell. So the dream vanished,

whereat I thanked Heaven. I shed many tears before the thin moon rose up, frail and faint as

a sickle of straw. Now though the Enchanter gnash his teeth vainly, yet shall he return as the

Spring returns. Oh, wretched man! Hell gapes, Erebus now lies open. The mouths of Death

wait on thy end.

3.6 / KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS 117

Programming Problems

3.23 Write a program that can encrypt and decrypt using the general Caesar

cipher, also known as an additive cipher.

3.24 Write a program that can encrypt and decrypt using the affine cipher

described in Problem 3.1.

3.25 Write a program that can perform a letter frequency attack on an additive

cipher without human intervention. Your software should produce possible

plaintexts in rough order of likelihood. It would be good if your user inter-

face allowed the user to specify “give me the top 10 possible plaintexts.”

3.26 Write a program that can perform a letter frequency attack on any mono-

alphabetic substitution cipher without human intervention. Your software

should produce possible plaintexts in rough order of likelihood. It would

be good if your user interface allowed the user to specify “give me the top

10 possible plaintexts.”

3.27 Create software that can encrypt and decrypt using a 2 * 2 Hill cipher.

3.28 Create software that can perform a fast known plaintext attack on a Hill cipher,

given the dimension m. How fast are your algorithms, as a function of m?

118118

4.1 Traditional Block Cipher Structure

Stream Ciphers and Block Ciphers

Motivation for the Feistel Cipher Structure

The Feistel Cipher

4.2 The Data Encryption Standard

DES Encryption

DES Decryption

4.3 A DES Example

Results

The Avalanche Effect

4.4 The Strength of DES

The Use of 56-Bit Keys

The Nature of the DES Algorithm

Timing Attacks

4.5 Block Cipher Design Principles

Number of Rounds

Design of Function F

Key Schedule Algorithm

4.6 Key Terms, Review Questions, and Problems

CHAPTER

Block Ciphers and the Data

Encryption Standard

4.1 / TRADITIONAL BLOCK CIPHER STRUCTURE 119

The objective of this chapter is to illustrate the principles of modern symmetric

ciphers. For this purpose, we focus on the most widely used symmetric cipher: the Data

Encryption Standard (DES). Although numerous symmetric ciphers have been devel-

oped since the introduction of DES, and although it is destined to be replaced by the

Advanced Encryption Standard (AES), DES remains the most important such algo-

rithm. Furthermore, a detailed study of DES provides an understanding of the prin-

ciples used in other symmetric ciphers.

This chapter begins with a discussion of the general principles of symmetric block

ciphers, which are the principal type of symmetric ciphers studied in this book. The

other form of symmetric ciphers, stream ciphers, are discussed in Chapter 8. Next, we

cover full DES. Following this look at a specific algorithm, we return to a more general

discussion of block cipher design.

Compared to public-key ciphers, such as RSA, the structure of DES and most

symmetric ciphers is very complex and cannot be explained as easily as RSA and simi-

lar algorithms. Accordingly, the reader may wish to begin with a simplified version of

DES, which is described in Appendix G. This version allows the reader to perform

encryption and decryption by hand and gain a good understanding of the working of

the algorithm details. Classroom experience indicates that a study of this simplified

version enhances understanding of DES.1

4.1 TRADITIONAL BLOCK CIPHER STRUCTURE

Several important symmetric block encryption algorithms in current use are based

on a structure referred to as a Feistel block cipher [FEIS73]. For that reason, it is

important to examine the design principles of the Feistel cipher. We begin with a

comparison of stream ciphers and block ciphers. Then we discuss the motivation for

the Feistel block cipher structure. Finally, we discuss some of its implications.

1However, you may safely skip Appendix G, at least on a first reading. If you get lost or bogged down in

the details of DES, then you can go back and start with simplified DES.

LEARNING OBJECTIVES

After studying this chapter, you should be able to

◆ Understand the distinction between stream ciphers and block ciphers.

◆ Present an overview of the Feistel cipher and explain how decryption is

the inverse of encryption.

◆ Present an overview of Data Encryption Standard (DES).

◆ Explain the concept of the avalanche effect.

◆ Discuss the cryptographic strength of DES.

◆ Summarize the principal block cipher design principles.

120 CHAPTER 4 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD

Stream Ciphers and Block Ciphers

A stream cipher is one that encrypts a digital data stream one bit or one byte at a

time. Examples of classical stream ciphers are the autokeyed Vigenère cipher and

the Vernam cipher. In the ideal case, a one-time pad version of the Vernam cipher

would be used (Figure 3.7), in which the keystream (ki) is as long as the plaintext bit

stream (pi). If the cryptographic keystream is random, then this cipher is unbreakable

by any means other than acquiring the keystream. However, the keystream must be

provided to both users in advance via some independent and secure channel. This

introduces insurmountable logistical problems if the intended data traffic is very large.

Accordingly, for practical reasons, the bit-stream generator must be imple-

mented as an algorithmic procedure, so that the cryptographic bit stream can be

produced by both users. In this approach (Figure 4.1a), the bit-stream generator is

a key-controlled algorithm and must produce a bit stream that is cryptographically

strong. That is, it must be computationally impractical to predict future portions of

the bit stream based on previous portions of the bit stream. The two users need only

share the generating key, and each can produce the keystream.

A block cipher is one in which a block of plaintext is treated as a whole and

used to produce a ciphertext block of equal length. Typically, a block size of 64 or

Figure 4.1 Stream Cipher and Block Cipher

Bit-stream

generation

algorithm

ENCRYPTION

(a) Stream cipher using algorithmic bit-stream generator

(b) Block cipher

Key

( K )

Encryption

algorithm

Plaintext

b bits

b bits

Key

( K )

ki

Plaintext

(pi)

Plaintext

(pi)

Bit-stream

generation

algorithm

DECRYPTION

Key

( K )

ki

Ciphertext

(ci)

Ciphertext

Decryption

algorithm

Ciphertext

b bits

b bits

Key

( K )

Plaintext

4.1 / TRADITIONAL BLOCK CIPHER STRUCTURE 121

128 bits is used. As with a stream cipher, the two users share a symmetric encryption

key (Figure 4.1b). Using some of the modes of operation explained in Chapter 7, a

block cipher can be used to achieve the same effect as a stream cipher.

Far more effort has gone into analyzing block ciphers. In general, they seem

applicable to a broader range of applications than stream ciphers. The vast majority

of network-based symmetric cryptographic applications make use of block ciphers.

Accordingly, the concern in this chapter, and in our discussions throughout the

book of symmetric encryption, will primarily focus on block ciphers.

Motivation for the Feistel Cipher Structure

A block cipher operates on a plaintext block of n bits to produce a ciphertext block

of n bits. There are 2n possible different plaintext blocks and, for the encryption

to be reversible (i.e., for decryption to be possible), each must produce a unique

ciphertext block. Such a transformation is called reversible, or nonsingular. The fol-

lowing examples illustrate nonsingular and singular transformations for n = 2.

Reversible Mapping Irreversible Mapping

Plaintext Ciphertext Plaintext Ciphertext

00 11 00 11

01 10 01 10

10 00 10 01

11 01 11 01

In the latter case, a ciphertext of 01 could have been produced by one of two plain-

text blocks. So if we limit ourselves to reversible mappings, the number of different

transformations is 2n!.2

Figure 4.2 illustrates the logic of a general substitution cipher for n = 4.

A 4-bit input produces one of 16 possible input states, which is mapped by the sub-

stitution cipher into a unique one of 16 possible output states, each of which is repre-

sented by 4 ciphertext bits. The encryption and decryption mappings can be defined

by a tabulation, as shown in Table 4.1. This is the most general form of block cipher

and can be used to define any reversible mapping between plaintext and ciphertext.

Feistel refers to this as the ideal block cipher, because it allows for the maximum

number of possible encryption mappings from the plaintext block [FEIS75].

But there is a practical problem with the ideal block cipher. If a small block

size, such as n = 4, is used, then the system is equivalent to a classical substitution

cipher. Such systems, as we have seen, are vulnerable to a statistical analysis of the

plaintext. This weakness is not inherent in the use of a substitution cipher but rather

results from the use of a small block size. If n is sufficiently large and an arbitrary

reversible substitution between plaintext and ciphertext is allowed, then the statisti-

cal characteristics of the source plaintext are masked to such an extent that this type

of cryptanalysis is infeasible.

2The reasoning is as follows: For the first plaintext, we can choose any of 2n ciphertext blocks. For the

second plaintext, we choose from among 2n – 1 remaining ciphertext blocks, and so on.

122 CHAPTER 4 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD

An arbitrary reversible substitution cipher (the ideal block cipher) for a large

block size is not practical, however, from an implementation and performance

point of view. For such a transformation, the mapping itself constitutes the key.

Consider again Table 4.1, which defines one particular reversible mapping from

Figure 4.2 General n-bit-n-bit Block Substitution (shown with n = 4)

4-bit input

4 to 16 decoder

16 to 4 encoder

4-bit output

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Table 4.1 Encryption and Decryption Tables for Substitution Cipher of Figure 4.2

Plaintext Ciphertext

0000 1110

0001 0100

0010 1101

0011 0001

0100 0010

0101 1111

0110 1011

0111 1000

1000 0011

1001 1010

1010 0110

1011 1100

1100 0101

1101 1001

1110 0000

1111 0111

Ciphertext Plaintext

0000 1110

0001 0011

0010 0100

0011 1000

0100 0001

0101 1100

0110 1010

0111 1111

1000 0111

1001 1101

1010 1001

1011 0110

1100 1011

1101 0010

1110 0000

1111 0101

4.1 / TRADITIONAL BLOCK CIPHER STRUCTURE 123

plaintext to ciphertext for n = 4. The mapping can be defined by the entries in the

second column, which show the value of the ciphertext for each plaintext block.

This, in essence, is the key that determines the specific mapping from among all

possible mappings. In this case, using this straightforward method of defining the

key, the required key length is (4 bits) * (16 rows) = 64 bits. In general, for an

n-bit ideal block cipher, the length of the key defined in this fashion is n * 2n bits.

For a 64-bit block, which is a desirable length to thwart statistical attacks, the

required key length is 64 * 264 = 270 ≈ 1021 bits.

In considering these difficulties, Feistel points out that what is needed is an

approximation to the ideal block cipher system for large n, built up out of compo-

nents that are easily realizable [FEIS75]. But before turning to Feistel’s approach,

let us make one other observation. We could use the general block substitution

cipher but, to make its implementation tractable, confine ourselves to a subset of

the 2n! possible reversible mappings. For example, suppose we define the mapping

in terms of a set of linear equations. In the case of n = 4, we have

y1 = k11x1 + k12x2 + k13x3 + k14x4

y2 = k21x1 + k22x2 + k23x3 + k24x4

y3 = k31x1 + k32x2 + k33x3 + k34x4

y4 = k41x1 + k42x2 + k43x3 + k44x4

where the xi are the four binary digits of the plaintext block, the yi are the four bi-

nary digits of the ciphertext block, the kij are the binary coefficients, and arithmetic

is mod 2. The key size is just n2, in this case 16 bits. The danger with this kind of for-

mulation is that it may be vulnerable to cryptanalysis by an attacker that is aware of

the structure of the algorithm. In this example, what we have is essentially the Hill

cipher discussed in Chapter 3, applied to binary data rather than characters. As we

saw in Chapter 3, a simple linear system such as this is quite vulnerable.

The Feistel Cipher

Feistel proposed [FEIS73] that we can approximate the ideal block cipher by utiliz-

ing the concept of a product cipher, which is the execution of two or more simple

ciphers in sequence in such a way that the final result or product is cryptographi-

cally stronger than any of the component ciphers. The essence of the approach is

to develop a block cipher with a key length of k bits and a block length of n bits,

allowing a total of 2k possible transformations, rather than the 2n! transformations

available with the ideal block cipher.

In particular, Feistel proposed the use of a cipher that alternates substitutions

and permutations, where these terms are defined as follows:

■ Substitution: Each plaintext element or group of elements is uniquely replaced

by a corresponding ciphertext element or group of elements.

■ Permutation: A sequence of plaintext elements is replaced by a permutation

of that sequence. That is, no elements are added or deleted or replaced in the

sequence, rather the order in which the elements appear in the sequence is

changed.

124 CHAPTER 4 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD

In fact, Feistel’s is a practical application of a proposal by Claude Shannon

to develop a product cipher that alternates confusion and diffusion functions

[SHAN49].3 We look next at these concepts of diffusion and confusion and then

present the Feistel cipher. But first, it is worth commenting on this remarkable fact:

The Feistel cipher structure, which dates back over a quarter century and which, in

turn, is based on Shannon’s proposal of 1945, is the structure used by a number of

significant symmetric block ciphers currently in use. In particular, the Feistel struc-

ture is used for Triple Data Encryption Algorithm (TDEA), which is one of the two

encryption algorithms (along with AES), approved for general use by the National

Institute of Standards and Technology (NIST). The Feistel structure is also used for

several schemes for format-preserving encryption, which have recently come into

prominence. In addition, the Camellia block cipher is a Feistel structure; it is one

of the possible symmetric ciphers in TLS and a number of other Internet security

protocols. Both TDEA and format-preserving encryption are covered in Chapter 7.

DIFFUSION AND CONFUSION The terms diffusion and confusion were introduced by

Claude Shannon to capture the two basic building blocks for any cryptographic sys-

tem [SHAN49]. Shannon’s concern was to thwart cryptanalysis based on statisti-

cal analysis. The reasoning is as follows. Assume the attacker has some knowledge

of the statistical characteristics of the plaintext. For example, in a human-readable

message in some language, the frequency distribution of the various letters may be

known. Or there may be words or phrases likely to appear in the message (probable

words). If these statistics are in any way reflected in the ciphertext, the cryptanalyst

may be able to deduce the encryption key, part of the key, or at least a set of keys

likely to contain the exact key. In what Shannon refers to as a strongly ideal cipher,

all statistics of the ciphertext are independent of the particular key used. The arbi-

trary substitution cipher that we discussed previously (Figure 4.2) is such a cipher,

but as we have seen, it is impractical.4

Other than recourse to ideal systems, Shannon suggests two methods for

frustrating statistical cryptanalysis: diffusion and confusion. In diffusion, the sta-

tistical structure of the plaintext is dissipated into long-range statistics of the

ciphertext. This is achieved by having each plaintext digit affect the value of many

ciphertext digits; generally, this is equivalent to having each ciphertext digit be

affected by many plaintext digits. An example of diffusion is to encrypt a message

M = m1, m2, m3, c of characters with an averaging operation:

yn = ¢ ak

i = 1

mn + i≤ mod 26

3The paper is available at box.com/Crypto7e. Shannon’s 1949 paper appeared originally as a classified

report in 1945. Shannon enjoys an amazing and unique position in the history of computer and informa-

tion science. He not only developed the seminal ideas of modern cryptography but is also responsible for

inventing the discipline of information theory. Based on his work in information theory, he developed

a formula for the capacity of a data communications channel, which is still used today. In addition, he

founded another discipline, the application of Boolean algebra to the study of digital circuits; this last he

managed to toss off as a master’s thesis.

4Appendix F expands on Shannon’s concepts concerning measures of secrecy and the security of crypto-

graphic algorithms.

4.1 / TRADITIONAL BLOCK CIPHER STRUCTURE 125

adding k successive letters to get a ciphertext letter yn. One can show that the sta-

tistical structure of the plaintext has been dissipated. Thus, the letter frequencies in

the ciphertext will be more nearly equal than in the plaintext; the digram frequen-

cies will also be more nearly equal, and so on. In a binary block cipher, diffusion can

be achieved by repeatedly performing some permutation on the data followed by

applying a function to that permutation; the effect is that bits from different posi-

tions in the original plaintext contribute to a single bit of ciphertext.5

Every block cipher involves a transformation of a block of plaintext into a

block of ciphertext, where the transformation depends on the key. The mechanism

of diffusion seeks to make the statistical relationship between the plaintext and

ciphertext as complex as possible in order to thwart attempts to deduce the key. On

the other hand, confusion seeks to make the relationship between the statistics of

the ciphertext and the value of the encryption key as complex as possible, again to

thwart attempts to discover the key. Thus, even if the attacker can get some handle

on the statistics of the ciphertext, the way in which the key was used to produce that

ciphertext is so complex as to make it difficult to deduce the key. This is achieved by

the use of a complex substitution algorithm. In contrast, a simple linear substitution

function would add little confusion.

As [ROBS95b] points out, so successful are diffusion and confusion in captur-

ing the essence of the desired attributes of a block cipher that they have become the

cornerstone of modern block cipher design.

FEISTEL CIPHER STRUCTURE The left-hand side of Figure 4.3 depicts the encryption

structure proposed by Feistel. The inputs to the encryption algorithm are a plaintext

block of length 2w bits and a key K. The plaintext block is divided into two halves,

LE0 and RE0. The two halves of the data pass through n rounds of processing and

then combine to produce the ciphertext block. Each round i has as inputs LEi – 1 and

REi – 1 derived from the previous round, as well as a subkey Ki derived from the over-

all K. In general, the subkeys Ki are different from K and from each other. In Figure

4.3, 16 rounds are used, although any number of rounds could be implemented.

All rounds have the same structure. A substitution is performed on the left

half of the data. This is done by applying a round function F to the right half of the

data and then taking the exclusive-OR of the output of that function and the left

half of the data. The round function has the same general structure for each round

but is parameterized by the round subkey Ki. Another way to express this is to say

that F is a function of right-half block of w bits and a subkey of y bits, which pro-

duces an output value of length w bits: F(REi, Ki + 1). Following this substitution, a

permutation is performed that consists of the interchange of the two halves of the

data.6 This structure is a particular form of the substitution-permutation network

(SPN) proposed by Shannon.

5Some books on cryptography equate permutation with diffusion. This is incorrect. Permutation, by itself,

does not change the statistics of the plaintext at the level of individual letters or permuted blocks. For exam-

ple, in DES, the permutation swaps two 32-bit blocks, so statistics of strings of 32 bits or less are preserved.

6The final round is followed by an interchange that undoes the interchange that is part of the final round.

One could simply leave both interchanges out of the diagram, at the sacrifice of some consistency of pre-

sentation. In any case, the effective lack of a swap in the final round is done to simplify the implementa-

tion of the decryption process, as we shall see.

126 CHAPTER 4 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD

The exact realization of a Feistel network depends on the choice of the follow-

ing parameters and design features:

■ Block size: Larger block sizes mean greater security (all other things being

equal) but reduced encryption/decryption speed for a given algorithm. The

greater security is achieved by greater diffusion. Traditionally, a block size of

64 bits has been considered a reasonable tradeoff and was nearly universal in

block cipher design. However, the new AES uses a 128-bit block size.

Figure 4.3 Feistel Encryption and Decryption (16 rounds)

Output (ciphertext)

K1

LD0 = RE16 RD0 = LE16

LD2 = RE14 RD2 = LE14

LD14 = RE2 RD14 = LE2

LD16 = RE0

LD17 = RE0

RD16 = LE0

RD17 = LE0

RD1 = LE15LD1 = RE15

RD15 = LE1LD15 = RE1

Input (ciphertext)

Output (plaintext)

R

ou

nd

1

K1

K2

K15

K16

K2

K15

K16

F

LE0 RE0

Input (plaintext)

LE1 RE1

LE2 RE2

F

F

LE14 RE14

LE15 RE15

LE16 RE16

LE17 RE17

F

F

F

F

F

R

ou

nd

2

R

ou

nd

1

5

R

ou

nd

1

6

R

ou

nd

1

6

R

ou

nd

1

5

R

ou

nd

2

R

ou

nd

1

4.1 / TRADITIONAL BLOCK CIPHER STRUCTURE 127

■ Key size: Larger key size means greater security but may decrease encryption/

decryption speed. The greater security is achieved by greater resistance to

brute-force attacks and greater confusion. Key sizes of 64 bits or less are now

widely considered to be inadequate, and 128 bits has become a common size.

■ Number of rounds: The essence of the Feistel cipher is that a single round

offers inadequate security but that multiple rounds offer increasing security.

A typical size is 16 rounds.

■ Subkey generation algorithm: Greater complexity in this algorithm should

lead to greater difficulty of cryptanalysis.

■ Round function F: Again, greater complexity generally means greater resis-

tance to cryptanalysis.

There are two other considerations in the design of a Feistel cipher:

■ Fast software encryption/decryption: In many cases, encryption is embedded

in applications or utility functions in such a way as to preclude a hardware im-

plementation. Accordingly, the speed of execution of the algorithm becomes a

concern.

■ Ease of analysis: Although we would like to make our algorithm as difficult as

possible to cryptanalyze, there is great benefit in making the algorithm easy

to analyze. That is, if the algorithm can be concisely and clearly explained, it is

easier to analyze that algorithm for cryptanalytic vulnerabilities and therefore

develop a higher level of assurance as to its strength. DES, for example, does

not have an easily analyzed functionality.

FEISTEL DECRYPTION ALGORITHM The process of decryption with a Feistel cipher

is essentially the same as the encryption process. The rule is as follows: Use the

ciphertext as input to the algorithm, but use the subkeys Ki in reverse order. That

is, use Kn in the first round, Kn – 1 in the second round, and so on, until K1 is used in

the last round. This is a nice feature, because it means we need not implement two

different algorithms; one for encryption and one for decryption.

To see that the same algorithm with a reversed key order produces the cor-

rect result, Figure 4.3 shows the encryption process going down the left-hand side

and the decryption process going up the right-hand side for a 16-round algorithm.

For clarity, we use the notation LEi and REi for data traveling through the encryp-

tion algorithm and LDi and RDi for data traveling through the decryption algo-

rithm. The diagram indicates that, at every round, the intermediate value of the

decryption process is equal to the corresponding value of the encryption process

with the two halves of the value swapped. To put this another way, let the output

of the ith encryption round be LEi ‘ REi (LEi concatenated with REi). Then the cor-

responding output of the (16 – i)th decryption round is REi ‘ LEi or, equivalently,

LD16 – i ‘ RD16 – i.

Let us walk through Figure 4.3 to demonstrate the validity of the preceding

assertions. After the last iteration of the encryption process, the two halves of the

output are swapped, so that the ciphertext is RE16 ‘ LE16. The output of that round

is the ciphertext. Now take that ciphertext and use it as input to the same algorithm.

The input to the first round is RE16 ‘ LE16, which is equal to the 32-bit swap of the

output of the sixteenth round of the encryption process.

128 CHAPTER 4 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD

Now we would like to show that the output of the first round of the decryption

process is equal to a 32-bit swap of the input to the sixteenth round of the encryp-

tion process. First, consider the encryption process. We see that

LE16 = RE15

RE16 = LE15 ⊕ F(RE15, K16)

On the decryption side,

LD1 = RD0 = LE16 = RE15

RD1 = LD0 ⊕ F(RD0, K16)

= RE16 ⊕ F(RE15, K16)

= [LE15 ⊕ F(RE15, K16)] ⊕ F(RE15, K16)

The XOR has the following properties:

[A ⊕ B] ⊕ C = A ⊕ [B ⊕ C]

D ⊕ D = 0

E ⊕ 0 = E

Thus, we have LD1 = RE15 and RD1 = LE15. Therefore, the output of the first

round of the decryption process is RE15 ‘ LE15, which is the 32-bit swap of the input

to the sixteenth round of the encryption. This correspondence holds all the way

through the 16 iterations, as is easily shown. We can cast this process in general

terms. For the ith iteration of the encryption algorithm,

LEi = REi – 1

REi = LEi – 1 ⊕ F(REi – 1, Ki)

Rearranging terms:

REi – 1 = LEi

LEi – 1 = REi ⊕ F(REi – 1, Ki) = REi ⊕ F(LEi, Ki)

Thus, we have described the inputs to the ith iteration as a function of the outputs, and

these equations confirm the assignments shown in the right-hand side of Figure 4.3.

Finally, we see that the output of the last round of the decryption process is

RE0 ‘ LE0. A 32-bit swap recovers the original plaintext, demonstrating the validity

of the Feistel decryption process.

Note that the derivation does not require that F be a reversible function. To

see this, take a limiting case in which F produces a constant output (e.g., all ones)

regardless of the values of its two arguments. The equations still hold.

To help clarify the preceding concepts, let us look at a specific example

(Figure 4.4 and focus on the fifteenth round of encryption, corresponding to the sec-

ond round of decryption. Suppose that the blocks at each stage are 32 bits (two 16-bit

halves) and that the key size is 24 bits. Suppose that at the end of encryption round

fourteen, the value of the intermediate block (in hexadecimal) is DE7F03A6. Then

LE14 = DE7F and RE14 = 03A6. Also assume that the value of K15 is 12DE52.

After round 15, we have LE15 = 03A6 and RE15 = F(03A6, 12DE52) ⊕ DE7F.

4.2 / THE DATA ENCRYPTION STANDARD 129

Now let’s look at the decryption. We assume that LD1 = RE15 and

RD1 = LE15, as shown in Figure 4.3, and we want to demonstrate that LD2 = RE14

and RD2 = LE14. So, we start with LD1 = F(03A6, 12DE52) ⊕ DE7F and

RD1 = 03A6. Then, from Figure 4.3, LD2 = 03A6 = RE14 and RD2 =

F(03A6, 12DE52) ⊕ [F(03A6, 12DE52) ⊕ DE7F] = DE7F = LE14.

4.2 THE DATA ENCRYPTION STANDARD

Until the introduction of the Advanced Encryption Standard (AES) in 2001, the

Data Encryption Standard (DES) was the most widely used encryption scheme.

DES was issued in 1977 by the National Bureau of Standards, now the National

Institute of Standards and Technology (NIST), as Federal Information Processing

Standard 46 (FIPS PUB 46). The algorithm itself is referred to as the Data

Encryption Algorithm (DEA).7 For DEA, data are encrypted in 64-bit blocks using

a 56-bit key. The algorithm transforms 64-bit input in a series of steps into a 64-bit

output. The same steps, with the same key, are used to reverse the encryption.

Over the years, DES became the dominant symmetric encryption algorithm,

especially in financial applications. In 1994, NIST reaffirmed DES for federal use

for another five years; NIST recommended the use of DES for applications other

than the protection of classified information. In 1999, NIST issued a new version

of its standard (FIPS PUB 46-3) that indicated that DES should be used only

for legacy systems and that triple DES (which in essence involves repeating the

DES algorithm three times on the plaintext using two or three different keys to

produce the ciphertext) be used. We study triple DES in Chapter 7. Because the

underlying encryption and decryption algorithms are the same for DES and triple

DES, it remains important to understand the DES cipher. This section provides an

overview.For the interested reader, Appendix S provides further detail.

7The terminology is a bit confusing. Until recently, the terms DES and DEA could be used interchange-

ably. However, the most recent edition of the DES document includes a specification of the DEA

described here plus the triple DEA (TDEA) described in Chapter 7. Both DEA and TDEA are part of

the Data Encryption Standard. Further, until the recent adoption of the official term TDEA, the triple

DEA algorithm was typically referred to as triple DES and written as 3DES. For the sake of convenience,

we will use the term 3DES.

Figure 4.4 Feistel Example

12DE52

12DE52

F

DE7F 03A6

Decryption roundEncryption round

03A6

6A306A30 F(03A6, 12DE52) DE7F F(03A6, 12DE52) DE7F

F(03A6, 12DE52)

[F(03A6, 12DE52) DE7F]

= DE7F

FR

ou

nd

1

5

R

ou

nd

2

130 CHAPTER 4 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD

DES Encryption

The overall scheme for DES encryption is illustrated in Figure 4.5. As with any

encryption scheme, there are two inputs to the encryption function: the plaintext to

be encrypted and the key. In this case, the plaintext must be 64 bits in length and the

key is 56 bits in length.8

Looking at the left-hand side of the figure, we can see that the processing

of the plaintext proceeds in three phases. First, the 64-bit plaintext passes through

an initial permutation (IP) that rearranges the bits to produce the permuted input.

8Actually, the function expects a 64-bit key as input. However, only 56 of these bits are ever used; the

other 8 bits can be used as parity bits or simply set arbitrarily.

Figure 4.5 General Depiction of DES Encryption Algorithm

Initial permutation

Permuted choice 2Round 1

32-bit swap

Inverse initial

permutation

Permuted choice 1

Round 2

Round 16

64-bit plaintext 64-bit key

K1

K2

K16

64-bit ciphertext

Left circular shift

Permuted choice 2 Left circular shift

Permuted choice 2 Left circular shift

64 56

56

56

56

48

48

48

56 64

64 bits

4.3 / A DES EXAMPLE 131

This is followed by a phase consisting of sixteen rounds of the same function, which

involves both permutation and substitution functions. The output of the last (six-

teenth) round consists of 64 bits that are a function of the input plaintext and the

key. The left and right halves of the output are swapped to produce the preoutput.

Finally, the preoutput is passed through a permutation [IP -1] that is the inverse of

the initial permutation function, to produce the 64-bit ciphertext. With the excep-

tion of the initial and final permutations, DES has the exact structure of a Feistel

cipher, as shown in Figure 4.3.

The right-hand portion of Figure 4.5 shows the way in which the 56-bit key is

used. Initially, the key is passed through a permutation function. Then, for each of

the sixteen rounds, a subkey (Ki) is produced by the combination of a left circular

shift and a permutation. The permutation function is the same for each round, but a

different subkey is produced because of the repeated shifts of the key bits.

DES Decryption

As with any Feistel cipher, decryption uses the same algorithm as encryption, except

that the application of the subkeys is reversed. Additionally, the initial and final

permutations are reversed.

4.3 A DES EXAMPLE

We now work through an example and consider some of its implications. Although

you are not expected to duplicate the example by hand, you will find it informative

to study the hex patterns that occur from one step to the next.

For this example, the plaintext is a hexadecimal palindrome. The plaintext,

key, and resulting ciphertext are as follows:

Plaintext: 02468aceeca86420

Key: 0f1571c947d9e859

Ciphertext: da02ce3a89ecac3b

Results

Table 4.2 shows the progression of the algorithm. The first row shows the 32-bit

values of the left and right halves of data after the initial permutation. The next 16

rows show the results after each round. Also shown is the value of the 48-bit subkey

generated for each round. Note that Li = Ri – 1. The final row shows the left- and

right-hand values after the inverse initial permutation. These two values combined

form the ciphertext.

The Avalanche Effect

A desirable property of any encryption algorithm is that a small change in either

the plaintext or the key should produce a significant change in the ciphertext. In

particular, a change in one bit of the plaintext or one bit of the key should produce

132 CHAPTER 4 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD

a change in many bits of the ciphertext. This is referred to as the avalanche effect.

If the change were small, this might provide a way to reduce the size of the plaintext

or key space to be searched.

Using the example from Table 4.2, Table 4.3 shows the result when the fourth

bit of the plaintext is changed, so that the plaintext is 12468aceeca86420. The

second column of the table shows the intermediate 64-bit values at the end of each

round for the two plaintexts. The third column shows the number of bits that differ

between the two intermediate values. The table shows that, after just three rounds,

18 bits differ between the two blocks. On completion, the two ciphertexts differ in

32 bit positions.

Table 4.4 shows a similar test using the original plaintext of with two keys that

differ in only the fourth bit position: the original key, 0f1571c947d9e859, and

the altered key, 1f1571c947d9e859. Again, the results show that about half of

the bits in the ciphertext differ and that the avalanche effect is pronounced after just

a few rounds.

Round Ki Li Ri

IP 5a005a00 3cf03c0f

1 1e030f03080d2930 3cf03c0f bad22845

2 0a31293432242318 bad22845 99e9b723

3 23072318201d0c1d 99e9b723 0bae3b9e

4 05261d3824311a20 0bae3b9e 42415649

5 3325340136002c25 42415649 18b3fa41

6 123a2d0d04262a1c 18b3fa41 9616fe23

7 021f120b1c130611 9616fe23 67117cf2

8 1c10372a2832002b 67117cf2 c11bfc09

9 04292a380c341f03 c11bfc09 887fbc6c

10 2703212607280403 887fbc6c 600f7e8b

11 2826390c31261504 600f7e8b f596506e

12 12071c241a0a0f08 f596506e 738538b8

13 300935393c0d100b 738538b8 c6a62c4e

14 311e09231321182a c6a62c4e 56b0bd75

15 283d3e0227072528 56b0bd75 75e8fd8f

16 2921080b13143025 75e8fd8f 25896490

IP−1 da02ce3a 89ecac3b

Note: DES subkeys are shown as eight 6-bit values in hex format

Table 4.2 DES Example

4.3 / A DES EXAMPLE 133

Table 4.3 Avalanche Effect in DES: Change in Plaintext

Round D

9 c11bfc09887fbc6c

99f911532eed7d94

32

10 887fbc6c600f7e8b

2eed7d94d0f23094

34

11 600f7e8bf596506e

d0f23094455da9c4

37

12 f596506e738538b8

455da9c47f6e3cf3

31

13 738538b8c6a62c4e

7f6e3cf34bc1a8d9

29

14 c6a62c4e56b0bd75

4bc1a8d91e07d409

33

15 56b0bd7575e8fd8f

1e07d4091ce2e6dc

31

16 75e8fd8f25896490

1ce2e6dc365e5f59

32

IP−1 da02ce3a89ecac3b

057cde97d7683f2a

32

Round D

02468aceeca86420

12468aceeca86420

1

1 3cf03c0fbad22845

3cf03c0fbad32845

1

2 bad2284599e9b723

bad3284539a9b7a3

5

3 99e9b7230bae3b9e

39a9b7a3171cb8b3

18

4 0bae3b9e42415649

171cb8b3ccaca55e

34

5 4241564918b3fa41

ccaca55ed16c3653

37

6 18b3fa419616fe23

d16c3653cf402c68

33

7 9616fe2367117cf2

cf402c682b2cefbc

32

8 67117cf2c11bfc09

2b2cefbc99f91153

33

Table 4.4 Avalanche Effect in DES: Change in Key

Round D

02468aceeca86420

02468aceeca86420

0

1 3cf03c0fbad22845

3cf03c0f9ad628c5

3

2 bad2284599e9b723

9ad628c59939136b

11

3 99e9b7230bae3b9e

9939136b768067b7

25

4 0bae3b9e42415649

768067b75a8807c5

29

5 4241564918b3fa41

5a8807c5488dbe94

26

6 18b3fa419616fe23

488dbe94aba7fe53

26

7 9616fe2367117cf2

aba7fe53177d21e4

27

8 67117cf2c11bfc09

177d21e4548f1de4

32

Round D

9 c11bfc09887fbc6c

548f1de471f64dfd

34

10 887fbc6c600f7e8b

71f64dfd4279876c

36

11 600f7e8bf596506e

4279876c399fdc0d

32

12 f596506e738538b8

399fdc0d6d208dbb

28

13 738538b8c6a62c4e

6d208dbbb9bdeeaa

33

14 c6a62c4e56b0bd75

b9bdeeaad2c3a56f

30

15 56b0bd7575e8fd8f

d2c3a56f2765c1fb

27

16 75e8fd8f25896490

2765c1fb01263dc4

30

IP−1 da02ce3a89ecac3b

ee92b50606b62b0b

30

134 CHAPTER 4 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD

4.4 THE STRENGTH OF DES

Since its adoption as a federal standard, there have been lingering concerns about

the level of security provided by DES. These concerns, by and large, fall into two

areas: key size and the nature of the algorithm.

The Use of 56-Bit Keys

With a key length of 56 bits, there are 256 possible keys, which is approximately

7.2 * 1016 keys. Thus, on the face of it, a brute-force attack appears impractical.

Assuming that, on average, half the key space has to be searched, a single machine

performing one DES encryption per microsecond would take more than a thousand

years to break the cipher.

However, the assumption of one encryption per microsecond is overly con-

servative. As far back as 1977, Diffie and Hellman postulated that the technology

existed to build a parallel machine with 1 million encryption devices, each of which

could perform one encryption per microsecond [DIFF77]. This would bring the

average search time down to about 10 hours. The authors estimated that the cost

would be about $20 million in 1977 dollars.

With current technology, it is not even necessary to use special, purpose-built

hardware. Rather, the speed of commercial, off-the-shelf processors threaten the

security of DES. A recent paper from Seagate Technology [SEAG08] suggests that

a rate of 1 billion (109) key combinations per second is reasonable for today’s mul-

ticore computers. Recent offerings confirm this. Both Intel and AMD now offer

hardware-based instructions to accelerate the use of AES. Tests run on a contem-

porary multicore Intel machine resulted in an encryption rate of about half a bil-

lion encryptions per second [BASU12]. Another recent analysis suggests that with

contemporary supercomputer technology, a rate of 1013 encryptions per second is

reasonable [AROR12].

With these results in mind, Table 4.5 shows how much time is required for a

brute-force attack for various key sizes. As can be seen, a single PC can break DES in

about a year; if multiple PCs work in parallel, the time is drastically shortened. And

today’s supercomputers should be able to find a key in about an hour. Key sizes of

128 bits or greater are effectively unbreakable using simply a brute-force approach.

Even if we managed to speed up the attacking system by a factor of 1 trillion (1012),

it would still take over 100,000 years to break a code using a 128-bit key.

Fortunately, there are a number of alternatives to DES, the most important of

which are AES and triple DES, discussed in Chapters 6 and 7, respectively.

The Nature of the DES Algorithm

Another concern is the possibility that cryptanalysis is possible by exploiting

the characteristics of the DES algorithm. The focus of concern has been on the

eight substitution tables, or S-boxes, that are used in each iteration (described in

Appendix S). Because the design criteria for these boxes, and indeed for the entire

algorithm, were not made public, there is a suspicion that the boxes were con-

structed in such a way that cryptanalysis is possible for an opponent who knows

4.5 / BLOCK CIPHER DESIGN PRINCIPLES 135

Key Size (bits) Cipher

Number of

Alternative

Keys

Time Required at 109

Decryptions/s

Time Required

at 1013

Decryptions/s

56 DES 256 ≈ 7.2 * 1016 255 ns = 1.125 years 1 hour

128 AES 2128 ≈ 3.4 * 1038 2127 ns = 5.3 * 1021 years 5.3 * 1017 years

168 Triple DES 2168 ≈ 3.7 * 1050 2167 ns = 5.8 * 1033 years 5.8 * 1029 years

192 AES 2192 ≈ 6.3 * 1057 2191 ns = 9.8 * 1040 years 9.8 * 1036 years

256 AES 2256 ≈ 1.2 * 1077 2255 ns = 1.8 * 1060 years 1.8 * 1056 years

26 characters

(permutation)

Monoalphabetic 2! = 4 * 1026 2 * 1026 ns = 6.3 * 109 years 6.3 * 106 years

Table 4.5 Average Time Required for Exhaustive Key Search

the weaknesses in the S-boxes. This assertion is tantalizing, and over the years a

number of regularities and unexpected behaviors of the S-boxes have been discov-

ered. Despite this, no one has so far succeeded in discovering the supposed fatal

weaknesses in the S-boxes.9

Timing Attacks

We discuss timing attacks in more detail in Part Two, as they relate to public-key

algorithms. However, the issue may also be relevant for symmetric ciphers. In

essence, a timing attack is one in which information about the key or the plaintext is

obtained by observing how long it takes a given implementation to perform decryp-

tions on various ciphertexts. A timing attack exploits the fact that an encryption

or decryption algorithm often takes slightly different amounts of time on different

inputs. [HEVI99] reports on an approach that yields the Hamming weight (number

of bits equal to one) of the secret key. This is a long way from knowing the actual

key, but it is an intriguing first step. The authors conclude that DES appears to be

fairly resistant to a successful timing attack but suggest some avenues to explore.

Although this is an interesting line of attack, it so far appears unlikely that this tech-

nique will ever be successful against DES or more powerful symmetric ciphers such

as triple DES and AES.

4.5 BLOCK CIPHER DESIGN PRINCIPLES

Although much progress has been made in designing block ciphers that are cryp-

tographically strong, the basic principles have not changed all that much since the

work of Feistel and the DES design team in the early 1970s. In this section we look

at three critical aspects of block cipher design: the number of rounds, design of the

function F, and key scheduling.

9At least, no one has publicly acknowledged such a discovery.

136 CHAPTER 4 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD

Number of Rounds

The cryptographic strength of a Feistel cipher derives from three aspects of the

design: the number of rounds, the function F, and the key schedule algorithm. Let

us look first at the choice of the number of rounds.

The greater the number of rounds, the more difficult it is to perform crypt-

analysis, even for a relatively weak F. In general, the criterion should be that the

number of rounds is chosen so that known cryptanalytic efforts require greater

effort than a simple brute-force key search attack. This criterion was certainly used

in the design of DES. Schneier [SCHN96] observes that for 16-round DES, a dif-

ferential cryptanalysis attack is slightly less efficient than brute force: The differen-

tial cryptanalysis attack requires 255.1 operations,10 whereas brute force requires 255.

If DES had 15 or fewer rounds, differential cryptanalysis would require less effort

than a brute-force key search.

This criterion is attractive, because it makes it easy to judge the strength of

an algorithm and to compare different algorithms. In the absence of a cryptana-

lytic breakthrough, the strength of any algorithm that satisfies the criterion can be

judged solely on key length.

Design of Function F

The heart of a Feistel block cipher is the function F, which provides the element of

confusion in a Feistel cipher. Thus, it must be difficult to “unscramble” the substitu-

tion performed by F. One obvious criterion is that F be nonlinear, as we discussed

previously. The more nonlinear F, the more difficult any type of cryptanalysis will be.

There are several measures of nonlinearity, which are beyond the scope of this

book. In rough terms, the more difficult it is to approximate F by a set of linear

equations, the more nonlinear F is.

Several other criteria should be considered in designing F. We would like the

algorithm to have good avalanche properties. Recall that, in general, this means that

a change in one bit of the input should produce a change in many bits of the output.

A more stringent version of this is the strict avalanche criterion (SAC) [WEBS86],

which states that any output bit j of an S-box (see Appendix S for a discussion of

S-boxes) should change with probability 1/2 when any single input bit i is inverted

for all i, j. Although SAC is expressed in terms of S-boxes, a similar criterion could

be applied to F as a whole. This is important when considering designs that do not

include S-boxes.

Another criterion proposed in [WEBS86] is the bit independence criterion

(BIC), which states that output bits j and k should change independently when any

single input bit i is inverted for all i, j, and k. The SAC and BIC criteria appear to

strengthen the effectiveness of the confusion function.

10Differential cryptanalysis of DES requires 247 chosen plaintext. If all you have to work with is known

plaintext, then you must sort through a large quantity of known plaintext–ciphertext pairs looking for the

useful ones. This brings the level of effort up to 255.1.

4.6 / KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS 137

Key Schedule Algorithm

With any Feistel block cipher, the key is used to generate one subkey for each round.

In general, we would like to select subkeys to maximize the difficulty of deducing

individual subkeys and the difficulty of working back to the main key. No general

principles for this have yet been promulgated.

Adams suggests [ADAM94] that, at minimum, the key schedule should guar-

antee key/ciphertext Strict Avalanche Criterion and Bit Independence Criterion.

4.6 KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS

Key Terms

avalanche effect

block cipher

confusion

Data Encryption Standard

(DES)

diffusion

Feistel cipher

irreversible mapping

key

permutation

product cipher

reversible mapping

round

round function

subkey

substitution

Review Questions

4.1 Briefly define a nonsingular transformation.

4.2 What is the difference between a block cipher and a stream cipher?

4.3 Why is it not practical to use an arbitrary reversible substitution cipher of the kind

shown in Table 4.1?

4.4 Briefly define the terms substitution and permutation.

4.5 What is the difference between diffusion and confusion?

4.6 Which parameters and design choices determine the actual algorithm of a Feistel

cipher?

4.7 What are the critical aspects of Feistel cipher design?

Problems

4.1 a. In Section 4.1, under the subsection on the motivation for the Feistel cipher struc-

ture, it was stated that, for a block of n bits, the number of different reversible

mappings for the ideal block cipher is 2n!. Justify.

b. In that same discussion, it was stated that for the ideal block cipher, which allows all

possible reversible mappings, the size of the key is n * 2n bits. But, if there are 2n!

possible mappings, it should take log2 2

n! bits to discriminate among the different

mappings, and so the key length should be log2 2

n!. However, log2 2

n! 6 n * 2n.

Explain the discrepancy.

138 CHAPTER 4 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD

4.2 Consider a Feistel cipher composed of sixteen rounds with a block length of 128 bits

and a key length of 128 bits. Suppose that, for a given k, the key scheduling algorithm

determines values for the first eight round keys, k1, k2, c k8, and then sets

k9 = k8, k10 = k7, k11 = k6, c , k16 = k1

Suppose you have a ciphertext c. Explain how, with access to an encryption oracle,

you can decrypt c and determine m using just a single oracle query. This shows that

such a cipher is vulnerable to a chosen plaintext attack. (An encryption oracle can be

thought of as a device that, when given a plaintext, returns the corresponding cipher-

text. The internal details of the device are not known to you and you cannot break

open the device. You can only gain information from the oracle by making queries to

it and observing its responses.)

4.3 Let p be a permutation of the integers 0, 1, 2, c , (2n – 1), such that p(m) gives the

permuted value of m, 0 … m 6 2n. Put another way, p maps the set of n-bit integers

into itself and no two integers map into the same integer. DES is such a permutation

for 64-bit integers. We say that p has a fixed point at m if p(m) = m. That is, if p is

an encryption mapping, then a fixed point corresponds to a message that encrypts to

itself. We are interested in the number of fixed points in a randomly chosen permuta-

tion p. Show the somewhat unexpected result that the number of fixed points for p is

1 on an average, and this number is independent of the size of the permutation.

4.4 Consider a block encryption algorithm that encrypts blocks of length n, and let

N = 2n. Say we have t plaintext–ciphertext pairs Pi, Ci = E(K, Pi), where we assume

that the key K selects one of the N! possible mappings. Imagine that we wish to find K

by exhaustive search. We could generate key K′ and test whether Ci = E(K′, Pi) for

1 … i … t. If K′ encrypts each Pi to its proper Ci, then we have evidence that K = K′.

However, it may be the case that the mappings E(K, # ) and E(K′, # ) exactly agree

on the t plaintext–cipher text pairs Pi, Ci and agree on no other pairs.

a. What is the probability that E(K, # ) and E(K′, # ) are in fact distinct mappings?

b. What is the probability that E(K, # ) and E(K′, # ) agree on another t′ plaintext–

ciphertext pairs where 0 … t′ … N – t?

4.5 For any block cipher, the fact that it is a nonlinear function is crucial to its security. To

see this, suppose that we have a linear block cipher EL that encrypts 256-bit blocks

of plaintext into 256-bit blocks of ciphertext. Let EL(k, m) denote the encryption of a

256-bit message m under a key k (the actual bit length of k is irrelevant). Thus,

EL(k, [m1 ⊕ m2]) = EL(k, m1) ⊕ EL(k, m2) for all 128@bit patterns m1, m2.

Describe how, with 256 chosen ciphertexts, an adversary can decrypt any ciphertext

without knowledge of the secret key k. (A “chosen ciphertext” means that an adver-

sary has the ability to choose a ciphertext and then obtain its decryption. Here, you

have 256 plaintext/ciphertext pairs to work with and you have the ability to choose

the value of the ciphertexts.)

4.6 Suppose the DES F function mapped every 32-bit input R, regardless of the value of

the input K, to;

a. 32-bit string of zero

b. R

Then

1. What function would DES then compute?

2. What would the decryption look like?

Hint: Use the following properties of the XOR operation:

(A ⊕ B) ⊕ C = A ⊕ (B ⊕ C)

(A ⊕ A) = 0

(A⊕ 0 ) = A

A ⊕ 1 = bitwise complement of A

4.6 / KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS 139

where

A,B,C are n-bit strings of bits

0 is an n-bit string of zeros

1 is an n-bit string of one

4.7 Show that DES decryption is, in fact, the inverse of DES encryption.

4.8 The 32-bit swap after the sixteenth iteration of the DES algorithm is needed to make

the encryption process invertible by simply running the ciphertext back through the

algorithm with the key order reversed. This was demonstrated in the preceding prob-

lem. However, it still may not be entirely clear why the 32-bit swap is needed. To

demonstrate why, solve the following exercises. First, some notation:

A ‘ B = the concatenation of the bit strings A and B

Ti(R ‘ L) = the transformation defined by the ith iteration of the encryption

algorithm for 1 … I … 16

TDi(R ‘ L) = the transformation defined by the ith iteration of the decryption

algorithm for 1 … I … 16

T17(R ‘ L) = L ‘ R, where this transformation occurs after the sixteenth iteration

of the encryption algorithm

a. Show that the composition TD1(IP(IP

-1(T17(T16(L15 ‘ R15))))) is equivalent to the

transformation that interchanges the 32-bit halves, L15 and R15. That is, show that

TD1(IP(IP

-1(T17(T16(L15 ‘ R15))))) = R15 ‘ L15

b. Now suppose that we did away with the final 32-bit swap in the encryption algo-

rithm. Then we would want the following equality to hold:

TD1(IP(IP

-1(T16(L15 ‘ R15)))) = L15 ‘ R15

Does it?

Note: The following problems refer to details of DES that are described in Appendix S.

4.9 Consider the substitution defined by row 1 of S-box S1 in Table S.2. Show a block

diagram similar to Figure 4.2 that corresponds to this substitution.

4.10 Compute the bits number 4, 17, 41, and 45 at the output of the first round of the DES

decryption, assuming that the ciphertext block is composed of all ones and the exter-

nal key is composed of all ones.

4.11 This problem provides a numerical example of encryption using a one-round version

of DES. We start with the same bit pattern for the key K and the plaintext, namely:

Hexadecimal notation: 0 1 2 3 4 5 6 7 8 9 A B C D E F

Binary notation: 0000 0001 0010 0011 0100 0101 0110 0111

1000 1001 1010 1011 1100 1101 1110 1111

a. Derive K1, the first-round subkey.

b. Derive L0, R0.

c. Expand R0 to get E[R0], where E[ # ] is the expansion function of Table S.1.

d. Calculate A = E[R0] ⊕ K1.

e. Group the 48-bit result of (d) into sets of 6 bits and evaluate the corresponding

S-box substitutions.

f. Concatenate the results of (e) to get a 32-bit result, B.

140 CHAPTER 4 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD

g. Apply the permutation to get P(B).

h. Calculate R1 = P(B) ⊕ L0.

i. Write down the ciphertext.

4.12 Analyze the amount of left shifts in the DES key schedule by studying Table S.3 (d).

Is there a pattern? What could be the reason for the choice of these constants?

4.13 When using the DES algorithm for decryption, the 16 keys (K1, K2, c , K16) are

used in reverse order. Therefore, the right-hand side of Figure S.1 is not valid for

decryption. Design a key-generation scheme with the appropriate shift schedule

(analogous to Table S.3d) for the decryption process.

4.14 a. Let X′ be the bitwise complement of X. Prove that if the complement of the

plaintext block is taken and the complement of an encryption key is taken, then

the result of DES encryption with these values is the complement of the original

ciphertext. That is,

If Y = E(K, X)

Then Y′ = E(K′, X′)

Hint: Begin by showing that for any two bit strings of equal length, A and B,

(A ⊕ B)′ = A′ ⊕ B.

b. It has been said that a brute-force attack on DES requires searching a key space of

256 keys. Does the result of part (a) change that?

4.15 a. We say that a DES key K is weak if DESK is an involution. Exhibit four weak

keys for DES.

b. We say that a DES key K is semi-weak if it is not weak and if there exists a key K′

such that DESK

– 1 = DESK′. Exhibit four semi-weak keys for DES.

Note: The following problems refer to simplified DES, described in Appendix G.

4.16 Refer to Figure G.3, which explains encryption function for S-DES.

a. How important is the initial permutation IP?

b. How important is the SW function in the middle?

4.17 The equations for the variables q and r for S-DES are defined in the section on

S-DES analysis. Provide the equations for s and t.

4.18 Using S-DES, decrypt the string 01000110 using the key 1010000010 by hand.

Show intermediate results after each function (IP, FK, SW, FK, IP

-1). Then decode

the first 4 bits of the plaintext string to a letter and the second 4 bits to another letter

where we encode A through P in base 2 (i.e., A = 0000, B = 0001, c , P = 1111).

Hint: As a midway check, after the xoring with K2, the string should be 11000001.

Programming Problems

4.19 Create software that can encrypt and decrypt using a general substitution block

cipher.

4.20 Create software that can encrypt and decrypt using S-DES. Test data: use plaintext,

ciphertext, and key of Problem 4.18.

141

5.1 Groups

Groups

Abelian Group

Cyclic Group

5.2 Rings

5.3 Fields

5.4 Finite Fields of the Form GF(p)

Finite Fields of Order p

Finding the Multiplicative Inverse in GF(p)

Summary

5.5 Polynomial Arithmetic

Ordinary Polynomial Arithmetic

Polynomial Arithmetic with Coefficients in Zp

Finding the Greatest Common Divisor

Summary

5.6 Finite Fields of the form GF(2n)

Motivation

Modular Polynomial Arithmetic

Finding the Multiplicative Inverse

Computational Considerations

Using a Generator

Summary

5.7 Key Terms, Review Questions, and Problems

CHAPTER

Finite Fields

142 CHAPTER 5 / FINITE FIELDS

Finite fields have become increasingly important in cryptography. A number of

cryptographic algorithms rely heavily on properties of finite fields, notably the

Advanced Encryption Standard (AES) and elliptic curve cryptography. Other exam-

ples include the message authentication code CMAC and the authenticated encryption

scheme GCM.

This chapter provides the reader with sufficient background on the concepts of

finite fields to be able to understand the design of AES and other cryptographic algo-

rithms that use finite fields. Because students unfamiliar with abstract algebra may find

the concepts behind finite fields somewhat difficult to grasp, we approach the topic in a

way designed to enhance understanding. Our plan of attack is as follows:

1. Fields are a subset of a larger class of algebraic structures called rings, which

are in turn a subset of the larger class of groups. In fact, as shown in Figure 5.1,

both groups and rings can be further differentiated. Groups are defined by

a simple set of properties and are easily understood. Each successive subset

(abelian group, ring, commutative ring, and so on) adds additional properties

and is thus more complex. Sections 5.1 through 5.3 will examine groups, rings,

and fields, successively.

2. Finite fields are a subset of fields, consisting of those fields with a finite num-

ber of elements. These are the class of fields that are found in cryptographic

algorithms. With the concepts of fields in hand, we turn in Section 5.4 to a

specific class of finite fields, namely those with p elements, where p is prime.

Certain asymmetric cryptographic algorithms make use of such fields.

3. A more important class of finite fields, for cryptography, comprises those with

2n elements depicted as fields of the form GF(2n). These are used in a wide

variety of cryptographic algorithms. However, before discussing these fields, we

need to analyze the topic of polynomial arithmetic, which is done in Section 5.5.

4. With all of this preliminary work done, we are able at last, in Section 5.6, to

discuss finite fields of the form GF(2n).

Before proceeding, the reader may wish to review Sections 2.1 through 2.3, which

cover relevant topics in number theory.

LEARNING OBJECTIVES

After studying this chapter, you should be able to:

◆ Distinguish among groups, rings, and fields.

◆ Define finite fields of the form GF(p).

◆ Explain the differences among ordinary polynomial arithmetic, polynomial

arithmetic with coefficients in Zp, and modular polynomial arithmetic in

GF(2n).

◆ Define finite fields of the form GF(2n).

◆ Explain the two different uses of the mod operator.

5.1 / GROUPS 143

5.1 GROUPS

Groups, rings, and fields are the fundamental elements of a branch of mathematics

known as abstract algebra, or modern algebra. In abstract algebra, we are concerned

with sets on whose elements we can operate algebraically; that is, we can combine

two elements of the set, perhaps in several ways, to obtain a third element of the set.

These operations are subject to specific rules, which define the nature of the set. By

convention, the notation for the two principal classes of operations on set elements is

usually the same as the notation for addition and multiplication on ordinary numbers.

However, it is important to note that, in abstract algebra, we are not limited to ordi-

nary arithmetical operations. All this should become clear as we proceed.

Groups

A group G, sometimes denoted by {G, # }, is a set of elements with a binary opera-

tion denoted by # that associates to each ordered pair (a, b) of elements in G an

element (a # b) in G, such that the following axioms are obeyed:1

(A1) Closure: If a and b belong to G, then a # b is also in G.

(A2) Associative: a # (b # c) = (a # b) # c for all a, b, c in G.

1 The operator # is generic and can refer to addition, multiplication, or some other mathematical operation.

Figure 5.1 Groups, Rings, and Fields

Groups

Abelian groups

Rings

Commutative rings

Integral domains

Fields

Finite

fields

144 CHAPTER 5 / FINITE FIELDS

(A3) Identity element: There is an element e in G such that

a # e = e # a = a for all a in G.

(A4) Inverse element: For each a in G, there is an element a′ in G

such that a # a′ = a′ # a = e.

Let Nn denote a set of n distinct symbols that, for convenience, we represent as

{1, 2, c , n}. A permutation of n distinct symbols is a one-to-one mapping from

Nn to Nn.

2 Define Sn to be the set of all permutations of n distinct symbols. Each

element of Sn is represented by a permutation p of the integers in 1, 2, . . . , n.

It is easy to demonstrate that Sn is a group:

A1: If (p, r∈ Sn), then the composite mapping p # r is formed by per-

muting the elements of r according to the permutation p. For

example, {3, 2, 1} # {1, 3, 2} = {2, 3, 1}. The notation for this map-

ping is explained as follows: The value of the first element of p

indicates which element of r is to be in the first position in p # r; the

value of the second element of p indicates which element of r is

to be in the second position in p # r; and so on. Clearly, p # r∈ Sn.

A2: The composition of mappings is also easily seen to be associative.

A3: The identity mapping is the permutation that does not alter the

order of the n elements. For Sn, the identity element is {1, 2, c , n}.

A4: For any p∈ Sn, the mapping that undoes the permutation defined

by p is the inverse element for p. There will always be such an

inverse. For example {2, 3, 1} # {3, 1, 2} = {1, 2, 3}.

2This is equivalent to the definition of permutation in Chapter 2, which stated that a permutation of a

finite set of elements S is an ordered sequence of all the elements of S, with each element appearing

exactly once.

The set of integers (positive, negative, and 0) under addition is an abelian group.

The set of nonzero real numbers under multiplication is an abelian group. The

set Sn from the preceding example is a group but not an abelian group for n 7 2.

If a group has a finite number of elements, it is referred to as a finite group, and

the order of the group is equal to the number of elements in the group. Otherwise,

the group is an infinite group.

Abelian Group

A group is said to be abelian if it satisfies the following additional condition:

(A5) Commutative: a # b = b # a for all a, b in G.

5.2 / RINGS 145

When the group operation is addition, the identity element is 0; the in-

verse element of a is – a; and subtraction is defined with the following rule:

a – b = a + ( – b).

Cyclic Group

We define exponentiation within a group as a repeated application of the group

operator, so that a3 = a # a # a. Furthermore, we define a0 = e as the identity ele-

ment, and a-n = (a′)n, where a′ is the inverse element of a within the group.

A group G is cyclic if every element of G is a power ak (k is an integer) of a fixed

element a ∈ G. The element a is said to generate the group G or to be a generator

of G. A cyclic group is always abelian and may be finite or infinite.

The additive group of integers is an infinite cyclic group generated by the element

1. In this case, powers are interpreted additively, so that n is the nth power of 1.

5.2 RINGS

A ring R, sometimes denoted by {R, + , * }, is a set of elements with two binary

operations, called addition and multiplication,3 such that for all a, b, c in R the fol-

lowing axioms are obeyed.

(A1–A5) R is an abelian group with respect to addition; that is, R satisfies axioms

A1 through A5. For the case of an additive group, we denote the identity element

as 0 and the inverse of a as – a.

(M1) Closure under multiplication: If a and b belong to R, then ab is also in R.

(M2) Associativity of multiplication: a(bc) = (ab)c for all a, b, c in R.

(M3) Distributive laws: a(b + c) = ab + ac for all a, b, c in R.

(a + b)c = ac + bc for all a, b, c in R.

In essence, a ring is a set of elements in which we can do addition, subtraction

[a – b = a + ( – b)], and multiplication without leaving the set.

3Generally, we do not use the multiplication symbol, * , but denote multiplication by the concatenation

of two elements.

With respect to addition and multiplication, the set of all n-square matrices over

the real numbers is a ring.

A ring is said to be commutative if it satisfies the following additional condition:

(M4) Commutativity of multiplication: ab = ba for all a, b in R.

146 CHAPTER 5 / FINITE FIELDS

Next, we define an integral domain, which is a commutative ring that obeys

the following axioms.

(M5) Multiplicative identity: There is an element 1 in R such that

a1 = 1a = a for all a in R.

(M6) No zero divisors: If a, b in R and ab = 0, then either a = 0

or b = 0.

Let S be the set of even integers (positive, negative, and 0) under the usual

operations of addition and multiplication. S is a commutative ring. The set of all

n-square matrices defined in the preceding example is not a commutative ring.

The set Zn of integers {0, 1, c , n – 1}, together with the arithmetic oper-

ations modulo n, is a commutative ring (Table 4.3).

Let S be the set of integers (positive, negative, and 0) under the usual operations

of addition and multiplication. S is an integral domain.

Familiar examples of fields are the rational numbers, the real numbers, and the

complex numbers. Note that the set of all integers is not a field, because not every

element of the set has a multiplicative inverse; in fact, only the elements 1 and – 1

have multiplicative inverses in the integers.

5.3 FIELDS

A field F, sometimes denoted by {F, + , * }, is a set of elements with two binary

operations, called addition and multiplication, such that for all a, b, c in F the follow-

ing axioms are obeyed.

(A1–M6) F is an integral domain; that is, F satisfies axioms A1 through A5 and

M1 through M6.

(M7) Multiplicative inverse: For each a in F, except 0, there is an element

a-1 in F such that aa-1 = (a-1)a = 1.

In essence, a field is a set of elements in which we can do addition, subtraction,

multiplication, and division without leaving the set. Division is defined with the fol-

lowing rule: a/b = a(b-1).

In gaining insight into fields, the following alternate characterization may be

useful. A field F, denoted by {F, + }, is a set of elements with two binary operations,

called addition and multiplication, such that the following conditions hold:

1. F forms an abelian group with respect to addition.

2. The nonzero elements of F form an abelian group with respect to multiplication.

5.4 / FINITE FIELDS OF THE FORM GF(p) 147

3. The distributive law holds. That is, for all a, b, c in F,

a(b + c) = ab + ac.

(a + b)c = ac + bc

4. Figure 5.2 summarizes the axioms that define groups, rings, and fields.

5.4 FINITE FIELDS OF THE FORM GF(p)

In Section 5.3, we defined a field as a set that obeys all of the axioms of Figure 5.2

and gave some examples of infinite fields. Infinite fields are not of particular inter-

est in the context of cryptography. However, in addition to infinite fields, there are

two types of finite fields, as illustrated in Figure 5.3. Finite fields play a crucial role

in many cryptographic algorithms.

It can be shown that the order of a finite field (number of elements in the

field) must be a power of a prime pn, where n is a positive integer. The finite field

of order pn is generally written GF(pn); GF stands for Galois field, in honor of the

mathematician who first studied finite fields. Two special cases are of interest for

our purposes. For n = 1, we have the finite field GF(p); this finite field has a differ-

ent structure than that for finite fields with n 7 1 and is studied in this section. For

finite fields of the form GF(pn), GF(2n) fields are of particular cryptographic inter-

est, and these are covered in Section 5.6.

Finite Fields of Order p

For a given prime, p, we define the finite field of order p, GF(p), as the set Zp of integers

{0, 1, c , p – 1} together with the arithmetic operations modulo p. Note therefore

that we are using ordinary modular arithmetic to define the operations over these fields.

Figure 5.2 Properties of Groups, Rings, and Fields

(A1) Closure under addition: If a and b belong to S, then a + b is also in S

(A2) Associativity of addition: a + (b + c) = (a + b) + c for all a, b, c in S

(A3) Additive identity: There is an element 0 in R such that

a + 0 = 0 + a = a for all a in S

(A4) Additive inverse: For each a in S there is an element –a in S

such that a + (–a) = (–a) + a = 0

(A5) Commutativity of addition: a + b = b + a for all a, b in S

(M1) Closure under multiplication: If a and b belong to S, then ab is also in S

(M2) Associativity of multiplication: a(bc) = (ab)c for all a, b, c in S

(M3) Distributive laws: a(b + c) = ab + ac for all a, b, c in S

(a + b)c = ac + bc for all a, b, c in S

(M4) Commutativity of multiplication: ab = ba for all a, b in S

(M5) Multiplicative identity: There is an element 1 in S such that

a1 = 1a = a for all a in S

(M6) No zero divisors: If a, b in S and ab = 0, then either

a = 0 or b = 0

(M7) Multiplicative inverse: If a belongs to S and a ≠ 0, there is an

element a –1 in S such that aa –1 = a –1a = 1

G

ro

up

A

be

lia

n

gr

ou

p

R

in

g

C

om

m

ut

at

iv

e

ri

ng

In

te

gr

al

d

om

ai

n

F

ie

ld

148 CHAPTER 5 / FINITE FIELDS

Recall that we showed in Section 5.2 that the set Zn of integers {0, 1, c , n – 1},

together with the arithmetic operations modulo n, is a commutative ring (Table 2.5).

We further observed that any integer in Zn has a multiplicative inverse if and only if

that integer is relatively prime to n [see discussion of Equation (2.5)].4 If n is prime,

then all of the nonzero integers in Zn are relatively prime to n, and therefore there

exists a multiplicative inverse for all of the nonzero integers in Zn. Thus, for Zp we

can add the following properties to those listed in Table 5.2:

Multiplicative

inverse (w -1)

For each w ∈ Zp, w ≠ 0, there exists a z ∈ Zp

such that w * z K 1 (mod p)

Because w is relatively prime to p, if we multiply all the elements of Zp by

w, the resulting residues are all of the elements of Zp permuted. Thus, exactly one

of the residues has the value 1. Therefore, there is some integer in Zp that, when

multiplied by w, yields the residue 1. That integer is the multiplicative inverse of w,

designated w -1. Therefore, Zp is in fact a finite field. Furthermore, Equation (2.5) is

consistent with the existence of a multiplicative inverse and can be rewritten with-

out the condition:

if (a * b) K (a * c)(mod p) then b K c(mod p) (5.1)

Multiplying both sides of Equation (5.1) by the multiplicative inverse of a, we have

((a-1) * a * b) K ((a-1) * a * c)(mod p)

b K c (mod p)

4As stated in the discussion of Equation (2.5), two integers are relatively prime if their only common

positive integer factor is 1.

Figure 5.3 Types of Fields

Fields

Fields with an

infinite number

of elements

Finite fields

GF(p)

Finite fields

with p elements

GF(pn)

Finite fields

with pn elements

The simplest finite field is GF(2). Its arithmetic operations are easily summarized:

+ 0 1

0 0 1

1 1 0

Addition

* 0 1

0 0 0

1 0 1

Multiplication

w – w w -1

0 0 –

1 1 1

Inverses

In this case, addition is equivalent to the exclusive-OR (XOR) operation, and

multiplication is equivalent to the logical AND operation.

5.4 / FINITE FIELDS OF THE FORM GF(p) 149

The right-hand side of Table 5.1 shows arithmetic operations in GF(7). This is a

field of order 7 using modular arithmetic modulo 7. As can be seen, it satisfies all

of the properties required of a field (Figure 5.2). Compare with the left-hand side

of Table 5.1, which reproduces Table 2.2. In the latter case, we see that the set Z8,

using modular arithmetic modulo 8, is not a field. Later in this chapter, we show

how to define addition and multiplication operations on Z8 in such a way as to

form a finite field.

Finding the Multiplicative Inverse in GF(p)

It is easy to find the multiplicative inverse of an element in GF(p) for small values

of p. You simply construct a multiplication table, such as shown in Table 5.1e, and

the desired result can be read directly. However, for large values of p, this approach

is not practical.

If a and b are relatively prime, then b has a multiplicative inverse modulo a.

That is, if gcd(a, b) = 1, then b has a multiplicative inverse modulo a. That is, for

positive integer b 6 a, there exists a b-1 6 a such that bb-1 = 1 mod a. If a is a

prime number and b 6 a, then clearly a and b are relatively prime and have a great-

est common divisor of 1. We now show that we can easily compute b-1 using the

extended Euclidean algorithm.

We repeat here Equation (2.7), which we showed can be solved with the ex-

tended Euclidean algorithm:

ax + by = d = gcd(a, b)

Now, if gcd(a, b) = 1, then we have ax + by = 1. Using the basic equalities of

modular arithmetic, defined in Section 2.3, we can say

[(ax mod a) + (by mod a)] mod a = 1 mod a

0 + (by mod a) = 1

But if by mod a = 1, then y = b-1. Thus, applying the extended Euclidean

algorithm to Equation (2.7) yields the value of the multiplicative inverse of b if

gcd(a, b) = 1.

Consider the example that was shown in Table 2.4. Here we have a = 1759,

which is a prime number, and b = 550. The solution of the equation

1759x + 550y = d yields a value of y = 355. Thus, b-1 = 355. To verify, we cal-

culate 550 * 355 mod 1759 = 195250 mod 1759 = 1.

More generally, the extended Euclidean algorithm can be used to find a

multiplicative inverse in Zn for any n. If we apply the extended Euclidean algorithm

to the equation nx + by = d, and the algorithm yields d = 1, then y = b-1 in Zn.

150 CHAPTER 5 / FINITE FIELDS

+ 0 1 2 3 4 5 6 7

0 0 1 2 3 4 5 6 7

1 1 2 3 4 5 6 7 0

2 2 3 4 5 6 7 0 1

3 3 4 5 6 7 0 1 2

4 4 5 6 7 0 1 2 3

5 5 6 7 0 1 2 3 4

6 6 7 0 1 2 3 4 5

7 7 0 1 2 3 4 5 6

(a) Addition modulo 8

* 0 1 2 3 4 5 6 7

0 0 0 0 0 0 0 0 0

1 0 1 2 3 4 5 6 7

2 0 2 4 6 0 2 4 6

3 0 3 6 1 4 7 2 5

4 0 4 0 4 0 4 0 4

5 0 5 2 7 4 1 6 3

6 0 6 4 2 0 6 4 2

7 0 7 6 5 4 3 2 1

(b) Multiplication modulo 8

w 0 1 2 3 4 5 6 7

– w 0 7 6 5 4 3 2 1

w -1 — 1 — 3 — 5 — 7

(c) Additive and multiplicative

inverses modulo 8

+ 0 1 2 3 4 5 6

0 0 1 2 3 4 5 6

1 1 2 3 4 5 6 0

2 2 3 4 5 6 0 1

3 3 4 5 6 0 1 2

4 4 5 6 0 1 2 3

5 5 6 0 1 2 3 4

6 6 0 1 2 3 4 5

(d) Addition modulo 7

* 0 1 2 3 4 5 6

0 0 0 0 0 0 0 0

1 0 1 2 3 4 5 6

2 0 2 4 6 1 3 5

3 0 3 6 2 5 1 4

4 0 4 1 5 2 6 3

5 0 5 3 1 6 4 2

6 0 6 5 4 3 2 1

(e) Multiplication modulo 7

w 0 1 2 3 4 5 6

– w 0 6 5 4 3 2 1

w -1 — 1 4 5 2 3 6

(f) Additive and multiplicative

inverses modulo 7

Table 5.1 Arithmetic Modulo 8 and Modulo 7

Summary

In this section, we have shown how to construct a finite field of order p, where p is

prime. Specifically, we defined GF(p) with the following properties.

1. GF(p) consists of p elements.

2. The binary operations + and * are defined over the set. The operations of

addition, subtraction, multiplication, and division can be performed without

leaving the set. Each element of the set other than 0 has a multiplicative in-

verse, and division is performed by multiplication by the multiplicative inverse.

We have shown that the elements of GF(p) are the integers {0, 1, c , p – 1}

and that the arithmetic operations are addition and multiplication mod p.

5.5 / POLYNOMIAL ARITHMETIC 151

5.5 POLYNOMIAL ARITHMETIC

Before continuing our discussion of finite fields, we need to introduce the interest-

ing subject of polynomial arithmetic. We are concerned with polynomials in a single

variable x, and we can distinguish three classes of polynomial arithmetic (Figure 5.4).

■ Ordinary polynomial arithmetic, using the basic rules of algebra.

■ Polynomial arithmetic in which the arithmetic on the coefficients is performed

modulo p; that is, the coefficients are in GF(p).

■ Polynomial arithmetic in which the coefficients are in GF(p), and the poly-

nomials are defined modulo a polynomial m(x) whose highest power is some

integer n.

This section examines the first two classes, and the next section covers the

last class.

Ordinary Polynomial Arithmetic

A polynomial of degree n (integer n Ú 0) is an expression of the form

f(x) = anx

n + an – 1xn – 1 + g + a1x + a0 = a

n

i = 0

aix

i

where the ai are elements of some designated set of numbers S, called the coefficient

set, and an ≠ 0. We say that such polynomials are defined over the coefficient set S.

A zero-degree polynomial is called a constant polynomial and is simply an

element of the set of coefficients. An nth-degree polynomial is said to be a monic

polynomial if an = 1.

In the context of abstract algebra, we are usually not interested in evaluating a

polynomial for a particular value of x [e.g., f(7)]. To emphasize this point, the vari-

able x is sometimes referred to as the indeterminate.

Polynomial arithmetic includes the operations of addition, subtraction, and

multiplication. These operations are defined in a natural way as though the variable

Figure 5.4 Treatment of Polynomials

Polynomial f(x)

x treated as a variable,

and evaluated for

a particular value of x

x treated as an

indeterminate

Ordinary

polynomial

arithmetic

Arithmetic on

coefficients is

performed

modulo p

Arithmetic on coefficients is

performed modulo p

and polynomials are defined

modulo a polynomial m(x)

152 CHAPTER 5 / FINITE FIELDS

x was an element of S. Division is similarly defined, but requires that S be a field.

Examples of fields include the real numbers, rational numbers, and Zp for p prime.

Note that the set of all integers is not a field and does not support polynomial

division.

Addition and subtraction are performed by adding or subtracting correspond-

ing coefficients. Thus, if

f(x) = a

n

i = 0

aix

i; g(x) = a

m

i = 0

bix

i; n Ú m

then addition is defined as

f(x) + g(x) = a

m

i = 0

(ai + bi)xi + a

n

i = m + 1

aix

i

and multiplication is defined as

f(x) * g(x) = a

n + m

i = 0

cix

i

where

ck = a0bk + a1bk – 1 + g + ak – 1b1 + akb0

In the last formula, we treat ai as zero for i 7 n and bi as zero for i 7 m. Note that

the degree of the product is equal to the sum of the degrees of the two polynomials.

As an example, let f(x) = x3 + x2 + 2 and g(x) = x2 – x + 1, where S is the set

of integers. Then

f(x) + g(x) = x3 + 2×2 – x + 3

f(x) – g(x) = x3 + x + 1

f(x) * g(x) = x5 + 3×2 – 2x + 2

Figures 5.5a through 5.5c show the manual calculations. We comment on division

subsequently.

Polynomial Arithmetic with Coefficients in Zp

Let us now consider polynomials in which the coefficients are elements of some

field F; we refer to this as a polynomial over the field F. In this case, it is easy to

show that the set of such polynomials is a ring, referred to as a polynomial ring. That

is, if we consider each distinct polynomial to be an element of the set, then that set

is a ring.5

When polynomial arithmetic is performed on polynomials over a field, then

division is possible. Note that this does not mean that exact division is possible. Let

5In fact, the set of polynomials whose coefficients are elements of a commutative ring forms a polynomial

ring, but that is of no interest in the present context.

5.5 / POLYNOMIAL ARITHMETIC 153

us clarify this distinction. Within a field, given two elements a and b, the quotient

a/b is also an element of the field. However, given a ring R that is not a field, in gen-

eral, division will result in both a quotient and a remainder; this is not exact division.

Figure 5.5 Examples of Polynomial Arithmetic

x3

x3

+ +x2

+2×2

x2 x

2

+–+ ( )

× ( )

– ( )

x–

1

+ 3

(a) Addition

(d) Division(c) Multiplication

x3

x3

+ +x2

+ x2

x2 x

2

x3

x 2

+

+

+x2

x3 – x2

2×2

+ x

– x

x

2

+ 2

2×2 – 2x + 2

x4 –– –x3 2x

– 2x

x5 + +x4 2×2

x5 +3×2

+– 1 x2 x +– 1

+ 2

+ 2

x3

x3

+ +x2

x2 x

2

+–

x+

1

+ 1

(b) Subtraction

Consider the division 5/3 within a set S. If S is the set of rational numbers, which

is a field, then the result is simply expressed as 5/3 and is an element of S. Now

suppose that S is the field Z7. In this case, we calculate (using Table 5.1f)

5/3 = (5 * 3-1) mod 7 = (5 * 5) mod 7 = 4

which is an exact solution. Finally, suppose that S is the set of integers, which is a

ring but not a field. Then 5/3 produces a quotient of 1 and a remainder of 2:

5/3 = 1 + 2/3

5 = 1 * 3 + 2

Thus, division is not exact over the set of integers.

Now, if we attempt to perform polynomial division over a coefficient set that

is not a field, we find that division is not always defined.

If the coefficient set is the integers, then (5×2)/(3x) does not have a solution,

because it would require a coefficient with a value of 5/3, which is not in the coef-

ficient set. Suppose that we perform the same polynomial division over Z7. Then

we have (5×2)/(3x) = 4x, which is a valid polynomial over Z7.

However, as we demonstrate presently, even if the coefficient set is a field,

polynomial division is not necessarily exact. In general, division will produce a quo-

tient and a remainder. We can restate the division algorithm of Equation (2.1) for

polynomials over a field as follows. Given polynomials f(x) of degree n and g(x)

154 CHAPTER 5 / FINITE FIELDS

of degree (m), (n Ú m), if we divide f(x) by g(x), we get a quotient q(x) and a

remainder r(x) that obey the relationship

f(x) = q(x)g(x) + r(x) (5.2)

with polynomial degrees:

Degree f(x) = n

Degree g(x) = m

Degree q(x) = n – m

Degree r(x) … m – 1

With the understanding that remainders are allowed, we can say that poly-

nomial division is possible if the coefficient set is a field. One common technique

used for polynomial division is polynomial long division, similar to long division for

integers. Examples of this are shown subsequently.

In an analogy to integer arithmetic, we can write f(x) mod g(x) for the remain-

der r(x) in Equation (5.2). That is, r(x) = f(x) mod g(x). If there is no remainder

[i.e., r(x) = 0], then we can say g(x) divides f(x), written as g(x)�f(x). Equivalently,

we can say that g(x) is a factor of f(x) or g(x) is a divisor of f(x).

For the preceding example [f(x) = x3 + x2 + 2 and g(x) = x2 – x + 1], f(x)/g(x)

produces a quotient of q(x) = x + 2 and a remainder r(x) = x, as shown in

Figure 5.5d. This is easily verified by noting that

q(x)g(x) + r(x) = (x + 2)(x2 – x + 1) + x = (x3 + x2 – x + 2) + x

= x3 + x2 + 2 = f(x)

For our purposes, polynomials over GF(2) are of most interest. Recall from

Section 5.4 that in GF(2), addition is equivalent to the XOR operation, and multi-

plication is equivalent to the logical AND operation. Further, addition and subtrac-

tion are equivalent mod 2:

1 + 1 = 1 – 1 = 0

1 + 0 = 1 – 0 = 1

0 + 1 = 0 – 1 = 1

Figure 5.6 shows an example of polynomial arithmetic over GF(2). For

f(x) = (x7 + x5 + x4 + x3 + x + 1) and g(x) = (x3 + x + 1), the figure shows

f(x) + g(x); f(x) – g(x); f(x) * g(x); and f(x)/g(x). Note that g(x)�f(x).

A polynomial f(x) over a field F is called irreducible if and only if f(x) can-

not be expressed as a product of two polynomials, both over F, and both of degree

lower than that of f(x). By analogy to integers, an irreducible polynomial is also

called a prime polynomial.

The polynomial6 f(x) = x4 + 1 over GF(2) is reducible, because

x4 + 1 = (x + 1)(x3 + x2 + x + 1).

6In the reminder of this chapter, unless otherwise noted, all examples are of polynomials over GF(2).

5.5 / POLYNOMIAL ARITHMETIC 155

Consider the polynomial f(x) = x3 + x + 1. It is clear by inspection that x is not

a factor of f(x). We easily show that x + 1 is not a factor of f(x):

x2 + x

x + 1�x3 + x + 1

x3 + x2

x2 + x

x2 + x

1

Thus, f(x) has no factors of degree 1. But it is clear by inspection that if f(x) is

reducible, it must have one factor of degree 2 and one factor of degree 1. There-

fore, f(x) is irreducible.

Figure 5.6 Examples of Polynomial Arithmetic over GF(2)

(a) Addition

(c) Multiplication

(d) Division

x4x5 ++x7

xx3

x3x4 ++x5 ++x7 +x 1

+++ ( )1

x3x4 ++x5 ++x7 +x 1

x4x5 ++x7

x3 x

x3 ++ +x 1

+ 1

x5x6 ++x8 x4 ++ +x2

+ x2

x

x7x8 ++x10 x6 ++ +x4

x10 + x4

x3

++× ( )1

x3x4 ++x5 ++x7

x4x5 ++x7

+x

x3 x

1

++– ( )1

(b) Subtraction

x3x4 ++x5 ++

++

x7

x4x5x7

+x 1

x3 + +x 1

x3 + +x 1

x4 1+

x3 x ++ 1

156 CHAPTER 5 / FINITE FIELDS

Finding the Greatest Common Divisor

We can extend the analogy between polynomial arithmetic over a field and integer

arithmetic by defining the greatest common divisor as follows. The polynomial c(x)

is said to be the greatest common divisor of a(x) and b(x) if the following are true.

1. c(x) divides both a(x) and b(x).

2. Any divisor of a(x) and b(x) is a divisor of c(x).

An equivalent definition is the following: gcd[a(x), b(x)] is the polynomial of

maximum degree that divides both a(x) and b(x).

We can adapt the Euclidean algorithm to compute the greatest common divisor

of two polynomials. Recall Equation (2.6), from Chapter 2, which is the basis of the

Euclidean algorithm: gcd(a, b) = gcd(b, a mod b). This equality can be rewritten as the

following equation:

gcd[a(x), b(x)] = gcd[b(x), a(x) mod b(x)] (5.3)

Equation (5.3) can be used repetitively to determine the greatest common divisor.

Compare the following scheme to the definition of the Euclidean algorithm for integers.

Euclidean Algorithm for Polynomials

Calculate Which satisfies

r1(x) = a(x) mod b(x) a(x) = q1(x)b(x) + r1(x)

r2(x) = b(x) mod r1(x) b(x) = q2(x)r1(x) + r2(x)

r3(x) = r1(x) mod r2(x) r1(x) = q3(x)r2(x) + r3(x)

rn(x) = rn – 2(x) mod rn – 1(x) rn – 2(x) = qn(x)rn – 1(x) + rn(x)

rn + 1(x) = rn – 1(x) mod rn(x) = 0

rn – 1(x) = qn + 1(x)rn(x) + 0

d(x) = gcd(a(x), b(x)) = rn(x)

At each iteration, we have d(x) = gcd(ri + 1(x), ri(x)) until finally

d(x) = gcd(rn(x), 0) = rn(x). Thus, we can find the greatest common divisor of two

integers by repetitive application of the division algorithm. This is the Euclidean

algorithm for polynomials. The algorithm assumes that the degree of a(x) is greater

than the degree of b(x).

Find gcd[a(x), b(x)] for a(x) = x6 + x5 + x4 + x3 + x2 + x + 1 and b(x) =

x4 + x2 + x + 1. First, we divide a(x) by b(x):

x2 + x

x4 + x2 + x + 1�x6 + x5 + x4 + x3 + x2 + x + 1

x6 + x4 + x3 + x2

x5 + x + 1

x5 + x3 + x2 + x

x3 + x2 + 1

5.6 / FINITE FIELDS OF THE FORM GF(2n) 157

Summary

We began this section with a discussion of arithmetic with ordinary polynomials. In

ordinary polynomial arithmetic, the variable is not evaluated; that is, we do not plug

a value in for the variable of the polynomials. Instead, arithmetic operations are

performed on polynomials (addition, subtraction, multiplication, division) using the

ordinary rules of algebra. Polynomial division is not allowed unless the coefficients

are elements of a field.

Next, we discussed polynomial arithmetic in which the coefficients are ele-

ments of GF(p). In this case, polynomial addition, subtraction, multiplication, and

division are allowed. However, division is not exact; that is, in general division re-

sults in a quotient and a remainder.

Finally, we showed that the Euclidean algorithm can be extended to find the

greatest common divisor of two polynomials whose coefficients are elements of a

field.

All of the material in this section provides a foundation for the following sec-

tion, in which polynomials are used to define finite fields of order pn.

5.6 FINITE FIELDS OF THE FORM GF(2n)

Earlier in this chapter, we mentioned that the order of a finite field must be of the

form pn, where p is a prime and n is a positive integer. In Section 5.4, we looked at

the special case of finite fields with order p. We found that, using modular arith-

metic in Zp, all of the axioms for a field (Figure 5.2) are satisfied. For polynomials

over pn, with n 7 1, operations modulo pn do not produce a field. In this section,

we show what structure satisfies the axioms for a field in a set with pn elements and

concentrate on GF(2n).

Motivation

Virtually all encryption algorithms, both symmetric and asymmetric, involve arith-

metic operations on integers. If one of the operations that is used in the algorithm is

division, then we need to work in arithmetic defined over a field. For convenience

This yields r1(x) = x

3 + x2 + 1 and q1 (x) = x2 + x.

Then, we divide b(x) by r1(x).

x + 1

x3 + x2 + 1�x4 + x2 + x + 1

x4 + x3 + x

x3 + x2 + 1

x3 + x2 + 1

This yields r2(x) = 0 and q2(x) = x + 1.

Therefore, gcd[a(x), b(x)] = r1(x) = x

3 + x2 + 1.

158 CHAPTER 5 / FINITE FIELDS

and for implementation efficiency, we would also like to work with integers that fit

exactly into a given number of bits with no wasted bit patterns. That is, we wish to

work with integers in the range 0 through 2n – 1, which fit into an n-bit word.

Suppose we wish to define a conventional encryption algorithm that operates on

data 8 bits at a time, and we wish to perform division. With 8 bits, we can repre-

sent integers in the range 0 through 255. However, 256 is not a prime number, so

that if arithmetic is performed in Z256 (arithmetic modulo 256), this set of inte-

gers will not be a field. The closest prime number less than 256 is 251. Thus, the

set Z251, using arithmetic modulo 251, is a field. However, in this case the 8-bit

patterns representing the integers 251 through 255 would not be used, resulting

in inefficient use of storage.

As the preceding example points out, if all arithmetic operations are to be

used and we wish to represent a full range of integers in n bits, then arithmetic

modulo 2n will not work. Equivalently, the set of integers modulo 2n for n 7 1, is

not a field. Furthermore, even if the encryption algorithm uses only addition and

multiplication, but not division, the use of the set Z2n is questionable, as the follow-

ing example illustrates.

Suppose we wish to use 3-bit blocks in our encryption algorithm and use only the

operations of addition and multiplication. Then arithmetic modulo 8 is well defined,

as shown in Table 5.1. However, note that in the multiplication table, the nonzero

integers do not appear an equal number of times. For example, there are only four

occurrences of 3, but twelve occurrences of 4. On the other hand, as was mentioned,

there are finite fields of the form GF(2n), so there is in particular a finite field of

order 23 = 8. Arithmetic for this field is shown in Table 5.2. In this case, the number

of occurrences of the nonzero integers is uniform for multiplication. To summarize,

Integer 1 2 3 4 5 6 7

Occurrences in Z8 4 8 4 12 4 8 4

Occurrences in GF(23) 7 7 7 7 7 7 7

For the moment, let us set aside the question of how the matrices of Table 5.2

were constructed and instead make some observations.

1. The addition and multiplication tables are symmetric about the main diago-

nal, in conformance to the commutative property of addition and multiplica-

tion. This property is also exhibited in Table 5.1, which uses mod 8 arithmetic.

2. All the nonzero elements defined by Table 5.2 have a multiplicative inverse,

unlike the case with Table 5.1.

3. The scheme defined by Table 5.2 satisfies all the requirements for a finite

field. Thus, we can refer to this scheme as GF(23).

4. For convenience, we show the 3-bit assignment used for each of the elements

of GF(23).

5.6 / FINITE FIELDS OF THE FORM GF(2n) 159

Intuitively, it would seem that an algorithm that maps the integers unevenly

onto themselves might be cryptographically weaker than one that provides a uni-

form mapping. That is, a cryptanalytic technique might be able to exploit the fact

that some integers occur more frequently and some less frequently in the ciphertext.

Thus, the finite fields of the form GF(2n) are attractive for cryptographic algorithms.

To summarize, we are looking for a set consisting of 2n elements, together

with a definition of addition and multiplication over the set that define a field. We

can assign a unique integer in the range 0 through 2n – 1 to each element of the

set. Keep in mind that we will not use modular arithmetic, as we have seen that this

does not result in a field. Instead, we will show how polynomial arithmetic provides

a means for constructing the desired field.

Modular Polynomial Arithmetic

Consider the set S of all polynomials of degree n – 1 or less over the field Zp. Thus,

each polynomial has the form

f(x) = an – 1x

n – 1 + an – 2xn – 2 + g + a1x + a0 = a

n – 1

i = 0

aix

i

000 001 010 011 100 101 110 111

+ 0 1 2 3 4 5 6 7

000 0 0 1 2 3 4 5 6 7

001 1 1 0 3 2 5 4 7 6

010 2 2 3 0 1 6 7 4 5

011 3 3 2 1 0 7 6 5 4

100 4 4 5 6 7 0 1 2 3

101 5 5 4 7 6 1 0 3 2

110 6 6 7 4 5 2 3 0 1

111 7 7 6 5 4 3 2 1 0

(a) Addition

000 001 010 011 100 101 110 111

* 0 1 2 3 4 5 6 7

000 0 0 0 0 0 0 0 0 0

001 1 0 1 2 3 4 5 6 7

010 2 0 2 4 6 3 1 7 5

011 3 0 3 6 5 7 4 1 2

100 4 0 4 3 7 6 2 5 1

101 5 0 5 1 4 2 7 3 6

110 6 0 6 7 1 5 3 2 4

111 7 0 7 5 2 1 6 4 3

(b) Multiplication

w – w w -1

0 0 –

1 1 1

2 2 5

3 3 6

4 4 7

5 5 2

6 6 3

7 7 4

(c) Additive and multiplicative

inverses

Table 5.2 Arithmetic in GF(23)

160 CHAPTER 5 / FINITE FIELDS

where each ai takes on a value in the set {0, 1, c , p – 1}. There are a total of pn

different polynomials in S.

For p = 3 and n = 2, the 32 = 9 polynomials in the set are

0, 1, 2, x, x + 1, x + 2, 2x, 2x + 1, 2x + 2

For p = 2 and n = 3, the 23 = 8 polynomials in the set are

0, 1, x, x + 1, x2, x2 + 1, x2 + x, x2 + x + 1

With the appropriate definition of arithmetic operations, each such set S is a

finite field. The definition consists of the following elements.

1. Arithmetic follows the ordinary rules of polynomial arithmetic using the basic

rules of algebra, with the following two refinements.

2. Arithmetic on the coefficients is performed modulo p. That is, we use the rules

of arithmetic for the finite field Zp.

3. If multiplication results in a polynomial of degree greater than n – 1, then the

polynomial is reduced modulo some irreducible polynomial m(x) of degree n.

That is, we divide by m(x) and keep the remainder. For a polynomial f(x), the

remainder is expressed as r(x) = f(x) mod m(x).

The Advanced Encryption Standard (AES) uses arithmetic in the finite field

GF(28), with the irreducible polynomial m(x) = x8 + x4 + x3 + x + 1. Consider

the two polynomials f(x) = x6 + x4 + x2 + x + 1 and g(x) = x7 + x + 1. Then

f(x) + g(x) = x6 + x4 + x2 + x + 1 + x7 + x + 1

= x7 + x6 + x4 + x2

f(x) * g(x) = x13 + x11 + x9 + x8 + x7

+ x7 + x5 + x3 + x2 + x

+ x6 + x4 + x2 + x + 1

= x13 + x11 + x9 + x8 + x6 + x5 + x4 + x3 + 1

x5 + x3

x8 + x4 + x3 + x + 1>x13 + x11 + x9 + x8 + x6 + x5 + x4 + x3 + 1

x13 + x9 + x8 + x6 + x5

x11 + x4 + x3

x11 + x7 + x6 + x4 + x3

x7 + x6 + 1

Therefore, f(x) * g(x) mod m(x) = x7 + x6 + 1.

5.6 / FINITE FIELDS OF THE FORM GF(2n) 161

As with ordinary modular arithmetic, we have the notion of a set of residues

in modular polynomial arithmetic. The set of residues modulo m(x), an nth-degree

polynomial, consists of pn elements. Each of these elements is represented by one of

the pn polynomials of degree m 6 n.

The residue class [x + 1], (mod m(x)), consists of all polynomials a(x) such that

a(x) K (x + 1)(mod m(x)). Equivalently, the residue class [x + 1] consists of all

polynomials a(x) that satisfy the equality a(x) mod m(x) = x + 1.

It can be shown that the set of all polynomials modulo an irreducible nth-

degree polynomial m(x) satisfies the axioms in Figure 5.2, and thus forms a finite

field. Furthermore, all finite fields of a given order are isomorphic; that is, any two

finite-field structures of a given order have the same structure, but the representa-

tion or labels of the elements may be different.

To construct the finite field GF(23), we need to choose an irreducible poly-

nomial of degree 3. There are only two such polynomials: (x3 + x2 + 1) and

(x3 + x + 1). Using the latter, Table 5.3 shows the addition and multiplication

tables for GF(23). Note that this set of tables has the identical structure to those

of Table 5.2. Thus, we have succeeded in finding a way to define a field of order 23.

We can now read additions and multiplications from the table easily. For exam-

ple, consider binary 100 + 010 = 110. This is equivalent to x2 + x. Also consider

100 * 010 = 011, which is equivalent to x2 * x = x3 and reduces to x + 1. That

is, x3 mod (x3 + x + 1) = x + 1, which is equivalent to 011.

Finding the Multiplicative Inverse

Just as the Euclidean algorithm can be adapted to find the greatest common divisor

of two polynomials, the extended Euclidean algorithm can be adapted to find the

multiplicative inverse of a polynomial. Specifically, the algorithm will find the mul-

tiplicative inverse of b(x) modulo a(x) if the degree of b(x) is less than the degree of

a(x) and gcd[a(x), b(x)] = 1. If a(x) is an irreducible polynomial, then it has no fac-

tor other than itself or 1, so that gcd[a(x), b(x)] = 1. The algorithm can be charac-

terized in the same way as we did for the extended Euclidean algorithm for integers.

Given polynomials a(x) and b(x) with the degree of a(x) greater than the degree

of b(x), we wish to solve the following equation for the values v(x), w(x), and d(x),

where d(x) = gcd[a(x), b(x)]:

a(x)v(x) + b(x)w(x) = d(x)

If d(x) = 1, then w(x) is the multiplicative inverse of b(x) modulo a(x). The calcula-

tions are as follows.

162 CHAPTER 5 / FINITE FIELDS

0

0

0

0

0

1

0

1

0

0

1

1

1

0

0

1

0

1

1

1

0

1

1

1

+

0

1

x

x

+

1

x

2

x

2

+

1

x

2

+

x

x

2

+

x

+

1

0

0

0

0

0

1

x

x

+

1

x

2

x

2

+

1

x

2

+

x

x

2

+

x

+

1

0

0

1

1

1

0

x

+

1

x

x

2

+

1

x

2

x

2

+

x

+

1

x

2

+

x

0

1

0

x

x

x

+

1

0

1

x

2

+

x

x

2

+

x

+

1

x

2

x

2

+

1

0

1

1

x

+

1

x

+

1

x

1

0

x

2

+

x

+

1

x

2

+

x

x

2

+

1

x

2

1

0

0

x

2

x

2

x

2

+

1

x

2

+

x

x

2

+

x

+

1

0

1

x

x

+

1

1

0

1

x

2

+

1

x

2

+

1

x

2

x

2

+

x

+

1

x

2

+

x

1

0

x

+

1

x

1

1

0

x

2

+

x

x

2

+

x

x

2

+

x

+

1

x

2

x

2

+

1

x

x

+

1

0

1

1

1

1

x

2

+

x

+

1

x

2

+

x

+

1

x

2

+

x

x

2

+

1

x

2

x

+

1

x

1

0

0

0

0

0

0

1

0

1

0

0

1

1

1

0

0

1

0

1

1

1

0

1

1

1

*

0

1

x

x

+

1

x

2

x

2

+

1

x

2

+

x

x

2

+

x

+

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

1

0

1

x

x

+

1

x

2

x

2

+

1

x

2

+

x

x

2

+

x

+

1

0

1

0

x

0

x

x

2

x

2

+

x

x

+

1

1

x

2

+

x

+

1

x

2

+

1

0

1

1

x

+

1

0

x

+

1

x

2

+

x

x

2

+

1

x

2

+

x

+

1

x

2

1

x

1

0

0

x

2

0

x

2

x

+

1

x

2

+

x

+

1

x

2

+

x

x

x

2

+

1

1

1

0

1

x

2

+

1

0

x

2

+

1

1

x

2

x

x

2

+

x

+

1

x

+

1

x

2

+

x

1

1

0

x

2

+

x

0

x

2

+

x

x

2

+

x

+

1

1

x

2

+

1

x

+

1

x

x

2

1

1

1

x

2

+

x

+

1

0

x

2

+

x

+

1

x

2

+

1

x

1

x

2

+

1

x

2

x

+

1

(a

)

A

d

d

it

io

n

(b

)

M

u

lt

ip

li

c

a

ti

o

n

T

ab

le

5

.3

P

o

ly

n

o

m

ia

l

A

ri

th

m

e

ti

c

M

o

d

u

lo

(

x3

+

x

+

1

)

5.6 / FINITE FIELDS OF THE FORM GF(2n) 163

Extended Euclidean Algorithm for Polynomials

Calculate Which satisfies Calculate Which satisfies

r-1(x) = a(x) v-1(x) = 1; w-1(x) = 0 a(x) = a(x)v-1(x) +

bw-1(x)

r0(x) = b(x) v0(x) = 0; w0(x) = 1 b(x) = a(x)v0(x) +

b(x)w0(x)

r1(x) = a(x) mod b(x)

q1(x) = quotient of

a(x)/b(x)

a(x) = q1(x)b(x) +

r1(x)

v1(x) = v-1(x) –

q1(x)v0(x) = 1

w1(x) = w-1(x) –

q1(x)w0(x) = – q1(x)

r1(x) = a(x)v1(x) +

b(x)w1(x)

r2(x) = b(x) mod r1(x)

q2(x) = quotient of

b(x)/r1(x)

b(x) = q2(x)r1(x) +

r2(x)

v2(x) = v0(x) –

q2(x)v1(x)

w2(x) = w0(x) –

q2(x)w1(x)

r2(x) = a(x)v2(x) +

b(x)w2(x)

r3(x) = r1(x) mod r2(x)

q3(x) = quotient of

r1(x)/r2(x)

r1(x) = q3(x)r2(x) +

r3(x)

v3(x) = v1(x) –

q3(x)v2(x)

w3(x) = w1(x) –

q3(x)w2(x)

r3(x) = a(x)v3(x) +

b(x)w3(x)

f

rn(x) = rn – 2(x)

mod rn – 1(x)

qn(x) = quotient of

rn – 2(x)/rn – 2(x)

rn – 2(x) = qn(x)rn – 1(x)

+ rn(x)

vn(x) = vn – 2(x) –

qn(x)vn – 1(x)

wn(x) = wn – 2(x) –

qn(x)wn – 1(x)

rn(x) = a(x)vn(x) +

b(x)wn(x)

rn + 1(x) = rn – 1(x)

mod rn(x) = 0

qn + 1(x) = quotient of

rn – 1(x)/rn(x)

rn – 1(x) = qn + 1(x)rn(x)

+ 0

d(x) = gcd(a(x),

b(x)) = rn(x)

v(x) = vn(x); w(x) =

wn(x)

Table 5.4 shows the calculation of the multiplicative inverse of (x7 + x + 1)

mod (x8 + x4 + x3 + x + 1). The result is that (x7 + x + 1)-1 = (x7). That is,

(x7 + x + 1)(x7) K 1(mod (x8 + x4 + x3 + x + 1)).

Computational Considerations

A polynomial f(x) in GF(2n)

f(x) = an – 1x

n – 1 + an – 2xn – 2 + g + a1x + a0 = a

n – 1

i = 0

aix

i

can be uniquely represented by the sequence of its n binary coefficients

(an – 1, an – 2, c , a0). Thus, every polynomial in GF(2

n) can be represented by an

n-bit number.

164 CHAPTER 5 / FINITE FIELDS

ADDITION We have seen that addition of polynomials is performed by adding cor-

responding coefficients, and, in the case of polynomials over Z2, addition is just the

XOR operation. So, addition of two polynomials in GF(2n) corresponds to a bitwise

XOR operation.

Initialization a(x) = x8 + x4 + x3 + x + 1; v-1(x) = 1; w-1(x) = 0

b(x) = x7 + x + 1; v0(x) = 0; w0(x) = 1

Iteration 1 q1(x) = x; r1(x) = x

4 + x3 + x2 + 1

v1(x) = 1; w1(x) = x

Iteration 2 q2(x) = x

3 + x2 + 1; r2(x) = x

v2(x) = x

3 + x2 + 1; w2(x) = x4 + x3 + x + 1

Iteration 3 q3(x) = x

3 + x2 + x; r3(x) = 1

v3(x) = x

6 + x2 + x + 1; w3(x) = x7

Iteration 4 q4(x) = x; r4(x) = 0

v4(x) = x

7 + x + 1; w4(x) = x8 + x4 + x3 + x + 1

Result d(x) = r3(x) = gcd(a(x), b(x)) = 1

w(x) = w3(x) = (x

7 + x + 1)-1 mod (x8 + x4 + x3 + x + 1) = x7

Table 5.4 Extended Euclid [(x8 + x4 + x3 + x + 1), (x7 + x + 1)]

Tables 5.2 and 5.3 show the addition and multiplication tables for GF(23) modulo

m(x) = (x3 + x + 1). Table 5.2 uses the binary representation, and Table 5.3

uses the polynomial representation.

Consider the two polynomials in GF(28) from our earlier example:

f(x) = x6 + x4 + x2 + x + 1 and g(x) = x7 + x + 1.

(x6 + x4 + x2 + x + 1) + (x7 + x + 1) = x7 + x6 + x4 + x2 (polynomial notation)

(01010111) ⊕ (10000011) = (11010100) (binary notation)

{57} ⊕ {83} = {D4} (hexadecimal notation)7

7A basic refresher on number systems (decimal, binary, hexadecimal) can be found at the Computer

Science Student Resource Site at WilliamStallings.com/StudentSupport.html. Here each of two groups

of 4 bits in a byte is denoted by a single hexadecimal character, and the two characters are enclosed in

brackets.

MULTIPLICATION There is no simple XOR operation that will accomplish multi-

plication in GF(2n). However, a reasonably straightforward, easily implemented

technique is available. We will discuss the technique with reference to GF(28) using

m(x) = x8 + x4 + x3 + x + 1, which is the finite field used in AES. The technique

readily generalizes to GF(2n).

The technique is based on the observation that

x8 mod m(x) = [m(x) – x8] = (x4 + x3 + x + 1) (5.4)

5.6 / FINITE FIELDS OF THE FORM GF(2n) 165

A moment’s thought should convince you that Equation (5.4) is true; if you

are not sure, divide it out. In general, in GF(2n) with an nth-degree polynomial p(x),

we have xn mod p(x) = [p(x) – xn].

Now, consider a polynomial in GF(28), which has the form

f(x) = b7x

7 + b6x6 + b5x5 + b4x4 + b3x3 + b2x2 + b1x + b0. If we multiply by x,

we have

x * f(x) = (b7x8 + b6x7 + b5x6 + b4x5 + b3x4

+ b2x3 + b1x2 + b0x) mod m(x) (5.5)

If b7 = 0, then the result is a polynomial of degree less than 8, which is already

in reduced form, and no further computation is necessary. If b7 = 1, then reduction

modulo m(x) is achieved using Equation (5.4):

x * f(x) = (b6x7 + b5x6 + b4x5 + b3x4 + b2x3 + b1x2 + b0x)

+ (x4 + x3 + x + 1)

It follows that multiplication by x (i.e., 00000010) can be implemented as a 1-bit

left shift followed by a conditional bitwise XOR with (00011011), which represents

(x4 + x3 + x + 1). To summarize,

x * f(x) = b (b6b5b4b3b2b1b00) if b7 = 0

(b6b5b4b3b2b1b00) ⊕ (00011011) if b7 = 1

(5.6)

Multiplication by a higher power of x can be achieved by repeated application

of Equation (5.6). By adding intermediate results, multiplication by any constant in

GF(28) can be achieved.

In an earlier example, we showed that for f(x) = x6 + x4 + x2 + x + 1, g(x) = x7 +

x + 1, and m(x) = x8 + x4 + x3 + x + 1, we have f(x) * g(x) mod m(x) = x7 + x6 + 1.

Redoing this in binary arithmetic, we need to compute (01010111) * (10000011). First,

we determine the results of multiplication by powers of x:

(01010111) * (00000010) = (10101110)

(01010111) * (00000100) = (01011100) ⊕ (00011011) = (01000111)

(01010111) * (00001000) = (10001110)

(01010111) * (00010000) = (00011100) ⊕ (00011011) = (00000111)

(01010111) * (00100000) = (00001110)

(01010111) * (01000000) = (00011100)

(01010111) * (10000000) = (00111000)

So,

(01010111) * (10000011) = (01010111) * [(00000001) ⊕ (00000010) ⊕ (10000000)]

= (01010111) ⊕ (10101110) ⊕ (00111000) = (11000001)

which is equivalent to x7 + x6 + 1.

166 CHAPTER 5 / FINITE FIELDS

Using a Generator

An equivalent technique for defining a finite field of the form GF(2n), using the

same irreducible polynomial, is sometimes more convenient. To begin, we need two

definitions: A generator g of a finite field F of order q (contains q elements) is an

element whose first q – 1 powers generate all the nonzero elements of F. That is,

the elements of F consist of 0, g0, g1, c , gq – 2. Consider a field F defined by a

polynomial f(x). An element b contained in F is called a root of the polynomial if

f(b) = 0. Finally, it can be shown that a root g of an irreducible polynomial is a gen-

erator of the finite field defined on that polynomial.

Power

Representation

Polynomial

Representation

Binary

Representation

Decimal (Hex)

Representation

0 0 000 0

g0(= g7) 1 001 1

g1 g 010 2

g2 g2 100 4

g3 g + 1 011 3

g4 g2 + g 110 6

g5 g2 + g + 1 111 7

g6 g2 + 1 101 5

Table 5.5 Generator for GF(23) using x3 + x + 1

Let us consider the finite field GF(23), defined over the irreducible poly-

nomial x3 + x + 1, discussed previously. Thus, the generator g must satisfy

f(g) = g3 + g + 1 = 0. Keep in mind, as discussed previously, that we need not

find a numerical solution to this equality. Rather, we deal with polynomial arith-

metic in which arithmetic on the coefficients is performed modulo 2. Therefore,

the solution to the preceding equality is g3 = – g – 1 = g + 1. We now show

that g in fact generates all of the polynomials of degree less than 3. We have the

following.

g4 = g(g3) = g(g + 1) = g2 + g

g5 = g(g4) = g(g2 + g) = g3 + g2 = g2 + g + 1

g6 = g(g5) = g(g2 + g + 1) = g3 + g2 + g = g2 + g + g + 1 = g2 + 1

g7 = g(g6) = g(g2 + 1) = g3 + g = g + g + 1 = 1 = g0

We see that the powers of g generate all the nonzero polynomials in GF(23).

Also, it should be clear that gk = gk mod7 for any integer k. Table 5.5 shows the

power representation, as well as the polynomial and binary representations.

5.6 / FINITE FIELDS OF THE FORM GF(2n) 167

In general, for GF(2n) with irreducible polynomial f(x), determine

gn = f(g) – gn. Then calculate all of the powers of g from gn + 1 through g2

n – 2.

The elements of the field correspond to the powers of g from g0 through g2

n – 2

plus the value 0. For multiplication of two elements in the field, use the equality

gk = gk mod(2

n – 1) for any integer k.

Summary

In this section, we have shown how to construct a finite field of order 2n. Specifically,

we defined GF(2n) with the following properties.

1. GF(2n) consists of 2n elements.

2. The binary operations + and * are defined over the set. The operations

of addition, subtraction, multiplication, and division can be performed with-

out leaving the set. Each element of the set other than 0 has a multiplicative

inverse.

We have shown that the elements of GF(2n) can be defined as the set of all

polynomials of degree n – 1 or less with binary coefficients. Each such polynomial

can be represented by a unique n-bit value. Arithmetic is defined as polynomial

arithmetic modulo some irreducible polynomial of degree n. We have also seen that

an equivalent definition of a finite field GF(2n) makes use of a generator and that

arithmetic is defined using powers of the generator.

This power representation makes multiplication easy. To multiply in the

power notation, add exponents modulo 7. For example, g4 * g6 = g(10 mod 7) =

g3 = g + 1. The same result is achieved using polynomial arithmetic: We have

g4 = g2 + g and g6 = g2 + 1. Then, (g2 + g) * (g2 + 1) = g4 + g3 + g2 + g.

Next, we need to determine (g4 + g3 + g2 + 1) mod (g3 + g + 1) by division:

g + 1

g3 + g + 1�g4 + g3 + g2 + g

g4 + g2 + g

g3

g3 + g + 1

g + 1

We get a result of g + 1, which agrees with the result obtained using the power

representation.

Table 5.6 shows the addition and multiplication tables for GF(23) using

the power representation. Note that this yields the identical results to the

polynomial representation (Table 5.3) with some of the rows and columns

i nterchanged.

168 CHAPTER 5 / FINITE FIELDS

0

0

0

0

0

1

0

1

0

1

0

0

0

1

1

1

1

0

1

1

1

1

0

1

+

0

1

G

g

2

g

3

g

4

g

5

g

6

0

0

0

0

0

1

G

g

2

g

+

1

g

2

+

g

g

2

+

g

+

1

g

2

+

1

0

0

1

1

1

0

g

+

1

g

2

+

1

g

g

2

+

g

+

1

g

2

+

g

g

2

0

1

0

g

g

g

+

1

0

g

2

+

g

1

g

2

g

2

+

1

g

2

+

g

+

1

1

0

0

g

2

g

2

g

2

+

1

g

2

+

g

0

g

2

+

g

+

1

g

g

+

1

1

0

1

1

g

3

g

+

1

g

1

g

2

+

g

+

1

0

g

2

+

1

g

2

g

2

+

g

1

1

0

g

4

g

2

+

g

g

2

+

g

+

1

g

2

g

g

2

+

1

0

1

g

+

1

1

1

1

g

5

g

2

+

g

+

1

g

2

+

g

g

2

+

1

g

+

1

g

2

1

0

g

1

0

1

g

6

g

2

+

1

g

2

g

2

+

g

+

1

1

g

2

+

g

g

+

1

g

0

(a

)

A

d

d

it

io

n

0

0

0

0

0

1

0

1

0

1

0

0

0

1

1

1

1

0

1

1

1

1

0

1

*

0

1

G

g

2

g

3

g

4

g

5

g

6

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

1

0

1

G

g

2

g

+

1

g

2

+

g

g

2

+

g

+

1

g

2

+

1

0

1

0

g

0

g

g

2

g

+

1

g

2

+

g

g

2

+

g

+

1

g

2

+

1

1

1

0

0

g

2

0

g

2

g

+

1

g

2

+

g

g

2

+

g

+

1

g

2

+

1

1

g

0

1

1

g

3

0

g

+

1

g

2

+

g

g

2

+

g

+

1

g

2

+

1

1

g

g

2

1

1

0

g

4

0

g

2

+

g

g

2

+

g

+

1

g

2

+

1

1

g

g

2

g

+

1

1

1

1

g

5

0

g

2

+

g

+

1

g

2

+

1

1

g

g

2

g

+

1

g

2

+

g

1

0

1

g

6

0

g

2

+

1

1

g

g

2

g

+

1

g

2

+

g

g

2

+

g

+

1

(b

)

M

u

lt

ip

li

c

a

ti

o

n

T

ab

le

5

.6

G

F

(2

3

)

A

ri

th

m

e

ti

c

U

si

n

g

G

e

n

e

ra

to

r

fo

r

th

e

P

o

ly

n

o

m

ia

l

(x

3

+

x

+

1

)

5.7 / KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS 169

5.7 KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS

Key Terms

abelian group

associative

coefficient set

commutative

commutative ring

cyclic group

divisor

Euclidean algorithm

field

finite field

finite group

generator

greatest common divisor

group

identity element

infinite field

infinite group

integral domain

inverse element

irreducible polynomial

modular arithmetic

modular polynomial

arithmetic

monic polynomial

order

polynomial

polynomial arithmetic

polynomial ring

prime number

prime polynomial

relatively prime

residue

ring

Review Questions

5.1 Briefly define a group.

5.2 Briefly define a ring.

5.3 Briefly define a field.

5.4 List three classes of polynomial arithmetic.

Problems

5.1 For the group Sn of all permutations of n distinct symbols,

a. what is the number of elements in Sn?

b. show that Sn is not abelian for n 7 2.

5.2 Does the set of residue classes (mod3) form a group

a. with respect to modular addition?

b. with respect to modular multiplication?

5.3 Let S = {0, a, b, c}. The addition and multiplication on the set S is defined in the

following tables:

+ 0 a B C

0 0 a B C

A a 0 c B

B b c 0 A

C c b a 0

* 0 a b c

0 0 0 0 0

a 0 a b c

b 0 a b c

c 0 0 0 0

Is S a noncommutative ring? Justify your answer.

5.4 Develop a set of tables similar to Table 5.1 for GF(5).

5.5 Demonstrate that the set of polynomials whose coefficients form a field is a ring.

5.6 Demonstrate whether each of these statements is true or false for polynomials over a

field.

170 CHAPTER 5 / FINITE FIELDS

a. The product of monic polynomials is monic.

b. The product of polynomials of degrees m and n has degree m + n.

c. The sum of polynomials of degrees m and n has degree max [m, n].

5.7 For polynomial arithmetic with coefficients in Z1 1 , perform the following calculations.

a. (x 2 + 2 x + 9 )(x 3 + 1 1 x 2 + x + 7 )

b. (8 x 2 + 3 x + 2 )(5 x 2 + 6 )

5.8 Determine which of the following polynomials are reducible over GF(2).

a. x 2 + 1

b. x 2 + x + 1

c. x 4 + x + 1

5.9 Determine the gcd of the following pairs of polynomials.

a. (x3 + 1) and (x2 + x + 1) over GF(2)

b. (x3 + x + 1) and (x2 + 1) over GF(3)

c. (x3 – 2x + 1) and (x2 – x – 2) over GF(5)

d. (x4 + 8×3 + 7x + 8) and (2×3 + 9×2 + 10x + 1) over GF(11)

5.10 Develop a set of tables similar to Table 5.3 for GF(3) with m(x) = x2 + x + 1.

5.11 Determine the multiplicative inverse of x 2 + 1 in GF(2 3 ) with m (x ) = x 3 + x – 1 .

5.12 Develop a table similar to Table 5.5 for GF(2 5 ) with m (x ) = x 5 + x 4 + x 3 + x + 1 .

Programming Problems

5.13 Write a simple four-function calculator in GF(24). You may use table lookups for the

multiplicative inverses.

5.14 Write a simple four-function calculator in GF(28). You should compute the multiplica-

tive inverses on the fly.

171

6.1 Finite Field Arithmetic

6.2 AES Structure

General Structure

Detailed Structure

6.3 AES Transformation Functions

Substitute Bytes Transformation

ShiftRows Transformation

MixColumns Transformation

AddRoundKey Transformation

6.4 AES Key Expansion

Key Expansion Algorithm

Rationale

6.5 An AES Example

Results

Avalanche Effect

6.6 AES Implementation

Equivalent Inverse Cipher

Implementation Aspects

6.7 Key Terms, Review Questions, and Problems

Appendix 6A Polynomials with Coefficients in GF(28)

CHAPTER

Advanced Encryption Standard

172 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD

The Advanced Encryption Standard (AES) was published by the National Institute of

Standards and Technology (NIST) in 2001. AES is a symmetric block cipher that is

intended to replace DES as the approved standard for a wide range of applications.

Compared to public-key ciphers such as RSA, the structure of AES and most symmet-

ric ciphers is quite complex and cannot be explained as easily as many other

cryptographic algorithms. Accordingly, the reader may wish to begin with a simplified

version of AES, which is described in Appendix I. This version allows the reader to

perform encryption and decryption by hand and gain a good understanding of the

working of the algorithm details. Classroom experience indicates that a study of this

simplified version enhances understanding of AES.1 One possible approach is to read

the chapter first, then carefully read Appendix I, and then re-read the main body

of the chapter.

Appendix H looks at the evaluation criteria used by NIST to select from among

the candidates for AES, plus the rationale for picking Rijndael, which was the winning

candidate. This material is useful in understanding not just the AES design but also the

criteria by which to judge any symmetric encryption algorithm.

6.1 FINITE FIELD ARITHMETIC

In AES, all operations are performed on 8-bit bytes. In particular, the arithmetic

operations of addition, multiplication, and division are performed over the finite

field GF(28). Section 5.6 discusses such operations in some detail. For the reader

who has not studied Chapter 5, and as a quick review for those who have, this sec-

tion summarizes the important concepts.

In essence, a field is a set in which we can do addition, subtraction, multiplica-

tion, and division without leaving the set. Division is defined with the following rule:

a/b = a(b-1). An example of a finite field (one with a finite number of elements) is

the set Zp consisting of all the integers {0, 1, c , p – 1}, where p is a prime num-

ber and in which arithmetic is carried out modulo p.

1However, you may safely skip Appendix I, at least on a first reading. If you get lost or bogged down in

the details of AES, then you can go back and start with simplified AES.

LEARNING OBJECTIVES

After studying this chapter, you should be able to:

◆ Present an overview of the general structure of Advanced Encryption

Standard (AES).

◆ Understand the four transformations used in AES.

◆ Explain the AES key expansion algorithm.

◆ Understand the use of polynomials with coefficients in GF(28).

6.1 / FINITE FIELD ARITHMETIC 173

Virtually all encryption algorithms, both conventional and public-key, involve

arithmetic operations on integers. If one of the operations used in the algorithm

is division, then we need to work in arithmetic defined over a field; this is because

division requires that each nonzero element have a multiplicative inverse. For con-

venience and for implementation efficiency, we would also like to work with inte-

gers that fit exactly into a given number of bits, with no wasted bit patterns. That is,

we wish to work with integers in the range 0 through 2n – 1, which fit into an n-bit

word. Unfortunately, the set of such integers, Z2n, using modular arithmetic, is not a

field. For example, the integer 2 has no multiplicative inverse in Z2n, that is, there is

no integer b, such that 2b mod 2n = 1.

There is a way of defining a finite field containing 2n elements; such a field is

referred to as GF(2n). Consider the set, S, of all polynomials of degree n – 1 or less

with binary coefficients. Thus, each polynomial has the form

f(x) = an – 1x

n – 1 + an – 2xn – 2 + g + a1x + a0 = a

n – 1

i = 0

aix

i

where each ai takes on the value 0 or 1. There are a total of 2

n different polynomials

in S. For n = 3, the 23 = 8 polynomials in the set are

0 x x2 x2 + x

1 x + 1 x2 + 1 x2 + x + 1

With the appropriate definition of arithmetic operations, each such set S is a

finite field. The definition consists of the following elements.

1. Arithmetic follows the ordinary rules of polynomial arithmetic using the basic

rules of algebra with the following two refinements.

2. Arithmetic on the coefficients is performed modulo 2. This is the same as the

XOR operation.

3. If multiplication results in a polynomial of degree greater than n – 1, then the

polynomial is reduced modulo some irreducible polynomial m(x) of degree n.

That is, we divide by m(x) and keep the remainder. For a polynomial f(x),

the remainder is expressed as r(x) = f(x) mod m(x). A polynomial m(x) is

called irreducible if and only if m(x) cannot be expressed as a product of two

polynomials, both of degree lower than that of m(x).

For example, to construct the finite field GF(23), we need to choose an irre-

ducible polynomial of degree 3. There are only two such polynomials: (x3 + x2 + 1)

and (x3 + x + 1). Addition is equivalent to taking the XOR of like terms. Thus,

(x + 1) + x = 1.

A polynomial in GF(2n) can be uniquely represented by its n binary coeffi cients

(an – 1an – 2 c a0). Therefore, every polynomial in GF(2

n) can be represented by

an n-bit number. Addition is performed by taking the bitwise XOR of the two n-bit

elements. There is no simple XOR operation that will accomplish multiplication in

GF(2n). However, a reasonably straightforward, easily implemented, technique is

available. In essence, it can be shown that multiplication of a number in GF(2n) by

174 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD

2 consists of a left shift followed by a conditional XOR with a constant. Multiplication

by larger numbers can be achieved by repeated application of this rule.

For example, AES uses arithmetic in the finite field GF(28) with the irreducible

polynomial m(x) = x8 + x4 + x3 + x + 1. Consider two elements A =

(a7a6 c a1a0) and B = (b7b6 c b1b0). The sum A + B = (c7c6 c c1c0), where

ci = ai ⊕ bi. The multiplication {02} # A equals (a6 c a1a00) if a7 = 0 and equals

(a6 c a1a00) ⊕ (00011011) if a7 = 1.

2

To summarize, AES operates on 8-bit bytes. Addition of two bytes is defined

as the bitwise XOR operation. Multiplication of two bytes is defined as multiplica-

tion in the finite field GF(28), with the irreducible polynomial3 m(x) = x8 + x4 + x3 +

x + 1. The developers of Rijndael give as their motivation for selecting this one of

the 30 possible irreducible polynomials of degree 8 that it is the first one on the list

given in [LIDL94].

6.2 AES STRUCTURE

General Structure

Figure 6.1 shows the overall structure of the AES encryption process. The cipher

takes a plaintext block size of 128 bits, or 16 bytes. The key length can be 16, 24, or

32 bytes (128, 192, or 256 bits). The algorithm is referred to as AES-128, AES-192,

or AES-256, depending on the key length.

The input to the encryption and decryption algorithms is a single 128-bit block.

In FIPS PUB 197, this block is depicted as a 4 * 4 square matrix of bytes. This

block is copied into the State array, which is modified at each stage of encryption or

decryption. After the final stage, State is copied to an output matrix. These opera-

tions are depicted in Figure 6.2a. Similarly, the key is depicted as a square matrix of

bytes. This key is then expanded into an array of key schedule words. Figure 6.2b

shows the expansion for the 128-bit key. Each word is four bytes, and the total key

schedule is 44 words for the 128-bit key. Note that the ordering of bytes within a ma-

trix is by column. So, for example, the first four bytes of a 128-bit plaintext input to

the encryption cipher occupy the first column of the in matrix, the second four bytes

occupy the second column, and so on. Similarly, the first four bytes of the expanded

key, which form a word, occupy the first column of the w matrix.

The cipher consists of N rounds, where the number of rounds depends on the

key length: 10 rounds for a 16-byte key, 12 rounds for a 24-byte key, and 14 rounds

for a 32-byte key (Table 6.1). The first N – 1 rounds consist of four distinct trans-

formation functions: SubBytes, ShiftRows, MixColumns, and AddRoundKey,

which are described subsequently. The final round contains only three transforma-

tions, and there is a initial single transformation (AddRoundKey) before the first

round, which can be considered Round 0. Each transformation takes one or more

2In FIPS PUB 197, a hexadecimal number is indicated by enclosing it in curly brackets. We use that convention

in this chapter.

3In the remainder of this discussion, references to GF(28) refer to the finite field defined with this

polynomial.

6.2 / AES STRUCTURE 175

Figure 6.1 AES Encryption Process

Initial transformation

K

ey

e

xp

an

si

on

Plaintext—16 bytes (128 bits) Key—M bytes

Key

(M bytes)Round 0 key

(16 bytes)

Round 1 key

(16 bytes)

Round N – 1 key

(16 bytes)

Round N key

(16 bytes)

Cipehertext—16 bytes (128 bits)

No. of

rounds

10 16

Key

Length

(bytes)

Input state

(16 bytes)

State after

initial

transformation

(16 bytes)

Final state

(16 bytes)

Round N – 1

output state

(16 bytes)

Round 1

output state

(16 bytes)

Round 1

(4 transformations)

Round N – 1

(4 transformations)

Round N

(3 transformations)

12 24

14 32

4 * 4 matrices as input and produces a 4 * 4 matrix as output. Figure 6.1 shows

that the output of each round is a 4 * 4 matrix, with the output of the final round

being the ciphertext. Also, the key expansion function generates N + 1 round keys,

each of which is a distinct 4 * 4 matrix. Each round key serves as one of the inputs

to the AddRoundKey transformation in each round.

F

ig

u

re

6

.2

A

E

S

D

a

ta

S

tr

u

c

tu

re

s

in

0

in

4

in

8

in

12

in

1

in

5

in

9

in

13

in

2

in

6

in

10

in

14

in

3

in

7

in

11

in

15

k 0

w

0

w

1

w

2

w

43

w

42

k 4

k 8

k 1

2

k 1

k 5

k 9

k 1

3

k 2

k 6

k 1

0

k 1

4

k 3

k 7

k 1

1

k 1

5

ou

t 0

ou

t 4

ou

t 8

ou

t 1

2

ou

t 1

ou

t 5

ou

t 9

ou

t 1

3

ou

t 2

ou

t 6

ou

t 1

0

ou

t 1

4

ou

t 3

ou

t 7

ou

t 1

1

ou

t 1

5

s 0

,0

s 1

,0

s 2

,0

s 3

,0

s 0

,1

s 1

,1

s 2

,1

s 3

,1

s 0

,2

s 1

,2

s 2

,2

s 3

,2

s 0

,3

s 1

,3

s 2

,3

s 3

,3

s 0

,0

s 1

,0

s 2

,0

s 3

,0

s 0

,1

s 1

,1

s 2

,1

s 3

,1

s 0

,2

s 1

,2

s 2

,2

s 3

,2

s 0

,3

s 1

,3

s 2

,3

s 3

,3

(a

)

In

pu

t,

s

ta

te

a

rr

ay

, a

nd

o

ut

pu

t

(b

)

K

ey

a

nd

e

xp

an

de

d

ke

y

176

6.2 / AES STRUCTURE 177

Key Size (words/bytes/bits) 4/16/128 6/24/192 8/32/256

Plaintext Block Size (words/bytes/bits) 4/16/128 4/16/128 4/16/128

Number of Rounds 10 12 14

Round Key Size (words/bytes/bits) 4/16/128 4/16/128 4/16/128

Expanded Key Size (words/bytes) 44/176 52/208 60/240

Table 6.1 AES Parameters

Detailed Structure

Figure 6.3 shows the AES cipher in more detail, indicating the sequence of transfor-

mations in each round and showing the corresponding decryption function. As was

done in Chapter 4, we show encryption proceeding down the page and decryption

proceeding up the page.

Before delving into details, we can make several comments about the overall

AES structure.

1. One noteworthy feature of this structure is that it is not a Feistel structure.

Recall that, in the classic Feistel structure, half of the data block is used to

modify the other half of the data block and then the halves are swapped. AES

instead processes the entire data block as a single matrix during each round

using substitutions and permutation.

2. The key that is provided as input is expanded into an array of forty-four 32-bit

words, w[i]. Four distinct words (128 bits) serve as a round key for each round;

these are indicated in Figure 6.3.

3. Four different stages are used, one of permutation and three of substitution:

■ Substitute bytes: Uses an S-box to perform a byte-by-byte substitution of

the block.

■ ShiftRows: A simple permutation.

■ MixColumns: A substitution that makes use of arithmetic over GF(28).

■ AddRoundKey: A simple bitwise XOR of the current block with a portion

of the expanded key.

4. The structure is quite simple. For both encryption and decryption, the cipher

begins with an AddRoundKey stage, followed by nine rounds that each in-

cludes all four stages, followed by a tenth round of three stages. Figure 6.4

depicts the structure of a full encryption round.

5. Only the AddRoundKey stage makes use of the key. For this reason, the cipher

begins and ends with an AddRoundKey stage. Any other stage, applied at the

beginning or end, is reversible without knowledge of the key and so would add

no security.

6. The AddRoundKey stage is, in effect, a form of Vernam cipher and by itself

would not be formidable. The other three stages together provide confusion,

diffusion, and nonlinearity, but by themselves would provide no security be-

cause they do not use the key. We can view the cipher as alternating operations

of XOR encryption (AddRoundKey) of a block, followed by scrambling of the

178 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD

Figure 6.3 AES Encryption and Decryption

Add round key

w[4, 7]

Plaintext

(16 bytes)

Plaintext

(16 bytes)

Substitute bytes

Expand key

Shift rows

Mix columnsR

ou

nd

1

R

ou

nd

9

R

ou

nd

1

0

Add round key

Substitute bytes

Shift rows

Mix columns

Add round key

Substitute bytes

Shift rows

Add round key

Ciphertext

(16 bytes)

(a) Encryption

Key

(16 bytes)

Add round key

Inverse sub bytes

Inverse shift rows

Inverse mix cols

R

ou

nd

1

0

R

ou

nd

9

R

ou

nd

1

Add round key

Inverse sub bytes

Inverse shift rows

Inverse mix cols

Add round key

Inverse sub bytes

Inverse shift rows

Add round key

Ciphertext

(16 bytes)

(b) Decryption

w[36, 39]

w[40, 43]

w[0, 3]

block (the other three stages), followed by XOR encryption, and so on. This

scheme is both efficient and highly secure.

7. Each stage is easily reversible. For the Substitute Byte, ShiftRows, and

MixColumns stages, an inverse function is used in the decryption algorithm.

For the AddRoundKey stage, the inverse is achieved by XORing the same

round key to the block, using the result that A ⊕ B ⊕ B = A.

8. As with most block ciphers, the decryption algorithm makes use of the

expanded key in reverse order. However, the decryption algorithm is not

6.3 / AES TRANSFORMATION FUNCTIONS 179

Figure 6.4 AES Encryption Round

SSubBytes

State

State

State

State

State

ShiftRows

MixColumns

AddRoundKey

S S S S S S S S S S S S S S S

M M M M

r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15

identical to the encryption algorithm. This is a consequence of the particular

structure of AES.

9. Once it is established that all four stages are reversible, it is easy to verify

that decryption does recover the plaintext. Figure 6.3 lays out encryption

and decryption going in opposite vertical directions. At each horizontal point

(e.g., the dashed line in the figure), State is the same for both encryption and

decryption.

10. The final round of both encryption and decryption consists of only three stages.

Again, this is a consequence of the particular structure of AES and is required

to make the cipher reversible.

6.3 AES TRANSFORMATION FUNCTIONS

We now turn to a discussion of each of the four transformations used in AES. For

each stage, we describe the forward (encryption) algorithm, the inverse ( decryption)

algorithm, and the rationale for the stage.

180 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD

Substitute Bytes Transformation

FORWARD AND INVERSE TRANSFORMATIONS The forward substitute byte

transformation, called SubBytes, is a simple table lookup (Figure 6.5a). AES

defines a 16 * 16 matrix of byte values, called an S-box (Table 6.2a), that con-

tains a permutation of all possible 256 8-bit values. Each individual byte of State

is mapped into a new byte in the following way: The leftmost 4 bits of the byte are

used as a row value and the rightmost 4 bits are used as a column value. These row

and column values serve as indexes into the S-box to select a unique 8-bit output

value. For example, the hexadecimal value {95} references row 9, column 5 of the

S-box, which contains the value {2A}. Accordingly, the value {95} is mapped into

the value {2A}.

Figure 6.5 AES Byte-Level Operations

s0,0 s0,1 s0,2 s0,3

s1,0 s1,2 s1,3

s2,0 s2,1 s2,2 s2,3

s3,0 s3,1 s3,2 s3,3

s0,0 s0,1 s0,2 s0,3

s1,0 s1,2 s1,3

s2,0 s2,1 s2,2 s2,3

s3,0 s3,1 s3,2 s3,3

(b) Add round key transformation

(a) Substitute byte transformation

S-box

x

y

¿ ¿ ¿ ¿

¿ ¿¿¿

s1,1

s0,0

wi wi+2 wi+3

s0,2 s0,3

s1,0 s1,2 s1,3

=

s2,0 s2,2 s2,3

s3,0 s3,2 s3,3

s1,1

s0,0 s0,2 s0,3

s1,0 s1,2 s1,3

s2,0 s2,2 s2,3

s3,0 s3,2 s3,3

s1,1

s0,1

s2,1

s3,1

wi+1

s0,1

s2,1

s3,1

s1,1

¿¿¿

¿ ¿ ¿ ¿

¿

¿

¿

¿

¿

¿ ¿

¿ ¿ ¿ ¿

¿

¿ ¿

¿

¿ ¿

6.3 / AES TRANSFORMATION FUNCTIONS 181

y

0 1 2 3 4 5 6 7 8 9 A B C D E F

0 63 7C 77 7B F2 6B 6F C5 30 01 67 2B FE D7 AB 76

1 CA 82 C9 7D FA 59 47 F0 AD D4 A2 AF 9C A4 72 C0

2 B7 FD 93 26 36 3F F7 CC 34 A5 E5 F1 71 D8 31 15

3 04 C7 23 C3 18 96 05 9A 07 12 80 E2 EB 27 B2 75

4 09 83 2C 1A 1B 6E 5A A0 52 3B D6 B3 29 E3 2F 84

5 53 D1 00 ED 20 FC B1 5B 6A CB BE 39 4A 4C 58 CF

6 D0 EF AA FB 43 4D 33 85 45 F9 02 7F 50 3C 9F A8

x

7 51 A3 40 8F 92 9D 38 F5 BC B6 DA 21 10 FF F3 D2

8 CD 0C 13 EC 5F 97 44 17 C4 A7 7E 3D 64 5D 19 73

9 60 81 4F DC 22 2A 90 88 46 EE B8 14 DE 5E 0B DB

A E0 32 3A 0A 49 06 24 5C C2 D3 AC 62 91 95 E4 79

B E7 C8 37 6D 8D D5 4E A9 6C 56 F4 EA 65 7A AE 08

C BA 78 25 2E 1C A6 B4 C6 E8 DD 74 1F 4B BD 8B 8A

D 70 3E B5 66 48 03 F6 0E 61 35 57 B9 86 C1 1D 9E

E E1 F8 98 11 69 D9 8E 94 9B 1E 87 E9 CE 55 28 DF

F 8C A1 89 0D BF E6 42 68 41 99 2D 0F B0 54 BB 16

(a) S-box

y

0 1 2 3 4 5 6 7 8 9 A B C D E F

0 52 09 6A D5 30 36 A5 38 BF 40 A3 9E 81 F3 D7 FB

1 7C E3 39 82 9B 2F FF 87 34 8E 43 44 C4 DE E9 CB

2 54 7B 94 32 A6 C2 23 3D EE 4C 95 0B 42 FA C3 4E

3 08 2E A1 66 28 D9 24 B2 76 5B A2 49 6D 8B D1 25

4 72 F8 F6 64 86 68 98 16 D4 A4 5C CC 5D 65 B6 92

5 6C 70 48 50 FD ED B9 DA 5E 15 46 57 A7 8D 9D 84

6 90 D8 AB 00 8C BC D3 0A F7 E4 58 05 B8 B3 45 06

x

7 D0 2C 1E 8F CA 3F 0F 02 C1 AF BD 03 01 13 8A 6B

8 3A 91 11 41 4F 67 DC EA 97 F2 CF CE F0 B4 E6 73

9 96 AC 74 22 E7 AD 35 85 E2 F9 37 E8 1C 75 DF 6E

A 47 F1 1A 71 1D 29 C5 89 6F B7 62 0E AA 18 BE 1B

B FC 56 3E 4B C6 D2 79 20 9A DB C0 FE 78 CD 5A F4

C 1F DD A8 33 88 07 C7 31 B1 12 10 59 27 80 EC 5F

D 60 51 7F A9 19 B5 4A 0D 2D E5 7A 9F 93 C9 9C EF

E A0 E0 3B 4D AE 2A F5 B0 C8 EB BB 3C 83 53 99 61

F 17 2B 04 7E BA 77 D6 26 E1 69 14 63 55 21 0C 7D

(b) Inverse S-box

Table 6.2 AES S-Boxes

182 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD

Here is an example of the SubBytes transformation:

EA 04 65 85 87 F2 4D 97

83 45 5D 96 EC 6E 4C 90

5C 33 98 B0 S 4A C3 46 E7

F0 2D AD C5 8C D8 95 A6

The S-box is constructed in the following fashion (Figure 6.6a).

Figure 6.6 Constuction of S-Box and IS-Box

b0

b1

b2

b3

b4

b5

b6

b7

=

1 0 0 0 1 1 1 1

1 1 0 0 0 1 1 1

1 1 1 0 0 0 1 1

1 1 1 1 0 0 0 1

1 1 1 1 1 0 0 0

0 1 1 1 1 1 0 0

0 0 1 1 1 1 1 0

0 0 0 1 1 1 1 1

b0

b1

b2

b3

b4

b5

b6

b7

+

1

1

0

0

0

1

1

0

Inverse

in GF(28)

Byte to bit

column vector

Bit column

vector to byte

Byte at row y,

column x

initialized to yx

yx

S(yx)

(a) Calculation of byte at

row y, column x of S-box

(a) Calculation of byte at

row y, column x of IS-box

Inverse

in GF(28)

Byte to bit

column vector

Bit column

vector to byte

Byte at row y,

column x

initialized to yx

yx

b0¿

b¿

b¿

b ¿

1

2

3

b4

b5

b6

b7

=

0 0 1 0 0 1 0 1

1 0 0 1 0 0 1 0

0 1 0 0 1 0 0 1

1 0 1 0 0 1 0 0

0 1 0 1 0 0 1 0

0 0 1 0 1 0 0 1

1 0 0 1 0 1 0 0

0 1 0 0 1 0 1 0

b0

b1

b2

b3

b4

b5

b6

b7

+

1

0

1

0

0

0

0

0

IS(yx)

¿

¿

¿

¿

¿

¿

¿

¿

¿

¿

¿

¿

6.3 / AES TRANSFORMATION FUNCTIONS 183

1. Initialize the S-box with the byte values in ascending sequence row by row.

The first row contains {00}, {01}, {02}, c , {0F}; the second row contains

{10}, {11}, etc.; and so on. Thus, the value of the byte at row y, column x is {yx}.

2. Map each byte in the S-box to its multiplicative inverse in the finite field

GF(28); the value {00} is mapped to itself.

3. Consider that each byte in the S-box consists of 8 bits labeled

(b7, b6, b5, b4, b3, b2, b1, b0). Apply the following transformation to each bit of

each byte in the S-box:

bi

= = bi ⊕ b(i + 4) mod 8 ⊕ b(i + 5) mod 8 ⊕ b(i + 6) mod 8 ⊕ b(i + 7) mod 8 ⊕ ci (6.1)

where ci is the ith bit of byte c with the value {63}; that is,

(c7c6c5c4c3c2c1c0) = (01100011). The prime (′) indicates that the variable is to

be updated by the value on the right. The AES standard depicts this transfor-

mation in matrix form as follows.

H

b0

=

b1

=

b2

=

b3

=

b4

=

b5

=

b6

=

b7

=

X = H

1 0 0 0 1 1 1 1

1 1 0 0 0 1 1 1

1 1 1 0 0 0 1 1

1 1 1 1 0 0 0 1

1 1 1 1 1 0 0 0

0 1 1 1 1 1 0 0

0 0 1 1 1 1 1 0

0 0 0 1 1 1 1 1

X H

b0

b1

b2

b3

b4

b5

b6

b7

X + H

1

1

0

0

0

1

1

0

X (6.2)

Equation (6.2) has to be interpreted carefully. In ordinary matrix multiplica-

tion,4 each element in the product matrix is the sum of products of the elements of

one row and one column. In this case, each element in the product matrix is the

bitwise XOR of products of elements of one row and one column. Furthermore, the

final addition shown in Equation (6.2) is a bitwise XOR. Recall from Section 5.6

that the bitwise XOR is addition in GF(28).

As an example, consider the input value {95}. The multiplicative inverse in

GF(28) is {95}-1 = {8A}, which is 10001010 in binary. Using Equation (6.2),

H

1 0 0 0 1 1 1 1

1 1 0 0 0 1 1 1

1 1 1 0 0 0 1 1

1 1 1 1 0 0 0 1

1 1 1 1 1 0 0 0

0 1 1 1 1 1 0 0

0 0 1 1 1 1 1 0

0 0 0 1 1 1 1 1

X H

0

1

0

1

0

0

0

1

X ⊕ H

1

1

0

0

0

1

1

0

X = H

1

0

0

1

0

0

1

0

X ⊕ H

1

1

0

0

0

1

1

0

X = H

0

1

0

1

0

1

0

0

X

4For a brief review of the rules of matrix and vector multiplication, refer to Appendix E.

184 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD

The result is {2A}, which should appear in row {09} column {05} of the S-box.

This is verified by checking Table 6.2a.

The inverse substitute byte transformation, called InvSubBytes, makes use

of the inverse S-box shown in Table 6.2b. Note, for example, that the input {2A}

produces the output {95}, and the input {95} to the S-box produces {2A}. The inverse

S-box is constructed (Figure 6.6b) by applying the inverse of the transformation in

Equation (6.1) followed by taking the multiplicative inverse in GF(28). The inverse

transformation is

bi

= = b(i + 2) mod 8 ⊕ b(i + 5) mod 8 ⊕ b(i + 7) mod 8 ⊕ di

where byte d = {05}, or 00000101. We can depict this transformation as follows.

H

b0

=

b1

=

b2

=

b3

=

b4

=

b5

=

b6

=

b7

=

X = H

0 0 1 0 0 1 0 1

1 0 0 1 0 0 1 0

0 1 0 0 1 0 0 1

1 0 1 0 0 1 0 0

0 1 0 1 0 0 1 0

0 0 1 0 1 0 0 1

1 0 0 1 0 1 0 0

0 1 0 0 1 0 1 0

X H

b0

b1

b2

b3

b4

b5

b6

b7

X + H

1

0

1

0

0

0

0

0

X

To see that InvSubBytes is the inverse of SubBytes, label the matrices in

SubBytes and InvSubBytes as X and Y, respectively, and the vector versions of con-

stants c and d as C and D, respectively. For some 8-bit vector B, Equation (6.2)

becomes B= = XB ⊕ C. We need to show that Y(XB ⊕ C) ⊕ D = B. To multiply

out, we must show YXB ⊕ YC ⊕ D = B. This becomes

H

0 0 1 0 0 1 0 1

1 0 0 1 0 0 1 0

0 1 0 0 1 0 0 1

1 0 1 0 0 1 0 0

0 1 0 1 0 0 1 0

0 0 1 0 1 0 0 1

1 0 0 1 0 1 0 0

0 1 0 0 1 0 1 0

X H

1 0 0 0 1 1 1 1

1 1 0 0 0 1 1 1

1 1 1 0 0 0 1 1

1 1 1 1 0 0 0 1

1 1 1 1 1 0 0 0

0 1 1 1 1 1 0 0

0 0 1 1 1 1 1 0

0 0 0 1 1 1 1 1

X H

b0

b1

b2

b3

b4

b5

b6

b7

X ⊕

H

0 0 1 0 0 1 0 1

1 0 0 1 0 0 1 0

0 1 0 0 1 0 0 1

1 0 1 0 0 1 0 0

0 1 0 1 0 0 1 0

0 0 1 0 1 0 0 1

1 0 0 1 0 1 0 0

0 1 0 0 1 0 1 0

X H

1

1

0

0

0

1

1

0

X ⊕ H

1

0

1

0

0

0

0

0

X =

6.3 / AES TRANSFORMATION FUNCTIONS 185

H

1 0 0 0 0 0 0 0

0 1 0 0 0 0 0 0

0 0 1 0 0 0 0 0

0 0 0 1 0 0 0 0

0 0 0 0 1 0 0 0

0 0 0 0 0 1 0 0

0 0 0 0 0 0 1 0

0 0 0 0 0 0 0 1

X H

b0

b1

b2

b3

b4

b5

b6

b7

X ⊕ H

1

0

1

0

0

0

0

0

X ⊕ H

1

0

1

0

0

0

0

0

X = H

b0

b1

b2

b3

b4

b5

b6

b7

X

We have demonstrated that YX equals the identity matrix, and the YC = D,

so that YC ⊕ D equals the null vector.

RATIONALE The S-box is designed to be resistant to known cryptanalytic attacks.

Specifically, the Rijndael developers sought a design that has a low correlation

between input bits and output bits and the property that the output is not a linear

mathematical function of the input [DAEM01]. The nonlinearity is due to the use

of the multiplicative inverse. In addition, the constant in Equation (6.1) was chosen

so that the S-box has no fixed points [S@box(a) = a] and no “opposite fixed points”

[S@box(a) = a], where a is the bitwise complement of a.

Of course, the S-box must be invertible, that is, IS@box[S@box(a)] = a.

However, the S-box does not self-inverse in the sense that it is not true that

S@box(a) = IS@box(a). For example, S@box({95}) = {2A}, but IS@box({95}) = {AD}.

ShiftRows Transformation

FORWARD AND INVERSE TRANSFORMATIONS The forward shift row transformation,

called ShiftRows, is depicted in Figure 6.7a. The first row of State is not altered. For

the second row, a 1-byte circular left shift is performed. For the third row, a 2-byte

circular left shift is performed. For the fourth row, a 3-byte circular left shift is per-

formed. The following is an example of ShiftRows.

87 F2 4D 97 87 F2 4D 97

EC 6E 4C 90 6E 4C 90 EC

4A C3 46 E7 S 46 E7 4A C3

8C D8 95 A6 A6 8C D8 95

The inverse shift row transformation, called InvShiftRows, performs the cir-

cular shifts in the opposite direction for each of the last three rows, with a 1-byte

circular right shift for the second row, and so on.

RATIONALE The shift row transformation is more substantial than it may first

appear. This is because the State, as well as the cipher input and output, is

treated as an array of four 4-byte columns. Thus, on encryption, the first 4 bytes

of the plaintext are copied to the first column of State, and so on. Furthermore,

as will be seen, the round key is applied to State column by column. Thus, a row

shift moves an individual byte from one column to another, which is a linear

186 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD

5We follow the convention of FIPS PUB 197 and use the symbol # to indicate multiplication over the

finite field GF(28) and ⊕ to indicate bitwise XOR, which corresponds to addition in GF(28).

distance of a multiple of 4 bytes. Also note that the transformation ensures that

the 4 bytes of one column are spread out to four different columns. Figure 6.4

illustrates the effect.

MixColumns Transformation

FORWARD AND INVERSE TRANSFORMATIONS The forward mix column transformation,

called MixColumns, operates on each column individually. Each byte of a column

is mapped into a new value that is a function of all four bytes in that column. The

transformation can be defined by the following matrix multiplication on State

(Figure 6.7b):

D 02 03 01 0101 02 03 01

01 01 02 03

03 01 01 02

T D s0,0 s0,1 s0,2 s0,3s1,0 s1,1 s1,2 s1,3

s2,0 s2,1 s2,2 s2,3

s3,0 s3,1 s3,2 s3,3

T = D s0,0= s0,1= s0,2= s0,3=s1,0= s1,1= s1,2= s1,3=

s2,0

= s2,1

= s2,2

= s2,3

=

s3,0

= s3,1

= s3,2

= s3,3

=

T (6.3)

Each element in the product matrix is the sum of products of elements of one row

and one column. In this case, the individual additions and multiplications5 are

Figure 6.7 AES Row and Column Operations

s0,0 s0,1 s0,2 s0,3

s1,0 s1,1 s1,2 s1,3

s2,0 s2,1 s2,2 s2,3

s3,0 s3,1 s3,2 s3,3

s0,0 s0,1 s0,2 s0,3

s1,0 s1,1 s1,2 s1,3

s2,0 s2,1 s2,2 s2,3

s3,0 s3,1 s3,2 s3,3

s0,0 s0,1 s0,2 s0,3

s1,0 s1,1 s1,2 s1,3

s2,0 s2,1 s2,2 s2,3

s3,0 s3,1 s3,2 s3,3

s0,0 s0,1 s0,2 s0,3

s1,1 s1,2 s1,3 s1,0

s2,2 s2,3 s2,0 s2,1

s3,3 s3,0 s3,1 s3,2

(a) Shift row transformation

(b) Mix column transformation

2 3 1 1

1 2 3 1

1 1 2 3

3 1 1 2

=*

¿ ¿ ¿ ¿

¿¿¿¿

¿ ¿ ¿ ¿

¿¿¿¿

6.3 / AES TRANSFORMATION FUNCTIONS 187

performed in GF(28). The MixColumns transformation on a single column of State

can be expressed as

s0, j

= = (2 # s0, j) ⊕ (3 # s1, j) ⊕ s2, j ⊕ s3, j

s1, j

= = s0, j ⊕ (2 # s1, j) ⊕ (3 # s2, j) ⊕ s3, j

s2, j

= = s0, j ⊕ s1, j ⊕ (2 # s2, j) ⊕ (3 # s3, j)

s3, j

= = (3 # s0, j) ⊕ s1, j ⊕ s2, j ⊕ (2 # s3, j)

(6.4)

The following is an example of MixColumns:

87 F2 4D 97 47 40 A3 4C

6E 4C 90 EC 37 D4 70 9F

46 E7 4A C3 S 94 E4 3A 42

A6 8C D8 95 ED A5 A6 BC

Let us verify the first column of this example. Recall from Section 5.6 that, in

GF(28), addition is the bitwise XOR operation and that multiplication can be per-

formed according to the rule established in Equation (4.14). In particular, multipli-

cation of a value by x (i.e., by {02}) can be implemented as a 1-bit left shift followed

by a conditional bitwise XOR with (0001 1011) if the leftmost bit of the original

value (prior to the shift) is 1. Thus, to verify the MixColumns transformation on the

first column, we need to show that

({02} # {87}) ⊕ ({03} # {6E}) ⊕ {46} ⊕ {A6} = {47}

{87} ⊕ ({02} # {6E}) ⊕ ({03} # {46}) ⊕ {A6} = {37}

{87} ⊕ {6E} ⊕ ({02} # {46}) ⊕ ({03} # {A6}) = {94}

({03} # {87}) ⊕ {6E} ⊕ {46} ⊕ ({02} # {A6}) = {ED}

For the first equation, we have {02} # {87} = (0000 1110) ⊕ (0001 1011) =

(0001 0101) and {03} # {6E} = {6E} ⊕ ({02} # {6E}) = (0110 1110) ⊕ (1101 1100) =

(1011 0010). Then,

{02} # {87} = 0001 0101

{03} # {6E} = 1011 0010

{46} = 0100 0110

{A6} = 1010 0110

0100 0111 = {47}

The other equations can be similarly verified.

The inverse mix column transformation, called InvMixColumns, is defined by

the following matrix multiplication:

D 0E 0B 0D 0909 0E 0B 0D

0D 09 0E 0B

0B 0D 09 0E

T D s0,0 s0,1 s0,2 s0,3s1,0 s1,1 s1,2 s1,3

s2,0 s2,1 s2,2 s2,3

s3,0 s3,1 s3,2 s3,3

T = D s0,0= s0,1= s0,2= s0,3=s1,0= s1,1= s1,2= s1,3=

s2,0

= s2,1

= s2,2

= s2,3

=

s3,0

= s3,1

= s3,2

= s3,3

=

T (6.5)

188 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD

It is not immediately clear that Equation (6.5) is the inverse of Equation (6.3).

We need to show

D 0E 0B 0D 0909 0E 0B 0D

0D 09 0E 0B

0B 0D 09 0E

T D 02 03 01 0101 02 03 01

01 01 02 03

03 01 01 02

T D s0,0 s0,1 s0,2 s0,3s1,0 s1,1 s1,2 s1,3

s2,0 s2,1 s2,2 s2,3

s3,0 s3,1 s3,2 s3,3

T = D s0,0 s0,1 s0,2 s0,3s1,0 s1,1 s1,2 s1,3

s2,0 s2,1 s2,2 s2,3

s0,3 s3,1 s3,2 s3,3

T

which is equivalent to showing

D 0E 0B 0D 0909 0E 0B 0D

0D 09 0E 0B

0B 0D 09 0E

T D 02 03 01 0101 02 03 01

01 01 02 03

03 01 01 02

T = D 1 0 0 00 1 0 0

0 0 1 0

0 0 0 1

T (6.6)

That is, the inverse transformation matrix times the forward transformation matrix

equals the identity matrix. To verify the first column of Equation (6.6), we need

to show

({0E} # {02}) ⊕ {0B} ⊕ {0D} ⊕ ({09} # {03}) = {01}

({09} # {02}) ⊕ {0E} ⊕ {0B} ⊕ ({0D} # {03}) = {00}

({0D} # {02}) ⊕ {09} ⊕ {0E} ⊕ ({0B} # {03}) = {00}

({0B} # {02}) ⊕ {0D} ⊕ {09} ⊕ ({0E} # {03}) = {00}

For the first equation, we have {0E} # {02} = 00011100 and {09} # {03} =

{09} ⊕ ({09} # {02}) = 00001001 ⊕ 00010010 = 00011011. Then

{0E} # {02} = 00011100

{0B} = 00001011

{0D} = 00001101

{09} # {03} = 00011011

00000001

The other equations can be similarly verified.

The AES document describes another way of characterizing the MixColumns

transformation, which is in terms of polynomial arithmetic. In the standard,

MixColumns is defined by considering each column of State to be a four-term poly-

nomial with coefficients in GF(28). Each column is multiplied modulo (x4 + 1) by

the fixed polynomial a(x), given by

a(x) = {03}x3 + {01}x2 + {01}x + {02} (6.7)

Appendix 5A demonstrates that multiplication of each column of State by

a(x) can be written as the matrix multiplication of Equation (6.3). Similarly, it

can be seen that the transformation in Equation (6.5) corresponds to treating

6.3 / AES TRANSFORMATION FUNCTIONS 189

each column as a four-term polynomial and multiplying each column by b(x),

given by

b(x) = {0B}x3 + {0D}x2 + {09}x + {0E} (6.8)

It readily can be shown that b(x) = a-1(x) mod (x4 + 1).

RATIONALE The coefficients of the matrix in Equation (6.3) are based on a linear

code with maximal distance between code words, which ensures a good mixing

among the bytes of each column. The mix column transformation combined with

the shift row transformation ensures that after a few rounds all output bits depend

on all input bits. See [DAEM99] for a discussion.

In addition, the choice of coefficients in MixColumns, which are all {01}, {02},

or {03}, was influenced by implementation considerations. As was discussed, multi-

plication by these coefficients involves at most a shift and an XOR. The coefficients

in InvMixColumns are more formidable to implement. However, encryption was

deemed more important than decryption for two reasons:

1. For the CFB and OFB cipher modes (Figures 7.5 and 7.6; described in

Chapter 7), only encryption is used.

2. As with any block cipher, AES can be used to construct a message authentica-

tion code (Chapter 13), and for this, only encryption is used.

AddRoundKey Transformation

FORWARD AND INVERSE TRANSFORMATIONS In the forward add round key transfor-

mation, called AddRoundKey, the 128 bits of State are bitwise XORed with the

128 bits of the round key. As shown in Figure 6.5b, the operation is viewed as a

columnwise operation between the 4 bytes of a State column and one word of

the round key; it can also be viewed as a byte-level operation. The following is an

example of AddRoundKey:

47 40 A3 4C AC 19 28 57 EB 59 8B 1B

37 D4 70 9F 77 FA D1 5C 40 2E A1 C3

94 E4 3A 42 ⊕ 66 DC 29 00 = F2 38 13 42

ED A5 A6 BC F3 21 41 6A 1E 84 E7 D6

The first matrix is State, and the second matrix is the round key.

The inverse add round key transformation is identical to the forward add

round key transformation, because the XOR operation is its own inverse.

RATIONALE The add round key transformation is as simple as possible and affects

every bit of State. The complexity of the round key expansion, plus the complexity

of the other stages of AES, ensure security.

Figure 6.8 is another view of a single round of AES, emphasizing the mecha-

nisms and inputs of each transformation.

190 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD

6.4 AES KEY EXPANSION

Key Expansion Algorithm

The AES key expansion algorithm takes as input a four-word (16-byte) key and

produces a linear array of 44 words (176 bytes). This is sufficient to provide a four-

word round key for the initial AddRoundKey stage and each of the 10 rounds of the

cipher. The pseudocode on the next page describes the expansion.

The key is copied into the first four words of the expanded key. The remain-

der of the expanded key is filled in four words at a time. Each added word w[i]

depends on the immediately preceding word, w[i – 1], and the word four positions

back, w[i – 4]. In three out of four cases, a simple XOR is used. For a word whose

position in the w array is a multiple of 4, a more complex function is used. Figure 6.9

illustrates the generation of the expanded key, using the symbol g to represent that

complex function. The function g consists of the following subfunctions.

Figure 6.8 Inputs for Single AES Round

SubBytes

State matrix

at beginning

of round

State matrix

at end

of round

MixColumns matrix

Round

key

Variable inputConstant inputs

ShiftRows

MixColumns

AddRoundKey

S-box

02 03 01 01

01 02 03 01

01 01 02 03

03 01 01 02

6.4 / AES KEY EXPANSION 191

KeyExpansion (byte key[16], word w[44])

{

word temp

for (i = 0; i < 4; i++) w[i] = (key[4*i], key[4*i+1],
key[4*i+2],
key[4*i+3]);
for (i = 4; i < 44; i++)
{
temp = w[i − 1];
if (i mod 4 = 0) temp = SubWord (RotWord (temp))
⊕ Rcon[i/4];
w[i] = w[i−4] ⊕ temp
}
}
Figure 6.9 AES Key Expansion
k3
(a) Overall algorithm
(b) Function g
k7 k11 k15
k2 k6 k10 k14
k1 k5 k9 k13
k0 k4 k8 k12
w0 w1 w2 w3 g
w4 w5 w6 w7
w40 w41 w42 w43
g
B0 B1 B2 B3
w
w
B1 B2 B3 B0
0 0 0
B1
S S
B2' ' B3
S S
B0' '
RCj
œ
192 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD
1. RotWord performs a one-byte circular left shift on a word. This means that an
input word [B0, B1, B2, B3] is transformed into [B1, B2, B3, B0].
2. SubWord performs a byte substitution on each byte of its input word, using the
S-box (Table 6.2a).
3. The result of steps 1 and 2 is XORed with a round constant, Rcon[j].
The round constant is a word in which the three rightmost bytes are always 0.
Thus, the effect of an XOR of a word with Rcon is to only perform an XOR on the
leftmost byte of the word. The round constant is different for each round and is de-
fined as Rcon[j] = (RC[j], 0, 0, 0), with RC[1] = 1, RC[j] = 2 # RC[j - 1] and with
multiplication defined over the field GF(28). The values of RC[j] in hexadecimal are
j 1 2 3 4 5 6 7 8 9 10
RC[j] 01 02 04 08 10 20 40 80 1B 36
For example, suppose that the round key for round 8 is
EA D2 73 21 B5 8D BA D2 31 2B F5 60 7F 8D 29 2F
Then the first 4 bytes (first column) of the round key for round 9 are calculated as
follows:
i (decimal) temp
After
RotWord
After
SubWord
Rcon (9)
After XOR
with Rcon
w[i - 4]
w[i] = temp
⊕ w[i - 4]
36 7F8D292F 8D292F7F 5DA515D2 1B000000 46A515D2 EAD27321 AC7766F3
Rationale
The Rijndael developers designed the expansion key algorithm to be resistant to
known cryptanalytic attacks. The inclusion of a round-dependent round constant
eliminates the symmetry, or similarity, between the ways in which round keys are
generated in different rounds. The specific criteria that were used are [DAEM99]
■ Knowledge of a part of the cipher key or round key does not enable calcula-
tion of many other round-key bits.
■ An invertible transformation [i.e., knowledge of any Nk consecutive words of
the expanded key enables regeneration of the entire expanded key (Nk = key
size in words)].
■ Speed on a wide range of processors.
■ Usage of round constants to eliminate symmetries.
■ Diffusion of cipher key differences into the round keys; that is, each key bit
affects many round key bits.
■ Enough nonlinearity to prohibit the full determination of round key differ-
ences from cipher key differences only.
■ Simplicity of description.
6.5 / AN AES EXAMPLE 193
The authors do not quantify the first point on the preceding list, but the idea
is that if you know less than Nk consecutive words of either the cipher key or one of
the round keys, then it is difficult to reconstruct the remaining unknown bits. The
fewer bits one knows, the more difficult it is to do the reconstruction or to deter-
mine other bits in the key expansion.
6.5 AN AES EXAMPLE
We now work through an example and consider some of its implications. Although
you are not expected to duplicate the example by hand, you will find it informative
to study the hex patterns that occur from one step to the next.
For this example, the plaintext is a hexadecimal palindrome. The plaintext,
key, and resulting ciphertext are
Plaintext: 0123456789abcdeffedcba9876543210
Key: 0f1571c947d9e8590cb7add6af7f6798
Ciphertext: ff0b844a0853bf7c6934ab4364148fb9
Results
Table 6.3 shows the expansion of the 16-byte key into 10 round keys. As previ-
ously explained, this process is performed word by word, with each four-byte word
occupying one column of the word round-key matrix. The left-hand column shows
Key Words Auxiliary Function
w0 = 0f 15 71 c9
w1 = 47 d9 e8 59
w2 = 0c b7 ad d6
w3 = af 7f 67 98
RotWord (w3) = 7f 67 98 af = x1
SubWord (x1) = d2 85 46 79 = y1
Rcon (1) = 01 00 00 00
y1 ⊕ Rcon (1) = d3 85 46 79 = z1
w4 = w0 ⊕ z1 = dc 90 37 b0
w5 = w4 ⊕ w1 = 9b 49 df e9
w6 = w5 ⊕ w2 = 97 fe 72 3f
w7 = w6 ⊕ w3 = 38 81 15 a7
RotWord (w7) = 81 15 a7 38 = x2
SubWord (x2) = 0c 59 5c 07 = y2
Rcon (2) = 02 00 00 00
y2 ⊕ Rcon (2) = 0e 59 5c 07 = z2
w8 = w4 ⊕ z2 = d2 c9 6b b7
w9 = w8 ⊕ w5 = 49 80 b4 5e
w10 = w9 ⊕ w6 = de 7e c6 61
w11 = w10 ⊕ w7 = e6 ff d3 c6
RotWord (w11) = ff d3 c6 e6 = x3
SubWord (x3) = 16 66 b4 83 = y3
Rcon (3) = 04 00 00 00
y3 ⊕ Rcon (3) = 12 66 b4 8e = z3
w12 = w8 ⊕ z3 = c0 af df 39
w13 = w12 ⊕ w9 = 89 2f 6b 67
w14 = w13 ⊕ w10 = 57 51 ad 06
w15 = w14 ⊕ w11 = b1 ae 7e c0
RotWord (w15) = ae 7e c0 b1 = x4
SubWord (x4) = e4 f3 ba c8 = y4
Rcon (4) = 08 00 00 00
y4 ⊕ Rcon (4) = ec f3 ba c8 = 4
Table 6.3 Key Expansion for AES Example
(Continued)
194 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD
Key Words Auxiliary Function
w16 = w12 ⊕ z4 = 2c 5c 65 f1
w17 = w16 ⊕ w13 = a5 73 0e 96
w18 = w17 ⊕ w14 = f2 22 a3 90
w19 = w18 ⊕ w15 = 43 8c dd 50
RotWord (w19) = 8c dd 50 43 = x5
SubWord (x5) = 64 c1 53 1a = y5
Rcon(5) = 10 00 00 00
y5 ⊕ Rcon (5) = 74 c1 53 1a = z5
w20 = w16 ⊕ z5 = 58 9d 36 eb
w21 = w20 ⊕ w17 = fd ee 38 7d
w22 = w21 ⊕ w18 = 0f cc 9b ed
w23 = w22 ⊕ w19 = 4c 40 46 bd
RotWord (w23) = 40 46 bd 4c = x6
SubWord (x6) = 09 5a 7a 29 = y6
Rcon(6) = 20 00 00 00
y6 ⊕ Rcon(6) = 29 5a 7a 29 = z6
w24 = w20 ⊕ z6 = 71 c7 4c c2
w25 = w24 ⊕ w21 = 8c 29 74 bf
w26 = w25 ⊕ w22 = 83 e5 ef 52
w27 = w26 ⊕ w23 = cf a5 a9 ef
RotWord (w27) = a5 a9 ef cf = x7
SubWord (x7) = 06 d3 bf 8a = y7
Rcon (7) = 40 00 00 00
y7 ⊕ Rcon(7) = 46 d3 df 8a = z7
w28 = w24 ⊕ z7 = 37 14 93 48
w29 = w28 ⊕ w25 = bb 3d e7 f7
w30 = w29 ⊕ w26 = 38 d8 08 a5
w31 = w30 ⊕ w27 = f7 7d a1 4a
RotWord (w31) = 7d a1 4a f7 = x8
SubWord (x8) = ff 32 d6 68 = y8
Rcon (8) = 80 00 00 00
y8 ⊕ Rcon(8) = 7f 32 d6 68 = z8
w32 = w28 ⊕ z8 = 48 26 45 20
w33 = w32 ⊕ w29 = f3 1b a2 d7
w34 = w33 ⊕ w30 = cb c3 aa 72
w35 = w34 ⊕ w32 = 3c be 0b 3
RotWord (w35) = be 0b 38 3c = x9
SubWord (x9) = ae 2b 07 eb = y9
Rcon (9) = 1B 00 00 00
y9 ⊕ Rcon (9) = b5 2b 07 eb = z9
w36 = w32 ⊕ z9 = fd 0d 42 cb
w37 = w36 ⊕ w33 = 0e 16 e0 1c
w38 = w37 ⊕ w34 = c5 d5 4a 6e
w39 = w38 ⊕ w35 = f9 6b 41 56
RotWord (w39) = 6b 41 56 f9 = x10
SubWord (x10) = 7f 83 b1 99 = y10
Rcon (10) = 36 00 00 00
y10 ⊕ Rcon (10) = 49 83 b1 99 = z10
w40 = w36 ⊕ z10 = b4 8e f3 52
w41 = w40 ⊕ w37 = ba 98 13 4e
w42 = w41 ⊕ w38 = 7f 4d 59 20
w43 = w42 ⊕ w39 = 86 26 18 76
Table 6.3 Continued
the four round-key words generated for each round. The right-hand column shows
the steps used to generate the auxiliary word used in key expansion. We begin, of
course, with the key itself serving as the round key for round 0.
Next, Table 6.4 shows the progression of State through the AES encryption
process. The first column shows the value of State at the start of a round. For the
first row, State is just the matrix arrangement of the plaintext. The second, third, and
fourth columns show the value of State for that round after the SubBytes, ShiftRows,
and MixColumns transformations, respectively. The fifth column shows the round
key. You can verify that these round keys equate with those shown in Table 6.3. The
first column shows the value of State resulting from the bitwise XOR of State after
the preceding MixColumns with the round key for the preceding round.
Avalanche Effect
If a small change in the key or plaintext were to produce a corresponding small
change in the ciphertext, this might be used to effectively reduce the size of the
6.5 / AN AES EXAMPLE 195
Start of Round After SubBytes After ShiftRows After MixColumns Round Key
01 89 fe 76
23 ab dc 54
45 cd ba 32
67 ef 98 10
0f 47 0c af
15 d9 b7 7f
71 e8 ad 67
c9 59 d6 98
0e ce f2 d9
36 72 6b 2b
34 25 17 55
ae b6 4e 88
ab 8b 89 35
05 40 7f f1
18 3f f0 fc
e4 4e 2f c4
ab 8b 89 35
40 7f f1 05
f0 fc 18 3f
c4 e4 4e 2f
b9 94 57 75
e4 8e 16 51
47 20 9a 3f
c5 d6 f5 3b
dc 9b 97 38
90 49 fe 81
37 df 72 15
b0 e9 3f a7
65 0f c0 4d
74 c7 e8 d0
70 ff e8 2a
75 3f ca 9c
4d 76 ba e3
92 c6 9b 70
51 16 9b e5
9d 75 74 de
4d 76 ba e3
c6 9b 70 92
9b e5 51 16
de 9d 75 74
8e 22 db 12
b2 f2 dc 92
df 80 f7 c1
2d c5 1e 52
d2 49 de e6
c9 80 7e ff
6b b4 c6 d3
b7 5e 61 c6
5c 6b 05 f4
7b 72 a2 6d
b4 34 31 12
9a 9b 7f 94
4a 7f 6b bf
21 40 3a 3c
8d 18 c7 c9
b8 14 d2 22
4a 7f 6b bf
40 3a 3c 21
c7 c9 8d 18
22 b8 14 d2
b1 c1 0b cc
ba f3 8b 07
f9 1f 6a c3
1d 19 24 5c
c0 89 57 b1
af 2f 51 ae
df 6b ad 7e
39 67 06 c0
71 48 5c 7d
15 dc da a9
26 74 c7 bd
24 7e 22 9c
a3 52 4a ff
59 86 57 d3
f7 92 c6 7a
36 f3 93 de
a3 52 4a ff
86 57 d3 59
c6 7a f7 92
de 36 f3 93
d4 11 fe 0f
3b 44 06 73
cb ab 62 37
19 b7 07 ec
2c a5 f2 43
5c 73 22 8c
65 0e a3 dd
f1 96 90 50
f8 b4 0c 4c
67 37 24 ff
ae a5 c1 ea
e8 21 97 bc
41 8d fe 29
85 9a 36 16
e4 06 78 87
9b fd 88 65
41 8d fe 29
9a 36 16 85
78 87 e4 06
65 9b fd 88
2a 47 c4 48
83 e8 18 ba
84 18 27 23
eb 10 0a f3
58 fd 0f 4c
9d ee cc 40
36 38 9b 46
eb 7d ed bd
72 ba cb 04
1e 06 d4 fa
b2 20 bc 65
00 6d e7 4e
40 f4 1f f2
72 6f 48 2d
37 b7 65 4d
63 3c 94 2f
40 f4 1f f2
6f 48 2d 72
65 4d 37 b7
2f 63 3c 94
7b 05 42 4a
1e d0 20 40
94 83 18 52
94 c4 43 fb
71 8c 83 cf
c7 29 e5 a5
4c 74 ef a9
c2 bf 52 ef
0a 89 c1 85
d9 f9 c5 e5
d8 f7 f7 fb
56 7b 11 14
67 a7 78 97
35 99 a6 d9
61 68 68 0f
b1 21 82 fa
67 a7 78 97
99 a6 d9 35
68 0f 61 68
fa b1 21 82
ec 1a c0 80
0c 50 53 c7
3b d7 00 ef
b7 22 72 e0
37 bb 38 f7
14 3d d8 7d
93 e7 08 a1
48 f7 a5 4a
db a1 f8 77
18 6d 8b ba
a8 30 08 4e
ff d5 d7 aa
b9 32 41 f5
ad 3c 3d f4
c2 04 30 2f
16 03 0e ac
b9 32 41 f5
3c 3d f4 ad
30 2f c2 04
ac 16 03 0e
b1 1a 44 17
3d 2f ec b6
0a 6b 2f 42
9f 68 f3 b1
48 f3 cb 3c
26 1b c3 be
45 a2 aa 0b
20 d7 72 38
f9 e9 8f 2b
1b 34 2f 08
4f c9 85 49
bf bf 81 89
99 1e 73 f1
af 18 15 30
84 dd 97 3b
08 08 0c a7
99 1e 73 f1
18 15 30 af
97 3b 84 dd
a7 08 08 0c
31 30 3a c2
ac 71 8c c4
46 65 48 eb
6a 1c 31 62
fd 0e c5 f9
0d 16 d5 6b
42 e0 4a 41
cb 1c 6e 56
cc 3e ff 3b
a1 67 59 af
04 85 02 aa
a1 00 5f 34
4b b2 16 e2
32 85 cb 79
f2 97 77 ac
32 63 cf 18
4b b2 16 e2
85 cb 79 32
77 ac f2 97
18 32 63 cf
b4 ba 7f 86
8e 98 4d 26
f3 13 59 18
52 4e 20 76
ff 08 69 64
0b 53 34 14
84 bf ab 8f
4a 7c 43 b9
Table 6.4 AES Example
196 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD
Round
Number of Bits
that Differ
0123456789abcdeffedcba9876543210
0023456789abcdeffedcba9876543210
1
0 0e3634aece7225b6f26b174ed92b5588
0f3634aece7225b6f26b174ed92b5588
1
1 657470750fc7ff3fc0e8e8ca4dd02a9c
c4a9ad090fc7ff3fc0e8e8ca4dd02a9c
20
2 5c7bb49a6b72349b05a2317ff46d1294
fe2ae569f7ee8bb8c1f5a2bb37ef53d5
58
3 7115262448dc747e5cdac7227da9bd9c
ec093dfb7c45343d689017507d485e62
59
4 f867aee8b437a5210c24c1974cffeabc
43efdb697244df808e8d9364ee0ae6f5
61
5 721eb200ba06206dcbd4bce704fa654e
7b28a5d5ed643287e006c099bb375302
68
6 0ad9d85689f9f77bc1c5f71185e5fb14
3bc2d8b6798d8ac4fe36a1d891ac181a
64
7 db18a8ffa16d30d5f88b08d777ba4eaa
9fb8b5452023c70280e5c4bb9e555a4b
67
8 f91b4fbfe934c9bf8f2f85812b084989
20264e1126b219aef7feb3f9b2d6de40
65
9 cca104a13e678500ff59025f3bafaa34
b56a0341b2290ba7dfdfbddcd8578205
61
10 ff0b844a0853bf7c6934ab4364148fb9
612b89398d0600cde116227ce72433f0
58
Table 6.5 Avalanche Effect in AES: Change in Plaintext
plaintext (or key) space to be searched. What is desired is the avalanche effect, in
which a small change in plaintext or key produces a large change in the ciphertext.
Using the example from Table 6.4, Table 6.5 shows the result when the
eighth bit of the plaintext is changed. The second column of the table shows the
value of the State matrix at the end of each round for the two plaintexts. Note
that after just one round, 20 bits of the State vector differ. After two rounds,
close to half the bits differ. This magnitude of difference propagates through
the remaining rounds. A bit difference in approximately half the positions in the
most desirable outcome. Clearly, if almost all the bits are changed, this would be
logically equivalent to almost none of the bits being changed. Put another way, if
we select two plaintexts at random, we would expect the two plaintexts to differ
in about half of the bit positions and the two ciphertexts to also differ in about
half the positions.
Table 6.6 shows the change in State matrix values when the same plaintext
is used and the two keys differ in the eighth bit. That is, for the second case, the
key is 0e1571c947d9e8590cb7add6af7f6798. Again, one round produces
a significant change, and the magnitude of change after all subsequent rounds
is roughly half the bits. Thus, based on this example, AES exhibits a very strong
avalanche effect.
6.6 / AES IMPLEMENTATION 197
Round
Number of Bits
that Differ
0123456789abcdeffedcba9876543210
0123456789abcdeffedcba9876543210
0
0 0e3634aece7225b6f26b174ed92b5588
0f3634aece7225b6f26b174ed92b5588
1
1 657470750fc7ff3fc0e8e8ca4dd02a9c
c5a9ad090ec7ff3fc1e8e8ca4cd02a9c
22
2 5c7bb49a6b72349b05a2317ff46d1294
90905fa9563356d15f3760f3b8259985
58
3 7115262448dc747e5cdac7227da9bd9c
18aeb7aa794b3b66629448d575c7cebf
67
4 f867aee8b437a5210c24c1974cffeabc
f81015f993c978a876ae017cb49e7eec
63
5 721eb200ba06206dcbd4bce704fa654e
5955c91b4e769f3cb4a94768e98d5267
81
6 0ad9d85689f9f77bc1c5f71185e5fb14
dc60a24d137662181e45b8d3726b2920
70
7 db18a8ffa16d30d5f88b08d777ba4eaa
fe8343b8f88bef66cab7e977d005a03c
74
8 f91b4fbfe934c9bf8f2f85812b084989
da7dad581d1725c5b72fa0f9d9d1366a
67
9 cca104a13e678500ff59025f3bafaa34
0ccb4c66bbfd912f4b511d72996345e0
59
10 ff0b844a0853bf7c6934ab4364148fb9
fc8923ee501a7d207ab670686839996b
53
Table 6.6 Avalanche Effect in AES: Change in Key
Note that this avalanche effect is stronger than that for DES (Table 4.2),
which requires three rounds to reach a point at which approximately half the bits
are changed, both for a bit change in the plaintext and a bit change in the key.
6.6 AES IMPLEMENTATION
Equivalent Inverse Cipher
As was mentioned, the AES decryption cipher is not identical to the encryption
cipher (Figure 6.3). That is, the sequence of transformations for decryption differs
from that for encryption, although the form of the key schedules for encryption
and decryption is the same. This has the disadvantage that two separate software
or firmware modules are needed for applications that require both encryption and
decryption. There is, however, an equivalent version of the decryption algorithm
that has the same structure as the encryption algorithm. The equivalent version has
the same sequence of transformations as the encryption algorithm (with transfor-
mations replaced by their inverses). To achieve this equivalence, a change in key
schedule is needed.
198 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD
Two separate changes are needed to bring the decryption structure in line
with the encryption structure. As illustrated in Figure 6.3, an encryption round has
the structure SubBytes, ShiftRows, MixColumns, AddRoundKey. The standard
decryption round has the structure InvShiftRows, InvSubBytes, AddRoundKey,
InvMixColumns. Thus, the first two stages of the decryption round need to be inter-
changed, and the second two stages of the decryption round need to be interchanged.
INTERCHANGING INVSHIFTROWS AND INVSUBBYTES InvShiftRows affects the se-
quence of bytes in State but does not alter byte contents and does not depend on
byte contents to perform its transformation. InvSubBytes affects the contents of
bytes in State but does not alter byte sequence and does not depend on byte se-
quence to perform its transformation. Thus, these two operations commute and can
be interchanged. For a given State Si,
InvShiftRows [InvSubBytes (Si)] = InvSubBytes [InvShiftRows (Si)]
INTERCHANGING ADDROUNDKEY AND INVMIXCOLUMNS The transformations
AddRoundKey and InvMixColumns do not alter the sequence of bytes in State. If we
view the key as a sequence of words, then both AddRoundKey and InvMixColumns
operate on State one column at a time. These two operations are linear with respect
to the column input. That is, for a given State Si and a given round key wj,
InvMixColumns (Si ⊕ wj) = [InvMixColumns (Si)] ⊕ [InvMixColumns (wj)]
To see this, suppose that the first column of State Si is the sequence (y0, y1, y2, y3)
and the first column of the round key wj is (k0, k1, k2, k3). Then we need to show
D 0E 0B 0D 0909 0E 0B 0D
0D 09 0E 0B
0B 0D 09 0E
T D y0 ⊕ k0y1 ⊕ k1
y2 ⊕ k2
y3 ⊕ k3
T = D 0E 0B 0D 0909 0E 0B 0D
0D 09 0E 0B
0B 0D 09 0E
T D y0y1
y2
y3
T ⊕ D 0E 0B 0D 0909 0E 0B 0D
0D 09 0E 0B
0B 0D 09 0E
T D k0k1
k2
k3
T
Let us demonstrate that for the first column entry. We need to show
[{0E} # (y0 ⊕ k0)] ⊕ [{0B} # (y1 ⊕ k1)] ⊕ [{0D} # (y2 ⊕ k2)] ⊕ [{09} # (y3 ⊕ k3)]
= [{0E} # y0] ⊕ [{0B} # y1] ⊕ [{0D} # y2] ⊕ [{09} # y3] ⊕
[{0E} # k0] ⊕ [{0B} # k1] ⊕ [{0D} # k2] ⊕ [{09} # k3]
This equation is valid by inspection. Thus, we can interchange AddRoundKey
and InvMixColumns, provided that we first apply InvMixColumns to the round
key. Note that we do not need to apply InvMixColumns to the round key for the
input to the first AddRoundKey transformation (preceding the first round) nor to
the last AddRoundKey transformation (in round 10). This is because these two
AddRoundKey transformations are not interchanged with InvMixColumns to pro-
duce the equivalent decryption algorithm.
Figure 6.10 illustrates the equivalent decryption algorithm.
6.6 / AES IMPLEMENTATION 199
Figure 6.10 Equivalent Inverse Cipher
Add round key
w[36, 39]
w[40, 43]
Ciphertext
Inverse sub bytes
Inverse shift rows
Inverse mix cols R
ou
nd
1
R
ou
nd
9
R
ou
nd
1
0
Add round keyInverse mix cols
Inverse sub bytes
Inverse shift rows
Inverse mix cols
Add round keyInverse mix cols
Inverse sub bytes
Inverse shift rowsExpand key
Add round key
PlaintextKey
w[4, 7]
w[0, 3]
Implementation Aspects
The Rijndael proposal [DAEM99] provides some suggestions for efficient im-
plementation on 8-bit processors, typical for current smart cards, and on 32-bit
processors, typical for PCs.
8-BIT PROCESSOR AES can be implemented very efficiently on an 8-bit proces-
sor. AddRoundKey is a bytewise XOR operation. ShiftRows is a simple byte-
shifting operation. SubBytes operates at the byte level and only requires a table
of 256 bytes.
The transformation MixColumns requires matrix multiplication in the field
GF(28), which means that all operations are carried out on bytes. MixColumns only
requires multiplication by {02} and {03}, which, as we have seen, involved simple
shifts, conditional XORs, and XORs. This can be implemented in a more efficient
200 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD
way that eliminates the shifts and conditional XORs. Equation set (6.4) shows the
equations for the MixColumns transformation on a single column. Using the iden-
tity {03} # x = ({02} # x) ⊕ x, we can rewrite Equation set (6.4) as follows.
Tmp = s0, j ⊕ s1, j ⊕ s2, j ⊕ s3, j
s0, j
= = s0, j ⊕ Tmp ⊕ [2 # (s0, j ⊕ s1, j)]
s1, j
= = s1, j ⊕ Tmp ⊕ [2 # (s1, j ⊕ s2, j)] (6.9)
s2, j
= = s2, j ⊕ Tmp ⊕ [2 # (s2, j ⊕ s3, j)]
s3, j
= = s3, j ⊕ Tmp ⊕ [2 # (s3, j ⊕ s0, j)]
Equation set (6.9) is verified by expanding and eliminating terms.
The multiplication by {02} involves a shift and a conditional XOR. Such
an implementation may be vulnerable to a timing attack of the sort described in
Section 4.4. To counter this attack and to increase processing efficiency at the
cost of some storage, the multiplication can be replaced by a table lookup. Define
the 256-byte table X2, such that X2[i] = {02} # i. Then Equation set (6.9) can be
rewritten as
Tmp = s0, j ⊕ s1, j ⊕ s2, j ⊕ s3, j
s0, j
= = s0, j ⊕ Tmp ⊕ X2[s0, j ⊕ s1, j]
s1, c
= = s1, j ⊕ Tmp ⊕ X2[s1, j ⊕ s2, j]
s2, c
= = s2, j ⊕ Tmp ⊕ X2[s2, j ⊕ s3, j]
s3, j
= = s3, j ⊕ Tmp ⊕ X2[s3, j ⊕ s0, j]
32-BIT PROCESSOR The implementation described in the preceding subsection uses
only 8-bit operations. For a 32-bit processor, a more efficient implementation can be
achieved if operations are defined on 32-bit words. To show this, we first define the
four transformations of a round in algebraic form. Suppose we begin with a State
matrix consisting of elements ai, j and a round-key matrix consisting of elements ki, j.
Then the transformations can be expressed as follows.
SubBytes bi, j = S[ai, j]
ShiftRows D c0, jc1, j
c2, j
c3, j
T = D b0, jb1, j - 1
b2, j - 2
b3, j - 3
T
MixColumns D d0, jd1, j
d2, j
d3, j
T = D 02 03 01 0101 02 03 01
01 01 02 03
03 01 01 02
T D c0, jc1, j
c2, j
c3, j
T
AddRoundKey D e0, je1, j
e2, j
e3, j
T = D d0, jd1, j
d2, j
d3, j
T ⊕ D k0, jk1, j
k2, j
k3, j
T
6.6 / AES IMPLEMENTATION 201
In the ShiftRows equation, the column indices are taken mod 4. We can
combine all of these expressions into a single equation:
D e0, je1, j
e2, j
e3, j
T = D 02 03 01 0101 02 03 01
01 01 02 03
03 01 01 02
T D S[a0, j]S[a1, j - 1]
S[a2, j - 2]
S[a3, j - 3]
T ⊕ D k0, jk1, j
k2, j
k3, j
T
= § D 0201
01
03
T # S[a0, j] ¥ ⊕ § D 0302
01
01
T # S[a1, j - 1] ¥ ⊕ § D 0103
02
01
T # S[a2, j - 2] ¥
⊕ § D 0101
03
02
T # S[a3, j - 3] ¥ ⊕ D k0, jk1, jk2, j
k3, j
T
In the second equation, we are expressing the matrix multiplication as a linear com-
bination of vectors. We define four 256-word (1024-byte) tables as follows.
T0[x] = § D 0201
01
03
T # S[x] ¥ T1[x] = § D 0302
01
01
T # S[x] ¥ T2[x] = § D 0103
02
01
T # S[x] ¥ T3[x] = § D 0101
03
02
T # S[x] ¥
Thus, each table takes as input a byte value and produces a column vector (a 32-bit
word) that is a function of the S-box entry for that byte value. These tables can be
calculated in advance.
We can define a round function operating on a column in the following fashion.
D s0, j=s1, j=
s2, j
=
s3, j
=
T = T0[s0, j] ⊕ T1[s1, j - 1] ⊕ T2[s2, j - 2] ⊕ T3[s3, j - 3] ⊕ D k0, jk1, jk2, j
k3, j
T
As a result, an implementation based on the preceding equation requires only
four table lookups and four XORs per column per round, plus 4 Kbytes to store the
table. The developers of Rijndael believe that this compact, efficient implementa-
tion was probably one of the most important factors in the selection of Rijndael
for AES.
202 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD
6.7 KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS
Advanced Encryption
Standard (AES)
avalanche effect
field
finite field
irreducible
polynomial
key expansion
National Institute of Standards
and Technology (NIST)
Rijndael
S-box
Key Terms
Review Questions
6.1 What was the original set of criteria used by NIST to evaluate candidate AES ciphers?
6.2 What was the final set of criteria used by NIST to evaluate candidate AES ciphers?
6.3 What is the difference between Rijndael and AES?
6.4 What is the purpose of the State array?
6.5 How is the S-box constructed?
6.6 Briefly describe SubBytes.
6.7 Briefly describe ShiftRows.
6.8 How many bytes in State are affected by ShiftRows?
6.9 Briefly describe MixColumns.
6.10 Briefly describe AddRoundKey.
6.11 Briefly describe the key expansion algorithm.
6.12 What is the difference between SubBytes and SubWord?
6.13 What is the difference between ShiftRows and RotWord?
6.14 What is the difference between the AES decryption algorithm and the equivalent
inverse cipher?
Problems
6.1 In the discussion of MixColumns and InvMixColumns, it was stated that
b(x) = a-1(x) mod(x4 + 1)
where a(x) = {03}x3 + {01}x2 + {01}x + {02} and b(x) = {0B}x3 + {0D}x2 + {09}x +
{0E.} Show that this is true.
6.2 a. What is {0 2 }-1 in GF(28)?
b. Verify the entry for {0 2 } in the S-box.
6.3 Show the first eight words of the key expansion for a 128-bit key of all ones.
6.4 Given the plaintext {0F0E0D0C0B0A09080706050403020100} and the key
{02020202020202020202020202020202}:
a. Show the original contents of State, displayed as a 4 * 4 matrix.
b. Show the value of State after initial AddRoundKey.
c. Show the value of State after SubBytes.
d. Show the value of State after ShiftRows.
e. Show the value of State after MixColumns.
6.5 Verify Equation (6.11) in Appendix 6A. That is, show that xi mod (x4 + 1) = xi mod 4.
APPENDIX 6A / POLYNOMIALS WITH COEFFICIENTS IN GF(28) 203
6.6 Compare AES to DES. For each of the following elements of DES, indicate the com-
parable element in AES or explain why it is not needed in AES.
a. XOR of subkey material with the input to the f function
b. XOR of the f function output with the left half of the block
c. f function
d. permutation P
e. swapping of halves of the block
6.7 In the subsection on implementation aspects, it is mentioned that the use of tables
helps thwart timing attacks. Suggest an alternative technique.
6.8 In the subsection on implementation aspects, a single algebraic equation is developed
that describes the four stages of a typical round of the encryption algorithm. Provide
the equivalent equation for the tenth round.
6.9 Compute the output of the MixColumns transformation for the following sequence
of input bytes “A1 B2 C3 D4.” Apply the InvMixColumns transformation to the ob-
tained result to verify your calculations. Change the first byte of the input from “A1”
to “A3” perform the MixColumns transformation again for the new input, and deter-
mine how many bits have changed in the output.
Note: You can perform all calculations by hand or write a program supporting these
computations. If you choose to write a program, it should be written entirely by you;
no use of libraries or public domain source code is allowed in this assignment.
6.10 Use the key 1010 1001 1100 0011 to encrypt the plaintext “hi” as expressed in ASCII
as 0110 1000 0110 1001. The designers of S-AES got the ciphertext 0011 1110 1111
1011. Do you?
6.11 Show that the matrix given here, with entries in GF(24), is the inverse of the matrix
used in the MixColumns step of S-AES.
¢x3 + 1 x
x x3 + 1
≤
6.12 Carefully write up a complete decryption of the ciphertext 0011 1110 1111 1011 using
the key 1010 1001 1100 0011 and the S-AES algorithm. You should get the plaintext
we started with in Problem 6.10. Note that the inverse of the S-boxes can be done
with a reverse table lookup. The inverse of the MixColumns step is given by the ma-
trix in the previous problem.
6.13 Demonstrate that Equation (6.9) is equivalent to Equation (6.4).
Programming Problems
6.14 Create software that can encrypt and decrypt using S-AES. Test data: A binary
plaintext of 0110 1111 0110 1011 encrypted with a binary key of 1010 0111 0011 1011
should give a binary ciphertext of 0000 0111 0011 1000. Decryption should work
correspondingly.
6.15 Implement a differential cryptanalysis attack on 1-round S-AES.
APPENDIX 6A POLYNOMIALS WITH COEFFICIENTS IN GF(28)
In Section 5.5, we discussed polynomial arithmetic in which the coefficients are in Zp
and the polynomials are defined modulo a polynomial m(x) whose highest power
is some integer n. In this case, addition and multiplication of coefficients occurred
within the field Zp; that is, addition and multiplication were performed modulo p.
204 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD
The AES document defines polynomial arithmetic for polynomials of degree 3
or less with coefficients in GF(28). The following rules apply.
1. Addition is performed by adding corresponding coefficients in GF(28). As was
pointed out Section 5.4, if we treat the elements of GF(28) as 8-bit strings, then
addition is equivalent to the XOR operation. So, if we have
a(x) = a3x
3 + a2x2 + a1x + a0 (6.10)
and
b(x) = b3x
3 + b2x2 + b1x + b0 (6.11)
then
a(x) + b(x) = (a3 ⊕ b3)x3 + (a2 ⊕ b2)x2 + (a1 ⊕ b1)x + (a0 ⊕ b0)
2. Multiplication is performed as in ordinary polynomial multiplication with two
refinements:
a. Coefficients are multiplied in GF(28).
b. The resulting polynomial is reduced mod (x4 + 1).
We need to keep straight which polynomial we are talking about. Recall from
Section 5.6 that each element of GF(28) is a polynomial of degree 7 or less with bi-
nary coefficients, and multiplication is carried out modulo a polynomial of degree
8. Equivalently, each element of GF(28) can be viewed as an 8-bit byte whose bit
values correspond to the binary coefficients of the corresponding polynomial. For
the sets defined in this section, we are defining a polynomial ring in which each ele-
ment of this ring is a polynomial of degree 3 or less with coefficients in GF(28), and
multiplication is carried out modulo a polynomial of degree 4. Equivalently, each
element of this ring can be viewed as a 4-byte word whose byte values are elements
of GF(28) that correspond to the 8-bit coefficients of the corresponding polynomial.
We denote the modular product of a(x) and b(x) by a(x) ⊕ b(x). To com-
pute d(x) = a(x) ⊕ b(x), the first step is to perform a multiplication without the
modulo operation and to collect coefficients of like powers. Let us express this as
c(x) = a(x) * b(x). Then
c(x) = c6x
6 + c5x5 + c4x4 + c3x3 + c2x2 + c1x + c0 (6.12)
where
c0 = a0 # b0 c4 = (a3 # b1) ⊕ (a2 # b2) ⊕ (a1 # b3)
c1 = (a1 # b0) ⊕ (a0 # b1) c5 = (a3 # b2) ⊕ (a2 # b3)
c2 = (a2 # b0) ⊕ (a1 # b1) ⊕ (a0 # b2) c6 = a3 # b3
c3 = (a3 # b0) ⊕ (a2 # b1) ⊕ (a1 # b2) ⊕ (a0 # b3)
The final step is to perform the modulo operation
d(x) = c(x) mod (x4 + 1)
That is, d(x) must satisfy the equation
c(x) = [(x4 + 1) * q(x)] ⊕ d(x)
such that the degree of d(x) is 3 or less.
A practical technique for performing multiplication over this polynomial ring
is based on the observation that
xi mod (x4 + 1) = xi mod 4 (6.13)
If we now combine Equations (6.12) and (6.13), we end up with
d(x) = c(x) mod (x4 + 1)
= [c6x
6 + c5x5 + c4x4 + c3x3 + c2x2 + c1x + c0] mod (x4 + 1)
= c3x
3 + (c2 ⊕ c6)x2 + (c1 ⊕ c5)x + (c0 ⊕ c4)
Expanding the ci coefficients, we have the following equations for the coef-
ficients of d(x).
d0 = (a0 # b0) ⊕ (a3 # b1) ⊕ (a2 # b2) ⊕ (a1 # b3)
d1 = (a1 # b0) ⊕ (a0 # b1) ⊕ (a3 # b2) ⊕ (a2 # b3)
d2 = (a2 # b0) ⊕ (a1 # b1) ⊕ (a0 # b2) ⊕ (a3 # b3)
d3 = (a3 # b0) ⊕ (a2 # b1) ⊕ (a1 # b2) ⊕ (a0 # b3)
This can be written in matrix form:
D d0d1
d2
d3
T = D a0 a3 a2 a1a1 a0 a3 a2
a2 a1 a0 a3
a3 a2 a1 a0
T D b0b1
b2
b3
T (6.14)
MixColumns Transformation
In the discussion of MixColumns, it was stated that there were two equivalent
ways of defining the transformation. The first is the matrix multiplication shown in
Equation (6.3), which is repeated here:
D 02 03 01 0101 02 03 01
01 01 02 03
03 01 01 02
T D s0, 0 s0, 1 s0, 2 s0, 3s1, 0 s1, 1 s1, 2 s1, 3
s2, 0 s2, 1 s2, 2 s2, 3
s3, 0 s3, 1 s3, 2 s3, 3
T = D s0, 0= s0, 1= s0, 2= s0, 3=s1, 0= s1, 1= s1, 2= s1, 3=
s2, 0
= s2, 1
= s2, 2
= s2, 3
=
s3, 0
= s3, 1
= s3, 2
= s3, 3
=
T
The second method is to treat each column of State as a four-term polynomial
with coefficients in GF(28). Each column is multiplied modulo (x4 + 1) by the fixed
polynomial a(x), given by
a(x) = {03}x3 + {01}x2 + {01}x + {02}
APPENDIX 6A / POLYNOMIALS WITH COEFFICIENTS IN GF(28) 205
206 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD
From Equation (6.10), we have a3 = {03}; a2 = {01}; a1 = {01}; and
a0 = {02}. For the jth column of State, we have the polynomial colj(x) = s3,jx
3 +
s2,jx
2 + s1,jx + s0, j. Substituting into Equation (6.14), we can express
d(x) = a(x) * colj(x) as
D d0d1
d2
d3
T = D a0 a3 a2 a1a1 a0 a3 a2
a2 a1 a0 a3
a3 a2 a1 a0
T D s0,js1,j
s2,j
s3,j
T = D 02 03 01 0101 02 03 01
01 01 02 03
03 01 01 02
T D s0,js1,j
s2,j
s3,j
T
which is equivalent to Equation (6.3).
Multiplication by x
Consider the multiplication of a polynomial in the ring by x: c(x) = x ⊕ b(x).
We have
c(x) = x ⊕ b(x) = [x * (b3x3 + b2x2 + b1x + b0] mod (x4 + 1)
= (b3x
4 + b2x3 + b1x2 + b0x) mod (x4 + 1)
= b2x
3 + b1x2 + b0x + b3
Thus, multiplication by x corresponds to a 1-byte circular left shift of the
4 bytes in the word representing the polynomial. If we represent the polynomial as
a 4-byte column vector, then we have
D c0c1
c2
c3
T = D 00 00 00 0101 00 00 00
00 01 00 00
00 00 01 00
T D b0b1
b2
b3
T
207
Block Cipher Operation
7.1 Multiple Encryption and Triple DES
Double DES
Triple DES with Two Keys
Triple DES with Three Keys
7.2 Electronic Codebook
7.3 Cipher Block Chaining Mode
7.4 Cipher Feedback Mode
7.5 Output Feedback Mode
7.6 Counter Mode
7.7 XTS-AES Mode for Block-Oriented Storage Devices
Tweakable Block Ciphers
Storage Encryption Requirements
Operation on a Single Block
Operation on a Sector
7.8 Format-Preserving Encryption
Motivation
Difficulties in Designing an FPE
Feistel Structure for Format-Preserving Encryption
NIST Methods for Format-Preserving Encryption
7.9 Key Terms, Review Questions, and Problems
CHAPTER
208 CHAPTER 7 / BLOCK CIPHER OPERATION
This chapter continues our discussion of symmetric ciphers. We begin with the topic of
multiple encryption, looking in particular at the most widely used multiple-encryption
scheme: triple DES.
The chapter next turns to the subject of block cipher modes of operation. We
find that there are a number of different ways to apply a block cipher to plaintext, each
with its own advantages and particular applications.
7.1 MULTIPLE ENCRYPTION AND TRIPLE DES
Because of its vulnerability to brute-force attack, DES, once the most widely used
symmetric cipher, has been largely replaced by stronger encryption schemes. Two
approaches have been taken. One approach is to design a completely new algo-
rithm that is resistant to both cryptanalytic and brute-force attacks, of which AES
is a prime example. Another alternative, which preserves the existing investment in
software and equipment, is to use multiple encryption with DES and multiple keys.
We begin by examining the simplest example of this second alternative. We then
look at the widely accepted triple DES (3DES) algorithm.
Double DES
The simplest form of multiple encryption has two encryption stages and two keys
(Figure 7.1a). Given a plaintext P and two encryption keys K1 and K2, ciphertext C
is generated as
C = E(K2, E(K1, P))
Decryption requires that the keys be applied in reverse order:
P = D(K1, D(K2, C))
For DES, this scheme apparently involves a key length of 56 * 2 = 112 bits, and
should result in a dramatic increase in cryptographic strength. But we need to exam-
ine the algorithm more closely.
LEARNING OBJECTIVES
After studying this chapter, you should be able to:
◆ Analyze the security of multiple encryption schemes.
◆ Explain the meet-in-the-middle attack.
◆ Compare and contrast ECB, CBC, CFB, OFB, and counter modes of operation.
◆ Present an overview of the XTS-AES mode of operation.
7.1 / MULTIPLE ENCRYPTION AND TRIPLE DES 209
REDUCTION TO A SINGLE STAGE Suppose it were true for DES, for all 56-bit key val-
ues, that given any two keys K1 and K2, it would be possible to find a key K3 such that
E(K2, E(K1, P)) = E(K3, P) (7.1)
If this were the case, then double encryption, and indeed any number of stages of
multiple encryption with DES, would be useless because the result would be equiv-
alent to a single encryption with a single 56-bit key.
On the face of it, it does not appear that Equation (7.1) is likely to hold.
Consider that encryption with DES is a mapping of 64-bit blocks to 64-bit blocks.
In fact, the mapping can be viewed as a permutation. That is, if we consider all 264
possible input blocks, DES encryption with a specific key will map each block into a
unique 64-bit block. Otherwise, if, say, two given input blocks mapped to the same
output block, then decryption to recover the original plaintext would be impossible.
Figure 7.1 Multiple Encryption
(3-key)
(2-key)K1
K3
or
(3-key)
(2-key)K1
K3
or
E E
K1
P
K2
C
X
Encryption
D D
K1
C
K2
P
X
Decryption
(a) Double encryption
E D E
K1
P
K2
C
A B
Encryption
D E D
K1
C
K2
P
Decryption
(b) Triple encryption
B A
210 CHAPTER 7 / BLOCK CIPHER OPERATION
With 264 possible inputs, how many different mappings are there that generate a
permutation of the input blocks? The value is easily seen to be
(264)! = 10347380000000000000000 7 (1010
20
)
On the other hand, DES defines one mapping for each different key, for a total
number of mappings:
256 6 1017
Therefore, it is reasonable to assume that if DES is used twice with different keys, it
will produce one of the many mappings that are not defined by a single application
of DES. Although there was much supporting evidence for this assumption, it was
not until 1992 that the assumption was proven [CAMP92].
MEET-IN-THE-MIDDLE ATTACK Thus, the use of double DES results in a mapping
that is not equivalent to a single DES encryption. But there is a way to attack this
scheme, one that does not depend on any particular property of DES but that will
work against any block encryption cipher.
The algorithm, known as a meet-in-the-middle attack, was first described in
[DIFF77]. It is based on the observation that, if we have
C = E(K2, E(K1, P))
then (see Figure 7.1a)
X = E(K1, P) = D(K2, C)
Given a known pair, (P, C), the attack proceeds as follows. First, encrypt P for all
256 possible values of K1. Store these results in a table and then sort the table by the
values of X. Next, decrypt C using all 256 possible values of K2. As each decryption
is produced, check the result against the table for a match. If a match occurs, then
test the two resulting keys against a new known plaintext–ciphertext pair. If the two
keys produce the correct ciphertext, accept them as the correct keys.
For any given plaintext P, there are 264 possible ciphertext values that could be
produced by double DES. Double DES uses, in effect, a 112-bit key, so that there
are 2112 possible keys. Therefore, for a given plaintext P, the maximum number
of different 112-bit keys that could produce a given ciphertext C is 2112/264 = 248.
Thus, the foregoing procedure can produce about 248 false alarms on the first (P, C)
pair. A similar argument indicates that with an additional 64 bits of known plaintext
and ciphertext, the false alarm rate is reduced to 248 - 64 = 2-16. Put another way,
if the meet-in-the-middle attack is performed on two blocks of known plaintext–
ciphertext, the probability that the correct keys are determined is 1 - 2-16. The
result is that a known plaintext attack will succeed against double DES, which has a
key size of 112 bits, with an effort on the order of 256, which is not much more than
the 255 required for single DES.
Triple DES with Two Keys
An obvious counter to the meet-in-the-middle attack is to use three stages of
encryption with three different keys. Using DES as the underlying algorithm,
this approach is commonly referred to as 3DES, or Triple Data Encryption
7.1 / MULTIPLE ENCRYPTION AND TRIPLE DES 211
Algorithm (TDEA). As shown in Figure 7.1b, there are two versions of 3DES;
one using two keys and one using three keys. NIST SP 800-67 (Recommendation
for the Triple Data Encryption Block Cipher, January 2012) defines the two-key
and three-key versions. We look first at the strength of the two-key version and
then examine the three-key version.
Two-key triple encryption was first proposed by Tuchman [TUCH79]. The
function follows an encrypt-decrypt-encrypt (EDE) sequence (Figure 7.1b):
C = E(K1, D(K2, E(K1, P)))
P = D(K1, E(K2, D(K1, C)))
There is no cryptographic significance to the use of decryption for the second
stage. Its only advantage is that it allows users of 3DES to decrypt data encrypted by
users of the older single DES:
C = E(K1, D(K1, E(K1, P))) = E(K1, P)
P = D(K1, E(K1, D(K1, C))) = D(K1, C)
3DES with two keys is a relatively popular alternative to DES and has been
adopted for use in the key management standards ANSI X9.17 and ISO 8732.1
Currently, there are no practical cryptanalytic attacks on 3DES. Coppersmith
[COPP94] notes that the cost of a brute-force key search on 3DES is on the order of
2112 ≈ (5 * 1033) and estimates that the cost of differential cryptanalysis suffers an
exponential growth, compared to single DES, exceeding 1052.
It is worth looking at several proposed attacks on 3DES that, although not
practical, give a flavor for the types of attacks that have been considered and that
could form the basis for more successful future attacks.
The first serious proposal came from Merkle and Hellman [MERK81]. Their
plan involves finding plaintext values that produce a first intermediate value of
A = 0 (Figure 7.1b) and then using the meet-in-the-middle attack to determine
the two keys. The level of effort is 256, but the technique requires 256 chosen plain-
text–ciphertext pairs, which is a number unlikely to be provided by the holder of
the keys.
A known-plaintext attack is outlined in [VANO90]. This method is an im-
provement over the chosen-plaintext approach but requires more effort. The attack
is based on the observation that if we know A and C (Figure 7.1b), then the problem
reduces to that of an attack on double DES. Of course, the attacker does not know
A, even if P and C are known, as long as the two keys are unknown. However, the
attacker can choose a potential value of A and then try to find a known (P, C) pair
that produces A. The attack proceeds as follows.
1. Obtain n (P, C) pairs. This is the known plaintext. Place these in a table
(Table 1) sorted on the values of P (Figure 7.2b).
1American National Standards Institute (ANSI): Financial Institution Key Management (Wholesale).
From its title, X9.17 appears to be a somewhat obscure standard. Yet a number of techniques specified in
this standard have been adopted for use in other standards and applications, as we shall see throughout
this book.
212 CHAPTER 7 / BLOCK CIPHER OPERATION
2. Pick an arbitrary value a for A, and create a second table (Figure 7.2c) with en-
tries defined in the following fashion. For each of the 256 possible keys K1 = i,
calculate the plaintext value Pi such that
Pi = D(i, a)
For each Pi that matches an entry in Table 1, create an entry in Table 2 consist-
ing of the K1 value and the value of B that is produced for the (P, C) pair from
Table 1, assuming that value of K1:
B = D(i, C)
At the end of this step, sort Table 2 on the values of B.
3. We now have a number of candidate values of K1 in Table 2 and are in a
position to search for a value of K2. For each of the 2
56 possible keys K2 = j,
calculate the second intermediate value for our chosen value of a:
Bj = D(j, a)
At each step, look up Bj in Table 2. If there is a match, then the corresponding
key i from Table 2 plus this value of j are candidate values for the unknown
keys (K1, K2). Why? Because we have found a pair of keys (i, j) that produce a
known (P, C) pair (Figure 7.2a).
4. Test each candidate pair of keys (i, j) on a few other plaintext–ciphertext pairs.
If a pair of keys produces the desired ciphertext, the task is complete. If no pair
succeeds, repeat from step 1 with a new value of a.
Figure 7.2 Known-Plaintext Attack on Triple DES
E D E
i j i
Ci
a Bj
(a) Two-key triple encryption with candidate pair of keys
Pi
Pi Ci
(b) Table of n known
plaintext–ciphertext
pairs, sorted on P
Bj Key i
(c) Table of intermediate
values and candidate
keys
7.2 / ELECTRONIC CODEBOOK 213
For a given known (P, C), the probability of selecting the unique value of a
that leads to success is 1/264. Thus, given n (P, C) pairs, the probability of success for
a single selected value of a is n/264. A basic result from probability theory is that the
expected number of draws required to draw one red ball out of a bin containing n
red balls and N - n green balls is (N + 1)/(n + 1) if the balls are not replaced. So
the expected number of values of a that must be tried is, for large n,
264 + 1
n + 1
≈
264
n
Thus, the expected running time of the attack is on the order of
(256)
264
n
= 2120 - log2 n
Triple DES with Three Keys
Although the attacks just described appear impractical, anyone using two-key 3DES
may feel some concern. Thus, many researchers now feel that three-key 3DES is the
preferred alternative (e.g., [KALI96a]). In SP 800-57, Part 1 (Recommendation for
Key Management—Part 1: General, July 2012) NIST recommends that 2-key 3DES
be retired as soon as practical and replaced with 3-key 3DES.
Three-key 3DES is defined as
C = E(K3, D(K2, E(K1, P)))
Backward compatibility with DES is provided by putting K3 = K2 or K1 = K2. One
might expect that 3TDEA would provide 56 # 3 = 168 bits of strength. However,
there is an attack on 3TDEA that reduces the strength to the work that would be
involved in exhausting a 112-bit key [MERK81].
A number of Internet-based applications have adopted three-key 3DES, in-
cluding PGP and S/MIME, both discussed in Chapter 19.
7.2 ELECTRONIC CODEBOOK
A block cipher takes a fixed-length block of text of length b bits and a key as input
and produces a b-bit block of ciphertext. If the amount of plaintext to be encrypted
is greater than b bits, then the block cipher can still be used by breaking the plain-
text up into b-bit blocks. When multiple blocks of plaintext are encrypted using the
same key, a number of security issues arise. To apply a block cipher in a variety of
applications, five modes of operation have been defined by NIST (SP 800-38A).
In essence, a mode of operation is a technique for enhancing the effect of a cryp-
tographic algorithm or adapting the algorithm for an application, such as applying
a block cipher to a sequence of data blocks or a data stream. The five modes are
intended to cover a wide variety of applications of encryption for which a block
cipher could be used. These modes are intended for use with any symmetric block
cipher, including triple DES and AES. The modes are summarized in Table 7.1 and
described in this and the following sections.
214 CHAPTER 7 / BLOCK CIPHER OPERATION
The simplest mode is the electronic codebook (ECB) mode, in which plaintext
is handled one block at a time and each block of plaintext is encrypted using the
same key (Figure 7.3). The term codebook is used because, for a given key, there is
a unique ciphertext for every b-bit block of plaintext. Therefore, we can imagine a
gigantic codebook in which there is an entry for every possible b-bit plaintext pat-
tern showing its corresponding ciphertext.
For a message longer than b bits, the procedure is simply to break the message
into b-bit blocks, padding the last block if necessary. Decryption is performed one
block at a time, always using the same key. In Figure 7.3, the plaintext (padded as
necessary) consists of a sequence of b-bit blocks, P1, P2, c , PN; the correspond-
ing sequence of ciphertext blocks is C1, C2, c , CN. We can define ECB mode as
follows.
ECB C j = E(K, Pj) j = 1, c , N Pj = D(K, Cj) j = 1, c , N
The ECB mode should be used only to secure messages shorter than a single
block of underlying cipher (i.e., 64 bits for 3DES and 128 bits for AES), such as to
encrypt a secret key. Because in most of the cases messages are longer than the en-
cryption block mode, this mode has a minimum practical value.
The most significant characteristic of ECB is that if the same b-bit block of
plaintext appears more than once in the message, it always produces the same
ciphertext.
Mode Description Typical Application
Electronic Codebook (ECB) Each block of plaintext bits is
encoded independently using the
same key.
Secure transmission of
single values (e.g., an
encryption key)
Cipher Block Chaining (CBC) The input to the encryption algo-
rithm is the XOR of the next block
of plaintext and the preceding
block of ciphertext.
General-purpose block-
oriented transmission
Authentication
Cipher Feedback (CFB) Input is processed s bits at a time.
Preceding ciphertext is used as
input to the encryption algorithm
to produce pseudorandom output,
which is XORed with plaintext to
produce next unit of ciphertext.
General-purpose
stream-oriented
transmission
Authentication
Output Feedback (OFB) Similar to CFB, except that the
input to the encryption algorithm
is the preceding encryption output,
and full blocks are used.
Stream-oriented
transmission over noisy
channel (e.g., satellite
communication)
Counter (CTR) Each block of plaintext is XORed
with an encrypted counter. The
counter is incremented for each
subsequent block.
General-purpose block-
oriented transmission
Useful for high-speed
requirements
Table 7.1 Block Cipher Modes of Operation
7.2 / ELECTRONIC CODEBOOK 215
For lengthy messages, the ECB mode may not be secure. If the message is
highly structured, it may be possible for a cryptanalyst to exploit these regularities.
For example, if it is known that the message always starts out with certain predefined
fields, then the cryptanalyst may have a number of known plaintext–ciphertext pairs
to work with. If the message has repetitive elements with a period of repetition a
multiple of b bits, then these elements can be identified by the analyst. This may help
in the analysis or may provide an opportunity for substituting or rearranging blocks.
We now turn to more complex modes of operation. [KNUD00] lists the fol-
lowing criteria and properties for evaluating and constructing block cipher modes of
operation that are superior to ECB:
■ Overhead: The additional operations for the encryption and decryption opera-
tion when compared to encrypting and decrypting in the ECB mode.
■ Error recovery: The property that an error in the ith ciphertext block is inher-
ited by only a few plaintext blocks after which the mode resynchronizes.
■ Error propagation: The property that an error in the ith ciphertext block is
inherited by the ith and all subsequent plaintext blocks. What is meant here is
a bit error that occurs in the transmission of a ciphertext block, not a computa-
tional error in the encryption of a plaintext block.
Figure 7.3 Electronic Codebook (ECB) Mode
C1
P1
Encrypt
K
P2
C2
Encrypt
K
P N
CN
Encrypt
K
(a) Encryption
P1
C1
Decrypt
K
C2
P2
Decrypt
K
CN
PN
Decrypt
K
(b) Decryption
216 CHAPTER 7 / BLOCK CIPHER OPERATION
■ Diffusion: How the plaintext statistics are reflected in the ciphertext. Low en-
tropy plaintext blocks should not be reflected in the ciphertext blocks. Roughly,
low entropy equates to predictability or lack of randomness (see Appendix F).
■ Security: Whether or not the ciphertext blocks leak information about the
plaintext blocks.
7.3 CIPHER BLOCK CHAINING MODE
To overcome the security deficiencies of ECB, we would like a technique in which
the same plaintext block, if repeated, produces different ciphertext blocks. A
simple way to satisfy this requirement is the cipher block chaining (CBC) mode
(Figure 7.4). In this scheme, the input to the encryption algorithm is the XOR of the
current plaintext block and the preceding ciphertext block; the same key is used for
each block. In effect, we have chained together the processing of the sequence of
plaintext blocks. The input to the encryption function for each plaintext block bears
no fixed relationship to the plaintext block. Therefore, repeating patterns of b bits
are not exposed. As with the ECB mode, the CBC mode requires that the last block
be padded to a full b bits if it is a partial block.
Figure 7.4 Cipher Block Chaining (CBC) Mode
C1
P1
Encrypt
IV
K
P2
C2
Encrypt
K
PN
CN
CN–1
Encrypt
K
(a) Encryption
P1
C1
Decrypt
IV
K
C2
P2
Decrypt
K
CN
PN
CN–1
Decrypt
K
(b) Decryption
7.3 / CIPHER BLOCK CHAINING MODE 217
For decryption, each cipher block is passed through the decryption algorithm.
The result is XORed with the preceding ciphertext block to produce the plaintext
block. To see that this works, we can write
Cj = E(K, [Cj - 1 ⊕ Pj])
Then
D(K, Cj) = D(K, E(K, [Cj - 1 ⊕ Pj]))
D(K, Cj) = Cj - 1 ⊕ Pj
Cj - 1 ⊕ D(K, Cj) = Cj - 1 ⊕ Cj - 1 ⊕ Pj = Pj
To produce the first block of ciphertext, an initialization vector (IV) is XORed
with the first block of plaintext. On decryption, the IV is XORed with the output
of the decryption algorithm to recover the first block of plaintext. The IV is a data
block that is the same size as the cipher block. We can define CBC mode as
CBC
C1 = E(K, [P1 ⊕ IV])
Cj = E(K, [Pj ⊕ Cj - 1])j = 2, c , N
P1 = D(K, C1) ⊕ IV
Pj = D(K, Cj) ⊕ Cj - 1 j = 2, c , N
The IV must be known to both the sender and receiver but be unpredictable
by a third party. In particular, for any given plaintext, it must not be possible to
predict the IV that will be associated to the plaintext in advance of the generation
of the IV. For maximum security, the IV should be protected against unauthorized
changes. This could be done by sending the IV using ECB encryption. One reason
for protecting the IV is as follows: If an opponent is able to fool the receiver into
using a different value for IV, then the opponent is able to invert selected bits in the
first block of plaintext. To see this, consider
C1 = E(K, [IV ⊕ P1])
P1 = IV ⊕ D(K, C1)
Now use the notation that X[i] denotes the ith bit of the b-bit quantity X. Then
P1[i] = IV[i] ⊕ D(K, C1)[i]
Then, using the properties of XOR, we can state
P1[i]′ = IV[i]′ ⊕ D(K, C1)[i]
where the prime notation denotes bit complementation. This means that if an oppo-
nent can predictably change bits in IV, the corresponding bits of the received value
of P1 can be changed.
For other possible attacks based on prior knowledge of IV, see [VOYD83].
So long as it is unpredictable, the specific choice of IV is unimportant.
SP 800-38A recommends two possible methods: The first method is to apply
the encryption function, under the same key that is used for the encryption of the
plaintext, to a nonce.2 The nonce must be a data block that is unique to each
2NIST SP 800-90 (Recommendation for Random Number Generation Using Deterministic Random Bit
Generators) defines nonce as follows: A time-varying value that has at most a negligible chance of repeat-
ing, for example, a random value that is generated anew for each use, a timestamp, a sequence number,
or some combination of these.
218 CHAPTER 7 / BLOCK CIPHER OPERATION
execution of the encryption operation. For example, the nonce may be a counter,
a timestamp, or a message number. The second method is to generate a random
data block using a random number generator.
In conclusion, because of the chaining mechanism of CBC, it is an appropriate
mode for encrypting messages of length greater than b bits.
In addition to its use to achieve confidentiality, the CBC mode can be used for
authentication. This use is described in Chapter 12.
7.4 CIPHER FEEDBACK MODE
For AES, DES, or any block cipher, encryption is performed on a block of b bits.
In the case of DES, b = 64 and in the case of AES, b = 128. However, it is pos-
sible to convert a block cipher into a stream cipher, using one of the three modes
to be discussed in this and the next two sections: cipher feedback (CFB) mode,
output feedback (OFB) mode, and counter (CTR) mode. A stream cipher elimi-
nates the need to pad a message to be an integral number of blocks. It also can
operate in real time. Thus, if a character stream is being transmitted, each char-
acter can be encrypted and transmitted immediately using a character-oriented
stream cipher.
One desirable property of a stream cipher is that the ciphertext be of the same
length as the plaintext. Thus, if 8-bit characters are being transmitted, each charac-
ter should be encrypted to produce a ciphertext output of 8 bits. If more than 8 bits
are produced, transmission capacity is wasted.
Figure 7.5 depicts the CFB scheme. In the figure, it is assumed that the unit of
transmission is s bits; a common value is s = 8. As with CBC, the units of plaintext
are chained together, so that the ciphertext of any plaintext unit is a function of all
the preceding plaintext. In this case, rather than blocks of b bits, the plaintext is
divided into segments of s bits.
First, consider encryption. The input to the encryption function is a b-bit shift
register that is initially set to some initialization vector (IV). The leftmost (most
significant) s bits of the output of the encryption function are XORed with the first
segment of plaintext P1 to produce the first unit of ciphertext C1, which is then
transmitted. In addition, the contents of the shift register are shifted left by s bits,
and C1 is placed in the rightmost (least significant) s bits of the shift register. This
process continues until all plaintext units have been encrypted.
For decryption, the same scheme is used, except that the received ciphertext
unit is XORed with the output of the encryption function to produce the plaintext
unit. Note that it is the encryption function that is used, not the decryption function.
This is easily explained. Let MSBs(X) be defined as the most significant s bits of X.
Then
C1 = P1 ⊕ MSBs[E(K, IV)]
Therefore, by rearranging terms:
P1 = C1 ⊕ MSBs[E(K, IV)]
The same reasoning holds for subsequent steps in the process.
7.4 / CIPHER FEEDBACK MODE 219
We can define CFB mode as follows.
CFB
I1 = IV
Ij = LSBb - s(Ij - 1) } Cj - 1 j = 2, c, N
Oj = E(K, Ij) j = 1, c, N
Cj = Pj ⊕ MSBs(Oj) j = 1, c, N
I1 = IV
Ij = LSBb - s(Ij - 1) }Cj - 1 j = 2, c, N
Oj = E(K, Ij) j = 1, c, N
Pj = Cj ⊕ MSBs(Oj) j = 1, c, N
Although CFB can be viewed as a stream cipher, it does not conform to the
typical construction of a stream cipher. In a typical stream cipher, the cipher takes
Figure 7.5 s-bit Cipher Feedback (CFB) Mode
C1
IV
I1
O1
I1
O1
I2
O2
I2
O2
IN
ON
IN
ON
P1
Encrypt
Select
s bits
Discard
b – s bits
K
(a) Encryption
CN–1
(b) Decryption
s bits
s bits s bits
C2
P2
Encrypt
Select
s bits
Discard
b – s bits
K
s bits
s bitsb – s bits
Shift register
s bits
CN
PN
Encrypt
Select
s bits
Discard
b – s bits
K
s bits
s bitsb – s bits
Shift register
P1
IV
C1
Encrypt
Select
s bits
Discard
b – s bits
K
CN–1
s bits
C2
s bits
CN
s bits
s bits s bits
P2
Encrypt
Select
s bits
Discard
b – s bits
K
s bitsb – s bits
Shift register
s bitsb – s bits
Shift register
s bits
PN
Encrypt
Select
s bits
Discard
b – s bits
K
220 CHAPTER 7 / BLOCK CIPHER OPERATION
as input some initial value and a key and generates a stream of bits, which is then
XORed with the plaintext bits (see Figure 4.1). In the case of CFB, the stream of
bits that is XORed with the plaintext also depends on the plaintext.
In CFB encryption, like CBC encryption, the input block to each forward
cipher function (except the first) depends on the result of the previous forward
cipher function; therefore, multiple forward cipher operations cannot be performed
in parallel. In CFB decryption, the required forward cipher operations can be per-
formed in parallel if the input blocks are first constructed (in series) from the IV
and the ciphertext.
7.5 OUTPUT FEEDBACK MODE
The output feedback (OFB) mode is similar in structure to that of CFB. For OFB,
the output of the encryption function is fed back to become the input for encrypting
the next block of plaintext (Figure 7.6). In CFB, the output of the XOR unit is fed
back to become input for encrypting the next block. The other difference is that the
OFB mode operates on full blocks of plaintext and ciphertext, whereas CFB oper-
ates on an s-bit subset. OFB encryption can be expressed as
Cj = Pj ⊕ E(K, Oj - 1)
where
Oj - 1 = E(K, Oj - 2)
Some thought should convince you that we can rewrite the encryption expres-
sion as:
Cj = Pj ⊕ E(K, [Cj - 1 ⊕ Pj - 1])
By rearranging terms, we can demonstrate that decryption works.
Pj = Cj ⊕ E(K, [Cj - 1 ⊕ Pj - 1])
We can define OFB mode as follows.
OFB
I1 = Nonce
Ij = Oj - 1 j = 2, c , N
Oj = E(K, Ij) j = 1, c , N
Cj = Pj ⊕ Oj j = 1, c , N - 1
C N
* = PN
* ⊕ MSBu(ON)
I1 = Nonce
Ij = Oj - 1 j = 2, c , N
Oj = E(K, Ij) j = 1, c , N
Pj = Cj ⊕ Oj j = 1, c , N - 1
PN
* = C N
* ⊕ MSBu(ON)
Let the size of a block be b. If the last block of plaintext contains u bits (indi-
cated by *), with u 6 b, the most significant u bits of the last output block ON are
used for the XOR operation; the remaining b - u bits of the last output block are
discarded.
As with CBC and CFB, the OFB mode requires an initialization vector. In
the case of OFB, the IV must be a nonce; that is, the IV must be unique to each
execution of the encryption operation. The reason for this is that the sequence of
7.5 / OUTPUT FEEDBACK MODE 221
encryption output blocks, Oi, depends only on the key and the IV and does not de-
pend on the plaintext. Therefore, for a given key and IV, the stream of output bits
used to XOR with the stream of plaintext bits is fixed. If two different messages had
an identical block of plaintext in the identical position, then an attacker would be
able to determine that portion of the Oi stream.
One advantage of the OFB method is that bit errors in transmission do not
propagate. For example, if a bit error occurs in C1, only the recovered value of P1 is
affected; subsequent plaintext units are not corrupted. With CFB, C1 also serves as
input to the shift register and therefore causes additional corruption downstream.
The disadvantage of OFB is that it is more vulnerable to a message stream
modification attack than is CFB. Consider that complementing a bit in the cipher-
text complements the corresponding bit in the recovered plaintext. Thus, controlled
Figure 7.6 Output Feedback (OFB) Mode
(a) Encryption
P1
C1
Nonce
Encrypt
K
P2 PN
C2
Encrypt
K
CN
Encrypt
K
(b) Decryption
C1
I1 I2 IN
I1 I2 IN
O1 O2 ON
O1 O2 ON
P1
Nonce
Encrypt
K
C2 CN
P2
Encrypt
K
PN
Encrypt
K
222 CHAPTER 7 / BLOCK CIPHER OPERATION
changes to the recovered plaintext can be made. This may make it possible for an
opponent, by making the necessary changes to the checksum portion of the message
as well as to the data portion, to alter the ciphertext in such a way that it is not de-
tected by an error-correcting code. For a further discussion, see [VOYD83].
OFB has the structure of a typical stream cipher, because the cipher gener-
ates a stream of bits as a function of an initial value and a key, and that stream of
bits is XORed with the plaintext bits (see Figure 4.1). The generated stream that is
XORed with the plaintext is itself independent of the plaintext; this is highlighted
by dashed boxes in Figure 7.6. One distinction from the stream ciphers we discuss
in Chapter 8 is that OFB encrypts plaintext a full block at a time, where typically a
block is 64 or 128 bits. Many stream ciphers encrypt one byte at a time.
7.6 COUNTER MODE
Although interest in the counter (CTR) mode has increased recently with appli-
cations to ATM (asynchronous transfer mode) network security and IPsec
(IP security), this mode was proposed in 1979 (e.g., [DIFF79]).
Figure 7.7 depicts the CTR mode. A counter equal to the plaintext block size
is used. The only requirement stated in SP 800-38A is that the counter value must be
different for each plaintext block that is encrypted. Typically, the counter is initial-
ized to some value and then incremented by 1 for each subsequent block (modulo 2b,
where b is the block size). For encryption, the counter is encrypted and then XORed
with the plaintext block to produce the ciphertext block; there is no chaining. For
decryption, the same sequence of counter values is used, with each encrypted coun-
ter XORed with a ciphertext block to recover the corresponding plaintext block.
Thus, the initial counter value must be made available for decryption. Given a
sequence of counters T1, T2, c , TN, we can define CTR mode as follows.
CTR
Cj = Pj ⊕ E(K, Tj) j = 1, c , N - 1
C N
* = PN
* ⊕ MSBu[E(K, TN)]
Pj = Cj ⊕ E(K, Tj) j = 1, c , N - 1
PN
* = C N
* ⊕ MSBu[E(K, TN)]
For the last plaintext block, which may be a partial block of u bits, the most
significant u bits of the last output block are used for the XOR operation; the re-
maining b - u bits are discarded. Unlike the ECB, CBC, and CFB modes, we do
not need to use padding because of the structure of the CTR mode.
As with the OFB mode, the initial counter value must be a nonce; that is, T1
must be different for all of the messages encrypted using the same key. Further,
all Ti values across all messages must be unique. If, contrary to this requirement, a
counter value is used multiple times, then the confidentiality of all of the plaintext
blocks corresponding to that counter value may be compromised. In particular, if
any plaintext block that is encrypted using a given counter value is known, then
the output of the encryption function can be determined easily from the associated
ciphertext block. This output allows any other plaintext blocks that are encrypted
using the same counter value to be easily recovered from their associated ciphertext
blocks.
7.6 / COUNTER MODE 223
One way to ensure the uniqueness of counter values is to continue to incre-
ment the counter value by 1 across messages. That is, the first counter value of the
each message is one more than the last counter value of the preceding message.
[LIPM00] lists the following advantages of CTR mode.
■ Hardware efficiency: Unlike the three chaining modes, encryption (or decryp-
tion) in CTR mode can be done in parallel on multiple blocks of plaintext or
ciphertext. For the chaining modes, the algorithm must complete the computa-
tion on one block before beginning on the next block. This limits the maximum
throughput of the algorithm to the reciprocal of the time for one execution of
block encryption or decryption. In CTR mode, the throughput is only limited
by the amount of parallelism that is achieved.
Figure 7.7 Counter (CTR) Mode
(a) Encryption
P1
C1
Counter 1
Encrypt
K
Counter 2 Counter N
P2 PN
C2
Encrypt
K
CN
Encrypt
K
(b) Decryption
C1
P1
Counter 1
Encrypt
K
Counter 2 Counter N
C2 CN
P2
Encrypt
K
PN
Encrypt
K
224 CHAPTER 7 / BLOCK CIPHER OPERATION
■ Software efficiency: Similarly, because of the opportunities for parallel execu-
tion in CTR mode, processors that support parallel features, such as aggressive
pipelining, multiple instruction dispatch per clock cycle, a large number of reg-
isters, and SIMD instructions, can be effectively utilized.
■ Preprocessing: The execution of the underlying encryption algorithm does
not depend on input of the plaintext or ciphertext. Therefore, if sufficient
memory is available and security is maintained, preprocessing can be used to
prepare the output of the encryption boxes that feed into the XOR functions,
as in Figure 7.7. When the plaintext or ciphertext input is presented, then
the only computation is a series of XORs. Such a strategy greatly enhances
throughput.
■ Random access: The ith block of plaintext or ciphertext can be processed in
random-access fashion. With the chaining modes, block Ci cannot be com-
puted until the i - 1 prior blocks are computed. There may be applications in
which a ciphertext is stored and it is desired to decrypt just one block; for such
applications, the random access feature is attractive.
■ Provable security: It can be shown that CTR is at least as secure as the other
modes discussed in this chapter.
■ Simplicity: Unlike ECB and CBC modes, CTR mode requires only the imple-
mentation of the encryption algorithm and not the decryption algorithm. This
matters most when the decryption algorithm differs substantially from the en-
cryption algorithm, as it does for AES. In addition, the decryption key schedul-
ing need not be implemented.
Note that, with the exception of ECB, all of the NIST-approved block ci-
pher modes of operation involve feedback. This is clearly seen in Figure 7.8. To
highlight the feedback mechanism, it is useful to think of the encryption function
as taking input from an input register whose length equals the encryption block
length and with output stored in an output register. The input register is updated
one block at a time by the feedback mechanism. After each update, the encryp-
tion algorithm is executed, producing a result in the output register. Meanwhile,
a block of plaintext is accessed. Note that both OFB and CTR produce output
that is independent of both the plaintext and the ciphertext. Thus, they are natu-
ral candidates for stream ciphers that encrypt plaintext by XOR one full block at
a time.
7.7 XTS-AES MODE FOR BLOCK-ORIENTED
STORAGE DEVICES
In 2010, NIST approved an additional block cipher mode of operation, XTS-AES.
This mode is also an IEEE standard, IEEE Std 1619-2007, which was developed
by the IEEE Security in Storage Working Group (P1619). The standard describes
a method of encryption for data stored in sector-based devices where the threat
model includes possible access to stored data by the adversary. The standard has
received widespread industry support.
7.7 / XTS-AES MODE FOR BLOCK-ORIENTED STORAGE DEVICES 225
Tweakable Block Ciphers
The XTS-AES mode is based on the concept of a tweakable block cipher, intro-
duced in [LISK02], which functions in much the same manner as a salt used with
passwords, as described in Chapter 22. The form of this concept used in XTS-AES
was first described in [ROGA04].
Before examining XTS-AES, let us consider the general structure of a tweak-
able block cipher. A tweakable block cipher is one that has three inputs: a plain-
text P, a symmetric key K, and a tweak T; and produces a ciphertext output C. We
can write this as C = E(K, T, P). The tweak need not be kept secret. Whereas the
Figure 7.8 Feedback Characteristic of Modes of Operation
Plaintext block
Plaintext block
Encrypt
Input register
Output register
Ciphertext Ciphertext
(a) Cipher block chaining (CBC) mode
Key
Encrypt
Input register
Output register
Key
(b) Cipher feedback (CFB) mode
Plaintext block
Ciphertext
Key
Encrypt
Input register
Output register
(c) Output feedback (OFB) mode
Plaintext block
Ciphertext
Key
Encrypt
Input register
Output register
Counter
(d) Counter (CTR) mode
226 CHAPTER 7 / BLOCK CIPHER OPERATION
purpose of the key is to provide security, the purpose of the tweak is to provide
variability. That is, the use of different tweaks with the same plaintext and same key
produces different outputs. The basic structure of several tweakable clock ciphers
that have been implemented is shown in Figure 7.9. Encryption can be expressed as:
C = H(T) ⊕ E(K, H(T) ⊕ P)
where H is a hash function. For decryption, the same structure is used with the
plaintext as input and decryption as the function instead of encryption. To see that
this works, we can write
H(T) ⊕ C = E(K, H(T) ⊕ P)
D[K, H(T) ⊕ C] = H(T) ⊕ P
H(T) ⊕ D(K, H(T) ⊕ C) = P
It is now easy to construct a block cipher mode of operation by using a differ-
ent tweak value on each block. In essence, the ECB mode is used but for each block
the tweak is changed. This overcomes the principal security weakness of ECB,
which is that two encryptions of the same block yield the same ciphertext.
Storage Encryption Requirements
The requirements for encrypting stored data, also referred to as “data at rest” dif-
fer somewhat from those for transmitted data. The P1619 standard was designed to
have the following characteristics:
1. The ciphertext is freely available for an attacker. Among the circumstances
that lead to this situation:
a. A group of users has authorized access to a database. Some of the records in
the database are encrypted so that only specific users can successfully read/
Figure 7.9 Tweakable Block Cipher
K
Hash
function
Tj
H(Tj)
Pj
Cj
Encrypt
(a) Encryption
K
Hash
function
Tj Cj
Pj
Decrypt
(b) Decryption
7.7 / XTS-AES MODE FOR BLOCK-ORIENTED STORAGE DEVICES 227
write them. Other users can retrieve an encrypted record but are unable to
read it without the key.
b. An unauthorized user manages to gain access to encrypted records.
c. A data disk or laptop is stolen, giving the adversary access to the encrypted
data.
2. The data layout is not changed on the storage medium and in transit. The en-
crypted data must be the same size as the plaintext data.
3. Data are accessed in fixed sized blocks, independently from each other. That is,
an authorized user may access one or more blocks in any order.
4. Encryption is performed in 16-byte blocks, independently from other blocks
(except the last two plaintext blocks of a sector, if its size is not a multiple of
16 bytes).
5. There are no other metadata used, except the location of the data blocks
within the whole data set.
6. The same plaintext is encrypted to different ciphertexts at different locations,
but always to the same ciphertext when written to the same location again.
7. A standard conformant device can be constructed for decryption of data en-
crypted by another standard conformant device.
The P1619 group considered some of the existing modes of operation for use with
stored data. For CTR mode, an adversary with write access to the encrypted media can
flip any bit of the plaintext simply by flipping the corresponding ciphertext bit.
Next, consider requirement 6 and the use of CBC. To enforce the requirement
that the same plaintext encrypts to different ciphertext in different locations, the IV
could be derived from the sector number. Each sector contains multiple blocks. An
adversary with read/write access to the encrypted disk can copy a ciphertext sec-
tor from one position to another, and an application reading the sector off the new
location will still get the same plaintext sector (except perhaps the first 128 bits).
For example, this means that an adversary that is allowed to read a sector from the
second position but not the first can find the content of the sector in the first posi-
tion by manipulating the ciphertext. Another weakness is that an adversary can flip
any bit of the plaintext by flipping the corresponding ciphertext bit of the previous
block, with the side-effect of “randomizing” the previous block.
Operation on a Single Block
Figure 7.10 shows the encryption and decryption of a single block. The operation in-
volves two instances of the AES algorithm with two keys. The following parameters
are associated with the algorithm.
Key The 256 or 512 bit XTS-AES key; this is parsed as a concatenation of two
fields of equal size called Key1 and Key2, such that Ke y = Ke y1 }Ke y2 .
Pj The jth block of plaintext. All blocks except possibly the final block have a
length of 128 bits. A plaintext data unit, typically a disk sector, consists of a
sequence of plaintext blocks P1, P2, c , Pm.
Cj The jth block of ciphertext. All blocks except possibly the final block have a
length of 128 bits.
228 CHAPTER 7 / BLOCK CIPHER OPERATION
j The sequential number of the 128-bit block inside the data unit.
i The value of the 128-bit tweak. Each data unit (sector) is assigned a
tweak value that is a nonnegative integer. The tweak values are assigned
consecutively, starting from an arbitrary nonnegative integer.
a A primitive element of GF(2128) that corresponds to polynomial x
(i.e., 0000 c 0102).
aj a multiplied by itself j times, in GF(2128).
⊕ Bitwise XOR.
⊗ Modular multiplication of two polynomials with binary coefficients modulo
x128 + x7 + x2 + x + 1. Thus, this is multiplication in GF(2128).
Figure 7.10 XTS-AES Operation on Single Block
Key2
Key1
AES
Encrypt
i
T
CC
PP
Pj
Cj
AES
Encrypt
(a) Encryption
(b) Decryption
j
Key2
Key1
AES
Encrypt
i
T
CC
PP
Cj
Pj
AES
Decrypt
j
7.7 / XTS-AES MODE FOR BLOCK-ORIENTED STORAGE DEVICES 229
In essence, the parameter j functions much like the counter in CTR mode. It
assures that if the same plaintext block appears at two different positions within a
data unit, it will encrypt to two different ciphertext blocks. The parameter i functions
much like a nonce at the data unit level. It assures that, if the same plaintext block
appears at the same position in two different data units, it will encrypt to two differ-
ent ciphertext blocks. More generally, it assures that the same plaintext data unit will
encrypt to two different ciphertext data units for two different data unit positions.
The encryption and decryption of a single block can be described as
XTS-AES block
operation
T = E(K2, i) ⊗a
j
PP = P ⊕ T
CC = E(K1, PP)
C = CC ⊕ T
T = E(K2, i) ⊗a
j
CC = C ⊕ T
PP = D(K1, CC)
P = PP ⊕ T
To see that decryption recovers the plaintext, let us expand the last line of both en-
cryption and decryption. For encryption, we have
C = CC ⊕ T = E(K1, PP) ⊕ T = E(K1, P ⊕ T) ⊕ T
and for decryption, we have
P = PP ⊕ T = D(K1, CC) ⊕ T = D(K1, C ⊕ T) ⊕ T
Now, we substitute for C:
P = D(K1, C ⊕ T) ⊕ T
= D(K1, [E(K1, P ⊕ T) ⊕ T] ⊕ T) ⊕ T
= D(K1, E(K1, P ⊕ T)) ⊕ T
= (P ⊕ T) ⊕ T = P
Operation on a Sector
The plaintext of a sector or data unit is organized into blocks of 128 bits. Blocks are
labeled P0, P1, c , Pm. The last block my be null or may contain from 1 to 127 bits.
In other words, the input to the XTS-AES algorithm consists of m 128-bit blocks
and possibly a final partial block.
For encryption and decryption, each block is treated independently and en-
crypted/decrypted as shown in Figure 7.10. The only exception occurs when the
last block has less than 128 bits. In that case, the last two blocks are encrypted/de-
crypted using a ciphertext-stealing technique instead of padding. Figure 7.11 shows
the scheme. Pm - 1 is the last full plaintext block, and Pm is the final plaintext block,
which contains s bits with 1 … s … 127. Cm - 1 is the last full ciphertext block, and
Cm is the final ciphertext block, which contains s bits. This technique is commonly
called ciphertext stealing because the processing of the last block “steals” a tempo-
rary ciphertext of the penultimate block to complete the cipher block.
Let us label the block encryption and decryption algorithms of Figure 7.10 as
Block encryption: XTS-AES-blockEnc(K, Pj, i, j)
Block decryption: XTS-AES-blockDec(K, Cj, i, j)
230 CHAPTER 7 / BLOCK CIPHER OPERATION
Then, XTS-AES mode is defined as follows:
XTS-AES mode with null
final block
Cj = XTS@AES@blockEnc(K, Pj, i, j) j = 0, c , m - 1
Pj = XTS@AES@blockEnc(K, Cj, i, j) j = 0, c , m - 1
XTS-AES mode with final
block containing s bits
Cj = XTS@AES@blockEnc(K, Pj, i, j) j = 0, c , m - 2
XX = XTS@AES@blockEnc(K, Pm - 1, i, m - 1)
CP = LSB128 - s(XX)
YY = Pm }CP
Cm - 1 = XTS@AES@blockEnc(K, YY, i, m)
Cm = MSBs(XX)
Pj = XTS@AES@blockDec(K, Cj, i, j) j = 0, c , m - 2
YY = XTS@AES@blockDec(K, Cm - 1, i, m - 1)
CP = LSB128 - s(YY)
XX = Cm }CP
Pm - 1 = XTS@AES@blockDec(K, XX, i, m)
Pm = MSBs(YY)
Figure 7.11 XTS-AES Mode
C0
P0
XTS-AES
block
encryption
Key
i, 0
C1
P1
XTS-AES
block
encryption
Key
i, 1
CP
XX
XX
YY
YY
Cm
CPPmPm–1
XTS-AES
block
encryption
Key
i, m–1
Cm–1
Cm–1
XTS-AES
block
encryption
Key
i, m
Cm
(a) Encryption
(b) Decryption
P0
C0
XTS-AES
block
decryption
Key
i, 0
P1
C1
XTS-AES
block
decryption
Key
i, 1
CPPm
CPCmCm–1
XTS-AES
block
decryption
Key
i, m
Pm–1
Pm–1
XTS-AES
block
decryption
Key
i, m–1
Pm
7.8 / FORMAT-PRESERVING ENCRYPTION 231
As can be seen, XTS-AES mode, like CTR mode, is suitable for parallel oper-
ation. Because there is no chaining, multiple blocks can be encrypted or decrypted
simultaneously. Unlike CTR mode, XTS-AES mode includes a nonce (the param-
eter i) as well as a counter (parameter j).
7.8 FORMAT-PRESERVING ENCRYPTION
Format-preserving encryption (FPE) refers to any encryption technique that takes
a plaintext in a given format and produces a ciphertext in the same format. For
example, credit cards consist of 16 decimal digits. An FPE that can accept this type of
input would produce a ciphertext output of 16 decimal digits. Note that the ciphertext
need not be, and in fact is unlikely to be, a valid credit card number. But it will have
the same format and can be stored in the same way as credit card number plaintext.
A simple encryption algorithm is not format preserving, with the exception
that it preserves the format of binary strings. For example, Table 7.2 shows three
types of plaintext for which it might be desired to perform FPE. The third row
shows examples of what might be generated by an FPE algorithm. The fourth row
shows (in hexadecimal) what is produced by AES with a given key.
Motivation
FPE facilitates the retrofitting of encryption technology to legacy applications,
where a conventional encryption mode might not be feasible because it would dis-
rupt data fields/pathways. FPE has emerged as a useful cryptographic tool, whose
applications include financial-information security, data sanitization, and transpar-
ent encryption of fields in legacy databases.
The principal benefit of FPE is that it enables protection of particular data
elements in a legacy database that did not provide encryption of those data ele-
ments, while still enabling workflows that were in place before FPE was in use. With
FPE, as opposed to ordinary AES encryption or TDEA encryption, no database
schema changes and minimal application changes are required. Only applications
that need to see the plaintext of a data element need to be modified and generally
these modifications will be minimal.
Some examples of legacy applications where FPE is desirable:
■ COBOL data-processing applications: Any changes in the structure of a re-
cord requires corresponding changes in all code that references that record
structure. Typical code sizes involve hundreds of modules, each containing
around 5,000–10,000 lines on average.
Credit Card Tax ID Bank Account Number
Plaintext 8123 4512 3456 6780 219-09-9999 800N2982K-22
FPE 8123 4521 7292 6780 078-05-1120 709G9242H-35
AES (hex) af411326466add24
c86abd8aa525db7a
7b9af4f3f218ab25
07c7376869313afa
9720ec7f793096ff
d37141242e1c51bd
Table 7.2 Comparison of Format-Preserving Encryption and AES
232 CHAPTER 7 / BLOCK CIPHER OPERATION
■ Database applications: Fields that are specified to take only character strings
cannot be used to store conventionally encrypted binary ciphertext. Base64
encoding of such binary ciphertext is not always feasible without increase in
data lengths, requiring augmentation of corresponding field lengths.
■ FPE-encrypted characters can be significantly compressed for efficient trans-
mission. This cannot be said about AES-encrypted binary ciphertext.
Difficulties in Designing an FPE
A general-purpose standardized FPE should meet a number of requirements:
1. The ciphertext is of the same length and format as the plaintext.
2. It should be adaptable to work with a variety of character and number types.
Examples include decimal digits, lowercase alphabetic characters, and the full
character set of a standard keyboard or international keyboard.
3. It should work with variable plaintext lengths.
4. Security strength should be comparable to that achieved with AES.
5. Security should be strong even for very small plaintext lengths.
Meeting the first requirement is not at all straightforward. As illustrated in
Table 7.2, a straightforward encryption with AES yields a 128-bit binary block that
does not resemble the required format. Also, a standard symmetric block cipher is
not easily adaptable to produce an FPE.
Consider a simple example. Assume that we want an algorithm that can en-
crypt decimal digit strings of maximum length of 32 digits. The input to the algo-
rithm can be stored in 16 bytes (128 bits) by encoding each digit as four bits and
using the corresponding binary value for each digit (e.g., 6 is encoded as 0101).
Next, we use AES to encrypt the 128-bit block, in the following fashion:
1. The plaintext input X is represented by the string of 4-bit decimal digits
X[1] . . . X[16]. If the plaintext is less than 16 digits long, it is padded out to the
left (most significant) with zeros.
2. Treating X as a 128-bit binary string and using key K, form ciphertext
Y = AESK(X).
3. Treat Y as a string of length 16 of 4-bit elements.
4. Some of the entries in Y may have values greater than 9 (e.g., 1100). To gener-
ate ciphertext Z in the required format, calculate
Z[i] = Y[i] mod 10, 1 … i … 16
This generates a ciphertext of 16 decimal digits, which conforms to the de-
sired format. However, this algorithm does not meet the basic requirement of
any encryption algorithm of reversibility. It is impossible to decrypt Z to recover
the original plaintext X because the operation is one-way; that is, it is a many-
to-one function. For example, 12 mod 10 = 2 mod 10 = 2. Thus, we need to de-
sign a reversible function that is both a secure encryption algorithm and format
preserving.
7.8 / FORMAT-PRESERVING ENCRYPTION 233
A second difficulty in designing an FPE is that some of the input strings are
quite short. For example, consider the 16-digit credit card number (CCN). The first
six digits provide the issuer identification number (IIN), which identifies the insti-
tution that issued the card. The final digit is a check digit to catch typographical
errors or other mistakes. The remaining nine digits are the user’s account number.
However, a number of applications require that the last four digits be in the clear
(the check digit plus three account digits) for applications such as credit card re-
ceipts, which leaves only six digits for encryption. Now suppose that an adversary
is able to obtain a number of plaintext/ciphertext pairs. Each such pair corresponds
to not just one CCN, but multiple CCNs that have the same middle six digits. In a
large database of credit card numbers, there may be multiple card numbers with
the same middle six digits. An adversary may be able to assemble a large diction-
ary mapping known as six-digit plaintexts to their corresponding ciphertexts. This
could be used to decrypt unknown ciphertexts from the database. As pointed out
in [BELL10a], in a database of 100 million entries, on average about 100 CCNs
will share any given middle-six digits. Thus, if the adversary has learned k CCNs
and gains access to such a database, the adversary can decrypt approximately
100k CCNs.
The solution to this second difficulty is to use a tweakable block cipher; this
concept is described in Section 7.7. For example, the tweak for CCNs could be the
first two and last four digits of the CCN. Prior to encryption, the tweak is added,
digit-by-digit mod 10, to the middle six-digit plaintext, and the result is then en-
crypted. Two different CCNs with identical middle six digits will yield different
tweaked inputs and therefore different ciphertexts. Consider the following:
CCN Tweak Plaintext Plaintext + Tweak
4012 8812 3456 1884 401884 123456 524230
5105 1012 3456 6782 516782 123456 639138
Two CCNs with the same middle six digits have different tweaks and there-
fore different values to the middle six digits after the tweak is added.
Feistel Structure for Format-Preserving Encryption
As the preceding discussion shows, the challenge with FPE is to design an algo-
rithm for scrambling the plaintext that is secure, preserves format, and is reversible.
A number of approaches have been proposed in recent years [ROGA10, BELL09]
for FPE algorithms. The majority of these proposals use a Feistel structure.
Although IBM introduced this structure with their Lucifer cipher [SMIT71] almost
half a century ago, it remains a powerful basis for implementing ciphers.
This section provides a general description of how the Feistel structure can
be used to implement an FPE. In the following section, we look at three specific
Feistel-based algorithms that are in the process of receiving NIST approval.
ENCRYPTION AND DECRYPTION Figure 7.12 shows the Feistel structure used in all of
the NIST algorithms, with encryption shown on the left-hand side and decryption
on the right-hand side. The structure in Figure 7.12 is the same as that shown in
234 CHAPTER 7 / BLOCK CIPHER OPERATION
Figure 4.3 but, to simplify the presentation, it is untwisted, not illustrating the swap
that occurs at the end of each round.
The input to the encryption algorithm is a plaintext character string of
n = u + v characters. If n is even, then u = v, otherwise u and v differ by 1. The
two parts of the string pass through an even number of rounds of processing to
produce a ciphertext block of n characters and the same format as the plaintext.
Each round i has inputs Ai and Bi, derived from the preceding round (or plaintext
for round 0).
All rounds have the same structure. On even-numbered rounds, a substitution
is performed on the left part (length u) of the data, Ai. This is done by applying the
round function FK to the right part (length v) of the data, Bi, and then performing
Figure 7.12 Feistel Structure for Format-Preserving Encryption
Input (plaintext)
Output (ciphertext)
(a) Encryption (b) Decryption
R
ou
nd
0
R
ou
nd
1
A0
C0
C1
u characters v characters
B0
n, T, 0
n, T, 1
A2 B1 B2 C1
+ FK
+
B1 C0 A1 B0
FK
R
ou
nd
r
–2
R
ou
nd
r
–1
Ar–2
Cr–2
Br–2
n, T, r–2
n, T, r–1
Ar Br–1 Br Cr–1
+ FK
+
Br–1 Cr–2 Ar–1 Br–2
FK
Output (plaintext)
Input (ciphertext)
R
ou
nd
r
–1
R
ou
nd
r
–2
A0 C0
C0
C1
u characters v characters
B0 A1
n, T, 0
n, T, 1
A2 C2 B2 A3
– FK
–
B1 A2 A1 C1
FK
R
ou
nd
1
R
ou
nd
0
Ci–2
Cr–1
Cr–1
n, T, i–2
n, T, r–1
Ar Br
– FK
–
Br–1 Ar Ar–1 Cr–1
Ar–2 Cr–2 Br–2 Ar–1
FK
7.8 / FORMAT-PRESERVING ENCRYPTION 235
a modular addition of the output of FK with Ai. The modular addition function and
the selection of modulus are described subsequently. On odd-numbered rounds,
the substitution is done on the right part of the data. FK is a one-way function that
converts the input into a binary string, performs a scrambling transformation on the
string, and then converts the result back into a character string of suitable format
and length. The function has as parameters the secret key K, the plaintext length n,
a tweak T, and the round number i.
Note that on even-numbered rounds, FK has an input of v characters, and that
the modular addition produces a result of u characters, whereas on odd-numbered
rounds, FK has an input of u characters, and that the modular addition produces a
result of v characters. The total number of rounds is even, so that the output consists
of an A portion of length u concatenated with a B portion of length v, matching the
partition of the plaintext.
The process of decryption is essentially the same as the encryption process.
The differences are: (1) the addition function is replaced by a subtraction function
that is its inverse; and (2) the order of the round indices is reversed.
To demonstrate that the decryption produces the correct result, Figure 7.12b
shows the encryption process going down the left-hand side and the decryption pro-
cess going up the right-hand side. The diagram indicates that, at every round, the
intermediate value of the decryption process is equal to the corresponding value of
the encryption process. We can walk through the figure to validate this, starting at
the bottom. The ciphertext is produced at the end of round r - 1 as a string of the
form A
r }B r, with Ar and Br having string lengths u and v, respectively. Encryption
round r - 1 can be described with the following equations:
Ar = Br - 1
Br = Ar - 1 + FK[Br - 1]
Because we define the subtraction function to be the inverse of the addition
function, these equations can be rewritten:
Br - 1 = Ar
Ar - 1 = Br - FK[Br - 1]
It can be seen that the last two equations describe the action of round 0 of the
decryption function, so that the output of round 0 of decryption equals the input
of round r - 1 of encryption. This correspondence holds all the way through the r
iterations, as is easily shown.
Note that the derivation does not require that F be a reversible function. To
see this, take a limiting case in which F produces a constant output (e.g., all ones)
regardless of the values of its input. The equations still hold.
CHARACTER STRINGS The NIST algorithms, and the other FPE algorithms that have
been proposed, are used with plaintext consisting of a string of elements, called
characters. Specifically, a finite set of two or more symbols is called an alphabet,
and the elements of an alphabet are called characters. A character string is a finite
sequence of characters from an alphabet. Individual characters may repeat in the
string. The number of different characters in an alphabet is called the base, also
236 CHAPTER 7 / BLOCK CIPHER OPERATION
referred to as the radix of the alphabet. For example, the lowercase English alpha-
bet a, b, c, . . . has a radix, or base, of 26. For purposes of encryption and decryption,
the plaintext alphabet must be converted to numerals, where a numeral is a non-
negative integer that is less than the base. For example, for the lowercase alphabet,
the assignment could be characters a, b, c, . . . , z map into 0, 1, 2, . . . , 25.
A limitation of this approach is that all of the elements in a plaintext format
must have the same radix. So, for example, an identification number that consists
of an alphabetic character followed by nine numeric digits cannot be handled in
format-preserving fashion by the FPEs that have been implemented so far.
The NIST document defines notation for specifying these conversions
(Table 7.3a). To begin, it is assumed that the character string is represented by
a numeral string. To convert a numeral string X into a number x, the function
NUMradix(X) is used. Viewing X as the string X[1] . . . X [m] with the most signifi-
cant numeral first, the function is defined as
NUMradix(X) = a
m
i = 1
X[i] radixm - i = a
m - 1
i = 0
X[m - i] radixi
Observe that 0 … NUMradix(X) 6 radixm and that 0 … X[i] 6 radix.
[x]s Converts an integer into a byte string; it is the string of s bytes that encodes the
number x, with 0 … x 6 28s. The equivalent notation is STR28s(x).
LEN(X) Length of the character string X.
NUMradix(X) Converts strings to numbers. The number that the numeral string X represents
in base radix, with the most significant character first. In other words, it is the
nonnegative integer less than radixLEN(X) whose most-significant-character-first
representation in base radix is X.
PRFK(X) A pseudorandom function that produces a 128-bit output with X as the input,
using encryption key K.
STRradix
m (x) Given a nonnegative integer x less than radixm, this function produces a repre-
sentation of x as a string of m characters in base radix, with the most significant
character first.
[i .. j] The set of integers between two integers i and j, including i and j.
X[i .. j] The substring of characters of a string X from X[i] to X[j], including X[i] and X[j].
REV(X) Given a bit string, X, the string that consists of the bits of X in reverse order.
(a) Notation
radix The base, or number of characters, in a given plaintext alphabet.
tweak Input parameter to the encryption and decryption functions whose confidentiality
is not protected by the mode.
tweakradix The base for tweak strings
minlen Minimum message length, in characters.
maxlen Maximum message length, in characters.
maxTlen Maximum tweak length
(b) Parameters
Table 7.3 Notation and Parameters Used in FPE Algorithms
7.8 / FORMAT-PRESERVING ENCRYPTION 237
For example, consider the string zaby in radix 26, which converts to the
numeral string 25 0 1 24. This converts to the number x = (25 * 263) + (1 * 261)
+ 2 4 = 4 3 9 4 5 0 . To go in the opposite direction and convert from a number
x 6 radixm to a numeral string X of length m, the function STRradixm (x) is used:
STRradix
m (x) = X[1] c X[m], where
X[i] = j x
radixm - i
kmod radix, i = 1, c, m
With the mapping of characters to numerals and the use of the NUM func-
tion, a plaintext character string can be mapped to a number and stored as an
unsigned integer. We would like to treat this unsigned integer as a bit string that
can be input to a bit-scrambling algorithm in FK. However, different platforms store
unsigned integers differently, some in little-endian and some in big-endian fashion.
So one more step is needed. By the definition of the STR function, STR2
8s(x) will
generate a bit string of length 8s, equivalently a byte string of length s, which is a
binary integer with the most significant bit first, regardless of how x is stored as an
unsigned integer. For convenience the following notation is used: [x]s = STR2
8s(x).
Thus, [NUMradix(X)]
s will convert the character string X into an unsigned integer
and then convert that to a byte string of length s bytes with the most significant
bit first.
Continuing, the preceding example should help clarify the issues involved.
Character string “zaby”
Numeral string X representation of
character string
25 0 1 24
Convert X to number
x = NUM26(X)
decimal: 439450
hex: 6B49A
binary: 1101011010010011010
x stored on big-endian byte order
machine as a 32-bit unsigned
integer
hex: 00 06 B4 9A
binary: 00000000000001101011010010011010
x stored on little-endian byte
order machine as a 32-bit unsigned
integer
hex: 9A B4 06 00
binary: 10011010101101000000011000000000
Convert x, regardless of endian
format, to a bit string of length
32 bits (4 bytes), expressed as [x]4
00000000000001101011010010011010
THE FUNCTION FK We can now define in general terms the function FK. The
core of FK is some type of randomizing function whose input and output are bit
strings. For convenience, the strings should be multiples of 8 bits, forming byte
strings. Define m to be u for even rounds and v for odd rounds; this specifies
the desired output character string length. Define b to be the number of bytes
needed to store the number representing a character string of m bytes. Then the
238 CHAPTER 7 / BLOCK CIPHER OPERATION
round, including FK, consists of the following general steps (A and B refer to Ai
and Bi for round i):
1. Q d [NUMradix(B)]b Converts numeral string X into byte string Q of
length b bytes.
2. Y d RAN[Q] A pseudorandom function PRNF that produces
a pseudorandom byte string Y as a function of
the bits of Q.
3. y d NUM2(Y) Converts Y into unsigned integer.
4. c d (NUMradix(A) + y) mod radixm Converts numeral string A into an integer and
adds to y, modulo radixm.
5. C d STRradixm (c) Converts c into a numeral string C of length m.
6. A d B;
B d C
Completes the round by placing the unchanged
value of B from the preceding round into A, and
placing C into B.
Steps 1 through 3 constitute the round function FK. Step 3 is presented with Y,
which is an unstructured bit string. Because different platforms may store unsigned
integers using different word lengths and endian conventions, it is necessary to per-
form NUM2(Y) to get an unsigned integer y. The stored bit sequence for y may or
may not be identical to the bit sequence for Y.
As mentioned, the pseudorandom function in step 2 need not be reversible. Its
purpose is to provide a randomized, scrambled bit string. For DES, this is achieved
by using fixed S-boxes, as described in Appendix S. Virtually all FPE schemes that
use the Feistel structure use AES as the basis for the scrambling function to achieve
stronger security.
RELATIONSHIP BETWEEN RADIX, MESSAGE LENGTH, AND BIT LENGTH Consider
a numeral string X of length len and base radix. If we convert this to a number
x = NUMradix(X), then the maximum value of x is radix
len - 1. The number of bits
needed to encode x is
bitlen =

/

P = M }pad[r](|M|)

s = 0b

for i = 0 to |P|r − 1 do

s = s ⊕ (Pi}0b − r)

s = f(s)

end for

Z =:s;r

while |Z|r r < / do
s = f (s)
Z = Z} :s;r
end while
return :Z;ℓ
In the algorithm definition, the following notation is used: �M� is the length
in bits of a bit string M. A bit string M can be considered as a sequence of blocks
of some fixed length x, where the last block may be shorter. The number of
blocks of M is denoted by �M� x. The blocks of M are denoted by Mi and the index
ranges from 0 to �M� x - 1. The expression :M;/ denotes the truncation of M to
its first / bits.
11.6 / SHA-3 369
Message Digest Size 224 256 384 512
Message Size no maximum no maximum no maximum no maximum
Block Size (bitrate r) 1152 1088 832 576
Word Size 64 64 64 64
Number of Rounds 24 24 24 24
Capacity c 448 512 768 1024
Collision Resistance 2112 2128 2192 2256
Second Preimage Resistance 2224 2256 2384 2512
Note: All sizes and security levels—are measured in bits.
Table 11.5 SHA-3 Parameters
SHA-3 makes use of the iteration function f, labeled Keccak-f, which is
described in the next section. The overall SHA-3 function is a sponge function
expressed as Keccak[r, c] to reflect that SHA-3 has two operational parameters, r,
the message block size, and c, the capacity, with the default of r + c = 1600 bits.
Table 11.5 shows the supported values of r and c. As Table 11.5 shows, the hash
function security associated with the sponge construction is a function of the
capacity c.
In terms of the sponge algorithm defined above, Keccak[r, c] is defined as
Keccak [r, c]∆ SPONGE [Keccak@f [r + c], pad 10*1, r]
We now turn to a discussion of the iteration function Keccak-f.
The SHA-3 Iteration Function f
We now examine the iteration function Keccak-f used to process each successive
block of the input message. Recall that f takes as input a 1600-bit variable s consist-
ing of r bits, corresponding to the message block size followed by c bits, referred to
as the capacity. For internal processing within f, the input state variable s is orga-
nized as a 5 * 5 * 64 array a. The 64-bit units are referred to as lanes. For our
purposes, we generally use the notation a[x, y, z] to refer to an individual bit with
the state array. When we are more concerned with operations that affect entire
lanes, we designate the 5 * 5 matrix as L[x, y], where each entry in L is a 64-bit
lane. The use of indices within this matrix is shown in Figure 11.16.4 Thus, the col-
umns are labeled x = 0 through x = 4, the rows are labeled y = 0 through y = 4,
and the individual bits within a lane are labeled z = 0 through z = 63. The mapping
between the bits of s and those of a is
s[64(5y + x) + z] = a[x, y, z]
4Note that the first index (x) designates a column and the second index (y) designates a row. This is
in conflict with the convention used in most mathematics sources, where the first index designates a
row and the second index designates a column (e.g., Knuth, D. The Art of Computing Programming,
Volume 1, Fundamental Algorithms; and Korn, G., and Korn, T. Mathematical Handbook for Scientists
and Engineers).
370 CHAPTER 11 / CRYPTOGRAPHIC HASH FUNCTIONS
We can visualize this with respect to the matrix in Figure 11.16. When treat-
ing the state as a matrix of lanes, the first lane in the lower left corner, L[0, 0], cor-
responds to the first 64 bits of s. The lane in the second column, lowest row, L[1,
0], corresponds to the next 64 bits of s. Thus, the array a is filled with the bits of s
starting with row y = 0 and proceeding row by row.
STRUCTURE OF f The function f is executed once for each input block of the message
to be hashed. The function takes as input the 1600-bit state variable and converts
it into a 5 * 5 matrix of 64-bit lanes. This matrix then passes through 24 rounds of
processing. Each round consists of five steps, and each step updates the state matrix
by permutation or substitution operations. As shown in Figure 11.17, the rounds are
identical with the exception of the final step in each round, which is modified by a
round constant that differs for each round.
The application of the five steps can be expressed as the composition5 of
functions:
R = i o x o p o r o u
Table 11.6 summarizes the operation of the five steps. The steps have a sim-
ple description leading to a specification that is compact and in which no trapdoor
can be hidden. The operations on lanes in the specification are limited to bitwise
Boolean operations (XOR, AND, NOT) and rotations. There is no need for table
lookups, arithmetic operations, or data-dependent rotations. Thus, SHA-3 is easily
and efficiently implemented in either hardware or software.
We examine each of the step functions in turn.
Figure 11.16 SHA-3 State Matrix
L[0, 4]
x = 0 x = 1 x = 2 x = 3 x = 4
L[0, 3]
L[0, 2]
L[0, 1]
L[0, 0]
a[x, y, 0] a[x, y, 1] a[x, y, 2]
y = 1
y = 0
y = 2
y = 3
y = 4 L[1, 4]
L[1, 3]
L[1, 2]
L[1, 1]
L[1, 0]
L[2, 4]
L[2, 3]
L[2, 2]
L[2, 1]
L[2, 0]
(a) State variable as 5 5 matrix A of 64-bit words
(b) Bit labeling of 64-bit words
L[3, 4]
L[3, 3]
L[3, 2]
L[4, 1]
L[3, 0]
L[4, 4]
L[4, 3]
L[4, 2]
L[4, 1]
L[4, 0]
a[x, y, 63]a[x, y, 62]a[x, y, z]
5If f and g are two functions, then the function F with the equation y = F(x) = g[f(x)] is called the
composition of f and g and is denoted as F = g o f.
11.6 / SHA-3 371
Figure 11.17 SHA-3 Iteration Function f
theta (u) step
s
s
rho (r) step
pi (p) step
chi (x) step
R
ou
nd
0
iota (i) step RC[0]
rot(x, y)
theta (u) step
rho (r) step
pi (p) step
chi (x) step
R
ou
nd
2
3
iota (i) step RC[23]
rot(x, y)
Function Type Description
u Substitution New value of each bit in each word depends on its current
value and on one bit in each word of preceding column
and one bit of each word in succeeding column.
r Permutation The bits of each word are permuted using a circular bit
shift. W[0, 0] is not affected.
p Permutation Words are permuted in the 5 * 5 matrix. W[0, 0] is not
affected.
x Substitution New value of each bit in each word depends on its current
value and on one bit in next word in the same row and one
bit in the second next word in the same row.
i Substitution W[0, 0] is updated by XOR with a round constant.
Table 11.6 Step Functions in SHA-3
THETA STEP FUNCTION The Keccak reference defines the u function as follows. For
bit z in column x, row y,
u: a[x, y, z] d a[x, y, z] ⊕ a
4
y== 0
a[(x - 1), y=, z] ⊕ a
4
y== 0
a[(x + 1), y=, (z - 1)] (11.1)
372 CHAPTER 11 / CRYPTOGRAPHIC HASH FUNCTIONS
where the summations are XOR operations. We can see more clearly what this
operation accomplishes with reference to Figure 11.18a. First, define the bitwise
XOR of the lanes in column x as
C[x] = L[x, 0] ⊕ L[x, 1] ⊕ L[x, 2] ⊕ L[x, 3] ⊕ L[x, 4]
Consider lane L[x, y] in column x, row y. The first summation in Equation 11.1
performs a bitwise XOR of the lanes in column (x - 1) mod 4 to form the 64-bit
lane C[x - 1]. The second summation performs a bitwise XOR of the lanes in
column (x + 1) mod 4, and then rotates the bits within the 64-bit lane so that the
bit in position z is mapped into position z + 1 mod 64. This forms the lane ROT
(C[x + 1], 1). These two lanes and L[x, y] are combined by bitwise XOR to form
the updated value of L[x, y]. This can be expressed as
L[x, y] d L[x, y] ⊕ C[x - 1] ⊕ ROT(C[x + 1], 1)
Figure 11.18.a illustrates the operation on L[3, 2]. The same operation is
performed on all of the other lanes in the matrix.
Figure 11.18 Theta and Chi Step Functions
(a) u step function
L[2, 3]L[2, 3] ROT(C[3], 1)C[1]
L[0, 4]
x = 0 x = 1 x = 2 x = 3 x = 4
L[0, 3]
L[0, 2]
L[0, 1]
L[0, 0]
y = 1
y = 0
y = 2
y = 3
y = 4 L[1, 4]
L[1, 3]
L[1, 2]
L[1, 1]
L[1, 0]
L[2, 4]
L[2, 3]
L[2, 2]
L[2, 1]
L[2, 0]
L[3, 4]
L[3, 3]
L[3, 2]
L[4, 1]
L[3, 0]
L[4, 4]
L[4, 3]
L[4, 2]
L[4, 1]
L[4, 0]
(b) x step function
L[2, 3]L[2, 3] L[3, 3] AND L[4, 3]
L[0, 4]
x = 0 x = 1 x = 2 x = 3 x = 4
L[0, 3]
L[0, 2]
L[0, 1]
L[0, 0]
y = 1
y = 0
y = 2
y = 3
y = 4 L[1, 4]
L[1, 3]
L[1, 2]
L[1, 1]
L[1, 0]
L[2, 4]
L[2, 3]
L[2, 2]
L[2, 1]
L[2, 0]
L[3, 4]
L[3, 3]
L[3, 2]
L[4, 1]
L[3, 0]
L[4, 4]
L[4, 3]
L[4, 2]
L[4, 1]
L[4, 0]
11.6 / SHA-3 373
Several observations are in order. Each bit in a lane is updated using the bit itself
and one bit in the same bit position from each lane in the preceding column and one
bit in the adjacent bit position from each lane in the succeeding column. Thus the up-
dated value of each bit depends on 11 bits. This provides good mixing. Also, the theta
step provides good diffusion, as that term was defined in Chapter 4. The designers of
Keccak state that the theta step provides a high level of diffusion on average and that
without theta, the round function would not provide diffusion of any significance.
RHO STEP FUNCTION The r function is defined as follows:
r: a[x, y, z] d a[x, y, z] if x = y = 0
otherwise,
r: a[x, y, z] d aJx, y, az - (t + 1)(t + 2)
2
b R (11.2)
with t satisfying 0 … t 6 24 and ¢0 1
2 3
≤t¢1
0
≤ = ¢x
y
≤ in GF(5)2 * 2
It is not immediately obvious what this step performs, so let us look at the
process in detail.
1. The lane in position (x, y) = (0, 0), that is L[0, 0], is unaffected. For all other
words, a circular bit shift within the lane is performed.
2. The variable t, with 0 … t 6 24, is used to determine both the amount of the
circular bit shift and which lane is assigned which shift value.
3. The 24 individual bit shifts that are performed have the respective values
(t + 1)(t + 2)
2
mod 64.
4. The shift determined by the value of t is performed on the lane in position
(x, y) in the 5 * 5 matrix of lanes. Specifically, for each value of t, the corre-
sponding matrix position is defined by ¢x
y
≤ = ¢0 1
2 3
≤t¢1
0
≤. For example, for
t = 3, we have
¢x
y
≤ = ¢0 1
2 3
≤3 ¢1
0
≤ mod 5
= ¢0 1
2 3
≤ ¢0 1
2 3
≤ ¢0 1
2 3
≤ ¢1
0
≤ mod 5
= ¢0 1
2 3
≤ ¢0 1
2 3
≤ ¢0
2
≤ mod 5
= ¢0 1
2 3
≤ ¢2
6
≤ mod 5 = ¢0 1
2 3
≤ ¢2
1
≤ mod 5
= ¢1
7
≤ mod 5 = ¢1
2
≤
374 CHAPTER 11 / CRYPTOGRAPHIC HASH FUNCTIONS
Table 11.7 shows the calculations that are performed to determine the amount
of the bit shift and the location of each bit shift value. Note that all of the rotation
amounts are different.
The r function thus consists of a simple permutation (circular shift) within
each lane. The intent is to provide diffusion within each lane. Without this function,
diffusion between lanes would be very slow.
PI STEP FUNCTION The p function is defined as follows:
p: a[x, y] d a[x=, y=], with¢x
y
≤ = ¢0 1
2 3
≤ ¢x=
y=
≤ (11.3)
This can be rewritten as (x, y) * (y, (2x + 3y)). Thus, the lanes within the
5 * 5 matrix are moved so that the new x position equals the old y position and the
Table 11.7 Rotation Values Used in SHA-3
t g(t) g (t) mod 64 x, y
0 1 1 1, 0
1 3 3 0, 2
2 6 6 2, 1
3 10 10 1, 2
4 15 15 2, 3
5 21 21 3, 3
6 28 28 3, 0
7 36 36 0, 1
8 45 45 1, 3
9 55 55 3, 1
10 66 2 1, 4
11 78 14 4, 4
(b) Rotation values by word position in matrix
x = 0 x = 1 x = 2 x = 3 x = 4
y = 4 18 2 61 56 14
y = 3 41 45 15 21 8
y = 2 3 10 43 25 39
y = 1 36 44 6 55 20
y = 0 0 1 62 28 27
t g(t) g (t) mod 64 x, y
12 91 27 4, 0
13 105 41 0, 3
14 120 56 3, 4
15 136 8 4, 3
16 153 25 3, 2
17 171 43 2, 2
18 190 62 2, 0
19 210 18 0, 4
20 231 39 4, 2
21 253 61 2, 4
22 276 20 4, 1
23 300 44 1, 1
(a) Calculation of values and positions
Note: g(t) = (t + 1)(t + 2)/2
¢x
y
≤ = ¢0 1
2 3
≤t¢1
0
≤ mod 5
11.6 / SHA-3 375
Figure 11.19 Pi Step Function
Z[0, 4]
x = 0 x = 1 x = 2
(a) Lane position at start of step
(b) Lane position after permutation
x = 3 x = 4
Z[0, 3]
Z[0, 2]
Z[0, 1]
Z[0, 0]
y = 1
y = 0
y = 2
y = 3
y = 4 Z[1, 4]
Z[1, 3]
Z[1, 2]
Z[1, 1]
Z[1, 0]
Z[2, 4]
Z[2, 3]
Z[2, 2]
Z[2, 1]
Z[2, 0]
Z[3, 4]
Z[3, 3]
Z[3, 2]
Z[3, 1]
Z[3, 0]
Z[4, 4]
row
0row
3
row
1
row
4
row
2
row
2
row
4
row
1
row
3
Z[4, 3]
Z[4, 2]
Z[4, 1]
Z[4, 0]
Z[2, 0]
x = 0 x = 1 x = 2 x = 3 x = 4
Z[4, 0]
Z[1, 0]
Z[3, 0]
Z[0, 0]
y = 1
y = 0
y = 2
y = 3
y = 4 Z[3, 1]
Z[0, 1]
Z[2, 1]
Z[4, 1]
Z[1, 1]
Z[4, 2]
Z[1, 2]
Z[3, 2]
Z[0, 2]
Z[2, 2]
Z[0, 3]
Z[2, 3]
Z[4, 3]
Z[1, 3]
Z[3, 3]
Z[1, 4]
Z[3, 4]
Z[0, 4]
Z[2, 4]
Z[4, 4]
new y position is determined by (2x + 3y) mod 5. Figure 11.19 helps in visualizing
this permutation. Lanes that are along the same diagonal (increasing in y value,
going from left to right) prior to p are arranged on the same row in the matrix after
p is executed. Note that the position of L[0, 0] is unchanged.
Thus the p step is a permutation of lanes: The lanes move position within the
5 * 5 matrix. The r step is a permutation of bits: Bits within a lane are rotated.
Note that the p step matrix positions are calculated in the same way that, for the r
step, the one-dimensional sequence of rotation constants is mapped to the lanes of
the matrix.
CHI STEP FUNCTION The x function is defined as follows:
x: a[x] d a[x] ⊕ ((a[x + 1] ⊕ 1) AND a[x + 2]) (11.4)
This function operates to update each bit based on its current value and the
value of the corresponding bit position in the next two lanes in the same row. The
376 CHAPTER 11 / CRYPTOGRAPHIC HASH FUNCTIONS
Round
Constant
(hexadecimal)
Number
of 1 bits
0 0000000000000001 1
1 0000000000008082 3
2 800000000000808A 5
3 8000000080008000 3
4 000000000000808B 5
5 0000000080000001 2
6 8000000080008081 5
7 8000000000008009 4
8 000000000000008A 3
9 0000000000000088 2
10 0000000080008009 4
11 000000008000000A 3
Table 11.8 Round Constants in SHA-3
Round
Constant
(hexadecimal)
Number
of 1 bits
12 000000008000808B 6
13 800000000000008B 5
14 8000000000008089 5
15 8000000000008003 4
16 8000000000008002 3
17 8000000000000080 2
18 000000000000800A 3
19 800000008000000A 4
20 8000000080008081 5
21 8000000000008080 3
22 0000000080000001 2
23 8000000080008008 4
operation is more clearly seen if we consider a single bit a[x, y, z] and write out the
Boolean expression:
a[x, y, z] d a[x, y, z] ⊕ (NOT(a[x + 1, y, z])) AND (a[x + 2, y, z])
Figure 11.18b illustrates the operation of the x function on the bits of the
lane L[3, 2]. This is the only one of the step functions that is a nonlinear mapping.
Without it, the SHA-3 round function would be linear.
IOTA STEP FUNCTION The i function is defined as follows:
i: a d a ⊕ RC[ir] (11.5)
This function combines an array element with a round constant that differs for
each round. It breaks up any symmetry induced by the other four step functions. In
fact, Equation 11.5 is somewhat misleading. The round constant is applied only to
the first lane of the internal state array. We express this is as follows:
L[0, 0] d L[0, 0] ⊕ RC[ir] 0 … ir … 24
Table 11.8 lists the 24 64-bit round constants. Note that the Hamming weight,
or number of 1 bits, in the round constants ranges from 1 to 6. Most of the bit posi-
tions are zero and thus do not change the corresponding bits in L[0, 0]. If we take
the cumulative OR of all 24 round constants, we get
RC[0] OR RC[1] OR c OR RC[23] = 800000008000808B
Thus, only 7 bit positions are active and can affect the value of L[0, 0].
Of course, from round to round, the permutations and substitutions propagate the
effects of the i function to all of the lanes and all of the bit positions in the matrix.
It is easily seen that the disruption diffuses through u and x to all lanes of the state
after a single round.
11.7 / KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS 377
11.7 KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS
absorbing phase
big endian
birthday attack
birthday paradox
bitrate
capacity
Chi step function collision
resistant
compression function
cryptographic hash function
hash code
hash function
hash value
Iota step function
Keccak
keyed hash function
lane
little endian
MD4
MD5
message authentication code
(MAC)
message digest
one-way hash function
Pi step function
preimage resistant
Rho step function
second preimage resistant
SHA-1
SHA-224
SHA-256
SHA-3
SHA-384
SHA-512
sponge construction
squeezing phase
strong collision resistance
Theta step function
weak collision resistance
Key Terms
Review Questions
11.1 What characteristics are needed in a secure hash function?
11.2 What is the difference between weak and strong collision resistance?
11.3 What is the role of a compression function in a hash function?
11.4 What is the difference between little-endian and big-endian format?
11.5 What basic arithmetical and logical functions are used in SHA?
11.6 Describe the set of criteria used by NIST to evaluate SHA-3 candidates.
11.7 Define the term sponge construction.
11.8 Briefly describe the internal structure of the iteration function f.
11.9 List and briefly describe the step functions that comprise the iteration function f.
Problems
11.1 The high-speed transport protocol XTP (Xpress Transfer Protocol) uses a 32-bit
checksum function defined as the concatenation of two 16-bit functions: XOR and
RXOR, defined in Section 11.4 as “two simple hash functions” and illustrated in
Figure 11.5.
a. Will this checksum detect all errors caused by an odd number of error bits?
Explain.
b. Will this checksum detect all errors caused by an even number of error bits? If not,
characterize the error patterns that will cause the checksum to fail.
c. Comment on the effectiveness of this function for use as a hash function for
authentication.
11.2 a. Consider the Davies and Price hash code scheme described in Section 11.4 and
assume that DES is used as the encryption algorithm:
Hi = Hi - 1 ⊕ E(Mi, Hi - 1)
378 CHAPTER 11 / CRYPTOGRAPHIC HASH FUNCTIONS
Recall the complementarity property of DES (Problem 3.14): If Y = E(K, X),
then Y′ = E(K′, X′). Use this property to show how a message consisting of
blocks M1, M2, c , MN can be altered without altering its hash code.
b. Show that a similar attack will succeed against the scheme proposed in [MEYE88]:
Hi = Mi ⊕ E(Hi - 1, Mi)
11.3 a. Consider the following hash function. Messages are in the form of a sequence of
numbers in Zn, M = (a1, a2, c at). The hash value h is calculated as ¢ at
i = 1
ai≤ for
some predefined value n. Does this hash function satisfy any of the requirements
for a hash function listed in Table 11.1? Explain your answer.
b. Repeat part (a) for the hash function h = ¢ at
i = 1
(ai)
2≤ mod n.
c. Calculate the hash function of part (b) for M = (189, 632, 900, 722, 349) and
n = 989.
11.4 It is possible to use a hash function to construct a block cipher with a structure similar
to DES. Because a hash function is one way and a block cipher must be reversible (to
decrypt), how is it possible?
11.5 Now consider the opposite problem: using an encryption algorithm to construct
a one-way hash function. Consider using RSA with a known key. Then process a
message consisting of a sequence of blocks as follows: Encrypt the first block, XOR
the result with the second block and encrypt again, etc. Show that this scheme is not
secure by solving the following problem. Given a two-block message B1, B2, and
its hash
RSAH(B1,B2) = RSA(RSA(B1) ⊕ B2)
Given an arbitrary block C1, choose C2 so that RSAH(C1, C2) = RSAH(B1, B2).
Thus, the hash function does not satisfy weak collision resistance.
11.6 Suppose H(m) is a collision-resistant hash function that maps a message of arbitrary
bit length into an n-bit hash value. Is it true that, for all messages x, x′ with x ≠ x′,
we have H(x) ≠ H(x′) Explain your answer.
11.7 In Figure 11.12, it is assumed that an array of 80 64-bit words is available to store the
values of Wt, so that they can be precomputed at the beginning of the processing of
a block. Now assume that space is at a premium. As an alternative, consider the use
of a 16-word circular buffer that is initially loaded with W0 through W15. Design an
algorithm that, for each step t, computes the required input value Wt.
11.8 For SHA-512, show the equations for the values of W16, W18, W23, and W31.
11.9 State the value of the padding field in SHA-512 if the length of the message is
a. 2942 bits
b. 2943 bits
c. 2944 bits
11.10 State the value of the length field in SHA-512 if the length of the message is
a. 2942 bits
b. 2943 bits
c. 2944 bits
11.11 Suppose a1a2a3a4 are the 4 bytes in a 32-bit word. Each ai can be viewed as an integer
in the range 0 to 255, represented in binary. In a big-endian architecture, this word
represents the integer
a12
24 + a2216 + a328 + a4
11.7 / KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS 379
In a little-endian architecture, this word represents the integer
a42
24 + a3216 + a228 + a1
a. Some hash functions, such as MD5, assume a little-endian architecture. It is impor-
tant that the message digest be independent of the underlying architecture. There-
fore, to perform the modulo 2 addition operation of MD5 or RIPEMD-160 on
a big-endian architecture, an adjustment must be made. Suppose X = x1 x2 x3 x4
and Y = y1 y2 y3 y4. Show how the MD5 addition operation (X + Y) would be
carried out on a big-endian machine.
b. SHA assumes a big-endian architecture. Show how the operation (X + Y) for
SHA would be carried out on a little-endian machine.
11.12 This problem introduces a hash function similar in spirit to SHA that operates on
letters instead of binary data. It is called the toy tetragraph hash (tth).6 Given a mes-
sage consisting of a sequence of letters, tth produces a hash value consisting of four
letters. First, tth divides the message into blocks of 16 letters, ignoring spaces, punc-
tuation, and capitalization. If the message length is not divisible by 16, it is padded
out with nulls. A four-number running total is maintained that starts out with the
value (0, 0, 0, 0); this is input to the compression function for processing the first
block. The compression function consists of two rounds.
Round 1 Get the next block of text and arrange it as a row-wise 4 * 4 block of text
and convert it to numbers (A = 0, B = 1, etc.). For example, for the block
ABCDEFGHIJKLMNOP, we have
A B C D
E F G H
I J K L
M N O P
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
Then, add each column mod 26 and add the result to the running total, mod 26. In this
example, the running total is (24, 2, 6, 10).
Round 2 Using the matrix from round 1, rotate the first row left by 1, second row left by 2,
third row left by 3, and reverse the order of the fourth row.
In our example:
B C D A
G H E F
L I J K
P O N M
1 2 3 0
6 7 4 5
11 8 9 10
15 14 13 12
Now, add each column mod 26 and add the result to the running total. The new run-
ning total is (5, 7, 9, 11). This running total is now the input into the first round of the
compression function for the next block of text. After the final block is processed,
convert the final running total to letters. For example, if the message is ABCDEF-
GHIJKLMNOP, then the hash is FHJL.
6I thank William K. Mason, of the magazine staff of The Cryptogram, for providing this example.
380 CHAPTER 11 / CRYPTOGRAPHIC HASH FUNCTIONS
a. Draw figures comparable to Figures 11.9 and 11.10 to depict the overall tth logic
and the compression function logic.
b. Calculate the hash function for the 22-letter message “Practice makes us perfect.”
c. To demonstrate the weakness of tth, find a message of length 32-letter to produces
the same hash.
11.13 For each of the possible capacity values of SHA-3 (Table 11.5), which lanes in the
internal 55 state matrix start out as lanes of all zeros?
11.14 Consider the SHA-3 option with a block size of 1024 bits and assume that each of the
lanes in the first message block (P0) has at least one nonzero bit. To start, all of the
lanes in the internal state matrix that correspond to the capacity portion of the initial
state are all zeros. Show how long it will take before all of these lanes have at least
one nonzero bit. Note: Ignore the permutation. That is, keep track of the original zero
lanes even after they have changed position in the matrix.
11.15 Consider the state matrix as illustrated in Figure 11.16a. Now rearrange the rows and
columns of the matrix so that L[0, 0] is in the center. Specifically, arrange the columns
in the left-to-right order (x = 3, x = 4, x = 0, x = 1, x = 2) and arrange the rows in
the top-to-bottom order (y = 2, y = 1, y = 0, y = 4, y = 6). This should give you
some insight into the permutation algorithm used for the function and for permut-
ing the rotation constants in the function. Using this rearranged matrix, describe the
permutation algorithm.
11.16 The function only affects L[0, 0]. Section 11.6 states that the changes to L[0, 0] diffuse
through u and to all lanes of the state after a single round.
a. Show that this is so.
b. How long before all of the bit positions in the matrix are affected by the changes
to L[0, 0]?
381
Message Authentication
Codes
12.1 Message Authentication Requirements
12.2 Message Authentication Functions
Message Encryption
Message Authentication Code
12.3 Requirements for Message Authentication Codes
12.4 Security of MACs
Brute-Force Attacks
Cryptanalysis
12.5 MACs Based on Hash Functions: HMAC
HMAC Design Objectives
HMAC Algorithm
Security of HMAC
12.6 MACs Based on Block Ciphers: DAA and CMAC
Data Authentication Algorithm
Cipher-Based Message Authentication Code (CMAC)
12.7 Authenticated Encryption: CCM and GCM
Counter with Cipher Block Chaining-Message Authentication Code
Galois/Counter Mode
12.8 Key Wrapping
Background
The Key Wrapping Algorithm
Key Unwrapping
12.9 Pseudorandom Number Generation Using Hash Functions and MACs
PRNG Based on Hash Function
PRNG Based on MAC Function
12.10 Key Terms, Review Questions, and Problems
CHAPTER
382 CHAPTER 12 / MESSAGE AUTHENTICATION CODES
One of the most fascinating and complex areas of cryptography is that of message
authentication and the related area of digital signatures. It would be impossible, in
anything less than book length, to exhaust all the cryptographic functions and proto-
cols that have been proposed or implemented for message authentication and digital
signatures. Instead, the purpose of this chapter and the next is to provide a broad
overview of the subject and to develop a systematic means of describing the various
approaches.
This chapter begins with an introduction to the requirements for authen-
tication and digital signature and the types of attacks to be countered. Then the
basic approaches are surveyed. The remainder of the chapter deals with the funda-
mental approach to message authentication known as the message authentication
code (MAC). Following an overview of this topic, the chapter looks at security
considerations for MACs. This is followed by a discussion of specific MACs in
two categories: those built from cryptographic hash functions and those built using
a block cipher mode of operation. Next, we look at a relatively recent approach
known as authenticated encryption. Finally, we look at the use of cryptographic
hash functions and MACs for pseudorandom number generation.
12.1 MESSAGE AUTHENTICATION REQUIREMENTS
In the context of communications across a network, the following attacks can be
identified.
1. Disclosure: Release of message contents to any person or process not possess-
ing the appropriate cryptographic key.
LEARNING OBJECTIVES
After studying this chapter, you should be able to:
◆ List and explain the possible attacks that are relevant to message
authentication.
◆ Define the term message authentication code.
◆ List and explain the requirements for a message authentication code.
◆ Present an overview of HMAC.
◆ Present an overview of CMAC.
◆ Explain the concept of authenticated encryption.
◆ Present an overview of CCM.
◆ Present an overview of GCM.
◆ Discuss the concept of key wrapping and explain its use.
◆ Understand how a hash function or a message authentication code can be
used for pseudorandom number generation.
12.2 / MESSAGE AUTHENTICATION FUNCTIONS 383
2. Traffic analysis: Discovery of the pattern of traffic between parties. In a
connection-oriented application, the frequency and duration of connec-
tions could be determined. In either a connection-oriented or connectionless
environment, the number and length of messages between parties could be
determined.
3. Masquerade: Insertion of messages into the network from a fraudulent source.
This includes the creation of messages by an opponent that are purported to
come from an authorized entity. Also included are fraudulent acknowledg-
ments of message receipt or nonreceipt by someone other than the message
recipient.
4. Content modification: Changes to the contents of a message, including inser-
tion, deletion, transposition, and modification.
5. Sequence modification: Any modification to a sequence of messages between
parties, including insertion, deletion, and reordering.
6. Timing modification: Delay or replay of messages. In a connection-oriented
application, an entire session or sequence of messages could be a replay of
some previous valid session, or individual messages in the sequence could be
delayed or replayed. In a connectionless application, an individual message
(e.g., datagram) could be delayed or replayed.
7. Source repudiation: Denial of transmission of message by source.
8. Destination repudiation: Denial of receipt of message by destination.
Measures to deal with the first two attacks are in the realm of message
confidentiality and are dealt with in Part One. Measures to deal with items
(3) through (6) in the foregoing list are generally regarded as message authentica-
tion. Mechanisms for dealing specifically with item (7) come under the heading of
digital signatures. Generally, a digital signature technique will also counter some
or all of the attacks listed under items (3) through (6). Dealing with item (8) may
require a combination of the use of digital signatures and a protocol designed to
counter this attack.
In summary, message authentication is a procedure to verify that received
messages come from the alleged source and have not been altered. Message au-
thentication may also verify sequencing and timeliness. A digital signature is an
authentication technique that also includes measures to counter repudiation by the
source.
12.2 MESSAGE AUTHENTICATION FUNCTIONS
Any message authentication or digital signature mechanism has two levels of func-
tionality. At the lower level, there must be some sort of function that produces an
authenticator: a value to be used to authenticate a message. This lower-level func-
tion is then used as a primitive in a higher-level authentication protocol that enables
a receiver to verify the authenticity of a message.
This section is concerned with the types of functions that may be used to pro-
duce an authenticator. These may be grouped into three classes.
384 CHAPTER 12 / MESSAGE AUTHENTICATION CODES
■ Hash function: A function that maps a message of any length into a fixed-length
hash value, which serves as the authenticator
■ Message encryption: The ciphertext of the entire message serves as its
authenticator
■ Message authentication code (MAC): A function of the message and a secret
key that produces a fixed-length value that serves as the authenticator
Hash functions, and how they may serve for message authentication, are dis-
cussed in Chapter 11. The remainder of this section briefly examines the remaining
two topics. The remainder of the chapter elaborates on the topic of MACs.
Message Encryption
Message encryption by itself can provide a measure of authentication. The analysis
differs for symmetric and public-key encryption schemes.
SYMMETRIC ENCRYPTION Consider the straightforward use of symmetric encryption
(Figure 12.1a). A message M transmitted from source A to destination B is encrypted
using a secret key K shared by A and B. If no other party knows the key, then confi-
dentiality is provided: No other party can recover the plaintext of the message.
Figure 12.1 Basic Uses of Message Encryption
Destination BSource A
M
K K
E
(a) Symmetric encryption: confidentiality and authentication
D M
PUb
(b) Public-key encryption: confidentiality
E(K, M)
M E D M
E(PUb, M)
E(PRa, M) E(PRa, M)E(PUb, E(PRa, M))
M E D M
(c) Public-key encryption: authentication and signature
(d) Public-key encryption: confidentiality, authentication, and signature
E D
PRb
PRa
M E D M
E(PRa, M)
PRa
PUa
PUaPUb PRb
12.2 / MESSAGE AUTHENTICATION FUNCTIONS 385
In addition, B is assured that the message was generated by A. Why? The
message must have come from A, because A is the only other party that possesses
K and therefore the only other party with the information necessary to construct
ciphertext that can be decrypted with K. Furthermore, if M is recovered, B knows
that none of the bits of M have been altered, because an opponent that does not
know K would not know how to alter bits in the ciphertext to produce the desired
changes in the plaintext.
So we may say that symmetric encryption provides authentication as well as
confidentiality. However, this flat statement needs to be qualified. Consider exactly
what is happening at B. Given a decryption function D and a secret key K, the
destination will accept any input X and produce output Y = D(K, X). If X is the
ciphertext of a legitimate message M produced by the corresponding encryption
function, then Y is some plaintext message M. Otherwise, Y will likely be a mean-
ingless sequence of bits. There may need to be some automated means of determin-
ing at B whether Y is legitimate plaintext and therefore must have come from A.
The implications of the line of reasoning in the preceding paragraph are pro-
found from the point of view of authentication. Suppose the message M can be any
arbitrary bit pattern. In that case, there is no way to determine automatically, at the
destination, whether an incoming message is the ciphertext of a legitimate message.
This conclusion is incontrovertible: If M can be any bit pattern, then regardless of
the value of X, the value Y = D(K, X) is some bit pattern and therefore must be
accepted as authentic plaintext.
Thus, in general, we require that only a small subset of all possible bit patterns
be considered legitimate plaintext. In that case, any spurious ciphertext is unlikely
to produce legitimate plaintext. For example, suppose that only one bit pattern in
106 is legitimate plaintext. Then the probability that any randomly chosen bit pat-
tern, treated as ciphertext, will produce a legitimate plaintext message is only 10-6.
For a number of applications and encryption schemes, the desired conditions
prevail as a matter of course. For example, suppose that we are transmitting English-
language messages using a Caesar cipher with a shift of one (K = 1). A sends the
following legitimate ciphertext:
nbsftfbupbutboeepftfbupbutboemjuumfmbnctfbujwz
B decrypts to produce the following plaintext:
mareseatoatsanddoeseatoatsandlittlelambseativy
A simple frequency analysis confirms that this message has the profile of ordinary
English. On the other hand, if an opponent generates the following random se-
quence of letters:
zuvrsoevgqxlzwigamdvnmhpmccxiuureosfbcebtqxsxq
this decrypts to
ytuqrndufpwkyvhfzlcumlgolbbwhttqdnreabdaspwrwp
which does not fit the profile of ordinary English.
386 CHAPTER 12 / MESSAGE AUTHENTICATION CODES
It may be difficult to determine automatically if incoming ciphertext de-
crypts to intelligible plaintext. If the plaintext is, say, a binary object file or digi-
tized X-rays, determination of properly formed and therefore authentic plaintext
may be difficult. Thus, an opponent could achieve a certain level of disruption
simply by issuing messages with random content purporting to come from a
legitimate user.
One solution to this problem is to force the plaintext to have some structure
that is easily recognized but that cannot be replicated without recourse to the en-
cryption function. We could, for example, append an error-detecting code, also
known as a frame check sequence (FCS) or checksum, to each message before en-
cryption, as illustrated in Figure 12.2a. A prepares a plaintext message M and then
provides this as input to a function F that produces an FCS. The FCS is appended to
M and the entire block is then encrypted. At the destination, B decrypts the incom-
ing block and treats the results as a message with an appended FCS. B applies the
same function F to attempt to reproduce the FCS. If the calculated FCS is equal to
the incoming FCS, then the message is considered authentic. It is unlikely that any
random sequence of bits would exhibit the desired relationship.
Note that the order in which the FCS and encryption functions are performed
is critical. The sequence illustrated in Figure 12.2a is referred to in [DIFF79] as
internal error control, which the authors contrast with external error control
(Figure 12.2b). With internal error control, authentication is provided because an
opponent would have difficulty generating ciphertext that, when decrypted, would
have valid error control bits. If instead the FCS is the outer code, an opponent can
construct messages with valid error-control codes. Although the opponent cannot
know what the decrypted plaintext will be, he or she can still hope to create confu-
sion and disrupt operations.
Figure 12.2 Internal and External Error Control
(b) External error control
Destination BSource A
K K
M | |
F
(a) Internal error control
MD
F
Compare
EM
F(M) F(M)
E(K, [M || F(M)])
M | |E
D
K
F
Compare
K
F
E(K, M)
F(E(K, M))
E(K, M)
M
12.2 / MESSAGE AUTHENTICATION FUNCTIONS 387
An error-control code is just one example; in fact, any sort of structuring
added to the transmitted message serves to strengthen the authentication capability.
Such structure is provided by the use of a communications architecture consisting
of layered protocols. As an example, consider the structure of messages transmit-
ted using the TCP/IP protocol architecture. Figure 12.3 shows the format of a TCP
segment, illustrating the TCP header. Now suppose that each pair of hosts shared
a unique secret key, so that all exchanges between a pair of hosts used the same
key, regardless of application. Then we could simply encrypt all of the datagram ex-
cept the IP header. Again, if an opponent substituted some arbitrary bit pattern for
the encrypted TCP segment, the resulting plaintext would not include a meaning-
ful header. In this case, the header includes not only a checksum (which covers the
header) but also other useful information, such as the sequence number. Because
successive TCP segments on a given connection are numbered sequentially, encryp-
tion assures that an opponent does not delay, misorder, or delete any segments.
PUBLIC-KEY ENCRYPTION The straightforward use of public-key encryption
(Figure 12.1b) provides confidentiality but not authentication. The source (A) uses
the public key PUb of the destination (B) to encrypt M. Because only B has the cor-
responding private key PRb, only B can decrypt the message. This scheme provides
no authentication, because any opponent could also use B’s public key to encrypt a
message and claim to be A.
To provide authentication, A uses its private key to encrypt the message, and
B uses A’s public key to decrypt (Figure 12.1c). This provides authentication using
the same type of reasoning as in the symmetric encryption case: The message must
have come from A because A is the only party that possesses PRa and therefore
the only party with the information necessary to construct ciphertext that can be
decrypted with PUa. Again, the same reasoning as before applies: There must be
some internal structure to the plaintext so that the receiver can distinguish between
well-formed plaintext and random bits.
Figure 12.3 TCP Segment
Source port Destination port
Checksum Urgent pointer
Sequence number
Acknowledgment number
Options + padding
Application data
Reserved Flags WindowDataoffset
0Bit: 4 10 16 31
20
o
ct
et
s
388 CHAPTER 12 / MESSAGE AUTHENTICATION CODES
Assuming there is such structure, then the scheme of Figure 12.1c does pro-
vide authentication. It also provides what is known as digital signature.1 Only A
could have constructed the ciphertext because only A possesses PRa. Not even B,
the recipient, could have constructed the ciphertext. Therefore, if B is in possession
of the ciphertext, B has the means to prove that the message must have come from
A. In effect, A has “signed” the message by using its private key to encrypt. Note
that this scheme does not provide confidentiality. Anyone in possession of A’s pub-
lic key can decrypt the ciphertext.
To provide both confidentiality and authentication, A can encrypt M first
using its private key, which provides the digital signature, and then using B’s pub-
lic key, which provides confidentiality (Figure 12.1d). The disadvantage of this ap-
proach is that the public-key algorithm, which is complex, must be exercised four
times rather than two in each communication.
Message Authentication Code
An alternative authentication technique involves the use of a secret key to generate
a small fixed-size block of data, known as a cryptographic checksum or MAC, that is
appended to the message. This technique assumes that two communicating parties,
say A and B, share a common secret key K. When A has a message to send to B, it
calculates the MAC as a function of the message and the key:
MAC = C(K, M)
where
M = input message
C = MAC function
K = shared secret key
MAC = message authentication code
The message plus MAC are transmitted to the intended recipient. The recipient
performs the same calculation on the received message, using the same secret key,
to generate a new MAC. The received MAC is compared to the calculated MAC
(Figure 12.4a). If we assume that only the receiver and the sender know the identity
of the secret key, and if the received MAC matches the calculated MAC, then
1. The receiver is assured that the message has not been altered. If an attacker al-
ters the message but does not alter the MAC, then the receiver’s calculation of
the MAC will differ from the received MAC. Because the attacker is assumed
not to know the secret key, the attacker cannot alter the MAC to correspond
to the alterations in the message.
2. The receiver is assured that the message is from the alleged sender. Because
no one else knows the secret key, no one else could prepare a message with a
proper MAC.
1This is not the way in which digital signatures are constructed, as we shall see, but the principle is the
same.
12.2 / MESSAGE AUTHENTICATION FUNCTIONS 389
3. If the message includes a sequence number (such as is used with HDLC, X.25,
and TCP), then the receiver can be assured of the proper sequence because an
attacker cannot successfully alter the sequence number.
A MAC function is similar to encryption. One difference is that the MAC
algorithm need not be reversible, as it must be for decryption. In general, the MAC
function is a many-to-one function. The domain of the function consists of messages
of some arbitrary length, whereas the range consists of all possible MACs and all
possible keys. If an n-bit MAC is used, then there are 2n possible MACs, whereas
there are N possible messages with N W 2n. Furthermore, with a k-bit key, there
are 2k possible keys.
For example, suppose that we are using 100-bit messages and a 10-bit MAC.
Then, there are a total of 2100 different messages but only 210 different MACs. So,
on average, each MAC value is generated by a total of 2100/210 = 290 different mes-
sages. If a 5-bit key is used, then there are 25 = 32 different mappings from the set
of messages to the set of MAC values.
It turns out that, because of the mathematical properties of the authentication
function, it is less vulnerable to being broken than encryption.
The process depicted in Figure 12.4a provides authentication but not confiden-
tiality, because the message as a whole is transmitted in the clear. Confidentiality
can be provided by performing message encryption either after (Figure 12.4b) or
before (Figure 12.4c) the MAC algorithm. In both these cases, two separate keys are
Figure 12.4 Basic Uses of Message Authentication code (MAC)
Destination BSource A
M | |
K
C
(a) Message authentication
M
E
| |
(c) Message authentication and confidentiality; authentication tied to ciphertext
M
C(K, M)
E(K2, [M || C(K1, M)])
C(K1, E(K2, M))
E(K2, M)
C
CompareK
EM | |
K1
K1
K2
K2
K2
K1
K1
K2
C
(b) Message authentication and confidentiality; authentication tied to plaintext
MD
C
Compare
C
C
Compare
D
M
C(K1, M)
390 CHAPTER 12 / MESSAGE AUTHENTICATION CODES
needed, each of which is shared by the sender and the receiver. In the first case, the
MAC is calculated with the message as input and is then concatenated to the mes-
sage. The entire block is then encrypted. In the second case, the message is encrypted
first. Then the MAC is calculated using the resulting ciphertext and is concatenated
to the ciphertext to form the transmitted block. Typically, it is preferable to tie the
authentication directly to the plaintext, so the method of Figure 12.4b is used.
Because symmetric encryption will provide authentication and because it is
widely used with readily available products, why not simply use this instead of a
separate message authentication code? [DAVI89] suggests three situations in which
a message authentication code is used.
1. There are a number of applications in which the same message is broadcast to
a number of destinations. Examples are notification to users that the network
is now unavailable or an alarm signal in a military control center. It is cheaper
and more reliable to have only one destination responsible for monitoring au-
thenticity. Thus, the message must be broadcast in plaintext with an associated
message authentication code. The responsible system has the secret key and
performs authentication. If a violation occurs, the other destination systems
are alerted by a general alarm.
2. Another possible scenario is an exchange in which one side has a heavy load
and cannot afford the time to decrypt all incoming messages. Authentication is
carried out on a selective basis, messages being chosen at random for checking.
3. Authentication of a computer program in plaintext is an attractive service. The
computer program can be executed without having to decrypt it every time,
which would be wasteful of processor resources. However, if a message au-
thentication code were attached to the program, it could be checked whenever
assurance was required of the integrity of the program.
Three other rationales may be added.
4. For some applications, it may not be of concern to keep messages secret, but
it is important to authenticate messages. An example is the Simple Network
Management Protocol Version 3 (SNMPv3), which separates the functions of
confidentiality and authentication. For this application, it is usually important
for a managed system to authenticate incoming SNMP messages, particularly
if the message contains a command to change parameters at the managed sys-
tem. On the other hand, it may not be necessary to conceal the SNMP traffic.
5. Separation of authentication and confidentiality functions affords architec-
tural flexibility. For example, it may be desired to perform authentication at
the application level but to provide confidentiality at a lower level, such as the
transport layer.
6. A user may wish to prolong the period of protection beyond the time of recep-
tion and yet allow processing of message contents. With message encryption, the
protection is lost when the message is decrypted, so the message is protected
against fraudulent modifications only in transit but not within the target system.
Finally, note that the MAC does not provide a digital signature, because both
sender and receiver share the same key.
12.3 / REQUIREMENTS FOR MESSAGE AUTHENTICATION CODES 391
12.3 REQUIREMENTS FOR MESSAGE AUTHENTICATION CODES
A MAC, also known as a cryptographic checksum, is generated by a function C of
the form
T = MAC(K, M)
where M is a variable-length message, K is a secret key shared only by sender and re-
ceiver, and MAC(K, M) is the fixed-length authenticator, sometimes called a tag. The
tag is appended to the message at the source at a time when the message is assumed or
known to be correct. The receiver authenticates that message by recomputing the tag.
When an entire message is encrypted for confidentiality, using either symmet-
ric or asymmetric encryption, the security of the scheme generally depends on the
bit length of the key. Barring some weakness in the algorithm, the opponent must
resort to a brute-force attack using all possible keys. On average, such an attack will
require 2(k - 1) attempts for a k-bit key. In particular, for a ciphertext-only attack, the
opponent, given ciphertext C, performs Pi = D(Ki, C) for all possible key values Ki
until a Pi is produced that matches the form of acceptable plaintext.
In the case of a MAC, the considerations are entirely different. In general,
the MAC function is a many-to-one function, due to the many-to-one nature of
the function. Using brute-force methods, how would an opponent attempt to dis-
cover a key? If confidentiality is not employed, the opponent has access to plain-
text messages and their associated MACs. Suppose k 7 n; that is, suppose that
the key size is greater than the MAC size. Then, given a known M1 and T1, with
T1 = MAC(K, M1), the cryptanalyst can perform Ti = MAC(Ki, M1) for all pos-
sible key values ki. At least one key is guaranteed to produce a match of Ti = T1.
Note that a total of 2k tags will be produced, but there are only 2n 6 2k different tag
values. Thus, a number of keys will produce the correct tag and the opponent has no
way of knowing which is the correct key. On average, a total of 2k/2n = 2(k - n) keys
will produce a match. Thus, the opponent must iterate the attack.
■ Round 1
Given: M1, T1 = MAC(K, M1)
Compute Ti = MAC(Ki, M1) for all 2
k keys
Number of matches L 2(k - n)
■ Round 2
Given: M2, T2 = MAC(K, M2)
Compute Ti = MAC(Ki, M2) for the 2
(k - n) keys resulting from Round 1
Number of matches L 2(k - 2 * n)
And so on. On average, a rounds will be needed k = a * n. For example, if an
80-bit key is used and the tag is 32 bits, then the first round will produce about 248
possible keys. The second round will narrow the possible keys to about 216 possibili-
ties. The third round should produce only a single key, which must be the one used
by the sender.
392 CHAPTER 12 / MESSAGE AUTHENTICATION CODES
If the key length is less than or equal to the tag length, then it is likely that a
first round will produce a single match. It is possible that more than one key will
produce such a match, in which case the opponent would need to perform the same
test on a new (message, tag) pair.
Thus, a brute-force attempt to discover the authentication key is no less ef-
fort and may be more effort than that required to discover a decryption key of the
same length. However, other attacks that do not require the discovery of the key
are possible.
Consider the following MAC algorithm. Let M = (X1 }X2 } c }Xm) be a
message that is treated as a concatenation of 64-bit blocks Xi. Then define
∆(M) = X1 ⊕ X2 ⊕ c ⊕ Xm
MAC(K, M) = E(K, ∆(M))
where ⊕ is the exclusive-OR (XOR) operation and the encryption algorithm
is DES in electronic codebook mode. Thus, the key length is 56 bits, and the tag
length is 64 bits. If an opponent observes {M }MAC(K, M)}, a brute-force attempt
to determine K will require at least 256 encryptions. But the opponent can attack the
system by replacing X1 through Xm - 1 with any desired values Y1 through Ym - 1 and
replacing Xm with Ym, where Ym is calculated as
Ym = Y1 ⊕ Y2 ⊕ g ⊕ Ym - 1 ⊕ ∆(M)
The opponent can now concatenate the new message, which consists of Y1
through Ym, using the original tag to form a message that will be accepted as authen-
tic by the receiver. With this tactic, any message of length 64 * (m - 1) bits can be
fraudulently inserted.
Thus, in assessing the security of a MAC function, we need to consider the
types of attacks that may be mounted against it. With that in mind, let us state the
requirements for the function. Assume that an opponent knows the MAC func-
tion but does not know K. Then the MAC function should satisfy the following
requirements.
1. If an opponent observes M and MAC(K, M), it should be computationally
infeasible for the opponent to construct a message M′ such that
MAC(K, M′) = MAC(K, M)
2. MAC(K, M) should be uniformly distributed in the sense that for randomly
chosen messages, M and M′, the probability that MAC(K, M) = MAC(K, M′)
is 2-n, where n is the number of bits in the tag.
3. Let M′ be equal to some known transformation on M. That is, M′ = f(M). For
example, f may involve inverting one or more specific bits. In that case,
Pr [MAC(K, M) = MAC(K, M′)] = 2-n
The first requirement speaks to the earlier example, in which an opponent is
able to construct a new message to match a given tag, even though the opponent
does not know and does not learn the key. The second requirement deals with the
need to thwart a brute-force attack based on chosen plaintext. That is, if we assume
12.4 / SECURITY OF MACs 393
that the opponent does not know K but does have access to the MAC function and
can present messages for MAC generation, then the opponent could try various
messages until finding one that matches a given tag. If the MAC function exhibits
uniform distribution, then a brute-force method would require, on average, 2(n - 1)
attempts before finding a message that fits a given tag.
The final requirement dictates that the authentication algorithm should not be
weaker with respect to certain parts or bits of the message than others. If this were
not the case, then an opponent who had M and MAC(K, M) could attempt varia-
tions on M at the known “weak spots” with a likelihood of early success at produc-
ing a new message that matched the old tags.
12.4 SECURITY OF MACs
Just as with encryption algorithms and hash functions, we can group attacks on
MACs into two categories: brute-force attacks and cryptanalysis.
Brute-Force Attacks
A brute-force attack on a MAC is a more difficult undertaking than a brute-force
attack on a hash function because it requires known message-tag pairs. Let us see
why this is so. To attack a hash code, we can proceed in the following way. Given
a fixed message x with n-bit hash code h = H(x), a brute-force method of finding
a collision is to pick a random bit string y and check if H(y) = H(x). The attacker
can do this repeatedly off line. Whether an off-line attack can be used on a MAC
algorithm depends on the relative size of the key and the tag.
To proceed, we need to state the desired security property of a MAC algo-
rithm, which can be expressed as follows.
■ Computation resistance: Given one or more text-MAC pairs [xi, MAC(K, xi)],
it is computationally infeasible to compute any text-MAC pair [x, MAC(K, x)]
for any new input x ≠ xi.
In other words, the attacker would like to come up with the valid MAC code for a
given message x. There are two lines of attack possible: attack the key space and at-
tack the MAC value. We examine each of these in turn.
If an attacker can determine the MAC key, then it is possible to generate a
valid MAC value for any input x. Suppose the key size is k bits and that the attacker
has one known text-tag pair. Then the attacker can compute the n-bit tag on the
known text for all possible keys. At least one key is guaranteed to produce the cor-
rect tag, namely, the valid key that was initially used to produce the known text-tag
pair. This phase of the attack takes a level of effort proportional to 2k (that is, one
operation for each of the 2k possible key values). However, as was described earlier,
because the MAC is a many-to-one mapping, there may be other keys that produce
the correct value. Thus, if more than one key is found to produce the correct value,
additional text-tag pairs must be tested. It can be shown that the level of effort
drops off rapidly with each additional text-MAC pair and that the overall level of
effort is roughly 2k [MENE97].
394 CHAPTER 12 / MESSAGE AUTHENTICATION CODES
An attacker can also work on the tag without attempting to recover the key.
Here, the objective is to generate a valid tag for a given message or to find a message
that matches a given tag. In either case, the level of effort is comparable to that for
attacking the one-way or weak collision-resistant property of a hash code, or 2n.
In the case of the MAC, the attack cannot be conducted off line without further
input; the attacker will require chosen text-tag pairs or knowledge of the key.
To summarize, the level of effort for brute-force attack on a MAC algorithm
can be expressed as min(2k, 2n). The assessment of strength is similar to that for
symmetric encryption algorithms. It would appear reasonable to require that the
key length and tag length satisfy a relationship such as min(k, n) Ú N, where N is
perhaps in the range of 128 bits.
Cryptanalysis
As with encryption algorithms and hash functions, cryptanalytic attacks on MAC
algorithms seek to exploit some property of the algorithm to perform some attack
other than an exhaustive search. The way to measure the resistance of a MAC algo-
rithm to cryptanalysis is to compare its strength to the effort required for a brute-
force attack. That is, an ideal MAC algorithm will require a cryptanalytic effort
greater than or equal to the brute-force effort.
There is much more variety in the structure of MACs than in hash functions,
so it is difficult to generalize about the cryptanalysis of MACs. Furthermore, far less
work has been done on developing such attacks. A useful survey of some methods
for specific MACs is [PREN96].
12.5 MACs BASED ON HASH FUNCTIONS: HMAC
Later in this chapter, we look at examples of a MAC based on the use of a symmetric
block cipher. This has traditionally been the most common approach to constructing
a MAC. In recent years, there has been increased interest in developing a MAC de-
rived from a cryptographic hash function. The motivations for this interest are
1. Cryptographic hash functions such as MD5 and SHA generally execute faster
in software than symmetric block ciphers such as DES.
2. Library code for cryptographic hash functions is widely available.
With the development of AES and the more widespread availability of code
for encryption algorithms, these considerations are less significant, but hash-based
MACs continue to be widely used.
A hash function such as SHA was not designed for use as a MAC and can-
not be used directly for that purpose, because it does not rely on a secret key.
There have been a number of proposals for the incorporation of a secret key into
an existing hash algorithm. The approach that has received the most support is
HMAC [BELL96a, BELL96b]. HMAC has been issued as RFC 2104, has been
chosen as the mandatory-to-implement MAC for IP security, and is used in other
Internet protocols, such as SSL. HMAC has also been issued as a NIST standard
(FIPS 198).
12.5 / MACs BASED ON HASH FUNCTIONS: HMAC 395
HMAC Design Objectives
RFC 2104 lists the following design objectives for HMAC.
■ To use, without modifications, available hash functions. In particular, to use
hash functions that perform well in software and for which code is freely and
widely available.
■ To allow for easy replaceability of the embedded hash function in case faster
or more secure hash functions are found or required.
■ To preserve the original performance of the hash function without incurring a
significant degradation.
■ To use and handle keys in a simple way.
■ To have a well understood cryptographic analysis of the strength of the au-
thentication mechanism based on reasonable assumptions about the embed-
ded hash function.
The first two objectives are important to the acceptability of HMAC. HMAC
treats the hash function as a “black box.” This has two benefits. First, an existing im-
plementation of a hash function can be used as a module in implementing HMAC.
In this way, the bulk of the HMAC code is prepackaged and ready to use without
modification. Second, if it is ever desired to replace a given hash function in an
HMAC implementation, all that is required is to remove the existing hash function
module and drop in the new module. This could be done if a faster hash function
were desired. More important, if the security of the embedded hash function were
compromised, the security of HMAC could be retained simply by replacing the em-
bedded hash function with a more secure one (e.g., replacing SHA-2 with SHA-3).
The last design objective in the preceding list is, in fact, the main advantage
of HMAC over other proposed hash-based schemes. HMAC can be proven secure
provided that the embedded hash function has some reasonable cryptographic
strengths. We return to this point later in this section, but first we examine the struc-
ture of HMAC.
HMAC Algorithm
Figure 12.5 illustrates the overall operation of HMAC. Define the following terms.
H = embedded hash function (e.g., MD5, SHA-1, RIPEMD-160)
IV = initial value input to hash function
M = message input to HMAC (including the padding specified in the embedded
hash function)
Yi = i th block of M, 0 … i … (L - 1)
L = number of blocks in M
b = number of bits in a block
n = length of hash code produced by embedded hash function
K = secret key; recommended length is Ú n; if key length is greater than b, the
key is input to the hash function to produce an n-bit key
K+ = K padded with zeros on the left so that the result is b bits in length
396 CHAPTER 12 / MESSAGE AUTHENTICATION CODES
ipad = 00110110 (36 in hexadecimal) repeated b/8 times
opad = 01011100 (5C in hexadecimal) repeated b/8 times
Then HMAC can be expressed as
HMAC(K, M) = H[(K+ ⊕ opad) }H[(K+ ⊕ ipad) }M]]
We can describe the algorithm as follows.
1. Append zeros to the left end of K to create a b-bit string K+ (e.g., if K is of
length 160 bits and b = 512, then K will be appended with 44 zeroes).
2. XOR (bitwise exclusive-OR) K+ with ipad to produce the b-bit block Si.
3. Append M to Si.
4. Apply H to the stream generated in step 3.
5. XOR K+ with opad to produce the b-bit block So.
6. Append the hash result from step 4 to So.
7. Apply H to the stream generated in step 6 and output the result.
Note that the XOR with ipad results in flipping one-half of the bits of K.
Similarly, the XOR with opad results in flipping one-half of the bits of K, using a
Figure 12.5 HMAC Structure
K+
Si
So
Y0 Y1 YL–1
b bits
b bits
b bits b bits
ipad
K+ opad
HashIV
n bits
n bits
Pad to b bits
HashIV
n bits
n bits
HMAC(K, M)
H(Si || M)
12.5 / MACs BASED ON HASH FUNCTIONS: HMAC 397
different set of bits. In effect, by passing Si and So through the compression function
of the hash algorithm, we have pseudorandomly generated two keys from K.
HMAC should execute in approximately the same time as the embedded hash
function for long messages. HMAC adds three executions of the hash compression
function (for Si, So, and the block produced from the inner hash).
A more efficient implementation is possible, as shown in Figure 12.6. Two
quantities are precomputed:
f(IV, (K+ ⊕ ipad))
f(IV, (K+ ⊕ opad))
where f(cv, block) is the compression function for the hash function, which takes as
arguments a chaining variable of n bits and a block of b bits and produces a chain-
ing variable of n bits. These quantities only need to be computed initially and every
time the key changes. In effect, the precomputed quantities substitute for the initial
value (IV) in the hash function. With this implementation, only one additional in-
stance of the compression function is added to the processing normally produced
Figure 12.6 Efficient Implementation of HMAC
b bits b bits b bits
Precomputed Computed per message
HashIV
n bits
b bits
n bits
Pad to b bits
n bits
n bits
HMAC(K, M)
f
IV
b bits
f f
K+
Si
So
Y0 Y1
ipad
K+ opad
YL–1
H(Si || M)
398 CHAPTER 12 / MESSAGE AUTHENTICATION CODES
by the hash function. This more efficient implementation is especially worthwhile if
most of the messages for which a MAC is computed are short.
Security of HMAC
The security of any MAC function based on an embedded hash function depends
in some way on the cryptographic strength of the underlying hash function. The
appeal of HMAC is that its designers have been able to prove an exact relation-
ship between the strength of the embedded hash function and the strength of
HMAC.
The security of a MAC function is generally expressed in terms of the prob-
ability of successful forgery with a given amount of time spent by the forger and
a given number of message-tag pairs created with the same key. In essence, it is
proved in [BELL96a] that for a given level of effort (time, message–tag pairs) on
messages generated by a legitimate user and seen by the attacker, the probability
of successful attack on HMAC is equivalent to one of the following attacks on the
embedded hash function.
1. The attacker is able to compute an output of the compression function even
with an IV that is random, secret, and unknown to the attacker.
2. The attacker finds collisions in the hash function even when the IV is random
and secret.
In the first attack, we can view the compression function as equivalent to the
hash function applied to a message consisting of a single b-bit block. For this attack,
the IV of the hash function is replaced by a secret, random value of n bits. An attack
on this hash function requires either a brute-force attack on the key, which is a level
of effort on the order of 2n, or a birthday attack, which is a special case of the second
attack, discussed next.
In the second attack, the attacker is looking for two messages M and M′ that
produce the same hash: H(M) = H(M′). This is the birthday attack discussed in
Chapter 11. We have shown that this requires a level of effort of 2n/2 for a hash
length of n. On this basis, the security of MD5 is called into question, because a
level of effort of 264 looks feasible with today’s technology. Does this mean that
a 128-bit hash function such as MD5 is unsuitable for HMAC? The answer is no,
because of the following argument. To attack MD5, the attacker can choose any
set of messages and work on these off line on a dedicated computing facility to
find a collision. Because the attacker knows the hash algorithm and the default IV,
the attacker can generate the hash code for each of the messages that the attacker
generates. However, when attacking HMAC, the attacker cannot generate mes-
sage/code pairs off line because the attacker does not know K. Therefore, the at-
tacker must observe a sequence of messages generated by HMAC under the same
key and perform the attack on these known messages. For a hash code length of
128 bits, this requires 264 observed blocks (272 bits) generated using the same key.
On a 1-Gbps link, one would need to observe a continuous stream of messages
with no change in key for about 150,000 years in order to succeed. Thus, if speed
is a concern, it is fully acceptable to use MD5 rather than SHA-1 as the embedded
hash function for HMAC.
12.6 / MACs BASED ON BLOCK CIPHERS: DAA AND CMAC 399
12.6 MACs BASED ON BLOCK CIPHERS: DAA AND CMAC
In this section, we look at two MACs that are based on the use of a block cipher
mode of operation. We begin with an older algorithm, the Data Authentication
Algorithm (DAA), which is now obsolete. Then we examine CMAC, which is de-
signed to overcome the deficiencies of DAA.
Data Authentication Algorithm
The Data Authentication Algorithm (DAA), based on DES, has been one of the
most widely used MACs for a number of years. The algorithm is both a FIPS pub-
lication (FIPS PUB 113) and an ANSI standard (X9.17). However, as we discuss
subsequently, security weaknesses in this algorithm have been discovered, and it is
being replaced by newer and stronger algorithms.
The algorithm can be defined as using the cipher block chaining (CBC) mode
of operation of DES (Figure 6.4) with an initialization vector of zero. The data (e.g.,
message, record, file, or program) to be authenticated are grouped into contiguous
64-bit blocks: D1, D2, c , DN. If necessary, the final block is padded on the right
with zeroes to form a full 64-bit block. Using the DES encryption algorithm E and a
secret key K, a data authentication code (DAC) is calculated as follows (Figure 12.7).
O1 = E(K, D)
O2 = E(K, [D2 ⊕ O1])
O3 = E(K, [D3 ⊕ O2])#
#
#
ON = E(K, [DN ⊕ ON - 1])
Figure 12.7 Data Authentication Algorithm (FIPS PUB 113)
Time = 1
DES
encrypt
K
(56 bits)
Time = 2
K
+ + +
K K
Time = NTime = N – 1
O1
(64 bits)
O2
D1
(64 bits) D2 DN–1
ON
DN
ON–1
DAC
(16 to 64 bits)
DES
encrypt
DES
encrypt
DES
encrypt
400 CHAPTER 12 / MESSAGE AUTHENTICATION CODES
The DAC consists of either the entire block ON or the leftmost M bits of the
block, with 16 … M … 64.
Cipher-Based Message Authentication Code (CMAC)
As was mentioned, DAA has been widely adopted in government and industry.
[BELL00] demonstrated that this MAC is secure under a reasonable set of security
criteria, with the following restriction. Only messages of one fixed length of mn bits
are processed, where n is the cipher block size and m is a fixed positive integer. As
a simple example, notice that given the CBC MAC of a one-block message X, say
T = MAC(K, X), the adversary immediately knows the CBC MAC for the two-
block message X }(X ⊕ T) since this is once again T.
Black and Rogaway [BLAC00] demonstrated that this limitation could be
overcome using three keys: one key K of length k to be used at each step of the
cipher block chaining and two keys of length b, where b is the cipher block length.
This proposed construction was refined by Iwata and Kurosawa so that the two
n-bit keys could be derived from the encryption key, rather than being provided
separately [IWAT03]. This refinement, adopted by NIST, is the Cipher-based
Message Authentication Code (CMAC) mode of operation for use with AES and
triple DES. It is specified in NIST Special Publication 800-38B.
First, let us define the operation of CMAC when the message is an integer
multiple n of the cipher block length b. For AES, b = 128, and for triple DES,
b = 64. The message is divided into n blocks (M1, M2, c , Mn). The algorithm
makes use of a k-bit encryption key K and a b-bit constant, K1. For AES, the key
size k is 128, 192, or 256 bits; for triple DES, the key size is 112 or 168 bits. CMAC is
calculated as follows (Figure 12.8).
C1 = E(K, M1)
C2 = E(K, [M2 ⊕ C1])
C3 = E(K, [M3 ⊕ C2])
#
#
#
Cn = E(K, [Mn ⊕ Cn - 1 ⊕ K1])
T = MSBTlen(Cn)
where
T = message authentication code, also referred to as the tag
Tlen = bit length of T
MSBs(X) = the s leftmost bits of the bit string X
If the message is not an integer multiple of the cipher block length, then the
final block is padded to the right (least significant bits) with a 1 and as many 0s as
necessary so that the final block is also of length b. The CMAC operation then pro-
ceeds as before, except that a different b-bit key K2 is used instead of K1.
12.6 / MACs BASED ON HASH FUNCTIONS: HMAC 401
The two b-bit keys are derived from the k-bit encryption key as follows.
L = E(K, 0b)
K1 = L # x
K2 = L # x2 = (L # x) # x
where multiplication ( # ) is done in the finite field GF(2b) and x and x2 are first- and
second-order polynomials that are elements of GF(2b). Thus, the binary represen-
tation of x consists of b - 2 zeros followed by 10; the binary representation of x2
consists of b - 3 zeros followed by 100. The finite field is defined with respect to
an irreducible polynomial that is lexicographically first among all such polynomials
with the minimum possible number of nonzero terms. For the two approved block
sizes, the polynomials are x64 + x4 + x3 + x + 1 and x128 + x7 + x2 + x + 1.
To generate K1 and K2, the block cipher is applied to the block that consists
entirely of 0 bits. The first subkey is derived from the resulting ciphertext by a
left shift of one bit and, conditionally, by XORing a constant that depends on the
block size. The second subkey is derived in the same manner from the first subkey.
This property of finite fields of the form GF(2b) was explained in the discussion of
MixColumns in Chapter 6.
Figure 12.8 Cipher-Based Message Authentication Code (CMAC)
EncryptK K K
T
Encrypt Encrypt
MSB(Tlen)
M1
K1
K2
M2 Mn
(a) Message length is integer multiple of block size
EncryptK K K
T
Encrypt Encrypt
MSB(Tlen)
10...0
(b) Message length is not integer multiple of block size
b
k
MnM1 M2
402 CHAPTER 12 / MESSAGE AUTHENTICATION CODES
12.7 AUTHENTICATED ENCRYPTION: CCM AND GCM
Authenticated encryption (AE) is a term used to describe encryption systems that
simultaneously protect confidentiality and authenticity (integrity) of communica-
tions. Many applications and protocols require both forms of security, but until re-
cently the two services have been designed separately.
There are four common approaches to providing both confidentiality and en-
cryption for a message M.
■ Hashing followed by encryption (H S E): First compute the cryptographic
hash function over M as h = H(M). Then encrypt the message plus hash func-
tion: E(K, (M }h)).
■ Authentication followed by encryption (A S E): Use two keys. First authen-
ticate the plaintext by computing the MAC value as T = MAC(K1, M). Then
encrypt the message plus tag: E(K2, [M }T ]). This approach is taken by the
SSL/TLS protocols (Chapter 17).
■ Encryption followed by authentication (E S A): Use two keys. First encrypt
the message to yield the ciphertext C = E(K2, M). Then authenticate the
ciphertext with T = MAC(K1, C) to yield the pair (C, T). This approach is
used in the IPSec protocol (Chapter 20).
■ Independently encrypt and authenticate (E + A). Use two keys. Encrypt
the message to yield the ciphertext C = E(K2, M). Authenticate the plain-
text with T = MAC(K1, M) to yield the pair (C, T). These operations can
be performed in either order. This approach is used by the SSH protocol
(Chapter 17).
Both decryption and verification are straightforward for each approach. For
H S E, A S E, and E + A, decrypt first, then verify. For E S A, verify first, then
decrypt. There are security vulnerabilities with all of these approaches. The H S E
approach is used in the Wired Equivalent Privacy (WEP) protocol to protect WiFi
networks. This approach had fundamental weaknesses and led to the replacement of
the WEP protocol. [BLAC05] and [BELL00] point out that there are security con-
cerns in each of the three encryption/MAC approaches listed above. Nevertheless,
with proper design, any of these approaches can provide a high level of security.
This is the goal of the two approaches discussed in this section, both of which have
been standardized by NIST.
Counter with Cipher Block Chaining-Message
Authentication Code
The CCM mode of operation was standardized by NIST specifically to sup-
port the security requirements of IEEE 802.11 WiFi wireless local area networks
(Chapter 18), but can be used in any networking application requiring authenti-
cated encryption. CCM is a variation of the encrypt-and-MAC approach to authen-
ticated encryption. It is defined in NIST SP 800-38C.
The key algorithmic ingredients of CCM are the AES encryption algorithm
(Chapter 6), the CTR mode of operation (Chapter 7), and the CMAC authentication
12.7 / AUTHENTICATED ENCRYPTION: CCM AND GCM 403
algorithm (Section 12.6). A single key K is used for both encryption and MAC algo-
rithms. The input to the CCM encryption process consists of three elements.
1. Data that will be both authenticated and encrypted. This is the plaintext mes-
sage P of data block.
2. Associated data A that will be authenticated but not encrypted. An example
is a protocol header that must be transmitted in the clear for proper protocol
operation but which needs to be authenticated.
3. A nonce N that is assigned to the payload and the associated data. This is a
unique value that is different for every instance during the lifetime of a pro-
tocol association and is intended to prevent replay attacks and certain other
types of attacks.
Figure 12.9 illustrates the operation of CCM. For authentication, the input
includes the nonce, the associated data, and the plaintext. This input is formatted
as a sequence of blocks B0 through Br. The first block contains the nonce plus some
formatting bits that indicate the lengths of the N, A, and P elements. This is fol-
lowed by zero or more blocks that contain A, followed by zero of more blocks that
contain P. The resulting sequence of blocks serves as input to the CMAC algorithm,
which produces a MAC value with length Tlen, which is less than or equal to the
block length (Figure 12.9a).
For encryption, a sequence of counters is generated that must be independent
of the nonce. The authentication tag is encrypted in CTR mode using the single
counter Ctr0. The Tlen most significant bits of the output are XORed with the tag to
produce an encrypted tag. The remaining counters are used for the CTR mode en-
cryption of the plaintext (Figure 7.7). The encrypted plaintext is concatenated with
the encrypted tag to form the ciphertext output (Figure 12.9b).
SP 800-38C defines the authentication/encryption process as follows.
1. Apply the formatting function to (N, A, P) to produce the blocks B0, B1, c , Br.
2. Set Y0 = E(K, B0).
3. For i = 1 to r, do Yi = E(K, (Bi ⊕ Yi - 1)).
4. Set T = MSBTlen(Yr).
5. Apply the counter generation function to generate the counter blocks
Ctr0, Ctr1, c , Ctrm, where m =