Understanding Checksums and Cyclic Redundancy Checks

Ebook723 pages6 hours

Understanding Checksums and Cyclic Redundancy Checks

Name: Understanding Checksums and Cyclic Redundancy Checks
Author: Philip Koopman
ISBN: 9798224117970

By Philip Koopman

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Understanding Checksums and Cyclic Redundancy Checks:
This book gives practical, comprehensive answers to common questions about checksums and CRCs. Descriptions are based mainly on intuitive rather than mathematical explanations for both algorithmic operation and limitations, improving accessibility to non-specialists. Coverage includes single-sum checksums, dual-sum checksums (Fletcher checksum, DualX, DualXP), the new Koopman checksum, Cyclic Redundancy Checks (CRCs), and system-level usage considerations.

For decades much of the practical use of checksums and CRCs was based, at least in part, on folklore. This book provides a solid, comprehensive foundation for addressing core issues such as the comparative fault detection effectiveness of each technique, insight into speed differences, intuitive explanations for how speed-up techniques work, which CRC polynomial you should use for any particular situation, and source code examples for each approach.

This is the most comprehensive treatment of checksums and CRCs to date. The emphasis is on intuitive explanations and empirical validation of insights for practical application of these techniques to addressing real-world problems.

Chapters:
1. Introduction
2. Decimal checksum examples
3. Checksum operation and terminology
4. Checksum fault model
5. Single-sum checksums
6. Dual-sum checksums (Fletcher, Adler, DualX)
7. Koopman checksum
8. Checksum plus parity for HD=4 (KoopmanP, DualXP)
9. Cyclic Redundancy Check (CRC)
10. CRC effectiveness
11. Other checksum considerations
12. System-level considerations
13. Resources
14. Conclusions
15. Appendix: Good CRC Polynomials
16. Glossary

Includes example source code in C for each type of checksum.
This e-book is best viewed in color, with some color-coded text and dozens of color illustrations.

Skip carousel

LanguageEnglish

PublisherPhilip Koopman

Release dateMar 20, 2024

ISBN9798224117970

Author

Philip Koopman

Prof. Philip Koopman is an internationally recognized expert on Autonomous Vehicle (AV) safety whose work in that area spans over 25 years. He is also actively involved with AV policy and standards as well as more general embedded system design and software quality. His pioneering research work includes software robustness testing and run time monitoring of autonomous systems to identify how they break and how to fix them. He has extensive experience in software safety and software quality across numerous transportation, industrial, and defense application domains including conventional automotive software and hardware systems. He was the principal technical contributor to the UL 4600 standard for autonomous system safety issued in 2020. He is a faculty member of the Carnegie Mellon University ECE department where he teaches software skills for mission-critical systems. In 2018 he was awarded the highly selective IEEE-SSIT Carl Barus Award for outstanding service in the public interest for his work in promoting automotive computer-based system safety. In 2022 he was named to the National Safety Council's Mobility Safety Advisory Group. He is the author of the books: Better Embedded System Software (2010), How Safe is Safe Enough: measuring and predicting autonomous vehicle safety (2022), and The UL 4600 Guidebook (2022).

Related to Understanding Checksums and Cyclic Redundancy Checks

Related ebooks

Skip carousel

Electronic Design Automation: Synthesis, Verification, and Test
Ebook
Electronic Design Automation: Synthesis, Verification, and Test
byElsevier Books Reference
Rating: 0 out of 5 stars
0 ratings
Practical Model-Based Testing: A Tools Approach
Ebook
Practical Model-Based Testing: A Tools Approach
byMark Utting
Rating: 0 out of 5 stars
0 ratings
Fun Q: A Functional Introduction to Machine Learning in Q
Ebook
Fun Q: A Functional Introduction to Machine Learning in Q
byNick Psaris
Rating: 0 out of 5 stars
0 ratings
Modern Multithreading: Implementing, Testing, and Debugging Multithreaded Java and C++/Pthreads/Win32 Programs
Ebook
Modern Multithreading: Implementing, Testing, and Debugging Multithreaded Java and C++/Pthreads/Win32 Programs
byRichard H. Carver
Rating: 0 out of 5 stars
0 ratings
Reliability of Computer Systems and Networks: Fault Tolerance, Analysis, and Design
Ebook
Reliability of Computer Systems and Networks: Fault Tolerance, Analysis, and Design
byMartin L. Shooman
Rating: 0 out of 5 stars
0 ratings
Modeling Embedded Systems and SoC's: Concurrency and Time in Models of Computation
Ebook
Modeling Embedded Systems and SoC's: Concurrency and Time in Models of Computation
byAxel Jantsch
Rating: 0 out of 5 stars
0 ratings
Parallel and High Performance Computing
Ebook
Parallel and High Performance Computing
byRobert Robey
Rating: 0 out of 5 stars
0 ratings
Flowgraph Models for Multistate Time-to-Event Data
Ebook
Flowgraph Models for Multistate Time-to-Event Data
byAparna V. Huzurbazar
Rating: 0 out of 5 stars
0 ratings
Jess in Action: Rule-Based Systems in Java
Ebook
Jess in Action: Rule-Based Systems in Java
byErnest Friedman-Hill
Rating: 0 out of 5 stars
0 ratings
Introduction to Applied Statistical Signal Analysis: Guide to Biomedical and Electrical Engineering Applications
Ebook
Introduction to Applied Statistical Signal Analysis: Guide to Biomedical and Electrical Engineering Applications
byRichard Shiavi
Rating: 0 out of 5 stars
0 ratings
Parallel Computing on Heterogeneous Networks
Ebook
Parallel Computing on Heterogeneous Networks
byAlexey L. Lastovetsky
Rating: 0 out of 5 stars
0 ratings
Structured Software Testing: The Discipline of Discovering
Ebook
Structured Software Testing: The Discipline of Discovering
byArunkumar Khannur
Rating: 0 out of 5 stars
0 ratings
Economic Forecasting
Ebook
Economic Forecasting
byGraham Elliott
Rating: 0 out of 5 stars
0 ratings
Modeling and Analysis of Telecommunications Networks
Ebook
Modeling and Analysis of Telecommunications Networks
byJeremiah F. Hayes
Rating: 0 out of 5 stars
0 ratings
Structured Parallel Programming: Patterns for Efficient Computation
Ebook
Structured Parallel Programming: Patterns for Efficient Computation
byMichael McCool
Rating: 1 out of 5 stars
1/5
The Mathematica® Programmer
Ebook
The Mathematica® Programmer
byRoman E. Maeder
Rating: 4 out of 5 stars
4/5
Robust Adaptive Control
Ebook
Robust Adaptive Control
byPetros Ioannou
Rating: 0 out of 5 stars
0 ratings
Biomolecular Networks: Methods and Applications in Systems Biology
Ebook
Biomolecular Networks: Methods and Applications in Systems Biology
byLuonan Chen
Rating: 0 out of 5 stars
0 ratings
Speech Coding Algorithms: Foundation and Evolution of Standardized Coders
Ebook
Speech Coding Algorithms: Foundation and Evolution of Standardized Coders
byWai C. Chu
Rating: 0 out of 5 stars
0 ratings
Principles of Modern Digital Design
Ebook
Principles of Modern Digital Design
byParag K. Lala
Rating: 0 out of 5 stars
0 ratings
Analytic Methods in Systems and Software Testing
Ebook
Analytic Methods in Systems and Software Testing
byRon S. Kenett
Rating: 0 out of 5 stars
0 ratings
Open Source SOA
Ebook
Open Source SOA
byJeff Davis
Rating: 0 out of 5 stars
0 ratings
OpenCL in Action: How to accelerate graphics and computations
Ebook
OpenCL in Action: How to accelerate graphics and computations
byMatthew Scarpino
Rating: 0 out of 5 stars
0 ratings
Foundations of Coding: Compression, Encryption, Error Correction
Ebook
Foundations of Coding: Compression, Encryption, Error Correction
byJean-Guillaume Dumas
Rating: 0 out of 5 stars
0 ratings
Optical Performance Monitoring: Advanced Techniques for Next-Generation Photonic Networks
Ebook
Optical Performance Monitoring: Advanced Techniques for Next-Generation Photonic Networks
byCalvin C. K. Chan
Rating: 0 out of 5 stars
0 ratings
Digital Circuit Testing: A Guide to DFT and Other Techniques
Ebook
Digital Circuit Testing: A Guide to DFT and Other Techniques
byFrancis C. Wong
Rating: 0 out of 5 stars
0 ratings
Digital Speech: Coding for Low Bit Rate Communication Systems
Ebook
Digital Speech: Coding for Low Bit Rate Communication Systems
byA. M. Kondoz
Rating: 0 out of 5 stars
0 ratings
Quantum Information Processing, Quantum Computing, and Quantum Error Correction: An Engineering Approach
Ebook
Quantum Information Processing, Quantum Computing, and Quantum Error Correction: An Engineering Approach
byIvan B. Djordjevic
Rating: 0 out of 5 stars
0 ratings
Mastering Rust Programming: From Foundations to Future
Ebook
Mastering Rust Programming: From Foundations to Future
byKameron Hussain
Rating: 0 out of 5 stars
0 ratings
Advanced Derivatives Pricing and Risk Management: Theory, Tools, and Hands-On Programming Applications
Ebook
Advanced Derivatives Pricing and Risk Management: Theory, Tools, and Hands-On Programming Applications
byClaudio Albanese
Rating: 0 out of 5 stars
0 ratings

Software Development & Engineering For You

Skip carousel

Learning Python
Ebook
Learning Python
byRomano Fabrizio
Rating: 5 out of 5 stars
5/5
How to Write Effective Emails at Work
Ebook
How to Write Effective Emails at Work
byRamakrishna Reddy
Rating: 4 out of 5 stars
4/5
iOS App Development For Dummies
Ebook
iOS App Development For Dummies
byJesse Feiler
Rating: 0 out of 5 stars
0 ratings
Python For Dummies
Ebook
Python For Dummies
byStef Maruch
Rating: 4 out of 5 stars
4/5
Level Up! The Guide to Great Video Game Design
Ebook
Level Up! The Guide to Great Video Game Design
byScott Rogers
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
Ebook
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
byTJ Books
Rating: 3 out of 5 stars
3/5
Adobe Illustrator CC For Dummies
Ebook
Adobe Illustrator CC For Dummies
byDavid Karlins
Rating: 5 out of 5 stars
5/5
Hand Lettering on the iPad with Procreate: Ideas and Lessons for Modern and Vintage Lettering
Ebook
Hand Lettering on the iPad with Procreate: Ideas and Lessons for Modern and Vintage Lettering
byLiz Kohler Brown
Rating: 4 out of 5 stars
4/5
Tiny Python Projects: Learn coding and testing with puzzles and games
Ebook
Tiny Python Projects: Learn coding and testing with puzzles and games
byKen Youens-Clark
Rating: 5 out of 5 stars
5/5
Expert Python Programming - Third Edition: Become a master in Python by learning coding best practices and advanced programming concepts in Python 3.7, 3rd Edition
Ebook
Expert Python Programming - Third Edition: Become a master in Python by learning coding best practices and advanced programming concepts in Python 3.7, 3rd Edition
byMichał Jaworski
Rating: 0 out of 5 stars
0 ratings
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
Ebook
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
byGwendolyn Faraday
Rating: 5 out of 5 stars
5/5
Lua Game Development Cookbook
Ebook
Lua Game Development Cookbook
byMário Kašuba
Rating: 0 out of 5 stars
0 ratings
Ry's Git Tutorial
Ebook
Ry's Git Tutorial
byRyan Hodson
Rating: 0 out of 5 stars
0 ratings
Reversing: Secrets of Reverse Engineering
Ebook
Reversing: Secrets of Reverse Engineering
byEldad Eilam
Rating: 4 out of 5 stars
4/5
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
Ebook
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
byMark Chan
Rating: 5 out of 5 stars
5/5
Coding All-in-One For Dummies
Ebook
Coding All-in-One For Dummies
byChris Minnick
Rating: 0 out of 5 stars
0 ratings
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Engineering Management for the Rest of Us
Ebook
Engineering Management for the Rest of Us
bySarah Drasner
Rating: 5 out of 5 stars
5/5
Beginning C++ Programming
Ebook
Beginning C++ Programming
byRichard Grimes
Rating: 3 out of 5 stars
3/5
Beginning Programming For Dummies
Ebook
Beginning Programming For Dummies
byWallace Wang
Rating: 4 out of 5 stars
4/5
Adobe Certified: Complete Step By Step Guide To Quickly Pass All Adobe Exams And Improve Your Job Position Real And Unique Practice Test Included
Ebook
Adobe Certified: Complete Step By Step Guide To Quickly Pass All Adobe Exams And Improve Your Job Position Real And Unique Practice Test Included
byDavid Mayer
Rating: 0 out of 5 stars
0 ratings
27 PROGRAM MANAGEMENT INTERVIEW TECHNIQUES - To Ace That Dream Job Offer !
Ebook
27 PROGRAM MANAGEMENT INTERVIEW TECHNIQUES - To Ace That Dream Job Offer !
byKumar Saurabh
Rating: 5 out of 5 stars
5/5
Modern C++ for Absolute Beginners: A Friendly Introduction to C++ Programming Language and C++11 to C++20 Standards
Ebook
Modern C++ for Absolute Beginners: A Friendly Introduction to C++ Programming Language and C++11 to C++20 Standards
bySlobodan Dmitrović
Rating: 0 out of 5 stars
0 ratings
RESTful API Design - Best Practices in API Design with REST: API-University Series, #3
Ebook
RESTful API Design - Best Practices in API Design with REST: API-University Series, #3
byMatthias Biehl
Rating: 5 out of 5 stars
5/5
Android App Development For Dummies
Ebook
Android App Development For Dummies
byMichael Burton
Rating: 0 out of 5 stars
0 ratings
Good Code, Bad Code: Think like a software engineer
Ebook
Good Code, Bad Code: Think like a software engineer
byTom Long
Rating: 5 out of 5 stars
5/5
DevOps For Dummies
Ebook
DevOps For Dummies
byEmily Freeman
Rating: 4 out of 5 stars
4/5
How Do I Do That in Photoshop?: The Quickest Ways to Do the Things You Want to Do, Right Now!
Ebook
How Do I Do That in Photoshop?: The Quickest Ways to Do the Things You Want to Do, Right Now!
byScott Kelby
Rating: 4 out of 5 stars
4/5
How Do I Do That In InDesign?
Ebook
How Do I Do That In InDesign?
byDave Clayton
Rating: 5 out of 5 stars
5/5
INSTANT PLC Programming with RSLogix 5000
Ebook
INSTANT PLC Programming with RSLogix 5000
byAustin Scott
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

The Bioinformatic artistry behind PCR assay design: Come get your assay on with us. If you’ve never considered how PCR assays are designed and developed, this episode will help you appreciate the skill and artistry involved.
Podcast episode
The Bioinformatic artistry behind PCR assay design: Come get your assay on with us. If you’ve never considered how PCR assays are designed and developed, this episode will help you appreciate the skill and artistry involved.
byAbsolute Gene-ius
0 ratings
0% found this document useful
316: git commit FreeBSD: NetBSD LLVM sanitizers and GDB regression test suite, Ada—The Language of Cost Savings, Homura - a Windows Games Launcher for FreeBSD, FreeBSD core team appoints a WG to explore transition to Git, OpenBSD 6.6 Beta tagged, Project Trident 12-U5 update now available, and more.
Podcast episode
316: git commit FreeBSD: NetBSD LLVM sanitizers and GDB regression test suite, Ada—The Language of Cost Savings, Homura - a Windows Games Launcher for FreeBSD, FreeBSD core team appoints a WG to explore transition to Git, OpenBSD 6.6 Beta tagged, Project Trident 12-U5 update now available, and more.
byBSD Now
0 ratings
0% found this document useful
067 Redefine Testing for Microservice Delivery Pipelines with Chris Burrell: How can you get your build times down to minutes? Exactly: eliminate the largest time consumer! At Landbay, where Chris Burrell heads up technology, this was end-to-end testing. Landbay deploys their 40 different microservices into ECS on a...
Podcast episode
067 Redefine Testing for Microservice Delivery Pipelines with Chris Burrell: How can you get your build times down to minutes? Exactly: eliminate the largest time consumer! At Landbay, where Chris Burrell heads up technology, this was end-to-end testing. Landbay deploys their 40 different microservices into ECS on a...
byPurePerformance
0 ratings
0% found this document useful
Analyzing the Google Paper on Continuous Delivery in ML // Part 4 // MLOps Coffee Sessions #17
Podcast episode
Analyzing the Google Paper on Continuous Delivery in ML // Part 4 // MLOps Coffee Sessions #17
byMLOps.community
0 ratings
0% found this document useful
Standard Server JavaScript - Deno, Workers, Bun and More: In this episode of Syntax, Wes and Scott talk about JavaScript runtimes available to use, what to consider, and packages to use. Show Notes Welcome Scott’s glasses All the runtimes A note for anyone new to JavaScript Winter CG...
Podcast episode
Standard Server JavaScript - Deno, Workers, Bun and More: In this episode of Syntax, Wes and Scott talk about JavaScript runtimes available to use, what to consider, and packages to use. Show Notes Welcome Scott’s glasses All the runtimes A note for anyone new to JavaScript Winter CG...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Guide to CRISPR/Cas9 Delivery How to Maximize Editing
Podcast episode
Guide to CRISPR/Cas9 Delivery How to Maximize Editing
byListen In - Bitesize Bio Webinar Audios
0 ratings
0% found this document useful
702: New + Proposed JS APIs for 2024: In this episode of Syntax, Wes and Scott talk through new and proposed JavaScript APIs including ones related to regex, sourcemaps, structured clone, temporal, JSON modules, and more! Show Notes Welcome Syntax Brought to you by Sentry RegExp...
Podcast episode
702: New + Proposed JS APIs for 2024: In this episode of Syntax, Wes and Scott talk through new and proposed JavaScript APIs including ones related to regex, sourcemaps, structured clone, temporal, JSON modules, and more! Show Notes Welcome Syntax Brought to you by Sentry RegExp...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
32: DHH - Building Basecamp 3 like a Porsche 911: DHH returns to the podcast to talk in-depth about how Basecamp 3 is designed and implemented! Topics include: Why Basecamp is a "majestic monolith", and the impact of organizational shape and size on technical decision making in product development How
Podcast episode
32: DHH - Building Basecamp 3 like a Porsche 911: DHH returns to the podcast to talk in-depth about how Basecamp 3 is designed and implemented! Topics include: Why Basecamp is a "majestic monolith", and the impact of organizational shape and size on technical decision making in product development How
byFull Stack Radio
0 ratings
0% found this document useful
Katie Gengler on the tech and processes that power the Ember community: Katie Gengler joins Sam and Ryan to talk about the important work she's done for the Ember project, from her creation and maintenance of community tools like Ember Try and Ember Observer to her role on Ember's Steering Committee. She also talks about lessons learned from doing custom app development for her consulting clients.
Podcast episode
Katie Gengler on the tech and processes that power the Ember community: Katie Gengler joins Sam and Ryan to talk about the important work she's done for the Ember project, from her creation and maintenance of community tools like Ember Try and Ember Observer to her role on Ember's Steering Committee. She also talks about lessons learned from doing custom app development for her consulting clients.
byFrontend First
0 ratings
0% found this document useful
VS Code Round Two: Scott and Wes are back with another round of tips, tricks and tasty treats for their favorite code editor, . LogRocket - Sponsor lets you replay what users do on your site, helping you reproduce bugs and fix issues faster. It’s an...
Podcast episode
VS Code Round Two: Scott and Wes are back with another round of tips, tricks and tasty treats for their favorite code editor, . LogRocket - Sponsor lets you replay what users do on your site, helping you reproduce bugs and fix issues faster. It’s an...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
qPCR Tips: Workflow, Applications and Troubleshooting
Podcast episode
qPCR Tips: Workflow, Applications and Troubleshooting
byListen In - Bitesize Bio Webinar Audios
0 ratings
0% found this document useful
Great Expectations For Your Data Pipelines with Abe Gong and James Campbell: Testing Data Pipelines With Great Expectations (Interview)
Podcast episode
Great Expectations For Your Data Pipelines with Abe Gong and James Campbell: Testing Data Pipelines With Great Expectations (Interview)
byThe Python Podcast.__init__
0 ratings
0% found this document useful
296: It’s Alive: OpenBSD 6.5: OpenBSD 6.5 has been released, mount ZFS datasets anywhere, help test upcoming NetBSD 9 branch, LibreSSL 2.9.1 is available, Bail Bond Denied Edition of FreeBSD Mastery: Jails, and one reason ed(1) was a good editor back in the days in this week’s episode.
Podcast episode
296: It’s Alive: OpenBSD 6.5: OpenBSD 6.5 has been released, mount ZFS datasets anywhere, help test upcoming NetBSD 9 branch, LibreSSL 2.9.1 is available, Bail Bond Denied Edition of FreeBSD Mastery: Jails, and one reason ed(1) was a good editor back in the days in this week’s episode.
byBSD Now
0 ratings
0% found this document useful
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization: The recent amalgamation of transformer and convolutional designs has led to steady improvements in accuracy and efficiency of the models. In this work, we introduce FastViT, a hybrid vision transformer architecture that obtains the state-of-the-art l...
Podcast episode
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization: The recent amalgamation of transformer and convolutional designs has led to steady improvements in accuracy and efficiency of the models. In this work, we introduce FastViT, a hybrid vision transformer architecture that obtains the state-of-the-art l...
byPapers Read on AI
0 ratings
0% found this document useful
Episode 159: 159 JSJ Why JavaScript Is Hard
Podcast episode
Episode 159: 159 JSJ Why JavaScript Is Hard
byJavaScript Jabber
0 ratings
0% found this document useful
Episode 249: Router On A Stick | BSD Now 249: OpenZFS and DTrace updates in NetBSD, NetBSD network security stack audit, Performance of MySQL on ZFS, OpenSMTP results from p2k18, legacy Windows backup to FreeNAS, ZFS block size importance, and NetBSD as router on a stick.
Podcast episode
Episode 249: Router On A Stick | BSD Now 249: OpenZFS and DTrace updates in NetBSD, NetBSD network security stack audit, Performance of MySQL on ZFS, OpenSMTP results from p2k18, legacy Windows backup to FreeNAS, ZFS block size importance, and NetBSD as router on a stick.
byBSD Now
0 ratings
0% found this document useful
This dangerous OpenSSL vulnerability can easily be triggered | CVE-2022-2274 Explained
Podcast episode
This dangerous OpenSSL vulnerability can easily be triggered | CVE-2022-2274 Explained
byThe Backend Engineering Show with Hussein Nasser
0 ratings
0% found this document useful
Cory O'Daniel and the Future of DevOps in Elixir Programming: In this episode of Elixir Wizards, Cory O'Daniel, CEO of Massdriver, talks with Sundi and Owen about the role of DevOps in the future of Elixir programming. They discuss the advantages of using Elixir for cloud infrastructure and the challenges of securing cloud systems. They elaborate on their hopes for the future, including processes and automation to streamline operations so programmers can spend more time doing what they love … writing software!
Podcast episode
Cory O'Daniel and the Future of DevOps in Elixir Programming: In this episode of Elixir Wizards, Cory O'Daniel, CEO of Massdriver, talks with Sundi and Owen about the role of DevOps in the future of Elixir programming. They discuss the advantages of using Elixir for cloud infrastructure and the challenges of securing cloud systems. They elaborate on their hopes for the future, including processes and automation to streamline operations so programmers can spend more time doing what they love … writing software!
byElixir Wizards
0 ratings
0% found this document useful
#3 Developer Testing: The Legacy Code Dilemma, Modified Condition Decision Coverage, and Pairwise Testing
Podcast episode
#3 Developer Testing: The Legacy Code Dilemma, Modified Condition Decision Coverage, and Pairwise Testing
byInspect and Adapt
0 ratings
0% found this document useful
De-Eveling BitMain with BraiinsOS ~ Jan Capek & Pavel Moravec ~ Understanding Bitcoin: Braiins OS is the very first fully open-source, Linux based system for cryptocurrency embedded devices. Finally, you know exactly what’s running on your device, have the ability to customize it and enjoy significant power saving and performance...
Podcast episode
De-Eveling BitMain with BraiinsOS ~ Jan Capek & Pavel Moravec ~ Understanding Bitcoin: Braiins OS is the very first fully open-source, Linux based system for cryptocurrency embedded devices. Finally, you know exactly what’s running on your device, have the ability to customize it and enjoy significant power saving and performance...
byThe World Crypto Network Podcast
0 ratings
0% found this document useful
Channel Gating for Cheaper and More Accurate Neural Nets with Babak Ehteshami Bejnordi - #385: Today we’re joined by Babak Ehteshami Bejnordi, a Research Scientist at Qualcomm. Babak works closely with former guest Max Welling and is currently focused on conditional computation, which is the main driver for today’s conversation. We dig into...
Podcast episode
Channel Gating for Cheaper and More Accurate Neural Nets with Babak Ehteshami Bejnordi - #385: Today we’re joined by Babak Ehteshami Bejnordi, a Research Scientist at Qualcomm. Babak works closely with former guest Max Welling and is currently focused on conditional computation, which is the main driver for today’s conversation. We dig into...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Options for Hands-On Practice: In this episode of The Broadcast Storm, we'll discuss a collection of options for getting hands-on practice with Cisco router and switch platforms. Following are links to some of the options mentioned in the podcast: Pearson’s Network Simulator:...
Podcast episode
Options for Hands-On Practice: In this episode of The Broadcast Storm, we'll discuss a collection of options for getting hands-on practice with Cisco router and switch platforms. Following are links to some of the options mentioned in the podcast: Pearson’s Network Simulator:...
byThe Broadcast Storm, with Kevin Wallace, CCIEx2 #7945 Emeritus
0 ratings
0% found this document useful
Into the Year of Documentation with Dr. KellyAnn Fitzpatrick: It is always a good day when you get to chat with any of the fine folks at RedMonk. So, after some polite badgering by Corey, Kelly Fitzpatrick, a Senior Industry Analysts at RedMonk, has joined the “Screaming” line up. At the forefront of today’s convers
Podcast episode
Into the Year of Documentation with Dr. KellyAnn Fitzpatrick: It is always a good day when you get to chat with any of the fine folks at RedMonk. So, after some polite badgering by Corey, Kelly Fitzpatrick, a Senior Industry Analysts at RedMonk, has joined the “Screaming” line up. At the forefront of today’s convers
byScreaming in the Cloud
0 ratings
0% found this document useful
The Journey of a Request to the Backend
Podcast episode
The Journey of a Request to the Backend
byThe Backend Engineering Show with Hussein Nasser
0 ratings
0% found this document useful
From QAos to Chaos Engineering
Podcast episode
From QAos to Chaos Engineering
byThe Cloudcast
0 ratings
0% found this document useful
160 - 2021 Crash Course Pt 1 - The Basics: Free SDR Training! Check out our new course, "The Ultimate Beginner's Guide to Software Defined Radios: Everything you need to know to get started with SDR in an afternoon." You can enroll for free at ...
Podcast episode
160 - 2021 Crash Course Pt 1 - The Basics: Free SDR Training! Check out our new course, "The Ultimate Beginner's Guide to Software Defined Radios: Everything you need to know to get started with SDR in an afternoon." You can enroll for free at ...
byScanner School - Everything you wanted to know about the Scanner Radio Hobby
0 ratings
0% found this document useful
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP: Open-vocabulary segmentation is a challenging task requiring segmenting and recognizing objects from an open set of categories. One way to address this challenge is to leverage multi-modal models, such as CLIP, to provide image and text features in a...
Podcast episode
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP: Open-vocabulary segmentation is a challenging task requiring segmenting and recognizing objects from an open set of categories. One way to address this challenge is to leverage multi-modal models, such as CLIP, to provide image and text features in a...
byPapers Read on AI
0 ratings
0% found this document useful
PLAID: An Efficient Engine for Late Interaction Retrieval: Pre-trained language models are increasingly important components across multiple information retrieval (IR) paradigms. Late interaction, introduced with the ColBERT model and recently refined in ColBERTv2, is a popular paradigm that holds state-of-t...
Podcast episode
PLAID: An Efficient Engine for Late Interaction Retrieval: Pre-trained language models are increasingly important components across multiple information retrieval (IR) paradigms. Late interaction, introduced with the ColBERT model and recently refined in ColBERTv2, is a popular paradigm that holds state-of-t...
byPapers Read on AI
0 ratings
0% found this document useful
How to Leverage IEC 62304 to Improve SaMD Development Processes: How can the IEC 62304 standard serve as a framework for your Software as a Medical Device (SaMD) development processes? Today’s guest is Cathy Wilburn, Director of Quality Assurance and Compliance for The RND Group, a medical device software developme...
Podcast episode
How to Leverage IEC 62304 to Improve SaMD Development Processes: How can the IEC 62304 standard serve as a framework for your Software as a Medical Device (SaMD) development processes? Today’s guest is Cathy Wilburn, Director of Quality Assurance and Compliance for The RND Group, a medical device software developme...
byGlobal Medical Device Podcast powered by Greenlight Guru
0 ratings
0% found this document useful
SE Radio 590: Andy Suderman on Standing Up Kubernetes: Andy Suderman, CTO of Fairwinds, joins host to talk about standing up a kubernetes cluster. Their discussion covers build-your-own versus managed clusters provided by cloud services, and how to determine the number of kubernetes clusters an...
Podcast episode
SE Radio 590: Andy Suderman on Standing Up Kubernetes: Andy Suderman, CTO of Fairwinds, joins host to talk about standing up a kubernetes cluster. Their discussion covers build-your-own versus managed clusters provided by cloud services, and how to determine the number of kubernetes clusters an...
bySoftware Engineering Radio - the podcast for professional software developers
0 ratings
0% found this document useful

Skip carousel

Quantum Supremacy Is Coming: Here’s What You Should Know
Quanta
Article
Quantum Supremacy Is Coming: Here’s What You Should Know
Jul 18, 2019
7 min read
The Devil Is In The Detail
Amateur Photographer
Article
The Devil Is In The Detail
Nov 26, 2019
2 min read
Introduction to eBPF Revolutionizing Linux Kernel Technology
Techfastly
Article
Introduction to eBPF Revolutionizing Linux Kernel Technology
Apr 1, 2022
6 min read
How To Add EBPF Observability To Your Project
Techfastly
Article
How To Add EBPF Observability To Your Project
Apr 1, 2022
3 min read
Top 5 Dcc Pitfalls
British Railway Modelling (BRM)
Article
Top 5 Dcc Pitfalls
May 18, 2023
Just about everyone who starts out with DCC control makes this mistake – I did. The initial outlay can be quite prohibitive, and the temptation to get a cheap pack of decoders is very tempting. This is false economy because they may fail after a shor
4 min read
EBPF To Enhance Kubernetes Monitoring
Techfastly
Article
EBPF To Enhance Kubernetes Monitoring
Apr 1, 2022
The introduction of Docker and Kubernetes has brought a dramatic revolution in the IT industry. Unlike the traditional methods of developing and deploying software, Kubernetes or K8s uses scaling and automated deployment. Thanks to the Linux function
4 min read
Observability Of The Kernel And Containers
Linux Format
Article
Observability Of The Kernel And Containers
Apr 4, 2023
Mihalis Tsoukalos is currently working on Time Series. You can reach him at: @mactsouk. For our final delve into eBPF, we’re tackling applications, the kernel and Docker containers. At the end of the day, all Linux machines execute code for applicat
10 min read
An easy-to-Understand Overview of Popular extended BPF Tools: BCC, Falco, and More
Techfastly
Article
An easy-to-Understand Overview of Popular extended BPF Tools: BCC, Falco, and More
Apr 1, 2022
7 min read
Changing Of The Guard
Australian Sky & Telescope
Article
Changing Of The Guard
Apr 15, 2020
6 min read
Too Many Pixels?
Digital Camera World
Article
Too Many Pixels?
Oct 18, 2019
1 min read
Stepping Up To CMOS
Australian Sky & Telescope
Article
Stepping Up To CMOS
Apr 20, 2023
6 min read
Phase One Iq4 150 Sets The IQ Standard
Pro Photo
Article
Phase One Iq4 150 Sets The IQ Standard
Dec 19, 2022
2 min read
SOMBRERO Comprehensively Benchmark Your PC
Linux Format
Article
SOMBRERO Comprehensively Benchmark Your PC
Jul 2, 2019
9 min read
Canon EOS R VS Canon EOS 5D Mark IV
Digital Camera World
Article
Canon EOS R VS Canon EOS 5D Mark IV
Dec 14, 2018
3 min read
Understanding autofocusing - Part 1
Photo Review
Article
Understanding autofocusing - Part 1
Feb 28, 2019
4 min read
Ultra-Precision, Super-Speed, Zero-Error Inspection; Cognitive Visual Inspection in Manufacturing
Techfastly
Article
Ultra-Precision, Super-Speed, Zero-Error Inspection; Cognitive Visual Inspection in Manufacturing
Dec 1, 2021
5 min read
QHYCCD’s New 60-megapixel Camera
Australian Sky & Telescope
Article
QHYCCD’s New 60-megapixel Camera
Jun 3, 2020
THERE’S A LOT TO TALK ABOUT with the new QHY600M camera from QHYCCD, so let’s jump right in. And the first thing there’s a lot of is pixels — 62,393,796 of them, according to the spec sheet for Sony’s IMX455 back-side illuminated, full-frame, CMOS se
7 min read
Our Experts Suggest
Amateur Photographer
Article
Our Experts Suggest
Mar 31, 2020
At the core of the Sony A7 III is a 24.2MP backside-illuminated sensor that delivers great results under testing conditions with 14-bit raw capture. Up to 10fps burst shooting is possible with both auto exposure and autofocus tracking, while its 4D f
1 min read
Speed Camera Accuracy
Motorcycle Sport & Leisure
Article
Speed Camera Accuracy
Aug 4, 2021
My attention was drawn to the letter by Keith Baker in the February 2021 edition regarding the accuracy of speed enforcement cameras. The guidance to police forces is found in the on-line document ‘ACPO Speed Enforcement Policy Guidelines 2011-2015:
3 min read
Simulation Breaks Quantum Computing World Record
Futurity
Article
Simulation Breaks Quantum Computing World Record
Jul 2, 2018
5 min read
Maximum Benchmarking
Maximum PC
Article
Maximum Benchmarking
Aug 18, 2020
10 min read
Quantum Simulators An Overview
Techfastly
Article
Quantum Simulators An Overview
Oct 1, 2021
4 min read
Is Quantum Computing Ready For Prime Time?
PC Pro Magazine
Article
Is Quantum Computing Ready For Prime Time?
Sep 7, 2023
7 min read
Strategic Command
Racecar Engineering
Article
Strategic Command
Nov 4, 2022
9 min read
Rover Mems Diagnostic Software
Car Mechanics
Article
Rover Mems Diagnostic Software
Sep 17, 2020
In the May 2016 issue of CM, we published details of three diagnostic software packages that give Rover owners the ability to interrogate – and communicate interactively with – MEMS ECUs (versions 1.3, 1.6 and 1.9). All that is needed is a laptop, or
5 min read
“There Was Some Date-checking Going On–presumably To Stop You Entering A Year During The Black Death”
PC Pro Magazine
Article
“There Was Some Date-checking Going On–presumably To Stop You Entering A Year During The Black Death”
Dec 10, 2020
9 min read
Canon, Into The Fray
Amateur Photographer
Article
Canon, Into The Fray
Jun 22, 2021
This is a further article in an accidental sequence on sensor technology. Previously I discussed Sony and Nikon sensors, and this time it’s Canon’s turn, with the ‘newly developed full-frame, back illuminated, stacked CMOS sensor that makes possible
2 min read
Sig BDX System Specifics
Gun Digest
Article
Sig BDX System Specifics
Aug 13, 2019
Relinquishing control is something that’s never easy, and that’s exactly what you must do with Sig’s BDX System—to a certain extent. From Kentucky holdover, drop-compensating reticles, dialing turrets and now, to digital elevation-compensation techno
3 min read
Data Rates And Throughput
Amateur Photographer
Article
Data Rates And Throughput
Jun 9, 2020
I write this article from lockdown, which has entailed two related things going on. One is that I'm taking even more interest in technical goings-on in the photographic world than I do usually. The second is that I've made a start on something I've i
2 min read
Starlight Xpress Trius Pro 694 mono CCD camera bundle
BBC Sky at Night
Article
Starlight Xpress Trius Pro 694 mono CCD camera bundle
Mar 26, 2020
4 min read

Related categories

Skip carousel

Reviews for Understanding Checksums and Cyclic Redundancy Checks

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Understanding Checksums and Cyclic Redundancy Checks - Philip Koopman

Preface

Checksums and Cyclic Redundancy Checks (CRCs) have been around for a long time. But surprisingly, it has taken decades to really understand them.

There is a significant divide between mathematical treatments of the topic and practical applications to networking and lightweight data integrity checks. This book comes down firmly on the side of practical implementation, and intentionally minimizes the mathematical aspects. There are plenty of places to read about the math. My goals here are to build non-mathematical intuition and emphasize empirical validation of insights.

An important realization I had early on was that a purely mathematical approach does not necessarily get you where you need to go on this topic. The math is seductive. But it tends to run out of steam just when things get interesting, and in particular does so for CRCs with a Hamming Distance of greater than 4 (that is when you want to be able to guarantee detection of 4 or more bit faults in a single network message or data structure). So I turned to a brute force evaluation approach that depended on dramatic algorithmic optimizations. Despite daunting odds that it would ever be efficient enough to be useful, a brute-force approach with suitable optimizations turned out to be the key to unlocking significant new discoveries and insights.

I’ve spent many hours puzzling through coding theory books. While I can understand what they say, my brain has never really worked math-first for this topic – it is more math-last I guess! I build intuition, and often see the bit movement patterns in my head before the math makes sense. Not everyone thinks that way, and certainly that is not how the vast majority of CRC and checksum literature is written. But it turns out that this way of thinking has been helpful for achieving a more comprehensive understanding of the practical aspects of checksums and CRCs.

Along the way, I have occasionally noted and commented upon mistakes, typos, or limitations of previous work. It is important to recognize that creating anything defect-free in this field is immensely difficult. Indeed, I have been hesitating to press the publish button for this book because I know there will inevitably be mistakes here too. Nonetheless, knowledge improves over time by identifying and correcting previous over-simplifications, false paths, gaps, and outright mistakes. I hope that this book provides a solid step in that direction.

Any book must have limits to its scope. For this book the topic is generally limited to checksums (including CRCs) used for fault detection. This leaves out error correction codes, and also codes more complex than checksums and CRCs. Other books for the most part quickly gloss over the mundane topic of checksums to concentrate on more complex and sophisticated error codes. However, in many practical applications checksums or CRCs are the right tool for the job, so this book concentrates on them rather than fancier coding theory. Hash functions, which might be thought of as cousins to checksums, are similarly out of scope.

One checksum I decided not to dig into deeper is the Weighted Sum Codes approach (Feldmeier 1995). While it is said to offer an interesting speed vs. fault detection tradeoff point, the computational algorithm is more complex than the codes I consider, and I have not seen any practical adoptions. So that is left as an area of future study for the reader.

A topic area I discuss with a curtailed scope is proper vs. improper CRCs (see section 10.10). What I found on that topic raised more questions than it answered for practical applications, primarily in the area of what happens with bit error rates might be greater than 50%.¹ Again, this is left as an area of future study.

For printing production I decided to go with color, because the graphs are very challenging to do well in black and white. That makes the printed version a bit more expensive (and for the hard cover edition, the raw printing cost is much more expensive). But let’s face it – this topic is not cut out to be a best seller beach read for the masses! So I’m going for presentation quality rather than cheaper printing cost. Additionally, I expect many copies of this book will be distributed as an e-book edition, where color just costs some file size. (As an aside, this is not a free book. If you didn’t pay for your copy, then you are reading an illicit copy. Please go buy a legitimate copy from the e-book store in the nearest convenient country. Thanks!)

The journey to finally being in a position to write this book was a long one. I give grateful acknowledgement to students, colleagues, and others who helped along the way with research in this area, including: Tridib Chakravarty, Kobey DeVale, Kevin Driscoll, Brendan Hall, Beth Osyk, Theresa (Theta) Maxino, Jennifer Black, Michael Paulitsch, Justin Ray, and Eushiuan Tran. Additional contributions came from: Mark Adler, Bob DiSilvestro, Berthold Gick, Hubert Kirrmann, and Pierre Zuber. I appreciate the additional personal support from Charles Kilgore, Barbara Lingberg, and Priya Narasimhan.

Some of the earliest work on this topic using an empirical evaluation approach was done by Dr. G. Funk of ASEA-Brown Boveri (ABB). His ABB colleagues pointed me toward his work early in my investigation of this topic, and it had a large impact on my thinking.

While writing this book itself did not have research sponsorship, previous related research did, so I want to acknowledge supporters again from those previous research projects that paved the way: the General Motors Collaborative Research Laboratory at Carnegie Mellon University, Robert Bosch GmbH, Bombardier Transportation, Honeywell, the Federal Aviation Administration, the Office of Naval Research, US Department of Transportation, and the Pennsylvania Infrastructure Technology Alliance (PITA). Earlier work also benefitted from equipment grants from Digital Equipment Corporation and Intel. The opinions and findings in this book are entirely my own, and do not reflect upon those historical research sponsors.²

Comments and corrections are welcome via email. Thanks to Steve Simpson for catching typos.

I maintain an updated zoo of good CRC polynomials here:

https://users.ece.cmu.edu/~koopman/crc/

I also have a checksum and CRC blog with occasional entries. Look for an entry about this book that has errata and a clickable list of all web pointers to save having to try to type them in for those of you reading the print version:

https://checksumcrc.blogspot.com/p/book.html

Checksums and CRCs occupy a peculiar niche in the world of computer dependability. The classic techniques are decades old and ubiquitous. Yet they are arguably under-studied for such a core part of our computing infrastructure, and use of them is often (but decreasingly) based on folklore rather than solid engineering tradeoffs. I hope that publishing this book is a useful step in the direction of improving the practice of deploying appropriate and effective checksums and CRCs.

Philip Koopman

Pittsburgh, PA, February 2024

koopman@cmu.edu

Sun Flowers

1. Introduction

The vast majority of data transferred between computers of all types has at some point been protected from potential accidental corruption by a checksum or Cyclic Redundancy Check (CRC).

Any data transmitted on a computer network or placed into storage is subject to some level of corruption. Over time the bits just rot so to speak. The probability for any particular bit being corrupted is small, but the risk adds up as the amount of data and exposure to potential sources of corruption increases. A number stored in memory comes back altered when it is read later. A message sent at one end of a network comes out a different message at the other end. This tendency toward data corruption applies to all data values, whether numbers, text, photos, videos, music, or otherwise.

For the most part, data corruption and methods to mitigate its effects are invisible to end users. But those who build any computer system need to choose the right mitigation method and implement it in a robust way to achieve a happy outcome of apparent fault-free operation.

The area of research and practice known as coding theory³ arose to deal with this issue of data corruption and related matters. This book does not attempt to be yet another book that tries to broadly cover that complex and vast topic. Rather, this book limits its coverage to the commonly used – but too often under-discussed – area of checksums and CRCs.

Our scope covers variations on the following algorithms:

Modular addition checksums (including parity computations)

Dual-sum checksums (including Fletcher and Adler checksums), including an extended block size DualX version

Parity versions of DualX and Koopman checksums

The Koopman checksum, which is a new point in the complexity/error detection performance tradeoff space based on a pure modular reduction operation

Cyclic Redundancy Checks (CRCs) ⁴

Modular addition checksums, dual-sum checksums, and CRCs are the workhorse error detection codes widely used in different computing applications. Understanding the tradeoffs is especially important for embedded computer system designers, who struggle with meager resources in both computing power and bandwidth while creating highly reliable products.

While checksums and CRCs are typically dismissed in a few pages of a deeply theoretical textbook, there is more to it than is typically discussed in such treatments.⁵ More importantly, that extra information can easily make the difference between a checksum or CRC that is effective for error detection, and one that is not.

There is plenty of misinformation on what makes an effective checksum or CRC that has been circulated over the decades.⁶ All of it is well-intentioned, but much of it is incorrect nonetheless. Some of the issues are just typos, but even a single incorrect bit value can change an excellent CRC into a terrible one. Some are just advice that seemed reasonable in a previous era given a lack of robust quantitative analysis results, but have turned out to be inaccurate. And some of it is just engineers frustrated at a lack of solid research results doing the best they can given insufficient information.

This book strives to provide comprehensive, detailed information that is generally applicable to a wide variety of uses for both checksums and CRCs. Moreover, it does so in a way that is intentionally less mathematical than typical coding theory texts so that it might serve a much broader audience of practicing engineers. Think of it as intuition-building with a software-forward instead of a math-forward approach.

Specific design choices made in creating this book include:

Using the least amount of mathematical notation that is practical, consistent with describing what is essentially a mathematical computation. This is enforced by limiting math to what can be expressed readily without using an equation editor. This means some of the mathematical rigor and proofs typically found in a coding theory approach are missing. But see the next point…

Emphasizing a description-by-example approach rather than a mathematical proof approach. This makes the material accessible to a wider audience. If you already understand the math from coding theory treatments, there is still material here for you in terms of practical application. And you might find that the insight-based rather than mathematical-based treatments still provide value.

Describing various topics example-first, then generalizing. This can make concepts more concrete than a math-first approach to many readers.

Using an empirical evaluation approach to selecting moduli rather than a theoretical approach.⁷ This has been the key to unlocking deeper insights for both checksums and CRCs.

Emphasizing the contributions of each specific number of fault bits (e.g., 2-bit, 3-bit, 4-bit, … faults) rather than varying the probability of random bit faults (bit error ratio) in analysis. We have found this provides useful insights.

Conforming to established terminology as much as possible, while attempting to smooth the seams where terminology from diverse disciplines leads to conflicting terms. For example, one might say there is a bit fault or a bit error in a particular datum depending on technical background and point of view, but mean the same thing.

To ensure we are not accused of complete heresy regarding the lack of fancy equations, indeed there will be a discussion of polynomial division regarding CRCs. But that is just the start of the discussion, not the end. And the emphasis is on intuition-building and algorithms, not rigorous mathematical notation.

There will still need to be some math, and we assume the reader is familiar with basic probability calculations.⁸ Ultimately, evaluating checksums and CRC fault detection effectiveness is a probabilistic numbers game, so a minimum amount of math is required. That and long division.

This section is a warm-up of terminology to prepare the reader for the introduction to checksum concepts in chapter 2. We revisit terminology in more detail in chapter 3. Also see the glossary at the end of the book if you need to quickly check the definition of a term.

The basic arrangement of a checksum or CRC calculation as it relates to data in memory or a network message is shown in figure 1.1.⁹

Figure 1.1. Simplified representation of a code word that contains both a data word (data to be protected) and a check value that is the result of a checksum or CRC computation.

A data wordis the data we want to safeguard via using a checksum. It might be one bit, or it might be a megabyte or more of data.¹⁰ The idea is that it is some set of data that we want to ensure has not been corrupted. To be clear, for our purposes we cannot prevent it from being corrupted. But we can try our best to ensure that corrupted data does not sneak through unnoticed to create incorrect computational results.

A check valueis computed by running the data word through a checksum or CRC calculation. This mixes and condenses the data word into a resultant check value that is often 8, 16, or 32 bits in size. The check value is computed in a way that we hope makes detecting any corruption likely. But no such calculation can be perfect at detecting all possible faults. In particular, there will always be some fault patterns in the data word (potentially combined with faults in the check value) that in effect cancel each other out and result in an undetectably corrupted data word.

A code wordis the combination of a data word concatenated with a check value. The entire code word is handled as a single entity at the system level. For example, a single network message is likely to contain a code word. When a code word is received, the CRC or checksum algorithm is run again to see if the data word still matches the expected check value. If not, there has been a detected fault (i.e., the code word is corrupted, and that corruption has been detected). If they do match then either (1) there has been no corruption, or (2) there has been a corruption severe enough to be beyond the ability of the checksum or CRC algorithm to detect. Note that we consider corruption of the entire code word, not just corruption of the data word, because various causes of corruption at the bit-by-bit level generally do not know the structure of the code word – they see a string of bits the recipient happens to decide to treat as a particular type of code word.

The Hamming Distance is the smallest number of bit faults that must occur to corrupt a code word to the point that it is potentially undetectable by a particular checksum or CRC. For example, HD=4 means Hamming Distance of 4, in which all 1-, 2- and 3-bit faults are detected, but some 4-bit faults are not.

There are significant tradeoffs regarding the computational cost vs. fault detection capability of any particular checksum or CRC algorithm. In general, more computational cost gives better fault detection, but more sophisticated approaches have limits to the maximum size data word for which they are effective. Exploring that tradeoff space is a primary goal of this book.

There are two basic mechanisms for checksums discussed in this book.

Modular addition checksum. This computes a sum of blocks of data across the length of the data word to produce a check value. For example, a checksum might be a two’s complement 32-bit addition of all 32-bit values in a data word. The modular part comes into play in that a modulo operation (remainder after division) is performed after each addition so that the result fits within the defined check value size, even if the sum would otherwise be too big to fit. The modular addition mechanism is used by both single-sum and dual-sum checksums.

Modular reduction checksum. In this approach, the entire data word is treated as a single large number that is divided by some modulus. The check value is the remainder after division. This mechanism is used by pure modular reduction, Koopman checksums, and CRCs.

For both modular addition and modular reduction, there are two possible types of mathematical operations used for the addition and division operations: normal integer arithmetic for most checksums, and polynomial operations with 1-bit coefficients for CRCs. The details of each will be described as we go.

Beyond the scope of this book are other types of error codes. They certainly have their uses. To understand those, have a look at a more traditional book on coding theory, with several of the usual suspects for that listed in the resources chapter. We’re here to take a deep dive into checksums and CRCs.

The goal of this book is to be an accessible guide for practicing engineers. The style reflects this.

Many of the references are to Wikipedia. Wikipedia is not a universally authoritative source. However, in our experience, the topics we refer to are pretty reasonable in content and style. Wikipedia provides references to useful, often authoritative sources for anyone who wants to dig further into a particular topic. Just as importantly, much of the work in this area is buried behind academic publication paywalls and might not be available to many readers. So we refer to Wikipedia in most cases to ensure that backup information is readily availability, with the caution that the reader should be wary of any potential inaccuracies that might affect some particular point.¹¹

In some places, the recommendations in this book are different than those found in our earlier publications. That does not mean the earlier publications are wrong (unless we note that explicitly). Rather, it means that after decades of studying this topic we have even better advice to offer. In particular, CRC polynomials previously recommended still perform as advertised at the time. But in a few cases, we have decided that recommendations can be changed to be even better.

For those who want to dive deeper, Chapter 13 has a list of resources that have informed this work and are worth looking at for more details, especially regarding mathematical and theoretical topics that we do not cover in depth.

Here is a summary of our coverage:

Chapter 2 gives some decimal checksum examples, including some examples in use in the real world such as the ISBN 10 book numbering system and off-line credit card number validation. This serves as a gentle introduction to the concepts used in more general checksum examples. It also provides a quick survey of some practical checksum uses beyond binary numeric applications.

Chapter 3 gives a more thorough set of terminology and definitions. It also presents a generic description of how a checksum/CRC assures data integrity.

Chapter 4 covers fault models. If we are going to talk about how well any particular algorithm can detect faults, we need a solid basis for understanding what different fault characteristics might be detected. This includes a discussion of the notion of a Hamming Distance (HD), which is a key attribute of fault detection effectiveness.

Chapter 5 covers single-sum checksums, which are what people often mean when they are mentioning a checksum without being more specific. A single running sum is computed across bytes, 32-bit integers, or other blocks of the data word to create a check value.

Chapter 6 covers dual-sum checksums. These use a pair of cascaded running sums across the data word to provide better fault detection compared to single-sum checksums. The Fletcher and Adler checksums are examples of this technique. We introduce DualX checksums, which have an expanded block size. DualX checksums provide more capable fault detection and can run significantly faster on larger processors.

Chapter 7 covers the Koopman checksum, which is a novel approach to computing a modular reduction operation that involves reformulating it to be a modular addition loop. This provides superior fault detection effectiveness for larger data words than a dual-sum checksum while still making use of any native CPU support for integer division. It is for practical purposes as effective as a particular class of CRCs (HD=3) for data word lengths of up to about half as long as a corresponding good CRC with the same check value size, but with much faster computational speed.

Chapter 8 covers variations of DualX and Koopman checksums that include a parity bit in the check value. This provides fault detection effectiveness at HD=4, albeit for data word lengths approximately half as long as their non-parity equivalents.

Chapter 9 explains how Cyclic Redundancy Check (CRC) computations work, including both left-shift and right-shift algorithms, as well as the table lookup computational speedup technique.

Chapter 10 examines CRC effectiveness, which is independent of which CRC computation algorithm is being used.

Chapter 11 covers loose ends that have arisen in previous chapters: the effect of data values on fault detection effectiveness, whether large moduli are better, what happens for large numbers of bit faults, and effectiveness for burst faults rather than random independent single-bit faults.

Chapter 12 discusses practical system-level considerations for using checksums and CRCs.

Chapter 13 is an annotated list of resources that might prove useful to readers interested in exploring further.

Chapter 14 is a brief conclusion.

The Appendix lists good general-purpose CRC polynomials based on CRC size, data word length, and maximum achievable HD. The polynomials listed provide one-stop shopping for good CRC polynomial selection.

A Glossary has concise definitions for bold-faced terms defined throughout the book.

Figure 1.2 is a preview of checksum and CRC performance for some of the checksum algorithms discussed in later chapters. The vertical axis is the probability of undetected fault Pud at a particular bit error rate. (Pud is the probability of an undetected faulty code word at a particular assumed bit error rate. The details of what that might mean are discussed in chapter 4. For now, up is bad; down is good; and the vertical axis is on a log scale.)

With a span of many orders of magnitude of Pud performance between the most and least effective checksums on this chart, there is plenty of room for trading off better Pud vs. algorithmic complexity and computation speed. The Simple curves are notional references for an idealized checksum operating at the indicated HD. The balance of the book revisits all these curves with explanations, so this is just a preview of things to come.

graphs

Figure 1.2. Survey of Pud for selected checksums. 16-bit check values; BER=10-⁶

Speed differences are discussed in section 12.1, with the general finding that CRCs are slowest, but provide the best fault detection capabilities. Checksums that are both increasingly faster but less capable are, in order: KoopmanP, Koopman, DualXP, DualX, and single sum/parity. DualX (e.g., Fletcher and Adler checksums) are in many cases both slower and less capable than the corresponding DualX checksum alternatives.

Chapter 2 covers decimal digit checksums while subsequent chapters switch to computer-friendly binary number-based checksum algorithms.

2. Decimal checksum examples

In this chapter, we work some examples of decimal digit checksums to illustrate the general ideas that come into play. Subsequent chapters will revisit these ideas in a more rigorous way using binary integers.

The emphasis here is on building insight to ground the more detailed discussions that follow. The examples are based on decimal notation that is generally suitable for manual data entry and, at least in principle, pencil-and-paper arithmetic.¹²

A classic use of a checksum is detecting mis-typed numeric data entered via a keyboard. If a person is asked to enter a long string of arbitrary numbers, there is a chance they will make some sort of typographical error doing so. Some problems may come from pressing the wrong key, while others come from visually getting lost in the long string of digits. Swapping the order of pairs of adjacent digits during their trip through a person’s short-term memory is an especially common fault to guard against.

There are notational tricks to help with the visual digit recognition task, such as breaking phone numbers into groups of digits. For example, using the North American phone numbering system, 4125550123 might be more difficult to interpret visually than 412.555.0123.¹³

However, typos will still happen. To help detect such typos, we can use a checksum.

Given a data word of a 10-digit phone number, the most straightforward checksum calculation is to add the digits of a number, taking only the lowest digit of the result to form a one-digit check value (algorithm 2.1). For a running example of 412.555.0123, we would get:

4 + 1 + 2 + 5 + 5 + 5 + 0 + 1 + 2 + 3 = 28 checksum of 8

We’ll note that taking only the lowest digit is mathematically equivalent to taking the remainder after division by 10, or computing the sum modulo 10 (often written as mod 10).¹⁴

We could then append this value 8 as a check digit to the number, creating a more robust number (a code word): 412.555.0123.8. The point of doing this is that if anyone enters this number with wrong-digit typographical fault, there is a reasonable chance the check digit (the check value) will not match the phone number (the data word), and the fact that there has been a typo can be detected.

Algorithm 2.1 Decimal addition mod 10.

With a checksum-based phone number, rather than bothering someone by reaching a wrong number, the caller can get a system error most of the time¹⁵ and try again. For example, let’s say someone mis-wrote the above phone number as: 412.556.0123.8 (the underline is there for the reader, but the person making a call would not know there was a 5⇒"6" typo). When using this number to make a call, the phone system would compute the checksum of the first 10 digits:

Intended number: 412.555.0123.8

Faulty number: 412.556.0123.8

Faulty number: 4 + 1 + 2 + 5 + 5 + 6 + 0 + 1 + 2 + 3 = 29

29 mod 10 computed checksum of 9

That computed checksum "9 could be compared to the dialed checksum (the final 8" in the dialed number) to determine if there must be some typo in the number somewhere, making the mis-typed phone number invalid.

For this example, we have the following situation, in which dialing a wrong number successfully would require two digits to be incorrect, resulting in a check digit that happens to match up with the main phone number:

412.555.0123.8 valid phone number

412.556.0123.8 invalid checksum means invalid number

412.556.0123.9 some other phone number that is valid

412.546.0123.8 some other phone number that is valid

Given those examples, we can infer some fault detection properties of this checksum approach:

Any single-digit fault (any incorrect single decimal digit within the number, including the check digit) will be detectable. There are two cases:

If the digit fault is in the main phone number, it will give an incorrect sum that does not match the check value digit at the end of the phone number.

If the check digit is incorrect, it will not match the sum of the correct phone number digits.

Many, but not all, two-digit faults will be undetectable. There are two cases of undetectable faults:

A one-digit change in the phone number is paired with a change to the check value, with that pair resulting in a correct checksum match. In the example above, 412.556.0123.9 might be an intended legitimate phone number, or might be the result of changing 5 to 6 as well as a second change of the check digit 8 to 9, which results in a check value that matches the computed checksum result.

A two-digit change in the phone number in which those two digits happen to result in the same sum. In the example above, 412.546.0123.8 has a valid checksum, but that might be due to corrupting 555 to 546, which has the same net effect on the checksum, giving an overall checksum result of 8.

Most multi-digit faults will be detectable. In general, entering a completely random 11-digit number has only a 10% chance of forming a valid phone number. As a simple thought experiment, if you choose a 10-digit primary phone number at random, then choose a random check digit, there is a 1 in 10 chance that the check digit just happens to match whatever one of the ten possible sums was created based on the primary digits.

No ordering faults (digits with different positions) are detectable. If you change the order of the digits, the sum is the same. This is a significant weakness in practice, because reversing the order of adjacent digits is a somewhat common data entry fault. (In the example above, 412.555.1023.8 is a valid phone number, but might be the result of a transposition fault of 0123 to 1023.)

An important fault detection characteristic is that this checksum algorithm is especially good at detecting single-digit faults (100% detection). If we think that in practice single-digit faults are the most common, this is an attractive property, even if the average performance across more random faults is only a 90% fault detection rate.

Why modern phone numbers don’t have check digits is not obvious, but might have its roots in the evolution of phone numbers. Early phone systems used word/number pairs, presumably to aid in the ability to keep the whole number in a person’s short-term memory.¹⁶ Adding a check digit would have increased the burden of dealing with numbers, and would likely not have had any practical benefit until automated switching was pervasive – by which time the older numbering system would have been well established. Even though most North American calls now require a full 10 digits, with an eleventh digit potentially being a minor incremental burden, the legacy numbering system is far too entrenched to add check digits at this point.¹⁷

An alternate approach to creating a checksum is to use division instead of addition as the primary operation. This is obviously not really a sum and feels more like a hash function,¹⁸ but can serve the purpose of a checksum in some applications, and lays important groundwork for later discussions.

For a simple division-based checksum, the data value is divided by some fixed number, and the remainder of that division is used as the checksum value. Taking a remainder after division is called modular reduction (also known as a modulo operation) with the number being used as the divisor being the modulus.

For example, if we use a divisor of 9, we get:

4125550123 mod 9 = 1 ⇒ checksum value of 1

In this system, a valid phone number would be treated as a single large integer for the checksum operation, with an added checksum digit that is the mod 9 result of "1", giving: 412.555.0123.1

While this approach works moderately well, it has three important limitations:

It is more complex to compute than simple addition.

It only uses nine out of the ten possibilities for the check digit (a mod 9 computation can only generate a check digit of {0..8}, with the value 9 being unused). This in turn means that its average effectiveness for randomly corrupted values is only 8/9 = 88.9% instead of the 90% for decimal digits with a range of {0..9}. That is a small difference, but a difference nonetheless.

It misses some single-digit corruptions. In particular, it will miss any 0 changed to a 9, or any 9 changed to a 0, because either digit’s contribution to the remainder is zero. For example, 4125559123 mod 9 = 1, which is unchanged even though the 0 has been corrupted to a 9.

So far, we

Enjoying the preview?

Page 1 of 1

Understanding Checksums and Cyclic Redundancy Checks

About this ebook

Philip Koopman

Read more from Philip Koopman

Related authors

Related to Understanding Checksums and Cyclic Redundancy Checks

Related ebooks

Software Development & Engineering For You

Related podcast episodes

Related articles

Related categories

Reviews for Understanding Checksums and Cyclic Redundancy Checks

What did you think?

Book preview

Understanding Checksums and Cyclic Redundancy Checks - Philip Koopman

Preface

1. Introduction

2. Decimal checksum examples