Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Improving Product Reliability and Software Quality: Strategies, Tools, Process and Implementation
Improving Product Reliability and Software Quality: Strategies, Tools, Process and Implementation
Improving Product Reliability and Software Quality: Strategies, Tools, Process and Implementation
Ebook937 pages9 hours

Improving Product Reliability and Software Quality: Strategies, Tools, Process and Implementation

Rating: 0 out of 5 stars

()

Read preview

About this ebook

The authoritative guide to the effective design and production of reliable technology products, revised and updated

While most manufacturers have mastered the process of producing quality products, product reliability, software quality and software security has lagged behind. The revised second edition of Improving Product Reliability and Software Quality offers a comprehensive and detailed guide to implementing a hardware reliability and software quality process for technology products. The authors – noted experts in the field – provide useful tools, forms and spreadsheets for executing an effective product reliability and software quality development process and explore proven software quality and product reliability concepts.

The authors discuss why so many companies fail after attempting to implement or improve their product reliability and software quality program. They outline the critical steps for implementing a successful program. Success hinges on establishing a reliability lab, hiring the right people and implementing a reliability and software quality process that does the right things well and works well together. Designed to be accessible, the book contains a decision matrix for small, medium and large companies. Throughout the book, the authors describe the hardware reliability and software quality process as well as the tools and techniques needed for putting it in place. The concepts, ideas and material presented are appropriate for any organization. This updated second edition: 

  • Contains new chapters on Software tools, Software quality process and software security.
  • Expands the FMEA section to include software fault trees and software FMEAs.
  • Includes two new reliability tools to accelerate design maturity and reduce the risk of premature wearout.
  • Contains new material on preventative maintenance, predictive maintenance and Prognostics and Health Management (PHM) to better manage repair cost and unscheduled downtime.
  • Presents updated information on reliability modeling and hiring reliability and software engineers.
  • Includes a comprehensive review of the reliability process from a multi-disciplinary viewpoint including new material on uprating and counterfeit components.
  • Discusses aspects of competition, key quality and reliability concepts and presents the tools for implementation.

Written for engineers, managers and consultants lacking a background in product reliability and software quality theory and statistics, the updated second edition of Improving Product Reliability and Software Quality explores all phases of the product life cycle. 

LanguageEnglish
PublisherWiley
Release dateApr 9, 2019
ISBN9781119179436
Improving Product Reliability and Software Quality: Strategies, Tools, Process and Implementation

Related to Improving Product Reliability and Software Quality

Titles in the series (12)

View More

Related ebooks

Technology & Engineering For You

View More

Related articles

Reviews for Improving Product Reliability and Software Quality

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Improving Product Reliability and Software Quality - Mark A. Levin

    About the Authors

    Mark A. Levin is the reliability manager at Teradyne, Inc. and is based in Agoura Hills, California. He received his bachelor of science degree in Electrical Engineering (1982) from the University of Arizona, a master of science degree in Technology Management (1999) from Pepperdine University, a master of science in Reliability Engineering (2009) from the University of Maryland, and all but dissertation for a PhD in Reliability Engineering from the University of Maryland. He has more than 36 years of electronics experience spanning the aerospace, defense, consumer, and medical electronics industries. He has held several management and research positions at Hughes Aircraft Missiles Systems Group, Hughes Aircraft Microwave Products Division, General Medical Company, and Medical Data Electronics. His experience is diverse, having worked in manufacturing, design, and research and development. He has developed manufacturing and reliability design guidelines, reliability training classes, workmanship standards, quality programs, JIT manufacturing, and ESD safe work environments, and has established a surface mount production facility.

    (Mark.levin@Teradyne.com)

    Ted T. Kalal is a reliability engineer (now retired) who has gained much of his understanding of reliability from hands‐on experience and from many great mentors. He is a graduate of the University of Wisconsin (1981) in Business Administration after completing much preliminary study in mathematics, physics, and electronics. He has held many positions as a contract engineer and as a consultant, where he was able to focus on design, quality, and reliability tasks. He has authored several papers on electronic circuitry and holds a patent in the field of power electronics. With two partners, he started a small manufacturing company that makes high‐tech power supplies and other scientific apparatus for the bioresearch community.

    Jonathan Rodin is a software engineering manager at Teradyne, Inc. A graduate of Columbia University (1981), Jon has 39 years of experience developing software, both working as a programmer and managing software development projects. His experience spans companies of many sizes, ranging from early stage startups to companies of greater than 100 000 employees. Prior to joining Teradyne, Jon held executive engineering management positions at FTP Software, NaviSite, and Percussion Software. He has led software process reengineering projects numerous times, most recently driving the effort to bring Teradyne's Semiconductor Test Division to CMMI Level 3.

    List of Figures

    Figure 1.1 Product cost is determined early in development.

    Figure 1.2 Cost to fix a design increases an order of magnitude with each subse...

    Figure 1.3 The reliability process reduces the number of ECOs required after pr...

    Figure 1.4 Including reliability in concurrent engineering reduces time to mark...

    Figure 1.5 Product introduction relative to competitors.

    Figure 1.6 The ICM process.

    Figure 2.1 Overcoming reliability hurdles bring significant rewards.

    Figure 5.1 The six phases of the product life cycle.

    Figure 5.2 The ICM process.

    Figure 5.3 A risk mitigation program (ICM) needs to address risk issues in all ...

    Figure 6.1 The bathtub curve (timescale is logarithmic).

    Figure 6.2 Cumulative failure curve.

    Figure 6.3 Light bulb theoretical example.

    Figure 6.4 Availability as a function of MTBF and MTTR. Note: The curve has a s...

    Figure 6.5 Design maturity testing – accept/reject criteria.

    Figure 6.6 Number of fan failures vs. run time.

    Figure 6.7 Mechanism that can cause degradation and failure.

    Figure 6.8 PHM data collection and processing to detect degradation..

    Figure 7.1 Functional block diagram.

    Figure 7.2 Filled‐out functional block diagram.

    Figure 7.3 Schematic diagram of a flashlight.

    Figure 7.4 Functional block diagram of a flashlight.

    Figure 7.5 Functional block diagram of a flashlight using Post‐its.

    Figure 7.6 Fault tree logic symbols.

    Figure 7.7 Fault tree diagram for flashlight using Post‐its.

    Figure 7.8 Logic flow diagram.

    Figure 7.9 Fault tree logic diagram.

    Figure 7.10 Flash light fault tree logic diagram.

    Figure 7.11 Functional block diagram for the flashlight process.

    Figure 7.12 Example of a SFTA for an execution flow failure.

    Figure 8.1 Pareto of failures.

    Figure 8.2 HALT failure percentage by stress type.

    Figure 8.3 Product design specification limits.

    Figure 8.4 Design margin.

    Figure 8.5 Some products fail product spec.

    Figure 8.6 HALT increases design margin.

    Figure 8.7 Soft and hard failures.

    Figure 8.8 Impact of HALT on design margins.

    Figure 8.9 Two heat exchangers placed in front of chamber forced air.

    Figure 8.10 Test setup profile to checkout connections and functionality.

    Figure 8.11 Temperature step stress with power cycle and end of each step.

    Figure 8.12 Vibration step stress.

    Figure 8.13 Temperature and vibration step stress.

    Figure 8.14 Rapid thermal cycling.

    Figure 8.15 Slow temperature ramp.

    Figure 8.16 Slow temperature ramp with constantly varying vibration level.

    Figure 8.17 HASS stress levels.

    Figure 8.18 The bathtub curve.

    Figure 8.19 HASA plan.

    Figure 8.20 A HALT chamber has six simultaneous degrees of freedom (movement).

    Figure 8.21 ARG process flow.

    Figure 8.22 Accelerated reliability growth.

    Figure 8.23 ARG and ELT acceleration test plans.

    Figure 8.24 Selective process control.

    Figure 9.1 Quality ROI chart (financial impact of escapes is low).

    Figure 9.2 Quality ROI chart (financial impact of escapes is high).

    Figure 9.3 Sample line counts.

    Figure 9.4 Defect run chart 1.

    Figure 9.5 Defect run chart 2.

    Figure 9.6 Comparative escape rates.

    Figure 10.1 Generic fishbone diagram.

    Figure 10.2 Sample fishbone diagram.

    Figure 10.3 Sample Pareto chart.

    Figure 10.4 Code review root cause Pareto.

    Figure 10.5 Try‐catch code example.

    Figure 11.1 Waterfall life cycle.

    Figure 11.2 Quality processes in a waterfall life cycle.

    Figure 11.3 Sprint activities.

    Figure 11.4 Sprint activities in an epic.

    Figure 12.1 Sample requirements.

    Figure 12.2 Sample user stories.

    Figure 12.3 Code comments example.

    Figure 12.4 Sample UART HAL code.

    Figure 15.1 ESPEC/Qualmark HALT chamber.

    Figure 17.1 The six phases of the product life cycle.

    Figure 17.2 The hardware reliability process.

    Figure 17.3 Proactive activities in the product life cycle.

    Figure 18.1 Product concept phase risk mitigation form.

    Figure 18.2 Risk severity scale.

    Figure 18.3 ICM sign‐off required before proceeding to design concept.

    Figure 19.1 Opportunity to affect product cost.

    Figure 19.2 The bathtub curve.

    Figure 19.3 System MTBF requirement.

    Figure 19.4 Subsystem MTBF requirement.

    Figure 19.5 180° of reliability risk mitigation.

    Figure 19.6 Where to look for new reliability risks.

    Figure 19.7 The reliability risk mitigation process.

    Figure 19.8 The ICM is an effective gate to determine if the project should pro...

    Figure 20.1 The first phase of the product life cycle.

    Figure 20.2 Looking forward to identify risk issues.

    Figure 20.3 Risk mitigation strategies for reliability and performance.

    Figure 20.4 Risk growth curve shows the rate at which risk issues are identifie...

    Figure 20.5 DFR guideline for electrolytic capacitor usage.

    Figure 20.6 HALT planning flow.

    Figure 20.7 HALT planning checklist.

    Figure 20.8 HALT development phase.

    Figure 21.1 Reliability activities in the validation phase.

    Figure 21.2 HALT process flow.

    Figure 21.3 HALT test setup verification test.

    Figure 21.4 Temperature step stress.

    Figure 21.5 Vibration step stress.

    Figure 21.6 Temperature and vibration step stress.

    Figure 21.7 Rapid thermal cycling (60 °C min−1).

    Figure 21.8 Slow temperature ramp.

    Figure 21.9 Slow temperature ramp and sinusoidal amplitude vibration.

    Figure 21.10 HALT form to log failures.

    Figure 21.11 HALT graph paper for documenting test.

    Figure 21.12 HASS stress levels.

    Figure 21.13 HASS profile.

    Figure 22.1 Assert functions can be used with an appropriate header.

    Figure 22.2 Sample test plan.

    Figure 22.3 Sample log code.

    Figure 22.4 Example log file extract.

    Figure 24.1 Achieving quality in the production phase.

    Figure 24.2 Design issue tracking chart.

    Figure 24.3 Reliability growth chart.

    Figure 24.4 Reliability growth chart versus predicted.

    Figure 24.5 Duane curve.

    Figure 24.6 Phase 5 ARG process flow.

    Figure 24.7 Typical SPC chart.

    List of Tables

    Table 5.1 Functional activities for cross‐functional integration of reliability.

    Table 6.1 Failures in the warranty period w/different MTBFs.

    Table 6.2 Advantages of proactive reliability growth.

    Table 6.3 RDT multiplier for failure‐free runtime.

    Table 6.4 FMMEA for fan bearings (detection omitted).

    Table 6.5 Sensors to monitor for overstress in wearout degradation.

    Table 6.6 Sensors to monitor bearing degradation.

    Table 6.7 Component grade temperature classifications.

    Table 7.1 The FMEA spreadsheet.

    Table 7.2 RPN ranking table.

    Table 7.3 FMEA parking lot for important issue that are not part of the FMEA.

    Table 7.4 Common software failure modes.

    Table 7.5 Common causes for software failure.

    Table 7.6 Failure modes and associated possible causes.

    Table 8.1 Agreed upon HALT limits.

    Table 8.2 HALT profile for test setup checkout.

    Table 8.3 Temperature step stress with power cycle and end of each step.

    Table 8.4 Vibration step stress.

    Table 8.5 Temperature and vibration step stress.

    Table 8.6 Rapid thermal cycling.

    Table 8.7 Slow temperature ramp.

    Table 8.8 Slow temperature ramp with constantly varying vibration level.

    Table 11.1 CMMI process areas.

    Table 11.2 CMMI maturity levels.

    Table 11.3 Life cycle comparison.

    Table 14.1 Industry standards for managing counterfeit material risk.

    Table 15.1 Annual sales dollars relative to typical warranty costs.

    Table 15.2 HALT facility decision guide.

    Table 15.3 HALT machine decision matrix.

    Table 16.1 Reliability skill set for various positions.

    Table 17.1 Reliability activities for each phase of the product life cycle.

    Table 17.2 Reliability activities – what's required, recommended, and nice to have.

    Table 18.1 Product concept phase reliability activities.

    Table 19.1 Design concept phase reliability activities.

    Table 20.1 Reliability activities for the product design phase.

    Table 20.2 Common accelerated life test stresses.

    Table 20.3 Environmental stress tests.

    Table 21.1 Reliability activities in the design validation phase.

    Table 21.2 HALT Profile test limits and test times.

    Table 24.1 Reliability activities in the production ramp Phase 5.

    Table 24.2 Reliability activities in the production release Phase 6.

    table B.1 Conversion tables for FIT to MTBF and PPM.

    table B.2Factorials.

    table B.3 Repairable versus nonrepairable systems still operating (in MTBF time units).

    Series Editor's Foreword

    Engineering systems are becoming more and more complex, with added functions, capabilities and increasing complexity of the systems architecture. Systems modeling, performance assessment, risk analysis and reliability prediction present increasingly challenging tasks. Continuously growing computing power relegates more and more functions to the software, placing more pressure on delivering faultless hardware‐software interaction. Rapid development of autonomous vehicles and growing attention to functional safety brings quality and reliability to the forefront of the product development cycle.

    The book you are about to read presents a comprehensive and practical approach to reliability engineering as an integral part of the product design process. Various pieces of the puzzle, such as hardware reliability, physics of failure, FMEA, product validation and test planning, reliability growth, software quality, lifecycle engineering approach, supplier management and others fit nicely into a comprehensive picture of a successful reliability program.

    Despite its obvious importance, quality and reliability education is paradoxically lacking in today's engineering curriculum. Few engineering schools offer degree programs or even a sufficient variety of courses in quality or reliability methods. Therefore, a majority of the quality and reliability practitioners receive their professional training from colleagues, engineering seminars, publications and technical books. The lack of formal education opportunities in this field greatly emphasizes the importance of technical publications, such as this one, for professional development.

    We are confident that this book, as well as the whole series, will continue Wiley's tradition of excellence in technical publishing and provide a lasting and positive contribution to the teaching and practice of engineering.

    Dr. Andre Kleyner

    Editor of the Wiley Series in Quality & Reliability Engineering

    Series Foreword Second Edition

    There is a popular saying, If you fail to plan, you are planning to fail. I don't know if there is another discipline in complex product development where this is more true than designing for product reliability. When products are simple, it is possible to achieve high reliability by observing good design practices, but as products become more complex, and include thousands of components and hundreds of thousands of lines of software, a systematic approach is required.

    This has played itself out inside of Teradyne over the last decade through two product lines in our Semiconductor Test Division. One product line, the UltraFLEX Test System, was designed internally. Another, the ETS‐800 Test System, was designed in a company that Teradyne acquired in 2008.

    The UltraFLEX platform was designed using Teradyne's internal Design for Reliability standards. The principles embodied in those standards are described by the authors. We religiously used an approved parts list of qualified components and suppliers, we analyzed the electrical stress on every circuit, and we calculated predicted reliability for every instrument and the whole system. Once the system was fielded, we tracked MTBF and executed our failure response, analysis, and corrective action system (FRACAS) on repeat failure modes. The result is that the UltraFLEX platform, our most complex product, has a field reliability about three times higher than prior‐generation products. What makes this more remarkable is that the UltraFLEX has the capability to test two or even four more semiconductor devices in parallel compared to prior testers.

    During the development of the UltraFLEX and over the past decade, we also began to deploy and came to rely upon more formal methods to improve software reliability. To be frank, our organizational maturity in software reliability lagged behind our hardware best practices. But through the application of tools like defect models, and especially tracking the reliability of deployed software through automated quality monitors, we were able to both improve the quality of the deployed product and also improve our development methods. A key tool we use to evaluate software reliability is a metric we call clean sessions. A clean session is a session where an operator starts up the tester, loads a program, executes a task like developing tests, debugging, or just testing devices, finishes the task, and then unloads the program, without encountering any anomalous behavior. When we started tracking this metric at the launch of the UltraFLEX, only about half of the sessions were clean. It took us nearly five years to get to 95% clean sessions, and this has set a benchmark that our competitors struggle to reach. Through the learning achieved in this long struggle, we have been able to achieve 95% clean sessions within three months of the release of our next‐generation product.

    The ETS‐800 is the next generation version of the successful tester for mixed signal and power devices. When Teradyne acquired the business in 2008, there was no formal reliability program in place, but their products were well regarded in the marketplace and reasonably reliable. The ETS‐800 was a big step up in terms of capability from the prior generation. The instruments were two to four times as dense, and the system could support almost twice as many instruments. Further, the tester included a promising new feature that would greatly simplify customer test programs by providing the switching needed to share tester resources between different device pins.

    From a functional and performance perspective, the ETS‐800 was a fantastic success. A single ETS‐800 could replace up to eight prior generation testers. But we found out the hard way that the informal approach to reliability that worked for simple products did not work for more complex ones. When we initially fielded the ETS‐800, it was not a reliable tester. The weak link in the design was the inclusion of thousands of mechanical relays. These relays provided superior electrical performance, but are challenging to use from a reliability perspective. Mechanical relays are highly reliable if they are not hot switched, or switched while a current is flowing through the contacts. A hot‐switching event causes an arc across the contacts surface that causes a rapid degradation to the contact surface and the life of the relay. If the relays were designed for reliability, the hot‐switching event could have been avoided. The ETS 800 reliability was an order of magnitude below the much more complex UltraFLEX platform, and this put a blemish on the reputation we worked hard to develop for delivering highly reliable products.

    We worked for a long time to try to improve the robustness of the relays, and reduce the occurrence of hot switching without making much progress. Ultimately we decided to redesign all of the instrumentation using guidelines from the Teradyne reliability system. We are just beginning the deployment of the redesigned instruments, but in side‐by‐side testing, they are demonstrating about 100 times higher reliability than the ones that they replace. It was a hard but effective lesson that a systematic approach to hardware reliability and software quality as the authors have described is the best way to achieve both high customer satisfaction and good profits.

    Gregory S. Smith

    President, Semiconductor Test Division

    Teradyne, Inc.

    Series Foreword First Edition

    Modern engineering products, from individual components to large systems, must be designed and manufactured to be reliable. The manufacturing processes must be performed correctly and with the minimum of variation. All of these aspects impact upon the costs of design, development, manufacture, and use, or, as they are often called, the product's life cycle costs. The challenge of modern competitive engineering is to ensure that life cycle costs are minimized whilst achieving requirements for performance and time to market. If the market for the product is competitive, improved quality and reliability can generate very strong competitive advantages. We have seen the results of this in the way that many products, particularly Japanese cars, machine tools, earthmoving equipment, electronic components, and consumer electronic products have won dominant positions in world markets in the last 30–40 years. Their success has been largely the result of the teaching of the late W. E. Deming, who taught the fundamental connections between quality, productivity, and competitiveness. Today this message is well understood by nearly all the engineering companies that face the new competition, and those that do not understand lose position or fail.

    The customers for major systems, particularly the US military, drove the quality and reliability methods that were developed in the West. They reacted to a perceived low achievement by imposing standards and procedures, whilst their suppliers saw little motivation to improve, since they were paid for spares and repairs. The methods included formal systems for quality and reliability management (MIL‐Q‐9858 and MIL‐STD‐758) and methods for predicting and measuring reliability (MIL‐STD‐721, MIL‐HDBK‐217, MILSTD781). MIL‐Q‐9858 was the model for the international standard on quality systems (ISO9000); the methods for quantifying reliability have been similarly developed and applied to other types of products and have been incorporated into other standards such as ISO60300. These approaches have not proved to be effective and their application has been controversial.

    By contrast, the Japanese quality movement was led by an industry that learned how quality provided the key to greatly increased productivity and competitiveness, principally in commercial and consumer markets. The methods that they applied were based on an understanding of the causes of variation and failures, and continuous improvements through the application of process controls and the motivation and management of people at work. It is one of history's ironies that the foremost teachers of these ideas were Americans, notably P. Drucker, W.A. Shewhart, W.E. Deming, and J.R Juran.

    These two streams of development epitomize the difference between the deductive mentality applied by the Japanese to industry in general, and to engineering in particular, in contrast to the more inductive approach that is typically applied in the West. The deductive approach seeks to generate continuous improvements across a broad front and new ideas are subjected to careful evaluation. The inductive approach leads to inventions and break‐throughs, and to greater reliance on systems for control of people and processes. The deductive approach allows a clearer view, particularly in discriminating between sense and nonsense. However, it is not as conducive to the development of radical new ideas. Obviously these traits are not exclusive, and most engineering work involves elements of both. However, the overall tendency of Japanese thinking shows in their enthusiasm and success in industrial teamwork and in the way that they have adopted the philosophies of western teachers such as Drucker and Deming, whilst their western competitors have found it more difficult to break away from the mold of scientific management, with its reliance on systems and more rigid organizations and procedures.

    Unfortunately, the development of quality and reliability engineering has been afflicted with more nonsense than any other branch of engineering. This has been the result of the development of methods and systems for analysis and control that contravene the deductive logic that quality and reliability are achieved by knowledge, attention to detail, and continuous improvement on the part of the people involved. Therefore, it can be difficult for students, teachers, engineers, and managers to discriminate effectively, and many have been led down wrong paths.

    In this series we will attempt to provide a balanced and practical source covering all aspects of quality and reliability engineering and management, related to present and future conditions, and to the range of new scientific and engineering developments that will shape future products. The goal of this series is to present practical, cost‐efficient and effective quality and reliability engineering methods and systems.

    I hope that the series will make a positive contribution to the teaching and the practice of engineering.

    Patrick D.T. O'Connor

    February 2003

    Foreword First Edition

    In my 26 years at Teradyne, I have seen the automated test industry emerge from its infancy and grow into a multibillion‐dollar industry. During that period, Teradyne evolved into the world's leading supplier of automated test equipment (ATE) for testing semiconductors, circuit boards, modules, voice, and broadband telephone networks. As our business grew, the technology necessary to design ATE became increasingly complex, often requiring leading‐edge electronics to meet customer performance needs. Our designs have pushed the envelope, demanding advancements in nearly every technological area including process capability, component density, cooling technology, ASIC complexity, and analog/digital signal accuracy.

    Our customers, too, insist on the highest performance systems possible to test their products. But performance alone does not provide the product differentiation that wins sales. Customers also demand incomparable reliability. Revenue lost when an ATE system goes down can be staggering, often in the area of tens of thousands of dollars per hour. Furthermore, because of design complexity and system cost, the warranty cost to maintain these systems is increasing. Low reliability severely impacts the bottom line and impedes the ability to gain and hold market share.

    To improve product reliability, changes had to be made to the reliability process. We learned that the process needed to be proactive. It had to start early in the product concept stage and include all phases of the product development cycle. In researching solutions for improving product reliability, we found the wealth of information available to be too theoretical and mathematically based. Clearly, we didn't want a solution that could only be implemented by reliability engineers and statisticians. If the training were overly statistical, the message would be lost. If the process required training everyone to become a reliability engineer, it would be useless. The process had to reduce technical reliability theory into practical processes easily understood by the product development team.

    For the reliability program to be successful, we needed a way to provide both management and engineering with practical tools that are easily applied to the product development process. The reliability processes presented in this book achieve this goal.

    The authors logically present the reliability processes and deliverables for each phase of the product development cycle. The reliability theory is thoughtful, easily grasped, and does not include a complex mathematical basis. Instead, concepts are described using simple analogies and practical processes that a competent product development team can understand and apply. Thus, the reliability process described can be implemented into any electronic or other business, regardless of its size or type, and ultimately helps give customers products with superior performance and superior reliability.

    Edward Rogas Jr.

    Senior Vice President

    Teradyne, Inc.

    Preface Second Edition

    When this book was first published, the primary focus was on improving product reliability, why reliability improvement efforts fail, and how poor reliability negatively impacts current and future business. We discussed the ease with which consumers can research a product to determine consumer satisfaction and discover issues related to product reliability. To improve product reliability, we presented a comprehensive process for product development and an implementation strategy that any business can start. We also discussed ways to change the corporate culture so that it strives to design reliable products.

    The importance of designing reliable products has not changed since the book was first published. However, much has changed in regard to the types of products being developed today compared to when the book was first published. The most significant change is the amount of software and firmware required for new products. The other significant change is the number of new products being developed that connect to the internet (IOT) to provide ease of use, communicate with other devices, aid in customer support and update software remotely. The internet provides the consumer with greater ease of use and a better user experience, but brings with it a new set of risks regarding security and privacy.

    We changed the book title to Improving Product Reliability and Software Quality to convey the importance of software in product development. There are many books written about hardware reliability and likewise about software quality. The hardware reliability books do not cover software quality and the software quality books do not address hardware reliability. However, successful product development is dependent on the synergy of these two functional groups working well together. Hardware engineers and software engineers are very different and communicate in different languages; therefore, they do not effectively integrate each other's requirements and dependencies. Assumptions are often made regarding what other functional groups are delivering, which later turn out to be wrong.

    Hardware and software engineers look at bugs very differently. The hardware development team strives to release products without any reliability issues and assumes last‐minute discoveries will be fixed with software. Hardware requirements can be fully defined and validated to ensure the release of a reliable product. The software development team does not set a requirement for a 100% bug‐free product before product release. In fact, for most products, the software requirements and validation cannot define every use condition and possible state. When the software is released, the team is already working on the next update, tier release, or patch.

    In addition to software quality, there is also the issue of software security. Many new products access the Internet as a way to quickly and efficiently send out software bug fixes and as a way to improve customer use experience through user apps. A good example is the Nest™ programmable learning thermostat. This connectivity raises software security concerns and new challenges that are often overlooked or underestimated. Some products can communicate via Bluetooth and Global Positioning Services (GPS), which also have the potential to be compromised.

    Each new generation of electronic products incorporates significantly more software and firmware than the previous generation. This goes for simple products like a home thermostat to complex ones. Even the mix of development engineers required for product development is shifting. The goal of the book is to provide insight, process, and tools to help meet these changing demands.

    Preface First Edition

    Nearly every day, we learn of another company that has failed. In the new millennium, this rate of failure will increase. Competitors are rapidly entering the marketplace using technology, innovation, and reliability as their weapons to gain market share. Profit margins are shrinking. Internet shopping challenges the conventional business model. The information highway is changing the way consumers make buying decisions. Consumers have more resources available for product information, bringing them new awareness about product reliability.

    These changes have made it easier for consumers to choose the best product for their individual needs. As better‐informed shoppers, consumers can now determine their product needs at any place, any time, and for the best price. The information age allows today's consumer to research an entire market efficiently at any time and with little effort. Conventional shopping is being replaced by smart shopping. And a big part of smart shopping is getting the best product for the best price.

    As the sources for product information continue to increase, the information available about the quality of the product increases as well. In the past, information on product quality was available through consumer magazines, newspapers, and television. The information was not always current and often did not cover the full breadth of the market. Today's consumer is using global information sources and internet chat to help in their product‐selection process. An important part of the consumer's selection process is information regarding a product's quality and reliability. Does it really do what the manufacturer claims? Is it easy to use? Is it safe? Will it meet customer expectations of trouble‐free use? The list can be very long and very specific to the individual consumer.

    Quality versus Reliability

    From automobiles to consumer electronics, the list of manufacturers who make high‐quality products is continuously evolving. Manufacturers who did not participate in the quality revolution of the last two decades were replaced by those that did. They went out of business because the companies with high‐quality systems were producing products at a lower cost. Today, consumers demand products that not only meet their individual needs, but also continue to meet these needs over time. Quality design and manufacturing was the benchmark in the 1980s and 1990s; quality over time (reliability) is becoming the requirement in the twenty‐first century. In today's marketplace, product quality is necessary in order to stay in business. In tomorrow's marketplace, reliability will be the norm.

    Quality and reliability are terms that are often used interchangeably. While strongly connected, they are not the same. In the simplest terms:

    Quality is conformance to specifications.

    Reliability is conformance to specification over time.

    As an example, consider the quality and reliability in the color of a shirt. In solid‐color men's shirts, the color of the sleeves must match the color of the cuffs. They must match so closely that it appears that the material came from the same bolt of cloth. In today's manufacturing processes, several operations occur simultaneously. One bolt of cloth cannot serve several machines. The colors of several bolts of cloth must be the same, or the end product will be of poor quality. Every bolt of cloth has to match to a specified color standard, or the newest manufacturing technologies cannot be applied to the process. Quality in the material that goes into the product is as important as the quality that comes out. In fact, the quality that goes in becomes a part of the quality that comes out. After numerous washings, the shirt's color fades out. The shirt conformed to the consumer's expectations at the time of first use (quality), but failed to live up to the consumers' expectations later (reliability).

    Reliability is the continuation of quality over time. It is simply the time period over which a product meets the standards of quality for the period of expected use. Quality is now the standard for doing business. In today's marketplace and beyond, reliability will be the standard for doing business. The quality revolution is not over; it has just evolved into the reliability revolution.

    This book is an effort to guide the user on how to implement and improve product reliability with a product life cycle process. It is written to appeal to most types of businesses regardless of size. To achieve this, the beginning of each chapter discusses issues and principles that are common to all businesses, independent of size. We also segregate business into three categories based on size: small, medium, and large. Definitions are summarized in Table I.1.

    Table I.1 Business size definition.

    The finance department can, more precisely, quantify the lost revenue due to warranty claims and poor quality. This loss represents the potential dollars that are recoverable after the reliability process improvements have been implemented and have begun to bear fruit.

    Gaining Competitive Advantage

    Manufacturers who have no reliability engineering in place typically have warranty costs as high as 10–12% of their gross sales dollar. A company that implements reliability into their processes can see warranty costs diminish to below 1% of the gross sales dollar. The total amount that can be recovered from the warranty budget represents the dollars that could be reinvested (from the warranty budget) or added to earnings. If research and development is 10% of the gross sales dollars, then the annual warranty dollar savings from reliability can cover the costs to develop future products. Of course, this only addresses the tangible benefits from a reliability program. There are many intangible benefits that are gained by improving product reliability. Examples include better product image, reduced time to market, lower risk of product recall and engineering changes, and more efficient utilization of employee resources. These intangible assets are addressed later in the book.

    Acknowledgments

    Special thanks to Dana Levin, Molly Rodin, Larry Steinhardt, Anto Peter, Ken Turner, Steve Hlotyak, Chris Behling, Thomas Mayberry, Joel Justin, Jim McLinn, Pat O'Connor, Steve King, Joel Justin, Kevin Giebel, and Debra Levin for their technical edits, and proofreading of the second edition. Thanks to Glenn Hemanes for his patience and help with some of the artwork for the second edition. Finally, we would like to thank Gregory Smith for the second edition Foreword and for supporting our work.

    We would like to recognize and thank Harding Ounanian for his significant contribution in the first edit of the first edition of the book and to Ed Rogas for the first edition Foreword.

    We would like to thank the following people who have brought a better awareness about reliability and continue to influence our way of thinking; Benton Au, Joe Denny, Dave Evans, Jim Galuska, Ray Hansen, Dr. Greg Hobbs, Jim McLinn, Pat O'Connor, Roy Porter, Dr. David Steinberg, and Michael Pecht.

    Glossary

    Part I

    Reliability and Software Quality – It's a Matter of Survival

    1

    The Need for a New Paradigm for Hardware Reliability and Software Quality

    1.1 Rapidly Shifting Challenges for Hardware Reliability and Software Quality

    Hardware reliability and software quality, why do you need it? The major US car manufacturers saw their dominance eroded by the Japanese automobile manufacturers during the 1970s because the vehicles produced by the big three had significantly more problems. The slow downward market slide of the US automobile industry was predictable when the defect rate of US automobiles was compared with the Japanese automobile industry. In 1981, a Japanese‐manufactured automobile averaged 240 defects per 100 cars. The US automobile manufacturers, during the same time period, were manufacturing vehicles with 280–360% more defects per 100 vehicles. General Motors averaged 670 defects per 100 cars, Ford averaged 740 defects per 100 cars, and Chrysler was the highest, with 870 defects per 100 cars.

    Much has been written about how this came about and how the US manufacturers began implementing total quality management (TQM), quality circles, continuous improvement, and concurrent engineering to improve their products. Now the US automobile industry produces quality vehicles, and the perception that Japanese vehicles are better has eroded significantly. J.D. Powers and Associates reported in its 1997 model year report that cars and trucks averaged about 100 defects per 100 vehicles. This represented a 22% increase from 1996 and a 100% decrease from 1987. Vehicles such as the GM Saturn and Ford Taurus are a tribute to that success, both in financial terms and in the improved perception that automobile manufacturers in the United States can produce reliable, quality automobiles. Quality programs like TQM have dramatically improved American manufacturing quality. The automotive industry has also benefited from the quality of the components going into automobiles, which is also at a very high quality level. Counterfeit components and counterfeit material is still a major concern for the electronics and automotive industry that requires constant diligence and an effective program to minimize the risk of counterfeit material entering into the production stream.

    In the 1970s, the typical automobile warranty was for 12 months or 12 000 miles. In 1997, automobile manufactures were offering 3‐year/36 000‐mile bumper‐to‐bumper warranties. Three years later, these same automobile manufacturers were offering 7‐year/100 000‐mile warranties. Jaguar is now advertising a 7‐year/100 000‐mile warranty on its used vehicles! BMW has responded with a similar type of program. The reason these manufacturers can offer longer warranty periods is because they understand why and how their vehicles are failing and can therefore produce more reliable vehicles.

    A 1997 consumer reports survey of 604,000 automobile owners showed a dramatic improvement in the perception of the reliability of US‐manufactured automobiles. The improvement by the big three automobile manufacturers did not occur overnight. It was the result of a commitment to provide the necessary resources along with a credible plan for producing reliable vehicles. It was a paradigm change that took years and evolved through many steps.

    The process to improve hardware reliability has made significant progress over the past 20 years. If you follow the process outlined, there can

    Enjoying the preview?
    Page 1 of 1