

Speaking Of Reliability: Friends Discussing Reliability Engineering Topics | Warranty | Plant Maintenance
Reliability.FM: Accendo Reliability, focused on improving your reliability program and career
Gain the experience of your peers to accelerate improvement of your program and career. Improve your product development process, reliability or warranty performance; or your plant uptime or asset performance. Learn about reliability and maintenance engineering practical approaches, skills, and techniques. Join the conversation today.
Episodes
Mentioned books

May 1, 2023 • 0sec
Selecting the Right Method
Selecting the Right Method
Abstract
Greg and Fred discussing the right method (s) to solve quality and reliability problems specifically answering the question ‘is the approach good enough?’
Key Points
Join Greg and Fred as they discuss how to approach a problem and make a decision. Asking the right questions in the front end saves time and money. Most importantly, it helps ensure that you’ll get a better answer.
Topics include:
How to approach a problem by asking better questions in the front end.
How to select the right tool and approach to solve a problem.
Is there one best approach or method to solve a problem
Does problem solving come down to analytics or professional judgment or something in between.
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Related Topics:
SOR 875 What Should a Reliability Engineer Do(Opens podcast in a new browser tab)
The post SOR 857 Selecting the Right Method appeared first on Accendo Reliability.

Apr 28, 2023 • 0sec
ChatGPT and Reliability
ChatGPT and Reliability
Abstract
Kirk and Fred discuss the use of artificial intelligence engines such as ChatGPT in Reliability Engineering. A copy of the ChatGPT questions and responses that we discuss on this podcast is listed in the show notes below.
Key Points
Join Kirk and Fred as they discuss the recent introduction of the ChatGPT artificial intelligence engine that has been in the news, and its implications or use for reliability guidance in design and failure analysis. Fred and I refer to the sample questions below in the notes below that were asked of ChatGPT (version 4.0)
Topics include:
In the first question asked to ChatGPT it response was not exactly answering the question of how to test, but instead it answered with good methods for reliability development.
ChatGPT answers are highly dependent on how the question is asked. A major difference on how Fred and I would answer would be first asking many more questions about what it is that you are wanting to make reliable. Some of our questions to further define the answer might be 1) are you making a dozen or a million of these devices? 2) is this a new or follow up design? 3) what is the intended end use environment going to be?
Way back in 1968 in the science fiction movie “2001 A Space Odyssey” the HAL 9000 computer predicts the major communication system will fail in the next 24 hours. The astronauts remove the circuit board, probe it to evaluate it, and the HAL 9000 cannot determine the potential latent failure. The astronauts fail to ask the HAL 9000 what its analysis of future failure is based on, a fatal mistake on their part.
In this first pass at using AI for reliability the answers are fairly generic, but we know this is the first iteration and in future iterations will be more interactive in next generations. In the future it will be asking specific questions to provide more detailed specific answers.
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
Here are the text of the ChatGPT reliability questions and answers Fred and I discussed in this podcast. You can go to the ChatGPT here https://chat.openai.com/chat
First Question:
What is the best method for testing the reliability of electronics and
electronic systems for latent defects or design flaws
Chat GPT4 answer
1.The best method for testing the reliability of electronics and electronic systems for latent
defects or design flaws involves a combination of multiple techniques, as no single method can
provide comprehensive coverage. Some of the widely used methods include:
2. Design for Testability (DFT): Implement testability measures in the design phase to
improve the ease of testing and fault detection. Techniques include boundary-scan testing,
built-in self-test, and scan chains.
3. Design for Reliability (DFR): Design electronic systems to meet specified reliability and
lifetime requirements by considering factors like temperature, humidity, and mechanical
stresses.
4. Design Review: Conduct thorough design reviews to identify and address potential
issues, involving experienced engineers and cross-functional teams.
5. Simulation and Modeling: Use computer-aided engineering tools to simulate and model
the behavior of the electronic systems under various conditions, helping identify and fix issues
before prototyping.
6. Prototype Testing: Test early-stage prototypes to catch latent defects or design flaws
before moving to mass production.
7. Environmental Stress Testing: Subject the electronic system to environmental stresses
like temperature cycling, humidity, vibration, and shock to identify potential failure modes.
8. Accelerated Life Testing (ALT): Test the electronic system under elevated stress levels to
induce failure mechanisms more rapidly and estimate product life under normal conditions.
9. Highly Accelerated Life Testing (HALT): Test the system to its limits by applying extreme
levels of stress in order to identify potential weaknesses and design flaws.
10. Highly Accelerated Stress Screening (HASS): Screen production units using accelerated
stress testing to detect manufacturing defects and ensure reliability.
11. Failure Modes and Effects Analysis (FMEA): Analyze potential failure modes and their
impact on the electronic system, ranking them by severity and likelihood, to prioritize design
improvements and testing efforts.
12. Continuous Monitoring and Feedback: Monitor field performance and collect feedback
from users to identify failure trends, enabling further improvements to the design and testing
processes.
13.It is important to apply a combination of these methods and tailor them to the specific
electronic system being developed, as this will help improve the likelihood of detecting latent
defects and design flaws effectively
2nd Question
I have a electronic circuit board that has intermittent failure. How do I best troubleshoot to find
out where the failure mechanism is?
Chat GPT4 Answer
Intermittent failures can be challenging to diagnose, but with a systematic approach, you can
increase your chances of finding the issue. Here are some steps to help you troubleshoot an
electronic circuit board with intermittent failure:
1. Visual inspection: Start by examining the board for any visible signs of damage, such as
burnt components, broken traces, cracked solder joints, or loose connectors. Use a
magnifying glass or microscope if necessary.
2. Check for temperature-related issues: Intermittent failures can sometimes be caused by
temperature changes. Test the board by heating or cooling it gently (e.g., with a hairdryer
or a can of compressed air), and observe if the failure occurs more or less frequently. If
the problem seems temperature-related, look for components that are sensitive to
temperature changes or have poor thermal connections.
3. Reseat connectors and components: Unplug and re-plug all connectors, ensuring they
are seated properly. If there are socketed components, remove and re-insert them to
ensure good contact.
4. Test the power supply: Check the voltage levels of the power supply, and make sure they
are stable and within the specified range. Intermittent failures can be caused by an
unstable power supply.
5. Use an oscilloscope and/or logic analyzer: To analyze signals in the circuit, use an
oscilloscope for analog signals and a logic analyzer for digital signals. Look for signal
anomalies, such as noise, glitches, or incorrect voltages, that could cause the
intermittent failure.
6. Test components individually: Test components (resistors, capacitors, diodes, etc.) with a
multimeter to ensure they are functioning within their specified range. Replace any
suspicious components with known good ones.
7. Check for mechanical stress: Gently flex the board or tap on components while the
circuit is operating to see if the failure occurs. This can help identify poor solder joints,
hairline cracks, or damaged components.
8. Use freeze spray or a heat gun: Apply freeze spray on specific components or areas of
the board to lower their temperature and see if the issue goes away. Alternatively, use a
heat gun to heat components and observe if the failure occurs more frequently.
9. Record the failure: Document the exact circumstances under which the failure occurs,
including the input conditions, output conditions, and environmental factors. This can
help you establish a pattern and identify the problematic area or component.
10. Seek assistance: If you’re still having trouble diagnosing the problem, consider reaching
out to colleagues or online forums for advice. Sometimes, a fresh perspective can be
invaluable in identifying the root cause of an intermittent failure.
Remember, patience and persistence are key when troubleshooting intermittent issues. It may
take time to identify the cause, but a systematic approach can help you uncover the problem
and resolve it.
You can now purchase the recent recording of Kirk Gray’s Hobbs Engineering 8 (two 4 hour sessions) hour Webinar “Rapid and Robust Reliability Development 2022 HALT & HASS Methodologies Online Seminar” from this link.
Please click on this link to access a relatively new analysis of traditional reliability prediction methods article from the US ARMY and CALCE titled “Reliability Prediction – Continued Reliance on a Misleading Approach”
For more information on the newest discovery testing methodology here is a link to the book “Next Generation HALT and HASS: Robust design of Electronics and Systems” written by Kirk Gray and John Paschkewitz.
Related Topics:
SOR 380 How to Find an Intermittent Failure Mechanism?(Opens podcast in a new browser tab)
The post SOR 856 ChatGPT and Reliability appeared first on Accendo Reliability.

Apr 24, 2023 • 0sec
Mechanical Lessons Learned
Mechanical Lessons Learned
Abstract
Kirk and Fred discussing Kirk’s recent trip to the Texas Pinball Festival (TPF) and the reliability of Pinball Machines which new designs have been produced for over six decades. Some of the mechanical assemblies have not changed in design for those many decades, even though some changes would be simple and useful, but change is difficult in this very small group of manufacturers.
Key Points
Join Kirk and Fred as they discuss mechanical issues with Pinball machines.
Topics include:
Even though pinball flippers now have to shoot multiple balls at the same time the flipper assemblies still use a analog squeeze grip to mechanically connect the flipper shaft to the solenoid arm. Two engineers at the TPF were trying to sell a better mechanically locked flipper assembly to the end user. The only way they are going to make money off their design is to sell it to the pinball manufacturers.
We want engineers to innovate and improve technology. When the first portable (luggable?) computers were made, the reliability was poor but it was new and cool. The market demanded better reliability and through innovation and testing they evolved to very lightweight laptop computers, with long battery life and high resolution displays.
If it ain’t broke don’t fix it is the reason that Flipper assemblies today are the same basic design that has been used for decades. The pinball manufacturers are innovating many new technologies into their machines and have little motivation to change old flipper assembly designs for a slight advantage to the alignment of flippers
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
You can now purchase the recent recording of Kirk Gray’s Hobbs Engineering 8 (two 4 hour sessions) hour Webinar “Rapid and Robust Reliability Development 2022 HALT & HASS Methodologies Online Seminar” from this link.
Please click on this link to access a relatively new analysis of traditional reliability prediction methods article from the US ARMY and CALCE titled “Reliability Prediction – Continued Reliance on a Misleading Approach”
For more information on the newest discovery testing methodology here is a link to the book “Next Generation HALT and HASS: Robust design of Electronics and Systems” written by Kirk Gray and John Paschkewitz.
Related Topics:
SOR 131 Pinball Machine Reliability(Opens podcast in a new browser tab)
SOR 236 Evolution of Pinball Machine Reliability(Opens podcast in a new browser tab)
The post SOR 855 Mechanical Lessons Learned appeared first on Accendo Reliability.

Apr 21, 2023 • 0sec
Reliability Gap Assessment
Reliability Gap Assessment
Abstract
Carl and Fred discussing the second step in achieving high reliability: “Develop a Reliability Gap Assessment.”
Key Points
Join Carl and Fred as they discuss doing a gap assessment, how it is done and why it is essential to achieving high reliability.
Topics include:
The core principles of doing a gap assessment
You need to know organizations capabilities, their shortcomings
Where are you today? How good are you?
With Reliability Strategic Vision, we ask “where are you going?”
With Reliability Gap Assessment, we ask “what is your current capability?”
Gaps can be organizational, cultural, methods
To learn gaps in methods, you will ask: to what extent do you use (method)?, how do you use (method)?, why do you use (method)?, what examples for (method)?
Knowing the gaps helps you zero in on the areas where you need to improve
Essence of gap is the tension between your goals and the barriers or risks to achieving your goals
There is not a canned list of question to identify gaps, you need to develop your own list of questions
Fact finding: where are you today compared to where you need to be?
Future decisions should be based on the gaps
Communication is key to learning gaps
Map gaps to the Maturity Matrix
May need to change the culture to achieve reliability objectives
Gap Assessment is not an audit
You don’t know where a Gap Assessment will end up
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Related Topics:
SOR 075 Gap Assessment Process(Opens podcast in a new browser tab)
SOR 343 Why Perform a Reliability Gap Analysis(Opens podcast in a new browser tab)
The post SOR 854 Reliability Gap Assessment appeared first on Accendo Reliability.

Apr 17, 2023 • 0sec
Reliability Strategic Vision
Reliability Strategic Vision
Abstract
Carl and Fred discussing the first step in achieving high reliability: “Develop a Reliability Strategic Vision.”
Key Points
Join Carl and Fred as they discuss the elements that go into a reliability vision, and how it is developed.
Topics include:
Chapter 5 of the book: The Process of Reliability Engineering”
The first step is knowing where you’re going
Developing a Reliability Strategic Vision
Why strategy is so important
It’s not a generic Corporate mission statement
The elements of a reliability strategic vision
The four parts embedded in the definition of reliability: probability statement, primary function, environmental conditions, duration
The vision for reliability includes reliability and organizational goals
Where are you heading from a reliability point of view?
The importance of reputation, brand identity
Consider the overall lifecycle cost
Reliability can be a critical part of your marketing position; but it has to be real
Safety must be an essential part of your reliability vision
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Related Topics:
SOR 005 Importance of Reliability Goal and Vision Discussions(Opens podcast in a new browser tab)
The post SOR 853 Reliability Strategic Vision appeared first on Accendo Reliability.

Apr 14, 2023 • 0sec
Decisions and Value
Decisions and Value
Abstract
Chris and Fred discuss the importance of understanding value when you make decisions. And value needs to mean something to you and your organization. Not something in a dictionary.
Key Points
Join Chris and Fred as they discuss how important it is to understand the concept of ‘value’ as it relates to making the right decision. Many people almost ‘run’ past the discussion on what value means to them so they can start getting busy doing something. Like testing. Effort often feels like generating value. But that is not always the case.
Topics include:
How much fuel do you need for a car drive? Well … you need to start by understanding where you are going. What vehicle you are driving? What is its fuel efficiency? What it is towing? Many organizations (including reliability engineers) rush to come up with an estimate of how much fuel you need. It might be a literature review of other ‘car drives.’ It might be a standard that purports to say how much fuel the ‘average car drive’ needs. This approach never works. So … where are YOU going?
It usually starts with leadership. Is the leader simply trying to get the ‘reliability’ or ‘quality’ guy to OK his or her favorite (pre-ordained) outcome of his decision? If this is the case, then the leader ‘values’ something that is very different to organizational good.
Here is an example. A mining company wanted to reduce downtime … and so they demanded that Servicing or Preventive Maintenance Duration be halved. And of course … failures went through the roof. Where did they go wrong? They focused on maintenance duration and not operational availability. As soon as they got the right metrics, then the engineers got to work and were able to come up with a longer maintenance duration, but instead of doing it weekly, it was done every two weeks. So downtime decreased (good) while maintenance duration went up. So what do you value?
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Related Topics:
Reliability Goals(Opens article in a new browser tab)
The post SOR 852 Decisions and Value appeared first on Accendo Reliability.

Apr 10, 2023 • 0sec
ALT Sample Size
ALT Sample Size
Abstract
Chris and Fred discuss how many samples you need to conduct Accelerated Life Testing or ALT … and a little bit about what ALT is.
Key Points
Join Chris and Fred as they discuss the challenge of selecting how many samples you need to conduct Accelerated Life Testing or ALT. Firstly … what is ALT?
Topics include:
What is ALT? ALT involves increasing the stress to a product (think force, temperature, humidity, power et cetera) to try and compress 3 years’ worth of use into something like 2 days. To be able to do this, we need to understand how your product fails, so you know what failure mechanism to focus on to identify the right stress. Is failure based on temperature? … force? … both?
What is an ‘accelerated life model’ and why are there so many of them? Once you know the dominant failure mechanism of your product, then you need to know the physical process behind it. For example, many chemical reactions are based on concepts like ‘activation energy,’ which helps you work out what effect additional temperature will have on failure. The ‘Arrhenius model’ can be helpful for this. But the reason there are a lot of accelerated life models is that there are lots of different failure mechanisms.
How many samples do I need? What decision are you trying to inform. If you are trying to differentiate between two potential materials? Then perhaps not many samples. Do you believe that there is a material that should outlast the service life of your product of orders of magnitude … but you need to confirm this – then perhaps you need fewer samples again. But if your failure mechanism is accounting for around about 4 % of warranty period failures, and you can tolerate up to 5 % … then you might need lots of samples to get the level of confidence you need to make a decision.
But I want to test everything on my product! As in … every possible way it can fail. If you want to do this … then ALT is not for you.
Confidence is a measure of you … not your product. What does that mean? Many organizations test without any objective or decision in mind … simply because they want ‘confidence’ that reliability stuff is happening. You can’t test your way to reliability.
So always start with your decision!
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Related Topics:
Fundamentals of ALT(Opens webinar recording in a new browser tab)
How to Create an ALT Plan(Opens webinar recording in a new browser tab)
The post SOR 851 ALT Sample Size appeared first on Accendo Reliability.

Apr 7, 2023 • 0sec
Sedentary Inertia in Quality
Sedentary Inertia in Quality
Abstract
Chris and Dianna discuss the ‘sedentary inertia’ that many companies experience when they need to change … but can’t (don’t want to). Especially in the world of quality!
Key Points
Join Chris and Dianna as they discuss the challenges many organizations have when they need to change … but really don’t want to. We sometimes call this ‘sedentary inertia’ … where there is a tendency for things that aren’t moving to STAY not moving. Sound familiar?
Topics include:
How can quality help you stay competitive? Stop being resistant to change. For example, there are some quality teams out there that adamantly believe that if you can’t test for something in their laboratory (due to capacity, time, money and so on) … then it can’t be a requirement. It’s almost as if those quality guys think they are customer!
It usually happens when leadership is lacking. Why? Because the ‘quality guys’ can only become the reinforced roadblocks that many organizations feel they are when they aren’t answerable or challengable. If the boss is competent or confident enough to push any part of their organization along, then there is always a chance that those parts can become reinforced roadblocks to support their agendas.
You need good ‘quality’ guys and girls to innovate. Why? Because novel technologies still need to result in high-quality, easy-to-produce, defect free, reliable and safe. They just need to look outside what the standard says (… there is never a standard for emerging technology!)
So what do you need to do? Leadership. Culture. Ensuring is quality is always important. Which starts with making sure quality never becomes irrelevant.
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Related Topics:
3 Ideas to Overcome Organizational Inertia(Opens article in a new browser tab)
The post SOR 850 Sedentary Inertia in Quality appeared first on Accendo Reliability.

Apr 3, 2023 • 0sec
Two Sides of a Coin: Quality and Reliability
Two Sides of a Coin: Quality and Reliability
Abstract
Dianna and Chris discussing their different viewpoints of quality and reliability: how they are really two sides of a coin that are supported by the culture of an organization and by individual contributors.
Key Points
Join Dianna and Chris as they discuss two sides of a coin: quality and reliability.
Topics include:
quality and reliability as part of the leadership and development culture
practitioners modeling and showcasing quality and reliability in development
imaginary reliability and quality fairies
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
Dianna and Chris talk about both quality and reliability in development projects.
Quality and Reliability are different concepts. But there is not a clear distinction between them. The important part is – the customer!
Chris talks about the cultural motivations for involving quality and reliability early. His list of 5 motivating categories for finding problems, from worst to best:
Problems in the field (a disaster).
A problem uncovered during manufacturing (also a disaster).
Problems uncovered during a review.
Problems found during testing.
A problem that was prevented (the best situation).
Dianna talks about the power of the personality. Staff engineers can do what they can where they are, to promote quality and reliability methods. They can affect change in the culture of a workplace from within. But there is a risk – without the culture supporting quality and reliability during development, that personality may be the only safety net for a group.
The best situation: a culture of quality and reliability with personalities that promote these fields through partnership.
Related Topics
SOR 742 Reliability and Quality(Opens podcast in a new browser tab)
The post SOR 849 Two Sides of a Coin: Quality and Reliability appeared first on Accendo Reliability.

Mar 31, 2023 • 0sec
Monitoring Manufacturing Product Reliability
Monitoring Manufacturing Product Reliability
Abstract
Kirk and Fred discussing how to ensure that the quality and reliability of the last units produced from a manufacturing line is the same as the first units produced.
Key Points
Join Kirk and Fred as they discuss the fundamental concern of any manufacturing operation, sustaining quality throughout the manufacturing life of the product.
Topics include:
W. Ed Deming, PhD. taught the world manufacturing how to monitor manufacturing processes capability and uniformity through Statistical Process Control (SPC), but sometimes the process being monitored is “because we can” not because it is or has been a significant risk to reliability.
So many times a failure mechanism is introduced to a manufacturing line way back up in the supply chain. An example is a PEM (plastic encapsulated microcircuit) epoxy not having the correct formula for the TCE (thermal coefficient of expansion) to match the lead frame structure and may lead to delamination or wire bond failures. So many suppliers and so many potential mechanisms in the supply chain may lose their process control without knowing it.
The best and most relevant way to ensure the quality of the product is by making detailed failure analysis of final test failures and field failures early in the production cycle and then quickly apply corrective actions to minimize manufacturing process variation
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
You can now purchase the recent recording of Kirk Gray’s Hobbs Engineering 8 (two 4 hour sessions) hour Webinar “Rapid and Robust Reliability Development 2022 HALT & HASS Methodologies Online Seminar” from this link.
Please click on this link to access a relatively new analysis of traditional reliability prediction methods article from the US ARMY and CALCE titled “Reliability Prediction – Continued Reliance on a Misleading Approach”
For more information on the newest discovery testing methodology here is a link to the book “Next Generation HALT and HASS: Robust design of Electronics and Systems” written by Kirk Gray and John Paschkewitz.
Related Topics:
Fundamentals of Reliability Performance Monitoring(Opens in a new browser tab)
The post SOR 848 Monitoring Manufacturing Product Reliability appeared first on Accendo Reliability.