

Speaking Of Reliability: Friends Discussing Reliability Engineering Topics | Warranty | Plant Maintenance
Reliability.FM: Accendo Reliability, focused on improving your reliability program and career
Gain the experience of your peers to accelerate improvement of your program and career. Improve your product development process, reliability or warranty performance; or your plant uptime or asset performance. Learn about reliability and maintenance engineering practical approaches, skills, and techniques. Join the conversation today.
Episodes
Mentioned books

Aug 2, 2021 • 0sec
A Better Way to do Design Reviews
A Better Way to do Design Reviews
Abstract
Chris and Fred ask each other ‘what makes a good design review?’ This is a great question. Reliability engineering can help! (… click here to read an article about this). If you want to learn more – listen to this podcast!
Key Points
Join Chris and Fred as they discuss what makes a good design review. There is a chance that you have been involved in a bad design review. Like the one where the mechanical engineering team lead busts out a 378 page PowerPoint presentation that is wordy photojournal of effort.
Topics include:
Stop talking about requirements! What? How can this be? Because we only meet requirements at (or around) our final design. Which means the only design review where we talk about meeting requirements is the last one. So what do we up until that point if we focus on requirements? We continually ask ourselves if we are ‘on track.’ So everyone turns up with as much material as possible to convince people that they have been working really, really hard.
… and design reviewers set the scene. Design reviewers need to prepare. They need to research. They need to not necessarily be the most senior personnel. They need to be the best people placed to provide meaningful design help.
Design reviews are forums for helping to create awesome designs. A designer once took the initiative and sent a list of concerns or problems he was experiencing to the people who were going to be part of the design review. And this meant that instead of being an arduous ordeal of PowerPoint slides, the design review became a wonderful brainstorm of amazing design solutions.
Design reviews need to be ‘safe’ places. Not courtrooms where the defendants (… I mean … designers) have to justify how hard they have been working while ‘senior engineers’ try and pick holes. This is destructive. We need to encourage designers to be vulnerable and look for help. They will get accolades at the end of the day.
And the leader of the design review owns everything. If a reviewer doesn’t care or hasn’t prepared, get rid of them. If a design team lead turns up with 378 PowerPoint slides … ask them to stop and ask them what is keeping them up at night. Have a scribe. Someone who writes every action item down. Not ideas. Not concerns. Action items.
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
The post SOR 675 A Better Way to do Design Reviews appeared first on Accendo Reliability.

Jul 30, 2021 • 0sec
Safety and Reliability
Safety and Reliability
Abstract
Carl and Fred discussing the tools of safety and reliability engineering, and how these tools overlap.
Key Points
Join Carl and Fred as they discuss the broad set of tools within the body of knowledge of reliability engineering, and the value or lack of value in categorizing the tools into subsets.
Topics include:
Are the tools of quality and safety subsets of reliability?
What is safe enough, reliable enough?
How does the product perform in real life, from a safety standpoint, reliability standpoint, quality standpoint?
Bucketing the tools into silos does not add value
How should reliability engineering department be organized?
What is overlap between safety tools and reliability tools?
Performance is included in reliability definition (intended function)
Reasonably anticipated misuse
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
The post SOR 674 Safety and Reliability appeared first on Accendo Reliability.

Jul 26, 2021 • 0sec
Is Safety Binary or Variability
Safety – Binary or Variable
Abstract
Carl and Fred discuss the subject of safety. Specifically whether an item or device can be considered safe or not safe (binary), or whether there are degrees of safety (variable).
Key Points
Join Carl and Fred as they discuss the concept of developing and verifying safe products.
Topics include:
Is safety a quality characteristic?
Is an item safe or not safe (binary)?
Need to design safety into our products
Safety in the regulatory space
Risk needs to be reduced to acceptable level (variable)
Safety, hazard, risk – differences and similarities
Is meeting regulatory requirements sufficient?
Regulatory compliance is necessary, but not sufficient
Safe and reliable is the objective
How to specify safety requirements
How to achieve safety
Safety considerations with tandem events
FMEA, FTA and other more sophisticated tools
Safety and reliability tools overlap
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
This book was mentioned in the podcast:
System Safety Engineering and Management, by Harold E. Roland and Brian Moriarty, John Wiley & Sons, copyright 1990
The post SOR 673 Safety – Binary or Variable appeared first on Accendo Reliability.

Jul 23, 2021 • 0sec
Make or Check Reliability
Make or Check Reliability
Abstract
Chris and Fred discuss the difference between ‘making’ and ‘checking’ reliability. And there is a difference. This podcast follows on from Chris’s article about a US Department of Defense (DoD) quick reference guide on a ‘Reliability and Maintainability Engineering Body of Knowledge.’ The problem with this document was it was all about ‘checking’ reliability – and not enough ‘making’ reliability.
Key Points
Join Chris and Fred as they discuss the difference between making and checking reliability. We sometimes confuse effort with outcomes. And many documents (like the one Chris references) only talk about ‘reviewing’ progress, generating ‘status report,’ come up with ‘test plans’ and so on. Not a lot about making reliability happen.
Topics include:
Checking is not making. Of the 58 activities described in the quick reference guide, 57 activities were about ‘reviewing,’ ‘evaluating,’ ‘preparing documents,’ ‘verifying,’ and other keywords that are all about checking reliability. Nothing about making reliability.
A list of activities is not a strategy. There are different strategies when it comes to reliability. And the strategy comes down to human beings. What do we want them to be good at? How do we train them? In that training, what key approaches that are specific to our organization do we want to embed? This means you need to sit down and think about how the users are going to interact with your systems.
What happens when something goes wrong? When we have an approach which is based entirely on ‘review’ activities plastered over a Gant chart or other schedule diagram … what happens if something goes wrong? In practice, we can’t tolerate ‘do-overs.’ In this quick-reference guide, there was no allowance for something not ‘satisfying’ the review activity. There is in practice no time or money to redo stuff we thought we had already done well. So we muddle our way through it with concessions and all sorts of other mechanisms to wave away progress. What we actually want is to focus on making reliability happen by having design teams search for failures early in the design process. And this won’t happen if engineers a constantly preparing for the next project meeting they need to satisfy.
Do you have any design guidance? Do you like the idea of ‘modular’ maintenance? … where we take out an entire subsystem that has a failed component and replace it with another subsystem that is designed to make this swap easy? How on earth can you provide guidance to make this happen – if the only thing you do is check on what has been done?
… and engineers generally crave meaningful guidance. Not just waiting for you to tell them what they did wrong.
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
The post SOR 672 Make or Check Reliability appeared first on Accendo Reliability.

Jul 19, 2021 • 0sec
Supply Chain Controls and Checks
Supply Chain Controls and Checks
Abstract
Kirk and Fred discussing how we can insure we receive outsourced parts meet our requirements when we may not know what variances in specifications will affect your particular system.
Key Points
Join Kirk and Fred as they discuss supply chain issues and controlling potential quality excursions
Topics include:
Each component supplier has multiple suppliers that provide materials to build the components. It can be extremely difficult to monitor the many levels of supply chains and their long chain of suppliers.
Sometimes the specifications in assembly drawings are generically specified to a default tolerance and assumed to be sufficient without detailed analysis.
Fred discusses the specifications for car dashboards and specifications of reflectivity but still it is difficult make sure it is good for all potential sunlight conditions.
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
Please click on this link to access a relatively new analysis of traditional reliability prediction methods article from the US ARMY and CALCE titled “Reliability Prediction – A Continued Reliance on a Misleading Approach”
For more information on the newest discovery testing methodology here is a link to the book “Next Generation HALT and HASS: Robust design of Electronics and Systems” written by Kirk Gray and John Paschkewitz.
The post SOR 671 Supply Chain Controls and Checks appeared first on Accendo Reliability.

Jul 16, 2021 • 0sec
Kitchen Appliance Failure
Kitchen Appliance Failure
Abstract
Kirk and Fred discussing a recent failure of an electric range/oven and the troubleshooting, FA and repair. See the show notes to see photos and details of the failure analysis.
Key Points
Join Kirk and Fred as they discuss the process of finding the cause of the range/oven failure. You can see below the details of the failed Al-E capacitor location on one of the two control boards.
Topics include:
Could the problem have been tin whiskers between the two very close circuit boards (as you can see in the photos below) ?
Could it have been some foreign object or bug that had crawled into the spaces between the backs of the two circuit boards ?
Kirk discovered that the control module was a common model used for many different brands of ovens, but the first one ordered was almost completely functional but was missing the correct layout of the front panel.
Potential investigations and experiments if we worked for one of the manufacturers that found this problem to perform corrective action.
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
The lead end of capacitor that fell out
Tight spacing between PWA’s where leads of AL Cap may have touched or could it have been tin whiskers?
Location of AL Cap burn mark on opposite facing bottom PWB where cap leads may have shorted.
Underside of PWB where AL E cap leads extend to be soldered
Please click on this link to access a relatively new analysis of traditional reliability prediction methods article from the US ARMY and CALCE titled “Reliability Prediction – A Continued Reliance on a Misleading Approach”
For more information on the newest discovery testing methodology here is a link to the book “Next Generation HALT and HASS: Robust design of Electronics and Systems” written by Kirk Gray and John Paschkewitz.
The post SOR 670 Kitchen Appliance Failure appeared first on Accendo Reliability.

Jul 12, 2021 • 0sec
Wright's Law and Reliability
Wright’s Law and Reliability
Abstract
Chris and Fred discuss this thing called ‘Wright’s Law’ which is a really fascinating way of describing how things improve as we create more of them. And why is this relevant for reliability engineering? Does ‘reliability growth’ ring a bell?
Key Points
Join Chris and Fred as they discuss Wright’s Law, and how it relates to ongoing improvement in reliability.
Topics include:
What is Wright’s Law? Wright’s Law states that every time you double the number of items you produce or manufacture, the cost per unit item decreases by 15 %. Why is this? There are a number of things. First and foremost – we learn as we go. So the mistakes we made for the first few products aren’t repeated later on. But then of course there are things like economy of scale, efficiencies, and so on. But a good chunk of this improvement (which we tend to see over and over again) is down to learning. Wright’s Law is a form of a power law that we see in lots of different applications across the world.
So how is this relevant for reliability? Well … much of the costs and delays we incur in production are reliability-related. If a component prototype can’t integrate with the rest of the product, then that incurs costs and delays. It also represents an interface function failure which things like FMEAs are wonderful at preventing.
… and reliability growth? Well … we know that traditional reliability growth (where we build-test-fix) sees reliability grow … using a power law very similar to Wright’s law. In fact, this thing called Duane’s failure pattern is a power law with this thing called a ‘reliability growth rate.’ Lots of studies and observations have found that this reliability growth rate is around 0.3 to 0.4. And what does that mean? Well … every time we double the amount of reliability growth testing, we decrease failure rate by 21.5 %. This is pretty close to 15 %!
Are there any key takeaways? YES! Improving reliability through exploratory testing, learning from mistakes, and build-test-fixing is EXPENSIVE. So instead use tools like a Failure Mode and Effect Analysis (FMEA) to prevent as many problems as possible as part of your first design. For example, if we need to remove a certain number of defects and problems from our initial ‘raw’ design, you could halve your development time by eliminating 20 % of problems before you start. So why wouldn’t you? Why wouldn’t you take as many issues out of production as possible to do things like … incorporate design characteristics.
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
The post SOR 669 Wright’s Law and Reliability appeared first on Accendo Reliability.

Jul 9, 2021 • 0sec
Maintenance Culture Matters
Maintenance Culture Matters
Abstract
Chris and Fred discuss how important maintenance culture is – especially when it comes to safety-critical systems. Like ‘cable cars’ used to transport people up ski slopes. But unfortunately (like the recent accident that occurred in Italy that resulted in 14 deaths) toxic maintenance culture can lead to disastrous consequences. And this tends to happen across the world on a regular basis. Why does this happen?
Key Points
Join Chris and Fred as they discuss the recent cable car accident that occurred in Italy where 14 people died and how this relates to maintenance culture.
Topics include:
So what happened? As the cable car was reaching the top of the ski slope, a ‘thinner moving’ cable snapped, meaning that it accelerated back down a ‘thicker static’ cable that acts as a ‘rail’ for the cable car. The cable car quickly accelerated to 100 km per hour or 60 miles per hour and flew off the cable, into the ground, killing all but one on board.
So what went wrong? An emergency brake that is designed to arrest the movement of the cable car if it goes too fast was deliberately disabled. It was intermittently applying when it shouldn’t, so a technician inserted a ‘steel staple’ to hold the emergency brake open.
Was it the technician’s fault? According to his lawyer …
He is not a criminal and would never have let people go up with the braking system blocked had he known that there was even a possibility that the cable would have broken. He can't even begin to get his head around the fact that the cable broke.
What does this say about culture? The term ‘plausible deniability’ might be fitting here. The technician’s supervisors claim they didn’t know about the emergency brake being disabled. But … why did the technician believe that disabling the emergency brake was even an option? The emergency brake was included in the design of the cable car (at some cost). How can we, as engineers, believe that an emergency component that was included in the design is superfluous to requirements? We all know of organizations where bypassing safety elements of a design feature is fundamentally unacceptable.
What does weak management look like? For example, a manager who deliberately over allocates tasks to subordinates, knowing that they can’t possibly complete them all, but thinking they have avoided accountability for these tasks not being done. We can only speculate here, but we as a community have some experience in this regard. Safety is a special beast and needs to be ingrained in behaviors. Or culture. Some organizations empower every employee to push a hypothetical ’emergency stop’ button if they see a problem. In fact, they encourage it. Identifying a problem early allows a fast and inexpensive remedy. Other organizations discourage that ’emergency stop’ button from being pressed because it is all about throughput or profit.
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
The post SOR 668 Maintenance Culture Matters appeared first on Accendo Reliability.

Jul 5, 2021 • 0sec
Interpreting Distribution Parameters
Interpreting Distribution Parameters
Abstract
Chris and Fred discuss what ‘distribution parameters’ mean when it comes to random processes. Specifically failure random processes. This is an interesting podcast in response to a question from one of our listeners – which are podcasts we love!
Key Points
Join Chris and Fred as they discuss a question directed to us by a listener. In fact they were two questions – as follows:
Think of probability distributions and the sequence you define your observation points. Neither the distribution type nor the parameters change e.g. when you reverse the sequence or change the order. It’s ambiguous to me, because if I have higher rate of failures in the past but better conditions now, I’d like to see it in my parameters and shape. Otherwise, how can I rely on e.g. beta in my Weibull distribution? 2) How may I determine the rate of events (say rate of TTR, TTF, or any other parameter) when my distribution is not Weibull? Which parameter should I use? Let me appreciate your time & willingness to help in advance. Keyvan.
Just for the uninitiated, a Weibull distribution is a type of probability distribution that is used a lot in reliability engineering. The ‘beta’ refers to what we call a shape parameter, which describes the nature in which failure occurs.
Topics include:
Order of data shouldn’t matter … if we are looking at time to failure. The first step of any random data analysis is to order the data from smallest to largest. Unless … we are talking about a ‘renewal process.’ This is where you might have a single machine that works until it fails, and then it is repaired, and it keeps working. In which case … the order of data does matter. In a renewal process, one machine might have lots of times to failure (noting it gets repaired). This means that we can’t use probability distributions to describe single times to failure (like a Weibull distribution).
But what if it is a renewal process? Then we can perhaps examine monthly failure rates, or the Mean Cumulative Function (MCF) to identify trends over time. And by trends, we mean failure rate behaviors that show wear-in, wear-out or something in between. If you are really interested in getting to the bottom of what is going on, then research this thing called the nonhomogenous Poisson process.
OK … so what if it is simple ‘time to failure?’ Well before we talk about ‘betas,’ we need to confirm the Weibull distribution is an appropriate model. It is sometimes useful to break down failure into failure modes. If there are different failure modes, they might be modelled by different Weibull distributions. For example, if a system fails due to wear-in around half the time, and wear-out the other half, then fitting a Weibull distribution might try and ‘average’ the two and (incorrectly) conclude the system has a constant or non-changing failure rates. So always confirm you have the right model.
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
The post SOR 667 Interpreting Distribution Parameters appeared first on Accendo Reliability.

Jul 2, 2021 • 0sec
Giving and Receiving Feedback
Giving and Receiving Feedback
Abstract
Carl and Fred discussing the topic of giving and receiving feedback, and how it supports professional development.
Key Points
Join Carl and Fred as they discuss
Topics include:
Feedback aspects: giving, receiving, mandatory (like annual performance reviews) and voluntary (upon request)
Feedback is better when it is specific
Be open and receptive to hearing candid feedback; find the nugget
Ask clarifying questions about feedback you receive
Use active listening to enhance benefit of feedback you receive
Signal to people that you are open to *candid* feedback
Ask when giving feedback to a colleague
Be aware of body language when communicating feedback
Positive feedback first, then areas of improvement
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
Download Audio RSS
Show Notes
Chapter 10 of Effective FMEAs outlines general facilitation skills. Many of these skills, such as active listening, can be used when giving or receiving feedback.
The post SOR 666 Giving and Receiving Feedback appeared first on Accendo Reliability.