What is MTBF? Understanding Mean Time Between Failures

By ·Categories: Tech Explained·Published On: August 3rd, 2023·4.4 min read·

MTBF, or Mean Time Between Failures, is defined as the average amount of time a system can run between failures. For industrial and manufacturing environments in particular, downtime can be very costly. In fact, according to a survey from ITIC, unplanned downtime can cost over $1 million in a single hour. 

With so much money and time at stake, preventing downtime is at the forefront of many businesses’ minds. But how can a business maximize system uptime?

What is MTBF?

MTBF is often used as the defining standard for hardware reliability. There are many methods used to calculate MTBF, but one of the most common formulas comes from The Military Handbook for Reliability Prediction of Electronic Equipment, also commonly referred to as MIL-HDBK-217. There are two methods in this handbook that are widely used by system manufacturers.

The first, called the “parts count” method, is calculated using “typical conditions” of a given environment. The second is called the “part stress” method, and works by using as close to the conditions of the equipment and environment as possible. While the part stress method is most certainly the more accurate of the two, it isn’t always clear what method a manufacturer is using in its published MTBF data. Therefore, it can be difficult to know the true mean time to failure of a system in a particular environment when looking at the data.

For a more in-depth dive into mean time between failures and the math behind MTBF calculation, check out our whitepaper.

What are the main causes of downtime?

There are many factors that contribute to system downtime. Power outages, cyber attacks, and incorrect set up can all cause a system to fail. However, one of the biggest contributors to unplanned downtime is the hardware itself. If your system isn’t built to handle the environment it’s deployed in or is otherwise unsuitable, it can fail more frequently. So what are the main features you should look for in hardware that promote maximum reliability?

Hardware considerations to increase MTBF

When considering the kind of hardware you need, it’s important to keep the specifics of its use case and environment in mind; where will it be used? Will it be in a dusty or hot environment? Let’s go through a few of the most common industrial hardware features that can help to increase system uptime.

Fanless PCs

Fanless PCs address the reliability challenges presented by fanned computers. This is especially true when used in industrial settings, such as on manufacturing lines where contaminants in the air may be high. These contaminants, such as dirt, dust, and even oil, can enter the chassis through the vents and build up on the fan. Eventually this can cause the fan to slow or even fail entirely.

Fan malfunction can cause the system to overheat and lead to irreversible damage to the internal components like the CPU, motherboard, and hard drive. A fanless PC, which utilizes fanless cooling, helps to mitigate these concerns. Check out our blog on the advantages of fanless PCs to learn more about how fanless cooling can help to increase MTBF.

Industrial components

One reason standard computer hardware fails in an industrial environment is because of the internal components. Consumer-based PCs often aren’t built for the challenges of an industrial environment, so industrial PCs generally feature components that are specially designed to withstand harsh environments and offer a longer life span.

Industrial PCs typically utilize SSDs (solid state drives) over HDDs (hard disk drives). HDDs are made of spinning platters and a read-write head. SSDs, however, feature a solid state design with no moving parts and rely on flash memory instead. SSDs are especially important for industrial PCs due to the nature in how these PCs are used. For example, an industrial PC used in a vehicle or on a manufacturing line can be subject to varying levels of vibration.

This vibration can cause any moving parts (such as those in an HDD) to become dislodged or shift out of place, which can ultimately lead to system failure. Having an SSD in place of an HDD not only helps to improve a system’s vibration resistance, but it can also reduce the need for frequent storage drive replacement since SSDs typically have a longer life span.

Operating temperature range

Equipment in manufacturing and industrial environments can cause the surrounding areas to get very hot, which can put strain on a PC and cause it to overheat. This risk goes up significantly if there is a large system load or if the environment is poorly ventilated. Having a system with a wide operating temperature range can make all the difference in these types of environments.

Onlogic’s Karbon 800 series can handle operating temperatures ranging from -40°C to 70°C (-40°F to 158°F) and offers fanless and fanless hybrid cooling options to help keep your system cool in tough environments. You can find the MTBF summary of OnLogic’s Karbon 800 series here.

The bottom line on MTBF

Using industrial-grade, reliable hardware suited for the environment and application will help you get the most out of your system and increase uptime. Not sure which system you need? The experts at OnLogic can help you find the right fit for you. Contact us with any questions today.

Get the Latest Tech Updates

Subscribe to our newsletters to get updates from OnLogic delivered straight to your inbox. News and insights from our team of experts are just a click away. Hit the button to head to our subscription page.

Share

About the Author: Claireice Mathai

Claireice Mathai is a content creator for OnLogic. When not writing, she enjoys playing guitar and gaming.