The 737 Max Disaster: A Software Testing Nightmare

Naveen Alok
3 min readOct 15, 2022

--

When you think of airplane crashes, you probably don’t think of software failure as the leading cause. But that’s exactly what happened to Boeing in March when two brand new 737 Max planes crashed within five months of each other. It was later discovered that there were serious bugs in the plane’s anti-stall software and it's MCAS (Maneuvering Characteristics Augmentation System). The response from airlines and the general public was swift: The 737 Max fleet was grounded worldwide for months until further investigation.

The Software Failures

The Boeing 737 Max crashes were due to two separate problems in the software, both of which had to do with the flight control system. The first incident happened on March 10 when the Lion Air flight crashed into the Java Sea off the coast of Indonesia, killing all 189 passengers and crew members on board. The second incident was on October 29, when a Southwest 737 Max flight crashed in the Sangre de Cristo Mountains in New Mexico, killing the 1st officer and injuring the captain. Thankfully, they were able to land safely. The Lion Air flight crashed due to a malfunction in the Maneuvering Characteristics Augmentation System (MCAS). MCAS is a feature on the 737 Max designed to prevent the plane from stalling.

What Went Wrong?

While many people think that airplane crashes are caused by human error and pilot mistakes, software also has the potential to cause a crash. There are several ways software could cause a crash or a near-miss, including incorrect inputs or incorrect outputs, incorrect calculations, or a software bug. Most importantly a bug that is missed during testing or a bug that never appears in simulated & test conditions.

The Maneuvering Characteristics Augmentation System (MCAS) software on the Boeing 737 Max has been the subject of much controversy since the first Max crash. The MCAS is responsible for keeping the plane from stalling, which is especially important at low altitudes where there’s less air. The system automatically reduces the angle of the airplane’s nose during takeoff to increase the lift on the wings and keep the plane from stalling. But when the MCAS is activated, the pilots can’t simply turn it off; instead, they have to follow a two-step process to disengage it. The first step is to reduce the vertical speed of the plane, which is measured in feet per minute. The second step is to change the pitch of the nose of the plane. This is much easier said than done, particularly when the plane is in a low-altitude situation where there isn’t much air. This can cause serious problems for pilots, particularly if they don’t know the MCAS is activated.

Fix

After the first Boeing 737 Max crash, Boeing was quick to release a software update to correct the problem. Operators took the extra step of having their pilots train on the newly corrected software and also had their pilots fly under low-altitude situations to test the new software and make sure it worked as intended.

Bottom line

One size doesn't fit all, and the complexity of testing may vary in the domain you are working bug in a payroll system has a different impact than testing aviation software; however, this can’t be an excuse for failing. While the Boeing 737 Max aircraft have been cleared to fly again, it’s important to remember that this wasn’t a problem with the design of the plane itself; it was a problem with a piece of software that could have been tested better.

--

--

No responses yet