When High Performers Fail: Lessons from NASA's Greatest Losses

Interviews

Oct 8, 2025

Jowanza Joseph

CEO, Parakeet Risk

Michael Ciannilli

Program Manager at NASA

a group of people working in a field
a group of people working in a field
a group of people working in a field

In Episode 5 of Industrial Risk: Beyond The Blueprint, Jowanza and Mike Ciannilli explore the deeply human side of engineering failure, revealing how NASA's most devastating losses became the foundation for revolutionary safety practices that now guide industries worldwide.


In a riveting conversation, Mike Ciannilli, NASA's Program Manager for the Apollo, Challenger, and Columbia Lessons Learned Program, opens up about his journey from launch director to keeper of spaceflight's most painful memories. What began as an effort to memorialize fallen heroes has evolved into something far more profound: a blueprint for preventing catastrophic failure in any high-stakes organization.


The Genesis of Understanding: From Recovery to Revelation


When the Space Shuttle Columbia broke apart over Texas on February 1, 2003, Mike wasn't just watching from mission control—he was searching the skies above East Texas in a helicopter, helping bring Columbia and her crew home. That experience, combined with similar roles during earlier missions, transformed his understanding of failure from an abstract concept to a visceral reality.


NASA workplace safety culture training session showing personnel engaged in safety briefing within an industrial space environment. Source: appel.nasa


"The questions started to become why? Why did it happen?" Mike explains to Jowanza during their interview. "I think that's really the central question that I try to focus on and spend most of my time on—the why. People want to know beyond the basic facts and figures. How could teams that are such high-performance, talented, and passionate—make a mistake like this and then do it repeatedly?" 


This shift from "what happened" to "why it happened" became the cornerstone of NASA's Apollo Challenger Columbia Lessons Learned Program (ACCLLP), established in 2016. The program represents a fundamental departure from traditional accident analysis, focusing not on technical failures but on the human and organizational factors that enable those failures to become catastrophic. The story of Apollo 1 reveals perhaps the most sobering lesson about the danger of assumptions. On January 27, 1967, astronauts Virgil "Gus" Grissom, Edward White, and Roger Chaffee died in what was labeled a "non-hazardous test"—a routine ground rehearsal that NASA believed posed no significant risk.


Apollo 1: The "Non-Hazardous" Test That Changed Everything


Apollo 1 spacecraft showing burn damage from the 1967 disaster that led to critical NASA lessons on safety and risk management. Source: space.com 


Mike's analysis of this tragedy introduces a critical concept: failure of imagination. As he explains, the Apollo 1 team had become comfortable with individual elements of their system—such as a pure oxygen atmosphere, increased Velcro and netting, and electrical systems—without considering how these elements might interact.


We'd been flying with a pure oxygen environment in the capsules... We were also adding different substances to the capsule," Mike recounts. "Well, over the course of the early part of the program, the astronauts wanted more Velcro introduced and netting in different ways to help with orbit operations. All good purposes, right? We're learning from space flight how to solve certain problems, and we're adding different commodities on board. However, we've had a failure to really understand that a larger volume of the spacecraft means adding higher levels of oxidizer. When you're adding more commodities like the netting or Velcro, you're adding a fuel source.


Pure oxygen (oxidizer), increased combustible materials (fuel), and an electrical spark (ignition source) resulted in a catastrophic fire in what should have been a routine test. The tragedy killed three astronauts in seconds and nearly ended the Apollo program before it truly began.


This disaster established a fundamental principle that Mike emphasizes: questioning what we label as "non-hazardous." In our own personal and professional lives, what operations do we consider routine simply because we've done them countless times before?


The Normalization of Deviance: How Success Becomes Failure's Enabler


The seven astronauts of the Space Shuttle Challenger crew in their flight suits, photographed before the Challenger disaster in 1986. Source: nasa.com 


The Challenger and Columbia disasters, separated by 17 years, share a disturbing commonality that reveals one of the most insidious threats to safety culture: normalization of deviance. This phenomenon, first identified by sociologist Diane Vaughan in her analysis of the Challenger disaster, describes how organizations gradually accept lower standards until what was once unacceptable becomes routine.


Challenger: When Warning Signs Become Normal


The Challenger disaster of January 28, 1986, stemmed from O-ring degradation in the solid rocket boosters—a problem that had been observed on multiple previous flights. "Throughout the flight history of the space shuttle program, this was the 25th flight, multiple flights we had seen degradation of the O-rings, sometimes a little burning, sometimes soot, but we had seen evidence that something was not going right," Mike explains.


The critical failure wasn't a result of technical ignorance—it was a matter of psychological adaptation. "We saw it, but we thought we understood it enough not to stand down to fix it, but we thought we understood we could fix it on the fly," Mike continues. Each successful launch, despite O-ring damage, became justification for accepting the next instance of damage, creating a dangerous cycle of rationalization.


Columbia: The Deadly Evolution of Acceptable Risk


Columbia's fate was sealed 81.7 seconds after launch when a briefcase-sized chunk of foam insulation broke away from the external tank and struck the shuttle's wing. But foam shedding wasn't new—it had been occurring since the very first shuttle flights.


Mike's analysis reveals how this normalization process operates at a granular level:

We got so comfortable with the deviance in the system that subliminally and sometimes actively, we actually used the deviance experience to give us permission that that was okay in the future.


The process follows a predictable pattern:

  1. Initial deviation occurs (foam shedding)

  2. No immediate catastrophic consequences

  3. Deviation becomes "acceptable risk"

  4. Previous successful outcomes justify accepting larger deviations

  5. Risk envelope continues expanding until catastrophic failure occurs


Mike uses a powerful analogy to illustrate this phenomenon: "I can tell you if you ever saw my car, it does not look like it did when I first bought it, right? I've got the little thump over here, the little noise in the engine, the little orange blinky light on the dashboard that wasn't there, but to me that's normal. But it wasn't normal when I bought the car".



Beyond Aerospace: Universal Lessons for High-Stakes Organizations


Mike's work has expanded far beyond NASA's gates, reaching organizations across diverse industries including energy, healthcare, banking, and manufacturing. This broad applicability reflects a fundamental truth: the human factors that contributed to NASA's failures exist in virtually every complex organization.


The Healthcare Connection


Healthcare organizations have embraced NASA's lessons with particular enthusiasm, recognizing parallels between spaceflight and medical environments where small failures can cascade into catastrophic outcomes. The concept of psychological safety—the belief that one can speak up without risk of punishment or humiliation—has become central to patient safety initiatives worldwide.


High Reliability Organization Principles


NASA's transformation incorporates principles from High Reliability Organization (HRO) theory, which was developed by researchers studying organizations that operate successfully in high-risk environments despite complexity and the potential for catastrophic failure. These principles include:


  1. Preoccupation with failure: Constant awareness of what could go wrong

  2. Reluctance to simplify: Accepting complexity rather than seeking simple explanations

  3. Sensitivity to operations: Understanding that front-line workers often have the best situational awareness

  4. Commitment to resilience: Building capacity to respond to unexpected events

  5. Deference to expertise: Prioritizing knowledge over hierarchy when making critical decisions



Building Psychological Safety: The Foundation of Learning Organizations


NASA's cultural transformation following the Columbia disaster centered on creating psychological safety—an environment where employees feel comfortable raising concerns without fear of retribution. This concept, pioneered by Harvard Business School professor Amy Edmondson, has become fundamental to understanding how high-performing organizations maintain safety in complex environments.


The Columbia Accident Investigation Board found that engineer Rodney Rocha had concerns about potential wing damage but felt unable to escalate his concerns effectively due to hierarchical barriers. His response when asked why he hadn't spoken up more forcefully captured the essence of the cultural problem: "I just couldn't do it. I'm too low (in the organization)... and she (meaning Mission Management team Leader Linda Ham) is way up here".


NASA's response involved systematic cultural changes designed to flatten hierarchies when safety is at stake and ensure that expertise, rather than organizational level, determines whose voice carries weight in critical decisions.


Practical Applications: Implementing NASA's Lessons


Organizations seeking to implement NASA's lessons can focus on several practical areas:


1. Challenge Assumptions Regularly

Create formal processes for questioning what has become "normal" or "routine." Regularly ask: What are we accepting as standard that wasn't originally designed to be standard?


2. Create Safe-to-Fail Experiments

Before implementing changes that could affect safety, create controlled environments where potential failures can be identified and addressed without catastrophic consequences.


3. Develop "Reading the Room" Skills

Train leaders to recognize non-verbal cues and unspoken concerns. Create multiple channels for raising safety concerns, including anonymous reporting systems.


4. Focus on Why, Not Just What

When incidents occur, investigate organizational and cultural factors alongside technical causes. Ask not just what failed, but why competent, well-intentioned people made decisions that enabled the failure.


5. Make Failure Stories Personal

Connect safety lessons to human stories that create emotional resonance. Abstract principles are easily forgotten; human stories that illustrate consequences are remembered and acted upon.


The Continuing Mission: From Tragedy to Transformation


Mike Ciannilli's work represents more than accident analysis or safety training—it embodies a fundamental commitment to ensuring that lives lost in the pursuit of human advancement are not lost in vain. By transforming NASA's most painful moments into universal lessons about human performance, organizational culture, and the prevention of catastrophic failure, the Apollo Challenger Columbia Lessons Learned Program serves organizations worldwide.


The program's impact extends far beyond aerospace, reaching healthcare systems seeking to prevent medical errors, energy companies managing high-risk operations, and any organization where the stakes are high and the margin for error is small. Each presentation, tour, and training session carries forward the voices of 17 astronauts whose sacrifice illuminates the path to safer, more reliable operations.


Summary

Through continued vigilance, cultural transformation, and the systematic application of hard-won wisdom, NASA's greatest losses have become the foundation for creating safer, more reliable operations across industries. The mission continues: honor the fallen by ensuring their lessons prevent future tragedies, wherever and whenever they might threaten to occur.


Sources:

Managing the Unexpected: Resilient Performance in an Age of Uncertainty – August 31, 2007 by Karl E. Weick, Kathleen M. Sutcliffe.


🎧 Listen to the whole conversation between Jowanza and Mike: 

Episode

Industrial Risk

BEYOND THE BLUEPRINT

Episode

Industrial Risk

BEYOND THE BLUEPRINT

Episode

Industrial Risk

BEYOND THE BLUEPRINT

Logo Image

Copyright © 2025, All Rights Reserved.

Logo Image

Copyright © 2025, All Rights Reserved.

Logo Image

Copyright © 2025, All Rights Reserved.