Mars Climate Orbiter Report
Transcript of Mars Climate Orbiter Report
2
Table of Content
Prologue …………………………………………………………………………………………………. 2
Backgrounds
Why Mars? ....................................................................................... 3
First Mars Missions …….………………….………………………………..………….. 4
Faster Better Cheaper (FBC) Philosophy ……………………………………….. 4
Mars Missions Under FBC ………………..…………………………………………... 6
Mars Climate Orbiter Mission ……………………………………………………..… 8
Project Management Analysis
Human Resource Management ……….………………………………………….. 10
Project Communication Management ….…………………………………….. 14
Quality Management ……………………………..…………………………………… 18
Risk Management ……………………………………………………………………….. 21
Summary ………………………………………………………………………………………………. 26
References ………………………………………………………….…………………………………. 27
3
Prologue
The search for extraterrestrial life has been one of the greatest curiosities of mankind.
Throughout history, several attempts have been performed to gain more insight about
surrounding celestial bodies. In particular, exploration of Mars has been of great interest
because of its proximity to Earth and the fact that it is the second most habitable planet in our
solar system.
NASA is an organization that has been actively exploring Mars since 1960s, but it was in the
decade of the 90s that NASA significantly increased the number of missions to Mars. One of this
missions was the Mars Climate Orbiter, which received a lot of public attention after its failure
and the root causes were made public. In this report we will study the failure of this mission
from a project management perspective.
For starters, we will give some background about mankind’s initial interest on Mars. Also, we
will discuss the Faster, Better, Cheaper (FBC) philosophy adopted by NASA in the 90s, a strategy
used to cut costs and increase the number of missions. In addition, we will put in plain words
the events that led to the Mars Climate Orbiter (MCO) failure.
For our analysis of this mission from a project management point of view, we will focus on the
knowledge areas and processes that we believe had the greatest impact on the development,
operation and failure the mission. As such, we are going to discuss these knowledge areas:
− Human Resource Management
− Communication Management
− Quality Management
− Risk Management
For each individual knowledge area, we will present a brief overview of its key concepts; then,
we will assess the application of these knowledge areas and processes to the management of
the mission, as defined by the Project Management Institute; and, we will resume the lessons to
be learned of the application of these knowledge areas in this mission.
Finally, we will summarize the main causes of the failure of the mission and extract general
recommendations that should be take into account in any project.
4
Why Mars?
In the development of this project management case analysis our first question was Why Mars?
Throughout a research process emerged the following answers:
1- The interest and observation of Mars took place in the 1600’s, with the invention and
development of the telescope. In 1877 the Italian astronomer Giovanni Schiaparelli stated
that he saw life on the red planet. Increasingly detailed views of the planet inspired
speculation about its environment and intelligent civilizations.
2- Years later of the Schiaparelli’s finding, the American astronomer Percival Lowell jointly with
other astronomers suggested that the presumed forms of life discovered were an irrigation
system created by intelligent beings.
3- The Mars space exploration in the context of the space race began as result of the
ideological confrontation, called the Cold War, between U.S. and the Soviet Union from 1945
(end of WWII) until 1991 (end of the USSR).
4- On December 27 of 1984 a Martian meteorite (ALH 84001) was found in Allan
Hills, Antarctica. Then in 1996 was announced that in this meteorite had been found
microscopic fossils of Martian bacteria based on carbonate globules.
Figure 1. Causes of Mars explorations
1- 2-
3- 4-
5
First Mars Missions
As is shown in the Figure 2, in earlier attempts during the 60’s and 70’s, the NASA explorations
of Mars in comparison with the Soviet Space Program, experienced a high success rate but a low
launch rate (USA) with an extremely high cost per project.
Figure 2. Comparison of the first missions
Faster, Better, Cheaper (FBC) Philosophy
Daniel Saul Goldin returned to NASA as administrator on April 1, 1992 and implemented the
"faster, better, cheaper" philosophy that proposed “Missions should be smaller, launched more
often and cost less money”. This new approach emerged in response to shrinking budgets and
fears that the agency was placing too much emphasis on too few missions.
Other important factors that contributed to the appearance of this FBC approach were the
internal pressures which came from the multibillion dollar NASA missions that took decades to
move from concept to operation and return of data and external pressures derived from the
Government-wide initiative to do more with less.
Goldin initially promoted a low-cost manned lunar project, but due to the 1996 finding about
the Martian meteorite (ALH 84001)), the focus was shifted to unmanned Mars probes.
6
The Mars Program Independent Assessment Team's report, dated March 14, 2000, defined or
traduced FBC as:
(1) Utilizing new and innovative technology.
(2) Creating smaller spacecraft and therefore reducing costs would result in more frequent
missions.
(3) Accepting prudent risk where warranted by return.
(4) Reducing cycle time by eliminating inefficient and redundant processes.
(5) Utilizing proven engineering and management practices to maximize success.
Figure 3. Faster, better, cheaper approach.
7
Mars Missions under FBC
After 10 years without any Mars exploration missions, NASA is ready to return to the red planet
with a new philosophy (FBC). There were a total of six missions related to Mars that were
developed and launch during the period starting in 1994 and ending in 2000. The missions that
were launched during this period are:
− Mars Global Surveyor (1996)
− Pathfinder (1996)
− Deep Space 1 (1998)
− Mars Climate Orbiter (1998)
− Mars Polar Lander (1998)
− Deep Space 2 (early 1999)
The main objectives and budget of these missions are presented below:
• The Mars Global Surveyor was the first mission under FBC, and was fabricated by
Lockheed Martin Astronautics under the supervision of the NASA’s Jet Propulsion
Laboratory (JPL). The cost of development and construction was of US$ 154 million, and
US$ 65 million to launch. The spacecraft objectives was to orbit Mars and take pictures
to map Mars’ surface and serve as a relay station for future missions. The Surveyor was
operatives until 2007, when NASA lost contact with it.
• The Pathfinder landed on Mars on July 4, 1997. This mission was directed by the JPL with
a budget under US$ 280 million. The mission deployed the Mars’ rover Sojouner that
was able to analyze rocks from Mars’s surface and sent pictures of Mars topography and
sky. This mission was programmed to last around one month, but ended up lasting
almost three months, failing because its batteries worn out.
• The Deep Space 1 was also under the responsibility of the JPL with a cost of US$ 95
million. The goals of this mission was to test new technology, like ion propulsion and
onboard autonomous operations. In addition, flyby an asteroid was part of the mission
with a partial success. But most of the new technology were space proven and this
experience was now available for future missions.
8
• The Mars Climate Orbiter, was develop by Lockheed Martin Astronautics and the project
was under the responsibility of the JPL. This robotic spacecraft was supposed to arrive to
Mars atmosphere after 9 months of its launch, in December 1998, to study Mars’
climate, atmosphere, surface and act like a communication relay for the Mars Polar
Lander mission. The total budget of the mission was of US$ 125 million. But, this mission
didn’t accomplish it objective and the spacecraft was destroyed during it orbital
insertion, which will be discussed in more detail in this report.
• The Mars Polar Lander and Deep Space 2 missions were part of the same spacecraft. The
Polar Lander’s main purpose was to study the soil of Mars’ south pole, investigate the
possibility of finding ice and study the climate. The Deep Space 2 mission was comprised
of two identical impact probes, the size of a basketball, which would be launched by the
Polar Lander during its Mars landing. The probes would penetrate one meter of Mars’
soil to study subsurface composition. Again, the development of the spacecraft and
probes was under the accountability of Lockheed Martin Astronautics and the project
management was responsibility of the JPL. The budge for this missions was around US$
222.6 million, including development and launch. This mission failed after the Polar
Lander apparently was destroy during landing, and also failed to fire the two Deep Space
2 probes.
9
Mars Climate Orbiter Mission
The overall scientific objectives of the Mars Climate Orbiter (MCO) were to monitor daily
weather and atmospheric conditions, record changes on the surface and look for evidence of
past climate changes on the red planet. In addition, it would also serve as a relay between the
Mars Polar Lander and the ground systems on our planet. The MCO was expected to operate for
5 years after a complete orbital insertion on Mars.
The MCO spacecraft was launched by NASA in Cape Canaveral, Florida on December 11th of
1998 by a Delta II Lite Launch vehicle and lasted 42 minutes in total. The launch sequence went
through various stages using different fuel propellants at different altitudes to ensure an
optimal escape from the Earth’s atmosphere. The orbits of Earth and Mars around the Sun allow
for a launch window every 26 months or 2 years and 2 months when the energy required for
traveling is at a minimum. Only during these launch windows expeditions to Mars are
considered.
After the spacecraft left the Earth’s atmosphere, it started what is called a Hoffman transfer to
leave Earth’s orbit and join Mars orbit. The spacecraft trajectory was approximately 416 million
miles long and lasted 9 months and 12 days after which communication with the spacecraft was
lost.
During the first 4 months after the launch, because of problems in the ground navigation
software, the navigation team could not communicate with the spacecraft and therefore could
not model its trajectory. During that time, the navigation team had to rely on emails from
Lockheed Martin Aeronautics to model the spacecraft trajectory. After the communication issue
was resolved, the navigation team noted that the ground navigation software was generating
anomalous data, specifically, the thruster performance data.
The navigation team failed to use Incident/Surprise/Anomaly reports to handle the issue. NASA
had provisioned this type of report in order to identify, solve and document issues that
happened during the operational phase of a mission. It was understood that the lack of
understanding of ISA by the navigation team caused the issue not to be formally tracked causing
it to be ignored.
Because of the spacecraft’s odd design, the solar wind slightly pushed the spacecraft’s solar
panels making it rotate around the pitch axis. A reaction wheel fitted inside the spacecraft
canceled those small disturbances. But reaction wheels have a limited speed making them lose
their effectiveness after momentum builds up.
10
This momentum build up is brought back down to controllable levels using hydrazine thrusters.
It is important to know that this process is part of the control loop of the spacecraft and is
expected. Trajectory Correction Maneuvers (TCM’s) were planned ahead and executed to aim
the spacecraft for the Mars Orbital Insertion (MOI).
However, due to thruster performance data being logged in Imperial units instead of metric
units by the ground navigation software, the effects of the trajectory correction events were
underestimated by a factor of 4.45 (1 lbf-s ~= 4.45 N-s). This meant that for each correction
issued, the thrusters were fired 4.45 times longer than necessary. The need to correct the
spacecraft trajectory became more and more frequent. About 10 to 14 trajectory corrections
had to be issued to the spacecraft, whereas in other missions the average was about 2.
The errors explained before were small, but they added up in a gradual way.
The navigation team was not sure how unsure they were about their calculations of the
trajectory due to the erroneous nature of data they were working with. As the spacecraft
approached Mars, the navigation team executed TCM-4 to prepare the spacecraft for orbital
insertion a week later. The underlying error was not the unit mix up, but NASA’s failure to detect
it and solve it.
Using more information to model the spacecraft trajectory, calculations performed by the
navigation team indicated that the spacecraft was just barely 18 miles over the survivable
altitude. TCM-5, a planned emergency trajectory correction maneuver to steer the spacecraft
away from Mars was discussed but not implemented because the risk was not fully understood
by the navigation team and could also put the Mars Polar Lander mission at risk.
Just before the Mars Orbital Insertion took place, the navigation team lost contact with the
spacecraft as expected as it passed behind Mars. The navigation team could not communicate
with the spacecraft after it was supposed to reappear from the occlusion. Posterior and correct
trajectory simulations placed the spacecraft at 35 miles over the surface of Mars, way under the
50 mile critical altitude. It was understood then that the spacecraft deteriorated and ultimately
disintegrated while entering Mars’ atmosphere.
11
Human Resource Management
Project Human Resource Management is the knowledge area responsible for the choice,
training, evaluation and rewarding of the employees. It includes the processes that organize,
manage, and lead the project team.
The project team is comprised of the people with assigned roles and responsibilities for
completing the different phases of the project. The type and numbers may also be referred to
as the project’s staff.
An overview of the four processes involved in Human Resource Management is shown in the
next graphic.
Comparing some basic concepts of this knowledge area with how the MCO project was
managed; negligence was identified in several situations. Some important points of the first
processes seen before, the first two specifically, from left to right, will be discussed as follows.
Develop Human Resource Plan
• The PM is responsible of making sure the team members have the appropriate training
according to the task they will be performing.
• The PM must clearly identify the roles and responsibilities of the team.
Acquire project team
• The PM must evaluate the risk of resources becoming unavailable.
12
MCO’s Human Resource Management
Next, some examples we be provided about how Human Resource Management responsibilities
like “Training”, “Staffing” and “Roles definition” were not taken into account accurately. These,
including other issues analyzed later by other knowledge areas, compose the basic reasons of
the MCO failure.
Training
The personnel’s training is a critical consideration that every project manager should not
underestimate. A team without proper training can be compare to a time bomb. Everything
goes fine until the time is over and all the project objectives collapse. Something similar
happened with the MCO’s mission, the operations navigation team had not adequate training
on the MCO spacecraft design and its operations. A summary of the main issues noticed, in
which the lack of training was critical for the project, is detailed as follows:
• The team did not recognize the purpose and the use of the Incident-Surprise- Anomaly
(ISA) procedure.
The ISA is a formal problem resolution process used to address formally any incident,
surprise or anomaly during the project. Because of the lack of training in the use of this
document, when problems came, the team had to use informal ways to communicate.
This ended affecting the communication effectiveness severely.
• The small forces software development team needed additional training in the ground
software development process.
The understanding of this software was critical, since it was the responsible of
determining the spacecraft’s trajectory. Because of the lack of training of the team, an
end to end test of the software was not performed. This would have prevented the
MCO’s trajectory failure.
• Inadequate training about following the Mission Operations Software Interface
Specification (SIS).
• The team did not have a deep knowledge about the attitude operations of the
spacecraft. Because of this, unfortunately, the MCO attitude control system and related
subsystem parameters were not fully understood even when so many errors were being
generated for such systems.
13
• The Trajectory Correction Maneuver (TCM) was a contingency plan to execute a
trajectory correction in case of needed. It could have been applied to raise the MCO to a
safe altitude, but the team had never received a practical training about it. Investigation,
experiment, or procedures to commit to a TCM in the event of a safety issue were not
completed, nor attempted. Therefore, when such application was urgent, the team was
not ready for such a maneuver.
Staffing
The FBC approach reduced the personnel such in a way that affected significantly the project.
For example, the staffing of the operations navigation team was not enough to deal with all the
issues generated effectively. Due to the lack of staff, the Mars Surveyor Operations Project
(MSOP) was running 3 missions simultaneously. When the small problems began to appear in
the MCO, they could not be very focused on it.
The MSOP had no systems engineering and no mission assurance personnel. This was critical for
the mission. The presence of a mission assurance manager for example would have helped to
improve the project communication. It would have helped also to make sure that specifications
or standards such as the AMD file requirements, or the ISA resolutions, were been followed
correctly.
In other ways, success of space missions in general requires full involvement of the mission
science personnel in the management process. Science personnel with relevant expertise are a
so important tool generally used through all the progress of the project. These special personnel
should be included in all decisions where expert knowledge is required. Such experts were not
present in the decisions prior to MCO’s Mars orbit insertion.
Roles and responsibilities
Roles and responsibilities of part of the team were unclear. A recurring argument was “Who’s in
charge?” Another such recurring argument was “Who’s the mission manager?” Hesitancy and
wavering was perceived on the people attempting to answer these questions. The clearest
example can be found on the Flight Operations Manager (FOM). Due to the missing mission
assurance manager, the FOM became the improvised mission manager. This was not part of his
designated responsibility.
14
Finally, another problem observed on the MCO mission was that the system engineers did not
have very clear its function. This caused lack of understanding on the part of the navigation
team of essential spacecraft design characteristics. Systems engineering support would have
improved the operations navigation team’s skills to reach critical decisions and would have
provided supervision in navigation mission assurance. The role of these engineers was that
critical, that it was the responsibility of the systems engineering organization to identify the
unit’s problem leading to mission loss of the MCO.
Lessons Learned
• The team should be provided with proper training and detailed information regarding
systems which may have a high impact on the well behavioral of the project.
• The project manager should identify or provide backup personnel that could be available
to serve in some of the critical positions when needed.
• The human resource department should make sure that the staff has clear and well
defined roles and responsibilities.
15
Project Communications Management
The key words of this knowledge area are handling information. According to the PMBOK this
area includes the processes required to ensure timely and appropriate generation, collection,
distribution, storage, retrieval, and ultimate disposition of project information. The processes
involved in this knowledge area are:
-Identify Stakeholders
-Plan Communications
-Distribute Information
-Manage Stakeholder Expectations
-Report Performance
Throughout the analysis of the MCO project management information we could identify two of
these processes that were not used correctly
Report Performance
This process is focused on collecting the performance information, examining it, comparing it
with the baselines and sending it to the defined stakeholders. Also it is focused on the reliability
and accuracy of the information reported in order to prognosticate which preventive actions
would be needed in the future of the project.
Distribute Information
This process is focused on distributing information to each stakeholder which requires a
different type of information, in a specific format and in a specific moment of the project. This
process also ensures that there is an efficient and effective communication among all parties
involved in the project.
16
MCO’s Communication Management
The absence of use of these processes was present throughout the development of the MCO
project specifically in the following situations:
FBC Implementation
Based on different NASA reports, we identified that the
failure of this project began with the FBC philosophy
implementation because NASA Headquarters didn’t enact a
formal definition of FBC, resulting this in different
interpretations by project managers of what prudent risk was.
Due to this lack of formalization into written policies or guidance, the communication of the FBC
approach to the project managers and contractors was ineffective.
Interfaces and Relationships
Among the interfaces and relationships reviewed, two significant areas of concern were
identified and therefore the communication problems between them: “The interface between
NASA Headquarters and JPL” and “The interface between JPL and Lockheed Martin.”
NASA- JPL Relationship
This interface was highly ineffective from the communications point of view for two reasons:
First, the interpretation of the initial information on both parties, the Figure 5 shows the
intended versus perceived communications.
17
Figure 5. Communication NASA-JPL
For the MCO project the NASA defined and supplied the program objectives, requirements and
constraints to JPL. The JPL management interpreted these terms as mandates and deduced that
no cost increase was allowed even when was necessary for mitigating some risks.
JPL's response was a supportive attitude to the program in order to present a positive image as
a substitute for a rigorous risk assessment with appropriate concerns. Finally NASA understood
that JPL was in accordance with the objectives, requirements, and constraints for the MCO
project.
Second, the lack of a single MCO project interface at NASA responsible for all requirements,
including those from other NASA organizations resulted in multiple inputs to JPL MCO project
that were in some instances conflicting and in general increase the problem of communication.
Jet Propulsion Laboratory – Lockheed Martin Relationship
The relationship between JPL and LMA was effective over the MCO project development but it
became ineffective at the time of communicating the senior management of NASA about the
risk which had not been formally identified.
Figure 6. Communication JPL-LMA
18
Communication Barriers Between Project Elements
Communication barriers between project team were the main cause of MCO mission failure due
to each team worked independently and with little cross-communication.
In the MCO project, there are proofs of poor communications among the project management,
operation navigation and spacecraft team, for example:
The operations navigation team discussed the trajectory concerns among themselves, but they
did not communicate it effectively to the spacecraft operations team or project management.
When conflicts in the data were discovered, the team relied on e-mail to track and solve the
problem, instead of formal problem resolution processes such as the Incident, Surprise, and
Anomaly (ISA) reporting procedure.
Figure 7. Communication barriers between project teams
Lessons Learned
• Senior management must be receptive to communications of problems and risks.
• A dedicated single interface at NASA Headquarters for the Mars Program is essential.
• Contractor responsibilities must include formal notification to the customer of project
risk and deviations.
19
• Increase the amount of formal and informal face-to-face communications with all team
elements and especially for those elements that have critical interfaces.
Quality Management
A quality management plan should contain relevant existing quality practices, standards and
requirements not only for project deliverables but for the management of the project as well.
Processes and procedures should be defined on how to conduct and improve quality. Quality
should be measurable, which means that the quality management plan should clearly identify
what, when and how to measure. Quality should be controllable, therefore, metrics should be
compared against defined thresholds (control limits) and corrections performed as needed.
Under FBC’s philosophy, a great emphasis was put on reducing cost and schedule with an
already steep scope. This philosophy, albeit partially implemented, inadvertently neglected
quality and its importance in risk mitigation. As such, several missions under this philosophy
failed when pushing the boundaries of the FBC. (Samovinski & Judd & Richards & Bauer &
Cipolla & Purcarey, 2001). The MCO’s project management team failed to adequately perform
all quality management processes as defined by the PMI in the PMBOK: plan quality, perform
quality assurance and perform quality control.
However, on this report, we will only concentrate on perform quality assurance and control.
Perform Quality Assurance
The Verification, Validation & Accreditation (VV&A) processes should have determined early on
that the navigation software was not to be used because the implemented model and
associated data (thruster performance) did not conform to specifications, therefore it did not
represent the “real world”. The VV&A processes were not thoroughly performed against the
navigation software of the MCO spacecraft and ground system. In consequence, they failed to
catch a discrepancy that caused them to incorrectly model the spacecraft trajectory.
Even though the navigation software was developed by Lockheed Martin Aeronautics (LMA) and
the software bug was introduced by them, NASA should have caught the mistake through its
internal VV&A.
Although there was a document containing the specifications of interoperability between
systems, namely the Mars Surveyor Operations Project Software Interface Specification (SIS), it
20
was not followed by the software programmers nor used by the software testing team. (Lilley,
2009).
Within LMA, the faulty code came from a reused software package that was not mission-critical
in the past and not bound to the SIS. At some point, this software package was promoted to be
part of the navigation software of the MCO (which made it critical) and a formal code review
process was once again not performed.
This faulty code later became the root cause of the mission failure: use of Imperial units in the
ground navigation software. The SIS clearly defined the use of metric units and format. NASA
used (and continues to use) the International System of Units (SI) throughout the whole agency
due to the Metric Conversion Act (MCA) of 1975. However, LMA has long used the Imperial
system for aircrafts since the MCA only affects U.S. Government programs.
Early on when the spacecraft was already en route to Mars, the navigation team noticed
anomalous data in the files generated by the ground navigation software. They knew something
was wrong but they didn’t know what exactly. Increasingly frequent anomalous events used to
correct the spacecraft rotation were noted by the navigation team. However, they only
discussed these events informally. Due to the odd shape of the spacecraft, the navigation team
incorrectly believed that these events were to be expected.
Perform Quality Control
The anomalous events that occurred were not tracked using the multi-mission, institutional
defect database for mission operations that was in use during the lifetime of the MCO mission.
An Incident/Surprise/Anomaly (ISA) report documenting the operational issue should have been
submitted to the global issue-tracking database. If it had been done, significant efforts should
have taken place to solve the problem. This resulted in the issue “falling through the cracks”.
NASA had a separate database for defects during the development phase of a mission.
Rigorous and extensive validation of the software interfaces was not found to have taken place
as specified by the VV&A processes. Had it been done, the navigation team should have had a
chance in determining to what degree the model and its associated data failed to represent the
real world. To make matters worse, engineers of the navigation team did not fully understand
the navigation system of the MCO and were unqualified to do so. The physical implications of
thrusters firing over four times longer than required meant that the spacecraft was drifting off
course.
21
Control limits for how often trajectory correction events can occur in a given timeframe were
not defined during planning which lead to the team not knowing that there was a serious
problem going on. Also, control limits for thruster performance data were not defined either, so
they could not know that the data was incorrect by a factor of 4.45.
Lessons Learned
• Following the Verification, Validation & Assurance processes is crucial for mission
success. These quality audit processes should be conducted thoroughly by qualified
independent reviewers that are expert on the matter during the mission’s lifetime.
• A comprehensive test verification matrix for the whole mission should be defined early
on. This matrix should contain all mission requirements down to the subsystem level.
The project management team should put effort to ensure the use of the matrix by
everybody in the team.
• Continuous review of all the mission’s integrated systems should be conducted by the
project team and review attendance tracked.
• An unified global issue-tracking database that incorporates all phases (development,
operations, etc.) of a mission should be implemented. Analysis of frequent or common
issues of previous missions should be conducted in an effort to improve quality.
• The process of metrification (the use of the International System of Units) of all
dependencies of the NASA as well as all procured deliverables with contractors should
take place.
22
Risk Management
In this analysis of the extent to which Risk Management (RM) was implemented in the MCO
mission, we’ll start giving a summary of the most important RM concepts and tools; as a result,
the impact of the RM process, or the lack of it, in the MCO mission, will be more evident and
clearly appreciated.
RM deals with the uncertainties a project will face through its life cycle, positive or negative
ones. This means, RM should start with the initiation process and finish when the project is
closed. In the project charter high level risks are first assessed, but it’s during the planning
process group where most of the work related with RM is done, but this doesn’t mean that
when planning is done so is the risk efforts, this is an iterative process that has an monitoring
and control aspect that it’s carry out after planning through closing of the project. This
systematic approach to risk assessment has the purpose of identifying events that could impact,
positively or negatively, the project schedule, cost, quality, costumer’s satisfaction, stakeholder’s
interests, scope, etc.
When RM is fully integrated in a project, then the project can be executed without huge fires to
put out every day, they should have been eliminated with a risk response plan; risks are brought
up in every meeting to be address before they happen; and, if risks events does occur, there is a
plan in place to deal with it, meaning no more chaotic meeting to develop a response.
To better understand RM, we need to know the processes that comprise it. These processes
have a logical sequence, but the RM process is very iterative. The RM process is comprise of:
− Plan Risk Management
− Identify Risks
− Perform Qualitative Risk Analysis
− Perform Quantitative Risk Analysis
− Plan Risk Responses
− Monitor and Control Risks
Plan Risk Management (PRM), answer the question of how to approach risk management
depending on the complexity of the project. Therefore, the project manager, sponsor, team,
customer, experts and other stakeholders could be involved in this phase. Specifically, PRM will
detail the methodology, roles and responsibilities, budget, timing, risk categories, definitions of
probability and impact, stakeholder’s tolerance, reporting format and tracking.
23
Now that we know how we are going to do RM the next step is to Identify Risks (IR). This step
can done mainly during project initiation and planning, but it should continue until project
closing. There are many tools and techniques associated with this step, like: reviewing of past
documentation, brainstorming, interviewing, root cause analysis, assumption analysis, strength-
weakness-opportunities and threats. The result of this step is a document call the Risk Register
that includes: the list of risks, potential responses, root causes of risks and updated risk
categories.
With the risks identify we Perform Qualitative Risk Analysis (PQLRA), here we try to filter the
risks according with their probability of occurrence and potential impact. One of its tools is the
probability and impact matrix that helps classify the risks between the ones that require more
analysis and the ones that could go to a watch-list. After this step the Risk Register is updated
with a risk ranking, probabilities and impact, list of risks for additional analysis, a risk watch-list
and trends.
Using the list of risks that require more analysis we start the Perform Quantitative Risk Analysis
(PQTRA), but it’s important that the project manager considers costs versus benefits of doing his
analysis.
Plan Risk Response (PRR) deals with the question of what to do with the top risks previously
identified and quantified. Its goal is to eliminate or reduce threats, and find ways to promote
opportunities. The threat that can’t be eliminated will be taken care with a contingency plan or
a fallback plan, if the contingency fails. Here the work to deal with risks is assign to owners who
are responsible of implementing the contingency and fallback responses. There are several
strategies to mitigate threats and exploit opportunities. The result of this step is the update of
the project management plan, project documents and risk register. The updates to the risk
register include: a list of residual risks, contingency plans, risks response owners, risks triggers,
fallback plans, reserves, etc.
But MR doesn’t end with PRR, we need to Monitor and Control Risks (MCR), here the project
manager makes sure that all that has been planned to eliminate or reduce uncertainty is being
implemented and take into account new risks that could have been identified. Therefore, the
project manager has to continuously perform risks reassessment, be prepared for a risk audit,
check the level of reserve of the project, organize status meetings and close risks that no longer
apply. The result of this step is the update of the risk register, change requests, project
management plan updates, project documents updates and organizational process assets
updates.
24
MCO’s Risk Management
To understand the level of RM implemented in the MCO mission we referred to the: Mars
Climate Orbiter Mishap Investigation Board Phase I Report (Nov 10, 1999), Report on Project
Management in NASA by the Mars Climate Orbiter Mishap Investigation Board (Mar 13, 2000)
and Mars Program Independent Assessment Team Summary Report (Mar 14, 2000). From this
reports we couldn’t find any evidence that a proper RM was implemented or that there were a
formal risk management plan for the mission. We can conclude this because the three
documents mention the need to implement a risk assessment and management process in
future missions.
In such a complex endeavors like the space explorations, risks is a natural factor that should be
always considered. The identification and management of risks is one the main tasks of the
project manager and its oversight frequently determined the result of a project. As we
discussed earlier FBC promoted making more with less, and taking risks that were justified; but
the failure of NASA administration to adequately define FBC and the policies and procedures
that would guide its application in all projects, especially in respect of what is prudent risk;
resulted in different interpretation of what FBC means when dealing with the constrains of
schedule, cost, scope and risk.
The reports found that the project managers of the MCO mission put more emphasis in cost and
schedule without paying attention to project risks. The next graph shows the balance that the
reports propose when dealing with FBC and projects constrains.
25
The scope shouldn’t be increased or cost and schedule be reduced beyond the point risk grows
rapidly.
An evidence that the project managers of the MCO put more emphasis in cost control over
risks, while keeping a very ambitious scope, is obvious when we compare the cost and scope of
the MCO ($125 millions) versus the Mars Global Surveyor ($250 millions). Then, we could say
that MCO was underfunded, resulting in testing and analysis shortages that caused a significant
increase in the risks and reduced the probability of success.
Another example of the lack of a proper RM plan for MCO and the big impact that this lack
caused in the failure of the mission, it’s found in hesitation to implement the Trajectory
Correction Maneuver – 5 (TCM-5), even though the navigation operation team knew that the
spacecraft was out of course and had the opportunity apply TCM-5 as a contingency action, but
there was not a procedure and the team was not prepared to implement TCM-5 in case of an
emergency.
In addition, the reports found that the systems engineering team, which performs critical
studies to improve the mission in terms of scope, cost, schedule, and helping the project
manager to identify and mitigate risks; didn’t showed an adequate execution of its
responsibilities when: (i) failed to identified what is an acceptable risk level for the mission; (ii)
the nonexistence of any analysis that could have help to identify risks in MCO mission, like fault
tree analysis; (iii) poor documentation of the critical elements of the mission; and, (iv) the lack
of a contingency plan for TCM-5 and other possible critical situations.
All this lead us to understand that the deficiency or absence of a good RM plan prevented that
the mission had a change to be successful in the event of any significant threat, and that the
mistake in the calculation of the trajectory was a threat that could have been overcame with a
good contingency plan.
Lessons Learned
• The reports make a lot stress in the importance of implementing sound RM plans in all
NASA missions. Part of the RM lessons drawn from MCO failure, include:
• Risks associated with the deviation from recognized project management principles
should be accepted.
26
• Risks most be assessed and accepted by all accountable parties, including senior
management, program and project managers.
• Risk most be assessed and control throughout the life cycle of the project, and should be
considered as important as cost, schedule and scope; therefore, should be considered a
fourth dimension of project management.
• The progress in the implementation of the risks mitigation plan should be reported
periodically to the project and program managers.
• A clear and detail definition of what constitute acceptable risk should communicated to
all team members.
• Qualitative and/or quantitative risks analysis tools should be used in all missions to
determine probability and impact of all identified risks.
• All risks that couldn’t be eliminated with a mitigation plan should have a risk owner, who
is responsible to manage that risk.
• At all meetings and projects review, the status of risks mitigation plan are reported and
reassessed.
• The concepts of earn value management should be applied to RM, allow the project
manager know the progress of the plan at any moment during the life of the project.
27
Summary
Throughout our analysis of the MCO’s mission failure, we identified a series of technical and
project management related errors and deviations from established procedures; but, we
conclude that this catastrophe had one root technical cause and one root management cause.
The technical cause is found in the difference in units system used between two software in
charge of calculating the trajectory and trajectory correction forces instructed to the spacecraft.
On the other hand, the root management cause is clearly found in the precarious
implementation of the FBC philosophy, that give rise to more missions, more ambitious scopes,
reduced costs, reduced time, and more risks. In principle, FBC should have resulted in more
successful missions, but ended up putting more missions in the hands of capable, though
inexperienced, project managers that understood in different ways the FBC’s values.
The general recommendations we can draw from MCO failure and that have to be considered in
any project, are the following:
• Senior management should make sure that new strategies are well understood in all levels
of the organization and incorporated in detail in all the relevant documents and procedures.
• Personnel availability and training in any project is critical to the success of the venture.
• Any scope, costs and schedule change should degenerate in the project manager setting
aside sound project management principles.
• Effective communication improves the integration among team members, stakeholder and
senior management; but, its absence can have a negative impact in the project.
• All processes and deliverables should be continuously assessed and controlled to assure that
they meet the requirements.
• Every project manager should determine in what depth to apply the Risk Management
process, but since risks are present in all projects, this shouldn’t be optional.
• Risk management should be considered a fourth dimension of the project management
endeavor, with the same relevance as scope, costs and schedule.
28
References
(Nov 10, 1999) Mars Climate Orbiter Mishap Investigation Board Phase I
Report.
(Mar 13, 2000) Report on Project Management in NASA by the Mars Climate
Orbiter Mishap Investigation Board.
(Mar 14, 2000) Mars Program Independent Assessment Team Summary Report.
Samoviski, D. J., & Judd, E., & Richards, J., & Bauer, E., & Cipolla, N., &
Purcarey, I. (Mar 13, 2001). Faster, Better, Cheaper: Policy, Strategic
Planning, and Human Resource Alignment [Audit report]. Washington, DC: NASA
Headquarters.
Edralin, D.M. (2004). Training: A strategic HRM function. Notes on business
education, 7 (4), 1-2.
Lilley, S. (Aug 2009). Lost In Translation. System Failure Cases, 3(05) 4.
(June 2011). [Personal communication between Larry O’Brien and Peter Norvig]
Retreived November 20, 2012 from
http://skeptics.stackexchange.com/questions/7276/was-nasas-mars-climate-
orbiter-lost-because-engineering-teams-used-different-me
Williams, D.R. (Jan 2005). A Crewed Mission to Mars. Retrieved November 20,
2012 from http://nssdc.gsfc.nasa.gov/planetary/mars/marslaun.html
Grayzeck, E. (May 2012). NASA NSSDC Mars Climate Orbiter Spacecraft-Details.
Retrieved November 21, 2012 from
http://nssdc.gsfc.nasa.gov/nmc/spacecraftDisplay.do?id=1998-073A
Graham, R. (Nov 2008). FAQ on Failures, Part Two: Mars Climate Orbiter
Failure. Retrieved November 21, 2012 from
http://www.designnotes.com/companion/failures/MCOcraft.html
Department of Defense. (Sep 2006). Retrieved November 26, 2012 from
http://vva.msco.mil/Key/key.htm/
Moore, M. (May 2010). NASA-International System of Units - The Metric
Measurement System Retrieved November 21, 2012 from
http://www.nasa.gov/offices/oce/functions/standards/isu.html