Mars Climate Orbiter Report

27
2 Table of Content Prologue …………………………………………………………………………………………………. 2 Backgrounds Why Mars? ....................................................................................... 3 First Mars Missions …….………………….………………………………..………….. 4 Faster Better Cheaper (FBC) Philosophy ……………………………………….. 4 Mars Missions Under FBC ………………..…………………………………………... 6 Mars Climate Orbiter Mission ……………………………………………………..… 8 Project Management Analysis Human Resource Management ……….………………………………………….. 10 Project Communication Management ….…………………………………….. 14 Quality Management ……………………………..…………………………………… 18 Risk Management ……………………………………………………………………….. 21 Summary ………………………………………………………………………………………………. 26 References ………………………………………………………….…………………………………. 27

Transcript of Mars Climate Orbiter Report

Page 1: Mars Climate Orbiter Report

2

Table of Content

Prologue …………………………………………………………………………………………………. 2

Backgrounds

Why Mars? ....................................................................................... 3

First Mars Missions …….………………….………………………………..………….. 4

Faster Better Cheaper (FBC) Philosophy ……………………………………….. 4

Mars Missions Under FBC ………………..…………………………………………... 6

Mars Climate Orbiter Mission ……………………………………………………..… 8

Project Management Analysis

Human Resource Management ……….………………………………………….. 10

Project Communication Management ….…………………………………….. 14

Quality Management ……………………………..…………………………………… 18

Risk Management ……………………………………………………………………….. 21

Summary ………………………………………………………………………………………………. 26

References ………………………………………………………….…………………………………. 27

Page 2: Mars Climate Orbiter Report

3

Prologue

The search for extraterrestrial life has been one of the greatest curiosities of mankind.

Throughout history, several attempts have been performed to gain more insight about

surrounding celestial bodies. In particular, exploration of Mars has been of great interest

because of its proximity to Earth and the fact that it is the second most habitable planet in our

solar system.

NASA is an organization that has been actively exploring Mars since 1960s, but it was in the

decade of the 90s that NASA significantly increased the number of missions to Mars. One of this

missions was the Mars Climate Orbiter, which received a lot of public attention after its failure

and the root causes were made public. In this report we will study the failure of this mission

from a project management perspective.

For starters, we will give some background about mankind’s initial interest on Mars. Also, we

will discuss the Faster, Better, Cheaper (FBC) philosophy adopted by NASA in the 90s, a strategy

used to cut costs and increase the number of missions. In addition, we will put in plain words

the events that led to the Mars Climate Orbiter (MCO) failure.

For our analysis of this mission from a project management point of view, we will focus on the

knowledge areas and processes that we believe had the greatest impact on the development,

operation and failure the mission. As such, we are going to discuss these knowledge areas:

− Human Resource Management

− Communication Management

− Quality Management

− Risk Management

For each individual knowledge area, we will present a brief overview of its key concepts; then,

we will assess the application of these knowledge areas and processes to the management of

the mission, as defined by the Project Management Institute; and, we will resume the lessons to

be learned of the application of these knowledge areas in this mission.

Finally, we will summarize the main causes of the failure of the mission and extract general

recommendations that should be take into account in any project.

Page 3: Mars Climate Orbiter Report

4

Why Mars?

In the development of this project management case analysis our first question was Why Mars?

Throughout a research process emerged the following answers:

1- The interest and observation of Mars took place in the 1600’s, with the invention and

development of the telescope. In 1877 the Italian astronomer Giovanni Schiaparelli stated

that he saw life on the red planet. Increasingly detailed views of the planet inspired

speculation about its environment and intelligent civilizations.

2- Years later of the Schiaparelli’s finding, the American astronomer Percival Lowell jointly with

other astronomers suggested that the presumed forms of life discovered were an irrigation

system created by intelligent beings.

3- The Mars space exploration in the context of the space race began as result of the

ideological confrontation, called the Cold War, between U.S. and the Soviet Union from 1945

(end of WWII) until 1991 (end of the USSR).

4- On December 27 of 1984 a Martian meteorite (ALH 84001) was found in Allan

Hills, Antarctica. Then in 1996 was announced that in this meteorite had been found

microscopic fossils of Martian bacteria based on carbonate globules.

Figure 1. Causes of Mars explorations

1- 2-

3- 4-

Page 4: Mars Climate Orbiter Report

5

First Mars Missions

As is shown in the Figure 2, in earlier attempts during the 60’s and 70’s, the NASA explorations

of Mars in comparison with the Soviet Space Program, experienced a high success rate but a low

launch rate (USA) with an extremely high cost per project.

Figure 2. Comparison of the first missions

Faster, Better, Cheaper (FBC) Philosophy

Daniel Saul Goldin returned to NASA as administrator on April 1, 1992 and implemented the

"faster, better, cheaper" philosophy that proposed “Missions should be smaller, launched more

often and cost less money”. This new approach emerged in response to shrinking budgets and

fears that the agency was placing too much emphasis on too few missions.

Other important factors that contributed to the appearance of this FBC approach were the

internal pressures which came from the multibillion dollar NASA missions that took decades to

move from concept to operation and return of data and external pressures derived from the

Government-wide initiative to do more with less.

Goldin initially promoted a low-cost manned lunar project, but due to the 1996 finding about

the Martian meteorite (ALH 84001)), the focus was shifted to unmanned Mars probes.

Page 5: Mars Climate Orbiter Report

6

The Mars Program Independent Assessment Team's report, dated March 14, 2000, defined or

traduced FBC as:

(1) Utilizing new and innovative technology.

(2) Creating smaller spacecraft and therefore reducing costs would result in more frequent

missions.

(3) Accepting prudent risk where warranted by return.

(4) Reducing cycle time by eliminating inefficient and redundant processes.

(5) Utilizing proven engineering and management practices to maximize success.

Figure 3. Faster, better, cheaper approach.

Page 6: Mars Climate Orbiter Report

7

Mars Missions under FBC

After 10 years without any Mars exploration missions, NASA is ready to return to the red planet

with a new philosophy (FBC). There were a total of six missions related to Mars that were

developed and launch during the period starting in 1994 and ending in 2000. The missions that

were launched during this period are:

− Mars Global Surveyor (1996)

− Pathfinder (1996)

− Deep Space 1 (1998)

− Mars Climate Orbiter (1998)

− Mars Polar Lander (1998)

− Deep Space 2 (early 1999)

The main objectives and budget of these missions are presented below:

• The Mars Global Surveyor was the first mission under FBC, and was fabricated by

Lockheed Martin Astronautics under the supervision of the NASA’s Jet Propulsion

Laboratory (JPL). The cost of development and construction was of US$ 154 million, and

US$ 65 million to launch. The spacecraft objectives was to orbit Mars and take pictures

to map Mars’ surface and serve as a relay station for future missions. The Surveyor was

operatives until 2007, when NASA lost contact with it.

• The Pathfinder landed on Mars on July 4, 1997. This mission was directed by the JPL with

a budget under US$ 280 million. The mission deployed the Mars’ rover Sojouner that

was able to analyze rocks from Mars’s surface and sent pictures of Mars topography and

sky. This mission was programmed to last around one month, but ended up lasting

almost three months, failing because its batteries worn out.

• The Deep Space 1 was also under the responsibility of the JPL with a cost of US$ 95

million. The goals of this mission was to test new technology, like ion propulsion and

onboard autonomous operations. In addition, flyby an asteroid was part of the mission

with a partial success. But most of the new technology were space proven and this

experience was now available for future missions.

Page 7: Mars Climate Orbiter Report

8

• The Mars Climate Orbiter, was develop by Lockheed Martin Astronautics and the project

was under the responsibility of the JPL. This robotic spacecraft was supposed to arrive to

Mars atmosphere after 9 months of its launch, in December 1998, to study Mars’

climate, atmosphere, surface and act like a communication relay for the Mars Polar

Lander mission. The total budget of the mission was of US$ 125 million. But, this mission

didn’t accomplish it objective and the spacecraft was destroyed during it orbital

insertion, which will be discussed in more detail in this report.

• The Mars Polar Lander and Deep Space 2 missions were part of the same spacecraft. The

Polar Lander’s main purpose was to study the soil of Mars’ south pole, investigate the

possibility of finding ice and study the climate. The Deep Space 2 mission was comprised

of two identical impact probes, the size of a basketball, which would be launched by the

Polar Lander during its Mars landing. The probes would penetrate one meter of Mars’

soil to study subsurface composition. Again, the development of the spacecraft and

probes was under the accountability of Lockheed Martin Astronautics and the project

management was responsibility of the JPL. The budge for this missions was around US$

222.6 million, including development and launch. This mission failed after the Polar

Lander apparently was destroy during landing, and also failed to fire the two Deep Space

2 probes.

Page 8: Mars Climate Orbiter Report

9

Mars Climate Orbiter Mission

The overall scientific objectives of the Mars Climate Orbiter (MCO) were to monitor daily

weather and atmospheric conditions, record changes on the surface and look for evidence of

past climate changes on the red planet. In addition, it would also serve as a relay between the

Mars Polar Lander and the ground systems on our planet. The MCO was expected to operate for

5 years after a complete orbital insertion on Mars.

The MCO spacecraft was launched by NASA in Cape Canaveral, Florida on December 11th of

1998 by a Delta II Lite Launch vehicle and lasted 42 minutes in total. The launch sequence went

through various stages using different fuel propellants at different altitudes to ensure an

optimal escape from the Earth’s atmosphere. The orbits of Earth and Mars around the Sun allow

for a launch window every 26 months or 2 years and 2 months when the energy required for

traveling is at a minimum. Only during these launch windows expeditions to Mars are

considered.

After the spacecraft left the Earth’s atmosphere, it started what is called a Hoffman transfer to

leave Earth’s orbit and join Mars orbit. The spacecraft trajectory was approximately 416 million

miles long and lasted 9 months and 12 days after which communication with the spacecraft was

lost.

During the first 4 months after the launch, because of problems in the ground navigation

software, the navigation team could not communicate with the spacecraft and therefore could

not model its trajectory. During that time, the navigation team had to rely on emails from

Lockheed Martin Aeronautics to model the spacecraft trajectory. After the communication issue

was resolved, the navigation team noted that the ground navigation software was generating

anomalous data, specifically, the thruster performance data.

The navigation team failed to use Incident/Surprise/Anomaly reports to handle the issue. NASA

had provisioned this type of report in order to identify, solve and document issues that

happened during the operational phase of a mission. It was understood that the lack of

understanding of ISA by the navigation team caused the issue not to be formally tracked causing

it to be ignored.

Because of the spacecraft’s odd design, the solar wind slightly pushed the spacecraft’s solar

panels making it rotate around the pitch axis. A reaction wheel fitted inside the spacecraft

canceled those small disturbances. But reaction wheels have a limited speed making them lose

their effectiveness after momentum builds up.

Page 9: Mars Climate Orbiter Report

10

This momentum build up is brought back down to controllable levels using hydrazine thrusters.

It is important to know that this process is part of the control loop of the spacecraft and is

expected. Trajectory Correction Maneuvers (TCM’s) were planned ahead and executed to aim

the spacecraft for the Mars Orbital Insertion (MOI).

However, due to thruster performance data being logged in Imperial units instead of metric

units by the ground navigation software, the effects of the trajectory correction events were

underestimated by a factor of 4.45 (1 lbf-s ~= 4.45 N-s). This meant that for each correction

issued, the thrusters were fired 4.45 times longer than necessary. The need to correct the

spacecraft trajectory became more and more frequent. About 10 to 14 trajectory corrections

had to be issued to the spacecraft, whereas in other missions the average was about 2.

The errors explained before were small, but they added up in a gradual way.

The navigation team was not sure how unsure they were about their calculations of the

trajectory due to the erroneous nature of data they were working with. As the spacecraft

approached Mars, the navigation team executed TCM-4 to prepare the spacecraft for orbital

insertion a week later. The underlying error was not the unit mix up, but NASA’s failure to detect

it and solve it.

Using more information to model the spacecraft trajectory, calculations performed by the

navigation team indicated that the spacecraft was just barely 18 miles over the survivable

altitude. TCM-5, a planned emergency trajectory correction maneuver to steer the spacecraft

away from Mars was discussed but not implemented because the risk was not fully understood

by the navigation team and could also put the Mars Polar Lander mission at risk.

Just before the Mars Orbital Insertion took place, the navigation team lost contact with the

spacecraft as expected as it passed behind Mars. The navigation team could not communicate

with the spacecraft after it was supposed to reappear from the occlusion. Posterior and correct

trajectory simulations placed the spacecraft at 35 miles over the surface of Mars, way under the

50 mile critical altitude. It was understood then that the spacecraft deteriorated and ultimately

disintegrated while entering Mars’ atmosphere.

Page 10: Mars Climate Orbiter Report

11

Human Resource Management

Project Human Resource Management is the knowledge area responsible for the choice,

training, evaluation and rewarding of the employees. It includes the processes that organize,

manage, and lead the project team.

The project team is comprised of the people with assigned roles and responsibilities for

completing the different phases of the project. The type and numbers may also be referred to

as the project’s staff.

An overview of the four processes involved in Human Resource Management is shown in the

next graphic.

Comparing some basic concepts of this knowledge area with how the MCO project was

managed; negligence was identified in several situations. Some important points of the first

processes seen before, the first two specifically, from left to right, will be discussed as follows.

Develop Human Resource Plan

• The PM is responsible of making sure the team members have the appropriate training

according to the task they will be performing.

• The PM must clearly identify the roles and responsibilities of the team.

Acquire project team

• The PM must evaluate the risk of resources becoming unavailable.

Page 11: Mars Climate Orbiter Report

12

MCO’s Human Resource Management

Next, some examples we be provided about how Human Resource Management responsibilities

like “Training”, “Staffing” and “Roles definition” were not taken into account accurately. These,

including other issues analyzed later by other knowledge areas, compose the basic reasons of

the MCO failure.

Training

The personnel’s training is a critical consideration that every project manager should not

underestimate. A team without proper training can be compare to a time bomb. Everything

goes fine until the time is over and all the project objectives collapse. Something similar

happened with the MCO’s mission, the operations navigation team had not adequate training

on the MCO spacecraft design and its operations. A summary of the main issues noticed, in

which the lack of training was critical for the project, is detailed as follows:

• The team did not recognize the purpose and the use of the Incident-Surprise- Anomaly

(ISA) procedure.

The ISA is a formal problem resolution process used to address formally any incident,

surprise or anomaly during the project. Because of the lack of training in the use of this

document, when problems came, the team had to use informal ways to communicate.

This ended affecting the communication effectiveness severely.

• The small forces software development team needed additional training in the ground

software development process.

The understanding of this software was critical, since it was the responsible of

determining the spacecraft’s trajectory. Because of the lack of training of the team, an

end to end test of the software was not performed. This would have prevented the

MCO’s trajectory failure.

• Inadequate training about following the Mission Operations Software Interface

Specification (SIS).

• The team did not have a deep knowledge about the attitude operations of the

spacecraft. Because of this, unfortunately, the MCO attitude control system and related

subsystem parameters were not fully understood even when so many errors were being

generated for such systems.

Page 12: Mars Climate Orbiter Report

13

• The Trajectory Correction Maneuver (TCM) was a contingency plan to execute a

trajectory correction in case of needed. It could have been applied to raise the MCO to a

safe altitude, but the team had never received a practical training about it. Investigation,

experiment, or procedures to commit to a TCM in the event of a safety issue were not

completed, nor attempted. Therefore, when such application was urgent, the team was

not ready for such a maneuver.

Staffing

The FBC approach reduced the personnel such in a way that affected significantly the project.

For example, the staffing of the operations navigation team was not enough to deal with all the

issues generated effectively. Due to the lack of staff, the Mars Surveyor Operations Project

(MSOP) was running 3 missions simultaneously. When the small problems began to appear in

the MCO, they could not be very focused on it.

The MSOP had no systems engineering and no mission assurance personnel. This was critical for

the mission. The presence of a mission assurance manager for example would have helped to

improve the project communication. It would have helped also to make sure that specifications

or standards such as the AMD file requirements, or the ISA resolutions, were been followed

correctly.

In other ways, success of space missions in general requires full involvement of the mission

science personnel in the management process. Science personnel with relevant expertise are a

so important tool generally used through all the progress of the project. These special personnel

should be included in all decisions where expert knowledge is required. Such experts were not

present in the decisions prior to MCO’s Mars orbit insertion.

Roles and responsibilities

Roles and responsibilities of part of the team were unclear. A recurring argument was “Who’s in

charge?” Another such recurring argument was “Who’s the mission manager?” Hesitancy and

wavering was perceived on the people attempting to answer these questions. The clearest

example can be found on the Flight Operations Manager (FOM). Due to the missing mission

assurance manager, the FOM became the improvised mission manager. This was not part of his

designated responsibility.

Page 13: Mars Climate Orbiter Report

14

Finally, another problem observed on the MCO mission was that the system engineers did not

have very clear its function. This caused lack of understanding on the part of the navigation

team of essential spacecraft design characteristics. Systems engineering support would have

improved the operations navigation team’s skills to reach critical decisions and would have

provided supervision in navigation mission assurance. The role of these engineers was that

critical, that it was the responsibility of the systems engineering organization to identify the

unit’s problem leading to mission loss of the MCO.

Lessons Learned

• The team should be provided with proper training and detailed information regarding

systems which may have a high impact on the well behavioral of the project.

• The project manager should identify or provide backup personnel that could be available

to serve in some of the critical positions when needed.

• The human resource department should make sure that the staff has clear and well

defined roles and responsibilities.

Page 14: Mars Climate Orbiter Report

15

Project Communications Management

The key words of this knowledge area are handling information. According to the PMBOK this

area includes the processes required to ensure timely and appropriate generation, collection,

distribution, storage, retrieval, and ultimate disposition of project information. The processes

involved in this knowledge area are:

-Identify Stakeholders

-Plan Communications

-Distribute Information

-Manage Stakeholder Expectations

-Report Performance

Throughout the analysis of the MCO project management information we could identify two of

these processes that were not used correctly

Report Performance

This process is focused on collecting the performance information, examining it, comparing it

with the baselines and sending it to the defined stakeholders. Also it is focused on the reliability

and accuracy of the information reported in order to prognosticate which preventive actions

would be needed in the future of the project.

Distribute Information

This process is focused on distributing information to each stakeholder which requires a

different type of information, in a specific format and in a specific moment of the project. This

process also ensures that there is an efficient and effective communication among all parties

involved in the project.

Page 15: Mars Climate Orbiter Report

16

MCO’s Communication Management

The absence of use of these processes was present throughout the development of the MCO

project specifically in the following situations:

FBC Implementation

Based on different NASA reports, we identified that the

failure of this project began with the FBC philosophy

implementation because NASA Headquarters didn’t enact a

formal definition of FBC, resulting this in different

interpretations by project managers of what prudent risk was.

Due to this lack of formalization into written policies or guidance, the communication of the FBC

approach to the project managers and contractors was ineffective.

Interfaces and Relationships

Among the interfaces and relationships reviewed, two significant areas of concern were

identified and therefore the communication problems between them: “The interface between

NASA Headquarters and JPL” and “The interface between JPL and Lockheed Martin.”

NASA- JPL Relationship

This interface was highly ineffective from the communications point of view for two reasons:

First, the interpretation of the initial information on both parties, the Figure 5 shows the

intended versus perceived communications.

Page 16: Mars Climate Orbiter Report

17

Figure 5. Communication NASA-JPL

For the MCO project the NASA defined and supplied the program objectives, requirements and

constraints to JPL. The JPL management interpreted these terms as mandates and deduced that

no cost increase was allowed even when was necessary for mitigating some risks.

JPL's response was a supportive attitude to the program in order to present a positive image as

a substitute for a rigorous risk assessment with appropriate concerns. Finally NASA understood

that JPL was in accordance with the objectives, requirements, and constraints for the MCO

project.

Second, the lack of a single MCO project interface at NASA responsible for all requirements,

including those from other NASA organizations resulted in multiple inputs to JPL MCO project

that were in some instances conflicting and in general increase the problem of communication.

Jet Propulsion Laboratory – Lockheed Martin Relationship

The relationship between JPL and LMA was effective over the MCO project development but it

became ineffective at the time of communicating the senior management of NASA about the

risk which had not been formally identified.

Figure 6. Communication JPL-LMA

Page 17: Mars Climate Orbiter Report

18

Communication Barriers Between Project Elements

Communication barriers between project team were the main cause of MCO mission failure due

to each team worked independently and with little cross-communication.

In the MCO project, there are proofs of poor communications among the project management,

operation navigation and spacecraft team, for example:

The operations navigation team discussed the trajectory concerns among themselves, but they

did not communicate it effectively to the spacecraft operations team or project management.

When conflicts in the data were discovered, the team relied on e-mail to track and solve the

problem, instead of formal problem resolution processes such as the Incident, Surprise, and

Anomaly (ISA) reporting procedure.

Figure 7. Communication barriers between project teams

Lessons Learned

• Senior management must be receptive to communications of problems and risks.

• A dedicated single interface at NASA Headquarters for the Mars Program is essential.

• Contractor responsibilities must include formal notification to the customer of project

risk and deviations.

Page 18: Mars Climate Orbiter Report

19

• Increase the amount of formal and informal face-to-face communications with all team

elements and especially for those elements that have critical interfaces.

Quality Management

A quality management plan should contain relevant existing quality practices, standards and

requirements not only for project deliverables but for the management of the project as well.

Processes and procedures should be defined on how to conduct and improve quality. Quality

should be measurable, which means that the quality management plan should clearly identify

what, when and how to measure. Quality should be controllable, therefore, metrics should be

compared against defined thresholds (control limits) and corrections performed as needed.

Under FBC’s philosophy, a great emphasis was put on reducing cost and schedule with an

already steep scope. This philosophy, albeit partially implemented, inadvertently neglected

quality and its importance in risk mitigation. As such, several missions under this philosophy

failed when pushing the boundaries of the FBC. (Samovinski & Judd & Richards & Bauer &

Cipolla & Purcarey, 2001). The MCO’s project management team failed to adequately perform

all quality management processes as defined by the PMI in the PMBOK: plan quality, perform

quality assurance and perform quality control.

However, on this report, we will only concentrate on perform quality assurance and control.

Perform Quality Assurance

The Verification, Validation & Accreditation (VV&A) processes should have determined early on

that the navigation software was not to be used because the implemented model and

associated data (thruster performance) did not conform to specifications, therefore it did not

represent the “real world”. The VV&A processes were not thoroughly performed against the

navigation software of the MCO spacecraft and ground system. In consequence, they failed to

catch a discrepancy that caused them to incorrectly model the spacecraft trajectory.

Even though the navigation software was developed by Lockheed Martin Aeronautics (LMA) and

the software bug was introduced by them, NASA should have caught the mistake through its

internal VV&A.

Although there was a document containing the specifications of interoperability between

systems, namely the Mars Surveyor Operations Project Software Interface Specification (SIS), it

Page 19: Mars Climate Orbiter Report

20

was not followed by the software programmers nor used by the software testing team. (Lilley,

2009).

Within LMA, the faulty code came from a reused software package that was not mission-critical

in the past and not bound to the SIS. At some point, this software package was promoted to be

part of the navigation software of the MCO (which made it critical) and a formal code review

process was once again not performed.

This faulty code later became the root cause of the mission failure: use of Imperial units in the

ground navigation software. The SIS clearly defined the use of metric units and format. NASA

used (and continues to use) the International System of Units (SI) throughout the whole agency

due to the Metric Conversion Act (MCA) of 1975. However, LMA has long used the Imperial

system for aircrafts since the MCA only affects U.S. Government programs.

Early on when the spacecraft was already en route to Mars, the navigation team noticed

anomalous data in the files generated by the ground navigation software. They knew something

was wrong but they didn’t know what exactly. Increasingly frequent anomalous events used to

correct the spacecraft rotation were noted by the navigation team. However, they only

discussed these events informally. Due to the odd shape of the spacecraft, the navigation team

incorrectly believed that these events were to be expected.

Perform Quality Control

The anomalous events that occurred were not tracked using the multi-mission, institutional

defect database for mission operations that was in use during the lifetime of the MCO mission.

An Incident/Surprise/Anomaly (ISA) report documenting the operational issue should have been

submitted to the global issue-tracking database. If it had been done, significant efforts should

have taken place to solve the problem. This resulted in the issue “falling through the cracks”.

NASA had a separate database for defects during the development phase of a mission.

Rigorous and extensive validation of the software interfaces was not found to have taken place

as specified by the VV&A processes. Had it been done, the navigation team should have had a

chance in determining to what degree the model and its associated data failed to represent the

real world. To make matters worse, engineers of the navigation team did not fully understand

the navigation system of the MCO and were unqualified to do so. The physical implications of

thrusters firing over four times longer than required meant that the spacecraft was drifting off

course.

Page 20: Mars Climate Orbiter Report

21

Control limits for how often trajectory correction events can occur in a given timeframe were

not defined during planning which lead to the team not knowing that there was a serious

problem going on. Also, control limits for thruster performance data were not defined either, so

they could not know that the data was incorrect by a factor of 4.45.

Lessons Learned

• Following the Verification, Validation & Assurance processes is crucial for mission

success. These quality audit processes should be conducted thoroughly by qualified

independent reviewers that are expert on the matter during the mission’s lifetime.

• A comprehensive test verification matrix for the whole mission should be defined early

on. This matrix should contain all mission requirements down to the subsystem level.

The project management team should put effort to ensure the use of the matrix by

everybody in the team.

• Continuous review of all the mission’s integrated systems should be conducted by the

project team and review attendance tracked.

• An unified global issue-tracking database that incorporates all phases (development,

operations, etc.) of a mission should be implemented. Analysis of frequent or common

issues of previous missions should be conducted in an effort to improve quality.

• The process of metrification (the use of the International System of Units) of all

dependencies of the NASA as well as all procured deliverables with contractors should

take place.

Page 21: Mars Climate Orbiter Report

22

Risk Management

In this analysis of the extent to which Risk Management (RM) was implemented in the MCO

mission, we’ll start giving a summary of the most important RM concepts and tools; as a result,

the impact of the RM process, or the lack of it, in the MCO mission, will be more evident and

clearly appreciated.

RM deals with the uncertainties a project will face through its life cycle, positive or negative

ones. This means, RM should start with the initiation process and finish when the project is

closed. In the project charter high level risks are first assessed, but it’s during the planning

process group where most of the work related with RM is done, but this doesn’t mean that

when planning is done so is the risk efforts, this is an iterative process that has an monitoring

and control aspect that it’s carry out after planning through closing of the project. This

systematic approach to risk assessment has the purpose of identifying events that could impact,

positively or negatively, the project schedule, cost, quality, costumer’s satisfaction, stakeholder’s

interests, scope, etc.

When RM is fully integrated in a project, then the project can be executed without huge fires to

put out every day, they should have been eliminated with a risk response plan; risks are brought

up in every meeting to be address before they happen; and, if risks events does occur, there is a

plan in place to deal with it, meaning no more chaotic meeting to develop a response.

To better understand RM, we need to know the processes that comprise it. These processes

have a logical sequence, but the RM process is very iterative. The RM process is comprise of:

− Plan Risk Management

− Identify Risks

− Perform Qualitative Risk Analysis

− Perform Quantitative Risk Analysis

− Plan Risk Responses

− Monitor and Control Risks

Plan Risk Management (PRM), answer the question of how to approach risk management

depending on the complexity of the project. Therefore, the project manager, sponsor, team,

customer, experts and other stakeholders could be involved in this phase. Specifically, PRM will

detail the methodology, roles and responsibilities, budget, timing, risk categories, definitions of

probability and impact, stakeholder’s tolerance, reporting format and tracking.

Page 22: Mars Climate Orbiter Report

23

Now that we know how we are going to do RM the next step is to Identify Risks (IR). This step

can done mainly during project initiation and planning, but it should continue until project

closing. There are many tools and techniques associated with this step, like: reviewing of past

documentation, brainstorming, interviewing, root cause analysis, assumption analysis, strength-

weakness-opportunities and threats. The result of this step is a document call the Risk Register

that includes: the list of risks, potential responses, root causes of risks and updated risk

categories.

With the risks identify we Perform Qualitative Risk Analysis (PQLRA), here we try to filter the

risks according with their probability of occurrence and potential impact. One of its tools is the

probability and impact matrix that helps classify the risks between the ones that require more

analysis and the ones that could go to a watch-list. After this step the Risk Register is updated

with a risk ranking, probabilities and impact, list of risks for additional analysis, a risk watch-list

and trends.

Using the list of risks that require more analysis we start the Perform Quantitative Risk Analysis

(PQTRA), but it’s important that the project manager considers costs versus benefits of doing his

analysis.

Plan Risk Response (PRR) deals with the question of what to do with the top risks previously

identified and quantified. Its goal is to eliminate or reduce threats, and find ways to promote

opportunities. The threat that can’t be eliminated will be taken care with a contingency plan or

a fallback plan, if the contingency fails. Here the work to deal with risks is assign to owners who

are responsible of implementing the contingency and fallback responses. There are several

strategies to mitigate threats and exploit opportunities. The result of this step is the update of

the project management plan, project documents and risk register. The updates to the risk

register include: a list of residual risks, contingency plans, risks response owners, risks triggers,

fallback plans, reserves, etc.

But MR doesn’t end with PRR, we need to Monitor and Control Risks (MCR), here the project

manager makes sure that all that has been planned to eliminate or reduce uncertainty is being

implemented and take into account new risks that could have been identified. Therefore, the

project manager has to continuously perform risks reassessment, be prepared for a risk audit,

check the level of reserve of the project, organize status meetings and close risks that no longer

apply. The result of this step is the update of the risk register, change requests, project

management plan updates, project documents updates and organizational process assets

updates.

Page 23: Mars Climate Orbiter Report

24

MCO’s Risk Management

To understand the level of RM implemented in the MCO mission we referred to the: Mars

Climate Orbiter Mishap Investigation Board Phase I Report (Nov 10, 1999), Report on Project

Management in NASA by the Mars Climate Orbiter Mishap Investigation Board (Mar 13, 2000)

and Mars Program Independent Assessment Team Summary Report (Mar 14, 2000). From this

reports we couldn’t find any evidence that a proper RM was implemented or that there were a

formal risk management plan for the mission. We can conclude this because the three

documents mention the need to implement a risk assessment and management process in

future missions.

In such a complex endeavors like the space explorations, risks is a natural factor that should be

always considered. The identification and management of risks is one the main tasks of the

project manager and its oversight frequently determined the result of a project. As we

discussed earlier FBC promoted making more with less, and taking risks that were justified; but

the failure of NASA administration to adequately define FBC and the policies and procedures

that would guide its application in all projects, especially in respect of what is prudent risk;

resulted in different interpretation of what FBC means when dealing with the constrains of

schedule, cost, scope and risk.

The reports found that the project managers of the MCO mission put more emphasis in cost and

schedule without paying attention to project risks. The next graph shows the balance that the

reports propose when dealing with FBC and projects constrains.

Page 24: Mars Climate Orbiter Report

25

The scope shouldn’t be increased or cost and schedule be reduced beyond the point risk grows

rapidly.

An evidence that the project managers of the MCO put more emphasis in cost control over

risks, while keeping a very ambitious scope, is obvious when we compare the cost and scope of

the MCO ($125 millions) versus the Mars Global Surveyor ($250 millions). Then, we could say

that MCO was underfunded, resulting in testing and analysis shortages that caused a significant

increase in the risks and reduced the probability of success.

Another example of the lack of a proper RM plan for MCO and the big impact that this lack

caused in the failure of the mission, it’s found in hesitation to implement the Trajectory

Correction Maneuver – 5 (TCM-5), even though the navigation operation team knew that the

spacecraft was out of course and had the opportunity apply TCM-5 as a contingency action, but

there was not a procedure and the team was not prepared to implement TCM-5 in case of an

emergency.

In addition, the reports found that the systems engineering team, which performs critical

studies to improve the mission in terms of scope, cost, schedule, and helping the project

manager to identify and mitigate risks; didn’t showed an adequate execution of its

responsibilities when: (i) failed to identified what is an acceptable risk level for the mission; (ii)

the nonexistence of any analysis that could have help to identify risks in MCO mission, like fault

tree analysis; (iii) poor documentation of the critical elements of the mission; and, (iv) the lack

of a contingency plan for TCM-5 and other possible critical situations.

All this lead us to understand that the deficiency or absence of a good RM plan prevented that

the mission had a change to be successful in the event of any significant threat, and that the

mistake in the calculation of the trajectory was a threat that could have been overcame with a

good contingency plan.

Lessons Learned

• The reports make a lot stress in the importance of implementing sound RM plans in all

NASA missions. Part of the RM lessons drawn from MCO failure, include:

• Risks associated with the deviation from recognized project management principles

should be accepted.

Page 25: Mars Climate Orbiter Report

26

• Risks most be assessed and accepted by all accountable parties, including senior

management, program and project managers.

• Risk most be assessed and control throughout the life cycle of the project, and should be

considered as important as cost, schedule and scope; therefore, should be considered a

fourth dimension of project management.

• The progress in the implementation of the risks mitigation plan should be reported

periodically to the project and program managers.

• A clear and detail definition of what constitute acceptable risk should communicated to

all team members.

• Qualitative and/or quantitative risks analysis tools should be used in all missions to

determine probability and impact of all identified risks.

• All risks that couldn’t be eliminated with a mitigation plan should have a risk owner, who

is responsible to manage that risk.

• At all meetings and projects review, the status of risks mitigation plan are reported and

reassessed.

• The concepts of earn value management should be applied to RM, allow the project

manager know the progress of the plan at any moment during the life of the project.

Page 26: Mars Climate Orbiter Report

27

Summary

Throughout our analysis of the MCO’s mission failure, we identified a series of technical and

project management related errors and deviations from established procedures; but, we

conclude that this catastrophe had one root technical cause and one root management cause.

The technical cause is found in the difference in units system used between two software in

charge of calculating the trajectory and trajectory correction forces instructed to the spacecraft.

On the other hand, the root management cause is clearly found in the precarious

implementation of the FBC philosophy, that give rise to more missions, more ambitious scopes,

reduced costs, reduced time, and more risks. In principle, FBC should have resulted in more

successful missions, but ended up putting more missions in the hands of capable, though

inexperienced, project managers that understood in different ways the FBC’s values.

The general recommendations we can draw from MCO failure and that have to be considered in

any project, are the following:

• Senior management should make sure that new strategies are well understood in all levels

of the organization and incorporated in detail in all the relevant documents and procedures.

• Personnel availability and training in any project is critical to the success of the venture.

• Any scope, costs and schedule change should degenerate in the project manager setting

aside sound project management principles.

• Effective communication improves the integration among team members, stakeholder and

senior management; but, its absence can have a negative impact in the project.

• All processes and deliverables should be continuously assessed and controlled to assure that

they meet the requirements.

• Every project manager should determine in what depth to apply the Risk Management

process, but since risks are present in all projects, this shouldn’t be optional.

• Risk management should be considered a fourth dimension of the project management

endeavor, with the same relevance as scope, costs and schedule.

Page 27: Mars Climate Orbiter Report

28

References

(Nov 10, 1999) Mars Climate Orbiter Mishap Investigation Board Phase I

Report.

(Mar 13, 2000) Report on Project Management in NASA by the Mars Climate

Orbiter Mishap Investigation Board.

(Mar 14, 2000) Mars Program Independent Assessment Team Summary Report.

Samoviski, D. J., & Judd, E., & Richards, J., & Bauer, E., & Cipolla, N., &

Purcarey, I. (Mar 13, 2001). Faster, Better, Cheaper: Policy, Strategic

Planning, and Human Resource Alignment [Audit report]. Washington, DC: NASA

Headquarters.

Edralin, D.M. (2004). Training: A strategic HRM function. Notes on business

education, 7 (4), 1-2.

Lilley, S. (Aug 2009). Lost In Translation. System Failure Cases, 3(05) 4.

(June 2011). [Personal communication between Larry O’Brien and Peter Norvig]

Retreived November 20, 2012 from

http://skeptics.stackexchange.com/questions/7276/was-nasas-mars-climate-

orbiter-lost-because-engineering-teams-used-different-me

Williams, D.R. (Jan 2005). A Crewed Mission to Mars. Retrieved November 20,

2012 from http://nssdc.gsfc.nasa.gov/planetary/mars/marslaun.html

Grayzeck, E. (May 2012). NASA NSSDC Mars Climate Orbiter Spacecraft-Details.

Retrieved November 21, 2012 from

http://nssdc.gsfc.nasa.gov/nmc/spacecraftDisplay.do?id=1998-073A

Graham, R. (Nov 2008). FAQ on Failures, Part Two: Mars Climate Orbiter

Failure. Retrieved November 21, 2012 from

http://www.designnotes.com/companion/failures/MCOcraft.html

Department of Defense. (Sep 2006). Retrieved November 26, 2012 from

http://vva.msco.mil/Key/key.htm/

Moore, M. (May 2010). NASA-International System of Units - The Metric

Measurement System Retrieved November 21, 2012 from

http://www.nasa.gov/offices/oce/functions/standards/isu.html