Managing Regression Test Program With Orthogonal Defect Classification ODC

1

Managing Regression Test Program with Orthogonal Defect Classification (ODC)

7 Th Annual International Software Testing Conference in India,2007

Author: C.Rangarajan ([email protected]) Motorola India Private Ltd Bagmane Tech Park C V Raman Nagar Bangalore – 560093

mailto:[email protected]

2

Abstract

Regression testing involves selective retesting of a system (or) component to verify that modifications have not caused unintended effects and the system (or) component still complies with the specific requirements. Regression testing is a necessary but expensive maintenance activity aimed at ensuring that the functionality has not been adversely affected by changes. Regression test selection techniques reuse tests from an existing test suite to test a modified program.

One of the challenging issues with the regression testing is the Test Selection Problem. Although present research has addressed and solved some problems, they are codebased techniques. CodeBased regression test selection is good for unit testing, but has a scalability problem. As the size of the system under test grows, it becomes harder to manage test information and to create corresponding traceability matrices.

This paper addresses the Test selection problem by applying Orthogonal Defect Classification (ODC) technique to the existing regression test cases. The information contained in the form of defect data both from the Field and from the inphase testing is analyzed and technical recommendations are suggested to improve the existing regression test program.

The methods explained in this paper will cover the improvements in the following areas of Test Selection problem: • Test Selection & Addition • Test Minimization • Test Prioritization • Test Configuration

The benefits of using this method can be realized in a complex system test environment like CDMA or GSM systems or in most of the product tests in Telecom domain where multiple Network elements are integrated together.

1. Introduction

In today’s changing business environment, time to market is a key factor to achieving project success. For a project to be most successful, quality must be maximized while minimizing cost and keeping delivery time short. Quality can be measured based on the number of escaped defects after deployment of the product.

In order to minimize the number of escape defects, good amount of testing is needed at the vendor side before releasing the product to the customer.

Software testing is a strenuous and expensive process [4],[5]. Research has shown that at least 50% of the total software cost is comprised of testing activities.[6],[7]. Companies are often faced with lack of time and resources, which limits their ability to effectively complete testing efforts. With the product gone for multiple releases , more number of new features are introduced in the system, thereby the system becomes more complex. Now, the challenge is to ensure that existing features which are successfully running in the customer site are not disturbed. This primarily depends on the effectiveness of the regression testing.

Regression testing is a testing process which is applied after a program is modified. It involves testing the modified program with some test cases in order to reestablish our confidence that the program will perform according to the (possibly modified) specification. Regression testing is a major component in the maintenance phase where the software system may be corrected, adapted to new environment, or enhanced to improve its performance. Modifying a program involves creating new logic to correct an error or to implement a change and incorporating that logic into an existing program. The new logic may involve minor modifications such as adding, deleting, rewriting a few lines of code, or major modifications such as adding, deleting or replacing one or more modules or subsystems. Regression testing aims to check the correctness of the new logic, to ensure the continuous working of the unmodified portions of a program, and to validate that the modified program as a whole functions correctly.

3

1.1 Regression Test Selection Problem

The initial phase of regression testing is to select the test cases for execution. Software engineers save test cases developed for prior version/release in a test repository and rerun these test cases as regression tests in later versions. Running the entire set of test cases on a revised version/release may be a safe technique, however, could be cost and timeprohibitive.

Several parameters need to be considered while selecting a regression test suite. We need to consider the cost of quality, the effectiveness of the suite, defect detection capability ,Priority of test cases and the past learning , assurance .

Cost of Quality – The cost which incurs in designing and executing test cases in order to ensure quality of the product. This always needs to be minimal for any product to be successful.

Defect Detection Capability – The ability of the suite to reveal the faults in modified version/release.

Test suite effectiveness – The test suite should not contain ineffective test cases which are not relevant to the new changes and test cases that has not capability to reveal fault should not be considered.

Test priority – The ability to find the severe failures in the system test at earlier stage.

Past Learning – The test suite should incorporate the lessons learned from customer and the suite should not escape the same pattern of defects to the customer.

Above all, the test suite should be comprehensive enough to provide assurance in the quality of the given product. Test selection problem comprises of all the above mentioned parameters.

Although many techniques are available in the industry to select test cases, most of these techniques rely on code base, structural coverage criteria, flow graphs etc. All this techniques are code based techniques. These techniques are very much suited for unit testing. Code based techniques has got scalability problem. Particularly with the increase in the complexity of the product the mapping of test cases to the code flow becomes mindnumbing.

In the case of complex system test environment like CDMA or GSM with more components and which has undergone multiple releases in the past, the number of test cases evolves over period of time is becoming massive. It needs good selection technique to identify the best suitable test cases for regression. Also in most of the system test cases there is no correlation with the code flow or any other relation with code base. So we need to go for non codebased techniques

Our approach is not codebased rather it uses the Orthogonal Defect Classification [1] to identify the test cases for “System Testing”. Defects found in a given product across all components are classified with respect to ODC triggers and weights are determined for all the components, Triggers and features .This helps in determining the final weight of the test Case.

1.2 ODC based Test Selection Approach

There have been several papers written on ODC that describe the concepts, define the ODC scheme, discuss the relationships among the ODC attributes, and include results from studies. [1,2] . For this reason, we include only a short description of the ODC details in this paper. ODC methodology provides both a classification scheme for software defects and a set of concepts that provides guidance in the analysis of the classified aggregate defect data. "Orthogonal" refers to the non redundant nature of information captured by the defect attributes and their values that are used to classify defects. Much like the Cartesian coordinates in geometry, nearly a decade of research has shown that these attributes (and their values) are adequate to "span" the interesting part of the defect information space for most software development issues.

4

Orthogonal Defect Classification (ODC) provides the framework for analyzing the defects to address many questions pertaining to ongoing maintainence. The definition, attributes and values defined within ODC are included in Appendix 1, Tables.

For in depth understanding of ODC please refer [1], [2]. The scope of this document is limited to using the ODC triggers and arriving at test case selection.

As per my knowledge the technical papers available in the testing industry do discuss about the test process improvements based on ODC ,but no other paper has defined a mathematical definition to solve test selection problem using ODC .

2.Method

The steps for using classified data to solve test selection are straightforward:

1. Collect ODC classified historical defect data from both customer defects and in phase test defects

2. Collect ODC classified Test cases data.

3. Correlating ODC data for both Test Cases and defects

4. Use evaluation techniques and arrive at test selection.

Output: The output sorts the test cases based on component coverage of a given trigger and it gives the test case prioritization. It also indicates the number of test case to add and delete from the given suite.

The product we consider for analysis is a complex product considering multiple components, and it is recognized as a highquality product that provides industrialstrength solutions for missioncritical applications. However, a few defects still escape to the field and need to be fixed under service. The cost of these defects to the customers arises both from the impact of the defect and the expense of applying service to the products.

2.1 ODC Classified historical test defect data

What is Historical Test Data? In the case of complex product which has gone for multiple releases, more number of defects would have been found during inphase testing and by the customer. The cumulative of all the defects is the historical test defect data. This data gives the maturity assessment of the product .These data can be used in optimizing the test selection.

Assume that a product’s n1 th release has gone to the customer and now we need to do test selection for the n th release. We need to consider all the defects until n1 th release in order to determine the test selection for the nth release.

Ideally, for comprehensive analysis and projections the defect data must be complete, representing the “life of the product” set. We define “complete” as all defects discovered by internal defect removal activities ( i.e design review, code inspection, unit test, functional test , system test etc) as well as customer reported defects for the entire period during which the product was in the customer’s or end user’s hands. This would include internal and customer reported defects for all generations of the product since it was initially introduced.

But we restrict our data collection to only system testing inphase defects and the customer defects for multiple releases as our prime objective is to arrive at System Test Selection, though lot of other possible improvements in

5

various phases of projects are possible with ODC. Defects which are service impacts are considered for analysis.Refer Appendix for definition.

Given a product P = C1 ,C2,C3..Ck consisting of multiple components. Based on the historical test defects we can find the number of defects across each components.If the number of defects in a given component is high, then much focus is needed in improving the quality of the component. So much weight needs to be assigned to that component during test selection.

Assume Dp = D1, D2, D3…DK the total number of defects in product P across all components, And Wp = W1, W2…Wk) weight across all the components then Wi > Wj ..if Di > Dj where i,j are integers and Wi,Wj are weights given to any two given components.

For given component Ci we can determine the weight .

Wc = Total Number of Defects in a given component Ci / (Total Number of defects in product)

i.e wc = Dc / Dp ……………equation (1)

6

Fig.1

The above fig.1captures the sample data of Mobile Radio access network system where multiple components are integrated together. Classification of defects across all triggers in a given component can be seen in the above figure.

Background activity involved in capturing these snapshots is to first record all the defects happened in the field for Mobile Radio access network across releases and then identifying the ODC triggers for those defects.

The indication shows that the BTS as a product has most of the defects coming from the field and N/w Manager is the 2 nd highest and MSC is the product with least number of defects. From this we can define the weightage of the product based on the number of defects.

If we apply the formula mentioned for the weightage then WBTS = Number of defects in BTS / Total number of defects = 29 /155 = 0.1870

Component with most number of defects is of high weightage and the network element with least number of defects is of low weightage.

The order of weight from the above figure will be BTS , N/w Manager, Transcoder, Call Control , Requirements, Alarm Manager , customer doc, L3 switch, Vocoder, Ip Controller , tools, PDSN, access node , util reports, MSC .

Also from the historical data we can define the weightage for the ODC triggers in a given component C.

Let T be the set of triggers T = ( T1,T2, T3…Tk) and Dt = Dt1,Dt2….Dtk be the total number of defects in a given component C, then Wt = (Wt1 ,Wt2, Wt3 …Wtk) weightage given to the corresponding triggers in a given component . Wt can be decided by the number of defects under a particular trigger category.

Ip Controller

DEFECT workload/stress DEFECT variation DEFECT startup, restart

DEFECT sequencing

DEFECT S/W configuration DEFECT recovery, exception DEFECT interaction

DEFECT HW configuration DEFECT coverage

Srs System Test Trigger

Util reports

Bts

L3 sw

itch

MSC

N\w Manager

Customer

doc

Requirement

Alarm manager

Vocoder

0

5

10

15

20

25

30

35

Pdsn

Access n

ode

Transcoder

Call control

Packet sel

Tools

ODC Triggers with Component Mapping

7

For a given component C, trigger weight Wtc =Defects due to trigger in Component C / Total Number of defects Dc.

i.e Wtc = Dtc / Dc ……….. equation(2) where Wtc is the weight of the trigger, Dtc is the number of defects due to particular trigger and Dc is the defects in a particular component

Trigger, with highest number of defects is assigned highest weight and triggers, with lowest number of defects is of lower weight.

If we apply the formula on BTS in Fig.1 for a particular trigger startup/restart .

Weight of Startup/restart on BTS Trigger Weight startup/restart BTS = (Defects in BTS due to startup&restart/Defects in BTS )

Wtc for startup/restart on BTS = 6 / 29 = 0.2068

So with the help of historical defects across all components in a given product and by correlating the defect distribution due to triggers we found the weight of the component wc and the corresponding trigger weight Wtc .

2.2 ODC Classified Test cases

Our Objective is to classify the test cases in the test repository corresponding to the triggers and derive weight for each test case.

In earlier section we determined the weight of the trigger in a component and the weight of the component in a given product. We need to use these weights as some of the parameters and also need to find weights for other parameters which all should be used in determining the test selection.

To determine the test cases for Regression testing in “SYSTEM TESTING”, we need to classify the test cases in the test repository with ODC trigger. Our objective is to select the best test case from the test repository.

We already said that in a given product “P” there are multiple components “C”. A product consists of multiple features and each does specific functionality. All features will have impact on one or more components.

Since regression testing is a black box testing, all the test cases will test a high level functionality of the system and ensures that functionality is not broken.

Each test case in a specific regression suite must have a objective of testing a specific scenario which covers multiple components in a specific feature.

Assume there is a TS = tc1,tc2…tck consisting of test cases which are in test repository. All the test cases in this suite will have objective of triggering a particular scenario. From this we can understand that each test case will end up in a particular trigger. So we can very well classify the test cases with trigger.

Classifying test cases with ODC trigger will help in understanding the gap in the existing test suite. As we indicated earlier let T= T1,T2,T3 … Tk be the set of triggers. Please refer to Appendix for complete set of triggers which we use for system testing.

8

Test Case Coverage Variation Sequencing Interaction Workload/stress Startup/Restart … Tc1 0 1 0 0 0 0 0 Tc2 1 0 0 0 0 0 0 Tc3 0 0 1 0 0 0 0 Tc4 0 0 0 1 0 0 0 Tc5 0 0 0 0 1 0 0 .. 0 0 0 0 0 1 0 .. 0 0 0 0 0 0 1 Tck 0 0 1 0 0 0 0

Table 1

“ All test case should have a trigger and any test cannot have more than one trigger from T”

Let us assume F = F1 ,F2,F3.. Fn be the set of the features that in a given product P = C1 ,C2,C3..Ck with multiple components. Then we can correlate test cases in the test repository with the features, components and triggers as indicated in template below.

Test Case

Feature ID

ODC Trigger

C1 C2 C3 C4 … C7 Ck

TC 1 F2 Workload/Stress 1 0 0 1 0 0 0 Tc2 F1 Workload/Stress 0 1 0 0 0 0 1 … F3 … 1 0 0 1 1 0

TC 100 F 25

Workload/Stress 1 0 1 0 0 1 0

TC 500 F3 Work

Load/Stress 1 1 1 0 0 0 0

TC 501 F1 Startup /Restart 0 1 1 1 0 1 0 TC502 F4 Startup/Restart 1 1 1 1 0 1 1 TC503 F17 Startup /Restart 1 0 0 1 0 0 0 TC504 F2 Startup /Restart 0 1 0 0 0 0 1 … … Startup/restart 0 1 0 0 1 1 0 TC650 F25 Startup/restart 1 0 1 0 0 1 0

… F1 …. 1 0 1 0 0 1 0 … F9 … 1 1 1 0 0 0 0 … F11 0 1 0 0 1 0 1 … F13 0 1 1 1 0 1 0 … F18 1 1 1 1 0 1 1 … F21 1 0 0 1 0 0 0 TC1000 F15 Nth Trigger 0 1 0 0 0 0 1

Table 2

9

2.3 Determining Weight for Feature

From the Fig.1 we can find the number of defects for any given trigger T in a product P.

Number of defects in a particular trigger classification Dtp = SUM OF THE DEFECTS UNDER THE PARTICULAR TRIGGER ACROSS ALL COMPONENTS.

Dtp = Dtc1 + Dtc2 + Dtc3 …Dtck where tp is a trigger, Dtci is the defects classified under a particular trigger on a given component Ci.

The defects Dtp are the cumulative defects raised on multiple features both inphase and at the customer site. In the below, table 3 we have done sample mapping based on the Fig.1 for a particular trigger Startup/Restart.

Defects found during testing and from customer on a given trigger T is the cumulative count of the defect introduced in various features in a given product.

i.e The Sum of defects in a given Trigger Ti = Number of defects among all the features (F1…Fn) for that given trigger. We need to consider the contribution of defects by a given feature during test selection .More the defects found in a feature F, then more the weight need to be assigned to the test case which covers that feature.

Table 3

Based on the above table3 and explanation we can say that..

Let Dft1, Dft2, Dft3 …Dftk be the defects across all the feature for a given trigger Ti then .

Weight of Feature Fi for a given Trigger Ti = Defects contributed by the feature Fi / Total number of defects on a given trigger Ti.

Wf = Dfti / Dtp ………………Equation(3) where Dfti is the defect contributed by feature Fi, Dtp is the total number of defects on given trigger Ti.

If we apply the formula for F1 in Table 3 then Wf1 = 3/31 , Weight of F2 will be Wf2= 7/31 etc..

TRIGGER FEATURE Defects

startup/Restart F1 3 F2 7 F3 4 F4 12 F5 5

Total 31

10

2.4 Evaluation Techniques

In the earlier topics we covered the importance of Historical Test results, analyzing ODC classified defects and test cases. We also arrived at the weight for the Components, Weight for the Triggers for given component and Weight for the Feature for a particular trigger in a product. All weights are derived based on the service impacting defects.

If we take a product P which has gone for multiple releases, then some of the features might be running over a period of time. Features which are available from the initial release of the product will become stable during the period of time. Reason being, that those features are exercised for long duration both in inphase system testing and during Customer Usage. This is called Feature Aging. If the aging of feature is more, then the probability of finding defect may be less unless there is a change in the new code in nth release disturbing that functionality.

If we consider a feature which is recently released, then more chances of finding defects is possible as it is not exercised for long time both in inphase testing and the customer site. While selecting test cases for regression testing Aging factor of a feature is a important parameter need to be considered.

Age factor of the feature = No of years a feature is available for Testing / Total number of years the product is available

FeatureAge Factor = Feature Age / Product Age …………………….Equation(4)

Assume Feature F1 in Table.3 is released in n1 th release, 1 year before selecting regression test suite, Assume that the product has undergone the first release before 3 years ,then

F1 Age Factor = 1/3 = 0.3333 .

Less the age factor, more weight need to to be given to a particular feature while selecting test cases,

We are in the process of determining the best possible test cases for nth release from the test repository. The algorithm we are trying to derive is completely based on the historical defects. We have classified the defects based on ODC and then classified the Test case corresponding to the trigger and derived the weight for the feature based on the trigger.

For any test selection process for nth release we should always keep an eye on the number of test cases we select for execution. With number of releases increasing in a given product ,the number of features will increase which in turn results in more number of test cases . This results in increase in the test case maintenance cost. Always any regression test selection should make sure that the number of test cases selected is not exceeding too much in test cases in the previous test suite in order to improve the quality.

Assume there is huge number of test cases in test repository across all triggers. We have to select the optimal suite for release “N” from those test cases. In our model we apply the weightages derived for Components and triggers and arrive at a weightage for test case.

In section 2.1 we found the weights for Component and Trigger as Wc and Wtc. Each trigger may contributes some portion of defect to the Component. Refer Equation (1) & (2)

In Section 2.3 we found the weights for a feature based on the contribution of feature for the defects under a particular trigger. Refer Equation (3)

In Equation(4) we found the aging factor for a given feature .

11

Using Equation (1),(2),(3) & 4 we can derive the weight of the testcase.

Weight for each test case in Table 2 , can be defined as

W(T) = (Wf / FAge Factor ) . Σ (Ci element of P ( Wi. Cov(Ci,T) .Wtci) ………………. Equation(5)

Where Ci is the component, Wi is the weight given to the components, Wtci is the weightage assigned to the Trigger for the component Ci., Cov > coverage of a particular component in a given testcase ,T is the Test case , P is the product under test, Wf is the weight given to a particular feature, Fagefactor > The duration for which the feature has undergone testing

If the test case is covering a particular component then Cov(Ci,T) = 1 otherwise Cov(Ci,T) = 0.Refer Table 2 for more information on coverage of Test Case.

In the case of System Testing any given test case will have coverage of more than one component. So whenever selecting the test case we should consider the combined weightage of all the Components in a particular trigger.

In order to illustrate our algorithm, we are taking a sample of data from Fig.1., Please Refer Fig.1 & Table 4.

We consider only 3 components BTS, Transcoder, Ip controller in order to make it simple.

The total number of defects in BTS = 29, Transcoder= 19 Ip controller = 6. Total Number of defects in product = 29+19+6 = 54 .

Note : Since we are considering only 3 components ,the total is restricted to sum of the defects on the 3 components.

We have Taken Startup/Restart as the trigger. Based on the Wc, Wtrigger,Feature age we derive the weight for Test Case.

Assumption: We assume that the first release of the product is before 3 years and every year there is a release. currently the product is on 3 rd release. F4 is 3, years old , F1,F3,F5 are 1 year old and F2 is 2 years old. Pls refer Table 4.

The defects contributed by the features F1,F2,F3,F4 are mentioned in the below table. Feature F4 is a legacy feature

BTS Transcoder IP controller Total Defects 29 19 6 54 Trigger 6 3 4 13 Wc 29/54=0.537037 19/54=0.351851852 6/54=0.111111111 Wtrigger 6/29=0.206897 3/19=0.157894737 4/6=0.666666667 Age Factor

Defects Feature Weight Aging F1 2 Wf1 = 2/13 0.153846154 1/3=0.3333 F2 3 Wf2 = 3/13 0.230769231 2/3=0.6666 F3 2 Wf3 = 2/13 0.153846154 3/3=1 F4 4 Wf4 = 4/13 0.307692308 1/3= 0.3333 F5 2 Wf5 = 2/13 0.153846154 1/3=0.3333

12

Table 4

Startup/Restart Trigger

Feature Test Case BTS

IpController Transcoder

Σ(Ci element of P

( Wi. Cov(Ci,T) . Wtci )

Wf* Σ(Ci element of P ( Wi. Cov(Ci,T) . Wtci ) W(T) SORT

F1 TC1 1 1 0 0.185117064 0.02847952 0.085447105 3 F2 TC2 0 1 1 0.1295733 0.029901501 0.044856737 10 F3 TC3 1 1 1 0.240631104 0.037020133 0.111071506 1 F2 TC4 0 1 1 0.1295733 0.029901501 0.044856737 x F2 TC5 0 0 1 0.05551404 0.012810919 0.0192183 x F4 TC6 1 1 1 0.240631104 0.074040266 0.07404266 7 F4 TC7 0 1 1 0.1295733 0.039868668 0.039868668 x F5 TC8 1 0 1 0.166571844 0.025626412 0.076886925 5 F5 TC9 1 1 1 0.240631104 0.037020133 0.111071506 2 F2 TC10 1 0 1 0.166571844 0.038439618 0.057665194 8 F2 TC11 1 0 1 0.166571844 0.038439618 0.057665194 9 F4 TC12 0 1 1 0.1295733 0.039868668 0.039868668 x F1 TC13 1 1 0 0.185117064 0.02847952 0.085447105 4 F4 TC14 0 1 0 0.07405926 0.022787442 0.022787442 x F3 TC15 1 0 1 0.166571844 0.025626412 0.076886925 6

Table 5

The output is depicted in the Table 5. We have total of 15 test cases in the Test Repository. We are selecting test cases for the fourth release. If you see the SORT column , the test case order changes. If the project management ask us to reduce the test count by 30 %,then the number of test cases to select from this repository is 10 .

Now the challenge is to select the best 10 test cases keeping in mind the customer usage, the feature aging and the importance of component .Also the test case should be bug yielding.

If we apply the formula,the output is given in Table 5.

Analysis

If we see the test cases selected after applying the Equation (5) we can see the following observation.

1. The number of Test Cases selected for feature F1,F3,F5 is “2” ,where as for F2 it has selected “3” and for F4 number of test cases selected is “1”

Though the feature F4 has historically yielding defects, Since F4 is a legacy feature it has become much stable over the period of time. So our algorithm balances it by not choosing more number of test cases and also ensures that it is covered with test case “6” . Naturally it has selected the best test case. It has selected more test cases for F2, because it was historically broken and the feature age is not too old. So much number of test cases is selected to catch the defects much earlier during inphase testing before delivering it

13

to customer.

Features F1, F3, F5 has been selected in equal proposition. Also the best part is that the algorithm selects the best test case like TC3, TC9 .

2. It has selected all the test cases covering BTS. Reason being most of the features which are newly deployed in the customer site are having impact on BTS. So more number of defects arrival is possible in this area.

3. With respect to IP controller though the test cases selected is less, the feature coverage is optimal. It has covered all features F1,F2..F5.

4. For Transcoder the number of test cases selected is balanced between BTS and IP Controller.

5. Also the order in which the test case priority done is well balanced. The algorithm assigned highest weight based on the Defect prediction capability, and the product complexity.

Overall, the decision made is well justified based on multiple parameter like Bug Yeilding,Assurance,Coverage and also the order of weightage.

Test case reduction from N-1th Suite and Test Addition to the Nth Suite

Test case reduction is one of the main concerns for Regression test suite. With more releases, test size may grow.

Let T=T1,T2,T3..Tk be the set of triggers and TS= Ts1,Ts2,Ts3…Tsk where Ts is the set of Test Suites and Tsi be the set of test cases for a given trigger Ti.

Trigger weight for each test suite Tsi can be derived from the percentage of the defects due to a given trigger across the product.

Trigger Weight Twi = (Number of Defects Dwi * 100) /Total number of defects Dp

For the test suite TS to be effective the total number of test cases on each trigger Ts1 ,Ts2..Tsk should be derived from Trigger Weight for each Tsi.

Total Number of Test cases in Tsi = ( Twi * Total number of Test cases in TS )/100

Let us do the analysis for existing suite mentioned below in Fig.2. The below analysis is for Operation and maintenance Test cases. In the below figure we can see that percentage of escape defects in Startup/Restart is of 20 % of total defects. At the same time percentage of escape defects in coverage are 5 % of total defects. This is the actual data of n1th suite. Now if we apply the formula for test case reduction by finding the Trigger Weight for Coverage then

Wcoverage = (Number of defects found under Coverage * 100 ) / Total number of defects in the product

= 6 * 100 / 100 Wcoverage = 6 Total Number of Testcases in T(coverage) = 6 * 100 /100 = 6

In the case of startup/restart Wstartup/restart = 20 Total Number of Testcases in T(Wstartup/restart) = 20.

But if we the current number of test cases in the existing suite we can see the coverage testcases is around 45% of the total coverage

14

According to our equation only 6 % of the testcase needs to be under coverage.So there is a huge scope of improvement in reduction in testcase for coverage.

Also if we consider the Testcase for Startup and restart it should be of 20 % according to the equation. But the existing data shows that the number of test cases for startup/restart is 5. So there is a 15 % scope of increasing the number of test case with respect to startup/restart in our nth suite.

Applying ODC we found the trigger for which we need to increase the testcases. During addition of testcases in startup/restart more focus should be given to the Component which has got most percentage of startup/restart defects.

Fig.2

2.5 Test case Prioritization for the n th release

Test prioritization can be defined as the Order in which the test cases need to be executed . In the case of time bound project more importance is given for test priority. Particularly in the case of regression system testing effective testcase is executed will result in more quality of delivery.

The advantage of doing the prioritization is to get the bug early in the projects so that we can avoid risk in the later stage. Based on the above equation we derived Weightage of each test case corresponding to each trigger. Consider a table mentioned below .This will be the final output of test selection after applying our ODC method.

With the equation (5) , we can prioritize the order of execution based on the weight given to the test case and also based on the percentage of defects for the given triggers. We can refer to table (5) for getting the sample prioritization for test cases classified under any Trigger. A ideal test suite should be having a flavor of sample test cases which are assigned high weights from all kind of triggers.

The order in which we need to execute test cases for the below table is sample of test cases with higher weights covering multiple trigger. Wt1,Wt2,,,Wtk.

0.00%

5.00%

10.00%

15.00%

20.00%

25.00%

30.00%

35.00%

40.00%

45.00%

50.00%

Coverage

Variation

Sequencing

Interaction

Workload Stress

Recovery Exception

Startup, restart

Hardware Configuration

Software Configuration

Triggers

% EDA

Testcases

15

Based on Equation (5) we can come up with the below details.

Trigger weight Test Suite Total number of defects at customer site

Test Cases

Wt1 Ts1 Dt1 Sorted Test cases based on Test case weight .(Tc1,Tc2,Tc3…Tck)

Wt2 Ts2 Dt2 Sorted Test cases based on Test case weight(Tc1,Tc2,Tc3…Tck)

… Ts3 … Sorted Test cases based on test weight (Tc1,Tc2,Tc3,…Tck)

Wtk Tsk Dtk Sorted Test cases based on Test Weight (Tc1,Tc2,Tc3..Tck)

Table 6 2.6 Test Configuration using ODC

Some times after post mortem of the customer find defects we will be surprised to see defects on specific are where there is good enough coverage. If we analyze further we could see that customer environment may be different from our environment. Eg. Customer data base size may be huge in the lab it may have less database size..

Consider the Fig.3 below where C1,C2,C3,C4,C5,C6 are various Customers who are using the product . From the figure.3 below it is clear that a same release will have wide pattern of defects across multiple customers. The reason being the customer configurations and the usage may vary for different customers. So while selecting test cases for a particular customer, we need to understand the customer configuration. ODC helps in achieving this by giving a clear picture of customer defects based on specific triggers.

If we analyze the defects raised by each customer with respect to the triggers then we can have a altogether different pattern for different customers.

With that we can find the gap in the testing corresponding to triggers. If we want to test for a particular customer, then test selection should be done based on the defect trends from the customer.

16

Fig 3

3.0 Summary

A method to manage complex system regression testing using ODC and component test coverage table and exploiting the classified data has been provided.

Of the many classification schemes exist; ODC is ideally suited as an underlying standard for analyzing the complex products and deriving at regression test suite. We derived weights for Components of the product and weight for triggers, feature contribution to the trigger and also aging factor which are used in Test Selection. As the ODC based selection of test cases is derived from the historical defects and feedback from customers, the results can be seen immediately. Quality of delivery and test area gaps are clearly visible.

While there are certain possibilities for developing a more rigorous and sophisticated model using ODC data in conjunction with other metrics and statistical models, the method we have described offers the advantages of a small investment of effort, immediate returns, and the detailed information which provides straightforward details to technical teams, and summary of information management.

4.0 Future Work

We have not considered the Feature interaction in the existing test suite. There may be lot of features in a given system which might be built on top of other features. This can be used as a factor for test reduction. Lot of dependent features is possible in any product. Based on the dependency we can derive weights and select the test cases accordingly. Any test case which covers more number of feature interactions is given higher weight. Also we have not considered the usage of a given feature at the customer site. For each feature we can derive the customer priority based on which we can select the assurance suite. The above two parameters are planned for future work.

5.0 References

[1] [ODC Web] Orthogonal Defect Classification www.chillarege.com/odc

0 5 1 1 2 2 3 3 4 4

C1 C2 C3 C4 C5 C6

Custom

er SRs

R1 R2 R3 R4

Customer Release Captured Distribution

17

www.research.ibm.com/softeng

[2] R. Chillarege, et al,. “Orthogonal defect classification: A Concept for inprocess measurements”, IEEE Transactions on Software Engineering, 18(11), pp.94356 (1992 )

[3] G. Rothermel and M. J. Harrold. Analyzing regression test selection techniques. IEEE Transactions on Software Engineering, 22(8):529 – 551, 1996.

[4]B. Beizer, Software Testing Techniques. New York, NY: Van Nostrand Reinhold, 1990.

[5] R. Craig and S. Jaskiel, Systematic Software Testing. Norwood, MA: Artech House Publishers, 2002.

[6] M. Harrold, "Testing: A Roadmap," presented at International Conference on Software Engineering, Limerick, Ireland, 2000.

[7] L. Tahat, B. Vaysburg, B. Korel, and A. Bader, "RequirementBased Automated BlackBox Test Generation," presented at 25th Annual International Computer Software and Applications Conference, Chicago, Illinois, 2001.

About the Author

Rangarajan received his Bachelor’s degree in computer science from University of Madras, in 1999.He received Personal Software Process for Engineers from Software Engineering Institute Carnegie Mellon University. He started his career with research and development in PSTN & Intelligent network products. He worked in testing OSS, VOIP products and currently leading a testing team in CDMA SYSTEM Testing Group in MOTOROLA INDIA Private Limited.

Appendix 1

Definition

Test Suite Set of Test Cases Test Repository Location where all the test cases are stored Escape Defect Defect found at the customer site InPhase Defect Defects found before the release of the product to the customer by the

vendor team

Service Impacting Defects

Define Service impact:Any defect which affects the basic service to the end user is of high service impact and any defect which in directly affect the customer is of medium and any defect which does not impact end user but affects the maintainability of the product is medium or low service impact. Refer TL 9000 DOC.

ODC definition

• ODC is a measurement system that allows learning from experience and provides a means of communicating experiences between projects.

18

• It uses classification of defects to study the dynamics of software development. – Provides a few vital measurements on the product and process that are extracted from defects. – Assists software engineering decisions via measurement and analysis.

• Bridges the gap between quantitative methods and qualitative analysis. • In order to provide a robust measurement system, ODC requires:

– Orthogonality – Consistency across phases – Uniformity across products

When tracking a software defect, different pieces of information become available at different times. Of the eight defect attributes used in ODC, the information to understand three (activity, trigger, and impact) usually becomes available when the defect is uncovered, the information needed for the remaining five when the defect is fixed or understood.

Attributes understood when a defect is uncovered: À Activity: The actual activity being performed when the defect surfaced. For example, during scheduled function test, you might decide to do a code inspection; this constitutes the activity captured. À Trigger: The environment or condition that had to exist for the defect to appear. During review and inspection activities, choose the selection that best describes what you were thinking about when you discovered the defect. For test defects, match the trigger that captures the intention behind the test case or the environment or condition that served as catalyst for the failure. À Impact: For inprocess defects, select the impact you think the defect would have had on the customer if it had escaped to the field. For fieldreported defects,select the impact the failure had on the customer.

Table A (shown on next page) shows how activities map to triggers; it is for illustrative purposes only. Pick the activities in your process and map each activity to the appropriate triggers from the full list of 21 triggers. Attributes understood when the defect fix is known:

À Target: The highlevel identity of the entity that is being fixed. À Defect type: The nature of the actual correction made. À Defect qualifier (applies to defect type): Indication of whether the defect was an omission or commission, or is extraneous. À Source: The origin of the design or code containing the defect. À Age: The history of the design/code containing the defect.

Earlier versions of ODC [6] primarily targeted software design and code for capturing defect information in a traditional environment. Defects related to other software development activities, such as build/package and user documentation, were captured as defect types for convenience

.

19

Orthogonal Defect Classification v 5.11

ODC Trigger Valid Set for Functional Test Activities

Simple path defect uncovered while specifically invoking a specific path such as in white box testing

Complex path test case that found the defect was executing a contrived combination of code paths

Coverage test case was a straightforward attempt to exercise code for a single function using no parameters or a single set of parameters

Variation test case was a straightforward attempt to exercise code for a single function using a variety of inputs and parameters

Sequencing test case executed multiple functions in a very specific sequence

Interaction test case initiated an interaction among two or more functions that execute successfully when run independently but fail in this combination.

ODC Trigger Valid Set for System Test Activities

Hardware Configuration Was the problem encountered on one particular HW configuration and not another?

Workload/Stress Was the system operating near a resource limit?

Software configuration Was the problem discovered on a specific SW configuration?

Startup / Restart Was the system being initialized or restarted after a failure or shutdown?

Recovery / Exception Was the system/subsystem in recovery mode?

Normal mode Operator Intervention

Did the operator initiate some action such as a command?

Normal mode (blocked test) Did the defect surface without any specific strategy?

Managing Regression Test Program With Orthogonal Defect Classification ODC

Documents

Transcript of Managing Regression Test Program With Orthogonal Defect Classification ODC