1
Assessing risk due to small sample size in probability of detection
analysis using tolerance intervals
Ajay M. Koshti, NASA Johnson Space Center, Houston, U.S.A.
ABSTRACTCASE
Small sample size (e.g. 6-30) poses risk in results of probability of detection (POD) analysis using tolerance intervals. This
method is also called as the limited sample or LS POD. The analysis is performed either during NDE procedure
qualification or for assessment of reliability of an NDE procedure. The risk is primarily due to sampling error. Smaller
samples are not likely to be random to the population or representative of the population. The small samples are likely to
be biased. Biased samples have smaller standard deviation compared to the population. POD analysis with small biased
sample can lead to overestimation of POD. Many sampling schemes are available in statistics to mitigate sampling risk.
Primary objective of POD analysis is to determine a decision threshold from signal response measurements of a sample
such that it is less than or equal to population decision threshold for 90% POD. Sampling error implies that this NDE
reliability condition is violated. One of sampling types is called a representative sample. Representative samples reduce
variance in POD estimates but also reduce magnitude of the error. Sampling sensitivity analysis for some sampling types
is performed here using repetitive random sampling or Monte Carlo method. Six sampling types are considered for
comparison. Some of the sampling types are similar to drawing a representative sample. LS POD model assumes random
sampling. Therefore, random sampling is used as a basis for comparison with each sampling type. The sampling types
used in the analysis are, A. Nominal and worst-case sampling, B. Worst-case sampling, C. Nominal case sampling, D.
Random sampling, E. Random target, and sub-target sampling. F. Nominal target and sub-target sampling. Results of
Monte Carlo simulation indicate that type F sampling can mitigate sampling risk and is also more practical to implement.
Type A sampling may also mitigate the sampling risk, but it may be less practical to implement.
Keywords: nondestructive evaluation, probability of detection, statistical sampling
1. INTRODUCTION
MIL-HDBK-1823A
1
and associated mh1823
2
software address NDE POD testing for two types of datasets. The first type
of dataset is that of the signal response “â” (read as a-hat) versus known flaw size “a”. In analyzing such data, the â data
is represented on the y-axis and the “a” data on the x-axis and the data may be transformed using a logarithmic function,
if needed, to provide a linear relationship around the signal response decision threshold level. A generalized linear model
(GLM) is fitted to the data using maximum likelihood estimate (MLE) for analysis. In this analysis, noise data, defined as
the signal response in a region where there is no flaw, is also obtained. The noise data is used to estimate false call rate or
probability of false positive calls (POF).
The second type of dataset considered is called hit-miss data, which contains the known flaw size and the corresponding
detection result (i.e. hit or miss). For numerical analysis, a hit is assigned a numerical value of 1 while a miss has numerical
value of 0. False call data (i.e. a hit is recorded where no flaw exists) is also recorded and used to determine the POF using
the Clopper-Pearson binomial distribution function. Typically, POD increases with increasing flaw size and POF decreases
with increasing flaw size. POF shall be required to be less than a certain limit to prevent adverse impacts on cost and
schedule necessary to take corrective actions to address the false positive calls. ASTM E2862
3
also provides a hit-miss
POD data analysis method that is consistent with MIL-HDBK-1823A.
There are other POD analysis approaches that are not covered by MIL-HDBK-1823A. Binomial point estimate method of
verifying reliably detectable flaw size is given by Rummel
4
. Current work is in probability of detection (POD) analysis
using tolerance intervals
5,6
. This work is limited to POD analysis and simulation approach for single hit limited sample
POD
7
or LS POD. Broadly, LS POD covers signal response-based data for both single hit and multi-hit POD applications
including modeling of transfer functions
7-13
. Smaller sample size is usable in LS POD analysis. In practice, signal responses
from smaller sample size of flaws may not be random to the population. LS POD assumes that the sample is randomly
drawn from the population. Thus, there is increased risk that the small sample is non-representative to the population and
is a biased sample. In statistics, sampling bias is defined as a bias in which a sample is collected in such a way that some
members of the intended population have a lower or higher relative sampling probability than others. It results in a biased