Successive sampling strategy under non response

The present work is an attempt to study the effect of non-response at both occasions in search of good successive (rotation) sampling over two occasions. A chain-type ratio and regression estimator has been proposed for estimating the population mean at current occasion in presence of non-response at both the occasion in two-occasion successive (rotation) sampling. Detail behaviors of proposed estimators have been studied. Proposed estimators are compared with the estimators for the same situations but in the absence of non-response. Performances of the proposed estimators have been demonstrated via empirical studies. Mathematics Subject Classification 2010: 62D05. Additional


INTRODUCTION
Surveys often gets repeated on many occasions (over years or seasons) for estimating same characteristics at different points of time.The information collected on previous occasion can be used to study the change or the total value over occasion for the character and also in addition to study the average value for the most recent occasion.In many social surveys, the same population is sampled repeatedly and the same study variable is measured on each occasion, so that development over time can be followed.For example, labor force surveys are conducted monthly to estimate the employment status, monthly/weekly data on the prices of goods are collected to determine the consumer price index, political opinion surveys are conducted at regular intervals to know the voter preferences, etc.In such cases, the use of successive (rotation) sampling schemes may be an attractive alternative to provide reliable estimates at a desired point of time (occasion) or to measure the change between two points of time (occasions).
Unauthentifiziert | Heruntergeladen 30.08.19 07:52 UTC Theory of successive (rotation) sampling appears to have started with the work of Jessen [10].He pioneered in utilizing the entire information collected in the previous investigations.Further the theory of successive (rotation) sampling was extended by Patterson [11], Rao and Graham [12], Gupta [8], Das [5], Chaturvedi and Tripathi [2] and many others.Sen [13] developed estimators for the population mean on the current occasion using information on two auxiliary variables available on previous occasion.Further, Sen [14,15] extended his work for p auxiliary variates.
Singh et al. [16] and Singh and Singh [17] used the auxiliary information on current occasion for estimating the current population mean in two occasions successive sampling.Singh [18] extended the work of Singh and Singh [17] for h-occasion successive sampling.
In many situations, information on an auxiliary variate may be readily available on the first as well as on the second occasion, for example, tonnage (or seat capacity) of each vehicle or ship is known in survey sampling of transportation, number of beds in different hospitals may be known in hospital surveys, number of polluting industries are known in environmental surveys, nature of employment status, educational status, food availability & medical aids of a locality are well known in advance for estimating the various demographic parameters in demographic surveys.
Many other situations in biological (life) sciences could be explored to show that the information on an auxiliary variate is available on both the occasions.Utilizing the auxiliary information on both the occasions Feng and Zou [7] and Biradar and Singh [1] proposed estimators for estimating the current population mean in successive (rotation) sampling.Further Singh [19], Singh and Karna [21,22] have proposed chain-type ratio and regression estimators for estimating the population mean at current (second) occasion in two occasions successive (rotation) sampling.
It is common experience in sample surveys that data cannot always be collected from all the units selected in the sample.For example, the selected families may not be at home at the first attempt and some may refuse to co-operate with the interviewer even if contacted.This is particularly true in mail surveys in which questionnaires are mailed to the sampled respondents who are requested to send back their returns by some deadline.As many respondents do not reply, available sample of returns is Unauthentifiziert | Heruntergeladen 30.08.19 07:52 UTC incomplete.The resulting incompleteness, called non-response, is sometimes so large as to completely vitiate the results.
Hansen and Hurwitz [9] suggested a technique of handling non-response in mail surveys.These surveys have the advantage that the data can be collected relatively inexpensively.However, non-response is a common problem with mail surveys.
Cochran [4] and Fabian and Hyunshik [6] extended the Hansen and Hurwitz technique to the case when besides the information on character under study, information is also available on auxiliary character.More recently Choudhary et al. [3], Singh and Kumar [20], Singh and Karna [23] used the Hansen and Hurwitz [9] technique for the estimation of population mean on current occasion in the context of sampling on two occasions.
The objective of the present work is to study the effect of non-response at current occasion in two-occasion successive (rotation) sampling.In two occasions successive (rotation) sampling, a portion of sample is matched from the previous occasion and it is assumed that whole units respond at first occasion.So, we may think that as they are familiar with the questionnaire at first occasion, therefore, they may not have any hesitation in responding at the second occasion for the units in the matched portion of the sample.At the current occasion a sample is drawn afresh from the remaining units, so there may be possibility of non-response at current occasion.Motivated with the above points and using Hansen and Hurwitz [9] technique, estimators are proposed to study the effect of non-response at current occasion in two-occasion successive (rotation) sampling.In this work a relevant chain-type ratio and regression estimator has been proposed for estimating the current population mean in two-occasion successive (rotation) sampling.The proposed estimator is mutually compared under with and without non-response situations.The behavior of the proposed estimator has been examined through empirical studies.

PROPOSED ESTIMATORS
Let U = (U 1 , U 2 , ---, U N ) be the finite population of N units, which has been sampled over two occasions.The character under study be denoted by x (y) on the first (second) occasion respectively.It is assumed that information on an auxiliary Unauthentifiziert | Heruntergeladen 30.08.19 07:52 UTC variable z (with known population mean), is available on both the occasions.We assume that there is non-response at both the occasions, so that the population can be divided into two classes, those who will respond at the first attempt and those who will not.Let the sizes of these two classes be N 1 and N 2 respectively at the first occasion and the corresponding sizes at the current (second) occasion be * 1 N and * 2 N respectively.A simple random sample (without replacement) of n units is taken on the first occasion.We assume that out of selected n units, n 1 units respond and n 2 units do not respond.Let n 2h denote the size of sub sample drawn from the non-response class in the sample.A random sub sample of m = n  units is retained (matched) from the responding n 1 for use on the second occasion.Now, at the current occasion a simple random sample (without replacement) of u = (n-m) = n units is drawn afresh from the remaining non-sampled units of the population so that the sample size on the second occasion is also n.It is assumed that the units in the matched portion of the sample respond fully at current occasion. and  (+  =1) are the fractions of matched and fresh samples respectively at the second (current) occasion.We assume that in the unmatched portion of the sample on the second (current) occasion u 1 units respond and u 2 units do not respond.Let u 2h denote the size of sub sample drawn from the non-response class in the unmatched portion of the sample on the current occasion.The following notations are considered for the further use: Let u 2h be the size of sub sample drawn from the non-response class in the unmatched portion of the sample on the current occasion and their response collected by direct contact or interview.Following are the list of notations, which are considered for their further use: X, Y, Z : The population mean of the variates x, y and z respectively.

FORMULATION OF THE ESTIMATOR
To estimate the population mean Y on the second occasion, two different estimators are suggested.One is the Hansen and Hurwitz [9] type estimator, say ∆ u , which is based on u sample units drawn afresh at current occasion such that out of these u units, u 1 units respond and remaining u 2 (= uu 1 ) units do not respond.∆ u is defined as where The second estimator based on the sample of size m, which is common to both the occasions and utilizes the information from the first occasion.Since, there is non response at first occasion, therefore, again Hansen and Hurwitz [9] type estimator has been considered.The second estimator, say ∆ m for estimating the population mean at Unauthentifiziert | Heruntergeladen 30.08.19 07:52 UTC current occasion is a chain-type regression in ratio estimator based on a sample of size m (= n), which common to both the occasions, and is defined as where The final estimator ∆ is the convex linear combination of the estimators ∆ u and ∆ m .
The estimator ∆ is defined as: where ψ is the unknown constant to be determined under certain criterion.
REMARK 3.1.For estimating the mean on each occasion the estimator Δ u is suitable, which implies that more belief on Δ u could be shown by choosing ψ as 1 (or close to 1), while for estimating the change from one occasion to the next, the estimator Δ m could be more useful so ψ might be chosen as 0 (or close to 0).For asserting both the problems simultaneously, the suitable (optimum) choice of ψ is required.

PROPERTIES OF THE ESTIMATOR Δ
Since, u m and   are ratio or chain-type regression in ratio estimator, they are biased for population mean Y .Therefore, the resulting estimator Δ defined in equation ( 3) is also a biased estimator of Y .The bias B (.) and mean square error M (.) up-to the first order of approximations are derived under large sample approximations and using the following transformations: Unauthentifiziert | Heruntergeladen 30.08.
+β X e -e 1+e 1+e 1+e  (5) Thus, we have the following theorems: THEOREM 4.1.Bias of the estimator Δ to the first order of approximations is obtained as where     where PROOF.The bias of the estimator ∆ is given by Substituting the values of Δ u and Δ m from equations ( 4) and ( 5) in the equation ( 9), expanding terms binomially and taking expectations up to o(n -1 ), we have the expression for the bias of the estimator Δ as described in equation (6 where and PROOF.It is obvious that mean square error of the estimator Δ is given by Using the expressions of Δ u and Δ m from equations ( 4) and ( 5) in the equation ( 14), expanding terms binomially and taking expectations up to o(n -1 ), we have the expression of mean square error of the estimator Δ as given in equation (10). where Results shown in equations ( 11)-( 13) are derived under the assumption that the coefficients of variation of x, y and z are approximately equal.

MINIMUM MEAN SQUARE ERROR OF Δ
Since, mean square error of Δ in equation ( 10) is a function of unknown constant ψ, therefore, it is minimized with respect to ψ and subsequently the optimum value of ψ is obtained as Now substituting the value of opt ψ in equation (5.10), we get the optimum mean square error of Δ as Further substituting the values of M(Δ u ), M(Δ m ) and   u m C Δ , Δ from equations ( 11) - (13) in equation (18), the simplified values of M(Δ) opt is obtained as: where   and μ is the fraction of fresh sample at the second (current) occasion for the estimator Δ.

OPTIMUM REPLACEMENT POLICY
To determine the optimum value of μ (fraction of sample to be taken afresh at second occasion) so that population mean Y may be estimated with the maximum precision, we minimize mean square error of Δ given in equation ( 19) with respect to μ.This yields quadratic equation in μ.Quadratic equation and the respective solution of μ say μ are given below: From equation (21), it is obvious that real value of μ exists, iff, the quantity under square root is greater than or equal to zero.For any combinations of correlations ρ yx , ρ xz and ρ yz , which satisfy the conditions of real solutions; two real values of μ are possible.Hence, while choosing the values of μ , it should be remembered that 0 μ1  .All the other values of μ are inadmissible.Substituting the admissible value of μ say (0)   μ from equation ( 21) into equation ( 19), we have the following optimum value of mean square error of Δ: Unauthentifiziert | Heruntergeladen 30.08.19 07:52 UTC

SOME SPECIAL CASES
Case 1: When non-response occurs only at first occasion For the case when non-response occurs only at first occasion, the estimator for population mean Y at current occasion may be obtained as where u u u y τ = Z z and Δ m is defined in equation ( 2).ψ * is unknown constant to be determined so as to minimize the mean square error of the estimator Δ * .τ and  are ratio and chain-type regression in ratio estimators, they are biased for population mean Y .Therefore, the resulting estimator Δ * defined in equation ( 23) is also biased estimator of Y .
THEOREM 7.1.Bias of the estimator Δ * to the first order of approximations is same as that of the estimators Δ which has been discussed in theorem 4.1.
THEOREM 7.2.Mean square error of the estimator Δ * to the first order of approximations is obtained as where and M(Δ m ) is same as that shown in equation (12).
Now substituting the value of * opt ψ in equation ( 24), we get the optimum mean square error of *  as Further substituting the values of M(τ u ), M(Δ m ) and ,...,5) are defined in section 5 and * μ is the fraction of fresh sample at the second (current) occasion for the estimator Δ * .

Optimum Replacement Policy
To determine the optimum value of * μ (fraction of sample to be taken afresh at second occasion) so that population mean Y may be estimated with the maximum precision, we minimize mean square error of Δ * given in equation ( 29 where M(Δ u ) is same as it is shown in equation (11) and Remark 7.2: Results shown in equations ( 34) and ( 35) are derived under the assumption that the coefficients of variation of x, y and z are approximately equal.

Minimum Mean Square Error of the estimator Δ **
Since, mean square error of Δ ** in equation ( 33) is a function of unknown constant ψ ** , therefore, it is minimized with respect to ψ ** and subsequently the optimum value of ψ ** is obtained as Now substituting the value of ** opt ψ in equation ( 33), we get the optimum mean square error of Δ ** as A = -fA A , A k (k = 1, 2, 3) are defined in section 5 and μ ** is the fraction of fresh sample at the second (current) occasion for the estimator Δ ** .

Optimum Replacement Policy
To determine the optimum values of μ ** (fraction of sample to be taken afresh at second occasion) so that population mean Y may be estimated with the maximum precision, we minimize mean square error of Δ ** given in equation ( 38) with respect to μ ** .This yields

EFFICIENCY COMPARISON
To examine the loss in efficiencies of the estimators Δ, Δ * and Δ ** owing to non-response, the percent relative loss in efficiencies of estimator Δ, Δ * and Δ ** with respect to τ, proposed by Singh and Karna [21], have been computed for different choices of ρ yz and ρ yx .The estimator τ is defined under the similar circumstances as the estimator Δ but in the absence of non-response.It is given as .φ is unknown constant to be determined by the minimization of the mean square error of τ.The optimum mean square error of τ is given by and optimum values of μ is given by where   with respect to τ, we introduce following assumptions: (i) ρ xz = ρ yz , which is an intuitive assumption, considered, for example by Cochran [4] and Feng and Zou [7].
The percent relative losses in precision of Δ, Δ * and Δ ** with respect to τ under their respective optimality conditions are given by Unauthentifiziert | Heruntergeladen 30.08.19 07:52 UTC The expressions of the optimum µ and the percent relative losses are given in terms of the population correlation coefficients.Therefore, the percent relative losses have been computed for different choices of correlations ρ yz and ρ yx .Percent relative losses in precision of the estimators Δ, Δ * and Δ ** have been computed for different choices of f, f 1 , f 2 , W, yx ρ , yz ρ and compiled in Tables 1 -3.
Table 1: Percent relative loss L in precision of Δ with respect to τ for f = 0.1 ρ yx 0.3 0.5 0.7 0.9

Discussion on the Behavior of Estimator Δ
The following conclusions can be read out from Table 1: (a) For the fixed values of f 1 , f 2 , W and ρ yx , the values of μ (0) decrease while loss in precision L increase when the value of ρ yz is increased.This phenomenon suggests that if study character is highly correlated with the auxiliary variate, smaller fresh sample is required at the second (current) occasion which leads in reducing the cost of the survey.
(b) For the fixed values of f 1 , W, ρ yx and ρ yz , the values of μ (0) and L increase with the increasing trends of f 2 .
(c) For the fixed values of f 2 , W, ρ yx and ρ yz , the values of μ (0) decrease but the values of L increase with the increasing values of f 1 .
(d) For the fixed values of f 1 , f 2 , ρ yx and ρ yz , the values of μ (0) and L increase with increasing values of W. This behavior shows that the higher the non-response rate, the larger fresh sample is required to be replaced at the second (current) occasion.(c) For the fixed values of f 1 , ρ yx and ρ yz , the values of μ * decrease and the values of L * increase with the increasing values of W. This behavior indicates that for the higher non-response rate, smaller fresh sample is required to be replaced at current occasion.

:
The sample means of the respective variates of the sample sizes shown in suffices.yx xz yz ρ , ρ , ρ : The correlation coefficients between the variates shown in suffices.

::
, S : The population mean squares of x, y and z respectively.The proportion of non-responding units in the population at first occasion.The proportion of non-responding units in the population at second (current) occasion.
is the sample regression coefficient between the variables shown in suffices and based on the sample size shown in bracket.

REMARK 4 . 1 .
Following Hansen and Hurwitz[9] technique, some expectations which are used in Theorems 4.1 and 4.2, are evaluated as given below:

Y at current occasion for this case mayTheorem 7 . 3 : 1 . 7 . 4 :
* is unknown constant to be determined so as to minimize the mean square error of the estimator Δ ** .are ratio and chain-type regression in ratio estimators, they are biased for population mean Y .Therefore, the resulting estimator Δ ** defined in equation (32) is also biased estimator of Y .Unauthentifiziert | Heruntergeladen 30.08.19 07:52 UTC Bias of the estimator Δ ** to the first order of approximations is same as that of the estimator Δ which has been discussed in theorem 4.Theorem Mean square error of sequence of estimators Δ ** to the first order of approximations is obtained as 37) Further, substituting the values of M(Δ u ), M(τ m ) and   ** um C Δ , τ , the simplified value of M(Δ ** ) opt are obtained as: of ** μ is obtained in the similar manner as for the cases of μ(0)   and μ *(0) .Substituting the admissible value of ** μ say μ **(0) from equation (39) into equation (38), we have the optimum value of mean square error of Δ ** , which is

1 .
To compare the performance of the estimators Δ, Δ * and Δ **

Figure. 1 Figure 2
Figure.1 Percent relative loss in precision of Δ over τ at optimum values of µ when non-response occurs at both the occasions for f = 0.1, ρyx = 0.7, W = 0.2and f1 = 2.0

Figure 3
Figure 3 Percent relative loss in precision of Δ ** over τ at optimum values of µ when non-response occurs only at the first occasions for f = 0.1, ρyx = 0.7 and W = 0.2.

9 . 2 .
Discussion on the Behavior of Estimator Δ * We may conclude from Table 2 that (a) For the fixed values of f 1 , W and ρ yx , the values of L * increase while no patterns are visible in the values of μ * with the increasing values of ρ yz .(b) For the fixed values of W, ρ yx and ρ yz , the values of μ * decrease and the values of L * increase with the increasing values of f 1 .

9. 3 .
Discussion on the Behavior of Estimator Δ ** Following conclusions can be drawn from Table 3: (a) For the fixed values of f 2 , ρ yx and W the values of μ **(0) decrease but the values of L ** increase when the value of ρ yz is increased.This phenomenon indicates that if a highly correlated auxiliary variate is available it pays in terms of reducing the cost of the survey.(b) For the fixed values of ρ yx , ρ yz and W the values of μ **(0) and L ** increase with the increasing values of f 2 .(c) For the fixed values of ρ yx and ρ yz the values of **(0) μ decrease while the values of L ** increase with the increasing values of W. This behavior shows that the more the non-response rate the more loss in precision occurs.Unauthentifiziert | Heruntergeladen 30.08.19 07:52 UTC Under the above transformations the estimators Δ u and Δ m take the following forms: 19 07:52 UTC