LEARNING 2000:

Educational Institution and the Creation of Human Capital

 

Technical Report 04-2

 

 

 

Methodological Issues in the Sampling Design of the

Third Wave of TEPS*

 

 

 

Tony Tam

Academia Sinica

 

 

(First Draft)  October 23, 2004

 

Approx. Word Count: 6,300

 


 

Methodological Issues in the Sampling Design of the

Third Wave of TEPS

 

ABSTRACT

 

This paper identifies the basic methodological issues involved in the sampling design of the third wave of Taiwan Educational Panel Survey (TEPS). Central to the design are three design questions (1) ‍sampling of schools, (2) ‍the sampling of panel members from the second wave of junior high students for two more waves of follow-up, and (3) ‍sampling of classes within school. A systematic discussion of successive hypothetical scenarios contributes to resolving potential confusions and concerns with the sampling design. The motivation for a partially fixed school sample design is discussed and justified. The proposed solutions to the three design issues are tightly integrated and can simultaneously address the need for intercohort tests of institutional and policy impacts, efficient estimation of school effects, avoidance of following up an overly dispersed panel, and maximizing the potential for obtaining class contexts not just for the fresh K11 sample but also for the continuing panel.

 


Methodological Issues in the Sampling Design of the

Third Wave of TEPS

Taiwan Educational Panel Survey (TEPS) is a multistage stratified sampling survey of Taiwanese high school students (Chang 2003). We take for granted that an appropriate sampling design should be one that can efficiently achieve the main analytic objectives for which the sample is designed. All sampling designs entail costs. The design with the lowest operational cost does not usually have the best efficiency in achieving the analytic objectives. An optimal design is one that strikes a good balance between analytic imperatives and cost concerns. It is assuring to know that the sampling design for the first two waves of TEPS was not only to economize on cost but also to be optimized for uniquely significant analytic objectives such as the multilevel study of the relative roles of school, class, teacher, peer, and family influences on students.

From inception, the sampling committee has made the decision to give up the common practice of maintaining approximately equal probability sampling or probability proportional to size sampling. The payoff is the degree of freedom for achieving multiple practical and analytic goals without complicating the usual procedure and requirement of applying sampling weights. The application of sampling weights in descriptive statistical analysis of TEPS should be routinely done. Weighting is not mandatory for statistical modeling but applying sampling weight is the most conservative practice (DuMouchel and Duncan 1983; Winship and Radbill 1994) and it can be routinely carried out for virtually all common modeling procedures, for instance, in the widely used software Stata without requiring extra programming or adjustment of standard errors (StataCorp 2001, p. 263-265). This paper addresses basic issues of sampling design and weight construction for the third wave of TEPS — after the junior high cohort (grade seven in the fall of 2001 or the 2001-K7 cohort) promotes to year two (grade eleven in the fall of 2005, i.e., becoming the 2005-K11 cohort) in senior high, vocational high, or five-year junior college programs. The central sampling design question is how to draw a new sample of students (the 2005-K11 sample) to be compared with previously sampled students in the same grade level in the 2001 (the 2001-K11 sample).

Before a systematic discussion of the design questions, it is important to discuss and clarify a number of methodological issues that may not be obvious to all readers.

 

KEY METHODOLOGICAL ISSUES

Irrelevance of Sampling by Nature

        Throughout the following discussion of sampling design, we will repeatedly come across issues about the construction of sampling weights. It is useful to state at the outset two guiding principles in constructing sampling weights. First, the sampling design must consist of a well-defined and replicable procedure of randomly drawing sample from a target population. A well-defined sampling procedure is one that permits the assignment of unambiguous sampling weights appropriate for each case in the sample. Second, properly normalized sampling weights reflect how many cases in the population each case in a sample should represent. The weighting question is determined by the sampling procedure.

The implication of the second guiding principle requires some explanation. To appreciate the implication, it is useful to distinguish between sampling steps introduced by the researcher and sampling processes introduced by nature. (By nature we mean every causal process not subject to the researcher’s intervention.) The former is what sampling weights are all about. The latter is what statistical modeling should be concerned with but totally irrelevant to the construction of sampling weights. Improper attention to sampling by nature would only misguide and often disable the construction of sampling weights. To underscore this statement, it is useful to call the Principle of Irrelevance of Sampling by Nature.

        We have talked about explicit and implicit sampling steps introduced by the researcher. It is worth giving two specific examples of sampling by nature. One is the distribution of students across geographic areas, the other is the allocation of students across schools. Students have different probabilities of residing in northern or southern Taiwan, depending on where their grandparents grew up, what their parents do, and even on how well they did on senior high school entrance exam. The probability depends on the area and the student. Similarly, students have different probabilities of attending any given school, depending on such factors as the public exam scores and locational preference. The probability also depends on the student and the school. Even with a perfect list of universe of students with their residence area and school at hand, the researcher cannot possibly estimate the probability of any student living in his/her area of residence or attending his/her school.

        Sampling by nature is ubiquitous. In fact, the probability of any sampling unit falling into any stratum in a typical stratified sampling design is determined by nature and varies from unit to unit. Fortunately, the ignorance of the probabilities is perfectly fine. It does not handicap the construction of sampling weights at all because these probabilities are determined by nature, not by anything the researcher does. Sampling weights are designed to aid statistical inference based on a statistic from a sample — constructed by the researcher via a reproducible procedure of sampling from a target population — to the corresponding parameter in the target population. Whatever nature does to the distribution of sampling units (such as students) across aggregate units (such as geographic areas, schools), it does not tell us the sampling procedure executed by the researcher. All we have to know for proper inference is how each case is drawn from the target population. To the extent that what nature does would not affect how a case is drawn, it is absolutely irrelevant to the construction of sampling weights. This is the Principle of Irrelevance of Sampling by Nature.

 

Fixed School Sample Design: Setup A

To facilitate further exposition, it is useful to separately consider three hypothetical and idealized setups (A, B, and C) of the sampling of schools before considering the actual design to be applied to the third wave of TEPS. Each hypothetical setup allows us to discuss and resolve some basic issues without unnecessary complications. Successive scenarios introduce complications and new issues. By the time we get to the TEPS design, the reader will appreciate the nature and solution to various basic issues that would contribute to a proper understanding of the TEPS design.

The first scenario is setup A and it is the simplest:

1.          The population of schools in W3 is identical to the 800 schools in W1.

2.          The mix of programs offered by, and other stratifying attributes of, each school are absolutely stable.

3.          In W1, 300 schools (call it W1-300 henceforth) were sampled using nonproportional stratified random sampling according to (a) ‍program tracks, (b) ‍over two dozen major administrative districts defined by the Executive Yuan as of January 1, 2000, (c) ‍a maximum of four levels of urbanization, and (d) ‍private and public sectors.

4.          In W3, 300 schools (call it W3-300) are to be sampled.

If the ideal is to maintain (a) ‍as much an overlap with and (b) ‍as close a replicate of sampling design between W1-300 and W3-300 as possible, setting W3-300 to be the W1-300 set can exactly achieve the ideal. This W3-300 set is by definition 100 percent overlap with the W1 set and its sampling design is identical to the one in W1. Note that the identical sample implies that the sampling weight of each school is identical to the weight for W1 inference to the population. We shall call the use of this ideal set the fixed school sample design (FXD).

 

A Cautionary Tale on Implicit Sampling

        In the computation of sampling weights, it is important to take account of all sampling steps explicitly or implicitly introduced by the researcher. An oversight of any step would lead to erroneous weights and even self-contradictory anomalies.

        Consider the following three-part reasoning for constructing sampling weights for W3-300. (P1) ‍First, the ideal set of W3-300 can be regarded as an outcome of the following stratified sampling procedure: (a) ‍Step‍ 1.—Stratify the population of schools for W3 into two strata: stratum S1 composed of the 300 schools from W1, stratum S2 composed of the remaining 500 schools in the population. (b) ‍Step‍ 2.—Sample all schools in S1, sample none from S2. W3-300 is the result. (P2) ‍Second, the sampling weight for S1 is identically one for everyone of the 300 schools, meaning that each of these 300 represents only one school in the population or, more precisely, only representative of stratum S1. (P3) ‍Third, since none is sampled from stratum S2, this two-stage sampling procedure shows that the sample fails to represent the population as it leaves stratum S2, in fact, 5/8 of the population, totally unrepresented.

How sound is this argument? It implies that the ideal above is anything but ideal — using the identical set of schools from W1 sample results in a grossly unrepresentative sample for the W3 population. However, the argument has self-contradictory implications. On the one hand, it acknowledges that W1-300 is representative of the W1 population, hence also acknowledging that W3-300 (being identical to W1-300) is representative of the W1 population. On the other hand, it asserts that W3-300 is grossly unrepresentative of the W3 population, even though the W1 and W3 populations are simply identical. In other words, the argument implies the following:

(A1) W1 population º W3 population

(A2) W1-300 º W3-300

                (A3) Sampling weights for W3-300 ¹ those for W1-300

(A4) W1-300 is representative of W1 population

                (A5) W3-300 is grossly unrepresentative of W3 population

A1-A5 cannot be all true, thus the argument is self-contradictory.

        The error in the argument is due to the oversight of the second part (P2) ‍of the argument. It ignores the fact that in order to sort the population into two strata in step‍ 2, the researcher must implicitly undertake a random sampling procedure that should have been taken into account in defining sampling weight. Imputing an equal weight of one to every one of the W3-300 schools erroneously ignores the implicit sampling step before one can sort the W3 population into strata S1 and S2. Explicit account of the implicit sampling step would require using the W1 sampling weight for each of the 300 schools, implying that each school in fact represents more than one school in the population and vindicate that W3-300 is representative of the entire W3 population. Thus the representativeness of W3-300 for the W3 population is identical to that of W1-300 for the W1 population, and hence there is no self contradiction. Schematically, the argument A1-A5 should be replaced by B1-B5:

(B1) W1 population º W3 population.

(B2) W1-300 º W3-300

                (B3) Sampling weights for W3-300 = those for W1-300

(B4) W1-300 is representative of W1 population

                (B5) W3-300 is representative of W3 population.

        By now it should be clear that there is nothing wrong with the two-step stratified sampling conception of the W3-300 sample. What is wrong is the imputation of sampling weights in P2 of the argument that ignores the implicit sampling step, leading to self contradictory implications. To be exact, the two-step stratified sampling conception should have been formulated as a three-step conception, the first step of which is exactly the stratified sampling procedure for W1-300. Viewed in this way, the three-step conception adds nothing but two redundant and distracting steps.

 

Partially Fixed School Sample Design: Setup B

The FXD in setup A does have common parallels in the sampling design of many other large-scale surveys. Consider, for instance, the survey design of many micro surveys conducted by the government in Taiwan. Local geographical sampling units are usually fixed for an extended period of time and across different surveys. Only sampling within geographical units is conducted afresh for different surveys. These geographical units are analogous to the schools in the FXD.

The FXD can be easily modified, as is done in two of the cross-sectional labor surveys most well-known to Taiwanese economists — the Manpower Utilization Surveys (MUS) here in Taiwan (Yu 2002; Lin and Chang 2003) and the Current Population Surveys (CPS) in the U.S. (Angrist and Krueger 1999) The MUS and CPS retain a fixed proportion of their sampled households from wave to wave. The households there are analogous to the schools here, only that a fraction of the households will be replaced in consecutive waves.

Thus the second setup introduces a partially fixed school sample design, similar to what happens in the MUS and CPS. Instead of adopting the entire W1-300 sample, it adopts only a subset (N1) of the W1-300 schools is retained for W3. A new set of schools (N2) outside the W1-300 sample is added. For specificity, consider the following scenario:

1.          The population of schools in W3 is identical to the 800 schools in W1.

2.          The mix of programs offered by, and other stratifying attributes of, each school are absolutely stable.

3.          In W1, 300 schools (call it W1-300 henceforth) were sampled using nonproportional stratified random sampling according to (a) ‍program track, (b) ‍over two dozen major administrative districts defined by the Executive Yuan as of January 1, 2000, (c) ‍a maximum of four levels of urbanization, (d) ‍private-public sector.

4.          In W3, 250 of the 300 schools (call it W1-250) are to be sampled and 50 new cases are drawn from the 500 schools outside W1-300 via the same stratified sampling procedure in the following way: (a) ‍randomly select 25 strata out of a total of 40 school strata; (b) ‍within each stratum for the W1-300 sample, each unit has been assigned a random number from a generator for the uniform distribution; (c) ‍all units are sorted within stratum according to the random numbers, which are used to determine their priority order of being sampled; (d) ‍for example, if stratum S4 of W1-300 the sampling procedure stopped at the 5th unit, now draw the 6th and 7th units and drop the 4th and 5th units, hence replacing two of the cases in W1-300 without altering the sampling design; (e) ‍repeat the procedure for all 25 strata.

If all strata have at least two schools sampled and a minimum of two schools not in W1-300, the sampling weights for W1-250 are simply their weights in W1-300. The sampling weights for the new 50 years should be the same as those they replace within their strata. In general, however, this may not work out so neatly and the number of schools from a stratum may be different for W1 and W3. In any event, the adjustment of weights involves no more than going back to the W1 stratification scheme and recalculate the sampling proportions wherever appropriate.

        In the specific setup above, N1+N2=300. In fact, the partially FXD can be modified so that the number of new schools is larger, or smaller, than the number dropped from the W1-300 sample. This minor extension does not introduce any new issue. It does necessitate the recalculation of all sampling weights.

 

A Shifting Target Population: Setup C

We now introduce yet another complication to the previous setup: instead of an absolutely stable population, the W3 population is allowed to be different than the W1 population. Table 1 provides a schematic summary of setups A, B, and C. Again, to fix ideas, consider the following scenario for the third setup:

INSERT TABLE 1 ABOUT HERE.

1.          The population of schools in W3 is (a) ‍the 800 schools in W1 plus (b) ‍100 new schools. We call these strata S800 and S100, respectively.

2.          The mix of programs offered by, and other stratifying attributes of, each of the 800 W1 schools are absolutely stable.

3.          In W1, 300 schools (call it W1-300 henceforth) were sampled using nonproportional stratified random sampling according to (a) ‍program track, (b) ‍over two dozen major administrative districts defined by the Executive Yuan as of January 1, 2000, (c) ‍a maximum of four levels of urbanization, (d) ‍private-public sector.

4.          In W3, 250 of the 300 schools (call it W1-250) are to be sampled and 50 new cases are drawn from the 500 schools outside W1-300 via the same stratified sampling procedure in the following way: (a) ‍randomly select 25 strata out of a total of 40 school strata; (b) ‍within each stratum for the W1-300 sample, each unit has been assigned a random number from a generator for the uniform distribution; (c) ‍all units are sorted within stratum according to the random numbers, which are used to determine their priority order of being sampled; (d) ‍for example, if stratum S4 of W1-300 the sampling procedure stopped at the 5th unit, now draw the 6th and 7th units and drop the 4th and 5th units, hence replacing two of the cases in W1-300 without altering the sampling design; (e) ‍repeat the procedure for all 25 strata.

5.          In addition, 40 schools are drawn via a separate stratified random sampling procedure from stratum S100.

Although the change in the population (parts 1 and 5) may appear to be a major complication, in fact it is not. Part 4 of this scenario is clearly parallel to setup B and the implication for the construction of sampling weights has been resolved. What is new here is part 5. Yet the implication for sampling weight is also standard, as in any stratified sampling design.

An important extension of the scenario is to allow S800 to shrink, that is, some of S800 disappear in W3. For specificity, assume that 50 of S800 have disappeared from the population, leaving only 750 of the original schools in the W3 population and, say, 20 sample schools are in the dropout list. What does this complication do to the computation of sampling weights? As it turns out, the complication is already anticipated in setup B. There is no new issue raised here.

Moreover, the complication is exactly analogous to the common situation in which a survey has to go through post-sampling adjustment of sampling weights after discovering from field work that the sampling frame has changed just before interviews are actually conducted or after interviews have been completed and a substantial refusal rate was recorded. In the present context, we have to go back to the stratified sampling scheme to adjust the sampling proportion for each stratum in light of the cases lost from S800.

        We now turn to the case of TEPS and the central questions of sampling design for third wave of TEPS.

 

Sampling Design for the third wave of TEPS

To obtain the new 2005-K11 sample of students for the third wave (W3) of TEPS for (1) intercohort comparative analysis and (2) a follow up of wave one (W1 conducted in 2001) TEPS students, we have to address three major questions. The first one should be considered without regard to following up on any TEPS students from W1, the second and third are in different ways related to W1 panel students. The questions are: (I)‍ sampling of W3 schools, (II) ‍subsampling of W1 and W2 TEPS junior high students for W3 and W4 follow-up survey, and (III)‍ sampling of W3 classes to obtain a fresh new sample of K11 students in 2005. The sampling of schools is crucial first step because it is a necessary step toward obtaining the sampling weights for classes and students within school.

 

The Setting

        TEPS is entering the third wave. In the first two waves, two cohorts of students were the target populations. One is the 2001-K11 cohort consisting of students entering year two of senior high, vocational high, or five-year junior college program in the fall of 2001. The other is the 2001-K7 cohort consisting of students entering year one of junior high in the fall of 2001. The 2001-K7 cohort was chosen to be surveyed again in the fall of 2005 to enable rigorous intercohort comparisons. The long-term nature of TEPS entails certain imperatives on all waves in general and the third wave in particular. The context of data collection is also in flux, entailing some design complications specific to the third wave. This subsection will briefly review the imperatives and the changing context before addressing the three main design issues mentioned at the start.

 

Imperatives

As a long-term survey intended to provide opportunities for evaluating the institutional and policy impacts on high school students, the design of the third wave of TEPS must meet certain constraints on collecting data that would permit (1) ‍intercohort comparison of learning experience under different institutions and policies and (2) ‍long-term panel analysis of learning over varying class and school contexts. Simply put, there are four

1.          All of the fundamental design features of the 2001-K11 sample should be maximally replicated in the 2005-K11 sample to avoid unnecessary comparability problems — including the way multistage stratified sampling is done and, most significantly, sampling of classes within school and multiple students within class.

2.          School is a critically important source of institutional influences on student learning, and it is a key component in the organization of survey operation. We need an efficient sampling design for the estimation of school effects and lay out the proper computation of the sampling weights for schools. Hence it would be ideal to retain as many W1 schools as possible to provide the maximal potential for measuring fixed-effects of schools on student learning.

3.          As many of the 2001-K7 sample should be followed up in the third wave.

4.          W3 questionnaire items must be sufficiently comparable with those in W1 to permit intercohort comparative analysis.

 

Changing Contexts

Even if all four imperatives can be met, we cannot control the context in which the sampling is to be conducted. Not only the institutional and policy contexts affecting K11 students are changing, the population of high schools for K11 is shifting as well.

It is worth noting that the sampling frame for the 2001-K11 sample was based on the latest Ministry of Education data on the program offerings by every high school in Taiwan. In order to start data collection in the fall of 2001, TEPS had to complete sampling, conduct pilot testing, and obtain authorization to survey a school well ahead of September 2001. At the time of the first sampling operation, the Ministry of Education could only provide information up to the academic year of 1998-1999. In other words, the target population for the 2001-K11 sample consists of schools on record in the Ministry of Education database of schools and program offerings for the academic year 1998-1999, not 2001-2002 even though the field work was conducted in the fall of 2001. Similarly, at the time of the follow-up field operation (fall of 2004) in preparation for the third wave, the latest sampling frame of high schools available is for academic year 2003-2004 (Min92). This list provides the sampling frame for the third wave schools. It implies that only schools offering K10 classes as of academic year 2003-2004 would be eligible for survey at the end of year 2005. Post-survey sampling weights will take into account the possibility that some of those eligible schools may drop out of the population targeted by the survey.

        A careful comparison of the two sampling frames identifies a substantial number of schools offering new programs, especially senior high or comprehensive program, and a substantial number of schools terminating old programs, notably vocational high or five-year junior college program. Some schools were closed since we completed our W2 survey. Some continuing schools simply closed the program for which TEPS sampled the schools. A few schools changed from privately owned to publicly own. This feature of an unstable target population of schools is exactly what distinguishes setup C from the other two setups. Table 2 presents the differences between the two sampling frames, the preliminary results from the 2004 tracking survey of the 2001-K7 sample, and projections for two opposite sampling designs for schools (for simplicity, without breakdown into the four program tracks). It shows that it is highly likely that the FXD would capture at least 7,000 of the nearly 20,000 TEPS junior high sample (2001-K7), a finding of key interest when we address issues II and III of the sampling design.

INSERT TABLE 2 ABOUT HERE.

 

Issue I. Sampling of Schools

        Given the imperatives for the third wave of TEPS, the key decision is to decide between two extreme options: either adopt the FXD or draw a fresh new sample of schools following as closely as feasible the sampling procedure for the 2001-K11 sample.

        Since the target populations of the W1 and W3 surveys are partly disjoint, the discussion of setup C in the previous section is relevant to the problem here. The discussion shows that, even when the FXD or a partially FXD is desired, the shifting population necessitates the selection of a supplementary sample of schools to make sure that the new W3 school sample is representative of the entire target population. The other necessary adjustment is the sampling weights for all schools from the 2001-K11 sample. But these adjustments are standard, not involving any new methodological issue.

        The issue then boils down to the merits and demerits of the FXD relative to the fresh school sample design. Why is so good about this FXD? Surveying the same schools obviously economize on the costs of learning about a new school and building working relationship with its staff. In addition to savings on operational costs, there are two major virtues related to the estimation of school effects.

The first virtue is about the efficient estimation of observed school attributes while the second virtue is about the efficient estimation of all school attributes stable from W1 to W3 and possible W4 using the maximum number of cases from the two cohorts of students. To understand the first virtue of the FXD, it may be instructive to draw a comparison with the classical statistical analysis of linear regression. Assuming everything else is held the same, it is well-known that estimation efficiency is highest if X is fixed throughout the repeat sampling process in which only the unobserved heterogeneity variable (usually called errors) is drawn afresh from a distribution. If X is not fixed but repeatedly sampled together with the errors, the sampling distribution of the estimator of coefficients will have larger variance than if X is fixed. Hence estimation is less efficient in this case than in the case where X is fixed in the repeat sampling process. Similarly, when the schools are fixed from W1 to W3, we maintain maximal efficient.

        The second virtue is related to the control for school fixed effects, as in a study of student level processes in which the researcher needs to partial out ALL stable school-level influences — whether or not measured by TEPS in any way. The more overlap between the W1 and W3 schools, the more students would be available to the fixed-effects analysis. All students in W1-only or W3-only schools cannot possible contribute any information for school fixed-effects and are necessarily left out of the fixed-effects analysis. (For simplicity and illustration, consider the numbers specified in setup A, a 100-percent overlap sample of W3 schools means all students and schools contribute to the fixed-effects analysis. By contrast, if W1 and W3 are independent samples of the population, the expected number of schools co-present in the W1 and W3 samples is 800´(3/8)´(3/8)=112.5, many fewer than the 300 ideal.) Notice that if school fixed effects exert powerful influences on a student outcome, so that student outcomes within school are highly correlated or homogeneous, there may be little residual variance available for the statistical identification of any effects outside the school fixed effects (see, e.g., Griliches [1986] for a generic econometric exposition and Tam [1997] for the problem in a major substantive literature). The penalty of losing nearly two-third of the schools can be very high. The loss not only threatens the representativeness of the analytic sample, but also may aggravate the unreliability of statistical estimation in a fixed-effects analysis.

        In sum, there are large and unique benefits to adopting the FXD of setup A. When the FXD is not feasible due to, e.g., shifting target population, a partially FXD such as the one specified in setup C would still be superior to the fresh school sample design. As a result, TEPS will adopt a partially fixed school sample design similar to the one in setup C.

 

Issue II. Sampling of Existing TEPS Panel

        While the first design issue focuses on the objective of intercohort comparative analysis, another key objective of TEPS is to provide a long-term panel of learning experience in high school. To provide a long-term panel, it is imperative that TEPS follow up as many of the 2001-K7 sample as is feasible. With an initial sample size of nearly 20,000 students in the 2001-K7 sample, the odds is heavily in favor of the bet that the continuing panel of students would be dispersed into practically every school in the target population of schools with K11 programs, and within school these students are likely to be distributed widely over the universe of classes in K11. The practical implication of following up every case would mean surveying virtually every school with a K11 program. The cost would be much higher than the budget limit TEPS has been working with, and it would be much less cost effective than the operation in the first two waves because the number of students available per school will go down, while the number of classes and teachers involved will go much higher. In short, the cost consequence of following the entire panel is prohibitively high; TEPS must subsample the 2001-K7 panel for long-term follow-up.

        It is obviously simple to randomly subsample the TEPS panel. Two obviously natural ways to do so would be to subsample a pre-set proportion of the panel with equal probability for everyone. More efficiently and cost effective would be to stratify the panel members according to measured attributes, such as family background and degree of urbanization residence area, and then select pre-set proportions of each stratum. In any case, however, this kind of procedure would not prevent the problem of going all across Taiwan to many K11 schools for follow-up survey of the panel.

        Recall that we proposed a partially fixed school sample design to address the first design issue. The same proposal suggests itself as the natural solution for this design issue as well. Is it not natural to subsample the panel members first at the level of K11 schools? This school-based subsampling automatically resolves the practical problem of an overly dispersed panel sample. There is also the invaluable convenience of piggybacking on the sampling of schools for the 2005-K11 cohort. We have discussed the important virtues of the partially FXD for the 2005-K11 sample. We shall not repeat them here. Indeed we decide to adopt the partially FXD for subsampling the 2001-K7 panel and we set the school sample for this purpose to be identical to the school sample for the fresh 2005-K11 sample.

        The construction of sampling weights is standard. Given the chosen design, the sampling weight for each panel member in W3 would be based on the product of two probabilities: the sampling probability (p1) of a member in W1 and the conditional sampling probability (q3) of the member in W3, The reciprocal of the product (p3= p1´q3) will be the appropriate the sampling weight for the member.

 

Issue III. Sampling of Classes

        After choosing the design for the sampling of W3 schools for the 2005-K11 sample and the subsampling of the 2001-K7 panel, there remains the problem of sampling classes within schools.

The source of the problem is related to the fact that many of the 2001-K7 panel will be in the roughly 300 W3 schools across four program tracks. But they are randomly distributed across classes. As stated earlier, there are at least 7,000 TEPS panel members who happen to be continuing their schooling in the W3 TEPS schools. On average, each W3 school will capture over 23 TEPS students. These students are likely to be distributed over most of the classes rather than, say, concentrated in one-third of the classes in K11. In addition, TEPS will follow up all or a random subset of the 2001-K7 panel members in W3 TEPS schools but, contrary to the design in W1 and W2, will not sample any classmates of the panel to provide class contexts. Sampling classmates are costly, the number of classes needed may be much larger than we can handle, and so greatly limit the size of panel TEPS can feasibly follow.

Fortunately, parallel to the TEPS panel entering K11 is a fresh sample of the 2001-K7 cohort who are now in K11 — the 2005-K11 sample of TEPS. We can have one stone for two birds if we draw a new sample of classes for the 2005-K11 sample by oversampling classes with the 2001-K7 panel members, especially classes with multiple panel members.

        The natural question is of course whether this efficient oversampling is statistically legitimate. The oversampling is perfectly legitimate because the procedure is equivalent to the standard use of stratified sampling. As far as sampling is concerned, identifying a student as member of TEPS and classifying the student into a stratum for TEPS panels does not implicitly introduce any sampling. Moreover, whether a student is a member of the TEPS panel is uncorrelated with any characteristic of the classes, therefore the population of classes containing any panel member or not should be indistinguishable from each other. This equivalence renders the stratified sampling procedure as efficient as one without this stratification.

        Within class, further sampling may or may not apply stratification of students who are members of the TEPS panel or not. With stratification, however, we can sample all TEPS panel members, i.e., with probability one and less than one for non-panel members. This two-stage stratified sampling within school can maximize the overlap between the two samples, the 2005-K11 sample and the 2001-K7 panel sample, while retaining their respective sampling designs. The panel members in these classes can be used in two ways: (1) as part of the 2005-K11 sample and (2) as part of the 2001-K7 panel. Teacher evaluations for the subsample of TEPS panel can be collected simply as part of the 2005-K11 sample.

 

Conclusion

        Taken together, the sampling design for the third wave of TEPS is optimal for simultaneously addressing the need for intercohort tests of institutional and policy impacts, efficient estimation of school effects, subsampling of the TEPS panel from junior high, and maximizing the potential for obtaining class contexts not just for the fresh K11 sample but also for the TEPS panel. The simultaneous achievement of these goals is a unique accomplishment of the proposed design for the third wave of TEPS. We are not aware of similar accomplishments in a large national panel survey of high school (or primary school) students anywhere in the world. If properly exploited, the distinctive design features discussed in this paper will bring value-added to future research based on TEPS.

 

REFERENCES

Angrist, Joshua D., and Alan B. Krueger. 1999. “Empirical Strategies in Labor Economics.” Pp. 1277-1366 in The Handbook of Labor Economics, Vol. 3A, edited by O. Ashenfelter and D. Card. Amsterdam: Elsevier Science.

Chang, Ly-yun. 2003. Taiwan Education Panel Survey: Base Year (2001) Parent Data [public release computer file]. Center for Survey Research, Academia Sinica [producer, distributor].

Lin, Ji-Ping, and Ying-Hwa Chang. 2003. “Manpower Utilization Quasi-longitudinal Survey: Construction of Database, Applications, and Future Developments.” (in Chinese) Survey Research 13:39-69.

DuMouchel, William H., and Greg J. Duncan. 1983. “Using Sample Survey Weights in Multiple Regression Analyses of Stratified Samples” Journal of the American Statistical Association 78:535-543.

StataCorp. 2001. Stata User’s Guide: Release 7. College Station, TX: Stata Press.

Winship, Christopher, and Larry Radbill. 1994. “Sampling Weights and Regression Analysis.” Sociological Methods and Research 23(2):230-257.

Yu, Ruoh-Rong. 2002. “Attrition in Matched Manpower Utilization Surveys.” (in Chinese) Survey Research 11:5-30.

 


 

Table 1. Summary of Three Hypothetical Setups

 

Setup

W1 & W3 Populations of Schools

W3 Sample of Schools

 

 

 

A

{W3-800} = {W1-800}

{W3-300} = {W1-300}

 

 

 

B

{W3-800} = {W1-800}

{W3-300} = {W1-250} + {50}

 

 

 

C

{W3-900} > {W1-800}

i.e. added 100 new schools

{W3-340} =

{W3-300 of setup B}

+ {40 out of 100 new schools}

 

{W3-850} ¹ {W1-800}

  i.e. W1-800 left with 750

{W3-320} =

{280 of W3-300 of setup B}

+ {40 out of 100 new schools}

 

 

Table 2. Comparing the Sampling Frames of Wave One and Wave Three

 

 

Total # of Schools W1

W1 only

W1 & W3

W3 only

Total # of Schools W3

Senior-high

275

21

254

74

328

Comprehensive

61

3

58

105

163

Vocational

301

47

254

5

259

Five-year junior college

69

7

62

11

73

Total

706

78

628

195

823

 

Counts as if October 23, 2004



* This research was supported by grants from Academia Sinica through the Learning 2000 project, National Academy of Educational Research and National Science Council through the Taiwan Educational Panel Survey (TEPS) project. I thank Yung-Tai Hung, Jing-Shiang Hwang, Lung-An Li, and Sam Peng for detailed comments and other members of the Sampling Design Committee of TEPS for helpful discussion. Any opinions expressed in the technical report series are those of the authors, not those of Academia Sinica and other funding agencies. Direct correspondence to Tony Tam (譚康榮), Academia Sinica, Institute of European and American Studies, Nankang, Taipei, 11529, TAIWAN (email: tam@sinica.edu.tw; tel: (02)3789-7249).

Copyright © 2004 by Tony Tam.  All rights reserved.