15 Matching Annotations
  1. Last 7 days
    1. SSUs in the same PSU are often spatially autocorrelated and provide less independent information than SSUs sampled across the full population

      In Finnish NFI the distance between the SSUs is set to 300 meters, because the range of autocorrelation is shorter than that. Maybe you could mention that if there is info about the autocorrelation, then this effect can be diminished.

    2. Weak or poorly measured covariates may provide little gain in precision and can mislead estimation if their uncertainty is not accounted for

      My experience is that using the model is always better than not using the model, even when the model is very poor indeed. I have tested with very poor models. I am not sure what you mean by accounting for the uncertainty here. Since the formulas are based on the observed errors from the models, what else is there to account for?

    3. rely on model assumptions for validity

      I do not agree. In design-based setting, the working model does not need to be correct. In model-based setting the situation is obviously different. I remember (from way back when I was teaching from it) that there is mention of some kind of bias in Cochran's book regarding regression estimator, but generally I think we nowadays consider model-assisted estimation as design-unbiased, at least approximately design-unbiased. Need to check from my Yello book, but I anyway think it is not correct to state the validity of regression estimation depends on the validity of the model.

    4. A common way to describe the relationship between two variables is with a straight line

      This part seems maybe overly simplistic, as everybody and their dog is nowadays using model-assisted estimation, and much more complicated models. Or that is how it feels. While the one-predictor model is good for getting familiar with the idea, I think it would be good to at least mention the term model-assisted, and explain that very much more complicated models are possible.

    5. SYS can produce biased estimates.

      Göran Ståhl once gave me a lesson about this. I know I have written this sentence in my chapter in 2006, but Göran later proved me wrong. It is not biased, it is just increased variance. If you go through all the possible starting points, and calculate all the possible results, take a mean of them, it should be exactly unbiased.

    6. it may over- or

      You mean for periodic populations? Usually it is assumed to overestimate, as we generally assume a trend in the population. I think this should be made clear, the proof for this should be in Matern's paper from 1960.

    7. The common workaround is to apply the SRS estimators for variance (11.17) and standard error (11.21).

      This is not the best of approaches, as when there is a trend in the data, and systematic sample is truly more precise than SRS, using the SRS estimator does not show it. Instead, it overestimates the variance. There are estimators available that work better, depending on differences between neighbours. By Grafström. See Räty, M., Kuronen, M., Myllymäki, M., Kangas, A., Mäkisara, K., Heikkinen J. 2020. Comparison of the local pivotal method and systematic sampling for national forest inventories. Forest Ecosystems 7: 54.

    8. If you really must have a specific sample size, then the best approach is to specify a denser grid than needed and randomly or systematically thin points until the target sample size is reached (see, e.g., K. Iles (2003)).

      I do not know if it is of any importance here, but it could also be a pseudo-systematic sample. So that make a grid of desired size, and make a simple random sample of one unit from each. That would also be more like a real random sample.

    9. improving the reliability of mean and total estimates

      This is only true, if there is a trend in the population. If the units are completely randomly assigned, the accuracy is the same as in SRS.

    1. selected using some random mechanism

      I see that this book is meant to be very practice-oriented and directed for students not at all familiar with sampling beforehand. However, I think it would be very important to include also the concepts such as inclusion probability and its relation to the Cochranian notation style you have chosen to use. The Horwitz-Thompson estimator, I mean. I know that students in statistics in Finland do not learn the Cochranian style at all anymore, all they learn about sampling is the inclusion probability -based notation. It would be important (for the next level courses, such as model-assisted methods) to understand the inclusion probabilities, otherwise the step to next level will be quite high. I noted that you had explained the inclusion zone for trees, which is a related concept, and could benefit from the relations to the inclusion probabilities. Like in the book by Mandallaz.

    2. is positioned using a random mechanism

      But the systematic sample is still really one big cluster of plots, because when you select one plot, you select all of them. I noticed that you explained this later on, but I would prefer mentioning the problem also here. To avoid misunderstandings.