77 Matching Annotations
  1. Last 7 days
    1. point sampling is unique because it allows you to match information collection effort with the desired level of inference. Under point sampling, the minimum data collection effort is called a continuous tally, which means a count of measurement trees is kept across the nnn sampling locations (no additional information is recorded—not even how many measurement trees were observed at each sampling location). At the end of a continuous tally cruise, you have the total number of measurement trees mmm, which is used to compute the mean basal area per unit area estimate as

      In fact there is no difference to other plot designs, you can do exactly the same also in fixed area or nested plots... If you calculate the expansion factor per tree and expand e.g. tree basal area to one hectar, then you can sum this over all of your trees (from multiple plots) and divide by n. Same result! Sum(y_i) (plot aggregate) is here equal to Sum(y_ij) (sum over trees). Only that you need no expansion factor here, since you are already counting on a per ha basis

    2. the constant kkk in feet is

      I am confused: this is only for counting factor k=1? For measurement in meters and cm the ratios for k=1,2,4 are 1:50, 1:34.5 and 1:25 respectively (both in same units). Means, using k=1 (every counted tree is 1m²/ha) a tree with dbh of 35cm (0.35m) has a maximum distance of 0.35*50=17.5 meters SORRY, just realized that k is not counting factor (as used here in Germany). Our "k" is your BAF and what you notate as k is what we call c...

    3. so it’s worth the extra time to conduct the limiting distance calculation

      very right! It is usually doubtless for many trees that are definitely IN, for those close to the border it is difficult to guess and a distance measurement is indicated...

    4. its probability of selection is proportional to its DBH

      No, its proportional to its basal area (it is the radius of a circle that is proportional to dbh, then the circle area is proportional to dbh^2 or basal area)

    5. 3⋅TF

      If you leave out the 3*F it becomes more general. Trees per acre is just the sum of tree expansion factors. Then it can also be used for unequal probability designs in which expansion factors might vary with tree size

    6. A more common approach in forestry is to use an areal sampling frame

      Ok, finally here it comes, thank you! This is quite late and maybe an explanation of different population concepts would be nice at an earlier stage already. Harvard Forest was a special case (rather uncommon in inventories but maybe meaningful inside this experimantal plot) and now finally you introduce infinite population. From my own experience: we have sometimes similar problems here, since we teach general sampling statistics first (with fpc) and then switch to infinite.

    7. Under the plot sampling rule, all trees in the population have equal probability of being measured.

      Only for fixed area plots (rather uncommon). For nested plots or Bitterlich this is not he case

    8. for each location

      "for each location" would only hold for fixed area plots. As soon as a unequal probability design is used (nested plots, Bitterlich), we would expand on the tree level and aggregate to plot level afterwards.

    1. The sample variance is, for the most part, out of our control—it’s a characteristic of the population

      Sorry, but this is completely wrong! We have full control over the population variance (except practical constrains). If the population consist of plots, then we are the ones who define the character of this population! Imagine you make your n=100 plots twice as large as the area of interest (means: 100 times full census), then the population variance among these plots is zero! Any plot size smaller than the area of interest will introduce variability among the observations. And in your case of subdividing the area into cells: Of you make them larger, the variance among them will get smaller...

    2. which is an astonishingly large number!

      And now assume you would allow such a quadratic plot at any location, then is getting infinite. BUT: contrary to finite population in which we expect a different response in each cell, this does not hold in an infinite population. Many sampling locations (also infinite) would lead to the same included trees. See https://youtu.be/-8CVcXOKRxM?si=MpWdojUe3FbcU-Mn

    3. Finite means there’s a limited, i.e., countable, number of units in the population

      Ok, somehow you like to stick to this finite definition of the population. In the next sentence, however, you say (correctly) that each possible sample must have a non-zero probability. In real life: you could select a sample plot at any point in the total forest area. In a finite population view this is not the case (and somehow you are limited to quadratic cells as plots, which is not in line with what you explain later)?? Why is it so important to define the population as finite here? Imagine we conduct SRS by random selection of x,y coordinates in the area (and install plots there). Then we are outside of your population concept (the population of possible plots is infinite).

    4. for finite populations

      Which means you need to consider finite population correction in many estimators. Why not infinite populations (which is what we are doing in reality)? The concepts you introduce later (SRS, stratified, double sampling, ... ) would hold in exactly the same way, only that you can ignore the fpc

    1. For example, we might divide the 50-acre property into 250 1/5-th acre non-overlapping plots

      Which would be a very untypical strategy, right? We see such examples in old textbooks (like Shiver and Borders and my former boss Akca used it too). Such a subdivision of the area would result in a finite population of cells. No other overlapping cell in between would have a positive probability to be selected. If you like to avoid this, you can just say that a sample of plots is selected from the total forest area.

    2. In some cases we’re able to observe (meaning measure) all units in the population

      If the population we are looking at does allow, yes. We can e.g. measure all trees if we are interested in tree characteristics. It becomes impossible if the population consists of plots (resp. all possible sample plots in an area).

    1. dplyr

      It is not at all a bad idea to introduce dplyr here. data.table would be an alternative and often more efficient on large datasets, but the code is not as readyble

    1. indicates the log rule

      If these are standard, fine. Since this is just to learn for the students, you can also use simpler formulations like Smalian.

    2. for(i in 1:nrow(pef_trees)) { AGB_kg[i] <- agb(pef_trees$DBH_in[i], pef_trees$common_name[i], dbh_units = "in") }

      instead of looping, we would rather use lapply here later

    3. # Assign the species-specific regression coefficients.

      For a simple example ok, but later you want to have a leftjoin here and read the model coefficients for different species from a different related table. But you can do that also in base R:

      seperate table: model coefficients by species

      coef_df <- data.frame( species = c(your species 1, 2,3,...), a = c(0.5, 0.7, 0.6), # coefficients b = c(1.2, 1.5, 1.3) )

      join coefficients on trees

      library(dplyr)

      trees_with_coef <- trees_df %>% left_join(coef_df, by = "species")

      But dplyr was not yet introduced and i understand that they first should do it by hand...

    4. AGB=exp(β0+β1ln(DBH)),

      This is the log transformed linear form that we often use for regression analysis (due to heteroskedasticity in the metric data). As you see the coefficients are here ß and should be estimates by a linear regression. However, if it comes to applying an allometric model we usually use the form AGB=a*DBH^b. ß1 is b and a is exp(ß0). I would recommend to substitute ß0 and ß1 by a and b and use the above formulation. Easier to digest for the students and more common in practice.

    5. basal area is c⋅DBH2c⋅DBH2c\cdot \text{DBH}^2, where ccc is often referred to as the “foresters constant” and, depending on your measurement system, is either 0.005454 or 0.00007854

      I would not recommend to introduce any "foresters constants", it is just simple geometry if we want to calculate the area of a circle from its diameter. Students tend to learn such things by heart and forget about the fundamentals... Here the pi/10.000 comes in because DBH comes in centimeter (conversion from cm to meter) and the strange (DBH/2)^2 substitutes pi/4. I prefer to explain my students how to calculate the area of a circle and remind them that we want to have this in meter, instead of confusing them.

    6. to some degree

      you can have the same basal area with a single big tree or hundreds of small trees. There is not necessarily a relation between BA and density

    1. In this chapter, we introduce scripts and their use to facilitate reproducible workflows

      Maybe good to start with a single skript, but usually later you want to have a project with multiple scripts in which you can separate certain things from each other (e.g. functions and models from calculations or data import). Just see that this comes in 3.4!

    1. or a complete enumeration of all units (e.g., trees) in a population

      Mhhh, here you maybe introduce some confusion for the reader. It is correct that in this specific experimental plot and for the specific ecological perspective the interest is on trees. And yes, if the interest is on tree characteristics, the total population is all the trees. But this is very different to forest inventories (observational studies) that aim at describing the fores area. There, the population is the forest area and the sampling units are subsets of this area (plots). Maybe substitute by "... is a full census or complete enumeration of all trees, which in this case represent the population"

    2. The HTML version of Figure 1.5 shows the PEF LiDAR canopy height surface, forest inventory plot locations, and MU boundaries. Clicking on a plot shows the MU in which the plot resides, its identification number, and current basal area (ft22^2/ac). Clicking in a MU polygon (i.e., between plots) brings up a figure of the MU’s basal area changed over time. The printed version Figure 1.5 shows the PEF MU boundaries and plot locations colored by most current basal area (ft22^2/ac).

      Somehow difficult to consider printed and online at the same time. Some of it might be shifted to the caption.

    3. high spatial resolution wall-to-wall data products such as gridded maps

      Well, this is what everybody says, but if we look at what is used for decision making at the end, I am not sure. Maps are great for communication, but calculations are done in tabular data. If we provide forest managers with high resolution wall-to-wall maps, they usually ask for simpler products showing useful classes. My experience is that we often aggregate our high resolution back to something more simple. In the carbon business emission factors are not provided at the pixel level, but e.g. for a certain forest type.

    4. as well as extensive campaigns to collect field-based calibration data

      In the meanwhile LiDAR data acquisition has emerged to a standard in many countries. The wording "calibration data" reflects a purely model based perspective (common in remote sensing), while we usually look at it as ancillary data

    5. no significant change

      Just comment: Here we look at a typical experimental study that allows hypotheses testing and significance (in contrast to the observational studies mentioned above in which we can maybe look at relationships but never on cause and effect). Such studies do not aim at describing a forest area, but are designed to research into ecological or other relationships. Good to have such different examples!

    6. point

      I stop commenting on point or plot here, since you consistently used point. At the end it is a question of personal style and "point" is also fine as long as there is a mentioning that trees around such a point are included based on a certain rule (and I assume you come to plot designs later). Speaking of points may help to explain the infinite character of our population later.

    7. If you’re reading the HTML version of this book, mousing-over a point gives the point number and a single click gives a list of AGS trees used to calculate the basal area color reflected in the figure legend.

      Considering a later printed book publication you might want to shift this into the figure caption

    8. points

      sample plots? Maybe you come to observation units in a later chapter, but i see no reason why not to mention it already here. Tree characteristics cannot be measured at a point

    9. placement

      or "selection of measurement units". The typical target variables cannot be observed at a dimensionless point (as you say), usually our sampling elements are small outcuts of the forest area.

    10. are best fit lines

      see, we are not looking at strict allometry here (which would be a process model perspective) but look for the best fit. In this case dangerous if you interpolate beyond the range of data. Look at Betula lenta where the best fit to the current data would suggest that biomass is decreasing with increasing dbh. In regard to model choice, a data analyst has both in mind: a possibly good fit to the current data and biological plausibility.

    11. allometric equation

      "Allometry" or allometric relations usually refer to a specific kind of relations (the relation between two relative growth rates in one individual). We tend to call many model "allometric" that are in fact other kinds of relationships. Taking the character of allometric relations serious would mean to apply a power model, but in fact we are often using others. Anyway, I would not change the text here because it is in line with the general understanding.

    12. to identify features for consideration in the subsequent wrangling and analysis

      The target variables we are typically interested in are rarely "measurable" (volume, biomass, biodiversity, ....). It is another important task of the data analyst to to identify and calculate the essential target variables or requested information from the data (features) at hand. In most cases this require the application of models.

    13. For example, field crews collect forest inventory data to answer questions about the amount and location of timber or non-timber resources. Monitoring data are collected to understand change in forest characteristics. Highly detailed individual tree data are collected to understand allometry, which is the growth and size relationship between different parts of an organism. Experimental manipulations of trees, stands, or forests are used to better understand how environmental change and disturbance events impact individual growth rates and trends in population demographics.

      It is maybe helpful for the reader to distinguish between typical observational studies (forest inventories aim at estimating status and change) and experimental studies (investigating effects of treatments or researching into dynamics)

    14. forest sampling (Gregoire and Valentine 2007; Mandallaz 2007)

      Maybe include Forest Inventory - Methodology and Applications by Annika Kangas and Matti Maltamo and Sampling Methods, Remote Sensing and GIS Multiresource Forest Inventory by Köhl and Steen Magnussen.

  2. Mar 2025
    1. trade-offs

      From a certain plot size onwards, the gain in precision is minor and increasing sample size instead of enlarging plots is statistically more beneficial. For typical variables like volume, empirical research shows that 15-20 trees per plot are enough. But there are other target variables that might justify larger plots...

    2. where there is no notion of area sampled and hence no sampling intensity to compute

      "point sampling" is just a form of "continuously nested plots" and not at all "plotless". There is no argument to look at it in a different way.

    3. substantial portion of the forest area is sampled and the FPC is warranted

      Not consistent with the infinite population view described above!! The population is the infinite number of possible plot locations (continuum of sampling locations), from which some are selected. In some textbooks you find examples of partitioning the area into a finite number of grid cells from which some are selected. Then FPC is required, but this view is not consistent with what we usually do in forest inventories...

    4. small

      Infinite small! We draw a sample of plots from an infinite number of available plot locations in the area. The sample is therefore not shrinking the remaining population size in the next draw...

    5. FPC only applies to SRS

      This is not correct! FPC applies to any sampling design if applied in a finite population. It is a correction that ensures that the standard error becomes 0 if all elements of the population are in the sample. It is relevant if the sample proportion is relatively large (and the remaining u sampled portion becomes small). The argument for not considering it in forest inventories is that we sample from an infinite population...

    6. were summarized to the 0.25 ha plot in advance

      Since you talked about nested plots before (unequal probability) it might be better for understanding if expansion is done on the level of single trees and aggregated on plot level afterwards.

    7. scale tree measurements

      Since we take a sample of the forest area (on which we might find trees), I would rather say we upscale the plot area to a common basis. Same result but different wording...

    8. at ground level

      Usually everything we measure/estimate refers to dbh height. Also the decision whether a tree is in or out is felled according to horizontal distance measured at dbh height!