Hypothesis

77 Matching Annotations

Sep 2025
www.finley-lab.com www.finley-lab.com

Chapter 12 Estimating forest parameters | Introduction to Forestry Data Analysis with R

16
1. Lfehrma 09 Sep 2025
  
  in Public
  
  point sampling is unique because it allows you to match information collection effort with the desired level of inference. Under point sampling, the minimum data collection effort is called a continuous tally, which means a count of measurement trees is kept across the nnn sampling locations (no additional information is recorded—not even how many measurement trees were observed at each sampling location). At the end of a continuous tally cruise, you have the total number of measurement trees mmm, which is used to compute the mean basal area per unit area estimate as
  
  In fact there is no difference to other plot designs, you can do exactly the same also in fixed area or nested plots... If you calculate the expansion factor per tree and expand e.g. tree basal area to one hectar, then you can sum this over all of your trees (from multiple plots) and divide by n. Same result! Sum(y_i) (plot aggregate) is here equal to Sum(y_ij) (sum over trees). Only that you need no expansion factor here, since you are already counting on a per ha basis
2. Lfehrma 09 Sep 2025
  
  in Public
  
  the constant kkk in feet is
  
  I am confused: this is only for counting factor k=1? For measurement in meters and cm the ratios for k=1,2,4 are 1:50, 1:34.5 and 1:25 respectively (both in same units). Means, using k=1 (every counted tree is 1m²/ha) a tree with dbh of 35cm (0.35m) has a maximum distance of 0.35*50=17.5 meters SORRY, just realized that k is not counting factor (as used here in Germany). Our "k" is your BAF and what you notate as k is what we call c...
3. Lfehrma 09 Sep 2025
  
  in Public
  
  so it’s worth the extra time to conduct the limiting distance calculation
  
  very right! It is usually doubtless for many trees that are definitely IN, for those close to the border it is difficult to guess and a distance measurement is indicated...
4. Lfehrma 09 Sep 2025
  
  in Public
  
  its probability of selection is proportional to its DBH
  
  No, its proportional to its basal area (it is the radius of a circle that is proportional to dbh, then the circle area is proportional to dbh^2 or basal area)
5. Lfehrma 09 Sep 2025
  
  in Public
  
  The angle gauge
  
  Yes, e.g. our new Dendrometer III: https://doi.org/10.25625/USXNYO
6. Lfehrma 09 Sep 2025
  
  in Public
  
  DBH
  
  probability is proportional to basal area (while plot radius is proportional to dbh)!
7. Lfehrma 09 Sep 2025
  
  in Public
  
  3⋅TF
  
  If you leave out the 3*F it becomes more general. Trees per acre is just the sum of tree expansion factors. Then it can also be used for unequal probability designs in which expansion factors might vary with tree size
8. Lfehrma 09 Sep 2025
  
  in Public
  
  frequency
  
  density?
9. Lfehrma 09 Sep 2025
  
  in Public
  
  are spaced evenly across the landscape
  
  There are also statistical arguments: systematic sampling is known to be more precise.
10. Lfehrma 09 Sep 2025
  
  in Public
  
  A more common approach in forestry is to use an areal sampling frame
  
  Ok, finally here it comes, thank you! This is quite late and maybe an explanation of different population concepts would be nice at an earlier stage already. Harvard Forest was a special case (rather uncommon in inventories but maybe meaningful inside this experimantal plot) and now finally you introduce infinite population. From my own experience: we have sometimes similar problems here, since we teach general sampling statistics first (with fpc) and then switch to infinite.
11. Lfehrma 09 Sep 2025
  
  in Public
  
  Point sampling
  
  No mentioning of Bitterlich in this paragraph?
12. Lfehrma 09 Sep 2025
  
  in Public
  
  Under the plot sampling rule, all trees in the population have equal probability of being measured.
  
  Only for fixed area plots (rather uncommon). For nested plots or Bitterlich this is not he case
13. Lfehrma 09 Sep 2025
  
  in Public
  
  point sampling
  
  which is in fact also plot sampling, only with plot size proportional to basal area
14. Lfehrma 09 Sep 2025
  
  in Public
  
  for each location
  
  "for each location" would only hold for fixed area plots. As soon as a unequal probability design is used (nested plots, Bitterlich), we would expand on the tree level and aggregate to plot level afterwards.
15. Lfehrma 09 Sep 2025
  
  in Public
  
  random
  
  or systematic
16. Lfehrma 09 Sep 2025
  
  in Public
  
  sample observation locations
  
  or just "sampling locations"
Visit annotations in context

Annotators

Lfehrma

URL

finley-lab.com/ifdar/foreststat
www.finley-lab.com www.finley-lab.com

Chapter 11 Basic statistical concepts | Introduction to Forestry Data Analysis with R

10
1. Lfehrma 09 Sep 2025
  
  in Public
  
  The sample variance is, for the most part, out of our control—it’s a characteristic of the population
  
  Sorry, but this is completely wrong! We have full control over the population variance (except practical constrains). If the population consist of plots, then we are the ones who define the character of this population! Imagine you make your n=100 plots twice as large as the area of interest (means: 100 times full census), then the population variance among these plots is zero! Any plot size smaller than the area of interest will introduce variability among the observations. And in your case of subdividing the area into cells: Of you make them larger, the variance among them will get smaller...
2. Lfehrma 09 Sep 2025
  
  in Public
  
  which is an astonishingly large number!
  
  And now assume you would allow such a quadratic plot at any location, then is getting infinite. BUT: contrary to finite population in which we expect a different response in each cell, this does not hold in an infinite population. Many sampling locations (also infinite) would lead to the same included trees. See https://youtu.be/-8CVcXOKRxM?si=MpWdojUe3FbcU-Mn
3. Lfehrma 09 Sep 2025
  
  in Public
  
  sample variance
  
  the estimated population variance
4. Lfehrma 09 Sep 2025
  
  in Public
  
  non-selected units
  
  only in finite population
5. Lfehrma 09 Sep 2025
  
  in Public
  
  If the FPC isn’t needed
  
  which is the standard case if we consider an infinite population ;-)
6. Lfehrma 09 Sep 2025
  
  in Public
  
  smallest variance
  
  error variance (or standard error). Should be distinguished from population variance here!
7. Lfehrma 09 Sep 2025
  
  in Public
  
  sampling unit
  
  maybe "sampling element" is better
8. Lfehrma 09 Sep 2025
  
  in Public
  
  Finite means there’s a limited, i.e., countable, number of units in the population
  
  Ok, somehow you like to stick to this finite definition of the population. In the next sentence, however, you say (correctly) that each possible sample must have a non-zero probability. In real life: you could select a sample plot at any point in the total forest area. In a finite population view this is not the case (and somehow you are limited to quadratic cells as plots, which is not in line with what you explain later)?? Why is it so important to define the population as finite here? Imagine we conduct SRS by random selection of x,y coordinates in the area (and install plots there). Then we are outside of your population concept (the population of possible plots is infinite).
9. Lfehrma 09 Sep 2025
  
  in Public
  
  units in the population
  
  ... or the total forest area
10. Lfehrma 09 Sep 2025
  
  in Public
  
  for finite populations
  
  Which means you need to consider finite population correction in many estimators. Why not infinite populations (which is what we are doing in reality)? The concepts you introduce later (SRS, stratified, double sampling, ... ) would hold in exactly the same way, only that you can ignore the fpc
Visit annotations in context

Annotators

Lfehrma

URL

finley-lab.com/ifdar/statbasics
www.finley-lab.com www.finley-lab.com

Chapter 10 Preliminary definitions and concepts | Introduction to Forestry Data Analysis with R

2
1. Lfehrma 09 Sep 2025
  
  in Public
  
  For example, we might divide the 50-acre property into 250 1/5-th acre non-overlapping plots
  
  Which would be a very untypical strategy, right? We see such examples in old textbooks (like Shiver and Borders and my former boss Akca used it too). Such a subdivision of the area would result in a finite population of cells. No other overlapping cell in between would have a positive probability to be selected. If you like to avoid this, you can just say that a sample of plots is selected from the total forest area.
2. Lfehrma 09 Sep 2025
  
  in Public
  
  In some cases we’re able to observe (meaning measure) all units in the population
  
  If the population we are looking at does allow, yes. We can e.g. measure all trees if we are interested in tree characteristics. It becomes impossible if the population consists of plots (resp. all possible sample plots in an area).
Visit annotations in context

Annotators

Lfehrma

URL

finley-lab.com/ifdar/notation
www.finley-lab.com www.finley-lab.com

Chapter 7 Manipulating and summarizing data with dplyr | Introduction to Forestry Data Analysis with R

1
1. Lfehrma 09 Sep 2025
  
  in Public
  
  dplyr
  
  It is not at all a bad idea to introduce dplyr here. data.table would be an alternative and often more efficient on large datasets, but the code is not as readyble
Visit annotations in context

Annotators

Lfehrma

URL

finley-lab.com/ifdar/dplyr
www.finley-lab.com www.finley-lab.com

Chapter 5 Functions and functional programming | Introduction to Forestry Data Analysis with R

7
1. Lfehrma 09 Sep 2025
  
  in Public
  
  indicates the log rule
  
  If these are standard, fine. Since this is just to learn for the students, you can also use simpler formulations like Smalian.
2. Lfehrma 09 Sep 2025
  
  in Public
  
  for(i in 1:nrow(pef_trees)) { AGB_kg[i] <- agb(pef_trees$DBH_in[i], pef_trees$common_name[i], dbh_units = "in") }
  
  instead of looping, we would rather use lapply here later
3. Lfehrma 09 Sep 2025
  
  in Public
  
  # Assign the species-specific regression coefficients.
  
  For a simple example ok, but later you want to have a leftjoin here and read the model coefficients for different species from a different related table. But you can do that also in base R:
  
  seperate table: model coefficients by species
  
  coef_df <- data.frame( species = c(your species 1, 2,3,...), a = c(0.5, 0.7, 0.6), # coefficients b = c(1.2, 1.5, 1.3) )
  
  join coefficients on trees
  
  library(dplyr)
  
  trees_with_coef <- trees_df %>% left_join(coef_df, by = "species")
  
  But dplyr was not yet introduced and i understand that they first should do it by hand...
4. Lfehrma 09 Sep 2025
  
  in Public
  
  AGB=exp(β0+β1ln(DBH)),
  
  This is the log transformed linear form that we often use for regression analysis (due to heteroskedasticity in the metric data). As you see the coefficients are here ß and should be estimates by a linear regression. However, if it comes to applying an allometric model we usually use the form AGB=a*DBH^b. ß1 is b and a is exp(ß0). I would recommend to substitute ß0 and ß1 by a and b and use the above formulation. Easier to digest for the students and more common in practice.
5. Lfehrma 09 Sep 2025
  
  in Public
  
  basal area is c⋅DBH2c⋅DBH2c\cdot \text{DBH}^2, where ccc is often referred to as the “foresters constant” and, depending on your measurement system, is either 0.005454 or 0.00007854
  
  I would not recommend to introduce any "foresters constants", it is just simple geometry if we want to calculate the area of a circle from its diameter. Students tend to learn such things by heart and forget about the fundamentals... Here the pi/10.000 comes in because DBH comes in centimeter (conversion from cm to meter) and the strange (DBH/2)^2 substitutes pi/4. I prefer to explain my students how to calculate the area of a circle and remind them that we want to have this in meter, instead of confusing them.
6. Lfehrma 09 Sep 2025
  
  in Public
  
  to some degree
  
  you can have the same basal area with a single big tree or hundreds of small trees. There is not necessarily a relation between BA and density
7. Lfehrma 09 Sep 2025
  
  in Public
  
  directly related to stand volume
  
  closely related
Visit annotations in context

Annotators

Lfehrma

URL

finley-lab.com/ifdar/functions
www.finley-lab.com www.finley-lab.com

Chapter 3 Scripts and reproducible workflow | Introduction to Forestry Data Analysis with R

1
1. Lfehrma 09 Sep 2025
  
  in Public
  
  In this chapter, we introduce scripts and their use to facilitate reproducible workflows
  
  Maybe good to start with a single skript, but usually later you want to have a project with multiple scripts in which you can separate certain things from each other (e.g. functions and models from calculations or data import). Just see that this comes in 3.4!
Visit annotations in context

Annotators

Lfehrma

URL

finley-lab.com/ifdar/scriptsworkflows
www.finley-lab.com www.finley-lab.com

Chapter 1 Overview and motivating data | Introduction to Forestry Data Analysis with R

28
1. Lfehrma 09 Sep 2025
  
  in Public
  
  How to learn:
  
  Maybe shift this chapter up and describe the datasets after?
2. Lfehrma 09 Sep 2025
  
  in Public
  
  or a complete enumeration of all units (e.g., trees) in a population
  
  Mhhh, here you maybe introduce some confusion for the reader. It is correct that in this specific experimental plot and for the specific ecological perspective the interest is on trees. And yes, if the interest is on tree characteristics, the total population is all the trees. But this is very different to forest inventories (observational studies) that aim at describing the fores area. There, the population is the forest area and the sampling units are subsets of this area (plots). Maybe substitute by "... is a full census or complete enumeration of all trees, which in this case represent the population"
3. Lfehrma 09 Sep 2025
  
  in Public
  
  The HTML version of Figure 1.5 shows the PEF LiDAR canopy height surface, forest inventory plot locations, and MU boundaries. Clicking on a plot shows the MU in which the plot resides, its identification number, and current basal area (ft22^2/ac). Clicking in a MU polygon (i.e., between plots) brings up a figure of the MU’s basal area changed over time. The printed version Figure 1.5 shows the PEF MU boundaries and plot locations colored by most current basal area (ft22^2/ac).
  
  Somehow difficult to consider printed and online at the same time. Some of it might be shifted to the caption.
4. Lfehrma 09 Sep 2025
  
  in Public
  
  plot remeasurements
  
  ...and plot measurements instead of point mesuerements
5. Lfehrma 09 Sep 2025
  
  in Public
  
  permanent sample plots
  
  ok, you here mention sample plots for the first time
6. Lfehrma 09 Sep 2025
  
  in Public
  
  canopy height surface
  
  usually "canopy height model CHM". Which is a surface...
7. Lfehrma 09 Sep 2025
  
  in Public
  
  high spatial resolution wall-to-wall data products such as gridded maps
  
  Well, this is what everybody says, but if we look at what is used for decision making at the end, I am not sure. Maps are great for communication, but calculations are done in tabular data. If we provide forest managers with high resolution wall-to-wall maps, they usually ask for simpler products showing useful classes. My experience is that we often aggregate our high resolution back to something more simple. In the carbon business emission factors are not provided at the pixel level, but e.g. for a certain forest type.
8. Lfehrma 09 Sep 2025
  
  in Public
  
  ~50%
  
  I think wood chemistry suggest rather 0.47, but I have no reference at hand
9. Lfehrma 09 Sep 2025
  
  in Public
  
  LiDAR-based forest mapping
  
  Maybe substitute "LiDAR based mapping" by "LiDAR-based or -assited estimation". "Mapping" is confusing here
10. Lfehrma 09 Sep 2025
  
  in Public
  
  as well as extensive campaigns to collect field-based calibration data
  
  In the meanwhile LiDAR data acquisition has emerged to a standard in many countries. The wording "calibration data" reflects a purely model based perspective (common in remote sensing), while we usually look at it as ancillary data
11. Lfehrma 09 Sep 2025
  
  in Public
  
  no significant change
  
  Just comment: Here we look at a typical experimental study that allows hypotheses testing and significance (in contrast to the observational studies mentioned above in which we can maybe look at relationships but never on cause and effect). Such studies do not aim at describing a forest area, but are designed to research into ecological or other relationships. Good to have such different examples!
12. Lfehrma 09 Sep 2025
  
  in Public
  
  point
  
  I stop commenting on point or plot here, since you consistently used point. At the end it is a question of personal style and "point" is also fine as long as there is a mentioning that trees around such a point are included based on a certain rule (and I assume you come to plot designs later). Speaking of points may help to explain the infinite character of our population later.
13. Lfehrma 09 Sep 2025
  
  in Public
  
  cruise point
  
  sample plot
14. Lfehrma 09 Sep 2025
  
  in Public
  
  If you’re reading the HTML version of this book, mousing-over a point gives the point number and a single click gives a list of AGS trees used to calculate the basal area color reflected in the figure legend.
  
  Considering a later printed book publication you might want to shift this into the figure caption
15. Lfehrma 09 Sep 2025
  
  in Public
  
  Points
  
  Sampling locations were selected in a systematic grid
16. Lfehrma 09 Sep 2025
  
  in Public
  
  points
  
  sample plots? Maybe you come to observation units in a later chapter, but i see no reason why not to mention it already here. Tree characteristics cannot be measured at a point
17. Lfehrma 09 Sep 2025
  
  in Public
  
  Location of timber cruise points
  
  Alternatively: Selected plot locations
18. Lfehrma 09 Sep 2025
  
  in Public
  
  placement
  
  or "selection of measurement units". The typical target variables cannot be observed at a dimensionless point (as you say), usually our sampling elements are small outcuts of the forest area.
19. Lfehrma 09 Sep 2025
  
  in Public
  
  mathematical formula
  
  ..., which is a model prediction and in fact the mean of observed values at this dbh
20. Lfehrma 09 Sep 2025
  
  in Public
  
  an allometric equation
  
  here: "from a statistical model". If it would be an allometric model of the form BM=a*dbh^b it would look different
21. Lfehrma 09 Sep 2025
  
  in Public
  
  are best fit lines
  
  see, we are not looking at strict allometry here (which would be a process model perspective) but look for the best fit. In this case dangerous if you interpolate beyond the range of data. Look at Betula lenta where the best fit to the current data would suggest that biomass is decreasing with increasing dbh. In regard to model choice, a data analyst has both in mind: a possibly good fit to the current data and biological plausibility.
22. Lfehrma 09 Sep 2025
  
  in Public
  
  create
  
  Instead of "create" maybe use "fit a model using regression techniques" or something
23. Lfehrma 09 Sep 2025
  
  in Public
  
  allometric equation
  
  "Allometry" or allometric relations usually refer to a specific kind of relations (the relation between two relative growth rates in one individual). We tend to call many model "allometric" that are in fact other kinds of relationships. Taking the character of allometric relations serious would mean to apply a power model, but in fact we are often using others. Anyway, I would not change the text here because it is in line with the general understanding.
24. Lfehrma 09 Sep 2025
  
  in Public
  
  difficult to measure
  
  If at all it would be very destructive
  
  general
25. Lfehrma 09 Sep 2025
  
  in Public
  
  to identify features for consideration in the subsequent wrangling and analysis
  
  The target variables we are typically interested in are rarely "measurable" (volume, biomass, biodiversity, ....). It is another important task of the data analyst to to identify and calculate the essential target variables or requested information from the data (features) at hand. In most cases this require the application of models.
  
  general
26. Lfehrma 09 Sep 2025
  
  in Public
  
  For example, field crews collect forest inventory data to answer questions about the amount and location of timber or non-timber resources. Monitoring data are collected to understand change in forest characteristics. Highly detailed individual tree data are collected to understand allometry, which is the growth and size relationship between different parts of an organism. Experimental manipulations of trees, stands, or forests are used to better understand how environmental change and disturbance events impact individual growth rates and trends in population demographics.
  
  It is maybe helpful for the reader to distinguish between typical observational studies (forest inventories aim at estimating status and change) and experimental studies (investigating effects of treatments or researching into dynamics)
  
  general
27. Lfehrma 09 Sep 2025
  
  in Public
  
  forest sampling (Gregoire and Valentine 2007; Mandallaz 2007)
  
  Maybe include Forest Inventory - Methodology and Applications by Annika Kangas and Matti Maltamo and Sampling Methods, Remote Sensing and GIS Multiresource Forest Inventory by Köhl and Steen Magnussen.
28. Lfehrma 09 Sep 2025
  
  in Public
  
  50% to 80% of an analyst’s time is spent data wrangling
  
  I can confirm ;-)
  
  general
Visit annotations in context

Tags

general

Annotators

Lfehrma

URL

finley-lab.com/ifdar/overview
Mar 2025
www.finley-lab.com www.finley-lab.com

Chapter 12 Estimating forest parameters | Introduction to Forestry Data Analysis with R

12
1. Lfehrma 09 Mar 2025
  
  in Public
  
  trade-offs
  
  From a certain plot size onwards, the gain in precision is minor and increasing sample size instead of enlarging plots is statistically more beneficial. For typical variables like volume, empirical research shows that 15-20 trees per plot are enough. But there are other target variables that might justify larger plots...
2. Lfehrma 09 Mar 2025
  
  in Public
  
  Illustration
  
  Beautiful figures!
3. Lfehrma 09 Mar 2025
  
  in Public
  
  horizontal distance for all measurements
  
  If horizontal distance is measured no slope correction is required...
4. Lfehrma 09 Mar 2025
  
  in Public
  
  where there is no notion of area sampled and hence no sampling intensity to compute
  
  "point sampling" is just a form of "continuously nested plots" and not at all "plotless". There is no argument to look at it in a different way.
5. Lfehrma 09 Mar 2025
  
  in Public
  
  substantial portion of the forest area is sampled and the FPC is warranted
  
  Not consistent with the infinite population view described above!! The population is the infinite number of possible plot locations (continuum of sampling locations), from which some are selected. In some textbooks you find examples of partitioning the area into a finite number of grid cells from which some are selected. Then FPC is required, but this view is not consistent with what we usually do in forest inventories...
6. Lfehrma 09 Mar 2025
  
  in Public
  
  small
  
  Infinite small! We draw a sample of plots from an infinite number of available plot locations in the area. The sample is therefore not shrinking the remaining population size in the next draw...
7. Lfehrma 09 Mar 2025
  
  in Public
  
  FPC only applies to SRS
  
  This is not correct! FPC applies to any sampling design if applied in a finite population. It is a correction that ensures that the standard error becomes 0 if all elements of the population are in the sample. It is relevant if the sample proportion is relatively large (and the remaining u sampled portion becomes small). The argument for not considering it in forest inventories is that we sample from an infinite population...
8. Lfehrma 09 Mar 2025
  
  in Public
  
  plot sampling
  
  For sampling of single fixed area plots... Nested plots is also "plot sampling"
9. Lfehrma 09 Mar 2025
  
  in Public
  
  tree factor (TF)
  
  Why not EF for expansion factor?
10. Lfehrma 09 Mar 2025
  
  in Public
  
  were summarized to the 0.25 ha plot in advance
  
  Since you talked about nested plots before (unequal probability) it might be better for understanding if expansion is done on the level of single trees and aggregated on plot level afterwards.
11. Lfehrma 09 Mar 2025
  
  in Public
  
  scale tree measurements
  
  Since we take a sample of the forest area (on which we might find trees), I would rather say we upscale the plot area to a common basis. Same result but different wording...
12. Lfehrma 09 Mar 2025
  
  in Public
  
  at ground level
  
  Usually everything we measure/estimate refers to dbh height. Also the decision whether a tree is in or out is felled according to horizontal distance measured at dbh height!
Visit annotations in context

Annotators

Lfehrma

URL

finley-lab.com/files/ifdar/foreststat

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

seperate table: model coefficients by species

join coefficients on trees

Annotators

URL

Annotators

URL

Tags

Annotators

URL

Annotators

URL