8 Matching Annotations
  1. Oct 2024
    1. In this notebook we will analyze the metagenomic data from 3 different sampling campaigns made in Japan. The data was collected in the cities of Noto, Toyama and Kumamoto between 2014 and 2018. The DNA used as source for the metagenomic analysis was extracted from quartz filters which were used to capture the particles in the air for different periods of time with a high volume sampler (HVS) with either a 2.5 or 10 µm cutoff.

      In this notebook, we will analyze metagenomic data from three different air sampling campaigns conducted in Japan. The air samples were collected by HVS Sibata using PM2.5 and PM10 particle selection heads, concentrating the air biomass on quartz filters. Sampling locations included Noto, Toyama, and Kumamoto, and data were collected between 2014 and 2018.

      DNA was extracted from one-eighth of each quartz filter using an in-house extraction protocol based on phenol-chloroform extraction with enzymatic treatment. Extracted DNA was analized by amplicon and whole genome sequencing sequencing using Illumina and OPacBio technologies..

    1. If we check the total reads assigned per sample day and head type, we observe that while for days 1 and 2 the total reads where rather high and similar for both head types, but the 3rd day had a strong reduction, with the PM2.5 head having a much lower number of reads than the PM10 head (which still had a low number of reads compared to the previous two days).

      Upon reviewing the total number of reads assigned per sample day and head type, we observe that the number of reads detected on Days 1 and 2 is very similar for both heads. However, on the third day, the PM2.5 head shows a significantly lower number of reads compared to the PM10 head. This is an intriguing observation, as we would expect PM10 to include the PM2.5 fraction, suggesting that PM10 should have a higher mass. While this assumption holds true in terms of DNA yield, it is not reflected in two out of the three experiments.

    2. At a simple glance, it seems that the continuous sampling method yields more DNA (which makes sense since they are sampling for longer periods of time) but this does not seem to translate into a higher species richness. If we now check not only the richness but the actual species belonging to each of the sample types and see whether some species are more common in one type of sample than the other:

      The continuous sampling method yields more DNA, which is expected given the longer sampling durations. However, this higher DNA yield does not correspond to increased species richness. On the contrary, discrete sampling, which yields lower DNA amounts, shows relatively high species richness.

      When we look beyond species richness to examine the specific species present in each sampling method, we see that, while most species are common to both approaches, about 25% of species are unique to discrete sampling, compared to only around 3% unique to continuous sampling suggesting that discrete sampling may capture a broader diversity of species.

    3. Let’s start by taking a look at how the DNA yield of the samples and the diversity (measured as the richness of species) compares:

      Let's begin by examining how the DNA yield of each sample compares to the species diversity (measured as species richness):

    4. name sample_id reads 128710 Krasilnikoviella flava HVSP15 25.0 337027 Marmoricola sp. Leaf446 HVSP45 599.0 49047 Limibaculum sp. M0105 HVSP6 33.0 157388 Nocardioides ungokensis HVSP19 148.0 269015 Rubripirellula reticaptiva HVSP35 28.0 167215 Streptomyces hoynatensis HVSP20 40.0 23868 Georgenia subflava HVSP3 115.0 461309 Jeotgalicoccus psychrophilus HVSP59 133.0 403987 Rubrobacter aplysinae HVSP53 81.0 208014 Phyllostachys edulis HVSP26 28.0

      Que esta aqui? Porque esta esta lista?

    5. We have been given a list of species that are considered to be contaminants, which we will remove from the analysis:

      Species identified as contaminants will be excluded from the analysis.