What exactly is the definition of enriched here? When I think of "enriched", I think of a statistical definition similar to hypergeometric enrichment, where things are present in this fraction more than expected by chance.
It is also not clear either here or from the methods, how exactly one gets a comparison of proteins for differential analysis. The methods say that amyloid plaques are isolated using the SDS protocol and then digestion for proteomics. Then CSF only is prepared for proteomics with no amyloid isolation protocol?
This is the only thing that makes sense to do, because based on the ThT assay in Figure 1, there were essentially no amyloid plaques observed in the CSF only samples after 24 - 48 hours.
But the proteins associated with amyloid tangles that are previously shown to be important like APLP1, ApoE, etc all show statistically significant differential decreases in the virus samples compared to CSF?? If you are comparing amyloid plaque associated proteins to control CSF, they should be increased instead of decreased, correct? The only way I can see the results of Figure 1 making sense is if you are comparing the supernatant after removing the amyloid plaques, or the design matrix in limma was inverted. This would be possible to evaluate if the data was deposited somewhere appropriately.
I realize this might be out of scope, but it would be nice to have proteomics quantification of:
* virus induced amyloid tangles
* CSF after removal of virus induced amyloid tangles
* control CSF after doing amyloid tangle selection protocol
* control CSF
I think these 4 sets would make it easier to know where specific proteins are actually appearing and cross reference that the overall differential design results are correct and consistent.
Finally, an actual enrichment test would be good to do. For example, using the set of proteins previously detected in amyloid plaques as the "annotated set", you could test using hypergeometric test that the differential proteins are enriched in this set more than expected by chance, based on the total set of proteins detected in both the amyloid plaques and the control CSF.