On 2019-07-08 04:08:39, user Fraser Lab wrote:
This work describes the implementation of a data processing pipeline for acquiring high-resolution maps of microtubules (MTs) from cryo-electron microscopy (cryoEM) data using the RELION software. As in other pipelines for processing microtubule EM data, this implementation requires extensive custom processing because of the pseudosymmetric nature of most MTs assembled in vitro (also observed in vivo): a "seam" down the length of the assembly disrupts the otherwise helical symmetry. The broken symmetry means that existing methods for processing purely helical particles equate nonequivalent positions and produce low-quality reconstructions. The authors implement a treatment of these particles that accounts for the seam and produces high-resolution structures of the MT α and β asymmetric units. It builds on implementations of similar pipelines for the same purpose using other software, with the key advantage of conducting all steps in a single program that most cryoEM users are already familiar with. The pipeline consists of a set of scripts and a series of steps the user should complete in the RELION graphical user interface (gui) in order to obtain the asymmetric unit reconstructions. The authors test their pipeline on three example datasets with different decorators on the α or β subunits that aid in initial alignment and discrimination between the two, and note that they have successfully (with minor modifications) applied the pipeline to a more challenging dataset with both subunits decorated.
The major success of this paper is the clear and thorough description of the steps necessary to produce high-resolution α and β subunit reconstructions, complete with clear justification for each step and descriptions of expected results so that an advanced user can intervene when intermediate results deviate from expectations. This tool meets an immediate need in the structural biology community for analysis of MTs, which are inadequately reconstructed from cryoEM data by existing helical or strictly single-particle methods, and which play an important role in the cell interacting with a variety of other molecular machines. Ideally, it would be benchmarked against the other pipelines mentioned in the paper (e.g. https://github.com/nogalesl... from: https://www.ncbi.nlm.nih.go... but as one of us (JSF) knows from personal experience that it can be tricky to set up the necessary EMAN and frealign environments correctly to do such benchmarking properly. Here, the ability to complete this analysis without exporting steps to other programs could be a major boost in accessibility. Moreover, as the authors have built upon a popular cryoEM image processing program that has a gui highly accessible to novice and intermediate users as well as command-line tools that expert users may use for more advanced customizations and interventions, we anticipate this pipeline will be enthusiastically adopted by many users.
We also applaud the authors' choice to make the scripts open-source and publicly available on github, which will facilitate the active conversation between users and developers (and sometimes software developers who did not author the original work) that lead to major breakthroughs and advancements in later versions. However, we cannot comment on the scripts themselves yet as a full path to the source code is not provided in the manuscript and a search for “MiRP relion github” didn’t yield anything informative. We would like to request the authors include it in the revised manuscript and also provide it to us during the review process so we may evaluate this important component of the work. We recommend using Zenodo (https://zenodo.org/) to generate a DOI for a snapshot of the repository, which will also produce a timestamp and facilitate formal versioning.
We identify a few major weaknesses in the manuscript in its current form, all of which we hope the authors can address in a revision. First, the final, high-resolution reconstruction is the αβ dimer, not the C1 reconstruction of a full helical turn, which may not serve the goals of all users. The authors identify the final averaging step that disrupts the density for all but the αβ dimer directly opposite the seam and describe alternative approaches, including one implemented by the Carter group while the manuscript was in preparation. We would strongly encourage them to implement one of these approaches so that biological questions that require examination of the whole MT can also be addressed. We are also unclear on how the present implementation would preocess both the microtubule and fiducial protein for datasets with dynein or EB3 bound, and would like to see this explicitly discussed (or better, tested if EMPIAR datasets are available).
Second, at times the authors describe what they expect from data they have not processed, for example on page 14 lines 1-4. Given that they have the necessary tools in-hand and this work describes the method, we would press them to test this type of claim and describe the supporting evidence. They have also described processing a dataset with fiducial markers on both α and β monomers but not described the modifications they had to make to the pipeline for this dataset, and they have not yet (to our knowledge) used the method to process undecorated MTs. As they cite successful processing of undecorated MTs by the Nogales group, proof-of-concept processing of undecorated MTs would be an important component of making this pipeline at least as useful as existing methods. The other case where we would strongly prefer to see the authors test their claims is on page 12 lines 46-48, where they speculate on the effects of excluding some MTs from further analysis, and although the authors do not make any claims about performance using automated MT picking, we would be very interested to see this tested (even if the result is that it is discouraged).
Third, while we agree with the statement: “strikingly, although application of MiRP compared to standard helical processing has a negligible effect on the reported reconstruction resolution by FSC (Fig. 6c), the structural details are clearly superior in quality,” based on the snapshots of density shown at a specific contour in Fig 6 and Supp Fig 3, it is possible to use tubulin model refinement and other quantitative evaluations of the map to validate this statement. For example, compared to the standard helical map we would expect a higher quality reconstruction to have a smaller convergence rmsd when multiple independent models are rebuilt into the map (as in: https://www.ncbi.nlm.nih.go... a better EMRinger score (https://www.ncbi.nlm.nih.go... when evaluated with the same starting model rigid body fitted into the map; and lower B-factors/better model geometry when an atomic model is refined.
Fourth, reiterating that this is a methods paper, we find it critical that the raw data be made available on EMDB for all datasets described in the manuscript, not just the C1 reconstructions and symmetrized asymmetric units. This is important for reproducibility, open science, and the development of exciting new methods like this one using publicly available test data.
In sum, we find this an important piece of work that will immediately improve the ability of groups working on MTs to recover high-resolution structures, pending our several major reservations that we hope the authors will resolve in a revision. We also identify several minor points that could be improved, mostly regarding readability, and a few suggestions for alternative implementations of some steps in this or a future version of the pipeline. As the line numbers do not appear to be spaced the same as the text, for these points, we have indicated the closest line number to the line when printed.
Some figures or tables referenced in the main text are not referenced correctly, such as on page 9 line 31, Fig. 6aii (referencing a panel that does not exist).
Table 1 should include accession codes in the EMDB.
The authors might comment on the biological relevance of MTs with seams, given that these are more often encountered in vitro and much less often in vivo, preferably in the introduction.
We suggest a figure that visually highlights the symptoms of misalignment of the seam and/or helical averaging of MTs with seams. Including correlation coefficients with this figure could help illustrate the challenge this pipeline overcomes. This could also illustrate the signal boost of the superaverage and the symptoms of out-of-register units. The figure could be referenced at several points later in the text to explain why certain steps are necessary.
At the end of the first paragraph on page 6, the authors describe "structural constraints of MT polymers" but apply restraints in orientational and translational searches. It would be helpful to expand on the rigidity of these restraints and whether it varies with distance to further neighbors, if applicable. Ideally (or possibly in a future version), the authors could consider restraining each α or β monomer relative to its immediate neighbors and using this approach in combination with variable restraint rigidity to aid in reconstructions of monomers at the seam and in distorted regions.
Several points regarding resolution starting at the end of page 6 and continuing in the first paragraph on page 7 describe increments of resolution or changing pixel size by binning, which have no meaning in isolation (e.g. a difference of 0.2 Å or binning x 4). The starting or ending absolute quantities should be included (e.g. improvement of the resolution to 3.2 Å or a final pixel size of 3 Å). This is repeated on page 11 lines 11-12.
The authors describe that "there is a clear bias towards a certain range of Rot angles" on page 8 line 16, but as this is the expected behavior and not a ground truth, it should be described as such.
Similarly, on page 9 line 19, the authors intermix behavior on their test data ("As expected") with description of the method, and should more clearly separate these.
On page 8 lines 18-22, the authors describe their approach for using the bias toward one Rot angle to select the correct seam location. We recommend testing the alternative method of a grid search over correlation coefficients, or describing how this is effectively accomplished during the global search step.
On page 8 lines 21-22, the result of the Rot search is described as an approximation. It would be helpful to clarify whether this result is precise but sometimes inaccurate, or accurate but known to be imprecise.
On page 8 in the section on X-Y shift smoothing, the authors describe a remedy for out-of-register asymmetric units involving resetting excessively large shifts to zero and re-refining. We propose an alternative method by analysis of the distribution of X-Y shifts that identifies the out-of-register shift vector and adjusts excessively large shifts by modulo arithmetic. This would reduce the error in the reset shifts.
As part of the same description on page 8 lines 47-48, the authors describe enforcing all X/Y shifts in a MT to follow a single slope and intercept, and should clarify whether this is constrained or restrained.
The authors could expand on the process of 'segment average' image generation on page 4 line 29 and the source of known helical parameters on page 4 line 33.
The sample preparation for cryoEM section starting on page 3 could include greater detail, e.g. Vitrobot parameters during blotting and freezing.
The authors could clarify the difference between defects and switches in PF number on page 7, lines 27-28.
The description of a 'clean' seam on page 10 line 48 is confusing. Describing this as a MT with no seam might be clearer, if that is the correct interpretation.
There are a couple creative uses of the word "allocation" — on page 9 line 6 we suggest substituting with "positioning" and on page 13 line 22 we suggest substituting with "assignment".
The wording on page 5 line 52 seems to imply the RELION nomenclature preceded the Euler angle nomenclature used in many other applications, so we recommend dropping the modifier "former".
The wording on page 7 line 5 implies the previously implemented approaches are lacking in some way, and we recommend dropping the modifier "albeit". This is repeated on page 12 line 42.
The authors could describe which of the operations through the gui could alternatively be run on the command line on page 5, lines 59-60.
On page 13 line 21, the authors claim their implementation is "the only way to avoid introducing artefacts." This may be an overly bold claim.
We are unsure what the authors mean to do with the "41 Å shifted positions" on page 9 line 7.
Some of the word choices could be made more accessible to all readers, for example by using the more common and equivalent "while" instead of "whilst" in several instances.
The sentence "Initial Tilt ... picking coordinates" on page 6 lines 54-56 is unwieldy and could be rephrased.
We find the sentence beginning "In other words" on page 8 lines 2-4 redundant and unnecessary.
The qualifier "data collection parameters" should not accompany ice thickness on page 9 line 42.
We would prefer sticking to one set of units on page 10 line 1 and substituting "sub-10 Å" for "subnanometer".
The abbreviation "(DQE)" on page 9 line 43 is not used again and may be omitted.
There are several spacing errors throughout the text. Between a numeral and a unit, there should be no space, except where the next character is º or %.
On page 8 line 4 and in several other instances, where "however" is an interjection, it should also be preceded by a comma, e.g. "this register is, however, very error prone".
There is a typo on page 3 line 52 (1 mg/ml --> 1 mg/mL), an incorrect abbreviation on page 4 line 6 (sec --> s), an overly dense abbreviation on page 4 line 59 (4xbin), a typo on page 4 line 53 (smoothened --> smoothed), a typo on page 5 line 14 (psi/tilt/ranges --> Psi/Tilt ranges), use of redundant "around" and "~" modifiers on the same quantities on page 6 line 53, incorrect pluralisation on page 7 lines 33-34 (confidence --> confidences), an unnecessary word "score" on page 8 line 7, a missing word "good" in "good signal to noise and good angular distribution" on page 9 line 33, an unnecessary hyphen in "ice-thickness" on page 9 line 42, and an unnecessary comma in "reconstructions, remains" on page 9 line 56.
We review non-anonymously, Iris Young and James Fraser (UCSF).