4 Matching Annotations
  1. Jul 2018
    1. On 2013 Nov 06, Rafael Najmanovich commented:

      None


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On 2013 Nov 06, Rafael Najmanovich commented:

      As someone involved in the development of the field (see Najmanovich R, 2008, http://f1000research.com/articles/2-117/v2 and Najmanovich RJ, 2005) I find the subject area of this article highly relevant for the goal of understanding protein function and its prediction from structure. However, I have several reservations concerning the current paper. These are presented in what follows in decreasing order of importance which is in fact the exact opposite order in which the results mentioned here are presented in the paper followed by some final remarks.

      1 - Recovery of Holo Structures using the Apo Structure as a Query. In this section the authors calculate binding site similarities between each Apo protein form (unbound) and the entire set of Holo protein forms (bound). The authors set out, in their words to assess 'the likelihood of correctly identifying the cognate parter of each detected apo binding site’. This language is unclear as we don’t know if they are assessing the probability of finding as most similar a binding-site that binds a similar ligand or the binding-site of the Holo protein that is identical to the Apo form used as query. They find that in 87% of cases ‘the binding sites were correctly matched’. The authors appear to show that they are able to detect based on binding-site similarities the Holo form of a given Apo form for the same protein in 87% of cases. That is to say, the method is capable of accounting for whatever amount of flexibility exists between Apo and Holo forms in 87% of cases as it detects an identical binding site (except for flexibility) as top ranking. This result does not in any way measure the ability of the method to predict function (defined as detecting the correct binding ligand). As a side note, a simple sequence alignment would obviously detect the correct holo partner from the Apo in 100% of cases. To test the prediction capabilities of the method, the authors should use a truly non-redundant dataset (see point 3) and detect as high scoring the Holo forms of other proteins (not the one that pairs with the query) that bind the same or highly similar ligands (for details see Najmanovich R, 2008).

      2 - Detection of Binding Sites on Structures with or without Ligands Bound. In this section the authors seek to quantify the ability of the method to detect the cavity that represents the binding site (i.e., where the bound ligand is found) out of all possible cavities. They search against the cavities in the holo or the Apo sets and report 91% and 88% success rates respectively. While these results may at first seem impressive, it is important to note, as reported already in 1996 that the largest volume cavity is the cavity that contains the binding-site in 83% of cases (Laskowski RA, 1996). The challenge is not to detect the cavity that contains the binding-site but the actual binding-site, that is, the residues with atoms in contact with ligand atoms.

      3 - Curated LigASite Database. The authors mention that they use the non-redundant LigASite dataset (Dessailly BH, 2008), however, it is important to highlight that redundancy is always relative to the question at hand. The context here is the prediction of function from structure. In such a context, it is necessary to use a dataset that from the point of view of the proteins within, does not contain any redundancy at the level of protein folds in addition to sequence. As an example, human protein kinases can have very low sequence identity, below the 25% threshold used in LigASite, but almost 100% binding site similarity, sequence or otherwise. Additionally from the point of view of the ligands, a quick inspection of the originally reported LigASite non-redundant dataset shows an over-representation of nucleotide-containing molecules (MG ADP MN ATP NAG NAD GOL GDP PO4 GLC NAP GAL COA AMP ANP).

      Finally, the authors apply some extra filters to the LigASite dataset available at the time of their study but do not publish the final dataset, thus preventing any careful analysis of non-redundancy or allowing future works to compare their results to the ones reported here. In summary, the data presented here does not allow a reader to evaluate in any way the quality of the method. To truly test a method one has to compare it to existing methods that perform the same tasks (such as IsoCleft above and many others that exist) as well as use a dataset that is non-redundant for the particular questions.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

  2. Feb 2018
    1. On 2013 Nov 06, Rafael Najmanovich commented:

      As someone involved in the development of the field (see Najmanovich R, 2008, http://f1000research.com/articles/2-117/v2 and Najmanovich RJ, 2005) I find the subject area of this article highly relevant for the goal of understanding protein function and its prediction from structure. However, I have several reservations concerning the current paper. These are presented in what follows in decreasing order of importance which is in fact the exact opposite order in which the results mentioned here are presented in the paper followed by some final remarks.

      1 - Recovery of Holo Structures using the Apo Structure as a Query. In this section the authors calculate binding site similarities between each Apo protein form (unbound) and the entire set of Holo protein forms (bound). The authors set out, in their words to assess 'the likelihood of correctly identifying the cognate parter of each detected apo binding site’. This language is unclear as we don’t know if they are assessing the probability of finding as most similar a binding-site that binds a similar ligand or the binding-site of the Holo protein that is identical to the Apo form used as query. They find that in 87% of cases ‘the binding sites were correctly matched’. The authors appear to show that they are able to detect based on binding-site similarities the Holo form of a given Apo form for the same protein in 87% of cases. That is to say, the method is capable of accounting for whatever amount of flexibility exists between Apo and Holo forms in 87% of cases as it detects an identical binding site (except for flexibility) as top ranking. This result does not in any way measure the ability of the method to predict function (defined as detecting the correct binding ligand). As a side note, a simple sequence alignment would obviously detect the correct holo partner from the Apo in 100% of cases. To test the prediction capabilities of the method, the authors should use a truly non-redundant dataset (see point 3) and detect as high scoring the Holo forms of other proteins (not the one that pairs with the query) that bind the same or highly similar ligands (for details see Najmanovich R, 2008).

      2 - Detection of Binding Sites on Structures with or without Ligands Bound. In this section the authors seek to quantify the ability of the method to detect the cavity that represents the binding site (i.e., where the bound ligand is found) out of all possible cavities. They search against the cavities in the holo or the Apo sets and report 91% and 88% success rates respectively. While these results may at first seem impressive, it is important to note, as reported already in 1996 that the largest volume cavity is the cavity that contains the binding-site in 83% of cases (Laskowski RA, 1996). The challenge is not to detect the cavity that contains the binding-site but the actual binding-site, that is, the residues with atoms in contact with ligand atoms.

      3 - Curated LigASite Database. The authors mention that they use the non-redundant LigASite dataset (Dessailly BH, 2008), however, it is important to highlight that redundancy is always relative to the question at hand. The context here is the prediction of function from structure. In such a context, it is necessary to use a dataset that from the point of view of the proteins within, does not contain any redundancy at the level of protein folds in addition to sequence. As an example, human protein kinases can have very low sequence identity, below the 25% threshold used in LigASite, but almost 100% binding site similarity, sequence or otherwise. Additionally from the point of view of the ligands, a quick inspection of the originally reported LigASite non-redundant dataset shows an over-representation of nucleotide-containing molecules (MG ADP MN ATP NAG NAD GOL GDP PO4 GLC NAP GAL COA AMP ANP).

      Finally, the authors apply some extra filters to the LigASite dataset available at the time of their study but do not publish the final dataset, thus preventing any careful analysis of non-redundancy or allowing future works to compare their results to the ones reported here. In summary, the data presented here does not allow a reader to evaluate in any way the quality of the method. To truly test a method one has to compare it to existing methods that perform the same tasks (such as IsoCleft above and many others that exist) as well as use a dataset that is non-redundant for the particular questions.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On 2013 Nov 06, Rafael Najmanovich commented:

      None


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.