Completed on 3 Jan 2017 by NHGRI Journal Club . Sourced from http://biorxiv.org/content/early/2016/10/24/064915.
Login to endorse this review.
We reviewed this paper in our December preprint journal club. Overall, we found the paper to be well written and the conclusions to be convincing. We had only a few minor comments and suggestions:
· Please be more clear about what the coding score in Figure 3B and 4C means. It is difficult to move from the results to the methods to interpret the CS_hexamer_ equation, so it would help your readers if you give a more intuitive interpretation of this value right in the results. Also, how did you determine that 0.049 is the cutoff for high coding score?
· It would have been nice to see two distinct tissues compared in figure 4B, given that one might expect “brain” and hippocampus to be fairly similar. If this would be an incorrect assumption, then it should be spelled out why, otherwise one or more confirmatory figures should be included in the supplement. Also, how did you choose 60% as the cutoff? Just by eye?
· Please add coding genes to figure 4C.
· Figure 2B could be improved by adding density plots in the margins with asterisks indicating significance (such as those provided in Figure 4E).
· We were interested to see the effect of the algorithm for predicting coding potential. Do things change significantly if you use e.g. CPAT rather than CIPHER?
· In the discussion, you focus on lncRNAs as a potential intermediate step leading to de novo protein coding genes. Isn’t it equally likely that lncRNAs (especially those that are highly conserved) were at some point functional and are degenerate in mouse? If yes, please consider this in the discussion, otherwise add a short explanation as to why this can’t be so.
Sofia de Pereira Barreira