Completed on 3 Aug 2016 by NHGRI Journal Club . Sourced from http://biorxiv.org/content/early/2016/06/07/057570.
Login to endorse this review.
We reviewed this paper in our July preprint journal club. Obviously, the potential for contamination to influence the results is the first question that all reviewers will ask. Although we were impressed with the steps the authors did take to mitigating such concerns, we felt there are a couple of simple experiments that still need to be done:
1) The entire pipeline should be run in parallel with both a blood sample and a saline or ultra-pure water control sample. If the later is free of contamination, and if the sequencing pipeline introduces no contamination, then the sequencing library is expected to generate no data; thus, any data that is generated provides a background that can be used to normalize the data from the blood sample.
2) You state that you expect the microbes found in blood originate in the gut or oral cavity. But isn’t it possible that you’re simply picking up microbe-derived RNA that has crossed into the blood, rather than RNA of microbes that are in the blood? If such a diversity of bacteria truly does exist in the blood, then shouldn’t it be possible to observe it by microscopy or by culture? Some of the species you report will certainly grown in culture. Alternatively, you should be able to perform flow sorting or some other single-cell approach to isolate non-human cells from the blood. Another experiment you could do to test your hypothesis (which is probably beyond the scope of this paper but would be interesting for a future study) is to compare the blood profiles of mice raised in germ-free versus dirty environments.
Some additional concerns/suggestions:
1) It is not stated whether or not you used index (i.e. barcoded) adapters. As you probably know, Illumina instruments have some degree of carryover between runs (http://core-genomics.blogspot..... So if a metagenomic sample was run previously on the same instrument (or in a different lane during the same run), a fraction of the reads (consistent with what shows up as non-human in your samples) could be explained by carryover.
2) You state that you observed no eukaryotic species, but, to our knowledge, the Phylosift reference database does not include any eukaryotic proteins by default. Were you specifically not looking for eukaryotes? If there were contamination from the skin during blood draw, we would expect to see some evidence of yeast species.
3) You state that you drew two vials of blood from each individual and randomly selected one for sequencing. Yes, this will randomly distribute errors, but it would still be informative to show a comparison between the microbial communities detected in first-draw vials versus second-draw vials.
4) You find the SCZ group to be different from the other three, but this group is also quite different in terms of age and/or sex ratio. Are you concerned about these potentially confounding factors? What happens if you restrict your analyses to only age- and sex-matched subsets of each cohort?
Sofia de Pereira Barreira