<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
<ui>1758-907X-2-2</ui>
<ji>1758-907X</ji>
<fm>
<dochead>Review</dochead>
<bibl>
<title><p>Experimental design, preprocessing, normalization and differential expression analysis of small RNA sequencing experiments</p></title>
<aug>
<au ce="yes" id="A1"><snm>McCormick</snm><mi>P</mi><fnm>Kevin</fnm><insr iid="I1"/><email>mccormic@cis.udel.edu</email></au>
<au ce="yes" id="A2"><snm>Willmann</snm><mi>R</mi><fnm>Matthew</fnm><insr iid="I2"/><email>willmann@sas.upenn.edu</email></au>
<au ca="yes" id="A3"><snm>Meyers</snm><mi>C</mi><fnm>Blake</fnm><insr iid="I1"/><email>meyers@dbi.udel.edu</email></au>
</aug>
<insg>
<ins id="I1"><p>Department of Plant and Soil Sciences and Delaware Biotechnology Institute, University of Delaware, Newark, DE 19711, USA</p></ins>
<ins id="I2"><p>Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA</p></ins>
</insg>
<source>Silence</source>
<issn>1758-907X</issn>
<pubdate>2011</pubdate>
<volume>2</volume>
<issue>1</issue>
<fpage>2</fpage>
<url>http://www.silencejournal.com/content/2/1/2</url>
<xrefbib><pubidlist><pubid idtype="pmpid">21356093</pubid><pubid idtype="doi">10.1186/1758-907X-2-2</pubid></pubidlist></xrefbib></bibl>
<history><rec><date><day>22</day><month>12</month><year>2010</year></date></rec><acc><date><day>28</day><month>2</month><year>2011</year></date></acc><pub><date><day>28</day><month>2</month><year>2011</year></date></pub></history><cpyrt><year>2011</year><collab>McCormick et al; licensee BioMed Central Ltd.</collab><note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
<abs>
<sec><st><p>Abstract</p></st>
<p>Prior to the advent of new, deep sequencing methods, small RNA (sRNA) discovery was dependent on Sanger sequencing, which was time-consuming and limited knowledge to only the most abundant sRNA. The innovation of large-scale, next-generation sequencing has exponentially increased knowledge of the biology, diversity and abundance of sRNA populations. In this review, we discuss issues involved in the design of sRNA sequencing experiments, including choosing a sequencing platform, inherent biases that affect sRNA measurements and replication. We outline the steps involved in preprocessing sRNA sequencing data and review both the principles behind and the current options for normalization. Finally, we discuss differential expression analysis in the absence and presence of biological replicates. While our focus is on sRNA sequencing experiments, many of the principles discussed are applicable to the sequencing of other RNA populations.</p>
</sec>
</abs>
</fm>
<bdy>
<sec><st><p>Introduction</p></st>
<p>Deep sequencing technologies have revolutionized the field of genomics since their inception in 2000, when Lynx Therapeutics' Massively Parallel Signature Sequencing (MPSS; Lynx Therapeutics, Hayward, CA, USA) was described as a way to quantify messenger RNA (mRNA) populations <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. MPSS allowed the parallel sequencing of 17- or 20-nucleotide (nt) signatures from hundreds of thousands of cloned RNA, but it has been made obsolete by newer systems enabling longer sequence reads with fewer biases. Next-generation sequencing has since been adapted to the study of a wide range of nucleic acid populations, including mRNA (RNA-seq) <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, small RNA (sRNA) <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, microRNA (miRNA)-directed mRNA cleavage sites (called parallel analysis of RNA ends (PARE), genome-wide mapping of uncapped transcripts (GMUCT) or degradome sequencing) <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>, double-stranded RNA (dsRNA) <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>, actively transcribing RNA (NET-seq) <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, translated mRNA <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, transcription factor DNA binding sites and histone modification sites (chromatin immunoprecipitation (ChIP)-seq) <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, methylated DNA (BS-seq) <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> and genomic DNA (DNA-seq) <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>. These applications vary with regard to the templates used, but they rely on the same sequencing technologies.</p>
<p>Prior to high-throughput sequencing, DNA microarrays were the predominant method of genome-wide transcriptional analysis. Microarrays have been used to quantify the levels of both known and unknown mRNA, alternative splicing products, translated mRNA and miRNA, as well as to detect miRNA cleavage sites, transcription factor binding sites, single-nucleotide polymorphisms and deletions. Now, however, high-throughput sequencing is often favored over microarrays for such experiments because sequencing avoids several problems encountered in microarray experiments. First, unlike microarrays, sequencing approaches do not require knowledge of the genome <it>a priori</it>, enabling any organism to be easily studied. Second, sequencing is not dependent on hybridization. Microarray data are obtained by hybridizing a labeled target to complementary DNA probes immobilized on a solid surface, and the strength of this hybridization is dependent on the base composition of the probe <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr></abbrgrp>. With arrays, it is possible for cross-hybridization to occur, such that the signal may come from sources besides the perfectly complementary intended target <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B21">21</abbr></abbrgrp>. Sequencing, however, has a single-nucleotide resolution, which increases specificity and is far superior for certain applications, such as defining transcription factor binding sites to the probe-defined resolution of microarrays. Third, sequencing produces digital data by counting the number of copies of a particular sequence, enabling accurate determination of low-, middle- and high-abundance species. Because microarray data are based on the intensity of the fluorescence label at each spot on the hybridized array and intensity falls on a continuum, the data are analog. The disadvantage of this is that it is hard to accurately quantify signals at the two extremes: signals near the lower limit of detection <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp> and those near the intensity saturation point <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. The proper quantification of intensity also depends on accurate measurement of background levels, which is not an issue for digital data <abbrgrp><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. Although sequencing is free from these intrinsic experimental limitations, microarray experiments are cheaper (at the moment) and do not suffer from ligation biases (discussed below in the section "Library preparation and inherent biases").</p>
<p>Next-generation sequencing has proved to be a boon to the study of sRNA. Sequencing of individual sRNA clones by traditional Sanger sequencing was laborious and did not achieve a sufficient sequencing depth to detect rare species <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr></abbrgrp>. There are several biologically relevant and functionally diverse classes of sRNA of specific sizes and produced by different, genetically separable pathways. These include miRNA, small interfering RNA (siRNA) and the animal-specific Piwi-interacting RNA (piRNA, originally called repeat-associated siRNA or rasiRNA). miRNA are 19 to 25 nt long and originate from noncoding RNA called pri-miRNA that have extensive secondary structure <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. miRNA posttranscriptionally silence non-self-targeted mRNA through imperfect base pairing, directing target cleavage <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr></abbrgrp> or translational inhibition <abbrgrp><abbr bid="B40">40</abbr><abbr bid="B43">43</abbr></abbrgrp>.</p>
<p>The biogenesis of miRNA is in contrast to that of siRNA (20 to 24 nt), which are formed from long dsRNA <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr></abbrgrp>. siRNA can direct the cleavage of perfectly base-paired mRNA, including the RNA from which they originate <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B46">46</abbr></abbrgrp>. Several subclasses of siRNA exist, which vary by name or by type in different organisms. In animals, siRNA are designated on the basis of their source: endogenous dsRNA (endo-siRNA, or esiRNA) and exogenous dsRNA (exo-siRNA) <abbrgrp><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr></abbrgrp>. esiRNA are derived from long dsRNA made by RNA-dependent RNA polymerases (RDRs) from sense transcripts, pairing between convergent transcripts (sense and natural antisense transcripts) or long self-complementary RNA, while exo-siRNA come from RNA viruses. The <it>Caenorhabditis elegans </it>and plant literature distinguish primary siRNA, that is, those that are formed from the dsRNA that initiates a silencing event, from secondary siRNA, that is, those that are formed from the cleaved target mRNA and perpetuate and amplify silencing <abbrgrp><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr></abbrgrp>. In plants, siRNA are also defined based on their origin and/or function and include heterochromatic siRNA (hc-siRNA, sometimes also referred to as rasiRNA), natural antisense transcript-derived siRNA (nat-siRNA), and <it>trans</it>-acting siRNA (ta-siRNA). hc-siRNA are 23- to 24-nt siRNA found in plants and <it>Schizosaccharomyces pombe </it>that direct methylation of DNA and histones, leading to transcriptional gene silencing, particularly in repeat regions <abbrgrp><abbr bid="B53">53</abbr><abbr bid="B54">54</abbr><abbr bid="B55">55</abbr></abbrgrp>. A second subset of siRNA in plants, nat-siRNA, arise from the hybridization of sense transcripts with their naturally occurring antisense forms and subsequent cleavage <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. siRNA derived from natural antisense transcripts are also found in animals, but are not always referred to as nat-siRNA <abbrgrp><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr></abbrgrp>. ta-siRNA appear to be plant-specific and originate from noncoding RNA that are the targets of miRNA. Following miRNA cleavage, the cleavage products are made double-stranded and then chopped into 20- or 21-nt ta-siRNA. These ta-siRNA target non-self-targeted mRNA via imperfect base pairing for cleavage, similarly to miRNA <abbrgrp><abbr bid="B61">61</abbr><abbr bid="B62">62</abbr><abbr bid="B63">63</abbr><abbr bid="B64">64</abbr></abbrgrp>.</p>
<p>The most recently identified major class of sRNA is the piRNA group, a 25- to 30-nt sRNA associated with the Piwi subclade of Argonaute family of proteins, and these sRNA have functions in the germline of animals <abbrgrp><abbr bid="B65">65</abbr><abbr bid="B66">66</abbr><abbr bid="B67">67</abbr><abbr bid="B68">68</abbr><abbr bid="B69">69</abbr><abbr bid="B70">70</abbr><abbr bid="B71">71</abbr></abbrgrp>. All of these kinds of sRNA can be identified by generating sRNA sequencing libraries from size-selected populations of RNA that are approximately 18 to 30 nt long. Along with these biologically relevant sRNA, RNA degradation products, including fragments of transfer RNA (tRNA) and ribosomal RNA (rRNA), are also sequenced. Studies have found an abundance of specific tRNA-derived sRNA in <it>Saccharomyces cerevisiae</it>, <it>Arabidopsis </it>and human cells <abbrgrp><abbr bid="B72">72</abbr><abbr bid="B73">73</abbr><abbr bid="B74">74</abbr></abbrgrp>, at least some of which are Dicer cleavage products <abbrgrp><abbr bid="B73">73</abbr></abbrgrp>, and methionine tRNA, or tRNA<sup>Met</sup>, was associated with human Argonaute 2 protein, or Ago2, in human cells <abbrgrp><abbr bid="B75">75</abbr></abbrgrp>. The finding by the Dutta laboratory <abbrgrp><abbr bid="B72">72</abbr></abbrgrp> that some of these tRNA sequences, called tRNA-derived RNA fragments, have a biological function further suggests that new classes of and roles for sRNA will likely continue to be identified.</p>
<p>Sequencing can also be used to study sRNA targets. RNA-seq can directly quantify expression levels of mRNA that are targets of sRNA. High-throughput sequencing has recently been applied to the identification of miRNA cleavage sites, a method alternately called degradome sequencing <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, PARE <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> and GMUCT <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. This approach is useful for identifying precise miRNA target sites because the fragment immediately downstream of the cleavage site will appear much more abundantly than any surrounding sequences produced by nonspecific decay. These methods will not detect the effects of miRNA on target translation, however. New approaches that combine immunopurification of polysomes (mRNA that are associated with ribosomes) with deep sequencing allow for the sequencing of RNA that are actively being translated and enable the detection of miRNA-mediated translational inhibition <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B76">76</abbr></abbrgrp>. In contrast to miRNA, the target of hc-siRNA is chromatin, and hc-siRNA-induced DNA and histone methylation can be identified using BS-seq and ChIP-seq, respectively.</p>
<p>Next-generation sequencing data sets are similar to one another in several aspects, regardless of the technology or template used. In all cases, raw data files in the form of images must be preprocessed and normalized before they can be stored for analysis or visualization. The preprocessing of data comprises a series of steps that involve converting image files to raw sequences (also called "reads"), handling low-quality base calls, trimming adapters from raw sequencing reads, tabulating numbers of trimmed reads per distinct sequence and aligning these reads to a reference genome if available. Normalization, the process of comparing raw sequence counts against some common denominator, is a critical step when processing expression data of all types. Normalization removes technical artefacts arising from the method itself or from unintended variation, with the goal that differences remaining between samples are truly or predominantly biological in nature. Figure <figr fid="F1">1</figr> demonstrates the flow of data for typical sequencing experiments.</p>
<fig id="F1"><title><p>Figure 1</p></title><caption><p>Flowchart of typical data-handling steps for small RNA (sRNA) libraries</p></caption><text>
   <p><b>Flowchart of typical data-handling steps for small RNA (sRNA) libraries</b>. Flowchart depicting the steps involved in creating, processing and normalizing next-generation sequencing libraries. In this article, we focus on sRNA data, but the methods for analyzing other RNA-based or even chromatin immunoprecipitation sequencing data are similar.</p>
</text><graphic file="1758-907X-2-2-1" hint_layout="double"/></fig>
<p>In this review, we consider the design of sRNA sequencing experiments, the preprocessing and normalization of sequencing data and basic differential expression analysis. We discuss various approaches for normalizing sequencing data, starting with what has been learned from microarrays about the fundamentals of normalizing large-scale transcriptional data sets. Because the cost of sequencing is still somewhat high (although it is dropping rapidly), many experiments do not currently involve biological replicates, so we discuss statistical approaches for differential expression analysis when replicates are and are not available.</p>
</sec>
<sec><st><p>Designing sRNA sequencing experiments</p></st>
<sec><st><p>Sequencing technologies and inherent biases</p></st>
<p>The first decision to make when designing a sequencing experiment is which sequencing technology to use. Today there are two main varieties of next-generation sequencing: (1) sequencing by synthesis (SBS), employed by 454 sequencing <url>http://www.454.com/</url>; 454 Life Sciences/Roche, Branford, CT, USA) <abbrgrp><abbr bid="B77">77</abbr></abbrgrp>, Illumina (formerly called Solexa sequencing; <url>http://www.illumina.com/</url>; San Diego, CA, USA) <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>, Helicos <url>http://www.helicosbio.com/</url>; Helicos Biosciences Corp., Cambridge, MA, USA) <abbrgrp><abbr bid="B78">78</abbr><abbr bid="B79">79</abbr></abbrgrp> and the latest entrant into the market, small-molecule, real-time sequencing, or SMRT, sequencing introduced by Pacific BioSciences <url>http://pacificbiosciences.com/</url>; Menlo Park, CA, USA) <abbrgrp><abbr bid="B80">80</abbr></abbrgrp>; and (2) sequencing by ligation (SBL), used in SOLiD (Sequencing by Oligonucleotide Ligation and Detection; <url>http://www.appliedbiosystems.com/</url>; Applied Biosystems, Carlsbad, California, USA) <abbrgrp><abbr bid="B81">81</abbr></abbrgrp> and Polonator sequencing <url>http://www.polonator.org/</url>; Dover Systems, Salem, New Hampshire, USA) <abbrgrp><abbr bid="B82">82</abbr></abbrgrp>. Table <tblr tid="T1">1</tblr> shows the current efficiency statistics for each of these methods as provided by the product websites, but the sequencing depth, speed and accuracy of these technologies are constantly increasing. Most of these approaches can be implemented as paired-end runs, in which both ends of each clone are sequenced, increasing the amount of information gleaned per fragment, but single-end runs are sufficient for the short length of sRNA <abbrgrp><abbr bid="B83">83</abbr><abbr bid="B84">84</abbr><abbr bid="B85">85</abbr></abbrgrp>.</p>
<tbl id="T1"><title><p>Table 1</p></title><caption><p>Comparison of next-generation sequencing technologies<sup>a</sup></p></caption><tblbdy cols="8">
      <r>
         <c ca="left">
            <p>
               <b>Technology</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Approach</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Approximate sequencing depth</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Read length, nt</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Paired ends</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Accuracy</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Individual molecule sequencing</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Optimal for sRNA</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Illumina (Solexa)</p>
         </c>
         <c ca="center">
            <p>Synthesis</p>
         </c>
         <c ca="center">
            <p>500 M reads/flow cell, 12 Gb/35-nt run</p>
         </c>
         <c ca="center">
            <p>35 to 75</p>
         </c>
         <c ca="center">
            <p>Optional</p>
         </c>
         <c ca="center">
            <p>&#8805;98% to 99%</p>
         </c>
         <c ca="center">
            <p>No</p>
         </c>
         <c ca="center">
            <p>Yes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>454</p>
         </c>
         <c ca="center">
            <p>Synthesis</p>
         </c>
         <c ca="center">
            <p>1.6 M reads/flow cell or 500 Mb/run</p>
         </c>
         <c ca="center">
            <p>400</p>
         </c>
         <c ca="center">
            <p>Optional</p>
         </c>
         <c ca="center">
            <p>&#8805;99%</p>
         </c>
         <c ca="center">
            <p>No</p>
         </c>
         <c ca="center">
            <p>No</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Helicos</p>
         </c>
         <c ca="center">
            <p>Synthesis</p>
         </c>
         <c ca="center">
            <p>300 to 500 M reads/flow cell</p>
         </c>
         <c ca="center">
            <p>25 to 55</p>
         </c>
         <c ca="center">
            <p>Optional</p>
         </c>
         <c ca="center">
            <p>&gt; 99.995%</p>
         </c>
         <c ca="center">
            <p>Yes</p>
         </c>
         <c ca="center">
            <p>Yes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>SMRT</p>
         </c>
         <c ca="center">
            <p>Synthesis</p>
         </c>
         <c ca="center">
            <p>75 K reads/flow cell</p>
         </c>
         <c ca="center">
            <p>1,000</p>
         </c>
         <c ca="center">
            <p>N/A</p>
         </c>
         <c ca="center">
            <p>99.30%</p>
         </c>
         <c ca="center">
            <p>Yes</p>
         </c>
         <c ca="center">
            <p>No</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>SOLiD</p>
         </c>
         <c ca="center">
            <p>Ligation</p>
         </c>
         <c ca="center">
            <p>2.4 B reads/flow cell or 300 Gb/run</p>
         </c>
         <c ca="center">
            <p>35 to 75</p>
         </c>
         <c ca="center">
            <p>Optional</p>
         </c>
         <c ca="center">
            <p>&#8805;99.94%</p>
         </c>
         <c ca="center">
            <p>No</p>
         </c>
         <c ca="center">
            <p>Yes</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Polonator</p>
         </c>
         <c ca="center">
            <p>Ligation</p>
         </c>
         <c ca="center">
            <p>64 to 80 M mappable reads or 2.2.5 Gb/flow cell</p>
         </c>
         <c ca="center">
            <p>13</p>
         </c>
         <c ca="center">
            <p>Mandatory</p>
         </c>
         <c ca="center">
            <p>98%</p>
         </c>
         <c ca="center">
            <p>No</p>
         </c>
         <c ca="center">
            <p>No</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p><sup>a</sup>nt, nucleotides; sRNA, small RNA; SMRT, small-molecule, real-time sequencing; SOLid, sequencing by oligonucleotide ligation and detection.</p>
   </tblfn></tbl>
<p>The choice of sequencing method often comes down to cost, read length and sequencing depth. Because sRNA are in the range of approximately 18 to 30 nt and high sequencing depth is necessary to observe rare species, Illumina and SOLiD are currently the most appropriate methods for sRNA sequencing studies (Table <tblr tid="T1">1</tblr>). Illumina uses a four-color, reversible terminator sequencing-by-synthesis technology to sequence one base at a time. SOLiD uses 16 dinucleotide probes, each labeled with one of four fluorophores, to sequence by ligation two nucleotides of each clone at a time. This means that four dinucleotide pairs share the same label, making the analysis of SOLiD data a little more complicated. An algorithm generates the nucleotide sequence of a particular base <it>n </it>from this color space by examining the labels for the overlapping dinucleotides <it>n </it>- 1, <it>n </it>and <it>n</it>, <it>n </it>+ 1 <abbrgrp><abbr bid="B81">81</abbr></abbrgrp>. In this fashion, two different probes interrogate each base, which accounts for the reportedly high accuracy of this method. A single color call error, however, invalidates the sequence determination for all positions after this point. The read length and sequencing depth of Helicos sequencing make Helicos appropriate for sRNA sequencing as well, but this application has not been widely commercialized. For Helicos sequencing, cDNA molecules are polyadenylated and then annealed to immobilized oligo(dT) primers. Individual molecules are sequenced by sequential addition of each of the four nucleotides. One advantage of the Helicos method is that it allows for the sequencing of individual DNA molecules, eliminating the need for polymerase chain reaction (PCR) amplification and its inherent error rate. While Polonator sequencing allows for 26-nt reads at great sequencing depths, a 3- to 4-nt sequence gap remains in the middle of each read, which is not ideal for sRNA experiments.</p>
</sec>
<sec><st><p>Library preparation and inherent biases</p></st>
<p>Recent data have shown that the library preparation method, more than the sequencing technology, can significantly affect the diversity and abundance of the sRNA that are sequenced <abbrgrp><abbr bid="B86">86</abbr></abbrgrp>. For differential expression analyses comparing the relative abundance of the same sequence in different libraries, this is not a problem because all libraries will be affected equally by biases due to library preparation. Despite the digital nature of sequencing data, however, the relative levels of different sequences within the same library will be affected by these biases. Some sequences present in the biological samples may even be absent in the libraries because of preparation bias.</p>
<p>Illumina and SOLiD sRNA sequencing libraries are made by ligating RNA adapters of known sequence to the 5' and 3' ends of single molecules in a purified sRNA population. Alternatively, SOLiD sequencing can be performed by <it>in vitro </it>polyadenylation of the 3' end of the sRNA and addition of a 5' adapter <abbrgrp><abbr bid="B86">86</abbr></abbrgrp>. In either case, the adapter-ligated sequences are reverse-transcribed, amplified by PCR to increase the size of the library, applied to the platform and amplified again <it>in situ </it>to form millions of clusters of DNA of the same sequence. Then these clusters are sequenced in parallel.</p>
<p>Three steps in this process have the potential to influence the sequencing results: adapter ligation, reverse transcription and PCR amplification. Adapter ligation is the most important. The adapters have typically been ligated to each sRNA using T4 RNA ligase 1, which is capable of ligating two single-stranded oligoribonucleotides, where the acceptor nucleotide (&#8805;3 nt long) has a free 3'-hydroxyl group and the donor (&#8805;1 nt) has a 5'-monophosphate <abbrgrp><abbr bid="B87">87</abbr></abbrgrp>. The ligation efficiency of T4 RNA ligase 1 is very sensitive to nucleotide base composition at the ligation site and to sRNA modifications, however, and not all sRNA can act as donor substrates for the enzyme. Studies have suggested that the sequences of both the acceptor and the donor have an effect on ligation efficiency <abbrgrp><abbr bid="B86">86</abbr><abbr bid="B87">87</abbr><abbr bid="B88">88</abbr><abbr bid="B89">89</abbr><abbr bid="B90">90</abbr><abbr bid="B91">91</abbr></abbrgrp>, but the acceptor sequence is more important <abbrgrp><abbr bid="B87">87</abbr></abbrgrp>. The identity of at least the three 3'-most nucleotides of the acceptor affects ligation efficiency <abbrgrp><abbr bid="B87">87</abbr><abbr bid="B91">91</abbr></abbrgrp>, with a different base preference at each position (5'-nucleotide: A &gt; G &#8776; C &gt; U; middle nucleotide: A &gt; C &gt; U &gt; G; 3'-nucleotide: A &gt; C &gt; G &gt; U when using a pUUUCp donor) <abbrgrp><abbr bid="B91">91</abbr></abbrgrp>. The donor sequence appears to be less important, but the bias for the 5' nucleotide is C &gt; U &#8805; A &gt; G <abbrgrp><abbr bid="B88">88</abbr><abbr bid="B89">89</abbr></abbrgrp>.</p>
<p>Many sRNA are modified, and these modifications can also make them poor substrates for T4 RNA ligase 1. In particular, miRNA, siRNA, hc-siRNA, ta-siRNA and nat-siRNA in plants, siRNA and piRNA in insects and piRNA in animals are known to be 2'-<it>O</it>-methylated on the 3' end by the conserved methyltransferase HUA ENHANCER 1 (HEN1) (reviewed in <abbrgrp><abbr bid="B92">92</abbr></abbrgrp>), and this modification lowers ligation efficiency by T4 RNA ligase 1 by 30% to 72%, depending on assay conditions <abbrgrp><abbr bid="B93">93</abbr><abbr bid="B94">94</abbr><abbr bid="B95">95</abbr></abbrgrp>. The 2'-<it>O</it>-methylation also introduces a sequence bias for the 3' nucleotide of the acceptor at the ligation site, such that the efficiency is G = C &gt; A &gt; U <abbrgrp><abbr bid="B95">95</abbr></abbrgrp>. Unlike previous studies, the study by Munaf&#243; <it>et al</it>. <abbrgrp><abbr bid="B95">95</abbr></abbrgrp> did not find sequence bias at the acceptor site in unmethylated sRNA. Both of these issues are eliminated by using a truncated version of a closely related ligase, T4 RNA ligase 2, with a preadenylated 3'-RNA adapter <abbrgrp><abbr bid="B95">95</abbr></abbrgrp>, so this enzyme is being used more and more for library preparation. Illumina's first-generation sRNA library preparation kits used T4 RNA ligase 1 for the ligation of both the 5'- and 3'-adapters, but their Small RNA version 1.5 and TrueSeq&#8482; RNA Sample Preparation kits use the truncated form of T4 RNA ligase 2 for the ligation of the 3'-adapter. T4 RNA ligase 1 is still required for the ligation of the 5'-adapter, however, because of the need by the truncated T4 RNA ligase 2 for a preadenylated donor, which in this case is the sample itself. Thus, sequence bias is eliminated in only one of the two ligation reactions. To test whether an sRNA is 3'-modified or to specifically clone 3'-modified products, sRNA can be oxidized with NaIO<sub>4 </sub>followed by &#946;-eliminated at an alkaline pH. This treatment removes the 3'-most nucleotide from all sequences with 2',3'-OH groups (that is, unmodified sRNA), but not from modified sRNA, leaving a 3'-phosphate <abbrgrp><abbr bid="B96">96</abbr><abbr bid="B97">97</abbr><abbr bid="B98">98</abbr></abbrgrp>, which is not a substrate for T4 RNA ligase 1 or 2.</p>
<p>Because T4 RNA ligase 1 requires a 5'-monophosphate on the donor sequence, sRNA lacking this group are absent from standard libraries. A large population of 5'-ligation-resistant secondary siRNA was found in <it>C. elegans </it><abbrgrp><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr></abbrgrp>. These secondary siRNA are involved in the perpetuation of RNA interference (RNAi) and have a 5'-triphosphate, which is not a substrate for T4 RNA ligase 1. sRNA with 5'-diphosphate or 5'-triphosphate have also been found in the single-celled eukaryote <it>Entamoeba histolytica </it><abbrgrp><abbr bid="B99">99</abbr></abbrgrp>. The 5'-caps similarly block ligation by the enzyme and have been seen on 18- to 25-nt sRNA associated with the human hepatitis delta virus and on some RNA under 200 nt in human cells <abbrgrp><abbr bid="B100">100</abbr><abbr bid="B101">101</abbr></abbrgrp>. Both of these ligase-resistant 5'-modifications can be removed by pretreatment with tobacco acid pyrophosphatase before ligation of a 5'-adapter <abbrgrp><abbr bid="B101">101</abbr></abbrgrp>. Alternatively, a 5'-adapter-independent method can be used <abbrgrp><abbr bid="B51">51</abbr><abbr bid="B99">99</abbr><abbr bid="B100">100</abbr></abbrgrp>; however, this approach is not compatible with Illumina and SOLiD sequencing technologies. The importance of considering such a method, however, is highlighted by a study by Pak <it>et al</it>. <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>, who studied RNAi-induced <it>C. elegans </it>that used a 5'-adapter-independent library preparation protocol. In contrast to work that did not account for the possibility of 5'-ligation-resistant sRNA, which suggested that miRNA vastly outnumbered siRNA, they demonstrated that the two classes are actually found in similar degrees of abundance <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>.</p>
<p>Because sRNA acts as the donor during the 5'-adapter ligation and as the acceptor during the 3'-adapter ligation, the best solution for avoiding this bias would be to use a ligation-independent library preparation. Such a method has been applied to the generation of Illumina sequencing libraries <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and would be applicable to SOLiD sequencing as well. This method involves using <it>Escherichia coli </it>poly(A) polymerase (PAP) to polyadenylate the RNA molecules and then performing a reverse transcription reaction with an oligo(dT) primer having both 5'- and 3'-adapter sequences at the 5' end of the primer. The products are then circularized and cut with a restriction enzyme that cleaves between the 5'- and 3'-adapters, yielding the typical linear read of 5'-adapter, clone and 3'-adapter. Ligation-independent methods that rely on 3'-polyadenylation of the sRNA population, such as this technique and the one used for Helicos sequencing, may be better than ligation-dependent methods, but they are still not perfect. PAP has a bias for the 3'-nucleotide A = G &gt; C &gt; U, but the efficiencies of the different bases are within twofold of each other <abbrgrp><abbr bid="B95">95</abbr></abbrgrp>. As seen with T4 RNA ligase 1, 2'-<it>O</it>-methylation greatly reduces the efficiency of PAP by up to 10-fold, with the sequence bias altered to 2'-<it>O</it>-meG &gt; 2'-<it>O</it>-meA = 2'-<it>O</it>-meU &gt; 2'-<it>O</it>-meC <abbrgrp><abbr bid="B93">93</abbr><abbr bid="B94">94</abbr><abbr bid="B95">95</abbr></abbrgrp>.</p>
<p>While adapter ligation is probably the largest potential source of bias, bias can also be introduced during reverse transcription and amplification. The 2'-<it>O</it>-methylation of sRNA reduces the efficiency of reverse transcription as well as adapter ligation <abbrgrp><abbr bid="B95">95</abbr><abbr bid="B102">102</abbr></abbrgrp>. The step of PCR amplification during library preparation can be a problem with sequences that have very low or very high guanine-cytosine (GC) content, reducing the likelihood that these sequences will be represented in the final population. Two techniques that do not require the initial library amplification and are compatible with Illumina sequencing have been used for DNA-seq and RNA-seq, and both methods provide a less biased library preparation for low GC sequences <abbrgrp><abbr bid="B103">103</abbr><abbr bid="B104">104</abbr></abbrgrp>. These approaches remain to be tried with sRNA libraries and still require the standard amplification within the Illumina flow cell to generate clusters of identical sequences. The Helicos system will provide a truly amplification-independent sequencing protocol because it does not require PCR in the library preparation and sequences only single molecules, not clusters of molecules.</p>
</sec>
<sec><st><p>Multiplexing</p></st>
<p>High-throughput sequencing can be costly when loading only one sample per sequencing lane. To help improve cost efficiency, users can multiplex two or more samples in a single lane using bar coding <abbrgrp><abbr bid="B105">105</abbr><abbr bid="B106">106</abbr><abbr bid="B107">107</abbr><abbr bid="B108">108</abbr><abbr bid="B109">109</abbr><abbr bid="B110">110</abbr><abbr bid="B111">111</abbr><abbr bid="B112">112</abbr><abbr bid="B113">113</abbr></abbrgrp>. As the number of reads per run has increased (Table <tblr tid="T1">1</tblr>), sufficiently deep sequencing can be achieved even when running multiple samples in the same lane, with the number of multiplexed samples depending on the desired depth. Multiplexing either incorporates a unique sequence called a bar code into the 5'- or 3'-adapter of each library to be run in the same lane or adds the bar code during a PCR step after adapter ligation, an approach that minimizes ligation bias. All of the reads in a lane can be sorted into their respective libraries using their bar codes after sequencing has taken place. Because of the inherent error rate of sequencing, it is recommended that bar codes be long enough so that each pair varies by multiple substitutions, thereby reducing the likelihood that sequencing errors in the bar code will result in assigning reads to the wrong sample <abbrgrp><abbr bid="B107">107</abbr><abbr bid="B112">112</abbr></abbrgrp>. In particular, Illumina sequencing has a tendency to erroneously incorporate adenine more than the other bases <abbrgrp><abbr bid="B114">114</abbr></abbrgrp>, which should also be taken into account when designing your own bar codes. Multiplexing library preparation kits are now available for both Illumina and SOLiD. In both cases, the bar code is located within one of the adapters and separated by multiple bases from the ligation site, reducing the likelihood that the bar code will introduce any ligation bias. Helicos is also compatible with bar coding, though it requires a ligation step not in the original protocol. The one downside of using a bar code is that it may reduce the maximum length of the sRNA that can be sequenced, trimmed and assigned to a sample. However, the latest multiplexing systems for the Illumina and SOLiD machines incorporate the index into the 3' PCR primer and perform a second reaction specifically to sequence the bar code. This type of approach has numerous advantages, such as reducing or eliminating ligation bias, ensuring long reads across the sRNA and enabling multiplexing that reduces sequencing costs.</p>
</sec>
<sec><st><p>Replication</p></st>
<p>Several reports have used technical replicates, that is, the same library sequenced multiple times or independent libraries constructed from the same biological sample, to demonstrate the high reliability of Illumina <abbrgrp><abbr bid="B86">86</abbr><abbr bid="B115">115</abbr><abbr bid="B116">116</abbr><abbr bid="B117">117</abbr><abbr bid="B118">118</abbr></abbrgrp> and SOLiD sequencing <abbrgrp><abbr bid="B86">86</abbr></abbrgrp>. Similar results are possible for biological replicates <abbrgrp><abbr bid="B115">115</abbr><abbr bid="B118">118</abbr><abbr bid="B119">119</abbr></abbrgrp>. Because of the high cost of deep sequencing, most experiments published to date have not used biological replicates, even though they can increase the statistical significance and reduce both false-positive and false-negative rates. With biological replicates, the significance analysis of microarrays (SAM) <abbrgrp><abbr bid="B115">115</abbr></abbrgrp> and the Bioconductor program edgeR <abbrgrp><abbr bid="B118">118</abbr><abbr bid="B120">120</abbr></abbrgrp> can be applied to differential expression analysis of sequencing data, as we discuss later in the section "Differential expression analysis". Standards for deep sequencing experiments remain to be agreed upon, but as sequencing costs go down, sequencing depths further increase and multiplexing becomes more widely adopted, the requirement for biological replicates in differential expression experiments will surely follow.</p>
</sec>
</sec>
<sec><st><p>Preprocessing of sequencing data</p></st>
<p>The raw data of a sequencing experiment typically comprise a series of image files: one image per cycle of nucleotide addition for Illumina or dinucleotide ligation for SOLiD. Because of the size of flow cells, each one is subdivided into a number of "tiles" for imaging purposes. Thus, there is a series of images for every nucleotide. The images contain thousands of spots, one spot for every cluster, with a cluster representing one read. Each of these files must be analyzed to designate one of the four nucleotide bases (Illumina) or color space call (SOLiD) for each spot on the image, and then the data from each image for the same spot must be combined to give full sequence reads, one per spot. Each technology has its own specifications regarding the file formats used; for example, Illumina recently changed its standard output format from .qseq, which uses ASCII-64 encoding of Phred quality scores (a widely accepted metric to characterize the quality of DNA sequences), to .bcl, a binary format containing base call and quality for each tile in each cycle. SOLiD systems use .csfasta to encode color space calls and .qual files to record the quality values for each sequence call. Because one color call error will affect the sequence of all 3'-nucleotdies, SOLiD data are maintained in color space for much of the preprocessing. Figure <figr fid="F2">2</figr> demonstrates a sample pipeline for Illumina data files.</p>
<fig id="F2"><title><p>Figure 2</p></title><caption><p>Sample file formats for small RNA libraries</p></caption><text>
   <p><b>Sample file formats for small RNA libraries</b>. Illumina machines generate .bcl files, which are in binary form and are not human-readable. These files are converted into .qseq files, which record the most likely sequence and a quality score for each read. Scripts are available to convert files in .qseq format into .fastq or SCARF format (Solexa Compact ASCII Read Format). Files in these formats are often converted to a "tag count" format so that they can be easily stored and analyzed.</p>
</text><graphic file="1758-907X-2-2-2" hint_layout="single"/></fig>
<p>For many sequenced reads, ambiguous bases will exist. Ambiguous bases are the result of low confidence in any particular nucleotide. In the case of Illumina, a probability is assigned for a given nucleotide being each of the four bases. For a sequence designation to be assigned, the likelihood of the most likely base has to be at least 1.5 times greater than that of the next highest base; otherwise, the position in question will be deemed an ambiguous base. Different sequencing platforms and/or software pipelines have alternative approaches for handling ambiguous reads, usually denoted with an "N" in a sequence. Some will simply discard any sequence with an ambiguous read if the sequencing depth is sufficient, while others will assign the most likely base call at that nucleotide in an attempt to maximize the number of reads. A very sophisticated approach to this step is to record each read as more than a static sequence by using a probability matrix to record the probability of each nucleotide at each position <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. This additional information can help recover reads that would otherwise be classified as sequencing errors. For example, the most likely sequence for an ambiguous read, according to its probability matrix, might not map to any genomic locus, but the second most likely sequence might correspond to a known miRNA. This approach is likely to increase the number of usable reads for any given library, but it is undecided whether this increase is enough to warrant the increase in computational complexity that it brings. This approach will also likely mistakenly assign the sequence of some reads. The location of the ambiguities may also allow some reads to be saved. Ambiguities in the middle of a read will require that it be discarded from further analysis, but if it is within the adapter sequences, the read may still be retained.</p>
<p>The next step in processing next-generation sequencing data is to trim or remove any adapter sequences. Because these adapters are artificially introduced and are not part of the organism's transcriptome, it is necessary to remove any remnants of them before attempting to map the libraries against a reference genome. Trimming scripts require a minimum number of bases for adaptor recognition, so the maximum usable read length of Illumina and SOLiD is less than the total number of sequenced bases. This also means that longer sRNA may be lost as a result of an insufficient adapter sequence for matching and trimming. This is not a problem for the typical 19- to 30-nt sRNA, as current technologies generate sequences &gt; 36 nt. The process of removing adapters can be inefficient because it is possible (even likely) that sRNA sequences contain subsequences of the adapter. Thus, researchers must be careful when defining exact rules for determining which sequences to keep, which ones to trim and which ones to throw out altogether.</p>
<p>The final steps before data analysis can begin are to count the abundance for each distinct tag in a library and to map distinct tags to a reference genome if one exists. Calculating the abundance is computationally trivial, given current sequencing depth and standard computational limitations, so many researchers use their own programs for this step. Genome mapping, on the other hand, can be computationally expensive, but fortunately there are a number of publicly available programs to perform this task, such as SOAP <abbrgrp><abbr bid="B121">121</abbr></abbrgrp> and Bowtie <abbrgrp><abbr bid="B122">122</abbr></abbrgrp>, each with its own benefits and limitations. Some programs use multithreading and efficient memory allocation to maximize mapping speed.</p>
<p>The number of trimmed reads in a given library that will align perfectly to a reference genome depends on issues specific to the organism, the sample or the sequencing run, as well as on decisions made during data analysis. The completeness of the genome sequence is a major factor. Even in so-called "complete" genomes, there are highly repetitive regions (such as in centromeres and telomeres) that remain undetermined. Because a large number of sRNA originate from these locations, many reads will incorrectly fail to map to the genome. The sequence divergence between the reference genome and the sample will also have an effect. Low-quality sequencing runs will have reads riddled with erroneous base callings, causing them to be classified as nongenomic as well.</p>
<p>There are also some data analysis decisions that will influence the number of reads that align to a genome, including minimum read length, how to handle reads mapping to multiple genomic loci and how many mismatches to allow. Shorter sequences are more likely to map to multiple loci in the genome. Because sRNA researchers are generally interested in Dicer-mediated cleavage events, and because the shortest known Dicer products are 19 nt in length, it is recommended that any reads shorter than 18 nt be excluded. In plants, because the dominant size classes are miRNA and hc-siRNA, with the bulk of these being 20 or 21 nt and 23 or 24 nt, respectively, the data should demonstrate a significant decrease in the number of both distinct and total 18- or 19-nt and &gt; 25-nt reads. Figure <figr fid="F3">3</figr> demonstrates how reads shorter than 20 nt or longer than 24 nt are mostly derived from tRNA, rRNA, small nuclear RNA (snRNA) or small nucleolar RNA (snoRNA) loci.</p>
<fig id="F3"><title><p>Figure 3</p></title><caption><p>Small RNA (sRNA) reads derived from structural RNA versus other sRNA-generated loci</p></caption><text>
   <p><b>Small RNA (sRNA) reads derived from structural RNA versus other sRNA-generated loci</b>. <b>(A) </b>The number of total and distinct reads for all genomic sequences divided into those derived from ribosomal RNA, transfer RNA (tRNA), small nuclear RNA (snRNA) or other "structural" noncoding RNA-derived and other categories for each size class from 18 to 34 nt across 51 publicly available <it>Arabidopsis </it>sRNA libraries. We typically refer to the sRNA from nonstructural loci as "good" sRNA. <b>(B) </b>The percentage of tRNA-derived reads for each size class from 18 to 34 nt across 24 publicly available wild-type <it>Arabidopsis </it>libraries. Because of variations in sequencing read lengths among libraries, some libraries are missing data for sizes above 27 nt or 31 nt.</p>
</text><graphic file="1758-907X-2-2-3" hint_layout="double"/></fig>
<p>Several strategies have been employed to handle reads that map to multiple loci, also known as multireads. Reads that map to only one locus are called unique reads, which should not be confused with the distinct reads, which are reads with different nucleotide sequences. Figure <figr fid="F4">4</figr> shows the relative abundance of unique and nonunique reads across all sRNA size classes. In some cases, researchers have chosen to exclude all multireads from analysis <abbrgrp><abbr bid="B123">123</abbr></abbrgrp>, or to exclude those multireads mapping to more loci than some threshold <abbrgrp><abbr bid="B124">124</abbr><abbr bid="B125">125</abbr></abbrgrp>, as many of these will map to centromeres and telomeres. However, this will result in a loss of sequencing depth. When choosing to keep multireads, the problem arises how to allocate those reads between the different possible source loci. The two most common approaches are to allocate the total number of copies of a read to each mapped locus or to divide the number of copies evenly among the mapped loci. Allocating all copies to each locus ignores the fact that this is biologically impossible, but allows for the possibility that any locus might be the sole transcriptional source of a read. Distributing the copies evenly, while reflecting a biologically possible scenario, precludes such a possibility. A more sophisticated approach is to estimate the proportion of multiread transcriptions at each locus by examining the levels of uniquely mapping reads at nearby loci <abbrgrp><abbr bid="B126">126</abbr><abbr bid="B127">127</abbr></abbrgrp>. This approach has several names, but we shall refer to it as "probability mapping," since it involves estimating the probability that a transcript originated from each associated locus. The basic idea of probability mapping can be explained with this simple scenario. Suppose a multiread maps to genomic loci L1 and L2 and that the number of uniquely mapping reads overlapping L1 greatly outnumber those that overlap L2. Intuitively, we can presume that most of the copies of the multiread in question originated from L1, since there is likely a higher level of transcription occurring at L1 than at L2. The proportion of copies allocated to L1 is then approximately equal to the proportion of uniquely mapping reads overlapping L1 compared to those at L2. While it remains unknown whether the presence of uniquely mapping reads is an indication of a higher overall level of transcription, the data from applications of this technique seem to support the idea.</p>
<fig id="F4"><title><p>Figure 4</p></title><caption><p>Repetitiveness of small RNA (sRNA) reads measured across sizes</p></caption><text>
   <p><b>Repetitiveness of small RNA (sRNA) reads measured across sizes</b>. The number of total reads for all uniquely and nonuniquely mapping genomic sequences divided into ribosomal RNA- or transfer RNA-derived and other (also known as "good") categories for each size class from 18 to 34 nt across 51 publicly available <it>Arabidopsis </it>sRNA libraries. For each size class, structural RNA-derived reads are more likely to map nonuniquely mapping genomic sequences (that is, to more than one genomic location), whereas good reads are more likely to map uniquely mapping genomic sequences (that is, to one genomic location).</p>
</text><graphic file="1758-907X-2-2-4" hint_layout="single"/></fig>
<p>The number of mismatches to allow when performing genome mapping can be a difficult issue to resolve. Individual specific DNA polymorphisms and posttranscriptional sequence modifications, which have been seen in RNA from mitochondrial and plastid genomes, tRNA and miRNA, will also cause some reads not to map to the genome. Computational techniques that allow indels and mismatches when performing genome mapping are capable of "recovering" these modified reads that would otherwise be classified as nongenomic <abbrgrp><abbr bid="B125">125</abbr><abbr bid="B128">128</abbr><abbr bid="B129">129</abbr></abbrgrp>. Allowing mismatches increases the number of raw reads that will map to the genome but also decreases the likelihood that those reads originated from the matched loci. Because of the short length of sRNA, it is generally recommended that only perfectly matched reads be utilized, unless specific known polymorphisms or posttranscriptional RNA sequence modifications exist between the reference genome and the sample in question.</p>
<sec><st><p>Quality control</p></st>
<p>Once sRNA data have been preprocessed, it is common for researchers to verify the quality of the data before moving on to normalization and analysis. There are several ways to perform quality control on sRNA data. Each base of every Illumina sequenced read or each color call of every SOLiD sequenced read is given a quality score, which can be used to calculate an average error rate for each cycle of a sequencing run. While it is normal for the error rate to increase toward the end of a run, for a good run the average error rate throughout should be relatively similar and close to the expected rate for the technology. Creating size distribution graphs should reveal peaks of sequences corresponding to the dominant size classes. For example, in <it>Arabidopsis</it>, the dominant classes are 20 or 21 nt and 23 or 24 nt, which correspond to miRNA and hc-siRNA, respectively. Libraries made from high-quality RNA should have low levels of sRNA corresponding to highly abundant mRNA. Libraries made from green tissues of plants, for instance, should have low levels of sRNA for genes encoding the highly expressed photosynthetic proteins. Computing the levels of other RNA types, such as tRNA or rRNA, among different libraries in a data set may or may not be informative, as the relative level of tRNA can vary significantly. For example, from 51 public <it>Arabidopsis </it>sRNA libraries in our databases, tRNA represented from 4% to 40% of the total number of sequenced reads. Ideally, the level of nongenomic reads should also be similar between libraries to be compared.</p>
</sec>
</sec>
<sec><st><p>Data normalization</p></st>
<sec><st><p>Lessons from microarrays</p></st>
<p>The more than 20-year history of microarray experiments provides a good starting point for considering how to normalize next-generation sequencing data. While there are many technology-specific issues involved when handling raw microarray and sequencing data, the basic problem is still the same: how to convert raw data, in the form of image files, to numerical data, such that any expression differences between samples are due solely to biological variation, not to technical, experimentally introduced variation. In the case of microarrays, technical bias can be introduced during sample preparation (differences in RNA isolation, quality and amplification; target labeling; total amount of target; dye biases for spotted arrays; and so on), array manufacture (array surface chemistry, sequences used for the probes, locations of the probes within a gene, array printing for spotted arrays, scratches and so on) and array processing (hybridization conditions and scanning intensity and settings). Failing to properly remove these biases can lead to false conclusions when making comparisons within a single array or between two different arrays. Normalization attempts to remove technical bias without introducing noise.</p>
<p>Normalization requires two basic decisions: (1) which subset of genes (also called the normalization baseline or reference population) to use to determine the normalization factor and (2) which normalization method to employ <abbrgrp><abbr bid="B130">130</abbr></abbrgrp>. These two choices are independent, such that a given reference population can be used in combination with any of the different normalization methods. A good reference population is invariant in expression, meaning that the true expression levels are constant across biological treatments and span the entire expression range. Reference populations that have been used previously for microarray normalization include housekeeping genes <abbrgrp><abbr bid="B131">131</abbr></abbrgrp>, spike-ins of nonendogenous RNA or genomic DNA, an algorithmically identified set of invariant genes <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B132">132</abbr><abbr bid="B133">133</abbr><abbr bid="B134">134</abbr><abbr bid="B135">135</abbr></abbrgrp> and all genes <abbrgrp><abbr bid="B130">130</abbr></abbrgrp>. Housekeeping genes are typically used for normalizing northern blot analysis results and quantitative reverse transcription PCR (qRT-PCR) because of their supposedly constant expression level, but it has become ever more apparent that even these genes can vary in their expression <abbrgrp><abbr bid="B136">136</abbr><abbr bid="B137">137</abbr><abbr bid="B138">138</abbr><abbr bid="B139">139</abbr><abbr bid="B140">140</abbr><abbr bid="B141">141</abbr></abbrgrp>. Commercial arrays typically have probes for nonendogenous genes, and <it>in vitro </it>transcribed RNA from these genes can be used as spike-ins at various steps in the target preparation and array hybridization procedure. The point chosen will determine how much and what kind of technical variation will be corrected by the normalization. Genomic DNA has also been used for normalization because the concentration of a control sequence is readily known. In the absence of knowledge regarding invariant genes, algorithms have been developed that identify a set of invariant genes from the set of arrays themselves. These genes are discovered by comparing expression-ranked lists of all of the probes in each array to find the most rank-invariant genes <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B132">132</abbr><abbr bid="B133">133</abbr><abbr bid="B134">134</abbr><abbr bid="B135">135</abbr></abbrgrp>. This method is advantageous because it makes no assumptions about the expression patterns of individual genes. Normalization is generally improved by increasing the size of the reference population, which has been a disadvantage of spike-ins because only a few sequences are typically added. As an alternative to using a subset of probes for normalization, all probes can be used. This type of normalization assumes that because the RNA content is constant between treatments and most of the genes do not change in expression between treatments, the median or mean expression across all of the genes is unchanged.</p>
<p>There are many different algorithms for normalizing microarray data on the basis of the chosen reference population, but they fall into four main categories: linear scaling (as in the MAS5.0 algorithm), nonlinear scaling (as in locally weighted linear regression (LOWESS), cyclic LOWESS), quantile normalization (as in robust multi-array average (RMA), GC-RMA (a variation of RMA), dChip) and variance stabilization normalization (VSN), two of which (linear and nonlinear scaling) have been applied to sRNA sequencing data as we will see later in the section "Normalization methods". Linear scaling uses the reference population to determine a single factor by which the population varies when compared to a set target, such as a predetermined mean or median expression value. The expression of each probe or gene on the array is multiplied by this factor to achieve the normalized expression value. The advantage of using linear scaling is that the scaling factor is determined independently for each sample, unlike the other approaches, which normalize the data with reference to the other arrays in the data set. Linear normalization of microarray data has been largely abandoned, though, because expression values are not necessarily linear, particularly at the extremes <abbrgrp><abbr bid="B142">142</abbr></abbrgrp>. In attempt to overcome this problem, nonlinear scaling methods have been developed that, for a given pair of arrays or for an individual array and the mean or median data derived from all of the arrays in question, first fit a curve to the expression values of the reference using LOWESS or splines and then normalize the data such that the average fold change when comparing any two arrays is 1 (that is, no change) across the expression range. Thus, a scaling factor is determined independently for small windows across the entire expression range. Quantile normalization uses a nonscaling approach that assumes that most genes are not differentially expressed and that the true expression distribution is similar between different samples <abbrgrp><abbr bid="B142">142</abbr></abbrgrp>. The average distribution of the reference population is determined from all of the arrays in question, and then each array is normalized to have this same distribution. Variance stabilization normalization likewise assumes that most genes are not differentially expressed. Using a generalized logarithmic transformation, VSN methods fit the data such that the variance is equal across the expression range, allowing for greater precision for low expression values, which are generally subject to greater variance <abbrgrp><abbr bid="B143">143</abbr><abbr bid="B144">144</abbr><abbr bid="B145">145</abbr></abbrgrp>. Many studies have been performed comparing these different normalization methods, but beyond the opinion that linear scaling is not as ideal because of the analog nature of microarray data, the general conclusion is that there is no single "best" normalization method <abbrgrp><abbr bid="B31">31</abbr><abbr bid="B131">131</abbr><abbr bid="B142">142</abbr><abbr bid="B146">146</abbr><abbr bid="B147">147</abbr><abbr bid="B148">148</abbr><abbr bid="B149">149</abbr></abbrgrp>. Even though the data are digital, the same is likely to be true in the case of RNA sequencing experiments as discussed below in the section "Normalization methods".</p>
</sec>
<sec><st><p>Sources of nonbiological variation in sRNA sequencing experiments</p></st>
<p>There are a number of nonbiological sources of variation that can add noise to sRNA sequencing experiments. RNA quality is a major issue because low quality can result in an increase in sequencing of degradation products. As discussed above in the section "Library preparation and inherent biases", the choice of library preparation methods has a significant impact on the makeup of the library because of biases in ligation, reverse transcription, PCR amplification or polyadenylation efficiency. While not currently done, it may be possible to develop methods to correct for these biases. One issue that can be dealt with to some extent by normalization is differences in sequencing depth between libraries. More total reads equate to a higher likelihood of any particular sequence's appearing in a library, and standardizing the total number of reads per library or sequence run is not a realizable goal. One way to reduce the impact of this kind of variation (or other technical variations encountered as a result of the sequencing procedure itself) is to sequence all of the libraries to be compared at the same time or to use multiplexing to run the samples in the same lane or at least on the same flow cell.</p>
<p>Microarray and sequencing experiments start with equal amounts of total RNA when constructing a library or a labeled target. When performing differential expression analyses using such data, an inherent assumption is that a set amount of starting RNA comes from the same number of cells in each sample. It is well known, though, that transcription rates change depending on the stage of growth, development or environment of the cell, tissue, organ or organism. Thus, this assumption can result in over- or underestimation of differences between samples. This issue is probably most significant when comparing different stages of growth or development. Studies of the per-cell abundance of sRNA in different experimental conditions have not been performed, but such studies might help improve our estimates of differential expression as well as our knowledge of the biology of sRNA.</p>
</sec>
<sec><st><p>Selecting a normalization baseline for sRNA sequencing experiments</p></st>
<p>Three reference populations for normalization have been used with sRNA sequencing experiments: spike-ins, all "good reads" and all reads. As discussed earlier in the section "Lessons from microarrays", housekeeping genes have been shown to be nonideal for normalizing microarray data because of their variable expression <abbrgrp><abbr bid="B136">136</abbr><abbr bid="B137">137</abbr><abbr bid="B138">138</abbr><abbr bid="B139">139</abbr><abbr bid="B140">140</abbr><abbr bid="B141">141</abbr></abbrgrp>. In the case of sRNA, few "housekeeping" sequences have been delineated. The identification of rank-invariant sRNA sequences would help to establish a statistically significant baseline for normalization, but this has not been done to date. RNA spike-ins of foreign sequences have proven useful, however, to account for multiple sources of variation in sequencing experiments, particularly when the spike-in RNA have been added to the total sample RNA prior to library preparation <abbrgrp><abbr bid="B115">115</abbr></abbrgrp>. Fahlgren <it>et al</it>. <abbrgrp><abbr bid="B115">115</abbr></abbrgrp> added multiple spike-ins at different concentrations to cover a range of abundances. Some sequences were more likely sequenced than others even when added at the same concentration, possibly as a result of sequence biases, so it is probably best to include multiple spike-ins of varying base compositions for each of the concentrations to be tested. Spike-ins also have proven useful in demonstrating the accuracy of some downstream data analyses <abbrgrp><abbr bid="B126">126</abbr><abbr bid="B150">150</abbr></abbrgrp>.</p>
<p>Many other studies have used all reads or, more often, all "good reads" for the normalization baseline, which is comparable to using all probe sets when normalizing microarrays. Good reads are defined as all tags that map to a reference genome, except those associated with tRNA, rRNA, snRNA, snoRNA or other structural RNA <abbrgrp><abbr bid="B124">124</abbr><abbr bid="B151">151</abbr></abbrgrp>. This approach helps to mitigate the effects of bad sequencing runs and contamination with foreign RNA, both of which result in higher numbers of sequences that do not map to the reference genome. Experiments focusing on a specific RNA type, such as miRNA, may choose to use only these sequences for the normalization baseline <abbrgrp><abbr bid="B152">152</abbr><abbr bid="B153">153</abbr></abbrgrp>.</p>
<p>In sRNA sequencing experiments, the majority of distinct reads will be sequenced in only one copy and often will be observed in only a single library. Because these sequences can act as outliers, it is sometimes best to eliminate them from the normalization baseline as discussed in the next section.</p>
</sec>
<sec><st><p>Normalization methods</p></st>
<p>Once a normalization baseline has been chosen, there is still the decision which normalization method to use. Existing methods can be classified as either linear or nonlinear. Linear total count scaling is perhaps the simplest of all existing methods. It involves using the summation of all reads belonging to the normalization baseline as a "library size," choosing an appropriate "control" library size (either the actual size of a control library or the average size of all libraries in the experiment) and then multiplying the abundance of each individual read by the normalization value (control divided by library size). This method has been widely applied to different types of data, including sRNA Illumina data, mRNA Illumina data <abbrgrp><abbr bid="B154">154</abbr></abbrgrp> and PARE Illumina data <abbrgrp><abbr bid="B151">151</abbr></abbrgrp>. Linear total count scaling has been shown to be no better than the analog data of microarray experiments for detecting differentially expressed genes <abbrgrp><abbr bid="B154">154</abbr></abbrgrp>. A slight variation of this method is to use the number of distinct sequences, rather than the total abundance, as the size of each library <abbrgrp><abbr bid="B155">155</abbr></abbrgrp>.</p>
<p>Total count scaling is computationally simple but, for some experiments, biologically na&#239;ve. Consider this hypothetical scenario in which total count scaling fails: If sample A contains all reads from sample B, as well as a novel set of reads equal in size to the first set, total count scaling will result in underrepresenting reads from sample A and overrepresenting reads from sample B <abbrgrp><abbr bid="B120">120</abbr></abbrgrp>. Total count scaling is particularly inefficient in the context of sRNA sequencing because it ignores the number of distinct reads within each sample. One proposed method that incorporates this number is quantile-based normalization, which uses the upper quartile of expressed genes (after excluding genes not expressed in any library) as a linear scaling factor <abbrgrp><abbr bid="B154">154</abbr></abbrgrp>. (Note that this differs from quantile normalization, which scales data within each quantile separately.) The quantile-based method has been shown to yield better concordance with qRT-PCR results (with a bias near zero) than linear total count scaling, making quantile-based normalization better at detecting differentially expressed genes <abbrgrp><abbr bid="B154">154</abbr></abbrgrp>. This quantile-based method has been used with RNA-seq data, where all reads per gene have been grouped together to yield one total per gene, but it has not been used with sRNA sequencing data. Our attempts to apply this approach to sRNA sequencing data (about 0.5 to 2 million distinct reads per library) found that the 75th-percentile sRNA were found at only one or two copies per library. Even grouping sRNA by gene or by 500-bp sliding window found very low copy numbers at this percentile. As a result, this method may need further modification to be applied to sRNA data, such as not considering distinct reads sequenced only one time or raising the percentile used for the normalization.</p>
<p>Even quantile-based normalization has its limitations, because it assumes a similar distribution of abundances per distinct read among all libraries being normalized. It is not yet known how accurate next-generation sequencing is with regard to read distribution. It is possible, however, to properly normalize libraries that may not have similar abundance distributions by using linear regression <abbrgrp><abbr bid="B123">123</abbr></abbrgrp>. This method involves performing linear regression by comparing the abundance of each baseline element between two samples or between one sample and the mean or median of all samples, and then using the slope of the regression line as a linear scaling factor.</p>
<p>Because the total RNA output of each sample is unknown, linear total count scaling and other na&#239;ve methods can lead to underrepresentation of counts from high-output samples. Highly expressed genes (or other genomic elements) can sometimes take up too much "sequencing real estate" in a sample. The number of reads that map to a particular gene depends not only on gene length and expression level but also on the composition of the RNA population being sampled <abbrgrp><abbr bid="B120">120</abbr></abbrgrp>. In some studies, it is assumed that most genes are not differentially expressed and thus that their true relative expression levels should be pretty similar. The trimmed mean of M value (TMM) normalization method exploits this fact by calculating, for each baseline element, the log expression ratio (M values) of the experimental sample to a control sample (or the mean or median of all samples) and using their trimmed mean as a linear scaling factor. Although Robinson and Oshlack <abbrgrp><abbr bid="B120">120</abbr></abbrgrp> applied this method to genes using RNA-seq data, it could be applied to individual sRNA sequence counts as well.</p>
<p>All of the normalization methods discussed thus far are linear scaling methods, and they suffer from an inherent flaw in assuming that the level of noise in an sRNA library is directly proportional to the size of the library. A two-step nonlinear regression method can be used to eliminate nonlinear noise without making any assumptions about its shape <abbrgrp><abbr bid="B156">156</abbr></abbrgrp>. A previously published implementation of this method is shown in Figure <figr fid="F5">5</figr>. This method uses the number of sequences mapping to each genomic window as well as the averages of these counts across the set of libraries. While this particular normalization method assumes that the data include only uniquely mapping sequences, multireads could be included by using probability mapping (described above in the section "Preprocessing of sRNA data") to estimate the total number of transcripts originating from within each genomic window. The first step is to regress observed counts of differences (control minus sample) on the mean to estimate fitted values and then subtract these fitted values from the observed difference counts. This results in each observed count's being transformed into a mean normalized difference. The second step is to estimate the moving mean absolute deviation (by regressing the absolute value of mean normalized differences on absolute mean counts) and then divide the mean normalized difference counts by the estimated mean of absolute deviation.</p>
<fig id="F5"><title><p>Figure 5</p></title><caption><p>Example of two-step nonlinear normalization</p></caption><text>
   <p><b>Example of two-step nonlinear normalization</b>. An example of the normalization process applied to the binding quantity difference regarding breast cancer data on human chromosome 1 between (1) MCF-7 control and (2) MCF-7 with E2 stimulation. <b>(A) </b>Raw data with clear bias toward the positive direction. <b>(B) </b>Data normalized with respect to the mean. <b>(C) </b>Data normalized with respect to both mean and variance (<it>x</it>-axis is zoomed in). Green dashed-dotted line and magenta dashed line represent the locally weighted linear regression line with respect to the mean and variance, respectively. Red dotted line represents the zero difference line. Reproduced with permission from Oxford University Press from Taslim <it>et al</it>. <abbrgrp><abbr bid="B156">156</abbr></abbrgrp>.</p>
</text><graphic file="1758-907X-2-2-5" hint_layout="single"/></fig>
<p>A summary of the normalization methods discussed here is given in Table <tblr tid="T2">2</tblr>. Because modern computational standards make most of the more advanced normalization methods relatively trivial, especially when compared to the task of genome mapping, we recommend that researchers not hesitate to use the more sophisticated approaches described herein. In particular, the methods implemented by Robinson <it>et al</it>. <abbrgrp><abbr bid="B120">120</abbr></abbrgrp> (TMM) and Taslim <it>et al</it>. <abbrgrp><abbr bid="B156">156</abbr></abbrgrp> (two-step nonlinear regression) seem to account for many flaws inherent in total count linear scaling, which has been the predominant normalization method of choice. A study comparing these two methods, as well as others, with each other would help to provide a much-needed "gold standard" for normalizing sRNA data. We also recommend using absolute counts, rather than log ratios, when performing normalization, as log ratios fail to account for the vast differences in magnitude evident in many sRNA data sets but absent from microarray experiments.</p>
<tbl id="T2"><title><p>Table 2</p></title><caption><p>Comparison of sRNA normalization methods<sup>a</sup></p></caption><tblbdy cols="4">
      <r>
         <c ca="left">
            <p>
               <b>Method</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Computational complexity</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Control required</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Units normalized</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="4">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Total count scaling</p>
         </c>
         <c ca="left">
            <p>Low</p>
         </c>
         <c ca="left">
            <p>No</p>
         </c>
         <c ca="left">
            <p>Reads</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Quantile-based scaling</p>
         </c>
         <c ca="left">
            <p>Medium</p>
         </c>
         <c ca="left">
            <p>No</p>
         </c>
         <c ca="left">
            <p>Reads</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>TMM</p>
         </c>
         <c ca="left">
            <p>High</p>
         </c>
         <c ca="left">
            <p>Yes</p>
         </c>
         <c ca="left">
            <p>Reads</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Linear regression</p>
         </c>
         <c ca="left">
            <p>High</p>
         </c>
         <c ca="left">
            <p>Yes</p>
         </c>
         <c ca="left">
            <p>Reads</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Nonlinear regression</p>
         </c>
         <c ca="left">
            <p>Very high</p>
         </c>
         <c ca="left">
            <p>Yes</p>
         </c>
         <c ca="left">
            <p>Genomic windows</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p><sup>a</sup>sRNA, small RNA; TMM, trimmed mean of M value.</p>
   </tblfn></tbl>
</sec>
</sec>
<sec><st><p>Differential expression analysis</p></st>
<p>Once sRNA libraries have been normalized, there are many different analyses that can be performed on them, but most fall under some category of differential expression analysis. Differential expression analysis can be performed on (1) individual sequences of interest, such as miRNA; (2) genomic elements, such as genes or transposons; or (3) discrete sRNA-generating genomic loci, also known as "clusters" or "bins." Clustering or binning involves dividing the genome into windows of equal size and summing all normalized counts for tags mapping each window. For experiments involving sRNA data, clustering is not ideal when comparing genomic elements with specific, singular mature sequences, such as miRNA, but can be useful in identifying differentially expressed regions in promoters, noncoding DNA or previously unannotated genes.</p>
<p>The methods for identifying genes expressed differentially with statistical significance differ depending on whether biological replicates were performed. The approach to identifying differential expression between digital tag counts first implemented by Audic and Claverie <abbrgrp><abbr bid="B157">157</abbr></abbrgrp> is particularly sensitive to small differences in low tag counts and is useful for comparing data sets without replicates. Their A-C statistic involves computing the probability that two independent digital measurements of a particular sequence (or set of sequences) come from similar populations. As the actual values being compared increase, the minimum fold change between them recognized as significant decreases. Although this approach relies upon a single measurement for establishing an assumed Poisson distribution for a given sequence, it has been shown that this assumed distribution is never far from the true (but unknown) Poisson distribution <abbrgrp><abbr bid="B158">158</abbr></abbrgrp>. The original implementations by Audic and Claverie <abbrgrp><abbr bid="B157">157</abbr></abbrgrp> were for relatively small data sets (&lt; 10 K reads) and modern sRNA data sets are several orders of magnitude bigger, but the statistical principles guiding the approach remain the same. Thus, the A-C statistic has become popular among biologists seeking to perform comparisons between large RNA data sets <abbrgrp><abbr bid="B124">124</abbr><abbr bid="B158">158</abbr><abbr bid="B159">159</abbr><abbr bid="B160">160</abbr></abbrgrp>. There has been at least one study, however, that demonstrated a poor fit between RNA-seq data and a Poisson distribution <abbrgrp><abbr bid="B161">161</abbr></abbrgrp>. The nature of these types of data makes it difficult to identify a "true" distribution, leaving researchers to assume a distribution that they see most fit. Other distributions assumed include binomial <abbrgrp><abbr bid="B123">123</abbr></abbrgrp> and negative binomial <abbrgrp><abbr bid="B120">120</abbr></abbrgrp>. It should also be noted that Audic and Claverie <abbrgrp><abbr bid="B157">157</abbr></abbrgrp> provided an alternative formula that allows for both normalization and differential expression analysis, but this alternative formula is not recommended for normalization purposes as it essentially implements a total count linear scaling and does not exclude tRNA or nongenomic reads.</p>
<p>For differential expression analyses on data sets with replicates, at least two approaches have been implemented recently. Bioconductor <url>http://bioconductor.org/</url> offers a software package called edgeR (empirical analysis of digital gene expression in R) that detects differentially expressed genes in a replicated experiment using an overdispersed Poisson model (a Poisson model allowing for greater variability) and an empirical Bayes procedure to moderate the degree of overdispersion <abbrgrp><abbr bid="B162">162</abbr></abbrgrp>. By using a parameter to estimate the dispersion between replicates, the model can separate biological variation from technical variation. The edgeR program takes raw sequence counts and total library counts as input parameters, so the data do not have to be normalized first. This approach was used by Eveland <it>et al</it>. <abbrgrp><abbr bid="B118">118</abbr></abbrgrp> to identify differentially expressed genes from maize RNA-seq libraries. Using qRT-PCR, significant differences were validated for 80% of genes identified as differentially expressed. Differential expression detection was possible on tags found in more than 10 copies, but the statistical strength increased with higher counts. The results of analyzing individual tags also corresponded well with the results of analyzing entire genes.</p>
<p>Fahlgren <it>et al</it>. <abbrgrp><abbr bid="B115">115</abbr></abbrgrp> provided another approach for identifying differentially expressed genes from sequencing data sets with replicates by adapting the significance analysis of microarrays (SAM) to sequencing data, a method they call SAM-seq. The differential expression score between the samples incorporates the average abundance across each replicate set for a given sRNA as well as the standard deviation across all samples (from all replicate sets). It also incorporates a small but positive constant to minimize the coefficient of variation for the data set. Therefore, the differential expression score is essentially a <it>t</it>-statistic that has been modified to increase inferential power. This approach also uses a <it>Q</it>-value to allow for control of the false discovery rate. The power to detect differentially expressed genes (1 - false-negative rate) using this approach increases with the number of replicates as well as with the number of differentially expressed sRNA, but even with five replicates, it still remained in the 75% to 95% range. Conversely, the false discovery rate remained under 5%, even with as few as two replicates.</p>
</sec>
<sec><st><p>Conclusions</p></st>
<p>The use of next-generation sequencing to analyze small RNA populations is driving a large number of discoveries in many different organisms. The digital nature and the vast sequencing depth afforded by these approaches provide data that is both qualitatively and quantitatively highly informative. The technologies themselves, including read lengths, sequencing depths, cost and methods of library preparation, continue to improve. While standards for these experiments are still lacking, approaches for designing these experiments, preprocessing and normalizing the data and identifying differentially expressed genes continue to develop. To date, most experiments still do not use biological replicates because of cost. The application of the A-C statistic can still allow statistically meaningful conclusions to be drawn from such experiments, but replicates are still ideal. The ability to multiplex samples in single lanes combined with greater sequencing depths will make this financially more feasible, and we expect that in the near future replication will be required as it is for other genomic approaches. While next-generation sequencing is a vast improvement over microarrays for differential gene expression studies, it is not free from bias; the relative levels of different sequences within the same sample do not necessarily represent the biological situation, owing to bias during library preparation. No method is completely free of bias, but it can be reduced by using T4 RNA ligase 2 for adapter ligation, ligation-free library preparation and/or amplification-free sequencing methods. To date, normalization primarily accounts for differences in sequencing depths between libraries, but further experimental study of these biases may enable the biases to be corrected for during normalization. Normalization is still generally done by total linear count scaling, but positive results from RNA-seq and ChIP-seq experiments suggest that quantile-based or nonlinear scaling methods may be more appropriate for sRNA sequencing studies as well because of the abundance of low copy number reads. The issue of multireads complicates all of these analyses. We have attempted to use probability mapping in our studies, but we have found that a single, highly abundant, distinct sequence within a highly conserved region may throw off the apportioning between loci. Probability mapping approaches are also likely affected by sequencing biases, so both issues will need to be accounted for in improved methods.</p>
</sec>
<sec><st><p>Abbreviations</p></st>
<p>dsRNA: double-stranded RNA; endo-siRNA or esiRNA: endogenous siRNA; exo-siRNA: exogenous siRNA; GMUCT: genome-wide mapping of uncapped transcripts; hc-siRNA: heterochromatic siRNA; LOWESS: locally weighted linear regression; RMA: robust multi-array average; miRNA: microRNA; MPSS: massively parallel signature sequencing; nat-siRNA: natural antisense transcript-derived siRNA; NET-seq: native elongating transcript sequencing; PAP: poly(A) polymerase; PARE: parallel analysis of RNA ends; piRNA: Piwi-interacting RNA; rasiRNA: repeat-associated siRNA; RDR: RNA-dependent RNA polymerase; RNAi: RNA interference; SAM: significance analysis of microarrays; SBL: sequencing by ligation; SBS: sequencing by synthesis; siRNA: small interfering RNA; sRNA: small RNA; ta-siRNA: <it>trans</it>-acting siRNA; TMM: trimmed mean of M value; VSN: variance stabilization normalization.</p>
</sec>
<sec><st><p>Competing interests</p></st>
<p>The authors declare that they have no competing interests.</p>
</sec>
<sec><st><p>Authors' contributions</p></st>
<p>KPM and MRW prepared this manuscript with input, feedback and advice from BCM. All authors read and approved the final manuscript.</p>
</sec>
</bdy>
<bm>
<ack><sec><st><p>Acknowledgements</p></st>
<p>KPM and MRW are funded by National Science Foundation Arabidopsis 2010 award 0725968 to BCM and R. Scott Poethig of the Department of Biology, University of Pennsylvania, Philadelphia, PA, USA.</p>
</sec>
</ack>
<refgrp><bibl id="B1"><title><p>Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays</p></title><aug><au><snm>Brenner</snm><fnm>S</fnm></au><au><snm>Johnson</snm><fnm>M</fnm></au><au><snm>Bridgham</snm><fnm>J</fnm></au><au><snm>Golda</snm><fnm>G</fnm></au><au><snm>Lloyd</snm><fnm>DH</fnm></au><au><snm>Johnson</snm><fnm>D</fnm></au><au><snm>Luo</snm><fnm>S</fnm></au><au><snm>McCurdy</snm><fnm>S</fnm></au><au><snm>Foy</snm><fnm>M</fnm></au><au><snm>Ewan</snm><fnm>M</fnm></au><au><snm>Roth</snm><fnm>R</fnm></au><au><snm>George</snm><fnm>D</fnm></au><au><snm>Eletr</snm><fnm>S</fnm></au><au><snm>Albrecht</snm><fnm>G</fnm></au><au><snm>Vermaas</snm><fnm>E</fnm></au><au><snm>Williams</snm><fnm>SR</fnm></au><au><snm>Moon</snm><fnm>K</fnm></au><au><snm>Burcham</snm><fnm>T</fnm></au><au><snm>Pallas</snm><fnm>M</fnm></au><au><snm>DuBridge</snm><fnm>RB</fnm></au><au><snm>Kirchner</snm><fnm>J</fnm></au><au><snm>Fearon</snm><fnm>K</fnm></au><au><snm>Mao</snm><fnm>J</fnm></au><au><snm>Corcoran</snm><fnm>K</fnm></au></aug><source>Nat Biotechnol</source><pubdate>2000</pubdate><volume>18</volume><fpage>630</fpage><lpage>634</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/76469</pubid><pubid idtype="pmpid" link="fulltext">10835600</pubid></pubidlist></xrefbib></bibl><bibl id="B2"><title><p>The transcriptional landscape of the yeast genome defined by RNA sequencing</p></title><aug><au><snm>Nagalakshmi</snm><fnm>U</fnm></au><au><snm>Wang</snm><fnm>Z</fnm></au><au><snm>Waern</snm><fnm>K</fnm></au><au><snm>Shou</snm><fnm>C</fnm></au><au><snm>Raha</snm><fnm>D</fnm></au><au><snm>Gerstein</snm><fnm>M</fnm></au><au><snm>Snyder</snm><fnm>M</fnm></au></aug><source>Science</source><pubdate>2008</pubdate><volume>320</volume><fpage>1344</fpage><lpage>1349</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1158441</pubid><pubid idtype="pmcid">2951732</pubid><pubid idtype="pmpid">18451266</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>Elucidation of the small RNA component of the transcriptome</p></title><aug><au><snm>Lu</snm><fnm>C</fnm></au><au><snm>Tej</snm><fnm>SS</fnm></au><au><snm>Luo</snm><fnm>S</fnm></au><au><snm>Haudenschild</snm><fnm>CD</fnm></au><au><snm>Meyers</snm><fnm>BC</fnm></au><au><snm>Green</snm><fnm>PJ</fnm></au></aug><source>Science</source><pubdate>2005</pubdate><volume>309</volume><fpage>1567</fpage><lpage>1569</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1114112</pubid><pubid idtype="pmpid" link="fulltext">16141074</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>Endogenous siRNA and miRNA targets identified by sequencing of the Arabidopsis degradome</p></title><aug><au><snm>Addo-Quaye</snm><fnm>C</fnm></au><au><snm>Eshoo</snm><fnm>TW</fnm></au><au><snm>Bartel</snm><fnm>DP</fnm></au><au><snm>Axtell</snm><fnm>MJ</fnm></au></aug><source>Curr Biol</source><pubdate>2008</pubdate><volume>18</volume><fpage>758</fpage><lpage>762</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.cub.2008.04.042</pubid><pubid idtype="pmcid">2583427</pubid><pubid idtype="pmpid">18472421</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><title><p>Global identification of microRNA-target RNA pairs by parallel analysis of RNA ends</p></title><aug><au><snm>German</snm><fnm>MA</fnm></au><au><snm>Pillay</snm><fnm>M</fnm></au><au><snm>Jeong</snm><fnm>DH</fnm></au><au><snm>Hetawal</snm><fnm>A</fnm></au><au><snm>Luo</snm><fnm>S</fnm></au><au><snm>Janardhanan</snm><fnm>P</fnm></au><au><snm>Kannan</snm><fnm>V</fnm></au><au><snm>Rymarquis</snm><fnm>LA</fnm></au><au><snm>Nobuta</snm><fnm>K</fnm></au><au><snm>German</snm><fnm>R</fnm></au><au><snm>De Paoli</snm><fnm>E</fnm></au><au><snm>Lu</snm><fnm>C</fnm></au><au><snm>Schroth</snm><fnm>G</fnm></au><au><snm>Meyers</snm><fnm>BC</fnm></au><au><snm>Green</snm><fnm>PJ</fnm></au></aug><source>Nat Biotechnol</source><pubdate>2008</pubdate><volume>26</volume><fpage>941</fpage><lpage>946</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nbt1417</pubid><pubid idtype="pmpid">18542052</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>A link between RNA metabolism and silencing affecting <it>Arabidopsis </it>development</p></title><aug><au><snm>Gregory</snm><fnm>BD</fnm></au><au><snm>O'Malley</snm><fnm>RC</fnm></au><au><snm>Lister</snm><fnm>R</fnm></au><au><snm>Urich</snm><fnm>MA</fnm></au><au><snm>Tonti-Filippini</snm><fnm>J</fnm></au><au><snm>Chen</snm><fnm>H</fnm></au><au><snm>Millar</snm><fnm>AH</fnm></au><au><snm>Ecker</snm><fnm>JR</fnm></au></aug><source>Dev Cell</source><pubdate>2008</pubdate><volume>14</volume><fpage>854</fpage><lpage>866</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.devcel.2008.04.005</pubid><pubid idtype="pmpid" link="fulltext">18486559</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><title><p>Genome-wide measurement of RNA secondary structure in yeast</p></title><aug><au><snm>Kertesz</snm><fnm>M</fnm></au><au><snm>Wan</snm><fnm>Y</fnm></au><au><snm>Mazor</snm><fnm>E</fnm></au><au><snm>Rinn</snm><fnm>JL</fnm></au><au><snm>Nutter</snm><fnm>RC</fnm></au><au><snm>Chang</snm><fnm>HY</fnm></au><au><snm>Segal</snm><fnm>E</fnm></au></aug><source>Nature</source><pubdate>2010</pubdate><volume>467</volume><fpage>103</fpage><lpage>107</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature09322</pubid><pubid idtype="pmpid" link="fulltext">20811459</pubid></pubidlist></xrefbib></bibl><bibl id="B8"><title><p>Genome-wide double-stranded RNA sequencing reveals the functional significance of base-paired RNAs in <it>Arabidopsis</it></p></title><aug><au><snm>Zheng</snm><fnm>Q</fnm></au><au><snm>Ryvkin</snm><fnm>P</fnm></au><au><snm>Li</snm><fnm>F</fnm></au><au><snm>Dragomir</snm><fnm>I</fnm></au><au><snm>Valladares</snm><fnm>O</fnm></au><au><snm>Yang</snm><fnm>J</fnm></au><au><snm>Cao</snm><fnm>K</fnm></au><au><snm>Wang</snm><fnm>LS</fnm></au><au><snm>Gregory</snm><fnm>BD</fnm></au></aug><source>PLoS Genet</source><pubdate>2010</pubdate><volume>6</volume><fpage>e1001141</fpage><note>pii</note><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pgen.1001141</pubid><pubid idtype="pmcid">2947979</pubid><pubid idtype="pmpid">20941385</pubid></pubidlist></xrefbib></bibl><bibl id="B9"><title><p>Nascent transcript sequencing visualizes transcription at nucleotide resolution</p></title><aug><au><snm>Churchman</snm><fnm>LS</fnm></au><au><snm>Weissman</snm><fnm>JS</fnm></au></aug><source>Nature</source><pubdate>2011</pubdate><volume>469</volume><fpage>368</fpage><lpage>373</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature09652</pubid><pubid idtype="pmpid" link="fulltext">21248844</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling</p></title><aug><au><snm>Ingolia</snm><fnm>NT</fnm></au><au><snm>Ghaemmaghami</snm><fnm>S</fnm></au><au><snm>Newman</snm><fnm>JR</fnm></au><au><snm>Weissman</snm><fnm>JS</fnm></au></aug><source>Science</source><pubdate>2009</pubdate><volume>324</volume><fpage>218</fpage><lpage>223</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1168978</pubid><pubid idtype="pmcid">2746483</pubid><pubid idtype="pmpid">19213877</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing</p></title><aug><au><snm>Robertson</snm><fnm>G</fnm></au><au><snm>Hirst</snm><fnm>M</fnm></au><au><snm>Bainbridge</snm><fnm>M</fnm></au><au><snm>Bilenky</snm><fnm>M</fnm></au><au><snm>Zhao</snm><fnm>Y</fnm></au><au><snm>Zeng</snm><fnm>T</fnm></au><au><snm>Euskirchen</snm><fnm>G</fnm></au><au><snm>Bernier</snm><fnm>B</fnm></au><au><snm>Varhol</snm><fnm>R</fnm></au><au><snm>Delaney</snm><fnm>A</fnm></au><au><snm>Thiessen</snm><fnm>N</fnm></au><au><snm>Griffith</snm><fnm>OL</fnm></au><au><snm>He</snm><fnm>A</fnm></au><au><snm>Marra</snm><fnm>M</fnm></au><au><snm>Snyder</snm><fnm>M</fnm></au><au><snm>Jones</snm><fnm>S</fnm></au></aug><source>Nat Methods</source><pubdate>2007</pubdate><volume>4</volume><fpage>651</fpage><lpage>657</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nmeth1068</pubid><pubid idtype="pmpid" link="fulltext">17558387</pubid></pubidlist></xrefbib></bibl><bibl id="B12"><title><p>Shotgun bisulphite sequencing of the <it>Arabidopsis </it>genome reveals DNA methylation patterning</p></title><aug><au><snm>Cokus</snm><fnm>SJ</fnm></au><au><snm>Feng</snm><fnm>S</fnm></au><au><snm>Zhang</snm><fnm>X</fnm></au><au><snm>Chen</snm><fnm>Z</fnm></au><au><snm>Merriman</snm><fnm>B</fnm></au><au><snm>Haudenschild</snm><fnm>CD</fnm></au><au><snm>Pradhan</snm><fnm>S</fnm></au><au><snm>Nelson</snm><fnm>SF</fnm></au><au><snm>Pellegrini</snm><fnm>M</fnm></au><au><snm>Jacobsen</snm><fnm>SE</fnm></au></aug><source>Nature</source><pubdate>2008</pubdate><volume>452</volume><fpage>215</fpage><lpage>219</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature06745</pubid><pubid idtype="pmcid">2377394</pubid><pubid idtype="pmpid">18278030</pubid></pubidlist></xrefbib></bibl><bibl id="B13"><title><p>Accurate whole human genome sequencing using reversible terminator chemistry</p></title><aug><au><snm>Bentley</snm><fnm>DR</fnm></au><au><snm>Balasubramanian</snm><fnm>S</fnm></au><au><snm>Swerdlow</snm><fnm>HP</fnm></au><au><snm>Smith</snm><fnm>GP</fnm></au><au><snm>Milton</snm><fnm>J</fnm></au><au><snm>Brown</snm><fnm>CG</fnm></au><au><snm>Hall</snm><fnm>KP</fnm></au><au><snm>Evers</snm><fnm>DJ</fnm></au><au><snm>Barnes</snm><fnm>CL</fnm></au><au><snm>Bignell</snm><fnm>HR</fnm></au><au><snm>Boutell</snm><fnm>JM</fnm></au><au><snm>Bryant</snm><fnm>J</fnm></au><au><snm>Carter</snm><fnm>RJ</fnm></au><au><snm>Keira Cheetham</snm><fnm>R</fnm></au><au><snm>Cox</snm><fnm>AJ</fnm></au><au><snm>Ellis</snm><fnm>DJ</fnm></au><au><snm>Flatbush</snm><fnm>MR</fnm></au><au><snm>Gormley</snm><fnm>NA</fnm></au><au><snm>Humphray</snm><fnm>SJ</fnm></au><au><snm>Irving</snm><fnm>LJ</fnm></au><au><snm>Karbelashvili</snm><fnm>MS</fnm></au><au><snm>Kirk</snm><fnm>SM</fnm></au><au><snm>Li</snm><fnm>H</fnm></au><au><snm>Liu</snm><fnm>X</fnm></au><au><snm>Maisinger</snm><fnm>KS</fnm></au><au><snm>Murray</snm><fnm>LJ</fnm></au><au><snm>Obradovic</snm><fnm>B</fnm></au><au><snm>Ost</snm><fnm>T</fnm></au><au><snm>Parkinson</snm><fnm>ML</fnm></au><au><snm>Pratt</snm><fnm>MR</fnm></au><etal/></aug><source>Nature</source><pubdate>2008</pubdate><volume>456</volume><fpage>53</fpage><lpage>59</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature07517</pubid><pubid idtype="pmcid">2581791</pubid><pubid idtype="pmpid">18987734</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>The diploid genome sequence of an Asian individual</p></title><aug><au><snm>Wang</snm><fnm>J</fnm></au><au><snm>Wang</snm><fnm>W</fnm></au><au><snm>Li</snm><fnm>R</fnm></au><au><snm>Li</snm><fnm>Y</fnm></au><au><snm>Tian</snm><fnm>G</fnm></au><au><snm>Goodman</snm><fnm>L</fnm></au><au><snm>Fan</snm><fnm>W</fnm></au><au><snm>Zhang</snm><fnm>J</fnm></au><au><snm>Li</snm><fnm>J</fnm></au><au><snm>Zhang</snm><fnm>J</fnm></au><au><snm>Guo</snm><fnm>Y</fnm></au><au><snm>Feng</snm><fnm>B</fnm></au><au><snm>Li</snm><fnm>H</fnm></au><au><snm>Lu</snm><fnm>Y</fnm></au><au><snm>Fang</snm><fnm>X</fnm></au><au><snm>Liang</snm><fnm>H</fnm></au><au><snm>Du</snm><fnm>Z</fnm></au><au><snm>Li</snm><fnm>D</fnm></au><au><snm>Zhao</snm><fnm>Y</fnm></au><au><snm>Hu</snm><fnm>Y</fnm></au><au><snm>Yang</snm><fnm>Z</fnm></au><au><snm>Zheng</snm><fnm>H</fnm></au><au><snm>Hellmann</snm><fnm>I</fnm></au><au><snm>Inouye</snm><fnm>M</fnm></au><au><snm>Pool</snm><fnm>J</fnm></au><au><snm>Yi</snm><fnm>X</fnm></au><au><snm>Zhao</snm><fnm>J</fnm></au><au><snm>Duan</snm><fnm>J</fnm></au><au><snm>Zhou</snm><fnm>Y</fnm></au><au><snm>Qin</snm><fnm>J</fnm></au><etal/></aug><source>Nature</source><pubdate>2008</pubdate><volume>456</volume><fpage>60</fpage><lpage>65</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature07484</pubid><pubid idtype="pmcid">2716080</pubid><pubid idtype="pmpid">18987735</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>The complete genome of an individual by massively parallel DNA sequencing</p></title><aug><au><snm>Wheeler</snm><fnm>DA</fnm></au><au><snm>Srinivasan</snm><fnm>M</fnm></au><au><snm>Egholm</snm><fnm>M</fnm></au><au><snm>Shen</snm><fnm>Y</fnm></au><au><snm>Chen</snm><fnm>L</fnm></au><au><snm>McGuire</snm><fnm>A</fnm></au><au><snm>He</snm><fnm>W</fnm></au><au><snm>Chen</snm><fnm>YJ</fnm></au><au><snm>Makhijani</snm><fnm>V</fnm></au><au><snm>Roth</snm><fnm>GT</fnm></au><au><snm>Gomes</snm><fnm>X</fnm></au><au><snm>Tartaro</snm><fnm>K</fnm></au><au><snm>Niazi</snm><fnm>F</fnm></au><au><snm>Turcotte</snm><fnm>CL</fnm></au><au><snm>Irzyk</snm><fnm>GP</fnm></au><au><snm>Lupski</snm><fnm>JR</fnm></au><au><snm>Chinault</snm><fnm>C</fnm></au><au><snm>Song</snm><fnm>XZ</fnm></au><au><snm>Liu</snm><fnm>Y</fnm></au><au><snm>Yuan</snm><fnm>Y</fnm></au><au><snm>Nazareth</snm><fnm>L</fnm></au><au><snm>Qin</snm><fnm>X</fnm></au><au><snm>Muzny</snm><fnm>DM</fnm></au><au><snm>Margulies</snm><fnm>M</fnm></au><au><snm>Weinstock</snm><fnm>GM</fnm></au><au><snm>Gibbs</snm><fnm>RA</fnm></au><au><snm>Rothberg</snm><fnm>JM</fnm></au></aug><source>Nature</source><pubdate>2008</pubdate><volume>452</volume><fpage>872</fpage><lpage>876</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature06884</pubid><pubid idtype="pmpid" link="fulltext">18421352</pubid></pubidlist></xrefbib></bibl><bibl id="B16"><title><p>A novel design of whole-genome microarray probes for Saccharomyces cerevisiae which minimizes cross-hybridization</p></title><aug><au><snm>Talla</snm><fnm>E</fnm></au><au><snm>Tekaia</snm><fnm>F</fnm></au><au><snm>Brino</snm><fnm>L</fnm></au><au><snm>Dujon</snm><fnm>B</fnm></au></aug><source>BMC Genomics</source><pubdate>2003</pubdate><volume>4</volume><fpage>38</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-4-38</pubid><pubid idtype="pmcid">239980</pubid><pubid idtype="pmpid">14499002</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><title><p>Sensitivity of microarray oligonucleotide probes: variability and effect of base composition</p></title><aug><au><snm>Binder</snm><fnm>H</fnm></au><au><snm>Kirsten</snm><fnm>T</fnm></au><au><snm>Loeffler</snm><fnm>M</fnm></au><au><snm>Stadler</snm><fnm>PF</fnm></au></aug><source>J Phys Chem B</source><pubdate>2004</pubdate><volume>108</volume><fpage>18003</fpage><lpage>18014</lpage><xrefbib><pubid idtype="doi">10.1021/jp049593g</pubid></xrefbib></bibl><bibl id="B18"><title><p>Determinants of sensitivity and specificity in spotted DNA microarrays with unmodified oligonucleotides</p></title><aug><au><snm>Kucho</snm><fnm>K</fnm></au><au><snm>Yoneda</snm><fnm>H</fnm></au><au><snm>Harada</snm><fnm>M</fnm></au><au><snm>Ishiura</snm><fnm>M</fnm></au></aug><source>Genes Genet Syst</source><pubdate>2004</pubdate><volume>79</volume><fpage>189</fpage><lpage>197</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1266/ggs.79.189</pubid><pubid idtype="pmpid" link="fulltext">15514438</pubid></pubidlist></xrefbib></bibl><bibl id="B19"><title><p>Assessing the need for sequence-based normalization in tiling microarray experiments</p></title><aug><au><snm>Royce</snm><fnm>TE</fnm></au><au><snm>Rozowsky</snm><fnm>JS</fnm></au><au><snm>Gerstein</snm><fnm>MB</fnm></au></aug><source>Bioinformatics</source><pubdate>2007</pubdate><volume>23</volume><fpage>988</fpage><lpage>997</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btm052</pubid><pubid idtype="pmpid" link="fulltext">17387113</pubid></pubidlist></xrefbib></bibl><bibl id="B20"><title><p>Removal of AU bias from microarray mRNA expression data enhances computational identification of active microRNAs</p></title><aug><au><snm>Elkon</snm><fnm>R</fnm></au><au><snm>Agami</snm><fnm>R</fnm></au></aug><source>PLoS Comput Biol</source><pubdate>2008</pubdate><volume>4</volume><fpage>e1000189</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pcbi.1000189</pubid><pubid idtype="pmcid">2533120</pubid><pubid idtype="pmpid">18833292</pubid></pubidlist></xrefbib></bibl><bibl id="B21"><title><p>Exploration, normalization, and summaries of high density oligonucleotide array probe level data</p></title><aug><au><snm>Irizarry</snm><fnm>RA</fnm></au><au><snm>Hobbs</snm><fnm>B</fnm></au><au><snm>Collin</snm><fnm>F</fnm></au><au><snm>Beazer-Barclay</snm><fnm>YD</fnm></au><au><snm>Antonellis</snm><fnm>KJ</fnm></au><au><snm>Scherf</snm><fnm>U</fnm></au><au><snm>Speed</snm><fnm>TP</fnm></au></aug><source>Biostatistics</source><pubdate>2003</pubdate><volume>4</volume><fpage>249</fpage><lpage>264</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/biostatistics/4.2.249</pubid><pubid idtype="pmpid" link="fulltext">12925520</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>Comparison and calibration of transcriptome data from RNA-Seq and tiling arrays</p></title><aug><au><snm>Agarwal</snm><fnm>A</fnm></au><au><snm>Koppstein</snm><fnm>D</fnm></au><au><snm>Rozowsky</snm><fnm>J</fnm></au><au><snm>Sboner</snm><fnm>A</fnm></au><au><snm>Habegger</snm><fnm>L</fnm></au><au><snm>Hillier</snm><fnm>LW</fnm></au><au><snm>Sasidharan</snm><fnm>R</fnm></au><au><snm>Reinke</snm><fnm>V</fnm></au><au><snm>Waterston</snm><fnm>RH</fnm></au><au><snm>Gerstein</snm><fnm>M</fnm></au></aug><source>BMC Genomics</source><pubdate>2010</pubdate><volume>11</volume><fpage>383</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-11-383</pubid><pubid idtype="pmpid" link="fulltext">20565764</pubid></pubidlist></xrefbib></bibl><bibl id="B23"><title><p>A comparison of massively parallel nucleotide sequencing with oligonucleotide microarrays for global transcription profiling</p></title><aug><au><snm>Bradford</snm><fnm>JR</fnm></au><au><snm>Hey</snm><fnm>Y</fnm></au><au><snm>Yates</snm><fnm>T</fnm></au><au><snm>Li</snm><fnm>Y</fnm></au><au><snm>Pepper</snm><fnm>SD</fnm></au><au><snm>Miller</snm><fnm>CJ</fnm></au></aug><source>BMC Genomics</source><pubdate>2010</pubdate><volume>11</volume><fpage>282</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-11-282</pubid><pubid idtype="pmcid">2877694</pubid><pubid idtype="pmpid">20444259</pubid></pubidlist></xrefbib></bibl><bibl id="B24"><title><p>A comparison of microarray and MPSS technology platforms for expression analysis of <it>Arabidopsis</it></p></title><aug><au><snm>Chen</snm><fnm>J</fnm></au><au><snm>Agrawal</snm><fnm>V</fnm></au><au><snm>Rattray</snm><fnm>M</fnm></au><au><snm>West</snm><fnm>MA</fnm></au><au><snm>St Clair</snm><fnm>DA</fnm></au><au><snm>Michelmore</snm><fnm>RW</fnm></au><au><snm>Coughlan</snm><fnm>SJ</fnm></au><au><snm>Meyers</snm><fnm>BC</fnm></au></aug><source>BMC Genomics</source><pubdate>2007</pubdate><volume>8</volume><fpage>414</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-8-414</pubid><pubid idtype="pmcid">2190774</pubid><pubid idtype="pmpid">17997849</pubid></pubidlist></xrefbib></bibl><bibl id="B25"><title><p>A comparison of global gene expression measurement technologies in <it>Arabidopsis thaliana</it></p></title><aug><au><snm>Coughlan</snm><fnm>SJ</fnm></au><au><snm>Agrawal</snm><fnm>V</fnm></au><au><snm>Meyers</snm><fnm>B</fnm></au></aug><source>Comp Funct Genomics</source><pubdate>2004</pubdate><volume>5</volume><fpage>245</fpage><lpage>252</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/cfg.397</pubid><pubid idtype="pmcid">2447440</pubid><pubid idtype="pmpid">18629150</pubid></pubidlist></xrefbib></bibl><bibl id="B26"><title><p>Direct comparison of GeneChip and SAGE on the quantitative accuracy in transcript profiling analysis</p></title><aug><au><snm>Ishii</snm><fnm>M</fnm></au><au><snm>Hashimoto</snm><fnm>S</fnm></au><au><snm>Tsutsumi</snm><fnm>S</fnm></au><au><snm>Wada</snm><fnm>Y</fnm></au><au><snm>Matsushima</snm><fnm>K</fnm></au><au><snm>Kodama</snm><fnm>T</fnm></au><au><snm>Aburatani</snm><fnm>H</fnm></au></aug><source>Genomics</source><pubdate>2000</pubdate><volume>68</volume><fpage>136</fpage><lpage>143</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1006/geno.2000.6284</pubid><pubid idtype="pmpid" link="fulltext">10964511</pubid></pubidlist></xrefbib></bibl><bibl id="B27"><title><p>Comparison of hybridization-based and sequencing-based gene expression technologies on biological replicates</p></title><aug><au><snm>Liu</snm><fnm>F</fnm></au><au><snm>Jenssen</snm><fnm>TK</fnm></au><au><snm>Trimarchi</snm><fnm>J</fnm></au><au><snm>Punzo</snm><fnm>C</fnm></au><au><snm>Cepko</snm><fnm>CL</fnm></au><au><snm>Ohno-Machado</snm><fnm>L</fnm></au><au><snm>Hovig</snm><fnm>E</fnm></au><au><snm>Kuo</snm><fnm>WP</fnm></au></aug><source>BMC Genomics</source><pubdate>2007</pubdate><volume>8</volume><fpage>153</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-8-153</pubid><pubid idtype="pmcid">1899500</pubid><pubid idtype="pmpid">17555589</pubid></pubidlist></xrefbib></bibl><bibl id="B28"><title><p>Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms</p></title><aug><au><snm>'tHoen</snm><fnm>PA</fnm></au><au><snm>Ariyurek</snm><fnm>Y</fnm></au><au><snm>Thygesen</snm><fnm>HH</fnm></au><au><snm>Vreugdenhil</snm><fnm>E</fnm></au><au><snm>Vossen</snm><fnm>RH</fnm></au><au><snm>de Menezes</snm><fnm>RX</fnm></au><au><snm>Boer</snm><fnm>JM</fnm></au><au><snm>van Ommen</snm><fnm>GJ</fnm></au><au><snm>den Dunnen</snm><fnm>JT</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2008</pubdate><volume>36</volume><fpage>e141</fpage><xrefbib><pubidlist><pubid idtype="pmcid">2588528</pubid><pubid idtype="pmpid">18927111</pubid></pubidlist></xrefbib></bibl><bibl id="B29"><title><p>Analyzing high-density oligonucleotide gene expression array data</p></title><aug><au><snm>Schadt</snm><fnm>EE</fnm></au><au><snm>Li</snm><fnm>C</fnm></au><au><snm>Su</snm><fnm>C</fnm></au><au><snm>Wong</snm><fnm>WH</fnm></au></aug><source>J Cell Biochem</source><pubdate>2000</pubdate><volume>80</volume><fpage>192</fpage><lpage>202</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/1097-4644(20010201)80:2&lt;192::AID-JCB50&gt;3.0.CO;2-W</pubid><pubid idtype="pmpid" link="fulltext">11074587</pubid></pubidlist></xrefbib></bibl><bibl id="B30"><title><p>Correcting for signal saturation errors in the analysis of microarray data</p></title><aug><au><snm>Hsiao</snm><fnm>LL</fnm></au><au><snm>Jensen</snm><fnm>RV</fnm></au><au><snm>Yoshida</snm><fnm>T</fnm></au><au><snm>Clark</snm><fnm>KE</fnm></au><au><snm>Blumenstock</snm><fnm>JE</fnm></au><au><snm>Gullans</snm><fnm>SR</fnm></au></aug><source>Biotechniques</source><pubdate>2002</pubdate><volume>32</volume><fpage>330</fpage><lpage>332</lpage><note>334, 336</note><xrefbib><pubid idtype="pmpid">11848410</pubid></xrefbib></bibl><bibl id="B31"><title><p>Comparison of Affymetrix GeneChip expression measures</p></title><aug><au><snm>Irizarry</snm><fnm>RA</fnm></au><au><snm>Wu</snm><fnm>Z</fnm></au><au><snm>Jaffee</snm><fnm>HA</fnm></au></aug><source>Bioinformatics</source><pubdate>2006</pubdate><volume>22</volume><fpage>789</fpage><lpage>794</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btk046</pubid><pubid idtype="pmpid" link="fulltext">16410320</pubid></pubidlist></xrefbib></bibl><bibl id="B32"><title><p>Empirical evaluation of data transformations and ranking statistics for microarray analysis</p></title><aug><au><snm>Qin</snm><fnm>LX</fnm></au><au><snm>Kerr</snm><fnm>KF</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2004</pubdate><volume>32</volume><fpage>5471</fpage><lpage>5479</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkh866</pubid><pubid idtype="pmcid">524279</pubid><pubid idtype="pmpid">15479783</pubid></pubidlist></xrefbib></bibl><bibl id="B33"><title><p>A comparison of background correction methods for two-colour microarrays</p></title><aug><au><snm>Ritchie</snm><fnm>ME</fnm></au><au><snm>Silver</snm><fnm>J</fnm></au><au><snm>Oshlack</snm><fnm>A</fnm></au><au><snm>Holmes</snm><fnm>M</fnm></au><au><snm>Diyagama</snm><fnm>D</fnm></au><au><snm>Holloway</snm><fnm>A</fnm></au><au><snm>Smyth</snm><fnm>GK</fnm></au></aug><source>Bioinformatics</source><pubdate>2007</pubdate><volume>23</volume><fpage>2700</fpage><lpage>2707</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btm412</pubid><pubid idtype="pmpid" link="fulltext">17720982</pubid></pubidlist></xrefbib></bibl><bibl id="B34"><title><p>RNA interference is mediated by 21- and 22-nucleotide RNAs</p></title><aug><au><snm>Elbashir</snm><fnm>SM</fnm></au><au><snm>Lendeckel</snm><fnm>W</fnm></au><au><snm>Tuschl</snm><fnm>T</fnm></au></aug><source>Genes Dev</source><pubdate>2001</pubdate><volume>15</volume><fpage>188</fpage><lpage>200</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gad.862301</pubid><pubid idtype="pmcid">312613</pubid><pubid idtype="pmpid">11157775</pubid></pubidlist></xrefbib></bibl><bibl id="B35"><title><p>Identification of novel genes coding for small expressed RNAs</p></title><aug><au><snm>Lagos-Quintana</snm><fnm>M</fnm></au><au><snm>Rauhut</snm><fnm>R</fnm></au><au><snm>Lendeckel</snm><fnm>W</fnm></au><au><snm>Tuschl</snm><fnm>T</fnm></au></aug><source>Science</source><pubdate>2001</pubdate><volume>294</volume><fpage>853</fpage><lpage>858</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1064921</pubid><pubid idtype="pmpid" link="fulltext">11679670</pubid></pubidlist></xrefbib></bibl><bibl id="B36"><title><p>An abundant class of tiny RNAs with probable regulatory roles in <it>Caenorhabditis elegans</it></p></title><aug><au><snm>Lau</snm><fnm>NC</fnm></au><au><snm>Lim</snm><fnm>LP</fnm></au><au><snm>Weinstein</snm><fnm>EG</fnm></au><au><snm>Bartel</snm><fnm>DP</fnm></au></aug><source>Science</source><pubdate>2001</pubdate><volume>294</volume><fpage>858</fpage><lpage>862</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1065062</pubid><pubid idtype="pmpid" link="fulltext">11679671</pubid></pubidlist></xrefbib></bibl><bibl id="B37"><title><p>An extensive class of small RNAs in <it>Caenorhabditis elegans</it></p></title><aug><au><snm>Lee</snm><fnm>RC</fnm></au><au><snm>Ambros</snm><fnm>V</fnm></au></aug><source>Science</source><pubdate>2001</pubdate><volume>294</volume><fpage>862</fpage><lpage>864</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1065329</pubid><pubid idtype="pmpid" link="fulltext">11679672</pubid></pubidlist></xrefbib></bibl><bibl id="B38"><title><p>MicroRNAs in plants</p></title><aug><au><snm>Reinhart</snm><fnm>BJ</fnm></au><au><snm>Weinstein</snm><fnm>EG</fnm></au><au><snm>Rhoades</snm><fnm>MW</fnm></au><au><snm>Bartel</snm><fnm>B</fnm></au><au><snm>Bartel</snm><fnm>DP</fnm></au></aug><source>Genes Dev</source><pubdate>2002</pubdate><volume>16</volume><fpage>1616</fpage><lpage>1626</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gad.1004402</pubid><pubid idtype="pmcid">186362</pubid><pubid idtype="pmpid">12101121</pubid></pubidlist></xrefbib></bibl><bibl id="B39"><title><p>Endogenous and silencing-associated small RNAs in plants</p></title><aug><au><snm>Llave</snm><fnm>C</fnm></au><au><snm>Kasschau</snm><fnm>KD</fnm></au><au><snm>Rector</snm><fnm>MA</fnm></au><au><snm>Carrington</snm><fnm>JC</fnm></au></aug><source>Plant Cell</source><pubdate>2002</pubdate><volume>14</volume><fpage>1605</fpage><lpage>1619</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1105/tpc.003210</pubid><pubid idtype="pmcid">150710</pubid><pubid idtype="pmpid">12119378</pubid></pubidlist></xrefbib></bibl><bibl id="B40"><title><p>The <it>C. elegans </it>heterochronic gene <it>lin-4 </it>encodes small RNAs with antisense complementarity to <it>lin-14</it></p></title><aug><au><snm>Lee</snm><fnm>RC</fnm></au><au><snm>Feinbaum</snm><fnm>RL</fnm></au><au><snm>Ambros</snm><fnm>V</fnm></au></aug><source>Cell</source><pubdate>1993</pubdate><volume>75</volume><fpage>843</fpage><lpage>854</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0092-8674(93)90529-Y</pubid><pubid idtype="pmpid" link="fulltext">8252621</pubid></pubidlist></xrefbib></bibl><bibl id="B41"><title><p>A microRNA in a multiple-turnover RNAi enzyme complex</p></title><aug><au><snm>Hutvagner</snm><fnm>G</fnm></au><au><snm>Zamore</snm><fnm>PD</fnm></au></aug><source>Science</source><pubdate>2002</pubdate><volume>297</volume><fpage>2056</fpage><lpage>2060</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1073827</pubid><pubid idtype="pmpid" link="fulltext">12154197</pubid></pubidlist></xrefbib></bibl><bibl id="B42"><title><p>Cleavage of <it>Scarecrow-like </it>mRNA targets directed by a class of <it>Arabidopsis </it>miRNA</p></title><aug><au><snm>Llave</snm><fnm>C</fnm></au><au><snm>Xie</snm><fnm>Z</fnm></au><au><snm>Kasschau</snm><fnm>KD</fnm></au><au><snm>Carrington</snm><fnm>JC</fnm></au></aug><source>Science</source><pubdate>2002</pubdate><volume>297</volume><fpage>2053</fpage><lpage>2056</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1076311</pubid><pubid idtype="pmpid" link="fulltext">12242443</pubid></pubidlist></xrefbib></bibl><bibl id="B43"><title><p>Posttranscriptional regulation of the heterochronic gene <it>lin-14 </it>by <it>lin-4 </it>mediates temporal pattern formation in <it>C. elegans</it></p></title><aug><au><snm>Wightman</snm><fnm>B</fnm></au><au><snm>Ha</snm><fnm>I</fnm></au><au><snm>Ruvkun</snm><fnm>G</fnm></au></aug><source>Cell</source><pubdate>1993</pubdate><volume>75</volume><fpage>855</fpage><lpage>862</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0092-8674(93)90530-4</pubid><pubid idtype="pmpid" link="fulltext">8252622</pubid></pubidlist></xrefbib></bibl><bibl id="B44"><title><p>Potent and specific genetic interference by double-stranded RNA in <it>Caenorhabditis elegans</it></p></title><aug><au><snm>Fire</snm><fnm>A</fnm></au><au><snm>Xu</snm><fnm>S</fnm></au><au><snm>Montgomery</snm><fnm>MK</fnm></au><au><snm>Kostas</snm><fnm>SA</fnm></au><au><snm>Driver</snm><fnm>SE</fnm></au><au><snm>Mello</snm><fnm>CC</fnm></au></aug><source>Nature</source><pubdate>1998</pubdate><volume>391</volume><fpage>806</fpage><lpage>811</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/35888</pubid><pubid idtype="pmpid" link="fulltext">9486653</pubid></pubidlist></xrefbib></bibl><bibl id="B45"><title><p>A species of small antisense RNA in posttranscriptional gene silencing in plants</p></title><aug><au><snm>Hamilton</snm><fnm>AJ</fnm></au><au><snm>Baulcombe</snm><fnm>DC</fnm></au></aug><source>Science</source><pubdate>1999</pubdate><volume>286</volume><fpage>950</fpage><lpage>952</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.286.5441.950</pubid><pubid idtype="pmpid" link="fulltext">10542148</pubid></pubidlist></xrefbib></bibl><bibl id="B46"><title><p>RNAi: double-stranded RNA directs the ATP-dependent cleavage of mRNA at 21 to 23 nucleotide intervals</p></title><aug><au><snm>Zamore</snm><fnm>PD</fnm></au><au><snm>Tuschl</snm><fnm>T</fnm></au><au><snm>Sharp</snm><fnm>PA</fnm></au><au><snm>Bartel</snm><fnm>DP</fnm></au></aug><source>Cell</source><pubdate>2000</pubdate><volume>101</volume><fpage>25</fpage><lpage>33</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0092-8674(00)80620-0</pubid><pubid idtype="pmpid" link="fulltext">10778853</pubid></pubidlist></xrefbib></bibl><bibl id="B47"><title><p>Functional proteomics reveals the biochemical niche of <it>C. elegans </it>DCR-1 in multiple small-RNA-mediated pathways</p></title><aug><au><snm>Duchaine</snm><fnm>TF</fnm></au><au><snm>Wohlschlegel</snm><fnm>JA</fnm></au><au><snm>Kennedy</snm><fnm>S</fnm></au><au><snm>Bei</snm><fnm>Y</fnm></au><au><snm>Conte</snm><fnm>D</fnm><suf>Jr</suf></au><au><snm>Pang</snm><fnm>K</fnm></au><au><snm>Brownell</snm><fnm>DR</fnm></au><au><snm>Harding</snm><fnm>S</fnm></au><au><snm>Mitani</snm><fnm>S</fnm></au><au><snm>Ruvkun</snm><fnm>G</fnm></au><au><snm>Yates</snm><fnm>JR</fnm></au><au><snm>Mello</snm><fnm>C</fnm></au></aug><source>Cell</source><pubdate>2006</pubdate><volume>124</volume><fpage>343</fpage><lpage>354</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.cell.2005.11.036</pubid><pubid idtype="pmpid" link="fulltext">16439208</pubid></pubidlist></xrefbib></bibl><bibl id="B48"><title><p>Interacting endogenous and exogenous RNAi pathways in <it>Caenorhabditis elegans</it></p></title><aug><au><snm>Lee</snm><fnm>RC</fnm></au><au><snm>Hammell</snm><fnm>CM</fnm></au><au><snm>Ambros</snm><fnm>V</fnm></au></aug><source>RNA</source><pubdate>2006</pubdate><volume>12</volume><fpage>589</fpage><lpage>597</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1261/rna.2231506</pubid><pubid idtype="pmcid">1421084</pubid><pubid idtype="pmpid">16489184</pubid></pubidlist></xrefbib></bibl><bibl id="B49"><title><p>On the role of RNA amplification in dsRNA-triggered gene silencing</p></title><aug><au><snm>Sijen</snm><fnm>T</fnm></au><au><snm>Fleenor</snm><fnm>J</fnm></au><au><snm>Simmer</snm><fnm>F</fnm></au><au><snm>Thijssen</snm><fnm>KL</fnm></au><au><snm>Parrish</snm><fnm>S</fnm></au><au><snm>Timmons</snm><fnm>L</fnm></au><au><snm>Plasterk</snm><fnm>RH</fnm></au><au><snm>Fire</snm><fnm>A</fnm></au></aug><source>Cell</source><pubdate>2001</pubdate><volume>107</volume><fpage>465</fpage><lpage>476</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0092-8674(01)00576-1</pubid><pubid idtype="pmpid" link="fulltext">11719187</pubid></pubidlist></xrefbib></bibl><bibl id="B50"><title><p>Spreading of RNA targeting and DNA methylation in RNA silencing requires transcription of the target gene and a putative RNA-dependent RNA polymerase</p></title><aug><au><snm>Vaistij</snm><fnm>FE</fnm></au><au><snm>Jones</snm><fnm>L</fnm></au><au><snm>Baulcombe</snm><fnm>DC</fnm></au></aug><source>Plant Cell</source><pubdate>2002</pubdate><volume>14</volume><fpage>857</fpage><lpage>867</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1105/tpc.010480</pubid><pubid idtype="pmcid">150687</pubid><pubid idtype="pmpid">11971140</pubid></pubidlist></xrefbib></bibl><bibl id="B51"><title><p>Distinct populations of primary and secondary effectors during RNAi in <it>C. elegans</it></p></title><aug><au><snm>Pak</snm><fnm>J</fnm></au><au><snm>Fire</snm><fnm>A</fnm></au></aug><source>Science</source><pubdate>2007</pubdate><volume>315</volume><fpage>241</fpage><lpage>244</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1132839</pubid><pubid idtype="pmpid" link="fulltext">17124291</pubid></pubidlist></xrefbib></bibl><bibl id="B52"><title><p>Secondary siRNAs result from unprimed RNA synthesis and form a distinct class</p></title><aug><au><snm>Sijen</snm><fnm>T</fnm></au><au><snm>Steiner</snm><fnm>FA</fnm></au><au><snm>Thijssen</snm><fnm>KL</fnm></au><au><snm>Plasterk</snm><fnm>RH</fnm></au></aug><source>Science</source><pubdate>2007</pubdate><volume>315</volume><fpage>244</fpage><lpage>247</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1136699</pubid><pubid idtype="pmpid" link="fulltext">17158288</pubid></pubidlist></xrefbib></bibl><bibl id="B53"><title><p>Transcriptional silencing and promoter methylation triggered by double-stranded RNA</p></title><aug><au><snm>Mette</snm><fnm>MF</fnm></au><au><snm>Aufsatz</snm><fnm>W</fnm></au><au><snm>van der Winden</snm><fnm>J</fnm></au><au><snm>Matzke</snm><fnm>MA</fnm></au><au><snm>Matzke</snm><fnm>AJ</fnm></au></aug><source>EMBO J</source><pubdate>2000</pubdate><volume>19</volume><fpage>5194</fpage><lpage>5201</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/emboj/19.19.5194</pubid><pubid idtype="pmcid">302106</pubid><pubid idtype="pmpid">11013221</pubid></pubidlist></xrefbib></bibl><bibl id="B54"><title><p>Two classes of short interfering RNA in RNA silencing</p></title><aug><au><snm>Hamilton</snm><fnm>A</fnm></au><au><snm>Voinnet</snm><fnm>O</fnm></au><au><snm>Chappell</snm><fnm>L</fnm></au><au><snm>Baulcombe</snm><fnm>D</fnm></au></aug><source>EMBO J</source><pubdate>2002</pubdate><volume>21</volume><fpage>4671</fpage><lpage>4679</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/emboj/cdf464</pubid><pubid idtype="pmcid">125409</pubid><pubid idtype="pmpid">12198169</pubid></pubidlist></xrefbib></bibl><bibl id="B55"><title><p>Small RNAs correspond to centromere heterochromatic repeats</p></title><aug><au><snm>Reinhart</snm><fnm>BJ</fnm></au><au><snm>Bartel</snm><fnm>DP</fnm></au></aug><source>Science</source><pubdate>2002</pubdate><volume>297</volume><fpage>1831</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1077183</pubid><pubid idtype="pmpid" link="fulltext">12193644</pubid></pubidlist></xrefbib></bibl><bibl id="B56"><title><p>Endogenous siRNAs derived from a pair of natural <it>cis</it>-antisense transcripts regulate salt tolerance in <it>Arabidopsis</it></p></title><aug><au><snm>Borsani</snm><fnm>O</fnm></au><au><snm>Zhu</snm><fnm>J</fnm></au><au><snm>Verslues</snm><fnm>PE</fnm></au><au><snm>Sunkar</snm><fnm>R</fnm></au><au><snm>Zhu</snm><fnm>JK</fnm></au></aug><source>Cell</source><pubdate>2005</pubdate><volume>123</volume><fpage>1279</fpage><lpage>1291</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.cell.2005.11.035</pubid><pubid idtype="pmpid" link="fulltext">16377568</pubid></pubidlist></xrefbib></bibl><bibl id="B57"><title><p>Processing of naturally occurring sense/antisense transcripts of the vertebrate Slc34a gene into short RNAs</p></title><aug><au><snm>Carlile</snm><fnm>M</fnm></au><au><snm>Nalbant</snm><fnm>P</fnm></au><au><snm>Preston-Fayers</snm><fnm>K</fnm></au><au><snm>McHaffie</snm><fnm>GS</fnm></au><au><snm>Werner</snm><fnm>A</fnm></au></aug><source>Physiol Genomics</source><pubdate>2008</pubdate><volume>34</volume><fpage>95</fpage><lpage>100</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1152/physiolgenomics.00004.2008</pubid><pubid idtype="pmpid" link="fulltext">18413783</pubid></pubidlist></xrefbib></bibl><bibl id="B58"><title><p>An endogenous small interfering RNA pathway in <it>Drosophila</it></p></title><aug><au><snm>Czech</snm><fnm>B</fnm></au><au><snm>Malone</snm><fnm>CD</fnm></au><au><snm>Zhou</snm><fnm>R</fnm></au><au><snm>Stark</snm><fnm>A</fnm></au><au><snm>Schlingeheyde</snm><fnm>C</fnm></au><au><snm>Dus</snm><fnm>M</fnm></au><au><snm>Perrimon</snm><fnm>N</fnm></au><au><snm>Kellis</snm><fnm>M</fnm></au><au><snm>Wohlschlegel</snm><fnm>JA</fnm></au><au><snm>Sachidanandam</snm><fnm>R</fnm></au><au><snm>Hannon</snm><fnm>GJ</fnm></au><au><snm>Brennecke</snm><fnm>J</fnm></au></aug><source>Nature</source><pubdate>2008</pubdate><volume>453</volume><fpage>798</fpage><lpage>802</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature07007</pubid><pubid idtype="pmcid">2895258</pubid><pubid idtype="pmpid">18463631</pubid></pubidlist></xrefbib></bibl><bibl id="B59"><title><p>Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes</p></title><aug><au><snm>Watanabe</snm><fnm>T</fnm></au><au><snm>Totoki</snm><fnm>Y</fnm></au><au><snm>Toyoda</snm><fnm>A</fnm></au><au><snm>Kaneda</snm><fnm>M</fnm></au><au><snm>Kuramochi-Miyagawa</snm><fnm>S</fnm></au><au><snm>Obata</snm><fnm>Y</fnm></au><au><snm>Chiba</snm><fnm>H</fnm></au><au><snm>Kohara</snm><fnm>Y</fnm></au><au><snm>Kono</snm><fnm>T</fnm></au><au><snm>Nakano</snm><fnm>T</fnm></au><au><snm>Surani</snm><fnm>MA</fnm></au><au><snm>Sakaki</snm><fnm>Y</fnm></au><au><snm>Sasaki</snm><fnm>H</fnm></au></aug><source>Nature</source><pubdate>2008</pubdate><volume>453</volume><fpage>539</fpage><lpage>543</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature06908</pubid><pubid idtype="pmpid" link="fulltext">18404146</pubid></pubidlist></xrefbib></bibl><bibl id="B60"><title><p>Two distinct mechanisms generate endogenous siRNAs from bidirectional transcription in <it>Drosophila melanogaster</it></p></title><aug><au><snm>Okamura</snm><fnm>K</fnm></au><au><snm>Balla</snm><fnm>S</fnm></au><au><snm>Martin</snm><fnm>R</fnm></au><au><snm>Liu</snm><fnm>N</fnm></au><au><snm>Lai</snm><fnm>EC</fnm></au></aug><source>Nat Struct Mol Biol</source><pubdate>2008</pubdate><volume>15</volume><fpage>998</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nsmb0908-998c</pubid><pubid idtype="pmpid" link="fulltext">18769470</pubid></pubidlist></xrefbib></bibl><bibl id="B61"><title><p><it>SGS3 </it>and <it>SGS2</it>/<it>SDE1</it>/<it>RDR6 </it>are required for juvenile development and the production of <it>trans</it>-acting siRNAs in <it>Arabidopsis</it></p></title><aug><au><snm>Peragine</snm><fnm>A</fnm></au><au><snm>Yoshikawa</snm><fnm>M</fnm></au><au><snm>Wu</snm><fnm>G</fnm></au><au><snm>Albrecht</snm><fnm>HL</fnm></au><au><snm>Poethig</snm><fnm>RS</fnm></au></aug><source>Genes Dev</source><pubdate>2004</pubdate><volume>18</volume><fpage>2368</fpage><lpage>2379</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gad.1231804</pubid><pubid idtype="pmcid">522987</pubid><pubid idtype="pmpid">15466488</pubid></pubidlist></xrefbib></bibl><bibl id="B62"><title><p>Endogenous <it>trans</it>-acting siRNAs regulate the accumulation of <it>Arabidopsis </it>mRNAs</p></title><aug><au><snm>Vazquez</snm><fnm>F</fnm></au><au><snm>Vaucheret</snm><fnm>H</fnm></au><au><snm>Rajagopalan</snm><fnm>R</fnm></au><au><snm>Lepers</snm><fnm>C</fnm></au><au><snm>Gasciolli</snm><fnm>V</fnm></au><au><snm>Mallory</snm><fnm>AC</fnm></au><au><snm>Hilbert</snm><fnm>JL</fnm></au><au><snm>Bartel</snm><fnm>DP</fnm></au><au><snm>Crete</snm><fnm>P</fnm></au></aug><source>Mol Cell</source><pubdate>2004</pubdate><volume>16</volume><fpage>69</fpage><lpage>79</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.molcel.2004.09.028</pubid><pubid idtype="pmpid" link="fulltext">15469823</pubid></pubidlist></xrefbib></bibl><bibl id="B63"><title><p>A pathway for the biogenesis of <it>trans</it>-acting siRNAs in <it>Arabidopsis</it></p></title><aug><au><snm>Yoshikawa</snm><fnm>M</fnm></au><au><snm>Peragine</snm><fnm>A</fnm></au><au><snm>Park</snm><fnm>MY</fnm></au><au><snm>Poethig</snm><fnm>RS</fnm></au></aug><source>Genes Dev</source><pubdate>2005</pubdate><volume>19</volume><fpage>2164</fpage><lpage>2175</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gad.1352605</pubid><pubid idtype="pmcid">1221887,1221887</pubid><pubid idtype="pmpid">16131612</pubid></pubidlist></xrefbib></bibl><bibl id="B64"><title><p>microRNA-directed phasing during <it>trans</it>-acting siRNA biogenesis in plants</p></title><aug><au><snm>Allen</snm><fnm>E</fnm></au><au><snm>Xie</snm><fnm>Z</fnm></au><au><snm>Gustafson</snm><fnm>AM</fnm></au><au><snm>Carrington</snm><fnm>JC</fnm></au></aug><source>Cell</source><pubdate>2005</pubdate><volume>121</volume><fpage>207</fpage><lpage>221</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.cell.2005.04.004</pubid><pubid idtype="pmpid" link="fulltext">15851028</pubid></pubidlist></xrefbib></bibl><bibl id="B65"><title><p>A novel class of small RNAs bind to MILI protein in mouse testes</p></title><aug><au><snm>Aravin</snm><fnm>A</fnm></au><au><snm>Gaidatzis</snm><fnm>D</fnm></au><au><snm>Pfeffer</snm><fnm>S</fnm></au><au><snm>Lagos-Quintana</snm><fnm>M</fnm></au><au><snm>Landgraf</snm><fnm>P</fnm></au><au><snm>Iovino</snm><fnm>N</fnm></au><au><snm>Morris</snm><fnm>P</fnm></au><au><snm>Brownstein</snm><fnm>MJ</fnm></au><au><snm>Kuramochi-Miyagawa</snm><fnm>S</fnm></au><au><snm>Nakano</snm><fnm>T</fnm></au><au><snm>Chien</snm><fnm>M</fnm></au><au><snm>Russo</snm><fnm>JJ</fnm></au><au><snm>Ju</snm><fnm>J</fnm></au><au><snm>Sheridan</snm><fnm>R</fnm></au><au><snm>Sander</snm><fnm>C</fnm></au><au><snm>Zavolan</snm><fnm>M</fnm></au><au><snm>Tuschl</snm><fnm>T</fnm></au></aug><source>Nature</source><pubdate>2006</pubdate><volume>442</volume><fpage>203</fpage><lpage>207</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">16751777</pubid></xrefbib></bibl><bibl id="B66"><title><p>A germline-specific class of small RNAs binds mammalian Piwi proteins</p></title><aug><au><snm>Girard</snm><fnm>A</fnm></au><au><snm>Sachidanandam</snm><fnm>R</fnm></au><au><snm>Hannon</snm><fnm>GJ</fnm></au><au><snm>Carmell</snm><fnm>MA</fnm></au></aug><source>Nature</source><pubdate>2006</pubdate><volume>442</volume><fpage>199</fpage><lpage>202</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">16751776</pubid></xrefbib></bibl><bibl id="B67"><title><p>A novel class of small RNAs in mouse spermatogenic cells</p></title><aug><au><snm>Grivna</snm><fnm>ST</fnm></au><au><snm>Beyret</snm><fnm>E</fnm></au><au><snm>Wang</snm><fnm>Z</fnm></au><au><snm>Lin</snm><fnm>H</fnm></au></aug><source>Genes Dev</source><pubdate>2006</pubdate><volume>20</volume><fpage>1709</fpage><lpage>1714</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gad.1434406</pubid><pubid idtype="pmcid">1522066</pubid><pubid idtype="pmpid">16766680</pubid></pubidlist></xrefbib></bibl><bibl id="B68"><title><p>Characterization of the piRNA complex from rat testes</p></title><aug><au><snm>Lau</snm><fnm>NC</fnm></au><au><snm>Seto</snm><fnm>AG</fnm></au><au><snm>Kim</snm><fnm>J</fnm></au><au><snm>Kuramochi-Miyagawa</snm><fnm>S</fnm></au><au><snm>Nakano</snm><fnm>T</fnm></au><au><snm>Bartel</snm><fnm>DP</fnm></au><au><snm>Kingston</snm><fnm>RE</fnm></au></aug><source>Science</source><pubdate>2006</pubdate><volume>313</volume><fpage>363</fpage><lpage>367</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1130164</pubid><pubid idtype="pmpid" link="fulltext">16778019</pubid></pubidlist></xrefbib></bibl><bibl id="B69"><title><p>Identification and characterization of two novel classes of small RNAs in the mouse germline: retrotransposon-derived siRNAs in oocytes and germline small RNAs in testes</p></title><aug><au><snm>Watanabe</snm><fnm>T</fnm></au><au><snm>Takeda</snm><fnm>A</fnm></au><au><snm>Tsukiyama</snm><fnm>T</fnm></au><au><snm>Mise</snm><fnm>K</fnm></au><au><snm>Okuno</snm><fnm>T</fnm></au><au><snm>Sasaki</snm><fnm>H</fnm></au><au><snm>Minami</snm><fnm>N</fnm></au><au><snm>Imai</snm><fnm>H</fnm></au></aug><source>Genes Dev</source><pubdate>2006</pubdate><volume>20</volume><fpage>1732</fpage><lpage>1743</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gad.1425706</pubid><pubid idtype="pmcid">1522070</pubid><pubid idtype="pmpid">16766679</pubid></pubidlist></xrefbib></bibl><bibl id="B70"><title><p>Specific association of Piwi with rasiRNAs derived from retrotransposon and heterochromatic regions in the <it>Drosophila </it>genome</p></title><aug><au><snm>Saito</snm><fnm>K</fnm></au><au><snm>Nishida</snm><fnm>KM</fnm></au><au><snm>Mori</snm><fnm>T</fnm></au><au><snm>Kawamura</snm><fnm>Y</fnm></au><au><snm>Miyoshi</snm><fnm>K</fnm></au><au><snm>Nagami</snm><fnm>T</fnm></au><au><snm>Siomi</snm><fnm>H</fnm></au><au><snm>Siomi</snm><fnm>MC</fnm></au></aug><source>Genes Dev</source><pubdate>2006</pubdate><volume>20</volume><fpage>2214</fpage><lpage>2222</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gad.1454806</pubid><pubid idtype="pmcid">1553205</pubid><pubid idtype="pmpid">16882972</pubid></pubidlist></xrefbib></bibl><bibl id="B71"><title><p>A distinct small RNA pathway silences selfish genetic elements in the germline</p></title><aug><au><snm>Vagin</snm><fnm>VV</fnm></au><au><snm>Sigova</snm><fnm>A</fnm></au><au><snm>Li</snm><fnm>C</fnm></au><au><snm>Seitz</snm><fnm>H</fnm></au><au><snm>Gvozdev</snm><fnm>V</fnm></au><au><snm>Zamore</snm><fnm>PD</fnm></au></aug><source>Science</source><pubdate>2006</pubdate><volume>313</volume><fpage>320</fpage><lpage>324</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1129333</pubid><pubid idtype="pmpid" link="fulltext">16809489</pubid></pubidlist></xrefbib></bibl><bibl id="B72"><title><p>A novel class of small RNAs: tRNA-derived RNA fragments (tRFs)</p></title><aug><au><snm>Lee</snm><fnm>YS</fnm></au><au><snm>Shibata</snm><fnm>Y</fnm></au><au><snm>Malhotra</snm><fnm>A</fnm></au><au><snm>Dutta</snm><fnm>A</fnm></au></aug><source>Genes Dev</source><pubdate>2009</pubdate><volume>23</volume><fpage>2639</fpage><lpage>2649</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gad.1837609</pubid><pubid idtype="pmcid">2779758</pubid><pubid idtype="pmpid">19933153</pubid></pubidlist></xrefbib></bibl><bibl id="B73"><title><p>Filtering of deep sequencing data reveals the existence of abundant Dicer-dependent small RNAs derived from tRNAs</p></title><aug><au><snm>Cole</snm><fnm>C</fnm></au><au><snm>Sobala</snm><fnm>A</fnm></au><au><snm>Lu</snm><fnm>C</fnm></au><au><snm>Thatcher</snm><fnm>SR</fnm></au><au><snm>Bowman</snm><fnm>A</fnm></au><au><snm>Brown</snm><fnm>JW</fnm></au><au><snm>Green</snm><fnm>PJ</fnm></au><au><snm>Barton</snm><fnm>GJ</fnm></au><au><snm>Hutvagner</snm><fnm>G</fnm></au></aug><source>RNA</source><pubdate>2009</pubdate><volume>15</volume><fpage>2147</fpage><lpage>2160</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1261/rna.1738409</pubid><pubid idtype="pmcid">2779667</pubid><pubid idtype="pmpid">19850906</pubid></pubidlist></xrefbib></bibl><bibl id="B74"><title><p>tRNA cleavage is a conserved response to oxidative stress in eukaryotes</p></title><aug><au><snm>Thompson</snm><fnm>DM</fnm></au><au><snm>Lu</snm><fnm>C</fnm></au><au><snm>Green</snm><fnm>PJ</fnm></au><au><snm>Parker</snm><fnm>R</fnm></au></aug><source>RNA</source><pubdate>2008</pubdate><volume>14</volume><fpage>2095</fpage><lpage>2103</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1261/rna.1232808</pubid><pubid idtype="pmcid">2553748</pubid><pubid idtype="pmpid">18719243</pubid></pubidlist></xrefbib></bibl><bibl id="B75"><title><p>Human mitochondrial tRNA<sup>Met </sup>is exported to the cytoplasm and associates with the Argonaute 2 protein</p></title><aug><au><snm>Maniataki</snm><fnm>E</fnm></au><au><snm>Mourelatos</snm><fnm>Z</fnm></au></aug><source>RNA</source><pubdate>2005</pubdate><volume>11</volume><fpage>849</fpage><lpage>852</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1261/rna.2210805</pubid><pubid idtype="pmcid">1370769</pubid><pubid idtype="pmpid">15872185</pubid></pubidlist></xrefbib></bibl><bibl id="B76"><title><p>Cell-type specific analysis of translating RNAs in developing flowers reveals new levels of control</p></title><aug><au><snm>Jiao</snm><fnm>Y</fnm></au><au><snm>Meyerowitz</snm><fnm>EM</fnm></au></aug><source>Mol Syst Biol</source><pubdate>2010</pubdate><volume>6</volume><fpage>419</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/msb.2010.76</pubid><pubid idtype="pmcid">2990639</pubid><pubid idtype="pmpid">20924354</pubid></pubidlist></xrefbib></bibl><bibl id="B77"><title><p>Genome sequencing in microfabricated high-density picolitre reactors</p></title><aug><au><snm>Margulies</snm><fnm>M</fnm></au><au><snm>Egholm</snm><fnm>M</fnm></au><au><snm>Altman</snm><fnm>WE</fnm></au><au><snm>Attiya</snm><fnm>S</fnm></au><au><snm>Bader</snm><fnm>JS</fnm></au><au><snm>Bemben</snm><fnm>LA</fnm></au><au><snm>Berka</snm><fnm>J</fnm></au><au><snm>Braverman</snm><fnm>MS</fnm></au><au><snm>Chen</snm><fnm>YJ</fnm></au><au><snm>Chen</snm><fnm>Z</fnm></au><au><snm>Dewell</snm><fnm>SB</fnm></au><au><snm>Du</snm><fnm>L</fnm></au><au><snm>Fierro</snm><fnm>JM</fnm></au><au><snm>Gomes</snm><fnm>XV</fnm></au><au><snm>Godwin</snm><fnm>BC</fnm></au><au><snm>He</snm><fnm>W</fnm></au><au><snm>Helgesen</snm><fnm>S</fnm></au><au><snm>Ho</snm><fnm>CH</fnm></au><au><snm>Irzyk</snm><fnm>GP</fnm></au><au><snm>Jando</snm><fnm>SC</fnm></au><au><snm>Alenquer</snm><fnm>ML</fnm></au><au><snm>Jarvie</snm><fnm>TP</fnm></au><au><snm>Jirage</snm><fnm>KB</fnm></au><au><snm>Kim</snm><fnm>JB</fnm></au><au><snm>Knight</snm><fnm>JR</fnm></au><au><snm>Lanza</snm><fnm>JR</fnm></au><au><snm>Leamon</snm><fnm>JH</fnm></au><au><snm>Lefkowitz</snm><fnm>SM</fnm></au><au><snm>Lei</snm><fnm>M</fnm></au><au><snm>Li</snm><fnm>J</fnm></au><etal/></aug><source>Nature</source><pubdate>2005</pubdate><volume>437</volume><fpage>376</fpage><lpage>380</lpage><xrefbib><pubidlist><pubid idtype="pmcid">1464427</pubid><pubid idtype="pmpid">16056220</pubid></pubidlist></xrefbib></bibl><bibl id="B78"><title><p>Single-molecule DNA sequencing of a viral genome</p></title><aug><au><snm>Harris</snm><fnm>TD</fnm></au><au><snm>Buzby</snm><fnm>PR</fnm></au><au><snm>Babcock</snm><fnm>H</fnm></au><au><snm>Beer</snm><fnm>E</fnm></au><au><snm>Bowers</snm><fnm>J</fnm></au><au><snm>Braslavsky</snm><fnm>I</fnm></au><au><snm>Causey</snm><fnm>M</fnm></au><au><snm>Colonell</snm><fnm>J</fnm></au><au><snm>Dimeo</snm><fnm>J</fnm></au><au><snm>Efcavitch</snm><fnm>JW</fnm></au><au><snm>Giladi</snm><fnm>E</fnm></au><au><snm>Gill</snm><fnm>J</fnm></au><au><snm>Healy</snm><fnm>J</fnm></au><au><snm>Jarosz</snm><fnm>M</fnm></au><au><snm>Lapen</snm><fnm>D</fnm></au><au><snm>Moulton</snm><fnm>K</fnm></au><au><snm>Quake</snm><fnm>SR</fnm></au><au><snm>Steinmann</snm><fnm>K</fnm></au><au><snm>Thayer</snm><fnm>E</fnm></au><au><snm>Tyurina</snm><fnm>A</fnm></au><au><snm>Ward</snm><fnm>R</fnm></au><au><snm>Weiss</snm><fnm>H</fnm></au><au><snm>Xie</snm><fnm>Z</fnm></au></aug><source>Science</source><pubdate>2008</pubdate><volume>320</volume><fpage>106</fpage><lpage>109</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1150427</pubid><pubid idtype="pmpid" link="fulltext">18388294</pubid></pubidlist></xrefbib></bibl><bibl id="B79"><title><p>Quantification of the yeast transcriptome by single-molecule sequencing</p></title><aug><au><snm>Lipson</snm><fnm>D</fnm></au><au><snm>Raz</snm><fnm>T</fnm></au><au><snm>Kieu</snm><fnm>A</fnm></au><au><snm>Jones</snm><fnm>DR</fnm></au><au><snm>Giladi</snm><fnm>E</fnm></au><au><snm>Thayer</snm><fnm>E</fnm></au><au><snm>Thompson</snm><fnm>JF</fnm></au><au><snm>Letovsky</snm><fnm>S</fnm></au><au><snm>Milos</snm><fnm>P</fnm></au><au><snm>Causey</snm><fnm>M</fnm></au></aug><source>Nat Biotechnol</source><pubdate>2009</pubdate><volume>27</volume><fpage>652</fpage><lpage>658</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nbt.1551</pubid><pubid idtype="pmpid" link="fulltext">19581875</pubid></pubidlist></xrefbib></bibl><bibl id="B80"><title><p>Real-time DNA sequencing from single polymerase molecules</p></title><aug><au><snm>Eid</snm><fnm>J</fnm></au><au><snm>Fehr</snm><fnm>A</fnm></au><au><snm>Gray</snm><fnm>J</fnm></au><au><snm>Luong</snm><fnm>K</fnm></au><au><snm>Lyle</snm><fnm>J</fnm></au><au><snm>Otto</snm><fnm>G</fnm></au><au><snm>Peluso</snm><fnm>P</fnm></au><au><snm>Rank</snm><fnm>D</fnm></au><au><snm>Baybayan</snm><fnm>P</fnm></au><au><snm>Bettman</snm><fnm>B</fnm></au><au><snm>Bibillo</snm><fnm>A</fnm></au><au><snm>Bjornson</snm><fnm>K</fnm></au><au><snm>Chaudhuri</snm><fnm>B</fnm></au><au><snm>Christians</snm><fnm>F</fnm></au><au><snm>Cicero</snm><fnm>R</fnm></au><au><snm>Clark</snm><fnm>S</fnm></au><au><snm>Dalal</snm><fnm>R</fnm></au><au><snm>Dewinter</snm><fnm>A</fnm></au><au><snm>Dixon</snm><fnm>J</fnm></au><au><snm>Foquet</snm><fnm>M</fnm></au><au><snm>Gaertner</snm><fnm>A</fnm></au><au><snm>Hardenbol</snm><fnm>P</fnm></au><au><snm>Heiner</snm><fnm>C</fnm></au><au><snm>Hester</snm><fnm>K</fnm></au><au><snm>Holden</snm><fnm>D</fnm></au><au><snm>Kearns</snm><fnm>G</fnm></au><au><snm>Kong</snm><fnm>X</fnm></au><au><snm>Kuse</snm><fnm>R</fnm></au><au><snm>Lacroix</snm><fnm>Y</fnm></au><au><snm>Lin</snm><fnm>S</fnm></au><etal/></aug><source>Science</source><pubdate>2009</pubdate><volume>323</volume><fpage>133</fpage><lpage>138</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1162986</pubid><pubid idtype="pmpid" link="fulltext">19023044</pubid></pubidlist></xrefbib></bibl><bibl id="B81"><title><p>A high-resolution, nucleosome position map of <it>C. elegans </it>reveals a lack of universal sequence-dictated positioning</p></title><aug><au><snm>Valouev</snm><fnm>A</fnm></au><au><snm>Ichikawa</snm><fnm>J</fnm></au><au><snm>Tonthat</snm><fnm>T</fnm></au><au><snm>Stuart</snm><fnm>J</fnm></au><au><snm>Ranade</snm><fnm>S</fnm></au><au><snm>Peckham</snm><fnm>H</fnm></au><au><snm>Zeng</snm><fnm>K</fnm></au><au><snm>Malek</snm><fnm>JA</fnm></au><au><snm>Costa</snm><fnm>G</fnm></au><au><snm>McKernan</snm><fnm>K</fnm></au><au><snm>Sidow</snm><fnm>A</fnm></au><au><snm>Fire</snm><fnm>A</fnm></au><au><snm>Johnson</snm><fnm>SM</fnm></au></aug><source>Genome Res</source><pubdate>2008</pubdate><volume>18</volume><fpage>1051</fpage><lpage>1063</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.076463.108</pubid><pubid idtype="pmcid">2493394</pubid><pubid idtype="pmpid">18477713</pubid></pubidlist></xrefbib></bibl><bibl id="B82"><title><p>Accurate multiplex polony sequencing of an evolved bacterial genome</p></title><aug><au><snm>Shendure</snm><fnm>J</fnm></au><au><snm>Porreca</snm><fnm>GJ</fnm></au><au><snm>Reppas</snm><fnm>NB</fnm></au><au><snm>Lin</snm><fnm>X</fnm></au><au><snm>McCutcheon</snm><fnm>JP</fnm></au><au><snm>Rosenbaum</snm><fnm>AM</fnm></au><au><snm>Wang</snm><fnm>MD</fnm></au><au><snm>Zhang</snm><fnm>K</fnm></au><au><snm>Mitra</snm><fnm>RD</fnm></au><au><snm>Church</snm><fnm>GM</fnm></au></aug><source>Science</source><pubdate>2005</pubdate><volume>309</volume><fpage>1728</fpage><lpage>1732</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1117389</pubid><pubid idtype="pmpid" link="fulltext">16081699</pubid></pubidlist></xrefbib></bibl><bibl id="B83"><title><p>Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing</p></title><aug><au><snm>Campbell</snm><fnm>PJ</fnm></au><au><snm>Stephens</snm><fnm>PJ</fnm></au><au><snm>Pleasance</snm><fnm>ED</fnm></au><au><snm>O'Meara</snm><fnm>S</fnm></au><au><snm>Li</snm><fnm>H</fnm></au><au><snm>Santarius</snm><fnm>T</fnm></au><au><snm>Stebbings</snm><fnm>LA</fnm></au><au><snm>Leroy</snm><fnm>C</fnm></au><au><snm>Edkins</snm><fnm>S</fnm></au><au><snm>Hardy</snm><fnm>C</fnm></au><au><snm>Teague</snm><fnm>JW</fnm></au><au><snm>Menzies</snm><fnm>A</fnm></au><au><snm>Goodhead</snm><fnm>I</fnm></au><au><snm>Turner</snm><fnm>DJ</fnm></au><au><snm>Clee</snm><fnm>CM</fnm></au><au><snm>Quail</snm><fnm>MA</fnm></au><au><snm>Cox</snm><fnm>A</fnm></au><au><snm>Brown</snm><fnm>C</fnm></au><au><snm>Durbin</snm><fnm>R</fnm></au><au><snm>Hurles</snm><fnm>ME</fnm></au><au><snm>Edwards</snm><fnm>PA</fnm></au><au><snm>Bignell</snm><fnm>GR</fnm></au><au><snm>Stratton</snm><fnm>MR</fnm></au><au><snm>Futreal</snm><fnm>PA</fnm></au></aug><source>Nat Genet</source><pubdate>2008</pubdate><volume>40</volume><fpage>722</fpage><lpage>729</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/ng.128</pubid><pubid idtype="pmcid">2705838</pubid><pubid idtype="pmpid">18438408</pubid></pubidlist></xrefbib></bibl><bibl id="B84"><title><p>Multiplex sequencing of paired-end ditags (MS-PET): a strategy for the ultra-high-throughput analysis of transcriptomes and genomes</p></title><aug><au><snm>Ng</snm><fnm>P</fnm></au><au><snm>Tan</snm><fnm>JJ</fnm></au><au><snm>Ooi</snm><fnm>HS</fnm></au><au><snm>Lee</snm><fnm>YL</fnm></au><au><snm>Chiu</snm><fnm>KP</fnm></au><au><snm>Fullwood</snm><fnm>MJ</fnm></au><au><snm>Srinivasan</snm><fnm>KG</fnm></au><au><snm>Perbost</snm><fnm>C</fnm></au><au><snm>Du</snm><fnm>L</fnm></au><au><snm>Sung</snm><fnm>WK</fnm></au><au><snm>Wei</snm><fnm>CL</fnm></au><au><snm>Ruan</snm><fnm>Y</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2006</pubdate><volume>34</volume><fpage>e84</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkl444</pubid><pubid idtype="pmcid">1524903</pubid><pubid idtype="pmpid">16840528</pubid></pubidlist></xrefbib></bibl><bibl id="B85"><title><p>Comparing platforms for <it>C. elegans </it>mutant identification using high-throughput whole-genome sequencing</p></title><aug><au><snm>Shen</snm><fnm>Y</fnm></au><au><snm>Sarin</snm><fnm>S</fnm></au><au><snm>Liu</snm><fnm>Y</fnm></au><au><snm>Hobert</snm><fnm>O</fnm></au><au><snm>Pe'er</snm><fnm>I</fnm></au></aug><source>PLoS One</source><pubdate>2008</pubdate><volume>3</volume><fpage>e4012</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pone.0004012</pubid><pubid idtype="pmcid">2603312</pubid><pubid idtype="pmpid">19107202</pubid></pubidlist></xrefbib></bibl><bibl id="B86"><title><p>Limitations and possibilities of small RNA digital gene expression profiling</p></title><aug><au><snm>Linsen</snm><fnm>SE</fnm></au><au><snm>de Wit</snm><fnm>E</fnm></au><au><snm>Janssens</snm><fnm>G</fnm></au><au><snm>Heater</snm><fnm>S</fnm></au><au><snm>Chapman</snm><fnm>L</fnm></au><au><snm>Parkin</snm><fnm>RK</fnm></au><au><snm>Fritz</snm><fnm>B</fnm></au><au><snm>Wyman</snm><fnm>SK</fnm></au><au><snm>de Bruijn</snm><fnm>E</fnm></au><au><snm>Voest</snm><fnm>EE</fnm></au><au><snm>Kuersten</snm><fnm>S</fnm></au><au><snm>Tewari</snm><fnm>M</fnm></au><au><snm>Cuppen</snm><fnm>E</fnm></au></aug><source>Nat Methods</source><pubdate>2009</pubdate><volume>6</volume><fpage>474</fpage><lpage>476</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nmeth0709-474</pubid><pubid idtype="pmpid" link="fulltext">19564845</pubid></pubidlist></xrefbib></bibl><bibl id="B87"><title><p>Enzymatic oligoribonucleotide synthesis with T4 RNA ligase</p></title><aug><au><snm>England</snm><fnm>TE</fnm></au><au><snm>Uhlenbeck</snm><fnm>OC</fnm></au></aug><source>Biochemistry</source><pubdate>1978</pubdate><volume>17</volume><fpage>2069</fpage><lpage>2076</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/bi00604a008</pubid><pubid idtype="pmpid">667012</pubid></pubidlist></xrefbib></bibl><bibl id="B88"><title><p>Addition of mononucleotides to oligoribonucleotide acceptors with T4 RNA ligase</p></title><aug><au><snm>Kikuchi</snm><fnm>Y</fnm></au><au><snm>Hishinuma</snm><fnm>F</fnm></au><au><snm>Sakaguchi</snm><fnm>K</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>1978</pubdate><volume>75</volume><fpage>1270</fpage><lpage>1273</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.75.3.1270</pubid><pubid idtype="pmcid">411452</pubid><pubid idtype="pmpid">274717</pubid></pubidlist></xrefbib></bibl><bibl id="B89"><title><p>Donor activation in the T4 RNA ligase reaction</p></title><aug><au><snm>McLaughlin</snm><fnm>LW</fnm></au><au><snm>Piel</snm><fnm>N</fnm></au><au><snm>Graeser</snm><fnm>E</fnm></au></aug><source>Biochemistry</source><pubdate>1985</pubdate><volume>24</volume><fpage>267</fpage><lpage>273</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/bi00323a005</pubid><pubid idtype="pmpid">3978074</pubid></pubidlist></xrefbib></bibl><bibl id="B90"><title><p>Joining of ribooligonucleotides with T4 RNA ligase and identification of the oligonucleotide-adenylate intermediate</p></title><aug><au><snm>Ohtsuka</snm><fnm>E</fnm></au><au><snm>Nishikawa</snm><fnm>S</fnm></au><au><snm>Sugiura</snm><fnm>M</fnm></au><au><snm>Ikehara</snm><fnm>M</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>1976</pubdate><volume>3</volume><fpage>1613</fpage><lpage>1623</lpage><xrefbib><pubidlist><pubid idtype="pmcid">343018</pubid><pubid idtype="pmpid">183186</pubid></pubidlist></xrefbib></bibl><bibl id="B91"><title><p>The effect of acceptor oligoribonucleotide sequence on the T4 RNA ligase reaction</p></title><aug><au><snm>Romaniuk</snm><fnm>E</fnm></au><au><snm>McLaughlin</snm><fnm>LW</fnm></au><au><snm>Neilson</snm><fnm>T</fnm></au><au><snm>Romaniuk</snm><fnm>PJ</fnm></au></aug><source>Eur J Biochem</source><pubdate>1982</pubdate><volume>125</volume><fpage>639</fpage><lpage>643</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1432-1033.1982.tb06730.x</pubid><pubid idtype="pmpid" link="fulltext">7117259</pubid></pubidlist></xrefbib></bibl><bibl id="B92"><title><p>The growing catalog of small RNAs and their association with distinct Argonaute/Piwi family members</p></title><aug><au><snm>Farazi</snm><fnm>TA</fnm></au><au><snm>Juranek</snm><fnm>SA</fnm></au><au><snm>Tuschl</snm><fnm>T</fnm></au></aug><source>Development</source><pubdate>2008</pubdate><volume>135</volume><fpage>1201</fpage><lpage>1214</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1242/dev.005629</pubid><pubid idtype="pmpid" link="fulltext">18287206</pubid></pubidlist></xrefbib></bibl><bibl id="B93"><title><p>Extensive 3' modification of plant small RNAs is modulated by helper component-proteinase expression</p></title><aug><au><snm>Ebhardt</snm><fnm>HA</fnm></au><au><snm>Thi</snm><fnm>EP</fnm></au><au><snm>Wang</snm><fnm>MB</fnm></au><au><snm>Unrau</snm><fnm>PJ</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>2005</pubdate><volume>102</volume><fpage>13398</fpage><lpage>13403</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.0506597102</pubid><pubid idtype="pmcid">1224661</pubid><pubid idtype="pmpid">16157869</pubid></pubidlist></xrefbib></bibl><bibl id="B94"><title><p>HEN1 recognizes 21-24 nt small RNA duplexes and deposits a methyl group onto the 2' OH of the 3' terminal nucleotide</p></title><aug><au><snm>Yang</snm><fnm>Z</fnm></au><au><snm>Ebright</snm><fnm>YW</fnm></au><au><snm>Yu</snm><fnm>B</fnm></au><au><snm>Chen</snm><fnm>X</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2006</pubdate><volume>34</volume><fpage>667</fpage><lpage>675</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkj474</pubid><pubid idtype="pmcid">1356533</pubid><pubid idtype="pmpid">16449203</pubid></pubidlist></xrefbib></bibl><bibl id="B95"><title><p>Optimization of enzymatic reaction conditions for generating representative pools of cDNA from small RNA</p></title><aug><au><snm>Munaf&#243;</snm><fnm>DB</fnm></au><au><snm>Robb</snm><fnm>GB</fnm></au></aug><source>RNA</source><pubdate>2010</pubdate><volume>16</volume><fpage>2537</fpage><lpage>2552</lpage><xrefbib><pubidlist><pubid idtype="pmcid">2995414</pubid><pubid idtype="pmpid">20921270</pubid></pubidlist></xrefbib></bibl><bibl id="B96"><title><p>Incorporation of terminal phosphorothioates into oligonucleotides</p></title><aug><au><snm>Alefelder</snm><fnm>S</fnm></au><au><snm>Patel</snm><fnm>BK</fnm></au><au><snm>Eckstein</snm><fnm>F</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>1998</pubdate><volume>26</volume><fpage>4983</fpage><lpage>4988</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/26.21.4983</pubid><pubid idtype="pmcid">147945</pubid><pubid idtype="pmpid">9776763</pubid></pubidlist></xrefbib></bibl><bibl id="B97"><title><p>A cellular function for the RNA-interference enzyme Dicer in the maturation of the <it>let-7 </it>small temporal RNA</p></title><aug><au><snm>Hutv&#225;gner</snm><fnm>G</fnm></au><au><snm>McLachlan</snm><fnm>J</fnm></au><au><snm>Pasquinelli</snm><fnm>AE</fnm></au><au><snm>Balint</snm><fnm>E</fnm></au><au><snm>Tuschl</snm><fnm>T</fnm></au><au><snm>Zamore</snm><fnm>PD</fnm></au></aug><source>Science</source><pubdate>2001</pubdate><volume>293</volume><fpage>834</fpage><lpage>838</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">11452083</pubid></xrefbib></bibl><bibl id="B98"><title><p>Methylation as a crucial step in plant microRNA biogenesis</p></title><aug><au><snm>Yu</snm><fnm>B</fnm></au><au><snm>Yang</snm><fnm>Z</fnm></au><au><snm>Li</snm><fnm>J</fnm></au><au><snm>Minakhina</snm><fnm>S</fnm></au><au><snm>Yang</snm><fnm>M</fnm></au><au><snm>Padgett</snm><fnm>RW</fnm></au><au><snm>Steward</snm><fnm>R</fnm></au><au><snm>Chen</snm><fnm>X</fnm></au></aug><source>Science</source><pubdate>2005</pubdate><volume>307</volume><fpage>932</fpage><lpage>935</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1107130</pubid><pubid idtype="pmpid" link="fulltext">15705854</pubid></pubidlist></xrefbib></bibl><bibl id="B99"><title><p>Small RNAs with 5'-polyphosphate termini associate with a Piwi-related protein and regulate gene expression in the single-celled eukaryote <it>Entamoeba histolytica</it></p></title><aug><au><snm>Zhang</snm><fnm>H</fnm></au><au><snm>Ehrenkaufer</snm><fnm>GM</fnm></au><au><snm>Pompey</snm><fnm>JM</fnm></au><au><snm>Hackney</snm><fnm>JA</fnm></au><au><snm>Singh</snm><fnm>U</fnm></au></aug><source>PLoS Pathog</source><pubdate>2008</pubdate><volume>4</volume><fpage>e1000219</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.ppat.1000219</pubid><pubid idtype="pmcid">2582682</pubid><pubid idtype="pmpid">19043551</pubid></pubidlist></xrefbib></bibl><bibl id="B100"><title><p>Capped small RNAs and MOV10 in human hepatitis delta virus replication</p></title><aug><au><snm>Haussecker</snm><fnm>D</fnm></au><au><snm>Cao</snm><fnm>D</fnm></au><au><snm>Huang</snm><fnm>Y</fnm></au><au><snm>Parameswaran</snm><fnm>P</fnm></au><au><snm>Fire</snm><fnm>AZ</fnm></au><au><snm>Kay</snm><fnm>MA</fnm></au></aug><source>Nat Struct Mol Biol</source><pubdate>2008</pubdate><volume>15</volume><fpage>714</fpage><lpage>721</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nsmb.1440</pubid><pubid idtype="pmcid">2876191</pubid><pubid idtype="pmpid">18552826</pubid></pubidlist></xrefbib></bibl><bibl id="B101"><title><p>Post-transcriptional processing generates a diversity of 5'-modified long and short RNAs</p></title><aug><au><cnm>Affymetrix ENCODE Transcriptome Project; Cold Spring Harbor Laboratory ENCODE Transcriptome Project</cnm></au></aug><source>Nature</source><pubdate>2009</pubdate><volume>457</volume><fpage>1028</fpage><lpage>1032</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature07759</pubid><pubid idtype="pmcid">2719882</pubid><pubid idtype="pmpid">19169241</pubid></pubidlist></xrefbib></bibl><bibl id="B102"><title><p>Classical and novel approaches to the detection and localization of the numerous modified nucleotides in eukaryotic ribosomal RNA</p></title><aug><au><snm>Maden</snm><fnm>BE</fnm></au><au><snm>Corbett</snm><fnm>ME</fnm></au><au><snm>Heeney</snm><fnm>PA</fnm></au><au><snm>Pugh</snm><fnm>K</fnm></au><au><snm>Ajuh</snm><fnm>PM</fnm></au></aug><source>Biochimie</source><pubdate>1995</pubdate><volume>77</volume><fpage>22</fpage><lpage>29</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0300-9084(96)88100-4</pubid><pubid idtype="pmpid" link="fulltext">7599273</pubid></pubidlist></xrefbib></bibl><bibl id="B103"><title><p>Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes</p></title><aug><au><snm>Kozarewa</snm><fnm>I</fnm></au><au><snm>Ning</snm><fnm>Z</fnm></au><au><snm>Quail</snm><fnm>MA</fnm></au><au><snm>Sanders</snm><fnm>MJ</fnm></au><au><snm>Berriman</snm><fnm>M</fnm></au><au><snm>Turner</snm><fnm>DJ</fnm></au></aug><source>Nat Methods</source><pubdate>2009</pubdate><volume>6</volume><fpage>291</fpage><lpage>295</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nmeth.1311</pubid><pubid idtype="pmcid">2664327</pubid><pubid idtype="pmpid">19287394</pubid></pubidlist></xrefbib></bibl><bibl id="B104"><title><p>FRT-seq: amplification-free, strand-specific transcriptome sequencing</p></title><aug><au><snm>Mamanova</snm><fnm>L</fnm></au><au><snm>Andrews</snm><fnm>RM</fnm></au><au><snm>James</snm><fnm>KD</fnm></au><au><snm>Sheridan</snm><fnm>EM</fnm></au><au><snm>Ellis</snm><fnm>PD</fnm></au><au><snm>Langford</snm><fnm>CF</fnm></au><au><snm>Ost</snm><fnm>TW</fnm></au><au><snm>Collins</snm><fnm>JE</fnm></au><au><snm>Turner</snm><fnm>DJ</fnm></au></aug><source>Nat Methods</source><pubdate>2010</pubdate><volume>7</volume><fpage>130</fpage><lpage>132</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nmeth.1417</pubid><pubid idtype="pmcid">2861772</pubid><pubid idtype="pmpid">20081834</pubid></pubidlist></xrefbib></bibl><bibl id="B105"><title><p>Polony multiplex analysis of gene expression (PMAGE) in mouse hypertrophic cardiomyopathy</p></title><aug><au><snm>Kim</snm><fnm>JB</fnm></au><au><snm>Porreca</snm><fnm>GJ</fnm></au><au><snm>Song</snm><fnm>L</fnm></au><au><snm>Greenway</snm><fnm>SC</fnm></au><au><snm>Gorham</snm><fnm>JM</fnm></au><au><snm>Church</snm><fnm>GM</fnm></au><au><snm>Seidman</snm><fnm>CE</fnm></au><au><snm>Seidman</snm><fnm>JG</fnm></au></aug><source>Science</source><pubdate>2007</pubdate><volume>316</volume><fpage>1481</fpage><lpage>1484</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1137325</pubid><pubid idtype="pmpid" link="fulltext">17556586</pubid></pubidlist></xrefbib></bibl><bibl id="B106"><title><p>Genome-wide profiling and analysis of <it>Arabidopsis </it>siRNAs</p></title><aug><au><snm>Kasschau</snm><fnm>KD</fnm></au><au><snm>Fahlgren</snm><fnm>N</fnm></au><au><snm>Chapman</snm><fnm>EJ</fnm></au><au><snm>Sullivan</snm><fnm>CM</fnm></au><au><snm>Cumbie</snm><fnm>JS</fnm></au><au><snm>Givan</snm><fnm>SA</fnm></au><au><snm>Carrington</snm><fnm>JC</fnm></au></aug><source>PLoS Biol</source><pubdate>2007</pubdate><volume>5</volume><fpage>e57</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pbio.0050057</pubid><pubid idtype="pmcid">1820830</pubid><pubid idtype="pmpid">17298187</pubid></pubidlist></xrefbib></bibl><bibl id="B107"><title><p>Parallel tagged sequencing on the 454 platform</p></title><aug><au><snm>Meyer</snm><fnm>M</fnm></au><au><snm>Stenzel</snm><fnm>U</fnm></au><au><snm>Hofreiter</snm><fnm>M</fnm></au></aug><source>Nat Protoc</source><pubdate>2008</pubdate><volume>3</volume><fpage>267</fpage><lpage>278</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nprot.2007.520</pubid><pubid idtype="pmpid" link="fulltext">18274529</pubid></pubidlist></xrefbib></bibl><bibl id="B108"><title><p>Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex</p></title><aug><au><snm>Hamady</snm><fnm>M</fnm></au><au><snm>Walker</snm><fnm>JJ</fnm></au><au><snm>Harris</snm><fnm>JK</fnm></au><au><snm>Gold</snm><fnm>NJ</fnm></au><au><snm>Knight</snm><fnm>R</fnm></au></aug><source>Nat Methods</source><pubdate>2008</pubdate><volume>5</volume><fpage>235</fpage><lpage>237</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nmeth.1184</pubid><pubid idtype="pmpid" link="fulltext">18264105</pubid></pubidlist></xrefbib></bibl><bibl id="B109"><title><p>Identification of genetic variants using bar-coded multiplexed sequencing</p></title><aug><au><snm>Craig</snm><fnm>DW</fnm></au><au><snm>Pearson</snm><fnm>JV</fnm></au><au><snm>Szelinger</snm><fnm>S</fnm></au><au><snm>Sekar</snm><fnm>A</fnm></au><au><snm>Redman</snm><fnm>M</fnm></au><au><snm>Corneveaux</snm><fnm>JJ</fnm></au><au><snm>Pawlowski</snm><fnm>TL</fnm></au><au><snm>Laub</snm><fnm>T</fnm></au><au><snm>Nunn</snm><fnm>G</fnm></au><au><snm>Stephan</snm><fnm>DA</fnm></au><au><snm>Homer</snm><fnm>N</fnm></au><au><snm>Huentelman</snm><fnm>MJ</fnm></au></aug><source>Nat Methods</source><pubdate>2008</pubdate><volume>5</volume><fpage>887</fpage><lpage>893</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nmeth.1251</pubid><pubid idtype="pmpid" link="fulltext">18794863</pubid></pubidlist></xrefbib></bibl><bibl id="B110"><title><p>Highly-multiplexed barcode sequencing: an efficient method for parallel analysis of pooled samples</p></title><aug><au><snm>Smith</snm><fnm>AM</fnm></au><au><snm>Heisler</snm><fnm>LE</fnm></au><au><snm>St Onge</snm><fnm>RP</fnm></au><au><snm>Farias-Hesson</snm><fnm>E</fnm></au><au><snm>Wallace</snm><fnm>IM</fnm></au><au><snm>Bodeau</snm><fnm>J</fnm></au><au><snm>Harris</snm><fnm>AN</fnm></au><au><snm>Perry</snm><fnm>KM</fnm></au><au><snm>Giaever</snm><fnm>G</fnm></au><au><snm>Pourmand</snm><fnm>N</fnm></au><au><snm>Nislow</snm><fnm>C</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2010</pubdate><volume>38</volume><fpage>e142</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkq368</pubid><pubid idtype="pmcid">2910071</pubid><pubid idtype="pmpid">20460461</pubid></pubidlist></xrefbib></bibl><bibl id="B111"><title><p>The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing</p></title><aug><au><snm>Binladen</snm><fnm>J</fnm></au><au><snm>Gilbert</snm><fnm>MT</fnm></au><au><snm>Bollback</snm><fnm>JP</fnm></au><au><snm>Panitz</snm><fnm>F</fnm></au><au><snm>Bendixen</snm><fnm>C</fnm></au><au><snm>Nielsen</snm><fnm>R</fnm></au><au><snm>Willerslev</snm><fnm>E</fnm></au></aug><source>PLoS One</source><pubdate>2007</pubdate><volume>2</volume><fpage>e197</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pone.0000197</pubid><pubid idtype="pmcid">1797623</pubid><pubid idtype="pmpid">17299583</pubid></pubidlist></xrefbib></bibl><bibl id="B112"><title><p>Targeted high-throughput sequencing of tagged nucleic acid samples</p></title><aug><au><snm>Meyer</snm><fnm>M</fnm></au><au><snm>Stenzel</snm><fnm>U</fnm></au><au><snm>Myles</snm><fnm>S</fnm></au><au><snm>Prufer</snm><fnm>K</fnm></au><au><snm>Hofreiter</snm><fnm>M</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2007</pubdate><volume>35</volume><fpage>e97</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkm566</pubid><pubid idtype="pmcid">1976447</pubid><pubid idtype="pmpid">17670798</pubid></pubidlist></xrefbib></bibl><bibl id="B113"><title><p>Direct multiplex sequencing (DMPS): a novel method for targeted high-throughput sequencing of ancient and highly degraded DNA</p></title><aug><au><snm>Stiller</snm><fnm>M</fnm></au><au><snm>Knapp</snm><fnm>M</fnm></au><au><snm>Stenzel</snm><fnm>U</fnm></au><au><snm>Hofreiter</snm><fnm>M</fnm></au><au><snm>Meyer</snm><fnm>M</fnm></au></aug><source>Genome Res</source><pubdate>2009</pubdate><volume>19</volume><fpage>1843</fpage><lpage>1848</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.095760.109</pubid><pubid idtype="pmcid">2765274</pubid><pubid idtype="pmpid">19635845</pubid></pubidlist></xrefbib></bibl><bibl id="B114"><title><p>Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology</p></title><aug><au><snm>Cronn</snm><fnm>R</fnm></au><au><snm>Liston</snm><fnm>A</fnm></au><au><snm>Parks</snm><fnm>M</fnm></au><au><snm>Gernandt</snm><fnm>DS</fnm></au><au><snm>Shen</snm><fnm>R</fnm></au><au><snm>Mockler</snm><fnm>T</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2008</pubdate><volume>36</volume><fpage>e122</fpage><lpage>e122</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkn502</pubid><pubid idtype="pmcid">2577356</pubid><pubid idtype="pmpid">18753151</pubid></pubidlist></xrefbib></bibl><bibl id="B115"><title><p>Computational and analytical framework for small RNA profiling by high-throughput sequencing</p></title><aug><au><snm>Fahlgren</snm><fnm>N</fnm></au><au><snm>Sullivan</snm><fnm>CM</fnm></au><au><snm>Kasschau</snm><fnm>KD</fnm></au><au><snm>Chapman</snm><fnm>EJ</fnm></au><au><snm>Cumbie</snm><fnm>JS</fnm></au><au><snm>Montgomery</snm><fnm>TA</fnm></au><au><snm>Gilbert</snm><fnm>SD</fnm></au><au><snm>Dasenko</snm><fnm>M</fnm></au><au><snm>Backman</snm><fnm>TWH</fnm></au><au><snm>Givan</snm><fnm>SA</fnm></au><au><snm>Carrington</snm><fnm>JC</fnm></au></aug><source>RNA</source><pubdate>2009</pubdate><volume>15</volume><fpage>992</fpage><lpage>1002</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1261/rna.1473809</pubid><pubid idtype="pmcid">2673065</pubid><pubid idtype="pmpid">19307293</pubid></pubidlist></xrefbib></bibl><bibl id="B116"><title><p>RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays</p></title><aug><au><snm>Marioni</snm><fnm>JC</fnm></au><au><snm>Mason</snm><fnm>CE</fnm></au><au><snm>Mane</snm><fnm>SM</fnm></au><au><snm>Stephens</snm><fnm>M</fnm></au><au><snm>Gilad</snm><fnm>Y</fnm></au></aug><source>Genome Research</source><pubdate>2008</pubdate><volume>18</volume><fpage>1509</fpage><lpage>1517</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.079558.108</pubid><pubid idtype="pmcid">2527709</pubid><pubid idtype="pmpid">18550803</pubid></pubidlist></xrefbib></bibl><bibl id="B117"><title><p>Detecting heteroplasmy from high-throughput sequencing of complete human mitochondrial DNA genomes</p></title><aug><au><snm>Li</snm><fnm>M</fnm></au><au><snm>Schonberg</snm><fnm>A</fnm></au><au><snm>Schaefer</snm><fnm>M</fnm></au><au><snm>Schroeder</snm><fnm>R</fnm></au><au><snm>Nasidze</snm><fnm>I</fnm></au><au><snm>Stoneking</snm><fnm>M</fnm></au></aug><source>Am J Hum Genet</source><pubdate>2010</pubdate><volume>87</volume><fpage>237</fpage><lpage>249</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.ajhg.2010.07.014</pubid><pubid idtype="pmcid">2917713</pubid><pubid idtype="pmpid">20696290</pubid></pubidlist></xrefbib></bibl><bibl id="B118"><title><p>Digital gene expression signatures for maize development</p></title><aug><au><snm>Eveland</snm><fnm>AL</fnm></au><au><snm>Satoh-Nagasawa</snm><fnm>N</fnm></au><au><snm>Goldshmidt</snm><fnm>A</fnm></au><au><snm>Meyer</snm><fnm>S</fnm></au><au><snm>Beatty</snm><fnm>M</fnm></au><au><snm>Sakai</snm><fnm>H</fnm></au><au><snm>Ware</snm><fnm>D</fnm></au><au><snm>Jackson</snm><fnm>D</fnm></au></aug><source>Plant Physiol</source><pubdate>2010</pubdate><volume>154</volume><fpage>1024</fpage><lpage>1039</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1104/pp.110.159673</pubid><pubid idtype="pmpid" link="fulltext">20833728</pubid></pubidlist></xrefbib></bibl><bibl id="B119"><title><p>Using next-generation sequencing for high resolution multiplex analysis of copy number variation from nanogram quantities of DNA from formalin-fixed paraffin-embedded specimens</p></title><aug><au><snm>Wood</snm><fnm>HM</fnm></au><au><snm>Belvedere</snm><fnm>O</fnm></au><au><snm>Conway</snm><fnm>C</fnm></au><au><snm>Daly</snm><fnm>C</fnm></au><au><snm>Chalkley</snm><fnm>R</fnm></au><au><snm>Bickerdike</snm><fnm>M</fnm></au><au><snm>McKinley</snm><fnm>C</fnm></au><au><snm>Egan</snm><fnm>P</fnm></au><au><snm>Ross</snm><fnm>L</fnm></au><au><snm>Hayward</snm><fnm>B</fnm></au><au><snm>Morgan</snm><fnm>J</fnm></au><au><snm>Davidson</snm><fnm>L</fnm></au><au><snm>MacLennan</snm><fnm>K</fnm></au><au><snm>Ong</snm><fnm>TK</fnm></au><au><snm>Papagiannopoulos</snm><fnm>K</fnm></au><au><snm>Cook</snm><fnm>I</fnm></au><au><snm>Adams</snm><fnm>DJ</fnm></au><au><snm>Taylor</snm><fnm>GR</fnm></au><au><snm>Rabbitts</snm><fnm>P</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2010</pubdate><volume>38</volume><fpage>e151</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkq510</pubid><pubid idtype="pmcid">2919738</pubid><pubid idtype="pmpid">20525786</pubid></pubidlist></xrefbib></bibl><bibl id="B120"><title><p>A scaling normalization method for differential expression analysis of RNA-seq data</p></title><aug><au><snm>Robinson</snm><fnm>MD</fnm></au><au><snm>Oshlack</snm><fnm>A</fnm></au></aug><source>Genome Biol</source><pubdate>2010</pubdate><volume>11</volume><fpage>R25</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2010-11-3-r25</pubid><pubid idtype="pmcid">2864565</pubid><pubid idtype="pmpid">20196867</pubid></pubidlist></xrefbib></bibl><bibl id="B121"><title><p>SOAP: short oligonucleotide alignment program</p></title><aug><au><snm>Li</snm><fnm>R</fnm></au><au><snm>Li</snm><fnm>Y</fnm></au><au><snm>Kristiansen</snm><fnm>K</fnm></au><au><snm>Wang</snm><fnm>J</fnm></au></aug><source>Bioinformatics</source><pubdate>2008</pubdate><volume>24</volume><fpage>713</fpage><lpage>714</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btn025</pubid><pubid idtype="pmpid" link="fulltext">18227114</pubid></pubidlist></xrefbib></bibl><bibl id="B122"><title><p>Ultrafast and memory-efficient alignment of short DNA sequences to the human genome</p></title><aug><au><snm>Langmead</snm><fnm>B</fnm></au><au><snm>Trapnell</snm><fnm>C</fnm></au><au><snm>Pop</snm><fnm>M</fnm></au><au><snm>Salzberg</snm><fnm>SL</fnm></au></aug><source>Genome Biol</source><pubdate>2009</pubdate><volume>10</volume><fpage>R25</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2009-10-3-r25</pubid><pubid idtype="pmcid">2690996</pubid><pubid idtype="pmpid">19261174</pubid></pubidlist></xrefbib></bibl><bibl id="B123"><title><p>PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls</p></title><aug><au><snm>Rozowsky</snm><fnm>J</fnm></au><au><snm>Euskirchen</snm><fnm>G</fnm></au><au><snm>Auerbach</snm><fnm>RK</fnm></au><au><snm>Zhang</snm><fnm>ZD</fnm></au><au><snm>Gibson</snm><fnm>T</fnm></au><au><snm>Bjornson</snm><fnm>R</fnm></au><au><snm>Carriero</snm><fnm>N</fnm></au><au><snm>Snyder</snm><fnm>M</fnm></au><au><snm>Gerstein</snm><fnm>MB</fnm></au></aug><source>Nat Biotechnol</source><pubdate>2009</pubdate><volume>27</volume><fpage>66</fpage><lpage>75</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nbt.1518</pubid><pubid idtype="pmcid">2924752</pubid><pubid idtype="pmpid">19122651</pubid></pubidlist></xrefbib></bibl><bibl id="B124"><title><p>Genome-wide analysis of small RNA and novel microRNA discovery in human acute lymphoblastic leukemia based on extensive sequencing approach</p></title><aug><au><snm>Zhang</snm><fnm>H</fnm></au><au><snm>Yang</snm><fnm>JH</fnm></au><au><snm>Zheng</snm><fnm>YS</fnm></au><au><snm>Zhang</snm><fnm>P</fnm></au><au><snm>Chen</snm><fnm>X</fnm></au><au><snm>Wu</snm><fnm>J</fnm></au><au><snm>Xu</snm><fnm>L</fnm></au><au><snm>Luo</snm><fnm>XQ</fnm></au><au><snm>Ke</snm><fnm>ZY</fnm></au><au><snm>Zhou</snm><fnm>H</fnm></au><au><snm>Qu</snm><fnm>LH</fnm></au><au><snm>Chen</snm><fnm>YQ</fnm></au></aug><source>PLoS One</source><pubdate>2009</pubdate><volume>4</volume><fpage>e6849</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pone.0006849</pubid><pubid idtype="pmcid">2731166</pubid><pubid idtype="pmpid">19724645</pubid></pubidlist></xrefbib></bibl><bibl id="B125"><title><p>Cross-mapping and the identification of editing sites in mature microRNAs in high-throughput sequencing libraries</p></title><aug><au><snm>de Hoon</snm><fnm>MJL</fnm></au><au><snm>Taft</snm><fnm>RJ</fnm></au><au><snm>Hashimoto</snm><fnm>T</fnm></au><au><snm>Kanamori-Katayama</snm><fnm>M</fnm></au><au><snm>Kawaji</snm><fnm>H</fnm></au><au><snm>Kawano</snm><fnm>M</fnm></au><au><snm>Kishima</snm><fnm>M</fnm></au><au><snm>Lassmann</snm><fnm>T</fnm></au><au><snm>Faulkner</snm><fnm>GJ</fnm></au><au><snm>Mattick</snm><fnm>JS</fnm></au><au><snm>Daub</snm><fnm>CO</fnm></au><au><snm>Carninci</snm><fnm>P</fnm></au><au><snm>Kawai</snm><fnm>J</fnm></au><au><snm>Suzuki</snm><fnm>H</fnm></au><au><snm>Hayashizaki</snm><fnm>Y</fnm></au></aug><source>Genome Res</source><pubdate>2010</pubdate><volume>20</volume><fpage>257</fpage><lpage>264</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.095273.109</pubid><pubid idtype="pmcid">2813481</pubid><pubid idtype="pmpid">20051556</pubid></pubidlist></xrefbib></bibl><bibl id="B126"><title><p>Mapping and quantifying mammalian transcriptomes by RNA-seq</p></title><aug><au><snm>Mortazavi</snm><fnm>A</fnm></au><au><snm>Williams</snm><fnm>BA</fnm></au><au><snm>McCue</snm><fnm>K</fnm></au><au><snm>Schaeffer</snm><fnm>L</fnm></au><au><snm>Wold</snm><fnm>B</fnm></au></aug><source>Nat Methods</source><pubdate>2008</pubdate><volume>5</volume><fpage>621</fpage><lpage>628</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nmeth.1226</pubid><pubid idtype="pmpid" link="fulltext">18516045</pubid></pubidlist></xrefbib></bibl><bibl id="B127"><title><p>RNA-seq gene expression estimation with read mapping uncertainty</p></title><aug><au><snm>Li</snm><fnm>B</fnm></au><au><snm>Ruotti</snm><fnm>V</fnm></au><au><snm>Stewart</snm><fnm>RM</fnm></au><au><snm>Thomson</snm><fnm>JA</fnm></au><au><snm>Dewey</snm><fnm>CN</fnm></au></aug><source>Bioinformatics</source><pubdate>2010</pubdate><volume>26</volume><fpage>493</fpage><lpage>500</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btp692</pubid><pubid idtype="pmcid">2820677</pubid><pubid idtype="pmpid">20022975</pubid></pubidlist></xrefbib></bibl><bibl id="B128"><title><p>Meta-analysis of small RNA-sequencing errors reveals ubiquitous post-transcriptional RNA modifications</p></title><aug><au><snm>Ebhardt</snm><fnm>HA</fnm></au><au><snm>Tsang</snm><fnm>HH</fnm></au><au><snm>Dai</snm><fnm>DC</fnm></au><au><snm>Liu</snm><fnm>Y</fnm></au><au><snm>Bostan</snm><fnm>B</fnm></au><au><snm>Fahlman</snm><fnm>RP</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2009</pubdate><volume>37</volume><fpage>2461</fpage><lpage>2470</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkp093</pubid><pubid idtype="pmcid">2677864</pubid><pubid idtype="pmpid">19255090</pubid></pubidlist></xrefbib></bibl><bibl id="B129"><title><p>Bioinformatics analysis suggests base modifications of tRNAs and miRNAs in <it>Arabidopsis thaliana</it></p></title><aug><au><snm>Iida</snm><fnm>K</fnm></au><au><snm>Jin</snm><fnm>H</fnm></au><au><snm>Zhu</snm><fnm>JK</fnm></au></aug><source>BMC Genomics</source><pubdate>2009</pubdate><volume>10</volume><fpage>155</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-10-155</pubid><pubid idtype="pmcid">2674459</pubid><pubid idtype="pmpid">19358740</pubid></pubidlist></xrefbib></bibl><bibl id="B130"><title><p>Normalizing DNA microarray data</p></title><aug><au><snm>Bilban</snm><fnm>M</fnm></au><au><snm>Buehler</snm><fnm>LK</fnm></au><au><snm>Head</snm><fnm>S</fnm></au><au><snm>Desoye</snm><fnm>G</fnm></au><au><snm>Quaranta</snm><fnm>V</fnm></au></aug><source>Curr Issues Mol Biol</source><pubdate>2002</pubdate><volume>4</volume><fpage>57</fpage><lpage>64</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">11931570</pubid></xrefbib></bibl><bibl id="B131"><title><p>Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations</p></title><aug><au><snm>Autio</snm><fnm>R</fnm></au><au><snm>Kilpinen</snm><fnm>S</fnm></au><au><snm>Saarela</snm><fnm>M</fnm></au><au><snm>Kallioniemi</snm><fnm>O</fnm></au><au><snm>Hautaniemi</snm><fnm>S</fnm></au><au><snm>Astola</snm><fnm>J</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2009</pubdate><volume>10</volume><issue>Suppl 1</issue><fpage>S24</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-10-S1-S24</pubid><pubid idtype="pmcid">2648747</pubid><pubid idtype="pmpid">19208124</pubid></pubidlist></xrefbib></bibl><bibl id="B132"><title><p>Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection</p></title><aug><au><snm>Li</snm><fnm>C</fnm></au><au><snm>Wong</snm><fnm>WH</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>2001</pubdate><volume>98</volume><fpage>31</fpage><lpage>36</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.011404098</pubid><pubid idtype="pmcid">14539</pubid><pubid idtype="pmpid">11134512</pubid></pubidlist></xrefbib></bibl><bibl id="B133"><title><p>Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data</p></title><aug><au><snm>Pelz</snm><fnm>CR</fnm></au><au><snm>Kulesz-Martin</snm><fnm>M</fnm></au><au><snm>Bagby</snm><fnm>G</fnm></au><au><snm>Sears</snm><fnm>RC</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2008</pubdate><volume>9</volume><fpage>520</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-9-520</pubid><pubid idtype="pmcid">2644708</pubid><pubid idtype="pmpid">19055840</pubid></pubidlist></xrefbib></bibl><bibl id="B134"><title><p>Feature extraction and normalization algorithms for high-density oligonucleotide gene expression array data</p></title><aug><au><snm>Schadt</snm><fnm>EE</fnm></au><au><snm>Li</snm><fnm>C</fnm></au><au><snm>Ellis</snm><fnm>B</fnm></au><au><snm>Wong</snm><fnm>WH</fnm></au></aug><source>J Cell Biochem Suppl</source><pubdate>2001</pubdate><issue>Suppl 37</issue><fpage>120</fpage><lpage>125</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/jcb.10073</pubid><pubid idtype="pmpid">11842437</pubid></pubidlist></xrefbib></bibl><bibl id="B135"><title><p>Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects</p></title><aug><au><snm>Tseng</snm><fnm>GC</fnm></au><au><snm>Oh</snm><fnm>MK</fnm></au><au><snm>Rohlin</snm><fnm>L</fnm></au><au><snm>Liao</snm><fnm>JC</fnm></au><au><snm>Wong</snm><fnm>WH</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2001</pubdate><volume>29</volume><fpage>2549</fpage><lpage>2557</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/29.12.2549</pubid><pubid idtype="pmcid">55725</pubid><pubid idtype="pmpid">11410663</pubid></pubidlist></xrefbib></bibl><bibl id="B136"><title><p>Housekeeping genes as internal standards: use and limits</p></title><aug><au><snm>Thellin</snm><fnm>O</fnm></au><au><snm>Zorzi</snm><fnm>W</fnm></au><au><snm>Lakaye</snm><fnm>B</fnm></au><au><snm>De Borman</snm><fnm>B</fnm></au><au><snm>Coumans</snm><fnm>B</fnm></au><au><snm>Hennen</snm><fnm>G</fnm></au><au><snm>Grisar</snm><fnm>T</fnm></au><au><snm>Igout</snm><fnm>A</fnm></au><au><snm>Heinen</snm><fnm>E</fnm></au></aug><source>J Biotechnol</source><pubdate>1999</pubdate><volume>75</volume><fpage>291</fpage><lpage>295</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0168-1656(99)00163-7</pubid><pubid idtype="pmpid">10617337</pubid></pubidlist></xrefbib></bibl><bibl id="B137"><title><p>Control selection for RNA quantitation</p></title><aug><au><snm>Suzuki</snm><fnm>T</fnm></au><au><snm>Higgins</snm><fnm>PJ</fnm></au><au><snm>Crawford</snm><fnm>DR</fnm></au></aug><source>Biotechniques</source><pubdate>2000</pubdate><volume>29</volume><fpage>332</fpage><lpage>337</lpage><xrefbib><pubid idtype="pmpid">10948434</pubid></xrefbib></bibl><bibl id="B138"><title><p>Control genes and variability: absence of ubiquitous reference transcripts in diverse mammalian expression studies</p></title><aug><au><snm>Lee</snm><fnm>PD</fnm></au><au><snm>Sladek</snm><fnm>R</fnm></au><au><snm>Greenwood</snm><fnm>CM</fnm></au><au><snm>Hudson</snm><fnm>TJ</fnm></au></aug><source>Genome Res</source><pubdate>2002</pubdate><volume>12</volume><fpage>292</fpage><lpage>297</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.217802</pubid><pubid idtype="pmcid">155273</pubid><pubid idtype="pmpid">11827948</pubid></pubidlist></xrefbib></bibl><bibl id="B139"><title><p>Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes</p></title><aug><au><snm>Vandesompele</snm><fnm>J</fnm></au><au><snm>De Preter</snm><fnm>K</fnm></au><au><snm>Pattyn</snm><fnm>F</fnm></au><au><snm>Poppe</snm><fnm>B</fnm></au><au><snm>Van Roy</snm><fnm>N</fnm></au><au><snm>De Paepe</snm><fnm>A</fnm></au><au><snm>Speleman</snm><fnm>F</fnm></au></aug><source>Genome Biol</source><pubdate>2002</pubdate><volume>3</volume><fpage>RESEARCH0034</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2002-3-7-research0034</pubid><pubid idtype="pmcid">126239</pubid><pubid idtype="pmpid">12184808</pubid></pubidlist></xrefbib></bibl><bibl id="B140"><title><p>Genome-wide identification and testing of superior reference genes for transcript normalization in <it>Arabidopsis</it></p></title><aug><au><snm>Czechowski</snm><fnm>T</fnm></au><au><snm>Stitt</snm><fnm>M</fnm></au><au><snm>Altmann</snm><fnm>T</fnm></au><au><snm>Udvardi</snm><fnm>MK</fnm></au><au><snm>Scheible</snm><fnm>WR</fnm></au></aug><source>Plant Physiol</source><pubdate>2005</pubdate><volume>139</volume><fpage>5</fpage><lpage>17</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1104/pp.105.063743</pubid><pubid idtype="pmcid">1203353</pubid><pubid idtype="pmpid">16166256</pubid></pubidlist></xrefbib></bibl><bibl id="B141"><title><p>Housekeeping genes in cancer: normalization of array data</p></title><aug><au><snm>Khimani</snm><fnm>AH</fnm></au><au><snm>Mhashilkar</snm><fnm>AM</fnm></au><au><snm>Mikulskis</snm><fnm>A</fnm></au><au><snm>O'Malley</snm><fnm>M</fnm></au><au><snm>Liao</snm><fnm>J</fnm></au><au><snm>Golenko</snm><fnm>EE</fnm></au><au><snm>Mayer</snm><fnm>P</fnm></au><au><snm>Chada</snm><fnm>S</fnm></au><au><snm>Killian</snm><fnm>JB</fnm></au><au><snm>Lott</snm><fnm>ST</fnm></au></aug><source>Biotechniques</source><pubdate>2005</pubdate><volume>38</volume><fpage>739</fpage><lpage>745</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2144/05385ST04</pubid><pubid idtype="pmpid" link="fulltext">15948292</pubid></pubidlist></xrefbib></bibl><bibl id="B142"><title><p>A comparison of normalization methods for high density oligonucleotide array data based on variance and bias</p></title><aug><au><snm>Bolstad</snm><fnm>BM</fnm></au><au><snm>Irizarry</snm><fnm>RA</fnm></au><au><snm>Astrand</snm><fnm>M</fnm></au><au><snm>Speed</snm><fnm>TP</fnm></au></aug><source>Bioinformatics</source><pubdate>2003</pubdate><volume>19</volume><fpage>185</fpage><lpage>193</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/19.2.185</pubid><pubid idtype="pmpid" link="fulltext">12538238</pubid></pubidlist></xrefbib></bibl><bibl id="B143"><title><p>A variance-stabilizing transformation for gene-expression microarray data</p></title><aug><au><snm>Durbin</snm><fnm>BP</fnm></au><au><snm>Hardin</snm><fnm>JS</fnm></au><au><snm>Hawkins</snm><fnm>DM</fnm></au><au><snm>Rocke</snm><fnm>DM</fnm></au></aug><source>Bioinformatics</source><pubdate>2002</pubdate><volume>18</volume><issue>Suppl 1</issue><fpage>S105</fpage><lpage>S110</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">12169537</pubid></xrefbib></bibl><bibl id="B144"><title><p>Variance stabilization applied to microarray data calibration and to the quantification of differential expression</p></title><aug><au><snm>Huber</snm><fnm>W</fnm></au><au><snm>von Heydebreck</snm><fnm>A</fnm></au><au><snm>Sultmann</snm><fnm>H</fnm></au><au><snm>Poustka</snm><fnm>A</fnm></au><au><snm>Vingron</snm><fnm>M</fnm></au></aug><source>Bioinformatics</source><pubdate>2002</pubdate><volume>18</volume><issue>Suppl 1</issue><fpage>S96</fpage><lpage>S104</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">12169536</pubid></xrefbib></bibl><bibl id="B145"><title><p>A 'consistency' test for determining the significance of gene expression changes on replicate samples and two-convenient variance-stabilizing transformations</p></title><aug><au><snm>Munson</snm><fnm>P</fnm></au></aug><source>Proceedings of the GeneLogic Workshop on Low Level Analysis of Affymetrix GeneChip Data</source><publisher>Bethesda, MD</publisher><editor>Speed T</editor><pubdate>2001</pubdate><url>http://oz.berkeley.edu/users/terry/zarray/Affy/GL_Workshop/genelogic2001.html</url></bibl><bibl id="B146"><title><p>Effect of various normalization methods on Applied Biosystems expression array system data</p></title><aug><au><snm>Barbacioru</snm><fnm>CC</fnm></au><au><snm>Wang</snm><fnm>Y</fnm></au><au><snm>Canales</snm><fnm>RD</fnm></au><au><snm>Sun</snm><fnm>YA</fnm></au><au><snm>Keys</snm><fnm>DN</fnm></au><au><snm>Chan</snm><fnm>F</fnm></au><au><snm>Poulter</snm><fnm>KA</fnm></au><au><snm>Samaha</snm><fnm>RR</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2006</pubdate><volume>7</volume><fpage>533</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-7-533</pubid><pubid idtype="pmcid">1764432</pubid><pubid idtype="pmpid">17173684</pubid></pubidlist></xrefbib></bibl><bibl id="B147"><title><p>Calibration of microarray gene-expression data</p></title><aug><au><snm>Binder</snm><fnm>H</fnm></au><au><snm>Preibisch</snm><fnm>S</fnm></au><au><snm>Berger</snm><fnm>H</fnm></au></aug><source>Methods Mol Biol</source><pubdate>2010</pubdate><volume>576</volume><fpage>375</fpage><lpage>407</lpage><xrefbib><pubidlist><pubid idtype="doi">full_text</pubid><pubid idtype="pmpid" link="fulltext">19882273</pubid></pubidlist></xrefbib></bibl><bibl id="B148"><title><p>Comparison of algorithms for the analysis of Affymetrix microarray data as evaluated by co-expression of genes in known operons</p></title><aug><au><snm>Harr</snm><fnm>B</fnm></au><au><snm>Schlotterer</snm><fnm>C</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2006</pubdate><volume>34</volume><fpage>e8</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gnj010</pubid><pubid idtype="pmcid">1345700</pubid><pubid idtype="pmpid">16432259</pubid></pubidlist></xrefbib></bibl><bibl id="B149"><title><p>How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results</p></title><aug><au><snm>Millenaar</snm><fnm>FF</fnm></au><au><snm>Okyere</snm><fnm>J</fnm></au><au><snm>May</snm><fnm>ST</fnm></au><au><snm>van Zanten</snm><fnm>M</fnm></au><au><snm>Voesenek</snm><fnm>LA</fnm></au><au><snm>Peeters</snm><fnm>AJ</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2006</pubdate><volume>7</volume><fpage>137</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-7-137</pubid><pubid idtype="pmcid">1431565</pubid><pubid idtype="pmpid">16539732</pubid></pubidlist></xrefbib></bibl><bibl id="B150"><title><p>Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation</p></title><aug><au><snm>Trapnell</snm><fnm>C</fnm></au><au><snm>Williams</snm><fnm>BA</fnm></au><au><snm>Pertea</snm><fnm>G</fnm></au><au><snm>Mortazavi</snm><fnm>A</fnm></au><au><snm>Kwan</snm><fnm>G</fnm></au><au><snm>van Baren</snm><fnm>MJ</fnm></au><au><snm>Salzberg</snm><fnm>SL</fnm></au><au><snm>Wold</snm><fnm>BJ</fnm></au><au><snm>Pachter</snm><fnm>L</fnm></au></aug><source>Nat Biotechnol</source><pubdate>2010</pubdate><volume>28</volume><fpage>511</fpage><lpage>515</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nbt.1621</pubid><pubid idtype="pmpid" link="fulltext">20436464</pubid></pubidlist></xrefbib></bibl><bibl id="B151"><title><p>Global identification of microRNA-target RNA pairs by parallel analysis of RNA ends</p></title><aug><au><snm>German</snm><fnm>MA</fnm></au><au><snm>Pillay</snm><fnm>M</fnm></au><au><snm>Jeong</snm><fnm>DH</fnm></au><au><snm>Hetawal</snm><fnm>A</fnm></au><au><snm>Luo</snm><fnm>S</fnm></au><au><snm>Janardhanan</snm><fnm>P</fnm></au><au><snm>Kannan</snm><fnm>V</fnm></au><au><snm>Rymarquis</snm><fnm>LA</fnm></au><au><snm>Nobuta</snm><fnm>K</fnm></au><au><snm>German</snm><fnm>R</fnm></au><au><snm>De Paoli</snm><fnm>E</fnm></au><au><snm>Lu</snm><fnm>C</fnm></au><au><snm>Schroth</snm><fnm>G</fnm></au><au><snm>Meyers</snm><fnm>BC</fnm></au><au><snm>Green</snm><fnm>PJ</fnm></au></aug><source>Nat Biotechnol</source><pubdate>2008</pubdate><volume>26</volume><fpage>941</fpage><lpage>946</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nbt1417</pubid><pubid idtype="pmpid" link="fulltext">18542052</pubid></pubidlist></xrefbib></bibl><bibl id="B152"><title><p>siRNAs from miRNA sites mediate DNA methylation of target genes</p></title><aug><au><snm>Chellappan</snm><fnm>P</fnm></au><au><snm>Xia</snm><fnm>J</fnm></au><au><snm>Zhou</snm><fnm>X</fnm></au><au><snm>Gao</snm><fnm>S</fnm></au><au><snm>Zhang</snm><fnm>X</fnm></au><au><snm>Coutino</snm><fnm>G</fnm></au><au><snm>Vazquez</snm><fnm>F</fnm></au><au><snm>Zhang</snm><fnm>W</fnm></au><au><snm>Jin</snm><fnm>H</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2010</pubdate><volume>38</volume><fpage>6883</fpage><lpage>6894</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkq590</pubid><pubid idtype="pmcid">2978365</pubid><pubid idtype="pmpid">20621980</pubid></pubidlist></xrefbib></bibl><bibl id="B153"><title><p>From the cover: 22-nucleotide RNAs trigger secondary siRNA biogenesis in plants</p></title><aug><au><snm>Chen</snm><fnm>HM</fnm></au><au><snm>Chen</snm><fnm>LT</fnm></au><au><snm>Patel</snm><fnm>K</fnm></au><au><snm>Li</snm><fnm>YH</fnm></au><au><snm>Baulcombe</snm><fnm>DC</fnm></au><au><snm>Wu</snm><fnm>SH</fnm></au></aug><source>Proc Natl Acad Sci</source><pubdate>2010</pubdate><volume>107</volume><fpage>15269</fpage><lpage>15274</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.1001738107</pubid><pubid idtype="pmcid">2930544</pubid><pubid idtype="pmpid">20643946</pubid></pubidlist></xrefbib></bibl><bibl id="B154"><title><p>Evaluation of statistical methods for normalization and differential expression in mRNA-seq experiments</p></title><aug><au><snm>Bullard</snm><fnm>JH</fnm></au><au><snm>Purdom</snm><fnm>E</fnm></au><au><snm>Hansen</snm><fnm>KD</fnm></au><au><snm>Dudoit</snm><fnm>S</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2010</pubdate><volume>11</volume><fpage>94</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-11-94</pubid><pubid idtype="pmcid">2838869</pubid><pubid idtype="pmpid">20167110</pubid></pubidlist></xrefbib></bibl><bibl id="B155"><title><p>Characterization of unique small RNA populations from rice grain</p></title><aug><au><snm>Heisel</snm><fnm>SE</fnm></au><au><snm>Zhang</snm><fnm>Y</fnm></au><au><snm>Allen</snm><fnm>E</fnm></au><au><snm>Guo</snm><fnm>L</fnm></au><au><snm>Reynolds</snm><fnm>TL</fnm></au><au><snm>Yang</snm><fnm>X</fnm></au><au><snm>Kovalic</snm><fnm>D</fnm></au><au><snm>Roberts</snm><fnm>JK</fnm></au></aug><source>PLoS One</source><pubdate>2008</pubdate><volume>3</volume><fpage>e2871</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pone.0002871</pubid><pubid idtype="pmcid">2518513</pubid><pubid idtype="pmpid">18716673</pubid></pubidlist></xrefbib></bibl><bibl id="B156"><title><p>Comparative study on ChIP-seq data: normalization and binding pattern characterization</p></title><aug><au><snm>Taslim</snm><fnm>C</fnm></au><au><snm>Wu</snm><fnm>J</fnm></au><au><snm>Yan</snm><fnm>P</fnm></au><au><snm>Singer</snm><fnm>G</fnm></au><au><snm>Parvin</snm><fnm>J</fnm></au><au><snm>Huang</snm><fnm>T</fnm></au><au><snm>Lin</snm><fnm>S</fnm></au><au><snm>Huang</snm><fnm>K</fnm></au></aug><source>Bioinformatics</source><pubdate>2009</pubdate><volume>25</volume><fpage>2334</fpage><lpage>2340</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btp384</pubid><pubid idtype="pmcid">2800347</pubid><pubid idtype="pmpid">19561022</pubid></pubidlist></xrefbib></bibl><bibl id="B157"><title><p>The significance of digital gene expression profiles</p></title><aug><au><snm>Audic</snm><fnm>S</fnm></au><au><snm>Claverie</snm><fnm>JM</fnm></au></aug><source>Genome Res</source><pubdate>1997</pubdate><volume>7</volume><fpage>986</fpage><lpage>995</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">9331369</pubid></xrefbib></bibl><bibl id="B158"><title><p>Basic properties and information theory of Audic-Claverie statistic for analyzing cDNA arrays</p></title><aug><au><snm>Ti&#328;o</snm><fnm>P</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2009</pubdate><volume>10</volume><fpage>310</fpage><xrefbib><pubidlist><pubid idtype="pmcid">2761412</pubid><pubid idtype="pmpid">19775462</pubid></pubidlist></xrefbib></bibl><bibl id="B159"><title><p>A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome</p></title><aug><au><snm>Sultan</snm><fnm>M</fnm></au><au><snm>Schulz</snm><fnm>MH</fnm></au><au><snm>Richard</snm><fnm>H</fnm></au><au><snm>Magen</snm><fnm>A</fnm></au><au><snm>Klingenhoff</snm><fnm>A</fnm></au><au><snm>Scherf</snm><fnm>M</fnm></au><au><snm>Seifert</snm><fnm>M</fnm></au><au><snm>Borodina</snm><fnm>T</fnm></au><au><snm>Soldatov</snm><fnm>A</fnm></au><au><snm>Parkhomchuk</snm><fnm>D</fnm></au><au><snm>Schmidt</snm><fnm>D</fnm></au><au><snm>O'Keeffe</snm><fnm>S</fnm></au><au><snm>Haas</snm><fnm>S</fnm></au><au><snm>Vingron</snm><fnm>M</fnm></au><au><snm>Lehrach</snm><fnm>H</fnm></au><au><snm>Yaspo</snm><fnm>ML</fnm></au></aug><source>Science</source><pubdate>2008</pubdate><volume>321</volume><fpage>956</fpage><lpage>960</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1160342</pubid><pubid idtype="pmpid" link="fulltext">18599741</pubid></pubidlist></xrefbib></bibl><bibl id="B160"><title><p>Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells</p></title><aug><au><snm>Morin</snm><fnm>RD</fnm></au><au><snm>O'Connor</snm><fnm>MD</fnm></au><au><snm>Griffith</snm><fnm>M</fnm></au><au><snm>Kuchenbauer</snm><fnm>F</fnm></au><au><snm>Delaney</snm><fnm>A</fnm></au><au><snm>Prabhu</snm><fnm>AL</fnm></au><au><snm>Zhao</snm><fnm>Y</fnm></au><au><snm>McDonald</snm><fnm>H</fnm></au><au><snm>Zeng</snm><fnm>T</fnm></au><au><snm>Hirst</snm><fnm>M</fnm></au><au><snm>Eaves</snm><fnm>CJ</fnm></au><au><snm>Marra</snm><fnm>MA</fnm></au></aug><source>Genome Res</source><pubdate>2008</pubdate><volume>18</volume><fpage>610</fpage><lpage>621</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.7179508</pubid><pubid idtype="pmcid">2279248</pubid><pubid idtype="pmpid">18285502</pubid></pubidlist></xrefbib></bibl><bibl id="B161"><title><p>Modeling non-uniformity in short-read rates in RNA-seq data</p></title><aug><au><snm>Li</snm><fnm>J</fnm></au><au><snm>Jiang</snm><fnm>H</fnm></au><au><snm>Wong</snm><fnm>W</fnm></au></aug><source>Genome Biol</source><pubdate>2010</pubdate><volume>11</volume><fpage>R50</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2010-11-5-r50</pubid><pubid idtype="pmcid">2898062</pubid><pubid idtype="pmpid">20459815</pubid></pubidlist></xrefbib></bibl><bibl id="B162"><title><p>edgeR: a Bioconductor package for differential expression analysis of digital gene expression data</p></title><aug><au><snm>Robinson</snm><fnm>MD</fnm></au><au><snm>McCarthy</snm><fnm>DJ</fnm></au><au><snm>Smyth</snm><fnm>GK</fnm></au></aug><source>Bioinformatics</source><pubdate>2009</pubdate><volume>26</volume><fpage>139</fpage><lpage>140</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btp616</pubid><pubid idtype="pmcid">2796818</pubid><pubid idtype="pmpid">19910308</pubid></pubidlist></xrefbib></bibl></refgrp>
</bm>
</art>