ISU FLUture: a veterinary diagnostic laboratory web-based platform to monitor the temporal genetic patterns of Influenza A virus in swine

Bioinformatics

The Iowa State University Veterinary Diagnostic Laboratory

The ISU VDL provided IAV diagnostic and sequencing services from 2003 to present, with significant increases in the past 2 years: since 2015, the ISU VDL processed over 1,000 IAV-related cases annually collected from 38 states across the continental US. The IAV diagnostic results and related sequences are stored in the ISU VDL Laboratory Information Management System (LIMS), a private database of client records. Of the IAV positive diagnostic submissions, approximately 49% of IAV cases were eligible for the USDA IAV swine surveillance program beginning in 2009 [8], and for these submissions, the sequence data is publicly released to NCBI GenBank [29]. Consequently, the ISU VDL has accumulated over thirteen years of IAV diagnostic and sequence data in swine, of which a majority is not currently accessible by the public or stakeholders. These data include pig specific information such as age and location, as well as HA and NA nucleotide sequences. This information represents a unique and valuable resource that enables the determination of IAV evolutionary trends and spatial and temporal dynamics that have previously been inaccessible.

Data collection

Veterinarians submit diagnostic samples (lung tissue, nasal swabs, or oral fluids) collected on farms from swine with clinical signs (e.g., coughing, dyspnea, fever, depression, lethargy) for IAV screening using real-time reverse transcription polymerase chain reaction (RT-qPCR). Information related to age, weight, and farm location may also be provided at the discretion of the submitter. Samples that are RT-qPCR positive for IAV are subtyped to determine the HA and NA. Veterinarians may then elect to participate in the USDA IAV swine surveillance program, which subsidizes HA and NA sequencing and virus isolation if the case qualifies based on RT-qPCR cycle threshold (CT) values of ≤25 for lung and nasal swab samples and ≤ 20 for oral fluid samples. Samples that meet these criteria receive a unique USDA barcode (a nine-digit alpha-numeric designation beginning with A0) and the resulting sequences are publicly available in NCBI GenBank while maintaining ISU VDL client confidentiality. The client may elect to pay for private diagnostic services within the ISU VDL system with or without participation in the USDA system. If a veterinarian or producer does not want to participate in the USDA IAV surveillance program, or if the diagnostic sample does not meet USDA requirements for inclusion, they may choose to pay for sequencing using the ISU-VDL criteria of a screening RT-qPCR of CT ≤ 38. If successful, a sequence for the HA and NA may be acquired. Cases may also be submitted anonymously in the USDA program if initial CT values qualify and at the discretion of the ISU VDL. Through the anonymous USDA submission, all ISU VDL client information is removed and sequence and isolates are submitted with only state-level information. However, in the ISU VDL LIMS, the sequence data is linked with client-provided information regarding age, weight, and farm location and additional diagnostic information collected from the sample such as influenza subtyping PCR results and other pathogen identification in cases of respiratory disease with multi-etiologic diagnoses.

Data curation

The swine IAV cases from LIMS were extracted and curated in an independent SQL database for ISU FLUture to allow additional processing of the data to prepare it for display on the ISU FLUture webpage. Diagnostic cases maintained privately at the ISU VDL were non-redundantly combined with ISU VDL cases submitted as part of the USDA swine IAV surveillance program using a unique identifier and curated in the ISU FLUture Database. Updates to the ISU FLUture database occur at daily intervals. The data were reduced to USDA accession ID (where applicable), received date, data source (USDA or ISU VDL diagnostic streams), specimen used for PCR detection, specimen used for sequencing, pig age in days, pig weight in pounds, the geographic location (at US state resolution), the IAV subtypes detected in the specimen, the HA sequence, and the NA sequence (for cases included in the USDA IAV swine surveillance system). The case associated information was voluntarily provided by the clients, thus not all variables were available for every case. Duplicate cases were removed from the results in instances where multiple diagnostic samples were submitted from the same farm, retaining only the sample that contained the HA sequence, or the sample that tested positive for IAV when sequencing failed, but a subtype was available.

The description of the evolutionary dynamics of the sequenced samples was achieved by inferring the HA and NA phylogenetic clade for each case where applicable. HA clades are initially screened using a logistic regression one-vs-all multiclass classifier trained with all cases currently in the database with known clades to flag sequence data that would need follow up. HA clades for H1 subtype were determined using the Swine H1 Clade Classification Tool available on the Influenza Research Database [26, 27, 30]. The results were reported in the US familiar clade terms as the primary stakeholders for the ISU FLUture website, US veterinarians and producers, would not be versed in the global H1 nomenclature and the global context would not frequently be relevant for the US-restricted data. The H3, N1, and N2 clades were determined by phylogenetic analysis from a set of reference sequences (Additional file 1: Figure S1 and Additional file 2; Additional file 3: Figure S2 and Additional file 4; Additional file 5: Figure S3 and Additional file 6). Nucleic acid sequences for each case in question were included with the reference sequences and aligned with MAFFT v7.271 [31] using default settings. FastTree2 v2.1.9 [32] was used to infer the best-known maximum-likelihood tree for each of the gene alignments implementing a general time reversible model of nucleotide substitution with a CAT model of rate heterogeneity with branch lengths rescaled to optimize the Gamma20 likelihood [32]. The HA and NA phylogenetic clade for each strain was subsequently assigned [1014, 16, 17, 33]. The ISU VDL reports any novel IAV to the USDA as per the influenza surveillance guidelines in swine. Unique influenza viruses detected in swine may be reported to the World Organization for Animal Health (OIE) at the discretion of the USDA. USDA is responsible for diagnosing and reporting of OIE listed IAV. Resultant data was checked for irregularities such as mismatched clades and subtypes before being inserted into the underlying ISU FLUture database. The relational database stores information related to the USDA or ISU VDL case accession number, sample receipt date, data source, animal age, animal weight, animal location, sample used for PCR subtyping, subtyping PCR results, the specimen used for sequencing, the HA clade, and NA clade; and was internally identified by auto-generated case names based on this information.

Determining trends in IAV with interactive visualization tools

We developed multiple interactive tools within ISU FLUture to visualize IAV dynamics using the JavaScript libraries C3, D3, jQuery, and Raphael [34, 35]. Currently, four different modes of interpretation exist; correlation, time series, regional, and heat-map. The correlation tool depicts the relationship between two different database variables, facilitated using C3’s bar graph functionality. The X-axis displays the unique or binned values of a single variable for which the axis displays the count of occurrences in the database. Data normalization is built into the system, where the different variables in a bin are summed and divided by the total. The time series tool uses C3’s time series chart functionality to display the binned counts of data over time with the granularity of day, week, month or year. The heat-map tool displays HA and NA clade pairings which are displayed in a table over a selected region of time for cases where both the neuraminidase and the hemagglutinin phylogenetic clades are available. Monochromatic coloration is applied to the table to emphasize higher representation by intensity. The regional tool is designed to show the geographic provenance of the data in ISU FLUture. A map of the United States is drawn using the Raphael library, with the different states shaded proportional to the number of swine cases in the database over a selected range of time. The number of cases are reported numerically in a table under the map.

Articles You May Like

Cytoplasmic flows in starfish oocytes are fully determined by cortical contractions
Astronomers may have just discovered our Sun’s long-lost identical twin
Public Health Experts On New FDA E-Cigarette Rules: It’s Complicated
Masks, Purifiers and Other Steps to Reduce the Harmful Effects of California’s Smoke
Genetics Start-Up Wants to Sequence People’s Genomes for Free

Leave a Reply

Your email address will not be published. Required fields are marked *