H2V: A Database Of Human Genes And Proteins That Respond To ...

Data collection

In this study, human proteins/genes responding to viral infections were defined as differentially expressed genes (DEGs), proteins that participate in human-virus protein–protein interactions (PPIs), differentially expressed proteins (DEPs), differentially phosphorylated proteins (DPPs), differentially translated proteins (DTPs), differentially ubiquitinated proteins (DUPs), and disease severity associated proteins (SAPs).

We used the Bing search engine (https://www.bing.com), NCBI resources (https://www.ncbi.nlm.nih.gov/), and Proteome Xchange database http://www.proteomexchange.org/) to search for studies of SARS-CoV-2, SARS-CoV, and MERS-CoV infection. Based on the definition of response genes/proteins, the studies were classified as DEG, PPI, DEP, DPP, DTP, DUP and SAP. For each study type, three independent studies per virus were selected. If the number of available studies was less than three, any identified sources were used. Since we focused on dynamic changes in response genes/proteins over time post infection, studies reporting time-course surveys were selected as the highest priority. Only in cases of insufficient study numbers were studies without time-course examinations selected. After study selection, the journal articles reporting the selected studies were retrieved, and information about gene and protein responses was extracted from the main text and supplementary material of each article. When such information was not available in the journal article, raw data from the selected studies were downloaded from public repositories and subsequently analyzed. The selected studies ( [12,13,14,15,16,17,18,19,20,21,22,23,24]) and corresponding strategies to identify response genes and proteins are summarized in Table 1.

Table 1 Studies and strategies used to identify response genes/proteins
Full size table

Genome assemblies MN985325.1, NC_004718.3 and NC_019843.3 from the NCBI database (https://www.ncbi.nlm.nih.gov/) were used to annotate SARS-CoV-2, SARS-CoV and MERS-CoV genes, respectively. Drug information was collected from the DrugBank database [25]. Postprocessing of data was performed using R (https://www.r-project.org/) and Python (https://python.org/).

Implementation

H2V was developed using conventional web development techniques. The user interface was developed using HTML5, CSS3, and JavaScript. Bootstrap v4 (https://getbootstrap.com/) was used for layout design. DataTables (https://datatables.net/) was used to organize data in tables on the web page. Cytoscape.js was used for network visualization [26]. Plotly (https://plotly.com/) was used to create interactive plots. PHP (https://www.php.net/), Python (https://www.python.org) and Bash scripts were used for server-side development. The SQLite (https://www.sqlite.org/) database was used to manage the data. NCBI’s sequence viewer (https://www.ncbi.nlm.nih.gov/projects/sviewer/) was embedded on the web page to show the viral genome. PANTHER API was used for pathway enrichment analysis [27]. Drug information is not stored in H2V; instead, it is automatically retrieved on request from the DrugBank database via UniProt’s REST API [28]. H2V is deployed in an Amazon AWS host running Ubuntu 16.04.

Từ khóa » H2v