NOT PEER-REVIEWED
"PeerJ Preprints" is a venue for early communication or feedback before peer review. Data may be preliminary.

A peer-reviewed article of this Preprint also exists.

View peer-reviewed version

Supplemental Information

There are 19 different types of ConceptClass included in the dataset

Notes: *Indicates data that was included in the updated dataset, used during this work.

DOI: 10.7287/peerj.preprints.1427v1/supp-1

Increasing the edgeset of a semantic subgraph can actually improve the search time (in seconds) of a search

Note: Random semantic subgraphs were created with |V (Q)| of 4. Edgesets (|E (Q)|) of the subgraphs ranged from 3-6. Random target graphs were created with node sets ranging from 1 x 104 to 1 x 105. The algorithm used one of two parameters i) all elements of the match must be greater than ST (top) or ii) all elements must cumulatively be greater than the ST (bottom).

DOI: 10.7287/peerj.preprints.1427v1/supp-2

Accuracy of algorithm when altering target graph (G) connectivity

Note: Semantic subgraphs were created at random with a |V (Q)| of between 3 and 6. Runs were duplicated at least five times for each point in the graph, using the algorithm with two alternate parameters: i) all elements of the match must be greater than ST (top left and top right); and ii) all elements must cumulatively be greater than the ST (bottom left and bottom right). Graphs on the left show matches returned before and after spiking the target graph with 100 instances of the semantic subgraph that is to be searched for. Graphs on the right show the difference between the spiked (red) and non-spiked searches (black).

DOI: 10.7287/peerj.preprints.1427v1/supp-3

Accuracy of algorithm when altering target graph (G) size

Note: Semantic subgraphs were created at random with a |V (Q)| of between 3 and 6. Runs were duplicated at least five times for each point in the graph, using the algorithm with two alternate parameters: i) all elements of the match must be greater than ST (top left and top right); and ii) all elements must cumulatively be greater than the ST (bottom left and bottom right). Graphs on the left show matches returned before and after spiking the target graph with 100 instances of the semantic subgraph that is to be searched for. Graphs on the right show the difference between the spiked (red) and non-spiked searches (black).

DOI: 10.7287/peerj.preprints.1427v1/supp-4

Scored Subgraphs

Left hand graph is a boxplot showing the semantic subgraph scores, with the mean score shown by a red point. Graph on the right shows the scores for subgraphs ranked based on ID. All subgraphs that scored > maximum are labelled with their ID.

DOI: 10.7287/peerj.preprints.1427v1/supp-5

There are 19 different types of ConceptClass included in the dataset

Note: *Indicates data that was included in the updated dataset, used during this work.

DOI: 10.7287/peerj.preprints.1427v1/supp-6

There are 42 different categories of RelationType included in the dataset

Note: *Indicates data that was included in the updated dataset, used during this work.

DOI: 10.7287/peerj.preprints.1427v1/supp-7

Valid RelationTypes and their context

DOI: 10.7287/peerj.preprints.1427v1/supp-8

Scored subgraphs

Note: # SP refers to the number of drug-target associations captured in DBv3 and not in our dataset whose shortest path is captured via this subgraph. # Interactions refers to the number of drug-target associations inferred via S and # Unique refers to non-redundant inferred drug-target associations. The number of the inferred interactions that are captured in DBv3 are shown in the # Valid column. S score is to five decimal places. As we were using a semantic distance of 0.8 subgraphs that maintain a common topology and a relatively similar semantics (e.g. 62,63 and 64) can return the same set of mappings.

DOI: 10.7287/peerj.preprints.1427v1/supp-9

Scored associations

Top 100 ranked drug-target associations inferred by DReSMin.

DOI: 10.7287/peerj.preprints.1427v1/supp-10

Graph Split algorithm

Article provides a more concise description of the graph split component of DReSMin.

DOI: 10.7287/peerj.preprints.1427v1/supp-11

Calculating Semantic Threshold

Article provides an explanation of the semantic threshold (ST) employed during this work.

DOI: 10.7287/peerj.preprints.1427v1/supp-12

Additional Information

Competing Interests

Dr. Hannah Tipney and Peter Woolllard are paid employees of GSK.

Author Contributions

Joseph Mullen conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Simon J Cockell conceived and designed the experiments, analyzed the data, wrote the paper, reviewed drafts of the paper.

Hannah Tipney conceived and designed the experiments, wrote the paper, reviewed drafts of the paper.

Peter M Woollard conceived and designed the experiments, wrote the paper, reviewed drafts of the paper.

Anil Wipat conceived and designed the experiments, analyzed the data, wrote the paper, reviewed drafts of the paper.

Data Deposition

The following information was supplied regarding data availability:

https://bitbucket.org/ncl-intbio/dresmin

Funding

JM receives funding as a CASE student from GSK and EPSRC (ref 1592752). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.


Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Some Markdown syntax is allowed: _italic_ **bold** ^superscript^ ~subscript~ %%blockquote%% [link text](link URL)
 
By posting this you agree to PeerJ's commenting policies
  Visitors   Views   Downloads