The overall goal of the following experiment is to identify potential regulatory signals in sets of co-regulate genes and to identify other genes that might functionally belong to this set. This is achieved by submitting a list of candidate genes or sequences to the scope algorithm. As a second step, the analysis results are examined to identify high scoring motifs.
One or more motifs are chosen to investigate further. Next, the extra genes identified by scope for a given motif are combined with the original gene set and resubmitted to scope for further analysis results are obtained that show motif patterns in the new enlarged gene set. These patterns serve to confirm the original motif identification and to suggest extra genes that might be considered for further biological investigation for co-regulation with the original genes submitted to scope.
The main advantage of this approach is that unlike other motif finders scope uses an ensemble approach that consists of a number of different algorithms, each of which has been designed to find a specific kind of motif. In addition, scope doesn't force the user to provide unknowable information such as the length of the motif or the number of occurrences of the motif that you expect to find. The user interface is extremely simple and doesn't require any expertise in regulatory sequence finding.
Therefore, our user will be able to perform the analysis in a very simple and straightforward way. Scope output is highly graphical in nature, and that means that when you get the results back, it's very intuitive to find patterns and identify motifs that are of interest to you. To begin, prepare a list of names for genes that may be co-regulate.
For analysis by scope, save the list as a text file or copy it to the clipboard to paste into scope later. The file should contain one gene name per line with no additional information. Start the web browser and connect it to the URL here.
The information that scope needs to perform the analysis can be entered. Using the species popup menu, choose the species to be examined. It is important to choose the correct species because scope refers to the genome to calculate background frequencies of occurrence.
For any candidate motif, it is examining upstream sequence. Radio buttons are used to choose either intergenic or fixed length. Intergenic will analyze all the sequence between the gene of interest and the previous gene, whereas choosing fixed length, we'll look at exactly that number of nucleotides upstream from the start of the current gene.
Next, inform scope what gene set to analyze by either pasting the gene list or fast a sequence into the gene list text box, or by pressing the choose file button to select the file. The results must include section can be used to enter a motif for scope to include in its analysis and is useful when looking for a specific motif. The last section on the page can be used to enter an email address and a comment to be saved with the analysis.
If this is filled in scope, will send an email with a link back to the webpage containing the results. Now that the scope parameters have been described, the sample search button will be used to demonstrate how the scope program is run. Pressing this button will automatically fill in all the necessary information that was described for a manual run.
Three genes are automatically entered and the appropriate choices are selected for the other fields. These three genes are involved in telomere maintenance via recombination in croce visier. Keeping the extra genes option checked will instruct scope to look through the genome and identify any other genes that have this upstream motif resulting in an increase in motif score when added to the original gene set.
Finally start the analysis by pressing the run scope button at the bottom of the page. After the analysis is run, the results appear on the screen. The top of the page contains a table of information about the motifs that were found using scope.
The first column contains a list of motifs that were found and small colored squares serve as a legend for the graphical motif map shown below. The display of any given motif may be toggled on or off by clicking in the colored box. Other columns of data include count the number of occurrences of the motif in the entire gene set sig value, an indication of the significance of that motif coverage, the fraction of the submitted genes that contain at least one instance of that motif and algorithm telling which of the three scope component algorithms was used to detect the motif.
Clicking on any of the listed motifs will take the user to a page containing detailed information for that motif. On this page. The motif is represented in three ways, a sequence logo, a position, weight matrix, and a list of all motif instances with their positions, strands, and genes.
Further down the page are additional details regarding the results of looking for other genes containing this motif. In this case, there were 65 other genes containing the motif, all of which improved the SIG value when added to the original gene.Set. Pressing add check genes to search will return to the scope setup page.
With these genes added to the original gene set and the parameter set as they were previously using this example, 10 extra genes are added to the original three. The results of scope analysis of the extra genes for this motif are shown here. The original three genes are on the bottom of the results.
Looking at the pattern of the motifs in the upstream region of these extra genes clearly shows that they are similar. The original motif is now the highest scoring motif in the set shown here is an additional analysis result of a different set of genes involved in ribosome biogenesis. Clearly visible is a pattern of two motifs that can be seen for most of the genes being examined.
The graphical output of scope makes these kinds of patterns very easy for the user to discern. Unlike many other motif finders, scope does not require the user to take guesses at the length of the motif or the kinds of motifs the program should be looking for. Whether you're an expert or a novice scope has the same interface and will provide the same results for you.
Additionally, scope's ability to find other genes that might be co-regulate with your initial gene set provides new ideas for biological experimentation. The ensemble approach usually means that scope will return the most meaningful results and often much more accurately than in other motif finders. Don't forget though, that the ultimate interpreter of the results is you the life scientist.
No matter what the computer says, the results have to make sense in terms of biological context.