RETRIEVING INFORMATION FROM THE BLAST SEARCH …



NAME ____________________________________________

GENETICS (BIO 306)

RETRIEVING INFORMATION FROM BLAST SEARCH RESULTS 35 POINTS

Supply the following information about your fly genes. You may submit a separate document for each gene, or include both on one document.

1. Write the name of your fly gene.

2. In your own words, describe the putative normal function of the gene (i.e., of the wild-type allele). Include a properly formatted reference for the journal article that describes this function (see links on the assignment website for proper formatting guidelines).

3. Conduct a nucleotide BLAST search using your wild-type gene sequence. Exclude Drosophila (taxid:7215) [Note: if your initial search returns an error message of “No significant similarity found,” repeat the search using discontiguous megablast rather than megablast as your search algorithm. Some genes (e.g., white) require that you increase the number of results (try 500) in order to get a useful result. To change this feature, expand the “Algorithm parameters” menu in the bottom left corner]. Write the name of the top non-Drosophila hit obtained from the BLAST search, including the name of the organism (scientific and common names) and the name of the gene – basically, the entire Description, minus information about the data type (mRNA, complete sequence, complete cds, etc.). [Note: Sometimes the common name will not be readily available. If not, click on the link from the list of hits, and once the locus information comes up, click on the link next to the word "ORGANISM". This will take you to the taxonomy of the critter and the common name will be listed].

4. Write the query coverage % (“Query cover”), E value, and % identity (“Ident”) for your top non-Drosophila hit.

5. Using the Graphic Summary (i.e. the series of colored lines at the top of the BLAST results page), briefly describe the pattern of overlap between the query sequence and the top non-Drosophila hit. Does the entire hit show strong similarity to the query sequence? If not, which parts of the hit do? Provide approximate ranges of nucleotides and the similarity level (alignment score range, represented by the color of the bars).

6. Run a second BLAST search. In the “Choose Search Set” window, under “Organism,” include the following genetic model organisms (you can just copy-paste the information in the left column below to the Organism window, using the “+” button to add lines, one per organism):

Escherichia coli (taxid:562) [bacterium]

Saccharomyces cerevisiae (taxid:4932) [yeast]

Caenorhabditis elegans (taxid:6239) [nematode]

Arabidopsis thaliana (taxid:3702) [rock cress (plant)]

Danio rerio (taxid:7955) [zebrafish]

Mus musculus (taxid:10090) [mouse]

Homo sapiens (taxid:9606) [human]

Note: You may need to use the blastn algorithm rather than megablast or discontinuous megablast in order to obtain any hits. Even then, you may not obtain hits from all of these organisms. That is o.k.! If an organism is not represented in your results, type “NA” under percent query cover for that organism and leave the other two columns blank.

a. In the table below, provide the percent query coverage, percent identity, and the E-value for each model organism.

|Organism |Percent query cover |Percent identity |E-value |

|Escherichia coli | | | |

|Saccharomyces cerevisiae | | | |

|Caenorhabditis elegans | | | |

|Arabidopsis thaliana | | | |

|Danio rerio | | | |

|Mus musculus | | | |

|Homo sapiens | | | |

b. What organism yields the top hit to your Drosophila gene? Is this result what you expected? Why or why not?

c. What is the identity (title/description) of the top-hit gene? Does the top hit appear to have a biologically similar function to your Drosophila gene? Why or why not (use gene name/description, % coverage, % identity, and E value to explain)?

7. Now, obtain the amino acid translation of your gene sequence from the GenBank record for your sequence (you should have these saved from when you did the fly sequence assignment), and rerun the BLAST search with the same model organisms as above, but this time use a protein BLAST (blastp) search. Record the top hit organism, gene name/description, % query coverage, % identity, and E value for the top hit. Do these results differ significantly from the results of the nucleotide BLAST search? Does your conclusion change about whether the top hit appears to have a biologically similar function to your Drosophila gene? Why do you think this might be the case?

8. What do your results suggest about the evolutionary history of your gene?

 

Submit your answers to the appropriate dropbox as a Word File (.doc or .docx).

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download