Assignment 1 - MacOS



Assignment 8 – DNA Visualization

Dean Zeller Due: Thursday, April 19th by 5:00 pm

CS10061 10 points

Spring, 2007

Objective: The student will create a Python program implementing the chaos-game method of visualizing DNA sequences.

Readings: H. Joel Jeffrey (1990). “Chaos game representation of gene structure.” Nucleic Acids Research, Vol 18, No. 8, pp 2163-2170. (Available on class web page.)

Background: Visual statistics using the Chaos Game Representation (CGR)

It is difficult for humans to look at a bunch of numbers and find meaningful patterns. It is far more effective to provide a visual represention the statistics. Bar charts, pie charts, and line graphs are used to represent data in a visual manner. The chaos game representation is just one of many DNA visualization algorithms. Analyzing the data in this format can show some characteristics about the DNA structure. DNA sequences can be found at the National Center for Biotechnology Information website at . Interesting fractal patterns can be created from contrived examples of DNA. For example, a random sequence will fill the square uniformly, but a random sequence without any A’s will create the famous Sierpinski triangle fractal.

Input

The user must provide parameters for the drawing of the visualization. Collect these values from the user before execution. You may create more parameters as needed.

Filename: the file containing the DNA sequence

Interval size: how often to pause execution

Dot size: the radius of the dot created at each point

Output

In addition to the report created by the previous assignment, draw the appropriate dots using the CGR algorithm, interrupting every interval-size dots to pause the drawing for the user.

Testing

Test your program on a wide range of actual DNA sequences, random sequences, and repeated patterns. Compare the visualization to the report generated, and make note of any patterns found.

Tasks

Implement at least three of the following tasks using your chaos automata visualization:

1. Implement the chaos-automata algorithm using Python.

2. Create visualizations for actual DNA sequences for organisms.

3. Use the randomDNA.py program to generate DNA files with specific characteristics to create different artistic patterns.

4. Create guidelines similar to the diagrams above to indicate the different quadrants.

5. Allow the user to specify the characters represented for each corner.

6. Change the color of the dot every so often. Or allow the user to change the color at every interval.

7. Own idea: write up a task idea of your own and implement within the program. Get your idea approved first by your instructor.

Documentation

This assignment builds on the previous assignment. Create documentation blocks for any new functions you create, and list yourself as the author.

Turning in your assignment

1. Print a copy of your CGR program. If you made significant modifications to the randomDNA program, print that as well.

2. Print at least three interesting patterns generated by your program.

3. Write a report briefly describing the tasks completed.

4. No presentation is required for this assignment. It would make an excellent Run-With-It presentation.

Grading

You will be graded on the following criteria:

Quantity Variety of tasks correctly implemented

Readability Documentation indicating the lines of code created and modified

Testing Testing your program on a variety of input files

Creativity Use new methods to solve the problem

Extra Credit

Extra points will be given for including the following features:

Extra quantity Implement more than three tasks

Report/Analysis Include a formal report and spreadsheet of your analysis

CGR algorithm

This assignment combines concepts from the previous assignment on DNA sequencing and material from the graphics assignments. This assignment will implement the CGR algorithm to visualize the input DNA. The algorithm is actually quite simple, once the graphics procedures are understood.

Step 0: Create the DNA Square.

Based on parameters from the user, create a square within the canvas where all points will be drawn. You will need to know the left, top, bottom, and right positions within the square.

Step 1: Initial Point

Create two variables, x and y, denoting the point to draw. The initial point should be the center of the square.

x = (left + right)/2

y = (left + right)/2

Note: you do not draw the initial point; this just serves as a starting point.

Step 2: For each character in the DNA sequence

a) Calculate the next point to draw, which is halfway between the current point and the appropriate corner for the letter. (A: top-left, C: top-right, G: bottom-right, T: bottom-left)

b) Draw the current point

Pseudocode

The following is pseudocode for the CGR algorithm. Your job is to implement the algorithm using Python.

x = (left + right)/2

y = (top + bottom)/2

for i each character in seq

if seq[i]==’A’

x = (x+left)/2

y = (y+top)/2

else if seq[i]==’C’

x = (x+right)/2

y = (y+top)/2

else if seq[i]==’G’

x = (x+right)/2

y = (y+bottom)/2

else if seq[i]==’T’

x = (x+left)/2

y = (y+bottom)/2

drawDot(x,y)

Example: Sequence: “ATAGCCTGTGA”

[pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic]

Analysis

Consider the diagrams below. The first contains additional guidelines for different size patterns. The final diagram is the chaos-automata result for the sequence ATAGCCTGTGA. Note the correspondence between the final pattern.

[pic][pic][pic][pic][pic][pic]

-----------------------

Interrupting Execution

The following Python code will interrupt a loop every interval times.

interval = 10

for i in range(1000):

print i,

if i%interval == 0:

raw_input(“press return to continue”)

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download