Physics 498: Statistical Physics of Biological Information and Complexity

Nigel Goldenfeld

Homework 3

Due date: This assignment is due Fri Nov 16, 2001 or earlier.  There will be a penalty for late entries.  If you know that you will be unable to make this deadline for a good reason, e.g. you have beam time at Argonne or something like that, let me know ahead of time so that we can arrange a mutually agreed upon deadline. 

HW 3--1: This assignment is a chance for you to play with some web based bioinformatics tools, in an open-ended way.

(a) Please go to this tutorial on using databases, BLAST etc. and work through it, as was done in class.   You need not hand anything in for this.  You may prefer to do the analysis using the Biology WorkBench environment.   If the web resources do not answer your questions on this, please feel free to ask Michelle Nahas, who will do her best to answer your questions.

(b) The tutorial ends in an assignment (exercise 6).  Please work with a mystery sequence from this page and try to figure out the family and probable function, and anything else you can.  You should chose the mystery sequence from the assignment web page, based on the letter of your last name.

If your last name begins with:

the letter A-E then chose sequence 1

the letter F-J then chose sequence 2

the letter K-O then chose sequence 3

the letter P-T then chose sequence 4

the letter U-Z then chose sequence 5

Try to explain your reasoning as best you can.  I am not expecting that you would spend more than 1 - 2 hours on this, after you have worked through the tutorial in (a). 

HW 3--2: Write a brief paper about any aspect of sequence alignment, motif finding, pattern identification or bioinformatics that you find interesting.  Some suggested classes of topics:

Length of paper: No less than 2 single spaced 12 point typed pages.  No more than 4 single spaced 12 point typed pages.  This will be strictly enforced (I will not read more than 4 pages).   Figures should be attached separately, i.e. they will not count in the length limitation. 

Dissemination of papers: A few selected papers will be subject to a brief 10 minute presentation by their author later on this semester.   I will endeavour to collect all the homeworks and term paper together in a book for all those taking this class.

Further details:

I expect that you will use the web resources on the course's home page, and/or scan recent issues of Proceedings of the National Academy of Sciences, Science, Nature, Physical Review Letters, the appropriate online preprint archives, or more biologically technical journals, such as Cell.

The paper should have a bibliography.

You should imagine that your reader is one of your classmates.

Format: Your paper should have approximately the following structure.  Here are some suggestions for the sorts of questions your paper should address to make it most useful to the reader.  As you will see, the purpose is not to focus too much on technical details.

Introduction and Background:

What hypotheses are being tested in this paper?

What information induced the authors to perform the experiments/theory?

What new methods or insights brought to bear on the problem?

Why did you chose to write about this topic? 

Why is this interesting or important?


What are the critical methods of the paper?

What enabling technologies are used?

What are the weaknesses of the methods used?

Are there other or better approaches that could be used?

Results and Discussion

What are the primary conclusions of the paper?

Did the authors prove their hypotheses?

What novel information or directions come from this work?

What control experiments were performed? (If appropriate)

What assumptions still remain in the work?

How could these assumptions be tested?

What other explanations for the observations are still possible?

What would you do next to advance this field?