Author: Tommy Angelini
Title: Protein Chips
(13 kb)
Abstract:
In this course, our studies have been entirely based upon the methods of comparison, relying on the assumption that somewhere there is indeed realAuthor: Anoush Aghajani-Talesh
information. After all, comparisons are not made for their own sake, and meaningful comparisons must make reference to some known facts. Where
does the information come from in bioinformatics? In this essay I will present one of the methods for collecting such information: protein chips.
Abstract:
This essay describes a heuristic method for the extraction of profiles from multiple sequence alignments.Author: Marco V. Bayas
Abstract:
The use of Hidden Markov Models (HMM) in protein modeling is described. Sequence alignment based on profile HMMs can help identifying protein familyAuthor : Swarbhanu Chatterjee.
members and present some advantages. This possibility is discussed.
Abstract :
Hidden Markov Models are a sophisticated and useful tool for sequence alignment. Conventional tools are not able to analyze the large amount of data that is available. In this review paper, I have explained what HMMs are, described the different kinds of HMM which exist andshowed how they are useful.Author: Soon Yong Chang
Abstract:
In this paper, the non-coding portion of DNA is focused, where the regulatory elements of the genes are found. Different from the coding portion where the word size is limited to 3 letters (codon) this region of DNA allows for larger flexibility in terms of the possible size of the words. The proper understanding of the segmentation holds the promise of better understanding the non-coding region of DNA.Author: Jordi Cohen
Abstract:
Proteins search engines such as BLAST use score matrices to assign a score to protein alignments. The usual scoring matrices, such as PAM and BLOSUM, are very well suited for general-purpose searches, but they perform sub-optimally when they are used to compare hydrophobic protein domains such as those found in membrane proteins. It has been suggested that new scoring matrices should be developped, that would be especially suited for the comparison of the hydrophobic parts of proteins. New scoring matrices are introduced here that perform noticeably better in queries that involve such protein segments. These matrices have unusual properties such as asymmetric off-diagonal components as well as negative diagonal elements. In this essay, I will present these matrices and describe their properties.Author: Peter Fleck
Author :Parag Ghosh
Title: Determining protein function
from Comparative GenomeAnalysis (19kb)
Abstract:
This essay aims at identifying proteins that participate in a functional pathway. The underlying assumption is that proteins that function together in a pathway or structural complex are likely to be preserved together or eliminated together in organisms during the process of evolution. This property of correlated olution is studied here by characterizing each protein by its phylogenetic profile. This method not only brings out functional correlations among proteins but also helps us to predict the functions of uncharacterized proteins.Author: Matt Gordon
The need for fast comparison of genetic sequences has become evident with the rapid expansion of genetic sequence databases. This paperAuthor:Paul Grayson
discusses the most popular sequence alignment algorithms, based on dynamic programming, and how they can be effectively sped up by use of
parallel processing algorithms that distribute the computing requirements over several processors. Key issues such as load balancing
and efficient processor usage are addressed.
Abstract:
This essay discusses a recent study comparing the entire genomes of three eukaryotic organisms. The study identified most of the protein sequences in each species, in order to examine their differences. Comparison to known features of the species allows us to begin to understand why their genomes contain differing numbers of copies of particular types of protein.Author: Chalermpol Kanchanawarin
Abstract:
Thinking about alignment of multiple DNA sequences, what would you do if you find that there are some regions in the sequences which do not look like theyAuthor: David Larson
could be aligned as there may be some ambiguity in the alignment? In most studies, these alignment-ambiguous regions are simply removed before analysis
is carried out. In the article in the opinion section of TRENDS in Ecology & Evolution, December 2001 issue, the author, Michael S.Y. Lee, has raised and
restressed the importance of proper study and analysis of these alignment-ambiguous regions (e.g. the repidly evolving regions of genes) in
Molecular Phylogenetic and Evolutionary studies. Three promising methods have been suggested for analysis of such regions with examples.
Abstract:
This paper presents a definition and some of the mathematics behind hidden Markov models. It also discusses some of the usefulness andAuthor: Yan Li
appliations of these models, with a primary focus on categorizing DNA sequences.
Abstract:
This article describes a reliable and efficient method for multiple sequence alignment--CLUSTAL W Method. A brief intorduction of the progressiveAuthor: Ian O'Dwyer
appraoch is followed by the summary of improvements upon its sensitivity by CLUSTAL W. Modifications by CLUSTAL W are discussed in detail, together with its limation.
Abstract:
This essay reviews a statistical comparison of the the two draft versions ofTitle: The Minimal-Gene-Set (676 kb)
the human genome. It is found that, although both genomes share some similar
features at a macro level, they differ in the details.
Abstract:
An interesting challenge facing the biological community is the construction of a genome with the minimum required genes. This essay reviews a comparative genomics approach to the problem- this approach, combined with some biochemistry, may lead to a solution of the problem.Author: Rahul Roy
Abstract:
The century of biological revolution brought about the large scale sequencing of the genomes has provided us more than enough we can handle. The number of sequences and genes in the databases have been growing exponentially.Author: Prasanth Sankar
Abstract:
This essay describes an ab-initio bioinformatic method published recently to identify the regualtory genes. The method requires no experimental input, unlike other methods used for the same purpose, and develops a scheme to generate words from randomly placed letters. The success of the method is modest in the identification of the regulatory genes.Author: Martin Ph. Stehno
Abstract:
Conventional motif-finding algortihms are optimized for either good soundness or completeness. Good soundness is achieved when the output lists only a fewAuthor: Kalin Vestigan
motifs that are very likely to represent binding sites. On the other hand, there are algorithms which are designed to give a complete list of binding
sites. The downside of this is that the list will also contain typically hundreds of small variations of strong motifs, which are not considered to be
motifs in their own right. Here I want to report a new method of post-processing the output of such an algorithm. The method was invented by
Blanchette and Sinha (2001) and clears the motif list from artifacts of strong motifs. Left are a small number of IUPAC ambiguity code sequences that very
likely represent real binding sites.
Abstract:
Hidden Markov Models (HMMs) are an implementation of the idea that the scoring parameters should guide the multiple alignment as much as the alignment should determine the scoring parameters.Author: Elizabeth Villa Rodriguez
Abstract:
This essay reviews the scientific efforts by several group to investigate by means of molecular phylogenetics if a polio vaccine was to guilt for theAuthor: Qing-jun Wang
introduction of HIV into the man affecting virus landscape.
Abstract:
This essay describes the recent progress in identifying the borders of biological meaningful units in a genome. Segmentation procedure based on Jensen-Shannon divergence with a new stopping criteria based on the Bayesian information criterion are introduced. The procedure was appliedAuthor: Paul Welander
to complete genome of E. coli and the left telomere of chromosome 12 of Yeast and obtained highly accurate segmentations.
Dr. Samuel Karlin and colleagues at Stanford University have developed a method for assessing genomic similarities based on relative abundances ofAuthor:Kalin Vestigan
short nucleotide chains. The goal of such a method is to eliminate the need for homologous sequences that have been previously aligned by another
procedure. This approach deviates from previous methods of genomic analysis by utilizing information derived from the entire genome rather
than from specific subsequences. The resulting genomic comparisons are generally in agreement with accepted phylogenies.
Author: Jian Xu
Title: Building a dictionary
for DNA --Decoding the regulatory regions of a genome (285 kb)
Abstract:
This essay explains how statistical physics can be used to find the genes in regulatory region, which was referred to "junk". By an free energy analog,Author: Jin Yu
they builded a dictionary for the regulatory region, then confirmed some old "words" (regulatory factor) and find some new "words".
Abstract:
Approaches to gene recognition are traditionally divided into intrinsic and extrinsic approaches. This essay introduces a work using bacterial DNA regions significantly related to known proteins to extract codon usage statistics and other intrinsic recognition parameters that are further applied to unexplored parts of a genome. The leading idea of this work is that extrinsic evidence should be given higher priority that intrinsic information.
Author: Guojun Zhu
Title: Bioinformatics
Framework at Organism Level (113 kb)
Abstract:
This essay describes the bioinformatics framework and database at organism level. I describe the structure, the compromises and limitations, the
use, the future of it and talk about some examples.