March 2, 2005: Dr. Olgica Milenkovic (U. Colorado, Boulder)
Constrained and Error-Control Coding for DNA Computers
Place: 2460 A.V. Williams
Time: 3:00 PM
Host: Dr. Alexander Barg
Abstract: The last century was marked by the birth of two outstanding scientific and engineering disciplines: silicon based computing and the theory and technology of genetic data analysis. The research field very likely to dominate the area of scientific computing in the foreseeable future is the merger of these two disciplines, leading to unprecedented possibilities for applications in varied areas of biology and engineering. The first steps toward this goal were made in 1994, when Leonard Adleman solved a quite unremarkable computational problem with an outstanding method. The computational problem considered by Adleman was a simple instant of the directed travelling salesmen problem. The technique used for solving the problem was a new technological paradigm, termed DNA computing. Adleman's experiment represents a landmark demonstration of data processing and communication on the level of biological molecules.
Following Adleman's initial work, many other DNA computing schemes were developed, including solvers for the 3-SAT problem, DNA systems for analyzing chess games, DNA-based logical components, as well as an intelligent, interactive DNA "game machine", called MAYA. Other applications of DNA techniques include DNA-based storage, DNA biosensing, molecular signal processing and intra-cell disease diagnostics and treatment.
One of the major obstacles to efficient DNA computing, storage and signal processing is the very low reliability of DNA hybridization operations. DNA computing experiments require the creation of a controlled environment that allows for a set of DNA oligonucleotide strands (codewords) to hybridize with their complements in an appropriate fashion. If the DNA codewords are not carefully chosen, unwanted, or non-selective, hybridization may occur. Even more detrimental is the fact that an oligonucleotide sequence may fold back onto itself, forming a secondary structure which prevents the sequence from participating in the computational process.
In this talk, we describe some new DNA code design methods which meet the hybridization-constraint requirements. Several classes of such codes are cyclic, which is a particularly useful property in light of the fact that the presence of a cyclic structure reduces the complexity of testing DNA codes for secondary structure formation. Based on some simple properties of a dynamic-programming algorithm, known as Nussinov's method, which is an effective predictor of secondary structure given the sequence of bases in a DNA codeword, we identify some code design criteria that reduce the possibility of secondary structure formation in a codeword. These design criteria can be formulated in terms of the requirement that the Watson-Crick distance between a DNA codeword and a number of its shifts be larger than a given threshold or that the DNA codewords do not contain long subsequences and their reverse complements. In this talk we address both the issue of enumerating DNA sequences with such properties and the problem of practical DNA code construction under such constraints.
Joint work with Navin Kashyap from Queen's University, Kingston, Canada.
Biography: Olgica Milenkovic received the M.S. Degree in Mathematics in 2001, and the Ph.D. Degree in Electrical Engineering in 2002, both from the University of Michigan, Ann Arbor. She is currently an Assistant Professor in the Department of Electrical and Computer Engineering at the University of Colorado, Boulder. Her research interests are in the area of error-control coding, constrained coding, probability theory, analysis of algorithms and genetics.