CMU-CS-20-103 Computer Science Department School of Computer Science, Carnegie Mellon University
Zhen Zhou M.S. Thesis May 2020
The problem of constructing codes resilient to deletions, where bits go missing and a subsequence is received, has a long history. Optimal binary single-deletion c.podes, which have size Θ(2n/n) (or redundancy log n + O(1)), for length n, have been known since the 60s. Regardless of the alphabet size of the code, the redundancy of a code is measured in bits. Although the optimal binary single-deletion codes have been established, the situation even for two deletions is a lot more complex. A non-constructive greedy argument shows the existence of binary codes of redundancy 4 log2 n, whereas redundancy 2 log2 n is necessary. Compared with binary deletion codes, the topic of deletion codes over an alphabet of an arbitrary size q (or q-ary deletion codes) received much less attention. In this work, we describe 2-deletion codes over an alphabet of size q of redundancy 3 log2 n + Oq(log log n) + r2(n), assuming r2(n) is the minimum redundancy needed for binary 2-deletion code, thus opening up a new direction of study. Combining with Hastad's list decodable binary 2-deletion codes, which have redundancy 3 log2 n + O(log log n) and allow the received word to be list-decoded into at most two possibilities from 2 deletions, we obtain a list-decodable q-ary 2-deletion code with 6 log2 n + Oq(log log n) redundancy and a list size 2. Our approach in the proof uses indicator strings and the run numbers in the string to compute various check sums that together enable recovery of the two missing bits. We hope this new perspective will be a useful way to think about and construct further deletion codes. 43 pages
Thesis Committee:
Srinivasan Seshan, Head, Computer Science Department
| |
Return to:
SCS Technical Report Collection This page maintained by [email protected] |