Loading...
 

Chanted DNA

Let's say you want to memorize what makes somebody unique, it might be possible to isolate some parts of the DNA that is more "human" that other parts shared with our closest evolutionary cousins. Then take that information and transform into a sequence of words that can be memorized.

This could be as simple as decisions in a GPT-based language model decoding or just picking paraphrases over a given text.

Whether the information to do this is at the level of DNA bases or gene expressions and whether a full sequencing vs. gene identification is needed are implementation details. Given the full genome has about 3 billion base pairs but we share about 99% with chimps, that leaves 30 million base pairs to memorize (in base 4). Using a vocabulary of 100,000 words, that will be about 3 million words (or 5 times the size of a Christian bible):

>>> "{:,}".format(30*1000*1000*math.log(4, 100 * 1000))
'3,612,359.9479677742'


Most probably using genes will work better, estimated to be around 25,000, with about 3 million variants per gene. Using a vocabulary of 100,000 words leaves about 32,000 words to memorize:

>>> "{:,}".format(25*1000*math.log(3 * 1000 * 1000, 100 * 1000))
'32,385.60627359831'


(Making a sequence of intelligible prose that can be extracted from there will need a bigger vocabulary in practice.) That's about 4h of continuous speech:

>>> 25*1000*math.log(3 * 1000 * 1000, 100 * 1000) / 1000 * 6.67 / 60.
3.6001998974150125