Python implementation of a Markov Chain Monte Carlo method for breaking monoalphabetic substitution ciphers.
It is assumed that input ciphertext consists of characters from the specified alphabet.
use default alphabet with default parameters
./breaksub.py test_data/ciphertext_russel.txt > russel.txt
do 3000 iterations of the method (default is 5000) and do not show progress bar
./breaksub.py test_data/ciphertext_warandpeace.txt --no-progress --iters 3000 > warandpeace.txt
specifying alphabet and algorithm parameters
./breaksub.py --alphabet example_options/alphabet.csv --letter-probs example_options/letter_probabilities.csv --transition-probs example_options/letter_transition_matrix.csv --iters 7000 test_data/ciphertext_paradiselost.txt > paradiselost.txt
from breaksub import *
with open("test_data/ciphertext_warandpeace.txt") as f:
ciphertext = f.read()
alphabet = default_alphabet() # a-z, space, period
ciphertext = [c for c in ciphertext if c in alphabet] # remove characters that are not in our alphabet
options = DecipherOptions(iters=4000, print_progress=True)
cipher = substitution_decipher(ciphertext, options) # break the cipher
print('cipher is ', cipher)
plaintext = recover_plaintext(cipher, ciphertext) # recover plaintext given the cipher
print(plaintext)