We have on our hands an encrypted file, which is supposed to be a PDF judging from the challenge’s description, and the python script which was used to perform the encryption. It reads chunks of 9 bytes from the source file (padding it with zeroes if necessary), interprets them as a 3x3 matrix, multiplies this matrix by the redacted “key” matrix, flattens the result and writes it to the output file. Thing is, it does the multiplication modulo 256 to fit the resulting values in the same 9 bytes:
def matmult(a,b):
zip_b = zip(*b)
return [[sum(ele_a*ele_b for ele_a, ele_b in zip(row_a, col_b)) % 256
for col_b in zip_b] for row_a in a]
So I suppose the operation isn’t trivially reversible (though I may be mistaken), but we’ll come to that in a minute. Right now, we need to derive the key. We know that the encrypted file in all probability is a valid PDF, so we can assume it starts with “%PDF-1.x\n”, where x is of course the minor version of the format. We can’t say for sure which version it is, of course; moreover, “\n” could actually be “\r” in some files, but it’s a bit rare these days so we’ll ignore that for now. We also can assume that the file ends with either “%%EOF” or “%%EOF\n” (the newline at the end is optional).
What’s very convenient to us in the matrix multiplication scheme is that every resulting row depends only on the corresponding row of the source, that is, the first byte of the output would equal (a11 * b11 + a12 * b21 + a13 * b31) % 256, where a11-a13 are the first three bytes of the input and b11-b31 is the first column of the key matrix, and so on. It means that to restore, for example, the first column of the key matrix we can simply bruteforce every possible three-byte combination, checking if the known row multiplied by it would give us the first byte of the corresponding encrypted row. Let’s do just that, trying out every possible PDF version for the 9-byte header:
import itertools
for version in range(10):
print "Checking version 1.%d" % version
for guess in itertools.product(range(256),repeat=3):
if (((ord('%')*guess[0] + ord('P') * guess[1] + ord('D') * guess[2]) % 256 == 0x70)
and ((ord('F')*guess[0] + ord('-') * guess[1] + ord('1') * guess[2]) % 256 == 0x8e)
and ((ord('.')*guess[0] + ord(str(version)) * guess[1] + ord('\n') * guess[2]) % 256 == 0xc9)):
print guess
We get this in a couple minutes:
tr@karabut.com:~/work/hackyouctf16/crypto300$ python test.py
Checking version 1.0
Checking version 1.1
(84, 219, 199)
Checking version 1.2
Checking version 1.3
(28, 245, 229)
Checking version 1.4
Checking version 1.5
(132, 87, 27)
Checking version 1.6
Checking version 1.7
(204, 145, 153)
Checking version 1.8
Checking version 1.9
(52, 179, 15)
There are multiple collisions. Alright, we’ll need to winnow them, so we should also test for the EOF marker; noticing the last 6 bytes of the encrypted file are zeroes, we determine that the source file had to be padded with at least 6 and at most 8 zeroes for this to be true, so the first row of the last matrix would consist of 1 to 3 last bytes of the EOF marker padded with zeroes. Back to python, and we can narrow the possible versions down to 1.1, 1.3, 1.5 and 1.7 (1.9 doesn’t actually exist):
import itertools
for version in [1,3,5,7]:
for eof in ['EOF', 'OF\n', 'OF\x00', 'F\n\x00', 'F\x00\x00', '\n\x00\x00']:
print "Checking version 1.%d, eof %s" % (version, eof.encode('hex'))
for guess in itertools.product(range(256),repeat=3):
if (((ord('%') * guess[0] + ord('P') * guess[1] + ord('D') * guess[2]) % 256 == 0x70)
and ((ord('F') * guess[0] + ord('-') * guess[1] + ord('1') * guess[2]) % 256 == 0x8e)
and ((ord('.') * guess[0] + ord(str(version)) * guess[1] + ord('\n') * guess[2]) % 256 == 0xc9)
and ((ord(eof[0]) * guess[0] + ord(eof[1]) * guess[1] + ord(eof[2]) * guess[2]) % 256 == 0x28)):
print guess
tr@karabut.com:~/work/hackyouctf16/crypto300$ python test2.py
Checking version 1.1, eof 454f46
Checking version 1.1, eof 4f460a
Checking version 1.1, eof 4f4600
Checking version 1.1, eof 460a00
Checking version 1.1, eof 460000
Checking version 1.1, eof 0a0000
Checking version 1.3, eof 454f46
Checking version 1.3, eof 4f460a
Checking version 1.3, eof 4f4600
Checking version 1.3, eof 460a00
Checking version 1.3, eof 460000
Checking version 1.3, eof 0a0000
Checking version 1.5, eof 454f46
Checking version 1.5, eof 4f460a
Checking version 1.5, eof 4f4600
Checking version 1.5, eof 460a00
Checking version 1.5, eof 460000
Checking version 1.5, eof 0a0000
(132, 87, 27)
Checking version 1.7, eof 454f46
Checking version 1.7, eof 4f460a
Checking version 1.7, eof 4f4600
Checking version 1.7, eof 460a00
Checking version 1.7, eof 460000
Checking version 1.7, eof 0a0000
We get an unique result! So (132, 87, 27) is our first column then. Let’s see what the other two are:
import itertools
version = 5
eof = '\n\x00\x00'
for enc in [[0xe1, 0x65, 0xd0, 0xaa], [0xcf, 0xc6, 0xe2, 0xbe]]:
print "Testing " + str(enc)
for guess in itertools.product(range(256),repeat=3):
if (((ord('%') * guess[0] + ord('P') * guess[1] + ord('D') * guess[2]) % 256 == enc[0])
and ((ord('F') * guess[0] + ord('-') * guess[1] + ord('1') * guess[2]) % 256 == enc[1])
and ((ord('.') * guess[0] + ord(str(version)) * guess[1] + ord('\n') * guess[2]) % 256 == enc[2])
and ((ord(eof[0]) * guess[0] + ord(eof[1]) * guess[1] + ord(eof[2]) * guess[2]) % 256 == enc[3])):
print guess
tr@karabut.com:~/work/hackyouctf16/crypto300$ python test3.py
Testing [225, 101, 208, 170]
(17, 100, 27)
Testing [207, 198, 226, 190]
(19, 16, 4)
And we’ve got our key matrix. As I said earlier, I’m not sure matrix multiplicaton modulo 256 is reversible, but as the rows are independent of each other, we can again bruteforce every possible three-byte combination for every three-byte chunk of the file. Now, the file is about 500kb long, so we’d better build a look-up table for that purpose. Python’s usual data structures aren’t really adequate for building 16 million-entry look-up tables, so we’ll use a basic ctypes array:
import itertools
from ctypes import *
key = [
[132, 17, 19],
[ 87, 100, 16],
[ 27, 27, 4]
]
table = create_string_buffer(256**3 * 3)
print "Building the look-up table..."
for src in itertools.product(range(256),repeat=3):
index = (
(((key[0][0] * src[0] + key[1][0] * src[1] + key[2][0] * src[2]) % 256) << 16) |
(((key[0][1] * src[0] + key[1][1] * src[1] + key[2][1] * src[2]) % 256) << 8) |
(key[0][2] * src[0] + key[1][2] * src[1] + key[2][2] * src[2]) % 256
)
table[index*3] = c_char(chr(src[0]))
table[index*3+1] = c_char(chr(src[1]))
table[index*3+2] = c_char(chr(src[2]))
print "Table ready"
with open('cry300_flag.enc','rb') as f:
f2 = open('cry300_dec.pdf','wb')
last = False
while True:
block = f.read(9)
if len(block) < 9:
block += '\x00'*(9-len(block))
last = True
block = [ord(c) for c in block]
for i in range(3):
index = ((block[i*3] << 16) | (block[i*3+1] << 8) | block[i*3+2]) * 3
f2.write(table[index])
f2.write(table[index+1])
f2.write(table[index+2])
if last:
break
tr@karabut.com:~/work/hackyouctf16/crypto300$ python decrypt.py
Building the look-up table...
Table ready
And we get our cry300_dec.pdf
file, the first and only page of which looks like this: