So this challenge was comprised of a single 15-megabyte animated gif, flashing various symbols at you one at a time for more than a thousand frames, each different color and size, positioned randomly on a black background. It seemed wise to disregard everything about them except what the actual symbol was for starters, and flipping through the frames I was able to count exactly 26 different symbols, a couple of them only occuring near the end.
At that point I was already pretty sure that it should be just a basic sustitution cipher, and that I could just transcribe every symbol by hand and feed them into a solver (not that hard with only a thousand odd characters). But hackers should definitely be lazier than that.
A better plan:
Alright, this seems fine, but how to go about it? Well, let’s use PIL (and steal the nonblack pixels counter from StackExchange):
from glob import glob
from PIL import Image
filenames = sorted(glob("out_*.png"))
features = {}
def get_nonblack(im):
return sum(im.point(lambda x: 255 if x else 0)
.convert("L")
.point(bool)
.getdata())
print "Calculating features..."
for fn in filenames[1:-1]: # skip the first and last frames with words
img = Image.open(fn)
pixels = img.load()
# getbbox() crops out all zero value pixels,
# so we need to split the image into planes
# and use the red plane as a reference for that
# (and the green one as a fallback because there
# are frames with no red in them apparently)
r, g, b, a = img.split()
box = r.getbbox()
left, top, right, bottom = box if box else g.getbox()
cropped = img.crop((left, top, right, bottom))
nonblack = get_nonblack(cropped)
nonblack_topleft = get_nonblack(cropped.crop((0, 0, cropped.size[0]/2, cropped.size[1]/2)))
nonblack_topright = get_nonblack(cropped.crop((cropped.size[0]/2, 0, cropped.size[0], cropped.size[1]/2)))
nonblack_bottomleft = get_nonblack(cropped.crop((0, cropped.size[1]/2, cropped.size[0]/2, cropped.size[1])))
nonblack_bottomright = get_nonblack(cropped.crop((cropped.size[0]/2, cropped.size[1]/2, cropped.size[0], cropped.size[1])))
total = (cropped.size[0] * cropped.size[1])
features[fn[4:-4]] = (
(float(nonblack) / total),
(float(nonblack_topleft) / (total / 4)),
(float(nonblack_topright) / (total / 4)),
(float(nonblack_bottomleft) / (total / 4)),
(float(nonblack_bottomright) / (total / 4)),
(float(cropped.size[0]) / cropped.size[1])
)
And we get our features dictionary! Now, let’s move on to finding the right tolerance:
distinct = {}
tolerance = 0.01
cur = 0
for f in features.items():
if f[0] not in distinct:
distinct[f[0]] = cur
for f2 in features.items():
# should work alright without weighting in this case
distance = sum([(f[1][x] - f2[1][x]) ** 2 for x in range(len(f[1]))])
# this is the tolerance, should be tweaked
# until we catch 26 different symbols
if distance < tolerance:
distinct[f2[0]] = cur
cur +=1
print "Total no of distinct symbols: %s" % cur
Turns out 0.01 works exactly right, actually:
tr@karabut.com:~/work/ndh17/codetalkers$ python detect.py
Calculating features...
Finding distinct...
Total no of distinct symbols: 26
Of course some characters could be detected wrong, but with so many of them it’s hardly going to matter. All that’s left is to form the actual string:
out = ""
for k in sorted(distinct.keys()):
out += string.letters[distinct[k]]
print out
And we get this:
tr@karabut.com:~/work/ndh17/codetalkers$ python detect.py
Calculating features...
Finding distinct...
Total no of distinct symbols: 26
adkqmgafamlmbmeccxniasuxqerelaiadkumcakrafebkfilbrskhkdkenelsikobueltdkumcaokfdafcxvkseoklabjaiahkdrelmaiaehjsaomlasrjamfkglvefmelbikoadkqmnimlprfkrfmabefxibelsefsibjaomfiradkqmgeihfaebasmlqereluxijmpabetetnfmbegjkgeirefbkobjabaedgkftmlpkllbbskhkdkimdksadkumcamlbaflabrcebokfdtnfmbebkktmlirmfebmklofkdgaebjafokfaheibibjebniasixdukcibkijkggaebjafhjmlaiahjefehbafielsibfaabimplielsofkddelpebjebniasibkhtixdukcibkawrfaiiadkbmkliinhjeicmpjbuncuiimplmoxmlpmlirmfebmklbjaomfibiabkoadkqmgeihfaebaseirefbkomdksaidaiiepmlpoaebnfaibkjacroehmcmbebaacahbfklmhhkddnlmhebmklelsbkiafvaeiesmibmlpnmijmlpoaebnfaofkdkbjafiafvmhaitnfmbehfaebasbjaomfiradkqmueiasklbjaawrfaiimklibjebjakuiafvasrakrcadetmlpelskbjafbjmlpimlbjahmbxjnlsfasikoadkqmhjefehbafigafaalhksasmlbjanlmhksaibelsefsbjaessmbmklikfmpmleccxfaznaibasuxpkkpcatebdkdkmdeftsevmielsdeftniihjafafgfkbabjaomfibsfeobokfhklimsafebmkluxbjanlmhksabahjlmhechkddmbbaaelserrcamlhgjkiaxeinktmseelsrabafasuafpqkmlasbjaomfibkoomhmecnbhrfkrkiecgalbbjfknpjecklpiafmaikohkddalbmlpuxdaduafikobjanlmhksahklikfbmndelslebmklecibelsefsmyebmkluksmaikovefmknihknlbfmairefbmhmrebmlpgjalbfelidmbbasadkqmixdukciefairahmomaseiebgkuxbaiaznalhahklpfebncebmklimoxknhelfaesbjmidaiiepabjaocepmijpwaalbvwxibfaghyqndxognwpyflsrhtimnoehxzd
So we feed the string into quipquip, smash that Solve button hard and after a couple seconds are presented with a lovely sight:
Of course the spaces are lies, so we submit hgxeentvxystrewczjumyfwuxgzrndpcksiufacyqm and
By the way, it should be said that I’m in fact pretty dumb and messed up the distance calculation quite bad because of a couple typos at first, essentially adding extra weight to one of the quadrants and ditching the height/width ratio entirely. It didn’t go all bad, but the detector was able to find only 24 distinct symbols with decent results, and I settled with that, making some corrections by hand to establish some certainty and then transcribing the flag manually for about ten minutes or so. But let’s not dwell on that part and pretend it didn’t happen, alright? Forget I told you, really. Happy hacking people!