0xkasper

Nullcon HackIM CTF 2020 Write-Up

· ctf, crypto, misc

Introduction

This CTF was called HackIM CTF 2020 and was part of the Nullcon cybersecurity convention in India. The top 3 teams would get cash prizes and accommodation, and the top 25 teams would get passes for the convention. I must say that this was the most difficult CTF that I’ve played so far, even though that’s only 3 CTFs. I started with the cryptography challenges as always. The first challenge was quite easy but I had a lot of trouble with the second one, even though it’s actually not that difficult either. I also completed the only challenge in miscellaneous, but it took most of the hours I had. This CTF only lasted for 36 hours and I spend most of on finding Dora. Because of this I didn’t have much time to do anything else, but I did learn a lot more about computer vision.

The next upcoming CTF is in 3 weeks, Aero CTF, so I’ll be doing challenges from older CTFs to practice. I want to get better at pwn and reverse engineering as well, so I can do those challenges and make write-ups.

All the challenges, with source code and solution, can be found at the HackIM Github.


Cryptography


RockPaperScissors

Points: 250

Description: To get the flag you have to beat us in rock paper scissors but to make it fair we used a commitment base scheme. Server runs at: rps.py.

Attachments:

During the actual CTF, the challenge had a server running to which you could connect and play rock paper scissors. The goal is to win 20 times and it would give you the flag. Winning by luck is practically impossible, so we’ll have to find another way to win. Luckily the challenge also includes a file that might help: rps.py, the source code that is running on the server. So we can see everything the server does and has, except the flag of course.

When playing against the server, it will give us 3 hashes and it will tell us that it’s move is the first hash. Then we have gave an input of ‘r’, ‘p’ or ‘s’ for our move. As soon as we lose it’s game over. In the source code we can see that for every game, it generates a new secret and each hash is the secret string plus the move hashed using their own implemented hash function. It then shuffles the moves and returns them. To determine the server’s move, we therefore have to ‘unhash’ the given hashes. For a hash function such as SHA-256, this is practically impossible. But since they implemented their own, there must be some weakness that allows us to reverse it. Let’s take a closer look.

def hash(data):
    state = bytearray([208, 151, 71, 15, 101, 206, 50, 225, 223, 14, 14, 106, 22, 40, 20, 2])
    data = pad(data, 16)
    data = group(data)
    for roundkey in data:
        for _ in range(round):
            state = repeated_xor(state, roundkey)
            for i in range(len(state)):
                state[i] = sbox[state[i]]
            temp = bytearray(16)
            for i in range(len(state)):
                temp[p[i]] = state[i]
            state = temp
    return hex(bytes_to_int(state))[2:]

The hash function from rsa.py that runs on the server

The secret that is generated is 16 bytes long, so plus the character for the move which is 1 byte, the data parameter is 17 bytes long. Now knowing this, it is possible to spot why this hash function doesn’t work well. The start state is constant, it doesn’t change. The data is padded until a multiple of 16 with the amount of bytes to pad, so 15 0x0F bytes are always added to the data. The data is then grouped into groups of 16 bytes, so the secret is always by itself and the second group is the character of the move plus padding. These groups will be the roundkeys in the next step, so the secret doesn’t actually do that much on the actual important data.

We therefore can reverse the round for the second group, because we already know the roundkey is one of the three characters plus padding. If we do that, we are left with the hashed secret, which we can use to determine which of the three moves it actually is. We don’t have to unhash the secret. We only have to reverse the hashing for each of the 3 given hashes from the server for each of the 3 possible moves, giving us 9 potential hashed secrets. And of these 9 potential secrets, 3 will be the same, because that is the actual secret and shows which move is the right one for each hash.

def unhash(move, data):
    state = bytearray.fromhex(('0' if len(data) % 2 != 0 else '') + data)
    roundkey = pad(move.encode(), 16)
    for _ in range(round):
        temp = bytearray(16)
        for i in range(16):
            temp[i] = state[p[i]]
        state = temp
        for i in range(16):
            state[i] = sbox.index(state[i])
        state = repeated_xor(state, roundkey)
    return state

def find_moves(data):
    secrets = {}
    choices = ["r", "p", "s"]
    for key in data:
        for choice in choices:
            hx = ''.join([format(x, '0x') for x in unhash(choice, key)])
            if hx in secrets:
                secrets[hx].append((key, choice))
            else:
                secrets[hx] = [(key, choice)]
    secrets = list(filter(lambda lst: len(lst) == 3, secrets.values()))
    for s in secrets[0]:
        print(f'{s[1]} = {s[0]}')

while True:
    data = input('Hashes: ').split(' ')
    find_moves(data)

My Python script that finds the corresponding moves of the 3 given hashes

We input all 3 hashes and try to find which hash is which move. As explained above, the find moves function just generates 9 hashed secrets, takes the one that occurs 3 times and prints the corresponding moves.

The hash functions uses 16 rounds and one state of 16 bytes is kept. When the group of the move plus padding is hashed, the start state is equal to the hashed secret, which is why we get the hashed secret when reversing the hashing of the second group. In these 16 rounds, the state is first XOR’ed with the roundkey. After that the values in state are switched for values from the sbox list, which is constant and given. The value of the state is used as index of the sbox and the value at that index is set as the new value in the state. Then the state is merely shuffled using the values of the given list p. Reversing the shuffling and sbox is trivial and reversing the XOR is possible since the amount of possible roundkeys is only 3. Now we can write the unhash function and find the move of every given hash.

The only thing that’s left is to play rock paper scissors against the server, win 20 times and get the flag as reward.

Flag: hackim20{b4d_pr1mitiv3_beats_all!1!_7f65}


ayyMessage

Points: 410

Description: We have captured a message from Bob to Alice. We believe it contains sensitive information. Alice’s message server runs at: server.py. Can you find the contents of the message?

Attachments:

I completed this challenge just after the CTF had ended.

First we take a good look at the give server.py. This is the python script that is running on the server. In there we see a bunch of different cryptographic terms used. We see RSA, AES, SHA256, ECC and DSS. This may be a bit overwhelming, but it’s actually not that complicated. Let’s start at the beginning.

The server receives a message in JSON format. It decodes every element using base64 to obtain byte arrays. It verifies the message and if it’s correct, it will decrypt the actual message and return the SHA256 hash of the plaintext. If the message cannot be verified, it will exit. The ciphertext is encrypted using AES in CTR mode, of which the key is unknown. The AES key is send with the JSON message, but it is encrypted with RSA and we only have the RSA public key so we cannot decrypt it. Only the server has the RSA private key, which it uses to first decrypt the AES key and then uses the AES key to decrypt the ciphertext. The nonce is known and the ciphertext as well. The ECC public key in the message is the Elliptic-Curve public key from the sender. This key is used by the server to verify the signature, which is also in the message. So see what’s wrong there?

The server verifies the message using a client provided key and signature. So anyone can change the ciphertext, you just have to create a new signature and send your own generated ECC public key. Furthermore, the server leaks information about the plaintext: the SHA256 hash is return as the read receipt. We can use this hash to bruteforce the plaintext of the ciphertext we’ve been given. To do this, we have to build the plaintext byte by byte from the ciphertext.

First we generate our own ECC private and public key so we create signatures of our modified ciphertext and send the key and signature to the server. After that we only have to iterate over every byte in the ciphertext, generate a signature and send this to the server. For example, if we only take the first byte of the ciphertext and send this to the server, we get the SHA256 of the first character of the plaintext. And since the first character is in the ASCII range, there less than 256 possibilities. Now we can loop over all character possibilities, hash each character and compare it to the server’s hash. The hash that matches reveals the right character. We do this for every byte and build the plaintext.

from Crypto.PublicKey import ECC
from hashlib import sha256
from base64 import b64decode, b64encode
from Crypto.Signature import DSS
from Crypto.Hash import SHA256
from time import sleep
import pwn
import string
import json

def send_receive(msg):
	sleep(.1)
	msgb = json.dumps(message).encode()
	conn = pwn.remote('localhost', 1234)
	conn.recv()
	conn.send(msgb + b'\n')
	res = conn.recv()
	rr = res.decode().split('\n')[1]
	return rr

with open('message.txt', 'r') as f:
	data = f.read()
	message = json.loads(data)

aeskeyb = b64decode(message["aeskey"])
nonceb = b64decode(message["nonce"])
enc_message = b64decode(message["message"])

ecckey = ECC.generate(curve='NIST P-256')
signer = DSS.new(ecckey, 'fips-186-3')
message["eccpubkey"] = b64encode(ecckey.public_key().export_key(format='DER')).decode()

flag = ''
for i in range(len(enc_message)):
	messageb = enc_message[0:i+1]
	message["message"] = b64encode(messageb).decode()
	h = SHA256.new(aeskeyb + nonceb + messageb)
	signatureb = signer.sign(h)
	message["signature"] = b64encode(signatureb).decode()
	res = send_receive(message)
	for c in string.printable:
		h = sha256((flag + c).encode()).hexdigest()
		if h == res:
			flag += c
print(flag)

The ciphertext that is send to the server consists of all known bytes so far plus the next one we want to know. Then we can bruteforce the next character and ultimately obtain the flag from the decrypted ciphertext.

Flag: hackim20{digital_singatures_does_not_always_imply_authenticitaaayyy}


Miscellaneous


Dora

Points: 361

Description: Guess what? Dora is back and you’ll have to find her to get the flag.

The challenge had a server running that would prompt you with a huge base64 string and ask you a question: Where’s Dora? 1 for upper right, 2 for upper left, 3 for lower left, 4 for lower right. Figuring out that the base64 string is an image, is only the beginning of this challenge. You can go to CyberChef, insert the base64 string, convert from base64 and render an image. This shows an image with a somewhat distorted Dora and 5 characters from the Dora TV show. Giving the right answer will give you the next image and giving a wrong answer will disconnect you. We have to give a correct answer for 800 consecutive times in order to get the flag. And apart from that, the server disconnects everyone every 30 minutes. Doing it manually is not only tiresome but almost impossible due to the timeout.

So apparently the solution was to write a script that saves each image as a hash and the correct answer in a dictionary. There were only 800 different images the server chose from. Doing it this way, where the script asks for manual answering when it doesn’t know and then saves this answer for future answering, is the easiest way. Unfortunately I didn’t realize that the server kept a small set of images, instead I thought it generated a random image every time. So I used computer vision to try and actually find Dora in the image. This took quite a lot of time, but in the end my script can find Dora about 99% of the time by itself and allows for manual answering otherwise.

I used the opencv-python library for image processing and recognition. I’ve tried many different approaches, until I found something that worked. I started with allow myself to select a region on the image that opencv shows me. There is a function for that in the library called selectROI, which was perfect. Then I made the script ask me where Dora is in the image and it would save this part of the image in a list, so it could use it with the function matchTemplate, which tries to find an image in another image. This did not work well, in each image Dora was either distorted, scaled or flipped. I noticed however that Dora usually had color, while the other figures were always in grayscale. If we can therefore find all contours in the image and omit each grayscale one, we usually find Dora. I’d say Dora was colored maybe 70-80% of the time, so we’d still require manual answering 20-30% of the time.

In order to find all contours in an image, we have to find the contours. Since the given image always had a background with one color, I used the inRange function of opencv to create a mask (second image) of the image. This mask has the background color set to black and everything else to white. Now we can use the findContours function together with boundingRect to get the rectangle of every contour. This information can then be used in the original image.

Now that we have to location information of each contour in the original image, we can check if it’s grayscale or not. To do this we calculate the normalized grayscale-value of each contour. A pixel is grayscale if it’s red, green and blue values are all equal, such as 0x3F3F3F. The grayscale value is then the number of grayscale pixels divided by the total number of pixels. When Dora had color her grayscale-value was usually 0 and the other contours were always more than 0.3. So if we set a threshold of about 0.05, we can determine if Dora is grayscale or not. If she has color, we just convert her location to one of the choices and send the answer to the server. If she is grayscale, we don’t know which of the grayscale contours is Dora, and we’ll have to find another way. We could also just let the script as for manual answering, because it’ll answer most of the prompts anyway. But I wanted to do better.

Another thing I noticed is that, while Dora changes every picture, the other figures stay the same. Recognizing the other figures using template matching will therefore be a lot easier. I created 5 images, one of each figure and called them obj1 through obj5. I load them at the start of the script and used them for template matching after the grayscale method failed. It is important for template matching to set the background color to black. I used the color of the pixel at (0, 0) to find the background color. This highly improves template matching. This template matching should yield exactly 5 objects, where these 5 objects are all the other figures and not Dora. Then I went through all the contours again, and chose the one that did not overlap with any of the found object rectangles. This chosen contour would then be translated to an answer and send to the server. If it resulted in no contours, then manual answering was needed.

Combing these two techniques resulted in about 99% correct answering, with only about 10 pictures that required manual answering. You can find the contour matching and object matching in their functions in my script.

dora-1

from io import BytesIO
import socket
import base64
import cv2
import numpy as np
import imutils

objs = [
    cv2.imread('obj1.png', cv2.IMREAD_UNCHANGED).astype(np.uint8),
    cv2.imread('obj2.png', cv2.IMREAD_UNCHANGED).astype(np.uint8),
    cv2.imread('obj3.png', cv2.IMREAD_UNCHANGED).astype(np.uint8),
    cv2.imread('obj4.png', cv2.IMREAD_UNCHANGED).astype(np.uint8),
    cv2.imread('obj5.png', cv2.IMREAD_UNCHANGED).astype(np.uint8)
]

def rect_overlap(x, y):
    return not ((x[0]+x[2]) < y[0] or x[0] > (y[0]+y[2]) or x[1] > (y[1]+y[3]) or (x[1]+x[3]) < y[1])

def calc_choice(rect):
    cX = rect[0] + (rect[2] / 2)
    cY = rect[1] + (rect[3] / 2)
    if cX >= 360:
        if cY < 360:
            return '1'
        if cY >= 360:
            return '4'
    if cX < 360:
        if cY < 360:
            return '2'
        if cY >= 360:
            return '3'
    return 0

def get_contours(cv_image):
    bounds = (cv_image[0][0], cv_image[0][0])
    mask = cv2.inRange(cv_image, bounds[0], bounds[1])
    mask = cv2.bitwise_not(mask)
    cnts = cv2.findContours(mask.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = imutils.grab_contours(cnts)
    cnt_locs = [cv2.boundingRect(cnt) for cnt in cnts]
    cnt_locs = list(filter(lambda r: r[2] > 5 and r[3] > 5, cnt_locs))
    return cnt_locs

def get_objects(cv_image):
    obj_locs = []
    for obj in objs:
        res = cv2.matchTemplate(cv_image, obj, cv2.TM_CCOEFF_NORMED)
        minVal, maxVal, minLoc, maxLoc = cv2.minMaxLoc(res)
        width, height = obj.shape[1::-1]
        rect = (maxLoc[0], maxLoc[1], width, height)
        obj_locs.append(rect)
    return obj_locs

def grayscale_value(cv_image, w, h):
    count = 0
    for line in img:
        for (r, g, b, a) in line:
            if r == g == b:
                count += 1
    count_norm = count / (w * h)
    return count_norm

def try_find_dora(cv_image):
    # Try grayscale method
    cnt_locs = get_contours(cv_image)
    lowest_gv = None
    for (x, y, w, h) in cnt_locs:
        img = cv_image[y:y+h, x:x+w]
        gv = grayscale_value(img)
        if not lowest_gv or gv < lowest_gv[0]:
            lowest_gv = (gv, (x, y, w, h))
    if lowest_gv and lowest_gv[0] < 0.05:
        return True, calc_choice(lowest_gv[1])
    # Try template matching
    bgcolor = cv_image[0][0]
    cv_image[np.where((cv_image==bgcolor).all(axis=2))] = [1, 2, 3, 255]
    obj_locs = get_objects(cv_image)
    dora = []
    for cnt in cnt_locs:
        if not any([rect_overlap(cnt, obj) for obj in obj_locs]):
            dora.append(cnt)
    if len(dora) < 1:
        return False, '0'
    dora_choices = [calc_choice(d) for d in dora]
    if all([dora_choices[0] == dc for dc in dora_choices]):
        return True, dora_choices[0]
    return False, '0'

def get_choice(image_b64): 
    np_arr = np.frombuffer(base64.b64decode(image_b64), np.uint8)
    cv_image = cv2.imdecode(np_arr, cv2.IMREAD_UNCHANGED)
    check, choice = try_find_dora(cv_image)
    if check:
        return choice
    # Manual answering
    roi = cv2.selectROI('dora', cv_image, True, False)
    return calc_choice(roi)

def show_image(image_b64):
    np_arr = np.frombuffer(base64.b64decode(image_b64), np.uint8)
    cv_image = cv2.imdecode(np_arr, cv2.IMREAD_UNCHANGED)
    cv2.imshow('dora', cv_image)
    cv2.waitKey(0)

host = 'misc.ctf.nullcon.net'
port = 8000
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((host, port))
sock.settimeout(5.0)

count = 0
while True:
    data = sock.recv(4096)
    if b'No flag for you\n' in data:
        sock.close()
        exit()
    elif b'Where\'s Dora?' in data:
        data = sock.recv(4096)
    image_b64 = []
    while b'\n\n' not in data:
        if count > 795:
            print(data)
        image_b64.append(data)
        data = sock.recv(4096)
    image_b64.append(data[:-2])
    image_b64 = ''.join([bts.decode() for bts in image_b64])
    choice = get_choice(image_b64)
    print(f'{count}: {choice}')
    sock.send(choice.encode() + b'\n')

After correctly answering 800 times, the server will send a final image, which is an image of the flag.

Flag: hackim20{swiper_no_swiping_the_flag_1f545a6d49bd6dc815ddd731d0c2a2ad}