image

During my usual sidetrack, I stumbled upon a quite amusing video through my colleague. The video was covering a section of PoC || GTFO, which was about a file polyglottery. Even though I was not able to find the original video I watched, here is link to another video that covers the same topic. I encourage you to check it out. Just like what you might have guessed or already known, polyglot files can be interpreted as multiple different file types. Just think about how many nasty exploits you can do with these polyglot files. How exciting! I couldn’t help myself but to jump right into making one myself.

The first polyglot file that I attempted to make was a GIF + JS polyglot because it looked the most approachable and interesting. This documentation that I found had a very thorough explanation of an approach. Learning step by step, I began by analyzing the byte format of GIF files. Image files like PNG, JPEG, and GIF always begin with header bytes that contain meta data about the files such as size of the file, the width of the image, and the height of the image. If an image file does not have a valid header, it can’t be correctly decoded and you won’t be able to see the image. So we need to figure out how to embed our JavaScript into our GIF file without corrupting the header of the GIF file. This is the general format of GIF that I got from a source:

typedef struct _GifHeader
{
  // Header
  BYTE Signature[3];     /* Header Signature (always "GIF") */
  BYTE Version[3];       /* GIF format version("87a" or "89a") */
  // Logical Screen Descriptor
  WORD ScreenWidth;      /* Width of Display Screen in Pixels */
  WORD ScreenHeight;     /* Height of Display Screen in Pixels */
  BYTE Packed;           /* Screen and Color Map Information */
  BYTE BackgroundColor;  /* Background Color Index */
  BYTE AspectRatio;      /* Pixel Aspect Ratio */
} GIFHEAD;

As you can see, GIF file always start with a header signature “GIF” followed by the GIF format version. Notice that these 6 bytes can be used as a valid JavaScript variable? Even though we do not know if other succeeding bytes can be validly interpreted in common browsers like Chrome or Firefox, we are certain that the first 6 bytes will always be a valid JavaScript variable name. This means that we are good to go as long as we can bypass the other bytes. The fact that the screen width comes right after our golden 6 bytes is the key, because we can define the screen width with any 2 bytes and still have a valid GIF header. So we can comment out the whole garbage bytes in the middle by setting our width to 0x2A2F (It will be a hell of a long GIF..) which translates to ‘/*’ and adding 0x2F2A ‘*/’, at the end of the file to create a valid JavaScript comment.

    GIF89a/*...... (GIF image data) .....*/

Now to create a valid Javascript syntax, we can simply add ‘=SOME VALUE’ to define the variable at the top of our file.

    GIF89a/*...... (GIF image data) .....*/=1;

Then we can append any JavaScript we want.

    GIF89a/*...... (GIF image data) .....*/=1;alert('This is cool!');console.log(':)'));

With such polyglot, we can do crazy things like using the same GIF file to render the GIF image and execute customized script as well in most browsers. Given our example above, the following HTML code will render our GIF, pop up an alert box, and log out ‘:)’.

    <html>
        <body>
            <img src="giphy.gif">
            <script src="giphy.gif"></script>
        </body>
    </html>

And we are done … But are we? What if there are bytes in the middle of the image that can translate into ‘*/’? This would surely break our method, because ‘/* */ */’ is not a valid JavaScript syntax and whatever that comes after the comment will not execute. Turns out, you will probably encounter bytes equivalent to ‘*/’ in the middle of the image in most GIFs, given how many bytes there are in a common GIF file. My attempt to bypass by removing and replacing the problematic bytes failed miserably, as it corrupted the file enough to make it look glitchy. Perhaps this doesn’t matter in most use cases of such polyglot, because it still will be intepreted as GIF with a valid header and successfully deliver desired scripts as long as there’s no CDR present. Anyways, here is the code that I wrote to assist generating polyglot files. Hope you found this fun to explore as well.


Ryan Jeon

Co-founder/CTO @ Immigo.io (Techstars 21)