Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prevent gcc optimization breaking rep movsb code? [duplicate]

I tried to create my memcpy code with rep movsb instruction. It works perfectly with any size when the optimization is disabled. But, when I enable optimization, it does not work as expected.

Questions

  1. How to prevent gcc optimization breaking rep movsb code?
  2. Is there something wrong with my code so it leads to undefined behavior?

Motivation to create my own memcpy:

I read about enhanced movsb for memcpy from Intel® 64 and IA-32 Architectures Optimization Reference Manual section 3.7.6. I came to the libc source code and I saw default memcpy from libc uses SSE instead of movsb.

Hence, I want to compare the performance between SSE instruction and rep movsb for memcpy. But now, I find something wrong with it.

Simple code to reproduce the problem (test.c)

#include <stdio.h>
#include <string.h>

inline static void *my_memcpy(
  register void *dest,
  register const void *src,
  register size_t n
) {
  __asm__ volatile(
    "mov %0, %%rdi;"
    "mov %1, %%rsi;"
    "mov %2, %%rcx;"
    "rep movsb;"
    :
    : "r"(dest), "r"(src), "r"(n)
    : "rdi", "rsi", "rcx"
  );
  return dest;
}

#define to_boolean_str(A) ((A) ? "true" : "false")

int main()
{
  char src[32];
  char dst[32];

  memset(src, 'a', 32);
  memset(dst, 'b', 32);

  my_memcpy(dst, src, 1);
  printf("%s\n", to_boolean_str(!memcmp(dst, src, 1)));

  my_memcpy(dst, src, 2);
  printf("%s\n", to_boolean_str(!memcmp(dst, src, 2)));

  my_memcpy(dst, src, 3);
  printf("%s\n", to_boolean_str(!memcmp(dst, src, 3)));

  return 0;
}

Compile and run

ammarfaizi2@integral:~$ gcc --version
gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ammarfaizi2@integral:~$ gcc -O0 test.c -o test && ./test
true
true
true
ammarfaizi2@integral:~$ gcc -O1 test.c -o test && ./test
false
true
true
ammarfaizi2@integral:~$ gcc -O2 test.c -o test && ./test
false
true
true
ammarfaizi2@integral:~$ gcc -O3 test.c -o test && ./test
false
true
true
ammarfaizi2@integral:~$ 

Summary

my_memcpy(dst, src, 1); results in wrong behavior if optimizations are enabled.

like image 913
Ammar Faizi Avatar asked Sep 30 '20 15:09

Ammar Faizi


1 Answers

As written, your asm constraints do not reflect that the asm statement can modify memory, so the compiler can freely reorder it with respect to operations that read or write the memory at dest or src. You need to add "memory" to the clobber list.

As others have noted, you should also edit the constraints to avoid mov. If you do so, you'll need to also represent in the constraints the fact that the asm now modifies its arguments (e.g. make them all dual input/output) and backup the value of dest so you can return it. So you might skip this improvement until you've gotten it working to begin with and until you understand how constraints work.

like image 106
R.. GitHub STOP HELPING ICE Avatar answered Sep 28 '22 01:09

R.. GitHub STOP HELPING ICE