Using Address Sanitizer

Address Sanitizers, available in compilers like GCC and Clang, are powerful tools designed to unearth out-of-bounds and use-after-free bugs in C and C++ programs. Given the insidious nature of these bugs—which are common, elusive, and often lead to system crashes—it’s vital to understand and employ address sanitizers.

How Does Address Sanitizer Work?

The Address Sanitizer acts as a runtime monitor. When activated, the compiler inserts checks around memory operations. During runtime, these operations are cross-verified with a shadow memory to confirm their validity. Any discrepancies trigger a comprehensive report detailing the type of error and its location, proving invaluable in spotting and rectifying bugs that might otherwise lurk undetected and pose security risks.

Now, let’s examine it with the following sample code

#include <stdio.h>
#include <stdlib.h>

struct test_ {
  int x[5];
  int z[5];
};

void print_struct(void *s) {
  struct test_ *tp = (struct test_ *)s;

  printf(" @%p\n .x: ", tp);
  for (int i = 0; i < 5; i++) {
    printf("%d, ", tp->x[i]);
  }
  printf("\n .z: ");
  for (int i = 0; i < 5; i++) {
    printf("%d, ", tp->z[i]);
  }

  printf("\n");
}

void fishy_func(int *z, int offset) {
  printf("\n\nin the fishy_function\n the .z values are \n ");
  for (int i = 0; i < 5; i++) {
    printf("%d, ", z[i]);
  }
  printf("\n");

  // change the value at the offset
  char *p = (char *)z;
  p[offset] = 1;
}

int main(void) {
  struct test_ t1 = {.x = {0, 0, 0, 0, 0}, .z = {0, 0, 0, 0, 0}};
  struct test_ t2 = {.x = {0, 0, 0, 0, 0}, .z = {0, 0, 0, 0, 0}};

  printf("t1\n");
  print_struct((void *)&t1);

  // the offset is of 48  bytes
  fishy_func(t1.z, -48);

  printf("\nafter fishy function \nt1\n");
  print_struct((void *)&t1);

  printf("\nt2\n");
  print_struct((void *)&t2);

  return (EXIT_SUCCESS);
}

In this code, we are

Let’s compile this code:

[girish@fedora test]$ gcc -o test test.c

It builds without error.

Let’s try to run it.

[girish@fedora test]$ ./test
t1
 @0x7ffcc6ef40d0
 .x: 0, 0, 0, 0, 0, 
 .z: 0, 0, 0, 0, 0, 


in the fishy_function
 the .z values are 
 0, 0, 0, 0, 0, 

after fishy function 
t1
 @0x7ffcc6ef40d0
 .x: 0, 0, 0, 0, 0, 
 .z: 0, 0, 0, 0, 0, 

t2
 @0x7ffcc6ef40a0
 .x: 0, 0, 0, 0, 0, 
 .z: 1, 0, 0, 0, 0, 
[girish@fedora test]$ 

There’s something peculiar. We passed the array z in the struct t1, but the function changed the first value in the array z of struct t2.

C does not perform bounds checking while modifying values in memory. Therefore, the program simply updates the contents at the given offset. Such changes, given any arbitrary value, can lead to various bugs, undefined behavior, and even crashes. Since this offset can be computed at runtime, it could be very challenging to debug.

This is where the address sanitizer comes into play.

To use the address sanitizer, we need to compile the program with the -fsanitize=address flag:

Note: This needs libasan installed on the system. On Fedora, it can be installed using

[girish@fedora test]$ sudo dnf install libasan

Once this is installed, it can be compiled with

[girish@fedora test]$ gcc -fsanitize=address -o test_asan test.c

Running the program with Address Sanitizer active will detect the bug and provide a detailed report:

[girish@fedora test]$ ./test_asan
t1
 @0x7fff31bf2d20
 .x: 0, 0, 0, 0, 0, 
 .z: 0, 0, 0, 0, 0, 

in the fishy_function
 the .z values are 
 0, 0, 0, 0, 0, 

=================================================================
==71179==ERROR: AddressSanitizer: stack-buffer-underflow on address 0x7fff31bf2d04 at pc 0x0000004013ed bp 0x7fff31bf2cc0 sp 0x7fff31bf2cb8
WRITE of size 1 at 0x7fff31bf2d04 thread T0
    #0 0x4013ec in fishy_func (test_asan+0x4013ec)
    #1 0x4015b0 in main (test_asan+0x4015b0)
    #2 0x7f2de204a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
    #3 0x7f2de204a5c8 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c8)
    #4 0x401104 in _start (test_asan+0x401104)
Address 0x7fff31bf2d04 is located in stack of thread T0 at offset 4 in frame
    #0 0x401402 in main (test_asan+0x401402)
  This frame has 2 object(s):
    [32, 72) 't1' (line 43)
    [112, 152) 't2' (line 44)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-underflow (test_asan+0x4013ec) in fishy_func
Shadow bytes around the buggy address:
  0x100066376550: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100066376560: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100066376570: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100066376580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100066376590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x1000663765a0:[f1]f1 f1 f1 00 00 00 00 00 f2 f2 f2 f2 f2 00 00
  0x1000663765b0: 00 00 00 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00
  0x1000663765c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000663765d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000663765e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000663765f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==71179==ABORTING

The output from the Address Sanitizer provides a lot of information.

Now, let’s dissect the main components of this output.

Error Type: ERROR: AddressSanitizer: stack-buffer-underflow This suggests that there’s an attempt to access memory below the beginning of a local variable, often an array or buffer.

Address and Operation: WRITE of size 1 at 0x7fff31bf2d04 This means that the program tried to write 1 byte of data at the memory address 0x7fff31bf2d04.

Call Stack: This section details the sequence of function calls leading up to the error.

This suggests that the main function called fishy_func, where the error occurred.

Location of Buggy Address: Address 0x7fff31bf2d04 is located in stack of thread T0 at offset 4 in frame This confirms that the error happened in the stack memory of the main thread (T0).

This frame has 2 object(s): 't1' (line 43) and 't2' (line 44) This reveals that there are two local objects in the current frame: t1 and t2.

Hint: HINT: this may be a false positive... While ASan is generally accurate, this hint suggests possible scenarios where the reported issue might not be a genuine bug.

Shadow Bytes: The shadow bytes section provides a representation of the state of the memory around the buggy address. Each shadow byte represents 8 application bytes, and the legend at the bottom helps in interpreting the meaning of each byte value. In this report, the address in question (indicated by =>) has a shadow byte value of f1, which corresponds to the “Stack left redzone,” reinforcing the fact that there was an underflow in stack memory.

Abort Message: ==71179==ABORTING This final line simply indicates that due to the detected error, the program was terminated by ASan.

Summary: The AddressSanitizer detected a stack-buffer-underflow, which typically arises from trying to access memory before the beginning of a local variable. The problematic memory access happened in the fishy_func, which was invoked by the main function. The local variables in the vicinity of the error were t1 and t2. AddressSanitizer provides hints and a shadow memory interpretation to assist developers in understanding and resolving the memory issues.

Performance Implications

Implementing address sanitizers isn’t without its trade-offs. The added instrumentation bulks up the binary size and can double the execution time. Moreover, expect a two to three-fold surge in memory usage. Yet, these overheads are justifiable, considering the potential costs and hazards of undetected memory bugs.

Limitations to Consider

For all their prowess, address sanitizers aren’t infallible. They overlook bugs in memory-mapped I/O regions and those arising from non-standard/custom malloc and free custom allocators. Rarely, they might also flag false positives. Additionally, they aren’t equipped to detect memory leaks—that’s a job for tools like LeakSanitizer.

Conclusion

Memory bugs in C and C++ can be exceptionally challenging to detect and remedy. They often manifest in subtle ways, leading to system crashes, unpredictable behavior, or glaring security vulnerabilities. Address Sanitizers, as demonstrated in this post, offer developers a powerful tool to identify such insidious issues. By instrumenting your code during the compilation process, it brings to light issues like out-of-bounds accesses or use-after-free errors.