384

I am executing my a.out file. After execution the program runs for some time then exits with the message:

**** stack smashing detected ***: ./a.out terminated*
*======= Backtrace: =========*
*/lib/tls/i686/cmov/libc.so.6(__fortify_fail+0x48)Aborted*

What could be the possible reasons for this and how do I rectify it?

2
  • 3
    Could you perhaps identify which parts of you code causes the stack smashing, and post it? Then we will probably be able to point out exactly why it happens and how to correct it. Commented Aug 30, 2009 at 8:45
  • 3
    I think it is synonym with overflow error. For example if you initialize and array of 5 elements this error will appear when trying to write the 6th element, or any element outside the bounds of the array. Commented Sep 12, 2018 at 20:09

10 Answers 10

493

Stack Smashing here is actually caused due to a protection mechanism used by gcc to detect buffer overflow errors. For example in the following snippet:

#include <stdio.h>

void func()
{
    char array[10];
    gets(array);
}

int main(int argc, char **argv)
{
    func();
}

The compiler, (in this case gcc) adds protection variables (called canaries) which have known values. An input string of size greater than 10 causes corruption of this variable resulting in SIGABRT to terminate the program.

To get some insight, you can try disabling this protection of gcc using option -fno-stack-protector while compiling. In that case you will get a different error, most likely a segmentation fault as you are trying to access an illegal memory location. Note that -fstack-protector should always be turned on for release builds as it is a security feature.

You can get some information about the point of overflow by running the program with a debugger. Valgrind doesn't work well with stack-related errors, but like a debugger, it may help you pin-point the location and reason for the crash.

13
  • 6
    thanks for this answer! I found that in my case I had not initialized the variable I was trying to write to Commented Jun 13, 2012 at 0:16
  • 5
    Valgrind doesn't work well for stack-related errors, since it can't add red zones there Commented Jan 7, 2014 at 10:35
  • 10
    This answer is incorrect, and provides dangerous advice. First of all, removing stack protector is not the right solution -- if you are getting a stack smashing error, you probably have a serious security vulnerability in your code. The correct response is to fix the buggy code. Second, as grasGendarme points out, the recommendation to try Valgrind is not going to be effective. Valgrind typically doesn't work for detecting illegal memory accesses to stack-allocated data.
    – D.W.
    Commented Feb 9, 2014 at 2:16
  • 53
    The OP asks for possible reasons for this behaviour, my answer provides an example and how it relates to a reasonably known error. Besides, removing the stack-protector is not a solution, it is sort of an experiment one could do to get more insights to the problem. The advice actually is to fix the error somehow, thanks for pointing about valgrind, i will edit my answer to reflect this.
    – sud03r
    Commented Feb 9, 2014 at 9:14
  • 4
    @D.W. the stack protection should be turned off in a release version, because at first -- the stack smashing detected message is a help only for a developers; at second -- an application could have yet chances to survive; and at third -- this is a tiny optimization.
    – Hi-Angel
    Commented Jun 4, 2014 at 12:44
101

Minimal reproduction example with disassembly analysis

main.c

void myfunc(char *const src, int len) {
    int i;
    for (i = 0; i < len; ++i) {
        src[i] = 42;
    }
}

int main(void) {
    char arr[] = {'a', 'b', 'c', 'd'};
    int len = sizeof(arr);
    myfunc(arr, len + 1); /* Cause smashing by writing one byte too many. */
    return 0;
}

GitHub upstream.

Compile and run:

gcc -fstack-protector-all -g -O0 -std=c99 main.c
ulimit -c unlimited && rm -f core
./a.out

fails as desired:

*** stack smashing detected ***: terminated
Aborted (core dumped)

Tested on Ubuntu 20.04, GCC 10.2.0.

On Ubuntu 16.04, GCC 6.4.0, I could reproduce with -fstack-protector instead of -fstack-protector-all, but it stopped blowing up when I tested on GCC 10.2.0 as per Geng Jiawen's comment. man gcc clarifies that as suggested by the option name, the -all version adds checks more aggressively, and therefore presumably incurs a larger performance loss:

-fstack-protector

Emit extra code to check for buffer overflows, such as stack smashing attacks. This is done by adding a guard variable to functions with vulnerable objects. This includes functions that call "alloca", and functions with buffers larger than or equal to 8 bytes. The guards are initialized when a function is entered and then checked when the function exits. If a guard check fails, an error message is printed and the program exits. Only variables that are actually allocated on the stack are considered, optimized away variables or variables allocated in registers don't count.

-fstack-protector-all

Like -fstack-protector except that all functions are protected.

Disassembly

Now we look at the disassembly:

objdump -D a.out

which contains:

int main (void){
  400579:       55                      push   %rbp
  40057a:       48 89 e5                mov    %rsp,%rbp

  # Allocate 0x10 of stack space.
  40057d:       48 83 ec 10             sub    $0x10,%rsp

  # Put the 8 byte canary from %fs:0x28 to -0x8(%rbp),
  # which is right at the bottom of the stack.
  400581:       64 48 8b 04 25 28 00    mov    %fs:0x28,%rax
  400588:       00 00 
  40058a:       48 89 45 f8             mov    %rax,-0x8(%rbp)

  40058e:       31 c0                   xor    %eax,%eax
    char arr[] = {'a', 'b', 'c', 'd'};
  400590:       c6 45 f4 61             movb   $0x61,-0xc(%rbp)
  400594:       c6 45 f5 62             movb   $0x62,-0xb(%rbp)
  400598:       c6 45 f6 63             movb   $0x63,-0xa(%rbp)
  40059c:       c6 45 f7 64             movb   $0x64,-0x9(%rbp)
    int len = sizeof(arr);
  4005a0:       c7 45 f0 04 00 00 00    movl   $0x4,-0x10(%rbp)
    myfunc(arr, len + 1);
  4005a7:       8b 45 f0                mov    -0x10(%rbp),%eax
  4005aa:       8d 50 01                lea    0x1(%rax),%edx
  4005ad:       48 8d 45 f4             lea    -0xc(%rbp),%rax
  4005b1:       89 d6                   mov    %edx,%esi
  4005b3:       48 89 c7                mov    %rax,%rdi
  4005b6:       e8 8b ff ff ff          callq  400546 <myfunc>
    return 0;
  4005bb:       b8 00 00 00 00          mov    $0x0,%eax
}
  # Check that the canary at -0x8(%rbp) hasn't changed after calling myfunc.
  # If it has, jump to the failure point __stack_chk_fail.
  4005c0:       48 8b 4d f8             mov    -0x8(%rbp),%rcx
  4005c4:       64 48 33 0c 25 28 00    xor    %fs:0x28,%rcx
  4005cb:       00 00 
  4005cd:       74 05                   je     4005d4 <main+0x5b>
  4005cf:       e8 4c fe ff ff          callq  400420 <__stack_chk_fail@plt>

  # Otherwise, exit normally.
  4005d4:       c9                      leaveq 
  4005d5:       c3                      retq   
  4005d6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  4005dd:       00 00 00 

Notice the handy comments automatically added by objdump's artificial intelligence module.

If you run this program multiple times through GDB, you will see that:

  • the canary gets a different random value every time
  • the last loop of myfunc is exactly what modifies the address of the canary

The canary randomized by setting it with %fs:0x28, which contains a random value as explained at:

Debug attempts

From now on, we modify the code:

    myfunc(arr, len + 1);

to be instead:

    myfunc(arr, len);
    myfunc(arr, len + 1); /* line 12 */
    myfunc(arr, len);

to be more interesting.

We will then try to see if we can pinpoint the culprit + 1 call with a method more automated than just reading and understanding the entire source code.

gcc -fsanitize=address to enable Google's Address Sanitizer (ASan)

If you recompile with this flag and run the program, it outputs:

#0 0x4008bf in myfunc /home/ciro/test/main.c:4
#1 0x40099b in main /home/ciro/test/main.c:12
#2 0x7fcd2e13d82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
#3 0x400798 in _start (/home/ciro/test/a.out+0x40079

followed by some more colored output.

This clearly pinpoints the problematic line 12.

The source code for this is at: https://github.com/google/sanitizers but as we saw from the example it is already upstreamed into GCC.

ASan can also detect other memory problems such as memory leaks: How to find memory leak in a C++ code/project?

Valgrind SGCheck

As mentioned by others, Valgrind is not good at solving this kind of problem.

It does have an experimental tool called SGCheck:

SGCheck is a tool for finding overruns of stack and global arrays. It works by using a heuristic approach derived from an observation about the likely forms of stack and global array accesses.

So I was not very surprised when it did not find the error:

valgrind --tool=exp-sgcheck ./a.out

The error message should look like this apparently: Valgrind missing error

GDB

An important observation is that if you run the program through GDB, or examine the core file after the fact:

gdb -nh -q a.out core

then, as we saw on the assembly, GDB should point you to the end of the function that did the canary check:

(gdb) bt
#0  0x00007f0f66e20428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007f0f66e2202a in __GI_abort () at abort.c:89
#2  0x00007f0f66e627ea in __libc_message (do_abort=do_abort@entry=1, fmt=fmt@entry=0x7f0f66f7a49f "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007f0f66f0415c in __GI___fortify_fail (msg=<optimized out>, msg@entry=0x7f0f66f7a481 "stack smashing detected") at fortify_fail.c:37
#4  0x00007f0f66f04100 in __stack_chk_fail () at stack_chk_fail.c:28
#5  0x00000000004005f6 in main () at main.c:15
(gdb) f 5
#5  0x00000000004005f6 in main () at main.c:15
15      }
(gdb)

And therefore the problem is likely in one of the calls that this function made.

Next we try to pinpoint the exact failing call by first single stepping up just after the canary is set:

  400581:       64 48 8b 04 25 28 00    mov    %fs:0x28,%rax
  400588:       00 00 
  40058a:       48 89 45 f8             mov    %rax,-0x8(%rbp)

and watching the address:

(gdb) p $rbp - 0x8
$1 = (void *) 0x7fffffffcf18
(gdb) watch 0x7fffffffcf18
Hardware watchpoint 2: *0x7fffffffcf18
(gdb) c
Continuing.

Hardware watchpoint 2: *0x7fffffffcf18

Old value = 1800814336
New value = 1800814378
myfunc (src=0x7fffffffcf14 "*****?Vk\266", <incomplete sequence \355\216>, len=5) at main.c:3
3           for (i = 0; i < len; ++i) {
(gdb) p len
$2 = 5
(gdb) p i
$3 = 4
(gdb) bt
#0  myfunc (src=0x7fffffffcf14 "*****?Vk\266", <incomplete sequence \355\216>, len=5) at main.c:3
#1  0x00000000004005cc in main () at main.c:12

Now, this does leaves us at the right offending instruction: len = 5 and i = 4, and in this particular case, did point us to the culprit line 12.

However, the backtrace is corrupted, and contains some trash. A correct backtrace would look like:

#0  myfunc (src=0x7fffffffcf14 "abcd", len=4) at main.c:3
#1  0x00000000004005b8 in main () at main.c:11

so maybe this could corrupt the stack and prevent you from seeing the trace.

Also, this method requires knowing what is the last call of the canary checking function otherwise you will have false positives, which will not always be feasible, unless you use reverse debugging.

5
19

Please look at the following situation:

ab@cd-x:$ cat test_overflow.c 
#include <stdio.h>
#include <string.h>

int check_password(char *password){
    int flag = 0;
    char buffer[20];
    strcpy(buffer, password);

    if(strcmp(buffer, "mypass") == 0){
        flag = 1;
    }
    if(strcmp(buffer, "yourpass") == 0){
        flag = 1;
    }
    return flag;
}

int main(int argc, char *argv[]){
    if(argc >= 2){
        if(check_password(argv[1])){
            printf("%s", "Access granted\n");
        }else{
            printf("%s", "Access denied\n");
        }
    }else{
        printf("%s", "Please enter password!\n");
    }
}
ab@cd-x:$ gcc -g -fno-stack-protector test_overflow.c 
ab@cd-x:$ ./a.out mypass
Access granted
ab@cd-x:$ ./a.out yourpass
Access granted
ab@cd-x:$ ./a.out wepass
Access denied
ab@cd-x:$ ./a.out wepassssssssssssssssss
Access granted

ab@cd-x:$ gcc -g -fstack-protector test_overflow.c 
ab@cd-x:$ ./a.out wepass
Access denied
ab@cd-x:$ ./a.out mypass
Access granted
ab@cd-x:$ ./a.out yourpass
Access granted
ab@cd-x:$ ./a.out wepassssssssssssssssss
*** stack smashing detected ***: ./a.out terminated
======= Backtrace: =========
/lib/tls/i686/cmov/libc.so.6(__fortify_fail+0x48)[0xce0ed8]
/lib/tls/i686/cmov/libc.so.6(__fortify_fail+0x0)[0xce0e90]
./a.out[0x8048524]
./a.out[0x8048545]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xc16b56]
./a.out[0x8048411]
======= Memory map: ========
007d9000-007f5000 r-xp 00000000 08:06 5776       /lib/libgcc_s.so.1
007f5000-007f6000 r--p 0001b000 08:06 5776       /lib/libgcc_s.so.1
007f6000-007f7000 rw-p 0001c000 08:06 5776       /lib/libgcc_s.so.1
0090a000-0090b000 r-xp 00000000 00:00 0          [vdso]
00c00000-00d3e000 r-xp 00000000 08:06 1183       /lib/tls/i686/cmov/libc-2.10.1.so
00d3e000-00d3f000 ---p 0013e000 08:06 1183       /lib/tls/i686/cmov/libc-2.10.1.so
00d3f000-00d41000 r--p 0013e000 08:06 1183       /lib/tls/i686/cmov/libc-2.10.1.so
00d41000-00d42000 rw-p 00140000 08:06 1183       /lib/tls/i686/cmov/libc-2.10.1.so
00d42000-00d45000 rw-p 00000000 00:00 0 
00e0c000-00e27000 r-xp 00000000 08:06 4213       /lib/ld-2.10.1.so
00e27000-00e28000 r--p 0001a000 08:06 4213       /lib/ld-2.10.1.so
00e28000-00e29000 rw-p 0001b000 08:06 4213       /lib/ld-2.10.1.so
08048000-08049000 r-xp 00000000 08:05 1056811    /dos/hacking/test/a.out
08049000-0804a000 r--p 00000000 08:05 1056811    /dos/hacking/test/a.out
0804a000-0804b000 rw-p 00001000 08:05 1056811    /dos/hacking/test/a.out
08675000-08696000 rw-p 00000000 00:00 0          [heap]
b76fe000-b76ff000 rw-p 00000000 00:00 0 
b7717000-b7719000 rw-p 00000000 00:00 0 
bfc1c000-bfc31000 rw-p 00000000 00:00 0          [stack]
Aborted
ab@cd-x:$ 

When I disabled the stack smashing protector no errors were detected, which should have happened when I used "./a.out wepassssssssssssssssss"

So to answer your question above, the message "** stack smashing detected : xxx" was displayed because your stack smashing protector was active and found that there is stack overflow in your program.

Just find out where that occurs, and fix it.

0
8

You could try to debug the problem using valgrind:

The Valgrind distribution currently includes six production-quality tools: a memory error detector, two thread error detectors, a cache and branch-prediction profiler, a call-graph generating cache profiler, and a heap profiler. It also includes two experimental tools: a heap/stack/global array overrun detector, and a SimPoint basic block vector generator. It runs on the following platforms: X86/Linux, AMD64/Linux, PPC32/Linux, PPC64/Linux, and X86/Darwin (Mac OS X).

3
  • 2
    Yeah, but Valgrind doesn't work well for overflows of stack-allocated buffers, which is the situation that this error message indicates.
    – D.W.
    Commented Feb 9, 2014 at 2:17
  • 4
    How could we use that stack array overrun detector? Can you elaborate? Commented Mar 18, 2014 at 0:54
  • 1
    @CraigMcQueen I've tried to use Valgrind's experimental heuristic SGCheck stack smashing detector on a minimal example: stackoverflow.com/a/51897264/895245 but it failed. Commented Aug 17, 2018 at 16:49
6

It means that you wrote to some variables on the stack in an illegal way, most likely as the result of a Buffer overflow.

2
  • 9
    Stack overflow is the stack smashing into something else. Here it is the other way around: something has smashed into the stack. Commented Aug 28, 2009 at 9:27
  • 6
    Not really. It's one part of the stack smashing into another part. So it really is a buffer overflow, just not over the top of stack, but "only" into another part of the stack.
    – Bas Wijnen
    Commented Nov 30, 2012 at 7:48
4

What could be the possible reasons for this and how do I rectify it?

One scenario would be in the following example:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void swap ( char *a , char *b );
void revSTR ( char *const src );

int main ( void ){
    char arr[] = "A-B-C-D-E";

    revSTR( arr );
    printf("ARR = %s\n", arr );
}

void swap ( char *a , char *b ){
    char tmp = *a;
    *a = *b;
    *b = tmp;
}

void revSTR ( char *const src ){
    char *start = src;
    char *end   = start + ( strlen( src ) - 1 );

    while ( start < end ){
        swap( &( *start ) , &( *end ) );
        start++;
        end--;
    }
}

In this program you can reverse a String or a part of the string if you for example call reverse() with something like this:

reverse( arr + 2 );

If you decide to pass the length of the array like this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void swap ( char *a , char *b );
void revSTR ( char *const src, size_t len );

int main ( void ){
    char arr[] = "A-B-C-D-E";
    size_t len = strlen( arr );

    revSTR( arr, len );
    printf("ARR = %s\n", arr );
}

void swap ( char *a , char *b ){
    char tmp = *a;
    *a = *b;
    *b = tmp;
}

void revSTR ( char *const src, size_t len ){
    char *start = src;
    char *end   = start + ( len - 1 );

    while ( start < end ){
        swap( &( *start ) , &( *end ) );
        start++;
        end--;
    }
}

Works fine too.

But when you do this:

revSTR( arr + 2, len );

You get get:

==7125== Command: ./program
==7125== 
ARR = A-
*** stack smashing detected ***: ./program terminated
==7125== 
==7125== Process terminating with default action of signal 6 (SIGABRT)
==7125==    at 0x4E6F428: raise (raise.c:54)
==7125==    by 0x4E71029: abort (abort.c:89)
==7125==    by 0x4EB17E9: __libc_message (libc_fatal.c:175)
==7125==    by 0x4F5311B: __fortify_fail (fortify_fail.c:37)
==7125==    by 0x4F530BF: __stack_chk_fail (stack_chk_fail.c:28)
==7125==    by 0x400637: main (program.c:14)

And this happens because in the first code, the length of arr is checked inside of revSTR() which is fine, but in the second code where you pass the length:

revSTR( arr + 2, len );

the Length is now longer then the actually length you pass when you say arr + 2.

Length of strlen ( arr + 2 ) != strlen ( arr ).

1
  • 1
    I like this example because it does not rely on standard library functions like gets and scrcpy. I wonder if we could minimize if further. I would at least get rid of string.h with size_t len = sizeof( arr );. Tested on gcc 6.4, Ubuntu 16.04. I would also give the failing example with the arr + 2 to minimize copy pasting. Commented Aug 17, 2018 at 8:47
4

Stack corruptions ususally caused by buffer overflows. You can defend against them by programming defensively.

Whenever you access an array, put an assert before it to ensure the access is not out of bounds. For example:

assert(i + 1 < N);
assert(i < N);
a[i + 1] = a[i];

This makes you think about array bounds and also makes you think about adding tests to trigger them if possible. If some of these asserts can fail during normal use turn them into a regular if.

0

I got this error while using malloc() to allocate some memory to a struct * after spending some this debugging the code, I finally used free() function to free the allocated memory and subsequently the error message gone :)

0

Another source of stack smashing is (incorrect) use of vfork() instead of fork().

I just debugged a case of this, where the child process was unable to execve() the target executable and returned an error code rather than calling _exit().

Because vfork() had spawned that child, it returned while actually still executing within the parent's process space, not only corrupting the parent's stack, but causing two disparate sets of diagnostics to be printed by "downstream" code.

Changing vfork() to fork() fixed both problems, as did changing the child's return statement to _exit() instead.

But since the child code precedes the execve() call with calls to other routines (to set the uid/gid, in this particular case), it technically does not meet the requirements for vfork(), so changing it to use fork() is correct here.

(Note that the problematic return statement was not actually coded as such -- instead, a macro was invoked, and that macro decided whether to _exit() or return based on a global variable. So it wasn't immediately obvious that the child code was nonconforming for vfork() usage.)

For more information, see:

The difference between fork(), vfork(), exec() and clone()

0
0

I encountered this when I edited the struct, but didn't recompile libs that use that struct. In some big project I added new fields to struct, that later are getting parsed from json in lib_struct, and this lib is later used in widgets to show what is parsed. My make file didn't have the dependencies covered, so the lib didn't recompile after editing the struct. My solution was to recompile all things that use the struct.

6
  • This does not really answer the question. If you have a different question, you can ask it by clicking Ask Question. To get notified when this question gets new answers, you can follow this question. Once you have enough reputation, you can also add a bounty to draw more attention to this question. - From Review Commented Sep 29, 2021 at 21:24
  • 1
    @SangeerththanBalachandran I think it does answer the question, which is What could be the possible reasons for this and how do I rectify it?. I showed a reason that I did not see in the list of answers and added the solution that solved the problem for me.
    – ygroeg
    Commented Sep 30, 2021 at 6:37
  • this is not the problem OP was facing and your problem is with the makefile specific for a project you have worked. Commented Sep 30, 2021 at 7:32
  • @SangeerththanBalachandran I believe that, if the same problem has different reasons, why shouldn't I post the path to different solution and thought process? The solution that is marked as correct, won't be able to solve the makefile problem. The fact that, OP wasn't facing this problem, doesn't mean that all people that encounter this error later will solve it like OP did. A lot of people use makefiles for their projects and a lot of those can make mistakes in them.
    – ygroeg
    Commented Sep 30, 2021 at 10:03
  • in such a case it will be useful to further provide what kind of mistakes happened specifically. Commented Sep 30, 2021 at 10:16

Not the answer you're looking for? Browse other questions tagged or ask your own question.