All Questions
365 questions
1
vote
1
answer
130
views
How to tell GCC that it does not need to reload a value from memory after passing a pointer to a function with read-only access
Please consider the following program:
#include <stdio.h>
typedef char*(*g)(const char*)
__attribute__((__access__(__read_only__, 1)))
;
char *h(const char*)
__attribute__((__access__(...
1
vote
1
answer
48
views
Cortex-M loading 32-bit variable optimization
I'm trying to compile the following test code below, that only writes the 32-bits variable into a pointer. I write it once as byte access, and second time as word access.
void load_data_8(uint32_t ...
3
votes
3
answers
186
views
Why are these C functions optimized differently by GCC?
I am writing embedded code for an ARM Cortex-M processor. I wanted some generic function to copy memory to registers that are mapped contiguously into memory.
Since the C standard forbids aliasing ...
2
votes
1
answer
127
views
Why doesn't the compiler optimize strlen() calls in a loop, despite it being a pure function? And how would [[reproducible]] affect this?
I have two functions that convert a string to lowercase. The first one calls strlen() in every iteration, while the second one only calls it after modifying the string. I expected the compiler to ...
0
votes
0
answers
55
views
Optimizing declaration of symbols implemented in stdlib
I'm creating a static compiled application for an embedded device.
The device has implementation for common std functions like memset/strlen.
In order to reduce the size of the binary I'm compiling ...
2
votes
1
answer
123
views
Constant function pointer optimization
I am trying to implement an abstract interface in C using function pointers inside a struct.
Something like the following
typedef int (*fn_t)(int);
typedef struct
{
int x;
const fn_t fnp;
}...
0
votes
1
answer
58
views
Are these codes(concurrent C codes) necessary to use memory barrier?
I found these codes in project libuv 1.3.0. But I can't understand that why memory barriers(for compiler) are need.
static int uv__async_make_pending(int* pending) {
/* Do a cheap read first. */
...
2
votes
2
answers
131
views
GCC optimisation damages system call stubs
Recent versions of GCC, including version 12, implement an inter-procedural analysis that fatally damages the following (GCC-only) code for a system call stub on ARM/Thumb.
typedef struct { int sender;...
2
votes
3
answers
271
views
Lock correctness with compiler optimizations (C / gcc)
When discussing concurrency with my professor, he mentioned that potential compiler optimizations around locks (re-ordering instructions, optimizing accesses, etc.), like pthread_mutex_lock() could ...
1
vote
3
answers
98
views
gcc optimization: Increment on condition
I've noticed that gcc12 does not optimize these two functions to the same code (with -O3):
int x = 0;
void f(bool a)
{
if (a) {
++x;
}
}
void f2(bool a)
{
x += a;
}
Basically no ...
1
vote
0
answers
203
views
Why -O2 is different than -Os , when disabling corresponding optimization flags?
I am working on Ubuntu 18.04.6 LTS with Intel Core(TM) 3.20GHz.
In GCC is written the following on the -Os flag:
-Os
Optimize for size. -Os enables all -O2 optimizations except those that often ...
0
votes
1
answer
102
views
Does gcc cache the address of a global array at a static index by default?
Let's say I have the global array
char global_array[3] = {33, 17, 6}; and I access global_array[1]. When gcc interprets global_array[1], will it add 1 to the address of global_array at runtime, or ...
1
vote
2
answers
386
views
Why does the compiler not automatically remove branches?
I was curious about whether the compiler automatically removed branches when it could so I set up an easy case:
bool testConditionsIf(int i, int j) {
if (i == 1){
if (j == 2){
...
3
votes
0
answers
95
views
Why removing orphan functions will not help size of my statically linked file [duplicate]
Let's consider this minimal example:
main.c
#include <stdio.h>
int main()
{
printf("Hello World\n");
return 0;
}
and I compile it dynamically linked:
gcc main.c -o dl_out
...
1
vote
1
answer
182
views
Part of -ffast-math (FTZ, DAZ) is disabled when using optimization (GCC)
The gcc flag -funsafe-math-optimizations (part of -ffast-math) turns on FTZ and DAZ (flush-to-zero and denormals-are-zero). However, turning on optimization disables this behavior.
#include <stdio....
0
votes
1
answer
211
views
Fast file deletion in C (Linux)
I am trying to read from a file, perform an operation on the contents on the file, and then write to another file as fast as possible (for a competition). To do this, I mmap both the input and output ...
1
vote
1
answer
186
views
run-time large performance drop from gcc 7.5.0-6ubuntu2 to gcc 8.4.0-3ubuntu2
After starting to use gcc 11 of Ubuntu 22.04 I've noticed I have ~90% degradation in my c application performance - the way I measure it.
Narrowing it I saw the degradation happens since gcc 8.4.0-...
1
vote
0
answers
83
views
gcc: "Boolean or"-optimization [duplicate]
Let's assume there's the following code:
#include <stdbool.h>
typedef struct {
bool a;
bool b;
} MyStruct2;
bool g(MyStruct2 *s) {
return s->a || s->b;
}
bool g2(MyStruct2 *...
2
votes
2
answers
501
views
Why is optimizing inline functions easier than normal functions?
Im reading What Every Programmer Should Know About Memory
https://people.freebsd.org/~lstewart/articles/cpumemory.pdf and it says that inline functions make your code more optimizable
for example:
...
0
votes
1
answer
45
views
Strange behavior with -pg and optimization when generating hashes
I decided to make a program to find a sha3-512 hash with a certain number of zeroes at the start (like hashcash). It was working fine in my initial tests, so i decided to profile it with gprof to see ...
0
votes
3
answers
809
views
Will variable declaration inside infinite loop in c cause stack overflow
This is a simple question, but I would just like to throw this out there and appreciate if anyone could validate if my understanding is correct or provide some more insight. My apologies in advance if ...
0
votes
1
answer
69
views
Understanding Pointers and Scope in C With Different Compilers
I have recently encountered an issue with how gcc compilers handle a pointer in some legacy code. Taking the below as a quite cut down example of how the code is set up:
int someCondition; // Set ...
2
votes
1
answer
207
views
Why is GCC 11 compiler producing weird output when optimization is enabled?
Please take a look at this code:
#include <stdio.h>
#include <stdint.h>
int main()
{
uint8_t run = 1;
/* Define variable of interest and set to random value */
uint32_t ...
8
votes
1
answer
1k
views
Why is the XOR swap optimized into a normal swap using the MOV instruction?
While testing things around Compiler Explorer, I tried out the following overflow-free function for calculating the average of 2 unsigned 32-bit integers:
uint32_t average_1(uint32_t a, uint32_t b)
{
...
1
vote
1
answer
168
views
GCC optimization of (char) pointers makes no sense
I am using STM32 Cube IDE version 1.8.0 for a STM32 MCU project.
Builder settings are default. It is set to "External builder".
arm-none-eabi-gcc --version
arm-none-eabi-gcc (GNU Tools for ...
2
votes
0
answers
126
views
GCC optimization: killing variables or memory locations
Is there a way to tell the optimizer of recent versions of GCC that a particular variable or memory location is dead at a particular point? Is there a way to tell it that a memory location won't ...
0
votes
1
answer
150
views
OpenMP: gcc causes weird summation in case of -march=native (-march=skylake-avx512) and -O3
The following code will behave differently, depending on the optimization applied by gcc and on the target architecture:
#include <omp.h>
#include <stdlib.h>
#include <stdio.h>
#...
-3
votes
1
answer
220
views
How to inform GCC about hotspots(often executed) and coldspots(seldom executed) in the code?
Let's say there's this code:
int foo() {
//code
}
int bar() {
//code
}
And if it's known for a fact that foo is called many times and bar is not called very much, how can the compiler be informed of ...
0
votes
1
answer
207
views
gcc optimized out unused variable when it should not
Considering the following code which many comes mostly from Bluedroid stack
#include <stdint.h>
#include <assert.h>
#define STREAM_TO_UINT16(u16, p) {u16 = ((uint16_t)(*(p)) + (((uint16_t)...
0
votes
1
answer
135
views
Why GCC optimizes accessing global array variable
I have problem with my code as it seems the thinking of gcc is not the same as mine.
Here is simplified version of problem.
struct foo
{
uint16_t var;
};
struct foo foos[7];
struct foo foo;
...
7
votes
4
answers
519
views
GCC wrongly optimizes a pointer-equality test for a variable at a custom address
When optimizing, GCC seems to bypass wrongly a #define test.
First of all, I'm using my own link.ld linker script to provide a __foo__ symbol at the address 0xFFF (actually the lowest bits, not the ...
0
votes
1
answer
120
views
Why did GCC optimizes important assignment, making it into a different program behavior?
So, I am making a C program for my console interpreter to detect which type is it, then return it to typeret and store the pointer into the void pointer retpoint.
Now, what I am asking is not what is ...
1
vote
6
answers
2k
views
GCC Why Non Run Optimization All Time?
I wrote the well known swap function in C and watched the assembly output using gcc S and once again did the same but with optimizations of O2
The difference was pretty big as I saw only 5 lines ...
1
vote
1
answer
193
views
Rarely used function that is unused affecting the performance
There is an essential function in the program that is rarely used. However, I found out that even when this function is not used in a particular execution, this function causes an increase in runtime ...
3
votes
4
answers
902
views
will gcc optimization remove for loop if it's only one iteration?
Im writing a real time DSP processing library.
My intention is to give it a flexibility to define input samples blockSize, while also having best possible performance in case of sample-by-sample ...
6
votes
1
answer
642
views
Can anything be done to get GCC to compile a switch statement with thousands of cases faster?
I've been working on a decompiler (for Glulx) which produces C code. In one mode it outputs all of the code in one switch statement (switching on the program counter). A medium sized input file ...
2
votes
4
answers
643
views
What effect does Optimisation in GCC compiler have on overflow conditions
test.c
#include<limits.h>
int main()
{
int a=INT_MAX-1;
if(a+100<a)
{
printf("overflow\n");
return 1;
}
printf("%i\n",a+100);
...
7
votes
0
answers
354
views
GCC optimization bug on weak const variable
I'm getting a weird gcc behaviour when dealing with weak const variables on different optimize levels (i.e. -O0 or -O1).
Here is the code:
def.h: declarations
const int var;
int copy;
int do_copy(void)...
2
votes
0
answers
308
views
Does using noipa or optnone attributes remove the need for asm("")
In GCC, one can use the noipa attribute to disable inter-procedural optimiations. Similarly, in Clang, optnone can be used to disable all optimization.
However, GCC's documentation for noinline states ...
0
votes
0
answers
1k
views
Controlling compiler inline optimization for memcpy
I like compiler inline optimization of memcpy instructions. On the arm processor I have found that it seems to inline up to 64 bytes with both gcc and clang. Without the inlining it seems to go to a ...
1
vote
0
answers
60
views
How can translate this statement in C to assembly language?
(C)
d = b * 7
(Assembly language)
(1)
movl b, %eax
movl $7, %ebx
mull %ebx
movl %eax, d
movl %edx, e
if 32bit integer type is multiplied with same type, it needs 64bit space for return value. ...
2
votes
6
answers
609
views
How can I make this loop run faster?
I'm using this code to find the highest temperature pixel in a thermal image and the coordinates of the pixel.
void _findMax(uint16_t *image, int sz, sPixelData *returnPixel)
{
int temp = 0;
...
0
votes
1
answer
83
views
Odd behaviour from compiler optimizing this loop
I wrote this code to find the highest temperature pixel in a thermal image. I also need to know the coordinates of the pixel in the image.
void _findMax(uint16_t* image, int sz, sPixelData* ...
3
votes
2
answers
645
views
Does gcc always do this kind of optimization? (common subexpression elimination)
As an example, assume that the expression sys->pot.atoms[item->P.kind].mass is evaluated inside a loop. The loop only changes item, so the expression can be simplified as atoms[item->P.kind]....
5
votes
1
answer
946
views
How to prevent gcc optimization breaking rep movsb code? [duplicate]
I tried to create my memcpy code with rep movsb instruction. It works perfectly with any size when the optimization is disabled. But, when I enable optimization, it does not work as expected.
...
0
votes
1
answer
276
views
if statement with empty block gcc compiler optimization behaviour
I have this C code
char msg[] = "hello";
int expr = some_function_giving_back_integers();
if(expr) {
_dbFunction(msg);
}
where _dbFunction(); is defined as follow
#define DBGFL 1
#...
3
votes
0
answers
103
views
Optimization for pure vs. const function
The source code I use in this post, is also available here: https://gcc.godbolt.org/z/dGvxnv
Given this C source code:
int pure_f(int a, int b) __attribute__((pure));
int const_f(int a, int b) ...
1
vote
1
answer
316
views
Reducing size of C program to fit in qr code
I wanted to fit minesweeper into a qr code, which has a maximum size allowance of 3KB. As it stands, my program's binary is 12KB, and its object file is 3KB. I tried using UPX on the binary and ...
2
votes
3
answers
1k
views
Is it necessary to use the "volatile" qualifier even in case the GCC optimisations are turned off?
My question is targeted towards the embedded development, specifically STM32.
I am well aware of the fact that the use of volatile qualifier for a variable is crucial when dealing with a program with ...
0
votes
1
answer
143
views
C: Weird label behavior with -O3?
Given this code (minimal example; there's no deeper meaning in it):
#include <stdio.h>
int main() {
start:
asm volatile(
".code64\n\t"
"push %rax\n\t"...