ARCGCC GettingStarted
ARCGCC GettingStarted
ARCGCC GettingStarted
ARC® GCC
GNU Compiler Collection
Getting Started
ARC® GCC Getting Started
Page ii ~ Copyright 2010 ~ Virage Logic Corporation ~ Proprietary Information ~ Do Not Duplicate
Contents
Chapter 1 — Overview 4
Installation 4
Tutorial 5
Chapter 2 — Documentation 6
Installation Instructions 6
Online Documentation 6
Copyright 2010 ~ Virage Logic Corporation ~ Proprietary Information ~ Do Not Duplicate ~ Page iii
Chapter 1 — Overview
The ARC® GNU toolchain is a complete compiler toolchain for ARC 600 and ARC 700
processors.
The following sections provide further details on building and using the toolchain:
• The Installation section outlines the build and installation requirements.
• The Tutorial section outlines how to compile and execute a simple HelloWorld
application.
Installation
The ARC GCC toolchain is delivered in a source package, so that you can build on your
specific host platform (Linux or Cygwin™ for Windows)
To build the ARC GCC toolchain from source, the following are required.
• A native GCC for the Host Operating system version 3.4 or above (for example, GCC
x86 for Linux)
• Make version 3.80 or greater
• Texinfo version 3.8
• bison version 1.875 or greater
• byacc
• Building Insight (the Graphical Interface of GDB) requires following additional
packages.
• qt (or qt-devel)
• libtermcap/termcap (or libtermcap-devel)
• ncurses (or libtermcap-devel)
A set of shell scripts is provided to automate the build process from source. To build the
ARC GCC toolchain (ELF32 version) you can use the script build_elf32.sh as follows:
# cd $UNZIP_LOC
# ./build_elf32.sh $INSTALL_DIR
Tutorial
After installing the ARC GCC toolkit (or building from source), try compiling and executing
hello.c with the ARC GCC toolkit (ELF32 version) as explained below.
Example 1 Listing of hello.c
#include <stdio.h>
int main()
{
printf("Hello World \n");
return 1;
}
Invoke the compiler to compile hello.c for ARC 700 (-mA7) with debug information (-g).
# arc-elf32-gcc -mA7 -g hello.c -o hello.out
Installation Instructions
Particular installation instructions for the toolchain and any third party tools are contained in
the readme file provided with the installation.
Online Documentation
The ARC GCC toolchain is supplied with built-in on-line documentation (info pages) that is
visible after the toolchain is installed.
In addition, PDF documentation is located in $UNZIP_DIR/docs. The starting point is
contents.pdf.
-mswap Allow generation of swap instruction through the use of builtins. For ARC
700, the –mswap option is turned on by default.
-mmul64 Allow generation of mul64 and mulu64 instructions, through the use of
builtins. Not available for ARC 700.
-mno-mpy Disallow generation of ARC 700 multiplier instructions by the compiler. The
ARC 700 multiplier instructions are enabled by default.
-mEA Generate extended arithmetic instructions. Presently, only the generation of
divaw with the help of compiler builtins is supported.
-msoft-float Instruct the compiler to use software emulation for floating point operations.
This is the default.
-mlong-calls Generate all function calls as register indirect. By default, the compiler
generates bl <<s25 offset>> instructions for function calls. This restricts the
range of target addresses for a function call to a signed 25-bit offset, unless -
mno-cond-exec’ is in effect, function calls may be conditionalized, reducing
s25 offsets to s21 offsets. To call functions placed at a higher offset, a register
indirect jump-and-link instruction needs to be used. This switch instructs the
compiler to call functions with a register-indirect and thus enables calling
functions within the entire 32-bit address space. This flag can be overridden
with the short_call function attribute.
-mno-brcc Do not generate BRcc instructions for ARC700 target. The default behavior is
to generate BRcc instructions with -mA7/-mARC7000 switch.
-mno-sdata Do not generate code using the small-data sections. For ARCompact targets,
the compiler allocates space for externally visible global variables in the small-
data sections and generates GP register based loads/stores to access them.
-mno-millicode Do not generate millicode for saving/restoring registers in functions
prologue/epilogue. This flag is required only with -Os switch. For other
optimization levels millicode thunk generation is disabled by default.
-mdynamic Link with dynamic libraries, if these are available (only for the ARC-LINUX-
UCLIBC build of the toolchain)
-mspfp Generate single-precision FPX instructions. The instruction scheduling is
-mspfp_compact done assuming latencies for the compact build of the FPX SPFP extensions.
-mspfp_fast Generate single-precision FPX instructions. The instruction scheduling is done
assuming latencies for the fast build of the FPX SPFP extensions.
-mdpfp Generate double-precision FPX instructions. The instruction scheduling is
-mdpfp_compact done assuming latencies for the compact build of the FPX DPFP extensions.
-mspfp_fast Generate double-precision FPX instructions. The instruction scheduling is done
assuming latencies for the Fast build of the FPX DPFP extensions.
-msimd Allow generation of vector instructions, through the use of SIMD builtins
provided by the compiler.
The following additional assembler directives have been added for Position Independent
Code. These directives are available only with the ARC700 processor. These directives
generate relocation entries which are interpreted by he linker in the manner described below.
@gotpc Relative distance of the symbol's GOT entry from the current pc location.
@gotoff Distance of the symbol from the base of the GOT.
@plt32 Distance of the symbol's PLT Entry to the current pc. This is valid only with
branch and link instructions currently and pc relative calls.
__GLOBAL_OFFSET_TABLE__
Symbol referring to the base of the GOT .
__DYNAMIC__
An alias for GOT Base. These can be used only with @gotpc modifiers.
-marcelf-prof Like –marcelf , but preserve special sections used for profiling.
-marclinux Used for Linux binaries on ARC 700. This allows linking with shared
libraries. Thus for any program to link shared libraries with the
ARC700 options, the –marclinux switch will have to be provided to
the linker. This is the default for the ARC-LINUX-UCLIBC build of
the tools (arc-linux-uclibc-ld).
-marclinux-prof Like -marclinux, but preserve special sections used for profiling.
Register usage
GCC for ARC follows the register usage conventions specified by the ABI (as given in the
table below).
Function Registers
Result R0-R1
Arguments R0-R7
Caller Saved Registers R0-R12
Callee Saved Registers R13-R25
Static chain pointer (if R11
required)
Register for temp calculation R12
Global Pointer R26 (GP)
Frame Pointer R27 (FP)
Stack Pointer R28 (SP)
Interrupt Link Register 1 R29 (ILINK1)
Interrupt Link Register 2 R30 (ILINK2)
Branch Link Register R31 (BLINK)
Stack Structure
The ARCtangent runtime environment uses a grow-down stack that contains return register,
non-volatile registers, frame pointer, local variables, and a routine’s parameter information.
• Decrements the stack pointer (SP) to account for the new stack frame
At the end of the function, the compiler-generated epilogue does the following:
• Restores the stack pointer (SP) to the beginning of the saved register area
• Restores all non-volatile registers that were saved in the register area
• Restores caller’s frame pointer register (FP), if required
• Restores the return address register (BLINK)
• Restores caller’s stack pointer register (SP)
• Returns to caller through the address stored in the blink register.
NOTE The return address save/restore is elided for leaf functions
Parameter Passing
Parameters are passed by pushing them onto the stack. The first 8 words (32 bytes) of
arguments are loaded into registers r0 to r7. The remaining arguments are passed by storing
them into the stack immediately above the stack pointer register (sp).
Return Value
• In the ARCtangent runtime environment, function values are returned in register r0
• A doubleword result (double or long double) is returned in registers r0 to r1
• Structures that can be contained in a single 4-byte register are returned in r0
• All other structures are returned by storing them at an address passed to the function as a
hidden first argument
• Sixty-four-bit integers (long long) are returned in registers r0 to r1
ld %r11, [__GLOBAL_OFFSET_TABLE__+4]
ld %r10, [__GLOBAL_OFFSET_TABLE__+8]
j [%r10]
__GLOBAL_OFFSET_TABLE__
● Position Independent PLT0
ld %r11, [pcl,__GLOBAL_OFFSET_TABLE__@gotpc + 4]
ld %r10, [pcl,__GLOBAL_OFFSET_TABLE__@gotpc + 8]
j [%r10]
__GLOBAL_OFFSET_TABLE__
● The PLTn Entry has been defined as follows.
ld %r12, [pcl, func@gotpc]
j_s.d [%r12]
mov_s %r12, %pc
● DT_PLTGOT points to the first entry of the PLT table.
For ease of the dynamic linker to find the GOT base for every module the last entry in PLT0
has been fixed as to contain the GOT base address.
For more info on the dynamic linking aspects please refer to the System V ABI extension v
2.0.
Generic built-ins
The GNU compiler for ARC processors supports the following built-in functions. The
compiler recognizes these as language extensions. The intrinsics can be used in C code to
generate the corresponding assembly instructions, as illustrated below:
void __builtin_arc_nop (void)
Usage: __builtin_arc_nop ();
Instruction generated: nop
SIMD Built-ins
SIMD builtins provided by the compiler can be used to generate the vector instructions. This
section describes the available builtins and how to use them in programs.
With the -msimd switch passed, the compiler provides 128-bit vector types, which can be
specified using the vector_size attribute. The header file <arc-simd.h> can be included to use
the following predefined types.
typedef int __v4si __attribute__((vector_size(16)));
typedef short __v8hi __attribute__((vector_size(16)));
These types can be used to define 128-bit variables. The built-in operations listed in the
following section can be used on these variables to generate the vector operations.
A list of the SIMD built-ins grouped by their signatures follows:
NOTE For all built-ins __builtin_arc_<someinsn>, the header file arc-simd.h also provides equivalent
macros called _<someinsn> which can be used for programming ease and improved readability.
Also provided are the following macros for DMA control
#define _setup_dma_in_channel_reg _vdiwr
#define _setup_dma_out_channel_reg _vdowr
The second argument in these builtins has to be an unsigned 8-bit integer constant:
The first argument in these built-ins has to be an unsigned 3-bit integer constant, as it
indicates DR0-DR7 DMA channel setup registers. The file arc-simd.h also provides defines
which can be used in place of the DMA register numbers to facilitate better code readability:
NOTE Although the equivalent hardware instructions do not take a SIMD register as an operand, these
built-ins overwrite the relevant bits of the __v8hi quantity provided as the first argument with the
value loaded from [Ib, u8] location in the SDM.
Example
#include <arc-simd.h>
__v8hi vector_operand1;
__v8hi vector_operand2;
__v8hi vector_result;
Attributes
The following attributes for function names are supported in the ARC backend of GCC:
long_call The function marked with this attribute are always called using register-
indirect jump-and-link instructions, thereby enabling the called function to be
placed anywhere within the 32-bit address space.
short_call This attribute indicates that the called function is close enough to be called
with a bl <<s25 offset>> instruction, i.e. within a signed 25-bit offset from the
call site.
These attributes override the -mlongcalls switch, and thus calls to functions declared with
'short_call' attribute can use bl <<offset>> even if the -mlongcalls switch is passed (please
refer to the example below).
NOTE Multiple function declarations need to have consistent attribute specification. A failure to do so is
reported as an error.
Example of attribute usage:
extern void nearfunc() __attribute__((short_call));
extern void farfunc() __attribute__((long_call));
extern void nearfunc(); /* Error: Earlier declared as a short_call function */
void foo()
{
nearfunc (); /* This can be called as 'bl nearfunc' */
farfunc (); /* This will be called as 'mov reg, @farfunc jl [reg]' */
}
The constraint “=x” for retval makes sure that return value register r0 is given to retval, and
the constraint “L” for the constant ensures that the immediate passed to the norm instruction
is a valid 6-bit unsigned constant integer.
NOTE The norm instruction can also be generated by using the compiler provided builtin
__builtin_arc_norm.
This specifies an extension auxiliary register called _mulhi_ which is at address 0x12 in the
memory space and which is only writable.
.extCondCode SUFFIX,VALUE
The condition codes on the ARC cores are extensible and can be specified by means of this
assembler directive. They are specified by the suffix and the value for the condition code.
They can be used to specify extra condition codes with any values. For example:
.extCondCode is_busy,0x14
add.is_busy r1,r2,r3
b.is_busy _main
.extCoreRegister NAME,REGNUM,MODE,SHORTCUT
Specifies an extension core register NAME for the application. This allows a register NAME
with a valid REGNUM between 0 and 60, with the following as valid values for MODE
• r (readonly)
• w (write only)
• r|w (read or write)
The other parameter gives a description of the register having a SHORTCUT in the pipeline.
The valid values are:
• can_shortcut
• cannot_shortcut
For example:
.extCoreRegister mlo,57,r,can_shortcut
This defines an extension core register mlo with the value 57 which can shortcut the pipeline.
.extInstruction NAME,OPCODE,SUBOPCODE,SUFFIXCLASS,SYNTAXCLASS
The ARC cores allows the user to specify extension instructions. The extension instructions
are not macros. The assembler creates encodings for use of these instructions according to
the specification by the user. The parameters are:
• NAME Name of the extension instruction. In case of the ARCompact, if the
instruction name Is suffixed with _s it indicates a 16-bit extension instruction.
• OPCODE Opcode to be used. (Bits 27:31 in the encoding). Valid values are 0x10 to 0x1f
or 0x03. In case of the ARCompact; for the 32-bit extension instructions valid
values range from 0x04 to 0x07 (inclusive), and for 16-bit instructions valid
values range from 0x08 to 0x0B (inclusive).
• SUBOPCODE Sub-opcode to be used. Valid values are from 0x090x3f. However the correct
value also depends on SYNTAXCLASS.
• SUFFIXCLASS Determines the kinds of suffixes to be allowed. This parameter indicates the
absence or presence of conditional suffixes and flag setting by the extension
instruction. Valid values are SUFFIX_NONE, SUFFIX_COND, and
SUFFIX_FLAG. It is also possible to specify that an instruction sets the flags
and is conditional by using SUFFIX_CODE | SUFFIX_FLAG.
• SYNTAXCLASS Determines the syntax class for the instruction. It can have the following
values:
SYNTAX_NOP Two-operand instruction with no operands
SYNTAX_1OP Two-operand instruction with one source operand
SYNTAX_2OP Two-operand instruction with one source operand and one
destination
SYNTAX_3OP Three-operand Instruction with two source operands and
one destination
The syntax class also permits modifiers as described below:
OP1_MUST_BE_IMM Modifies syntax class SYNTAX_3OP, specifying
that the first operand of a three-operand
instruction must be an immediate (i.e., the result is
discarded). OP1_MUST_BE_IMM is used by
bitwise ORing it with SYNTAX_3OP as given in
the example below. This is usually used to set the
flags using specific instructions and not retain
results.
OP1_IMM_IMPLIED Modifies syntax class SYNTAX_20P, it specifies
that there is an implied immediate destination
operand which does not appear in the syntax.
OP1_DEST_IGNORED Used with OP1_MUST_BE_IMM and
OP1_IMM_IMPLIED when the instruction
ignores the destination operand. It allows the
assembler to choose a more efficient encoding of
the instruction. GAS currently ignores this syntax
class modifier.
For example:
.extInstruction mul64,0x14,0x0,SUFFIX_COND,SYNTAX_3OP | OP1_MUST_BE_IMM
In this case, the first argument is an implied immediate (that is, the result is discarded). This
is as if the source code were inst 0,r1,r2. You use OP1_IMM_IMPLIED by bitwise ORing it
with SYNTAX_20P. For example, defining 64-bit multiplier with immediate operands:
.extInstruction mul64,0x14,0x0,SUFFIX_COND | SUFFIX_FLAG ,
SYNTAX_3OP|OP1_MUST_BE_IMM
The above specifies an extension instruction called mul64 that has 3 operands, sets the flags,
and can be used with a condition code, for which the first operand is an immediate
(equivalent to discarding the result of the operation).
.extInstruction mul64,0x14,0x00,SUFFIX_COND, SYNTAX_2OP | OP1_IMM_IMPLIED
In this example, the symbol floating_point is defined as zero. The symbol _etext is given
the same address as the location following the last .text input section. The symbol _bdata is
defined as sharing its address with the location following the .text output section aligned
upward to a four-byte boundary. In some cases, it is desirable for a linker script to define a
symbol only if the sysbol is referenced and is not defined by any object included in the link.
The PROVIDE keyword defines a symbol only if it is referenced but not defined. In the
above example, the symbol __provided_sym is defined by the linker only if none of the input
objects in the link command define it.