You can browse using gitweb at https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git
From there, goto tree views and navigate to the parts of the AVR backend:
gcc/config/avr/: The GCC AVR backendlibgcc/config/avr/: The AVR specific part of the libgcc runtimegcc/testsuite/gcc.target/avr/: AVR-specific parts of the test suitegcc/common/config/avr/: Part of the backend that's common to the compiler proper (cc1,cc1plus) and the compiler driver (avr-gcc,avr-g++).
Most convenient is to browse on file, so you would
$ git clone git://gcc.gnu.org/git/gcc.git SomeLocalDir
When you are interested in a specific branch or tag, like branches/releases/gcc-14 or tags/releases/gcc-14.1.0, you can navigate to that ref and browse from there.
What is the relation to LLVM?
I don't understand that question. LLVM is a different compiler (infrastructure) with it's own runtime. It has nothing to do with GCC (claims at least).
What do "they" use for regression testing architecture specific routines;
See Installing GCC: Testing → How to test GCC on a simulator → avr → AVRtest → README: Running the avr-gcc Testsuite using the AVRtest Simulator
is there something I could easily use myself?
Yes it's easy enough to compile AVRtest, for example. Notice that AVRtest is just an AVR core simulator; no peripherals are simulated. When you prefer avr-gdb, see SimulAVR and AVaRICE.
Answer from emacs drives me nuts on Stack OverflowThe .md (machine description) files of GCC source contain stuff to generate assembly. GCC contains several specialized C/C++ code generators (and some of them translates the .md files into code emitting assembly).
GCC is a very complex program. The documentation of GCC MELT (an obsolete project) contains several interesting links and slides, notably refering to the Indian GCC Resource Center
Most of the optimizations in GCC happens in the middle-end (which is mostly independent of source language or target system), notably with many passes working on the Gimple representations.
The GCC repo is an SVN repository.
See also this answer, notably the pictures inside it.
The actual source code for GCC is most accessible from here:
https://gcc.gnu.org/svn.html
The software is accessible via SVN (subversion), a source code control system. This would be installed on many versions of Linux/UNIX, but if not on your platform, you can install the svn kit and then fetch the source using the following command:
svn checkout svn://gcc.gnu.org/svn/gcc/trunk SomeLocalDir
GCC is complex and would take significant experience to understand the nature of how the application actually compiles to different architectures.
In a nutshell, GCC has three major components - front-end, middle and back-end processing. The front-end processor has the component of the language parsing to understand the syntax of languages (like C, C++, Objective-C, etc). The front-end deconstructs the code to a portable construct which is then passed to the back-end for compilation to the target environment.
The middle part performs code analysis and optimisation, attempting to prioritise the code to generate the best possible output at the end of the full process. Technically, optimisation can occur at any part of the process as patterns are discovered during analysis.
The back-end processor compiles the code to a tree-style output format (not actually final executable code). Based on what the expected output is designed to be, the "pseudo-code" is optimised for using registers, bit-sizes, endian-ness, and so on. The final code is then generated during the assembly phase, which converts the back-end code into machine executable instructions.
It's important to note that the compiler has many options to deal with output formats so you can create output to many classes of architecture, usually out of the box. For cross-compiling and target compiler options, try checking out this link:
https://gcc.gnu.org/install/configure.html