optimizing compiler produced by the GNU Project, key component of the GNU tool-chain and standard compiler for most projects related to GNU and the Linux kernel.
GCC_10.2_GNU_Compiler_Collection_self-compilation.png
gcc 11 1 0 compiling chicken screenshot
The GNU Compiler Collection (GCC) (formerly GNU C Compiler) is a collection of compilers from the GNU Project that support various programming languages, hardware architectures, and operating systems. The Free Software Foundation … Wikipedia
Factsheet
Original author Richard Stallman
Developer GNU Project
Initial release March 22, 1987; 38 years ago (1987-03-22)
Factsheet
Original author Richard Stallman
Developer GNU Project
Initial release March 22, 1987; 38 years ago (1987-03-22)
🌐
Wikipedia
en.wikipedia.org › wiki › GNU_Compiler_Collection
GNU Compiler Collection - Wikipedia
3 weeks ago - Users invoke a language-specific driver program (gcc for C, g++ for C++, etc.), which interprets command arguments, calls the actual compiler, runs the assembler on the output, and then optionally runs the linker to produce a complete executable binary. Each of the language compilers is a separate program that reads source code and outputs machine code.
Top answer
1 of 2
12

The .md (machine description) files of GCC source contain stuff to generate assembly. GCC contains several specialized C/C++ code generators (and some of them translates the .md files into code emitting assembly).

GCC is a very complex program. The documentation of GCC MELT (an obsolete project) contains several interesting links and slides, notably refering to the Indian GCC Resource Center

Most of the optimizations in GCC happens in the middle-end (which is mostly independent of source language or target system), notably with many passes working on the Gimple representations.

The GCC repo is an SVN repository.

See also this answer, notably the pictures inside it.

2 of 2
5

The actual source code for GCC is most accessible from here:

https://gcc.gnu.org/svn.html

The software is accessible via SVN (subversion), a source code control system. This would be installed on many versions of Linux/UNIX, but if not on your platform, you can install the svn kit and then fetch the source using the following command:

svn checkout svn://gcc.gnu.org/svn/gcc/trunk SomeLocalDir

GCC is complex and would take significant experience to understand the nature of how the application actually compiles to different architectures.

In a nutshell, GCC has three major components - front-end, middle and back-end processing. The front-end processor has the component of the language parsing to understand the syntax of languages (like C, C++, Objective-C, etc). The front-end deconstructs the code to a portable construct which is then passed to the back-end for compilation to the target environment.

The middle part performs code analysis and optimisation, attempting to prioritise the code to generate the best possible output at the end of the full process. Technically, optimisation can occur at any part of the process as patterns are discovered during analysis.

The back-end processor compiles the code to a tree-style output format (not actually final executable code). Based on what the expected output is designed to be, the "pseudo-code" is optimised for using registers, bit-sizes, endian-ness, and so on. The final code is then generated during the assembly phase, which converts the back-end code into machine executable instructions.

It's important to note that the compiler has many options to deal with output formats so you can create output to many classes of architecture, usually out of the box. For cross-compiling and target compiler options, try checking out this link:

https://gcc.gnu.org/install/configure.html

🌐
GitHub
github.com › gcc-mirror › gcc
GitHub - gcc-mirror/gcc
This directory contains the GNU Compiler Collection (GCC). The GNU Compiler Collection is free software. See the files whose names start with COPYING for copying permission. The manuals, and some of the runtime libraries, are under different terms; see the individual source files for details.
Starred by 10.7K users
Forked by 4.7K users
Languages   C++ 30.1% | C 29.3% | Ada 14.0% | D 5.9% | Go 5.3% | HTML 3.6%
🌐
YouTube
youtube.com › ants are everywhere
Let's read the GCC source code - YouTube
In this live stream, I start looking at the source for the GNU Compiler Collection, better known as GCC.
Published   February 14, 2023
Views   2K
🌐
Narkive
gcc.gcc.gnu.narkive.com › q2CaPOAd › understanding-source
Understanding gcc source
Post by Hari I am trying to understand the source code of Gcc because I want to learn its control flow . I am basically concentrating on the order in which Gcc compiler consults its different source files like toplev.c ,expr.c ..e.t.c. Please see the gcc documentation that comes with the gcc sources.
🌐
IIT Bombay
cse.iitb.ac.in › ~uday › courses › cs715-09 › gcc-code-view.pdf pdf
GCC Source Code: An Internal View Uday Khedker GCC Resource Center,
The Architecture of GCC · Language · Specific · Code · Language and · Machine · Independent · Generic Code · Machine · Dependent · Generator · Code · Machine · Descriptions · Compiler Generation Framework · Parser · Gimplifier · Tree SSA · Optimizer · RTL · Generator · Optimizer · Code · Generator · Generated Compiler (cc1) Source Program ·
🌐
GNU
gcc.gnu.org
GCC, the GNU Compiler Collection - GNU Project
The GNU Compiler Collection includes front ends for C, C++, Objective-C, Objective-C++, Fortran, Ada, Go, D, Modula-2, COBOL, Rust, and Algol 68 as well as libraries for these languages (libstdc++,...). GCC was originally written as the compiler for the GNU operating system.
Find elsewhere
🌐
GNU
gcc.gnu.org › legacy-ml › gcc › 2008-03 › msg00903.html
Basile STARYNKEVITCH - Re: How to understand gcc source code?
* on the positive side, GCC is still doing well and alive, has an active community; when you ask questions on this gcc@gcc.gnu.org mailing list, you get some answers, provided your question is precise enough, and you did look into code and existing documentation. * more concretely, there are lot of material on GCC. Of course, the "official" documentation http://gcc.gnu.org/onlinedocs/ the source code (usually well commented), the mailing lists (including gcc-patches@gcc.gnu.org), the Wiki (feel free to contribute) http://gcc.gnu.org/wiki and many others (which you can find by STFW).
🌐
Pepas
leopard-adc.pepas.com › documentation › DeveloperTools › gcc-4.2.1 › gcc › Source-Code.html
Source Code - Using the GNU Compiler Collection (GCC)
Each version will be tagged based on its build number, which you can find by executing `gcc --version'; for instance, if this prints
🌐
Wikibooks
en.wikibooks.org › wiki › GNU_C_Compiler_Internals › GNU_C_Compiler_Architecture
GNU C Compiler Internals/GNU C Compiler Architecture - Wikibooks, open books for an open world
It is a driver program that invokes the appropriate compilation programs depending on the language of the source file. For a C source file they are the preprocessor and compiler cc1, the assembler as, and the linker collect2. The first and the third programs come with a GCC distribution, the ...
🌐
Red Hat
docs.redhat.com › en › documentation › red_hat_enterprise_linux › 7 › html › developer_guide › gcc-compiling-code
Chapter 15. Building Code with GCC | Developer Guide | Red Hat Enterprise Linux | 7 | Red Hat Documentation
Source code written in the C or C++ language, present as plain text files. The files typically use extensions such as .c, .cc, .cpp, .h, .hpp, .i, .inc. For a complete list of supported extensions and their interpretation, see the gcc manual pages:
🌐
Medium
medium.com › @tyastropheus › the-magic-black-box-of-gcc-explained-54f991f4f6a2
The Magic Black Box of GCC Explained | by Tanya Kryukova | Medium
May 16, 2017 - There are four steps in the GCC ... Essentially, the compiler takes the source code, which is written in human-understood form, and breaks it down to binaries that computers can read in order to execute it....
Top answer
1 of 2
9

As a starting point see Links and Selected Readings on GCC site. Of particular interest to you, I think, are:

  • GNU C Compiler Internals
  • Compilation of Functional Programming Languages using GCC -- Tail Calls by Andreas Bauer
  • Porting GCC for Dunces by Hans-Peter Nilsson

If you want to develop on Windows you probably need to start from MinGW (Minimalist GNU for Windows) Compiler Suite sources (it includes GNU GDB debugger), which is a port of GCC to Windows.

For a comfortable development environment I cannot help much because I don't develop in C++. But I suppose a good IDE for C/C++ is what you need: have a look at this comparison, there are plenty free/open source IDEs for Windows.

Update: I think ICI can also be of interest to you:

The Interactive Compilation Interface (or 'ICI' for short) is a plugin system with a high-level compiler-independent and low-level compiler-dependent API to transform current compilers into collaborative open modular interactive toolsets. The ICI framework acts as a "middleware" interface between the compiler and the user-definable plugins. It opens up and reuses the production-quality compiler infrastructure to enable program analysis and instrumentation, fine-grain program optimizations, simple prototyping of new development and research ideas while avoiding building new compilation tools from scratch. For example, it is used in MILEPOST GCC to automate compiler and architecture design and program optimizations based on statistical analysis and machine learning. It should enable universal self-tuning compilers adaptable to heterogeneous, reconfigurable, multi-core architectures ranging from supercomputers to embedded systems.

.. as the rest of projects under the Collective TUNING umbrella.

Note: Writing "compilers are one of the most complex programs there are", as BlueRaja wrote in comments, is an overstatement: there are very simple compilers and very complex compilers. But in compiler theory (once you have studied it) there is nothing esoteric. GCC is a complex program to understand as whatever BIG, poorly documented program out there1. So rizwanhudda don't be discouraged: start studying the documentation available and then ask GCC developers (on GCC irc channel, as suggested by nvl or GCC developers mailing list) to explain what is poorly (or not at all) documented.

  1. In fact program comprehension is an active field of research.
2 of 2
1

I would suggest you to use the GCC irc channel, it is meant for discussion of development of GCC.

🌐
IIT Bombay
cse.iitb.ac.in › grc › intdocs › gcc-basic-info.html
Basic Information about GCC
Corresponding to each HLL, except C2, is a directory within $GCCHOME/gcc which all the code for processing that language exists. In particular this involves scanning the tokens of that language and creating the ASTs. If necessary, the basic AST tree node types need to be augmented with variations for this language. The main compiler calls these routines to handle input of that language. To isolate itself from the details of the source language, the main compiler uses a table of function pointers that are to be used to perform each required task.
🌐
Opensource.com
opensource.com › article › 22 › 5 › gnu-c-compiler
A programmer's guide to GNU C Compiler | Opensource.com
Compilation: During this stage, the compiler converts pre-processed source code into assembly code for a specific CPU architecture. The resulting assembly file is named with a .s extension, such as hellogcc.s in this example. Assembly: The assembler (as) converts the assembly code into machine code in an object file, such as hellogcc.o. Linking: The linker (ld) links the object code with the library code to produce an executable file, such as hellogcc. When running GCC, use the -v option to see each step in detail.
🌐
GNU
gcc.gnu.org › codingconventions.html
GCC Coding Conventions - GNU Project
For example, appropriate uses of @code are in phrases such as "@code{const}-qualified type", or "@code{asm} statement", or "function returns @code{true}". Examples where @code should be avoided are phrases such as "const variable", "volatile access", or "condition is false." Some files and packages in the GCC source tree are imported from elsewhere, and we want to minimize divergence from their upstream sources.
🌐
ITU Online
ituonline.com › itu online › tech terms definitions › what is gnu compiler collection (gcc)?
What Is GNU Compiler Collection (GCC)? - ITU Online IT Training
April 2, 2024 - The GNU Compiler Collection (GCC) is a comprehensive suite of free software compilers for various programming languages, including C, C++, Objective-C, Fortran, Ada, and Go, among others. Developed by the GNU Project, GCC is crucial for the compilation process, transforming source code written ...
🌐
Medium
medium.com › @darrenjs › building-gcc-from-source-dcc368a3bb70
Building GCC from source. Step by step guide on building GCC 9 | by Darren Smith | Medium
June 2, 2019 - GCC’s build system has a helpful feature for building these dependencies. If the source code for a dependency is placed within the GCC source code folder, it will be automatically discovered and compiled during the main build of GCC.
🌐
Bitboom
bitboom.github.io › 2018-10-22 › an-overview-of-gcc
An Overview of GCC | Sangwan’s blog
October 22, 2018 - Translate source code from a high-level programming language to a lower level language. - The high level programming language: C, C++, Objective-C, Java, Fortran, or Ada - The lower lavel language : assembly language, object code, or machine code · Each compiler includes three components. a front end, a middle end, and a back end. GCC is not a compiler.