Compiler Design Glossary
Alphabetical Index
- Abstract Syntax Tree
- Activation
- Activation Record
- Activation Tree
- Ambiguous Grammar
- Attributes
- Basic Block
- Bottom-Up Parsers
- Code Generation
- Code Optimization
- Compiler
- Compiler Design
- Constant Folding
- Context-Free Grammar
- Control Flow Graph
- Dead Code Elimination
- Identifiers
- Inherited Attributes
- Intermediate Code Generation
- Keywords
- L-Attributed Grammar
- Lexeme
- Lexical Analysis
- Linking
- LL(1) Parser
- Loading
- Loop Optimization
- LR Parser
- Literals
- Operators
- Parse Tree
- Parser
- Peephole Optimization
- Phases of a Compiler
- Punctuation
- Runtime Environment
- S-Attributed Grammar
- Semantic Analysis
- Shift-Reduce Actions
- Static Single Assignment
- Symbol Table
- Syntax Analysis
- Syntax Directed Translation
- Synthesized Attributes
- Three-Address Code
- Token
- Top-Down Parsers
General Concepts
- Compiler Design
- The study of how to build a compiler, which is a program that translates high-level programming languages into machine code. The focus is on the translation process, ensuring correctness and code efficiency.
- Compiler
- A program that translates high-level programming languages (e.g., Python, C++, or Java) into machine code that a computer’s hardware can execute directly.
- Phases of a Compiler
- The distinct stages of the compilation process. The main phases are: Lexical Analysis, Syntax Analysis, Semantic Analysis, Intermediate Code Generation, Code Optimization, and Code Generation.
Lexical Analysis
- Lexical Analysis (Scanning)
- The first phase of a compiler that breaks the source code into small, meaningful units called tokens. It involves reading the source program character by character and organizing them into tokens.
- Token
- A meaningful sequence of characters that can be treated as a single unit in the grammar of the programming language. Examples include keywords, identifiers, and operators.
- Lexeme
- An actual string of characters from the source code that matches a pattern and generates a token. For example, in the statement
float x;
, “float” is a lexeme for the “keyword” token, and “x” is a lexeme for the “identifier” token. - Keywords
- Reserved words with specific meanings in the language. They cannot be used as variable names or identifiers.
- Identifiers
- Names given to variables, functions, arrays, or other user-defined items.
- Literals (Constants)
- Fixed values in the code that cannot change during a program’s execution.
- Operators
- Symbols used to perform operations.
- Punctuation (Delimiters)
- Symbols that structure the program, such as commas, parentheses, and curly braces.
- Symbol Table
- A data structure used by the compiler to store information about variables, functions, and other identifiers.
Syntax Analysis and Parsing
- Syntax Analysis (Parsing)
- The second phase of a compiler that checks whether the tokens follow the rules of the programming language’s grammar.
- Parser
- A component of the compiler that performs syntax analysis by checking if the input tokens form a valid structure according to the language’s grammar. It takes a stream of tokens as input and produces a parse tree or syntax errors as output.
- Parse Tree
- A hierarchical data structure that represents the syntactic structure of a program according to the grammar rules. It contains all non-terminals and terminals.
- Abstract Syntax Tree (AST)
- A simplified, more abstract version of a parse tree that represents the semantic structure of the code and focuses on essential elements.
- Context-Free Grammar (CFG)
- A formal grammar used to define the syntax rules of programming languages.
- Ambiguous Grammar
- A grammar in which a single string can have more than one parse tree.
- Top-Down Parsers
- Parsers that build the parse tree from the root (start symbol) down to the leaves (terminals). They use a leftmost derivation.
- LL(1) Parser
- A type of top-down parser that reads input from left to right, constructs a leftmost derivation, and uses one “lookahead” token to decide parsing actions.
- Bottom-Up Parsers
- Parsers that build the parse tree from the leaves (terminals) up to the root (start symbol). They construct the rightmost derivation in reverse.
- LR Parser
- A type of bottom-up parser that reads input from left to right and constructs a rightmost derivation in reverse.
- Shift-Reduce Actions
- The two primary actions in bottom-up parsing. Shift moves the next input symbol onto a stack, and reduce replaces a sequence of symbols on the stack with a non-terminal.
Intermediate Code and Optimization
- Three-Address Code (3AC)
- A code representation where each statement has at most three operands. This is an intermediate representation used for optimization and portability.
- Static Single Assignment (SSA)
- A code representation where every variable in the code has a single assignment. This simplifies optimization.
- Control Flow Graph (CFG)
- A representation of a program using nodes (basic blocks) and edges (control flow).
- Basic Block
- A sequence of instructions with one entry point and one exit point.
- Code Optimization
- The process of improving the intermediate code to reduce execution time and memory usage.
- Constant Folding
- Evaluating constant expressions at compile time.
- Dead Code Elimination
- Removing code that does not affect the program’s output.
- Loop Optimization
- A category of optimizations that includes techniques like Code Motion (moving loop-invariant code outside the loop) and Loop Unrolling (reducing loop overhead by executing multiple iterations at once).
- Peephole Optimization
- Analyzing a small sequence of instructions (a “peephole”) and replacing it with a faster alternative.
Runtime Environment
- Runtime Environment
- The state of the target machine, including software libraries and environment variables, that provides services to a running program.
- Activation
- Each execution of a procedure.
- Activation Tree
- A tree that shows the way control enters and leaves procedure activations, with each node representing an activation. The root of the tree is the main function.
- Activation Record (Frame)
- A data structure that manages the information needed for a single execution of a procedure. It is pushed onto the Control Stack when a procedure is called and popped when it returns.
- Linking
- The process of combining multiple object files and resolving symbolic references to create a single executable file.
- Loading
- The process of placing the executable file into memory, resolving runtime addresses, and preparing it for execution by the CPU.
Syntax Directed Translation
- Syntax Directed Translation (SDT)
- A method that combines a context-free grammar with semantic rules to assign meaning or perform actions during parsing. It is used to generate code, check types, and build abstract syntax trees.
- Attributes
- Values attached to grammar symbols that are computed by semantic rules.
- Synthesized Attributes
- Attributes computed from the attributes of a node’s children in the parse tree (a bottom-up flow).
- Inherited Attributes
- Attributes computed from the attributes of a node’s parent or siblings (a top-down or lateral flow).
- S-Attributed Grammar
- A grammar that only uses synthesized attributes.
- L-Attributed Grammar
- A grammar that uses both synthesized attributes and a restricted form of inherited attributes (from the parent or left siblings).
TECH TALENTS
Average rating: 0 reviews