Unit IV: Instruction Set Architecture
⭐Numerical Formula:
- CPI (Cycles Per Instruction):
- Execution Time:
- MIPS:
- Speedup and Amdahl's Law:
⭐Introduction to Instruction Set Architecture and Microcode: Architecture & Microcode, Machine Models, ISA characteristics, Pipeline
Architecture & Microcode
- Instruction Set Architecture (ISA):
- Defines the set of instructions a computer can execute.
- Acts as the interface between hardware (processor) and software (programs).
- Determines fundamental aspects like the instruction formats, addressing modes, registers, and data types.
- Microcode:
- A lower-level set of instructions or signals used by complex instruction set computing (CISC) CPUs to implement higher-level instructions.
- Translates complex machine instructions into simpler, sequenced microinstructions.
- Microcode is stored in special, faster storage within the CPU (like ROM or RAM).
Machine Models
- Von Neumann Model:
- A traditional architecture where instructions and data share the same memory and data bus.
- Known for the Von Neumann Bottleneck, as the single data path for instructions and data can cause delays.
- Characteristics include simplicity and ease of use for general-purpose computing.
- Harvard Model:
- A design with separate memory and buses for instructions and data.
- Allows simultaneous access to instructions and data, increasing performance and reducing potential bottlenecks.
- Often used in digital signal processing and microcontrollers.
- Modified Harvard Model:
- Combines features of both Von Neumann and Harvard models, sometimes allowing transfer between instruction and data memories.
- Adds flexibility and is common in modern CPUs.
ISA Characteristics
- Complexity and Design (RISC vs. CISC):
- RISC (Reduced Instruction Set Computer): Simple, minimal instructions, focusing on a small set of efficient operations.
- CISC (Complex Instruction Set Computer): Larger set of complex instructions, often with microcoded implementation for complex instructions.
- Instruction Types and Formats:
- Instruction Types: Includes arithmetic, logical, control, and data movement instructions.
- Formats: Varying instruction sizes (fixed or variable) and formats based on operation types.
- Registers:
- Defines the types and numbers of registers (like general-purpose and floating-point).
- Registers are typically limited in number, which affects the efficiency of the ISA.
- Memory Addressing Modes:
- Different methods to access memory, such as immediate, direct, indirect, indexed, and base-plus-offset.
- Determines the flexibility and ease with which programs can access data.
- Data Types:
- Defines supported data types (integers, floating-point numbers, characters, etc.).
- Affects the range of applications the ISA can handle and the precision of operations.
Pipeline
- Basic Concept:
- A method of instruction execution that divides operations into multiple stages (e.g., fetch, decode, execute, memory access, and write-back).
- Each instruction is processed in different stages simultaneously, allowing overlap and improving throughput.
- Pipeline Stages:
- Fetch: Retrieves the instruction from memory.
- Decode: Decodes the instruction and prepares required control signals.
- Execute: Performs the operation (e.g., arithmetic or logic operation).
- Memory Access: Reads or writes data from/to memory if needed.
- Write-back: Stores the result in a register for later use.
- Pipeline Hazards:
- Data Hazards: Occur when instructions depend on the results of previous ones that are still in the pipeline.
- Control Hazards: Happen when the pipeline encounters branch instructions, potentially causing it to fetch the wrong instruction.
- Structural Hazards: Arise when hardware resources are insufficient to execute instructions in parallel.
- Advantages of Pipelining:
- Increased Throughput: More instructions are completed in a shorter time frame.
- Efficient Utilization of CPU Resources: Reduces idle time by overlapping instruction execution stages.
- Improved Performance: Enables higher instruction-per-cycle (IPC) execution, which improves overall CPU efficiency.
⭐Review: Microcoded Microarchitecture, Pipeline Basics, Structural and Data Hazards
Microcoded Microarchitecture
- Microarchitecture Overview:
- Refers to the internal architecture of the CPU, implementing the instructions defined by the ISA.
- Consists of functional blocks like the ALU (Arithmetic Logic Unit), control unit, registers, and cache.
- Role of Microcode:
- Microcode is a layer of low-level instructions that translates complex machine instructions into sequences of simpler micro-operations.
- Microinstructions control each CPU function in sequence, enabling execution of complex instructions in CPUs, especially in CISC (Complex Instruction Set Computing) architectures.
- Microprogram Control Unit:
- The part of the CPU that stores and manages microinstructions.
- Microcode is typically stored in a small, high-speed memory (e.g., control store) within the CPU.
- When an instruction is decoded, the control unit uses microcode to generate signals for various CPU components, guiding them through the steps required for that instruction.
- Advantages:
- Allows easier implementation of complex instructions by breaking them into manageable steps.
- Makes updating and extending instructions easier without changing the hardware, as changes can be made in microcode.
- Disadvantages:
- Adds latency due to the additional layer between machine instructions and execution.
- Increases complexity and may slow down execution speed, especially compared to RISC architectures that rely on fewer, simpler instructions.
Pipeline Basics
- Concept of Pipelining:
- Pipelining divides the CPU’s instruction cycle into multiple stages, enabling multiple instructions to be processed simultaneously.
- Common pipeline stages include Fetch (IF), Decode (ID), Execute (EX), Memory Access (MEM), and Write-back (WB).
- Stages of a Pipeline:
- Instruction Fetch (IF): Retrieves the next instruction from memory.
- Instruction Decode (ID): Decodes the instruction to identify the operation and operands.
- Execute (EX): Performs the operation, like an arithmetic or logic function.
- Memory Access (MEM): Accesses memory for read/write operations, if needed.
- Write-back (WB): Writes results back to registers or memory.
- Pipeline Depth:
- Refers to the number of stages in the pipeline.
- More stages (deeper pipeline) can improve instruction throughput but may introduce more latency, especially if hazards occur.
- Benefits of Pipelining:
- Higher Throughput: Enables more instructions to be completed in less time.
- Efficient Resource Usage: Minimizes idle time by utilizing each pipeline stage simultaneously.
- Reduced Latency per Instruction: Each instruction is broken down into smaller, more manageable operations, allowing faster instruction cycles.
- Challenges with Pipelining:
- Requires careful management to avoid stalling or inefficient use of resources.
- Susceptible to hazards, which can stall the pipeline and reduce efficiency.
Pipeline Hazards
- Structural Hazards:
- Definition: Occur when multiple instructions require the same hardware resources at the same time.
- Examples:
- When two instructions simultaneously need to access the same memory or the same arithmetic unit (ALU).
- A common problem in processors with limited resources or no duplicated functional units.
- Solution: Resource duplication (e.g., multiple ALUs) or effective resource scheduling to reduce conflicts.
- Data Hazards:
- Definition: Arise when an instruction depends on the results of a previous instruction that has not yet completed its execution in the pipeline.
- Types of Data Hazards:
- Read After Write (RAW): The current instruction depends on the result of a previous instruction still in the pipeline (most common).
- Write After Read (WAR): Occurs if an instruction writes to a register or memory location before a previous instruction has read from it.
- Write After Write (WAW): Happens if two instructions write to the same register or memory location in the wrong order.
- Solutions:
- Data Forwarding/Bypassing: Redirects the result of a previous instruction directly to the next one without waiting for it to complete.
- Stalling: Pauses the pipeline until the required data becomes available.
- Pipeline Interlocks: Control logic that automatically inserts stalls to prevent data hazards.
⭐Cache Review: Control Hazards - Jump, Branch, Others
Cache Review
- Purpose of Cache:
- A cache is a small, high-speed memory located close to the CPU that stores frequently accessed data and instructions.
- Reduces the average time to access data from main memory, improving CPU efficiency and performance.
- Cache Types:
- Instruction Cache (I-Cache): Stores instructions to reduce latency in instruction fetching.
- Data Cache (D-Cache): Stores data to reduce latency in data access.
- Unified Cache: A single cache that stores both instructions and data.
- Cache Levels:
- L1 Cache: Closest to the CPU, extremely fast but small in size.
- L2 Cache: Larger and slightly slower, provides data if not found in L1.
- L3 Cache: Found in some processors, shared among cores, larger and slower than L1 and L2.
- Cache Mapping Techniques:
- Direct-Mapped Cache: Each memory block maps to exactly one location in the cache.
- Fully Associative Cache: Any memory block can be stored in any cache location.
- Set-Associative Cache: A combination where each memory block maps to a set, and can be placed in any location within that set.
- Cache Performance:
- Hit Rate: Percentage of cache accesses where the data is found in the cache.
- Miss Rate: Percentage of accesses where the data is not found, requiring a fetch from main memory.
- Miss Penalty: The additional time needed to retrieve data from main memory when there is a cache miss.
Control Hazards
- Overview of Control Hazards:
- Control hazards, or branch hazards, occur when the pipeline encounters instructions that alter the program counter (PC) and disrupt the flow of instruction execution.
- Common in instructions like jumps, branches, and other control operations, leading to potential pipeline stalls and reduced efficiency.
- Types of Control Hazards:
- Jump Instructions:
- Definition: Unconditional instructions that move the program counter to a new memory address.
- Example:
JMP
in assembly language, where the PC is set to a specific address. - Impact on Pipeline: Causes the pipeline to discard instructions after the jump instruction if they are not part of the target address path.
- Branch Instructions:
- Definition: Conditional (like
BEQ
,BNE
) or unconditional instructions that may alter the PC based on certain conditions. - Example: A conditional branch that changes the PC only if a condition (like a comparison result) is true.
- Pipeline Impact:
- Requires the pipeline to wait until the branch decision is made, which can cause stalls.
- If the branch is taken, instructions after the branch must be discarded, wasting pipeline cycles.
- Solution: Branch prediction techniques attempt to guess the branch outcome to keep the pipeline filled.
- Definition: Conditional (like
- Jump Instructions:
- Other Control Instructions:
- Function Calls and Returns:
- Function Calls: Jump to a different code section, potentially adding complexity if registers or the stack needs to be saved/restored.
- Returns: Control flow returns to the calling function, which requires restoring previous program state.
- Interrupts and Exceptions:
- Interrupts: Hardware-initiated events that temporarily halt the normal program flow to execute an interrupt handler.
- Exceptions: Program-initiated control flow changes in response to errors (e.g., division by zero).
- Impact on Pipeline: Both interrupts and exceptions disrupt the normal flow, emptying or redirecting the pipeline and causing stalls.
- Function Calls and Returns:
Solutions to Control Hazards
- Branch Prediction:
- Static Branch Prediction: Assumes a predetermined outcome, like always assuming the branch will not be taken, which can be useful in predictable code patterns.
- Dynamic Branch Prediction: Uses historical information about previous branches to make more accurate predictions. This approach often uses hardware structures like the Branch Prediction Buffer.
- Types of Prediction Mechanisms:
- One-Bit Predictor: Predicts the same outcome until a misprediction occurs.
- Two-Bit Predictor: Requires two mispredictions before changing the predicted branch direction, offering improved accuracy.
- Branch Target Buffer (BTB):
- A small cache memory that stores the target addresses of recently taken branches.
- Enables quick access to the target address if the branch is taken, reducing the need to calculate the target address from scratch.
- Delayed Branching:
- A compiler-based approach where independent instructions are rearranged to execute immediately after a branch instruction, filling potential pipeline bubbles.
- Fill Slots with Useful Instructions: Prevents pipeline stalls by keeping the pipeline filled, even if the branch prediction is incorrect.
⭐Superscalar1: Classifying Caches, Cache Performance, Two-Way In-Order Superscalar, Fetch Logic and Alignment
Classifying Caches
- Purpose of Cache Classification:
- Classifying caches helps in understanding how data is stored, accessed, and managed within the cache.
- Different cache organizations impact data retrieval speed, efficiency, and overall CPU performance.
- Types of Cache Organizations:
- Direct-Mapped Cache:
- Each memory block maps to exactly one cache location.
- Simple to implement and faster due to minimal search logic but prone to high conflict misses if frequently accessed data maps to the same cache line.
- Fully Associative Cache:
- Any memory block can be stored in any cache line, providing the most flexibility and reducing conflict misses.
- More complex and slower due to the need to search all lines to locate data.
- Set-Associative Cache:
- A compromise between direct-mapped and fully associative caches.
- Memory blocks are mapped to a subset of cache lines (a set) and can be stored in any line within that set.
- Reduces conflict misses while maintaining simpler search logic.
- Direct-Mapped Cache:
Cache Performance
- Performance Metrics:
- Cache Hit: Occurs when the requested data is found in the cache, resulting in faster data retrieval.
- Cache Miss: Occurs when the requested data is not found, requiring data retrieval from main memory.
- Hit Rate: The percentage of memory accesses that result in cache hits.
- Miss Rate: The percentage of memory accesses that result in cache misses, directly affecting performance.
- Miss Penalty: The additional time required to fetch data from main memory on a miss.
- Average Memory Access Time (AMAT): Calculated as:
- Factors Affecting Cache Performance:
- Cache Size: Larger caches generally reduce miss rates but may increase access time.
- Block Size: Determines the amount of data transferred on each cache miss.
- Associativity: Higher associativity typically reduces conflict misses but increases search complexity.
- Techniques to Improve Cache Performance:
- Prefetching: Anticipating which data will be needed soon and loading it into the cache.
- Replacement Policies: Determines which cache line to replace on a cache miss (e.g., Least Recently Used (LRU), First-In-First-Out (FIFO)).
Two-Way In-Order Superscalar
- Superscalar Architecture:
- A processor design where multiple instructions are issued and executed per clock cycle, increasing instruction-level parallelism.
- Requires multiple execution units to allow simultaneous instruction processing.
- In-Order Execution:
- Instructions are issued and executed in the order they appear in the program, without reordering.
- Simpler design but can result in pipeline stalls if dependencies between instructions are not managed well.
- Two-Way Superscalar:
- Allows two instructions to be issued and executed simultaneously per clock cycle.
- Benefits: Improves throughput by allowing multiple operations without complex out-of-order execution mechanisms.
- Challenges: Requires careful handling of dependencies and may encounter stalls if both instructions need the same resources.
Fetch Logic and Alignment
- Fetch Logic:
- The mechanism that retrieves instructions from memory and loads them into the pipeline.
- Determines how quickly and efficiently instructions are fed to the processor, impacting overall performance.
- In a superscalar architecture, fetch logic must retrieve multiple instructions per cycle to keep the pipeline filled.
- Instruction Alignment:
- Ensures that instructions are correctly positioned in memory for fast and efficient retrieval.
- Critical in superscalar processors, as instructions must align properly with execution units.
- Challenges in Superscalar Fetching and Alignment:
- Branch Instructions: Control hazards from branches can disrupt fetch logic, requiring branch prediction to maintain efficiency.
- Alignment Hardware: Special hardware may be needed to handle misaligned instructions to maintain the dual-instruction fetch rate in a two-way superscalar processor.
π¨Thanks for visiting finenotes4u✨
Welcome to a hub for πNerds and knowledge seekers! Here, you'll find everything you need to stay updated on education, notes, books, and daily trends.
π Bookmark our site to stay connected and never miss an update!
π Have suggestions or need more content? Drop a comment below, and let us know what topics you'd like to see next! Your support means the world to us. π