Skip to content

Latest commit

 

History

History
310 lines (247 loc) · 10.8 KB

README.md

File metadata and controls

310 lines (247 loc) · 10.8 KB

Processor Simulator

Simulator

Getting Started

run <assembly-filename>
    [-s, --step-by-step](bool)       (run interactively step by step. default: false)
    [-v, --verbose](bool)            (verbose on debug mode. default: false)
    [-o, --output-folder](string)   (output folder where to store debug and memory files)
    [-c, --config-filename](string)  (processor config filename, a valid config filename is required)
    [--max-cycles](int)              (maximum number of cycles to execute. default: 3000)

Sample: run samples/programs/fibonacci.asm -c samples/configs/default.config-o results/my-test --max-cycles 1000 --step-by-step -v

Code structure

 bin/
   - Autogenerated binaries when the program is compiled
 benchmark/
   - Autogenerated output files/stats when benchmarks are executed
 samples/
    benchmark/
     - sh scripts for executing the different benchmarks in the simulator
    configs/
     - list of different configurations with different architectures to be benchmarked
    programs/
     - list of different programs availables to run benchmarks
 src/ (source code)
    github.com/codegangsta/cli/
     - Open source library for command line application "styling". 
    app/ (Processor-simulator source code)
      logger/
       - Files for managing logging
      simulator/
         processor/
          - Definition of the processor models, components, config and all bussiness-logic
         standards/
          - Definition of standards used and its implementation (IEEE754)
         translator/
          - Translafor in charge of compile the assembly file (.asm) and produce the machine code file (.hex)
         main.go
          - Entry point :)
    utils/
     - Go utilities

Debugging

During simulation (Interactive)

If the program is executed using the flag -s or --step-by-step you will be able to see the state of registers and/or data memory at the end of every step executed otherwise if the flag is not provided you will be able to see the final state at the end.

The following menu will be presented:

Press the desired key and then hit [ENTER]...
 - (R) to see registers memory
 - (D) to see data memory
 - (E) to exit and quit
 - (*) Any other key to continue

If selected R or D, the data will be displayed in the following format:

           0x00            0x04            0x08            0x0C
0x00    0x0000000A      0x0010000A      0x000C0000      0x00000000
0x10    0x000100E8      0x00000008      0x00000012      0x00000087
0x20    0x0000FF00      0x00000000      0x00D00068      0x002000A8
0x30    0x000000E8      0x0000C008      0x00000012      0x00000087
0x40    0x00000012      0x00000000      0x00100000      0x00000000
....

At the end of simulation (Persisted files)

At the end six files will be generated with details of the execution, debugging and final memory states. The location of those output files can be selected with the flag -o or --output-folder

 - assembly.hex: Machine code interpreted by the processor
 - memory.dat: Final state of the data memory.
 - registers.dat: Final state of the registers.
 - output.log: Execution resources according to the configuration and output statistics.
 - debug.log: Complete log for debugging purposes.
 - pipeline.dat: Pipeline diagram of the different executed instruction stages vs execution cycles

Compiler

This application has a builtin translator that converts human readable assembly instructions into machine code, the available instructions allowed are the ones defined on the previous instructions section.

Rules Syntax

  • Only one instruction allowed per line
  • Comments prefix is ;
  • Comments are allowed to be on a single line or after an instruction in the same line
  • It does not care about the amount of empty spaces or tabs
  • Branch labels must be on a single line
  • No instructions allowed to be on the same line where the branch label is declared
  • Blank lines are allowed

Examples

; Here are some comments on a new line

 PROCESS_LOOP:                      ; Here is a label followed by an inline comment
 ADDI    R1, R1, 1                  ; Here is a instruction along with its operands and an inline comment
                                    ; Here it is an empty line which is allowed followed by an inline comment
 ADD     R15, R15, R16              ; R15 += C[I]
 BLT     R1, R20, PROCESS_LOOP      ; Here is an instruction using a branch label followed by an inline comment

Processor Architecture

Diagram

The following diagram shows the components of the architecture along with the pipeline and the interaction with its components:

Processor architecture

Features

Overview

  • 32 bits architecture
  • Scalar, Pipelined or N-way superscalar
  • Out-of-order execution and non-blocking issue
  • 32 general purpose registers (32-bit) (used for integer & FP)
  • 1 MB Instructions Memory
  • 1 MB Data Memory

Five-stage pipeline

  • Fetch, Decode, Issue/Dispatch, Execute & Writeback

Execution units (EU’s)

  • 2 ALU units
  • 2Load/Store units
  • 1 Branch units
  • 1 FPU units

Branch Prediction

  • None (Stall)
  • Static: Always, Never, Forward, Backward
  • Dynamic: One bit predictor, Two-bit predictor (BHT)

Front-End Pipeline (In-order)

  • Instruction Fetch Unit (IFU):
  • 16 bytes fetch on each cycle (4 instructions)
  • Instruction Queue (IQ):
  • 18 instructions buffer
  • Instruction Decoding Unit (IDU)
  • 4 decoding units
  • Instructions Decoded Queue (IDQ):
  • 28 instructions buffer

Execution Pipeline (Out-of-order)

  • Common Data Bus (CDB)
  • Register Renaming
  • Register Alias Table (RAT) with 32 entries
  • Reorder buffer (ROB)
  • 32 entries
  • Up to 4 instructions written back on each cycle
  • Unified reservation station (URS):
  • 128 entries
  • Up to 6 instructions dispatched on each cycle

Configuration

The architecture of the processor can be configured based on a json file that will enable/disable/set different features of the processor.

Some configurations available at: samples/configs

Example

{
    "cycle_period_ms": 70,
    
    "registers_memory_size": 128,
    "instructions_memory_size": 1024,
    "data_memory_size": 1024,

    "branch_predictor_type": "one_bit",
    
    "pipelined": true,
    "instructions_fetched_per_cycle": 4,
    "instructions_queue": 18,
    "instructions_decoded_queue": 28,
    "instructions_dispatched_per_cycle": 6,
    "instructions_written_per_cycle": 6,
    "reservation_station_entries": 128,
    "reorder_buffer_entries": 32,
    "register_alias_table_entries": 32,

    "decoder_units": 4,

    "branch_units": 1,
    "load_store_units": 2,
    "alu_units": 2,
    "fpu_units": 1
}

Instruction Set

  • 32 bit instructions wide
  • Instructions formats: R, I & J
  • Instructions types: Arithmetic (ALU & FPU), Load/Store, Control/Branch
  • 32-bit registers used for integer operations or floating point operations

Instruction Formats

The next tables shows the format structure of the instructions accordingly to the different types: R, I, J

Type Format (32 bits)
R Opcode (6) Rd (5) Rs (5) Rt (5) Shmt (5) Func (6)
Type Format (32 bits)
I Opcode (6) Rd (5) Rs (5) - I m m e d i a t e (1 6 b i t s) -

Type | Format (32 bits)|| -----|------------|----|| J | Opcode (6) | - - - - - - - - - - A d d r e s s (2 6 b i t s ) - - - - - - - - - - |

  • All instructions are 32-bit long (1 word)
  • Rs, Rt, and Rd are general purpose registers
  • PC stands for the program counter address
  • C denotes a constant (immediate)
  • - denotes that those values do not care

List of Instructions

Aritmetic/Logic
  • From Opcode 000000 to 001111

  • ALU

    Syntax | Description | Type | --------------------|-----------------|------| add/addi Rd,Rs,Rt | Rd = Rs + Rt/C | R | sub/subi Rd,Rs,Rt | Rd = Rs - Rt/C | R | cmp Rd,Rs,Rt | Rd = Rs <=> Rt | R | mul Rd,Rs,Rt | Rd = Rs * Rt | R | shl/shli Rd,Rs,Rt | Rd = Rs << Rt/C | R | shr/shrl Rd,Rs,Rt | Rd = Rs >> Rt/C | R | and/andi Rd,Rs,Rt | Rd = Rs & Rt/C | R | or/ori Rd,Rs,Rt | Rd = Rs | Rt/C | R |

  • FPU

    Syntax | Description | Type | ----------------|--------------|------| fadd Rd,Rs,Rt | Rd = Rs + Rt | R | fsub Rd,Rs,Rt | Rd = Rs - Rt | R | fmul Rd,Rs,Rt | Rd = Rs * Rt | R | fdiv Rd,Rs,Rt | Rd = Rs / Rt | R |

Data Transfer
  • From Opcode 010000 to 011111

    Syntax | Description | Type | Notes | ---------------|----------------|------|-------------------------| lw Rd,Rs,C | Rd = M[Rs + C] | I | load M[Rs + C] into Rd | sw Rd,Rs,C | M[Rd + C] = Rs | I | store Rd into M[Rs + C] | lli Rd,C | Rd = C | I | load lower immediate | sli Rd,C | M[Rd] = C | I | store lower immediate | lui Rd,C | Rd = C << 16 | I | load upper immediate | sui Rd,C | M[Rd] = C << 16| I | store upper immediate |

Control-PDF file
  • From Opcode 100000 to 101111

    Syntax | Description | Type | Notes | ---------------|-----------------|------|----------------------| beq Rd,Rs,C | br on equal | I | PC = PC + 4 + 4C | bne Rd,Rs,C | br on not equal | I | PC = PC + 4 + 4C | blt Rd,Rs,C | br on less | I | PC = PC + 4 + 4C | bgt Rd,Rs,C | br on greater | I | PC = PC + 4 + 4C | j C | jump to C | J | PC = 4*C |

Benchmarks

The following PDF file contains a brief description of the processor simulator along with the different experiments and benchmarks performed on this project