RISC-V Simulator with Cache Hierarchy, Synchronization, and Scratchpad Memory

This repository contains a cycle-accurate RISC-V processor simulator developed as part of the CS209P Computer Architecture course project. The simulator models a multi-compute-unit architecture with realistic pipeline behavior, a two-level cache hierarchy, synchronization primitives, and programmer-managed scratchpad memory.

The goal of this project is to study memory system behavior, parallel execution, and performance trade-offs in modern processor designs.

Architecture Overview

Four Compute Units (CUs)
- Shared instruction fetch unit
- Independent Decode, Execute, Memory, and Writeback stages per CU
- Shared instruction and data memory
- Each CU has a unique CID (Compute ID) register
Single Fetch Unit
- All compute units fetch the same instruction
- Execution is selectively enabled or disabled based on CID
- Enables SIMD-like execution with control divergence

Memory Hierarchy

L1 Caches
- L1 Instruction Cache (L1I)
- L1 Data Cache (L1D)
- Configurable parameters:
  - Cache size
  - Block size
  - Associativity
  - Access latency
- Instruction fetch is treated as a cacheable memory access
- Cache blocks of 64 bytes hold up to 16 instructions
L2 Cache
- Unified L2 cache shared by instructions and data
- Configurable size, associativity, and latency
- Accessed on L1 cache misses
Main Memory
- Accessed on L2 cache misses
- Configurable main memory latency
- Variable-latency memory operations introduce pipeline stalls

Cache Replacement Policies

LRU (Least Recently Used)
One additional configurable replacement policy The simulator tracks cache hits, misses, and stall cycles introduced by memory access delays.

Scratchpad Memory (SPM)

In addition to the cache hierarchy, the simulator includes a programmer-managed scratchpad memory:

Same size and access latency as L1D cache
No automatic replacement or tag lookup
Entirely controlled by software

Custom Instructions

lw_spm rd, offset(rs1) Loads a word from scratchpad memory into register rd
sw_spm rs2, offset(rs1) Stores a word from register rs2 into scratchpad memory

The SPM is used to compare cache-based and software-managed memory systems for strided access patterns.

Synchronization Support

SYNC Instruction

Acts as a barrier synchronization primitive
A compute unit stalls at SYNC until all compute units reach the same instruction
Implemented as a hardware-modeled no-op
Ensures correctness for parallel workloads such as reductions
This mechanism prevents premature reads of shared data before all compute units have completed their updates.

Performance Metrics

At the end of execution, the simulator reports:

Total number of stall cycles
Cache miss rate
IPC (Instructions Per Cycle)

These metrics are used to evaluate different cache configurations and memory access strategies.

Supported Workloads

Parallel array addition using per-CU partial sums
Strided array access benchmarks
Cache vs scratchpad memory performance comparison
Barrier synchronization using SYNC

The simulator supports evaluating both direct-mapped and fully associative cache configurations.

Developers

MOMs

- Date: 10-03-2025
- Memebers: Anirudh A, Raghavendra P
- Decision: Raghavendra implemented GUI and Anirudh connected it with the backend using flask.
- Date: 08-03-2025
- Memebers: Anirudh A, Raghavendra P
- Decision: Anirudh completed implementing the shared IF unit, shared Memory and worked on special purpose registers.
- Date: 06-03-2025
- Memebers: Anirudh A, Raghavendra P
- Decision: Raghavendra started GUI, Anirudh completed implementing latencies and worked on shared IF unit.
- Date: 04-03-2025
- Memebers: Anirudh A, Raghavendra P
- Decision: Anirudh Implemented pipelining with data forwarding and Raghavendra tried latencies.
- Date: 02-03-2025
- Memebers: Anirudh A, Raghavendra P
- Decision: Raghavendra and Anirudh worked for detection and correctness in stall count and finally completed the stall count implementation.
- Date: 28-02-2025
- Memebers: Anirudh A, Raghavendra P
- Decision: Raghavendra tried pipelining without forwarding implementation and Anirudh worked on detecting the RAW hazards and completed code for it, along with forwarding.
- Date: 25-02-2025
- Memebers: Anirudh A, Raghavendra P
- Decision: Raghavendra and Anirudh discussed about the way to implement the pipelining and had decided an architecture.
- Date: 19-02-2025
- Memebers: Anirudh A, Raghavendra P
- Decision: Raghavendra completed the GUI using HTML, CSS, and JavaScript, while Anirudh worked on integrating the GUI with the Python backend. Anirudh decided to use Flask for this integration.
- Date: 17-02-2025
- Memebers: Anirudh A, Raghavendra P
- Decision: The team decided to implement a GUI for the simulator. Initially, Raghavendra developed a basic GUI using Tkinter (import tkinter as tk from tkinter import messagebox). However, it was not visually appealing, so We decided to build the GUI using HTML, CSS, and JavaScript instead.
- Date: 15-02-2025 - Memebers: Anirudh A, Raghavendra P - Decision: Anirudh tested the code with various programs and fixed several bugs using the data segment format. We verified the correct addressing of arrays and successfully obtained the correct output for sum-of-elements problems.
- Date: 13-02-2025 - Memebers: Anirudh A, Raghavendra P - Decision: The team collaboratively implemented the Bubble Sort algorithm. We also added a data segment (word) to the code by creating an array to store input data in the format: arr: .word 0x4 ...
- Date: 11-02-2025 - Memebers: Anirudh A, Raghavendra P - Decision: Anirudh realised array indexing is 1, but addi can also perform arithmetic operations and hence differentiating logical and pointer arithmetic will not be possible. Hence Anirudh decided to make memory of 4*x allocations, index belonging to its module 4 coreid.
- Date: 09-02-2025 - Memebers: Anirudh A, Raghavendra P - Decision: The team divided responsibilities: 1.Raghavendra was assigned to implement arithmetic operations. 2.Anirudh was responsible for memory operations. 3.We discussed defining unique instructions that differ from the RISC-V instruction set.
- Date: 07-02-2025 - Memebers: Anirudh A, Raghavendra P - Decision: 1.Anirudh was assigned to complete the Software Design by 10-02-2025. 2.Raghavendra was tasked with reviewing relevant topics and enhancing his Python knowledge.
- Date: 06-02-2025
- Memebers: Anirudh A, Raghavendra P
- Decision: Decided to complete and build the GPU simulator with python language since, 1.Python has a simpler syntax compared to C/C++, making it easier to implement and understand complex GPU architectures. 2.Python has great visualization tools like Matplotlib and Seaborn, which help analyze performance metrics.

Note:

special register: x31
instructions implemented: add addi sub la lw sw bne ble beq jal jr slt j li
implemented .word in data segment
code should have a .data and a .text segment to work
label should have the corresponding instruction for ease, label should be written as a standalone statement
memory starts being used from the end for storing .data segment values

To Execute:

GUI

cd Codes
cd Simulator/Phase2
pip install -r requirements.txt
python main.py
Open 1270.0.1:5000 in browser

File Reading: change assembly.asm

cd Codes
cd Simulator/Phase2
pip install -r requirements.txt
python main.py

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
Codes		Codes
Docs		Docs
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RISC-V Simulator with Cache Hierarchy, Synchronization, and Scratchpad Memory

Architecture Overview

Memory Hierarchy

Cache Replacement Policies

Scratchpad Memory (SPM)

Custom Instructions

Synchronization Support

SYNC Instruction

Performance Metrics

Developers

MOMs

Note:

To Execute:

About

Uh oh!

Releases 4

Packages

Contributors 2

Uh oh!

Languages

AnirudhArrepu/RISC-V-Simulator

Folders and files

Latest commit

History

Repository files navigation

RISC-V Simulator with Cache Hierarchy, Synchronization, and Scratchpad Memory

Architecture Overview

Memory Hierarchy

Cache Replacement Policies

Scratchpad Memory (SPM)

Custom Instructions

Synchronization Support

SYNC Instruction

Performance Metrics

Developers

MOMs

Note:

To Execute:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Contributors 2

Uh oh!

Languages

Packages