Skip to content

Joe-Naz01/transformers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Building a Basic PyTorch Transformer

This repository contains a Jupyter Notebook that breaks down the implementation of a Transformer model using PyTorch. It provides a "from-scratch" approach to understanding how modern NLP models process sequential data.

Project Overview

The project follows the architecture introduced in the landmark paper "Attention is All You Need." It focuses on the modular implementation of the encoder and decoder components.

Key Features:

  • Input Embeddings: Converting token IDs into dense vectors scaled by the square root of the model dimension.
  • Positional Encoding: Using sine and cosine functions to inject sequence order information into embeddings.
  • Multi-Head Attention: A custom class implementation that handles linear projections for Queries, Keys, and Values, head splitting, and attention weight computation.
  • Model Inspection: Visualizing the full nn.Transformer object structure, including encoder/decoder layers, normalization, and dropout.

Installation

  1. Clone the repository:
    git clone https://github.com/Joe-Naz01/transformers.git
    cd transformers
    
    conda create -n transformers
    conda activate transformers
    pip install requirements.txt

About

A deep learning project that implements and explains the fundamental building blocks of the Transformer model using PyTorch. The notebook covers the implementation of Input Embeddings, Positional Encoding, and a custom Multi-Head Attention mechanism, providing a step-by-step guide to how these components transform data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors