Building a Basic PyTorch Transformer

This repository contains a Jupyter Notebook that breaks down the implementation of a Transformer model using PyTorch. It provides a "from-scratch" approach to understanding how modern NLP models process sequential data.

Project Overview

The project follows the architecture introduced in the landmark paper "Attention is All You Need." It focuses on the modular implementation of the encoder and decoder components.

Key Features:

Input Embeddings: Converting token IDs into dense vectors scaled by the square root of the model dimension.
Positional Encoding: Using sine and cosine functions to inject sequence order information into embeddings.
Multi-Head Attention: A custom class implementation that handles linear projections for Queries, Keys, and Values, head splitting, and attention weight computation.
Model Inspection: Visualizing the full nn.Transformer object structure, including encoder/decoder layers, normalization, and dropout.

Installation

Clone the repository:

git clone https://github.com/Joe-Naz01/transformers.git
cd transformers

conda create -n transformers
conda activate transformers
pip install requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
transformer.ipynb		transformer.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building a Basic PyTorch Transformer

Project Overview

Key Features:

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Building a Basic PyTorch Transformer

Project Overview

Key Features:

Installation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages