hpml

Here are 6 public repositories matching this topic...

BGU-CS-VIL / DPMMSubClusters.jl

Distributed MCMC Inference in Dirichlet Process Mixture Models (High Performance Machine Learning Workshop 2019)

machine-learning julia parallel-computing distributed-computing inference distributed gaussian bayesian-inference mixture-model dirichlet-process multinomial dirichlet-process-mixtures ccgrid hpml hyper-computing

Updated Jan 18, 2023
Julia

parisimaa / NYU-HPC

Star

NYU HPC user instruction

machine-learning deep-learning hpc high-performance-computing hpml

Updated Sep 9, 2024
Shell

rutujaingole / Optimizing-LLM-Inference-using-NVIDIA-Dynamo-and-TorchDynamo

Star

The goal of the project is to benchmark and optimize BERT inference using different backends—PyTorch eager mode, TorchDynamo (Inductor backend), and NVIDIA Triton Inference Server. We use GLUE SST-2 samples for evaluation and compare performance through profiling, kernel timing, and latency analysis.

machine-learning machine-learning-algorithms pytorch high-performance-computing profiling bert nvidia-gpu hpml torchdynamo llm-inference nvidia-dynamo

Updated May 10, 2025
Jupyter Notebook

igopalakrishna / DyT-NoNorm-LLMs-REWILD

Star

Replacing LayerNorm with Dynamic Tanh (DyT) in DistilGPT2 + LoRA, evaluated on RE-WILD, Alpaca, and ShareGPT.

deep-learning pytorch transformer research-project lora pythia fine-tuning peft huggingface hpml distilgpt2 dyt llms dynamic-tanh rewild

Updated May 10, 2025
Jupyter Notebook

alex-is-busy-coding / speculative-rag

Star

An implementation of Speculative RAG exploring latency-quality trade-offs in multi-draft retrieval. Features batched parallel drafting via vLLM and log-probability verifier selection for fast, high-quality QA on a single A100 GPU.

nlp benchmarking gpu latency high-performance-computing throughput quantization faiss rag vector-search hpml a100 llm vllm retrieval-augmented-generation speculative-rag

Updated Mar 14, 2026
Python

MRROBOT401 / DyT-NoNorm-LLMs-REWILD

Star

Replacing LayerNorm with Dynamic Tanh (DyT) in DistilGPT2 + LoRA, evaluated on RE-WILD, Alpaca, and ShareGPT.

deep-learning pytorch transformer research-project lora pythia fine-tuning peft huggingface hpml distilgpt2 dyt dynamic-tanh rewild

Updated Mar 14, 2026
Jupyter Notebook

Improve this page

Add a description, image, and links to the hpml topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hpml topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hpml

Here are 6 public repositories matching this topic...

BGU-CS-VIL / DPMMSubClusters.jl

parisimaa / NYU-HPC

rutujaingole / Optimizing-LLM-Inference-using-NVIDIA-Dynamo-and-TorchDynamo

igopalakrishna / DyT-NoNorm-LLMs-REWILD

alex-is-busy-coding / speculative-rag

MRROBOT401 / DyT-NoNorm-LLMs-REWILD

Improve this page

Add this topic to your repo