school assignment 4 #29

kiw8 · 2025-06-21T14:16:09Z

This code is designed to build a GemmaForCausalLM model based on Google's pre-trained language model Gemma-2B, enhanced with the Infini-attention mechanism. The model safely inherits pre-trained weights, and then performs training and text generation tests using various input texts.

In particular, it verifies the model’s trainability by calculating the loss and performing backpropagation on both short and long input texts. Additionally, the code is structured to test two types of text generation: a step-by-step sampling method and automated generation using HuggingFace’s .generate() function.

hello 4

7407c01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

school assignment 4 #29

school assignment 4 #29

Uh oh!

kiw8 commented Jun 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

school assignment 4 #29

Are you sure you want to change the base?

school assignment 4 #29

Uh oh!

Conversation

kiw8 commented Jun 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant