Skip to content
Open

Anna #18

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
4427c13
Ignore virtual environment
Silv1357 Jan 27, 2025
a523f38
added a txt file
littlecelestedemon Jan 21, 2025
90aa292
added some articles about ai generation
littlecelestedemon Jan 21, 2025
bd3fd37
updated hugging-face-research.txt
littlecelestedemon Jan 22, 2025
c377e6b
Update hugging-face-research.txt
littlecelestedemon Jan 22, 2025
11c89e0
Update and rename hugging-face-research.txt to sources.txt
littlecelestedemon Jan 23, 2025
0ab6f85
Update sources.txt
littlecelestedemon Jan 25, 2025
0352e7e
readded sources
littlecelestedemon Jan 27, 2025
09dd6f2
added main and pegasus
littlecelestedemon Jan 27, 2025
d8f3109
LLM page
Silv1357 Jan 30, 2025
015f0cf
accidently modified this file, should not be diff
littlecelestedemon Jan 28, 2025
d3ac816
put in a way to find logo.png
littlecelestedemon Jan 28, 2025
969adeb
trying to get the logo to pop up
littlecelestedemon Jan 29, 2025
079108a
tinkered with adding a photo to the homepage
littlecelestedemon Jan 29, 2025
ce8c302
feat: add finetuning gui
Silv1357 Feb 2, 2025
c783acb
fix: remove max length and num beams parameters for training
ronantakizawa Jan 30, 2025
818c257
feat: Update CHANGELOG.md
ronantakizawa Jan 30, 2025
1d819bc
Ignore virtual environment
Silv1357 Jan 27, 2025
3ea4864
LLM page
Silv1357 Jan 30, 2025
c0692b8
feat: added evaluation gui
ronantakizawa Jan 31, 2025
61b33e8
feat: remove chat_history.py
ronantakizawa Jan 31, 2025
6f03c64
Changes to LLM page
Silv1357 Feb 2, 2025
5810b6d
Displaying downloaded LLMs
Silv1357 Feb 3, 2025
a0610ba
Multiple model upload feature for LLM Page
Silv1357 Feb 3, 2025
fc21db4
Change Download to Import for button
Silv1357 Feb 3, 2025
81e1b13
Updates to LLM Page UI
Silv1357 Feb 6, 2025
414a5ab
Updates to LLM UI
Silv1357 Feb 6, 2025
b20784b
Added LLM specific pagepop-out window
Silv1357 Feb 10, 2025
e791b64
Integration of Homepage with Kaylie
Silv1357 Feb 11, 2025
ce6888c
integration
Silv1357 Feb 11, 2025
1ba07be
integration
Silv1357 Feb 11, 2025
0db4739
Updates to Homepage
Silv1357 Feb 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@

__pycache__/
*.py[cod]
*$py.class
model_files/
medscribe_env/
**/.DS_Store
chat_data/
.env/
finetunedmodels/
.env/
src/geval/mistral-7b-instruct-v0.2.Q4_K_M.gguf
eval_files/
1 change: 0 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
- Support multiple models for evaluation and finetuning
- Support multiple models for chat


## 1-31
- Added Evaluation
- Added Evaluation via GUI
Expand Down
27 changes: 27 additions & 0 deletions pegasus.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# import pegasus
from transformers import PegasusForConditionalGeneration, PegasusTokenizer
from GUI import *

class Pegasus:
def __init__ (self):
# load pegasus for summarization
self.model_name = "google/pegasus-xsum"
self.tokenizer = PegasusTokenizer.from_pretrained(self.model_name) # initialize pegasus' tokenizer
self.model = PegasusForConditionalGeneration.from_pretrained(self.model_name) # initialize pegasus model


def tokenize(self, input_text):
# tokenize text
tokenized = self.tokenizer(input_text, truncation = True, padding = "longest", return_tensors = "pt")
return tokenized

def summarizer(self, tokenized_text):
# summarize
summarized = self.model.generate(**tokenized_text)
return summarized

def detokenize(self, summarized_text):
# detokenize
summary = self.tokenizer.batch_decode (summarized_text)
return summary

2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ rouge-score>=0.1.2
nltk>=3.8.1
bert-score>=0.3.13
evaluate>=0.4.0
llama-cpp-python>=0.2.11
llama-cpp-python>=0.2.11
18 changes: 18 additions & 0 deletions sources.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@

Works Cited

Avinash. “LLM Evaluation Metrics — BLEU, ROGUE and METEOR Explained.” Medium, 7 Aug. 2024, avinashselvam.medium.com/llm-evaluation-metrics-bleu-rogue-and-meteor-explained-a5d2b129e87f. Accessed 11 Feb. 2025.
Bais, Gourav. “LLM Evaluation for Text Summarization.” Neptune.ai, 25 Sept. 2024, neptune.ai/blog/llm-evaluation-text-summarization. Accessed 7 Feb. 2025. (photo)
Banerjee, Satanjeev, and Alon Lavie. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. 2005.
Bajaj, Ganesh. “When to Use BLEU Score: Evaluating Text Generation with N-Gram Precision.” Medium, Artificial Intelligence in Plain English, 26 Sept. 2024, ai.plainenglish.io/when-to-use-bleu-score-evaluating-text-generation-with-n-gram-precision-3431829a641e. Accessed 7 Feb. 2025. (photo)
“BERT Score - a Hugging Face Space by Evaluate-Metric.” Huggingface.co, huggingface.co/spaces/evaluate-metric/bertscore.
courtzc. “Evaluating the Performance of LLM Summarization Prompts with G-Eval.” Microsoft.com, 25 June 2024, learn.microsoft.com/en-us/ai/playbook/technology-guidance/generative-ai/working-with-llms/evaluation/g-eval-metric-for-summarization. Accessed 11 Feb. 2025.
Falcão, Fabiano. “Metrics for Evaluating Summarization of Texts Performed by Transformers: How to Evaluate The….” Medium, 22 Apr. 2023, fabianofalcao.medium.com/metrics-for-evaluating-summarization-of-texts-performed-by-transformers-how-to-evaluate-the-b3ce68a309c3.
Issiaka Faissal Compaore, et al. “AI-Driven Generation of News Summaries: Leveraging GPT and Pegasus Summarizer for Efficient Information Extraction.” Hal.science, 5 Feb. 2024, hal.science/hal-04437765, https://hal.science/hal-04437765. Accessed 11 Feb. 2025.
Liu, Yang, et al. G-EVAL: NLG Evaluation Using GPT-4 with Better Human Alignment.
Otten, Neri Van, and Neri Van Otten. “METEOR Metric in NLP: How It Works & How to Tutorial in Python.” Spot Intelligence, 26 Aug. 2024, spotintelligence.com/2024/08/26/meteor-metric-in-nlp-how-it-works-how-to-tutorial-in-python/.
---. “ROUGE Metric in NLP: Complete Guide & How to Tutorial in Python.” Spot Intelligence, 12 Aug. 2024, spotintelligence.com/2024/08/12/rouge-metric-in-nlp/.
Ruman. “BERT Score Explained | Medium.” Medium, 17 May 2024, rumn.medium.com/bert-score-explained-8f384d37bb06. Accessed 7 Feb. 2025. (photo)
Yang, Xinyu, et al. “Navigating Dataset Documentations in AI: A Large-Scale Analysis of Dataset Cards on Hugging Face.” ArXiv.org, 24 Jan. 2024, arxiv.org/abs/2401.13822. Accessed 7 Feb. 2025.


1 change: 0 additions & 1 deletion src/components/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,3 @@
from components.llm_input import LLMInput
from components.llm_list import LLMList
from components.evaluation_visualizer import EvaluationVisualizer
from components.llm_details import LLMDetails
5 changes: 2 additions & 3 deletions src/components/evaluation_form.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,5 @@ def start_evaluation(self):
dataset_path,
model_path,
start_idx,
end_idx,
self.use_geval.get() # Pass the G-EVAL toggle state
)
end_idx
)
17 changes: 15 additions & 2 deletions src/components/llm_details.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,33 +4,40 @@
import os
from itertools import islice
from collections import OrderedDict

class LLMDetails(tk.Frame):
def __init__(self, parent, llm, **kwargs):
super().__init__(parent, bg="#D2E9FC", **kwargs)
self.llm = llm
self.setup_input_area()

def setup_input_area(self):
self.grid_columnconfigure(0, weight=1)
self.grid_rowconfigure(0, weight=1)

# Container frame for LLM Name, LLM Details
self.container = tk.Frame(self, bg="#D2E9FC")
self.container.grid(row=0, column=0, sticky="nsew", padx=0, pady=15)
self.container.grid_rowconfigure(0, weight=1)
self.container.grid_rowconfigure(1, weight=1)
self.container.grid_columnconfigure(0, weight=1) # for name frame
self.container.grid_columnconfigure(1, weight=1) # for details frame

# LLM Name container
self.llm_name_container = tk.Frame(self.container, bg="#D2E9FC")
self.llm_name_container.grid(row=0, column=0, sticky="ew", padx=(5, 10), pady=(20, 5))
self.llm_name_container.grid_columnconfigure(0, weight=1)
# LLM details container frame with rounded corners

# LLM details container frame with rounded corners
self.llm_details_container = RoundedFrame(self.container, "#FFFFFF", radius=50)
self.llm_details_container.grid(row=1, column=0, sticky="nsew", padx=(20, 5), pady=10)
self.llm_details_container.grid_rowconfigure(0, weight=1)
self.llm_details_container.grid_columnconfigure(0, weight=1)

# "LLM Name" string variable
string = tk.StringVar()
string.set("LLM: " + self.llm)

# "Import LLM" area
self.name_label = tk.Label(
self.llm_name_container,
Expand All @@ -42,6 +49,7 @@ def setup_input_area(self):
justify="left"
)
self.name_label.grid(row=0, column=0, sticky="w", padx=20, pady=(20, 10))

# LLM detail area
self.details = tk.Text(
self.llm_details_container,
Expand All @@ -54,9 +62,11 @@ def setup_input_area(self):
highlightthickness=0,
relief="flat"
)

self.details.grid(row=0, column=0, sticky="nsew", padx=20, pady=20)
self.details.grid_rowconfigure(0, weight=1)
config = self.get_model_info()

# get first 10 in config
first_twenty = dict(islice(config.items(), 20))
rev = OrderedDict(reversed(list(first_twenty.items())))
Expand All @@ -66,6 +76,8 @@ def setup_input_area(self):
self.details.insert("1.0", "\n")
self.details.insert("1.0", str_key + ": " + str_value)
self.details.insert("1.0", "\n")


def get_model_info(self):
# cd to model_files
cwd = os.getcwd()
Expand All @@ -74,4 +86,5 @@ def get_model_info(self):
model_dir = os.path.join(parent, file)
f = open(model_dir) # read config file of selected model
config = json.load(f) # load configurations
return config
return config

2 changes: 1 addition & 1 deletion src/components/llm_input.py
Original file line number Diff line number Diff line change
Expand Up @@ -208,4 +208,4 @@ def get_input(self):

def clear_input(self):
self.input_text.delete("1.0", "end")
self.input_text.focus()
self.input_text.focus()
13 changes: 4 additions & 9 deletions src/components/llm_list.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,16 +75,11 @@ def get_models(self):
cwd = os.getcwd() # current directory
parent = os.path.dirname(cwd) # parent directory
model_dir = os.path.join(parent, "model_files")

# Create model_files directory if it doesn't exist
if not os.path.exists(model_dir):
os.makedirs(model_dir)

# check all folders in model_files folder
for folder in os.listdir(model_dir):
# check if model_files
for folder in os.listdir(model_dir): # list all folders in model_files folder
if os.path.isdir(os.path.join(model_dir, folder)):
folder_list.append(folder)

return folder_list

def write_list(self, list):
Expand All @@ -101,4 +96,4 @@ def mouse_scroll(self, event):
self.list.yview_scroll(-1, "units")
elif event.num == 5:
self.list.yview_scroll(1, "units")
return "break"
return "break"
10 changes: 5 additions & 5 deletions src/components/navbar.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,18 @@

# Navigation items
nav_items = [
("Home","home"),
("Home", "home"),
("Evaluations", "evaluations"),
("LLMs", "llms"),
("Finetune", "finetune")
("Finetune", "finetune"),
]

class Navbar(tk.Frame):
def __init__(self, parent, show_page_callback, **kwargs):
super().__init__(parent, bg="#FFFFFF", **kwargs)
# Set callback function
self.show_page_callback = show_page_callback
self.current_page = "llms" # Default page
self.current_page = "home" # Default page
self.buttons = {} # Store button references
self.logo_photo = None
self._setup_navbar()
Expand Down Expand Up @@ -94,7 +94,7 @@ def _setup_navbar(self):
btn.bind("<Enter>", lambda e, b=btn_container, p=page: self._on_hover(b, p))
btn.bind("<Leave>", lambda e, b=btn_container, p=page: self._on_leave(b, p))

self.set_active_page("llms")
self.set_active_page("home")

def _handle_click(self, page):
if page != self.current_page:
Expand Down Expand Up @@ -137,4 +137,4 @@ def _on_leave(self, container, page):
else:
container.configure(bg="#E8F0FE")
for child in container.winfo_children():
child.configure(bg="#E8F0FE")
child.configure(bg="#E8F0FE")
11 changes: 10 additions & 1 deletion src/download_pegasus.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,19 @@ def download_model():

"""
Model Details:
<<<<<<< HEAD
├── config.json (Architecture blueprint)
├── generation_config.json (Controls how the model generates text: beam search)
├── model.safetensors (Weights that change when doing finetuning)
├── special_tokens_map.json (Maps special tokens like [PAD], [EOS], [UNK] to their IDs, Used for handling start/end of text, padding, unknown words)
├── spiece.model (The tokenizer's vocabulary and rules, Defines how to split text into tokens)
└── tokenizer_config.json (Configuration for the tokenizer's behavior)

=======
├── config.json (Model architecture configuration)
├── generation_config.json (Text generation parameters)
├── model.safetensors (Model weights)
├── special_tokens_map.json (Mapping of special tokens like [PAD], [EOS])
├── tokenizer.json (Tokenizer vocabulary and rules)
└── tokenizer_config.json (Tokenizer configuration)
"""
"""
9 changes: 6 additions & 3 deletions src/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from pages.finetune_page import FinetunePage
from pages.home_page import HomePage


# Set environment variable
os.environ['TOKENIZERS_PARALLELISM'] = 'false'

Expand All @@ -20,7 +21,6 @@ def setup_gui(self):
self._configure_root()
self._configure_grid()
self._setup_navbar()
self.show_page("llms") # Start with llm page

def _configure_root(self):
self.root.title("Assess.ai")
Expand All @@ -45,14 +45,16 @@ def _clear_content(self):
def show_page(self, page_name):
# Show the new page
self._clear_content()

if page_name == "evaluations":
self.current_page = EvaluationPage(self.root)
elif page_name == "llms":
self.current_page = LLMsPage(self.root)
elif page_name == "finetune":
self.current_page = FinetunePage(self.root)
elif page_name == "home":
self.current_page = HomePage(self.root)
self.current_page = HomePage(self.root, self.show_page)


def main():
# Initialize root and components
Expand All @@ -66,9 +68,10 @@ def main():

# Initialize GUI
app = AssessAIGUI(root)
app.show_page("home")

# Start main loop
root.mainloop()

if __name__ == "__main__":
main()
main()
Loading