-
Notifications
You must be signed in to change notification settings - Fork 28
Add Knowledge Database to Kernel optimization #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
b9cb0d7 to
84708fd
Compare
Jack-Khuu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, is it hard to add the integration code using the RAG into this PR too?
Remember to cite for the code_samples/docs
- Drop
[Optimization 7/n]from the title just to avoid confusion
| """Adds a child node to the current node.""" | ||
| self.opt_parents.extend(parent_nodes) | ||
|
|
||
| def remove_parents(self, parent_nodes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this for any reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call - I've removed it in following commit
| level_1_opts = [optnode_latency, optnode_memory, optnode_utilization] | ||
| self.root.add_children(level_1_opts) | ||
| optnode_latency.add_parents([self.root]) | ||
| optnode_memory.add_parents([self.root]) | ||
| optnode_utilization.add_parents([self.root]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: For legibility can we add a helper like add_relation or something that updates the child+parent symmetrically
It's easy to parse here, but level3 is a harder to parse
|
|
||
| # Default path | ||
| if database_path is None: | ||
| database_path = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
database_path seems wrong, use Path(__file__).resolve().parents[...] until you hit the project root (where pyproject.toml is)
| return 0.0 | ||
| return dot_product / (norm_vec1 * norm_vec2) | ||
|
|
||
| def retrieve(self, opt_prompt: str) -> tuple[OptNode | None, dict[OptNode, float]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
retrieve() calls embeddings.embed_query(node.opt_desc) for every node each time. Some nodes include full code examples which is slow and costly.
Precompute embeddings once at init and cache them per node.
Or at least cache in-memory dict {OptNode: embedding} after first compute.
Also consider embedding only L1/L2 text nodes for retrieval, then traverse down for code examples. Embedding code blobs is noisy and expensive.
|
|
||
| return best_node, opt_similarity | ||
|
|
||
| def build_context(self, opt_node: OptNode) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It traverses from the selected node down and concatenates every descendant’s opt_desc, including entire code files. That will quickly blow context limits and drown signal.
Put a max character/token budget and stop after N leaf examples.
Add separators between nodes (right now it just concatenates).
Optionally include only: (a) technique description + (b) top-k leaf code examples.
|
|
||
| from pathlib import Path | ||
|
|
||
| from kernel_perf_agent.kernel_opt.database.docs import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t see a kernel_perf_agent/kernel_opt/database/docs/__init__.py added in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch, updated this in the c57e13c
pyproject.toml
Outdated
| "python-dotenv", | ||
| "gradio>=5.5.0", | ||
| "requests", | ||
| "langchain-openai", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding langchain-openai is a big dependency. If you only need embeddings, consider using the project’s existing LLM client (if any) or a thinner dependency.
If you keep it, I’d suggest pinning compatible versions or adding it as an optional dependency for the RAG feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! We can use OpenAI's text-embedding model for simplicity
This PR Introduces a hierarchical optimization database that stores GPU kernel optimization techniques and code examples for the RAG-based optimization.
Key components:
Optimization techniques covered:
This database enables the agent to retrieve relevant optimization strategies and reference implementations based on diagnosed performance bottlenecks.
Test