Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
The autotune will not be run on CPU. I would suggest removing this end point in cpu_server.
There was a problem hiding this comment.
I'd say let's keep cpu_server in case tpu server is not available for customers or for infra_verification if they need.
There was a problem hiding this comment.
The current implementation will not fallback to cpu backend (the default backend in autotune_kernel is tpu and when it is used in the AutotuneRunner, the backend is not specified. As a result it will still only use tpu). A good guidance for this kind of situation would be go/tott/737.
There was a problem hiding this comment.
Ok, I added cpu fallback logic so when there is no tpu resources, users are still able to use cpu for infra verification.
There was a problem hiding this comment.
I would suggest letting the eval_server to handle the backend selection logic if we want to use both. See examples like in accelerator-agents/MaxKernel/hitl_agent/subagents/profiling/kernel_profile.py and accelerator-agents/MaxKernel/hitl_agent/subagents/kernel_writing/kernel_compilation.py.
Add autotune agent
Add autotune tool
Integrate with the system