-
Notifications
You must be signed in to change notification settings - Fork 31
Open
Description
I realised an ideal dataset for this would using ones for training LLMs. E.g. https://kili-technology.com/large-language-models-llms/9-open-sourced-datasets-for-training-large-language-models
A cool project would be to find the optimal layout for different spoken languages, in addition to coding languages which are also included in the LLM datasets.
It would be interesting to create an open source 'base layout' for writing, coding, and writing+coding, which then allow for 'fine tuning' using a person's code and/or writing.
Metadata
Metadata
Assignees
Labels
No labels