-
-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Refactor: Reduce Complexity of sklearn_serializer.py
Summary
The file openmodels/serializers/sklearn/sklearn_serializer.py has grown quite large and complex, containing a mix of serialization logic, deserialization logic, type/dtype mapping, special-case handlers, and a large number of scikit-learn-specific workarounds. This makes the codebase harder to maintain, test, and extend.
Motivation
- Maintainability: The current file is lengthy and contains many responsibilities, making it difficult to navigate and update.
- Testability: Isolating logic into smaller, focused modules or classes will make it easier to write targeted unit tests.
- Extensibility: Reducing complexity will make it easier to add support for new estimators, kernels, or serialization features in the future.
- Readability: A more modular structure will help new contributors understand and contribute to the codebase.
Suggested Refactoring Tasks
- Split the file into smaller modules: For example, move loss serialization, kernel serialization, tree serialization, and special-case handlers into their own files or classes.
- Group related helper functions: Consider grouping helpers (e.g., type/dtype mapping, attribute extraction) into utility modules.
- Reduce duplication: Identify and refactor repeated patterns (e.g., recursive serialization/deserialization) into reusable functions.
- Document module boundaries: Add docstrings and comments to clarify the responsibilities of each new module/class.
- Add or improve tests: Ensure that the refactored code is covered by unit tests, especially for edge cases and custom estimator support.
Acceptance Criteria
- The main
sklearn_serializer.pyfile should be significantly shorter and focused on high-level orchestration. - Specialized logic (losses, kernels, trees, etc.) should be moved to dedicated modules or classes.
- All existing tests should pass, and new tests should be added for any newly isolated logic.
- The public API and behavior should remain unchanged.
Related file: openmodels/serializers/sklearn/sklearn_serializer.py
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request
Projects
Status
Ready