feat: Keras3 migration#40
Conversation
- Installation moved further up, above the massive table - Quick start shows a quick example - Sklearn removed as a proper usage pattern - Reduced size by ~40% - Much more readable
- Remove Keras 2 version detection and TypeSpec support - Update dependencies: keras>=3.0.0, tensorflow>=2.16.0 - Add keras.core package with BaseLayer for multi-backend layers - Add keras.tensorflow package with TfBaseLayer for TF-specific layers - Port 5 MVP layers to multi-backend: identity, absolute_value, multiply, exp, log - Add 34 passing tests for portable layer infrastructure
Move 35 TensorFlow-specific layers from kamae.tensorflow.layers to kamae.keras.tensorflow.layers as part of Keras 3 multi-backend migration. These layers require TensorFlow backend and cannot be made portable: - 5 hash/encoding layers (BloomEncode, Bucketize, HashIndex, etc.) - 8 datetime layers (CurrentDate, DateParse, UnixTimestampToDateTime, etc.) - 7 list operations (ListMax, ListMean, ListMedian, etc.) - 14 string layers (StringConcatenate, StringIndex, StringContains, etc.) - 1 lambda layer (LambdaFunction) Also migrate TF-specific utilities to kamae.keras.tensorflow.utils: - date_utils.py: 18 datetime functions (unix_timestamp_to_datetime, etc.) - list_utils.py: 6 list operations (get_top_n, segmented_operation, etc.) - transform_utils.py: 4 map_fn functions - typing.py: TF-specific Tensor type (includes SparseTensor, RaggedTensor) All TensorFlow operations remain byte-identical to originals. Only changes: - Base class: BaseLayer → TfBaseLayer (adds require_tensorflow() check) - Import paths updated to new package structure - Input decorators now use portable keras.core.utils.input_utils Numeric layers (divide, subtract, sum, etc.) remain in old location to be properly ported to multi-backend in next commits.
Migrate divide, subtract, round, round_to_decimal, and modulo layers from kamae.tensorflow.layers to kamae.keras.core.layers. .
Changes:
- divide.py: Implemented divide_no_nan using ops.where to
handle division by zero (returns 0 instead of NaN/Inf)
- subtract.py: Direct port using ops.subtract
- round.py: Direct port using ops.ceil/floor/round
- round_to_decimal.py: Uses numpy.finfo/iinfo for dtype max values
instead of TF-specific tensor.dtype.max
- modulo.py: Port using ops.mod (equivalent to tf.math.floormod)
All layers:
- Use keras.ops instead of tf.math operations
- Import from keras.core.layers.base (BaseLayer)
- Use portable decorators from keras.core.utils.input_utils
- Use keras.saving.register_keras_serializable (not tf.keras.utils)
- Return string dtype names (not tf.dtypes.DType objects)
Migrate sum, max, min, mean, and exponent layers from kamae.tensorflow.layers to kamae.keras.core.layers. New layers: - SumLayer: Element-wise addition with addend constant or reduce multiple tensors - MaxLayer: Element-wise maximum with max_constant or reduce multiple tensors - MinLayer: Element-wise minimum with min_constant or reduce multiple tensors - MeanLayer: Element-wise mean with mean_constant or reduce multiple tensors - ExponentLayer: Raise tensor to power (x^exponent) Implementation: - sum.py: Uses ops.add with functools.reduce for multiple inputs - max.py: Uses ops.maximum with functools.reduce - min.py: Uses ops.minimum with functools.reduce - mean.py: Uses ops.add + ops.true_divide(result, len(inputs)) - exponent.py: Uses ops.power for x^y operation All layers follow portable patterns: - keras.ops instead of tf.math operations - keras.core.layers.base.BaseLayer as parent - keras.core.utils.input_utils decorators - keras.saving.register_keras_serializable - String dtype names (not tf.dtypes.DType objects)
Migrate logical_and, logical_or, and logical_not layers from kamae.tensorflow.layers to kamae.keras.core.layers. These layers are now backend-agnostic and work with TensorFlow, JAX, and PyTorch. New layers: - LogicalAndLayer: Element-wise AND operation on multiple boolean tensors - LogicalOrLayer: Element-wise OR operation on multiple boolean tensors - LogicalNotLayer: Element-wise NOT operation on a single boolean tensor Implementation: - logical_and.py: Uses ops.logical_and with functools.reduce - logical_or.py: Uses ops.logical_or with functools.reduce - logical_not.py: Uses ops.logical_not for single tensor All layers: - Only support "bool" dtype - Use enforce_multiple_tensor_input (and/or) or enforce_single_tensor_input (not) - Use keras.ops instead of tf.math operations - Follow portable layer patterns
Migrate numerical_if_statement to kamae.keras.core.layers (portable) and if_statement to kamae.keras.tensorflow.layers (TF-only). Decision rationale: - NumericalIfStatementLayer: Numeric-only, fully portable - IfStatementLayer: Supports strings, requires TensorFlow backend NumericalIfStatementLayer (portable): - Conditional element-wise selection for numeric tensors only - Uses ops.where for conditional selection - Uses Python's operator module via get_condition_operator - Replaced tf.constant with ops.convert_to_tensor - Only supports numeric dtypes: bfloat16, float16, float32, float64 - Removed deprecation TODO (serves different purpose than IfStatementLayer) - Works on TensorFlow, JAX, and PyTorch IfStatementLayer (TF-only): - Conditional element-wise selection for any dtype including strings - Supports string comparisons (eq, neq) and numeric comparisons (all operators) - Inherits from TfBaseLayer with updated imports - Keeps all TensorFlow operations (tf.where, tf.constant, dtype checks) - Requires TensorFlow backend for string operations Both layers support: - Constants or tensor inputs for value_to_compare, result_if_true, result_if_false - Six comparison operators: eq, neq, lt, leq, gt, geq - Dynamic input construction pattern
- Some layers can accept any type. These will be created as multi-backend layers but must fail for string inputs if the backend is not tensorflow
Migrate 4 array operation layers from kamae.tensorflow.layers to portable kamae.keras.core.layers with backend-agnostic operations. ArrayConcatenateLayer (portable): - Concatenates multiple input tensors along specified axis - Supports auto_broadcast feature to match tensor ranks before concatenation - Uses ops.concatenate, ops.shape, ops.broadcast_to, ops.stack, ops.max - compatible_dtypes = None (accepts any backend-supported dtype) - Key change: tf.reduce_max(list) → ops.max(ops.stack(list)) ArraySplitLayer (portable): - Splits single tensor into list of tensors along specified axis - Expands dimensions to preserve shape consistency - Uses ops.unstack, ops.expand_dims - compatible_dtypes = None (accepts any backend-supported dtype) - Direct 1:1 operation replacement ArrayCropLayer (portable): - Crops or pads tensor final dimension to fixed length - Uses ops.minimum, ops.maximum, ops.pad, ops.reshape - compatible_dtypes = None (accepts any backend-supported dtype) - Key changes: * inputs_shape.shape[0] → len(inputs.shape) for rank calculation * Added static vs dynamic shape handling for efficiency * Build reshape target using mix of static/dynamic dimensions ArraySubtractMinimumLayer (portable): - Computes difference from minimum non-padded value along axis - Supports optional pad_value to exclude from minimum calculation - Uses ops.min, ops.subtract, ops.expand_dims, ops.where, ops.equal - compatible_dtypes = explicit numeric list - Key change: inputs.dtype.max → numpy.finfo/iinfo portable introspection Supporting changes: Created portable shape_utils.py: - New module: kamae/keras/core/utils/shape_utils.py - Added reshape_to_equal_rank() function as portable equivalent - Uses ops.concatenate, ops.shape, ops.ones, ops.reshape All changes are mechanical API replacements: - tensorflow as tf → keras, from keras import ops - @tf.keras.utils.register_keras_serializable → @keras.saving.register_keras_serializable - kamae.tensorflow.* → kamae.keras.core.* - tf.operation → ops.operation - List[tf.dtypes.DType] → List[str] - Zero algorithmic changes, only API-level conversions
Migrate 5 scaling/normalization layers from kamae.tensorflow.layers to kamae.keras.core.layers with multi-backend support (TensorFlow, JAX, PyTorch). StandardScaleLayer (multi-backend): - Performs standard scaling: (x - mean) / sqrt(variance) - Supports optional mask_value to preserve certain values unchanged - Uses ops.subtract, ops.sqrt, ops.maximum, ops.where for multi-backend divide_no_nan - Inherits from multi-backend NormalizeLayer base class - compatible_dtypes = ["bfloat16", "float16", "float32", "float64"] - Key change: Implemented divide_no_nan using ops.where to handle zero division ConditionalStandardScaleLayer (multi-backend): - Performs standard scaling with conditional masking - Supports skip_zeros parameter to leave zero values unchanged - Supports epsilon parameter for zero comparison tolerance - Uses ops.subtract, ops.sqrt, ops.maximum, ops.where, ops.abs, ops.less_equal - Inherits from multi-backend NormalizeLayer base class - compatible_dtypes = ["bfloat16", "float16", "float32", "float64"] - Key change: Multi-backend divide_no_nan + conditional zero masking MinMaxScaleLayer (multi-backend): - Performs min-max scaling: (x - min) / (max - min) - Scales values to range [0, 1] - Supports optional mask_value to preserve certain values - Uses ops.subtract, ops.where for multi-backend divide_no_nan - Inherits from multi-backend BaseLayer with axis-aware broadcasting - compatible_dtypes = ["bfloat16", "float16", "float32", "float64"] - Key change: Simplified build() using ops.reshape and list-based shape handling ImputeLayer (multi-backend): - Replaces mask_value with impute_value in input tensor - Supports both numeric and non-numeric dtypes (strings, etc.) - Uses ops.equal, ops.where for conditional replacement - compatible_dtypes = None (accepts any backend-supported dtype) - Key changes: * inputs.dtype.is_floating → simplified string-based checking * tf.constant → ops.convert_to_tensor for constants BinLayer (multi-backend): - Performs binning operation based on condition operators - Evaluates conditions sequentially, returns first matching bin label - Uses Python's operator module via get_condition_operator - Uses ops.where, ops.convert_to_tensor - compatible_dtypes = all numeric types (bfloat16, floats, ints, uints) - Key change: tf.constant → ops.convert_to_tensor for label/default values Supporting infrastructure: Created multi-backend NormalizeLayer base: - New module: kamae/keras/core/utils/normalize_layer.py - Base class for StandardScaleLayer and ConditionalStandardScaleLayer - Handles axis-aware mean/variance broadcasting in build() method - Uses ops.reshape instead of tf.reshape - Implements get_build_config/build_from_config for serialization - Key changes: * tf.TensorShape handling → list-based shape manipulation * Removed complex multi-input shape handling (unnecessary with decorators) * Simplified build() method Created multi-backend tensor utilities: - New module: kamae/keras/core/utils/tensor_utils.py - Added listify_tensors() function for config serialization - Uses hasattr(x, 'numpy') for backend-agnostic tensor detection - Works across TensorFlow, JAX, PyTorch backends Dtype checking simplifications: - Simplified numeric dtype checks in BaseLayer and ImputeLayer - "float" in dtype catches both float* and bfloat* types - "int" in dtype catches both int* and uint* types - Removed redundant "bfloat" and "uint" substring checks All changes are mechanical API replacements: - tensorflow as tf → keras, from keras import ops - @tf.keras.utils.register_keras_serializable → @keras.saving.register_keras_serializable - tf.math.divide_no_nan → ops.where-based implementation - tf.math.subtract → ops.subtract - tf.math.maximum → ops.maximum - tf.sqrt → ops.sqrt - tf.equal → ops.equal - tf.abs → ops.abs - tf.constant → ops.convert_to_tensor - tf.reshape → ops.reshape - tf.TensorShape().as_list() → list(input_shape) - inputs.dtype.name → keras.backend.standardize_dtype(inputs.dtype) - inputs.dtype.is_floating → "float" in dtype string - inputs.dtype.is_integer → "int" in dtype string - x <= y → ops.less_equal(x, y) - Zero algorithmic changes, only API-level conversions
Migrate final 3 geometry layers from kamae.tensorflow.layers to
kamae.keras.core.layers with multi-backend support (TensorFlow, JAX, PyTorch).
BearingAngleLayer (multi-backend):
- Computes bearing angle between two lat/lon coordinate pairs
- Supports optional lat_lon_constant for fixed destination
- Uses ops.sin, ops.cos, ops.arctan2, ops.mod for trig calculations
- Implements get_radians/get_degrees helpers with float64 precision
- compatible_dtypes = ["bfloat16", "float16", "float32", "float64"]
- Key changes:
* tf.math.atan2 → ops.arctan2
* tf.math.sin/cos/mod → ops.sin/cos/mod
* tf.constant → ops.convert_to_tensor
* tf.cast → ops.cast
CosineSimilarityLayer (multi-backend):
- Computes cosine similarity between two input tensors
- Supports axis and keepdims parameters
- Implements custom l2_normalize() using ops.sqrt, ops.sum, ops.square
- Uses ops.multiply, ops.sum for dot product calculation
- compatible_dtypes = float types + ["complex64", "complex128"]
- Key changes:
* tf.nn.l2_normalize → custom implementation (not in keras.ops)
* Custom l2_normalize: x / sqrt(max(sum(x^2), 1e-12))
* tf.reduce_sum → ops.sum
* tf.multiply → ops.multiply
HaversineDistanceLayer (multi-backend):
- Computes haversine distance between two lat/lon coordinate pairs
- Supports optional lat_lon_constant for fixed destination
- Supports unit parameter ('km' or 'miles')
- Uses ops.sin, ops.cos, ops.arcsin, ops.power for haversine formula
- Implements get_radians helper with float64 precision
- compatible_dtypes = ["bfloat16", "float16", "float32", "float64"]
- Key changes:
* tf.math.sin/cos → ops.sin/cos
* tf.math.asin → ops.arcsin
* tf.math.pow → ops.power
* pow(a, 0.5) → ops.power(a, 0.5)
* tf.constant → ops.convert_to_tensor
* tf.cast → ops.cast
All changes are mechanical API replacements:
- tensorflow as tf → keras, from keras import ops
- @tf.keras.utils.register_keras_serializable → @keras.saving.register_keras_serializable
- tf.math.atan2 → ops.arctan2
- tf.math.asin → ops.arcsin
- tf.math.sin → ops.sin
- tf.math.cos → ops.cos
- tf.math.mod → ops.mod
- tf.math.pow → ops.power
- tf.nn.l2_normalize → custom ops-based implementation
- tf.reduce_sum → ops.sum
- tf.multiply → ops.multiply
- tf.constant → ops.convert_to_tensor
- tf.cast → ops.cast
- Zero algorithmic changes, only API-level conversions
Refactor all 31 multi-backend layers for better consistency, maintainability,
and correctness with zero functional changes.
Changes:
1. Terminology update (9 files):
- Changed "portable" → "multi-backend" throughout codebase
- Updated module docstrings in base.py, normalize_layer.py, shape_utils.py
- Updated utility docstrings in input_utils.py, tensor_utils.py, typing.py
- Updated layer docstrings and comments in divide.py, standard_scale.py,
conditional_standard_scale.py, min_max_scale.py, __init__.py
2. Extract divide_no_nan utility (NEW FILE + 4 files):
- Created src/kamae/keras/core/utils/ops_utils.py
- Added divide_no_nan(x, y) function for multi-backend safe division
- Replaced duplicate implementations in:
* DivideLayer: removed 14-line _divide_no_nan method
* StandardScaleLayer: replaced 8-line inline pattern
* ConditionalStandardScaleLayer: replaced 8-line inline pattern
* MinMaxScaleLayer: replaced 5-line inline pattern
- Eliminated ~35 lines of duplicated code
- Single source of truth for divide-by-zero handling
3. Fix serialization bug (1 file):
- MinMaxScaleLayer.get_config() now includes mask_value parameter
- Ensures proper layer serialization/deserialization
4. Standardize import ordering (4 files):
- All files now follow: stdlib → third-party → local
- Updated divide.py, standard_scale.py, conditional_standard_scale.py,
min_max_scale.py to import from ops_utils
Testing:
- All 31 layers import and function correctly
- Verified divide_no_nan utility works on TensorFlow backend
- Verified MinMaxScaleLayer serialization includes mask_value
- Zero "portable" references remain (confirmed via grep)
- Zero inline divide_no_nan patterns remain
Impact:
- Code quality: DRY principle, single source of truth
- Maintainability: centralized divide-by-zero logic
- Correctness: fixed MinMaxScaleLayer serialization bug
- Consistency: unified terminology and import style
- Backward compatibility: 100% - no API or functional changes
- Keras requires minimum of python 3.9
- Use new layers
- Single class for all multi-backend and tf specific - Minor fix on array_concat - Add back keras quirk for saving build shapes to norm layers
- String isin had incorrect dtype in tests - Improved imports and serialisation
- We planned to remove it for this release, we do it here as its complicating our testing and planning
- Removes kamae.tensorflow entirely - Moves tests to mirror src
|
Converted to draft as we will open it against |
- User can check the supported backends before construction - Base classes check for backend when __init__ called
- Apply hash indexer null behaviour (PR #41): reserve index 0 for nulls, num_bins > 1 validation, +1 offset in hash_index, min_hash_index, bloom_encode - Apply allow layer/output names equal (PR #42): remove IdentityLayer wrapping from pipeline_graph, skip self-loops in graph edges - Migrate ArrayReduceMax and PairwiseCosineSimilarity as multi-backend layers using keras.ops (PR #45) - Fix dtype bug in divide_no_nan (x.dtype -> y.dtype)
Integrates PRs #41 (hash indexer null behaviour), #42 (layer/output name equality), and #45 (ArrayReduceMax + PairwiseCosineSimilarity). New layers from PR #45 are already migrated to multi-backend at src/kamae/keras/core/layers/ so the TF-only versions added by release are removed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move tf layers to use string dtypes to match the base layer - base.py now does not rely on inputs.dtype.name as this was tf specifc
- Fix serialization decorators in string_affix/concatenate layers (use package= kwarg) - Replace tf.float32 with "float32" string in base layer string casting - Remove redundant Union wrapper in pipeline_model type annotations - Add validate_backend() helper to eliminate duplication in BaseLayer/SparkOperation - Add discovery API for finding backend/JIT-compatible layers and transformers - Clarify README: TensorFlow is required, multi-backend is for numeric ops only
|
|
||
| - [ ] Use `kamae.keras.core.layers` for new numeric operations (multi-backend) | ||
| - [ ] Use `kamae.keras.tensorflow.layers` for string/datetime operations (TF-only) | ||
| - [ ] Import from `kamae.keras.core.base.BaseLayer` (not `tensorflow.layers.base`) |
There was a problem hiding this comment.
This is also a lie, we never used tensorflow.layers.base but had our own BaseLayer previously
| return dataset.withColumn(self.getOutputCol(), output_col) | ||
|
|
||
| def get_tf_layer(self) -> tf.keras.layers.Layer: | ||
| def get_keras_layer(self) -> tf.keras.layers.Layer: |
There was a problem hiding this comment.
Do we still want to be typehinting with TF here? I know it's a TF only layer but feels like an odd pattern
There was a problem hiding this comment.
Applies to all places where this happens
There was a problem hiding this comment.
I think is mainly a style/design consideration. E.g. do we:
- Keep tf only layers using only tf imports (and thus typing), even tho tf.keras.layers.Layer is now just an alias to the keras one (same for the serialisation decorators for example)
- Or do we mix keras ops/decorators/type hints into tf only layers where there is a valid keras 3 op. Might cause some confusion, but would give us a clearer understand which parts are not possible within keras 3
I opted for 1 to reduce git diff, we could do 2 in a later PR if we feel its beneficial.
| # limitations under the License. | ||
|
|
||
| """ | ||
| Discovery utilities for finding backend-compatible layers and transformers. |
There was a problem hiding this comment.
Do we make use of these utils anywhere? Also not sure I like the discovery naming convention, bit vague
There was a problem hiding this comment.
I added it from a request from Conor, I think it would be useful in the shared libraries to immediately find multi-backend / jit compatible transformers/layers. It's not directly used anywhere in kamae
Keras 3 Migration: Multi-Backend Support
Disclaimer: Claude completed this migration.
Overview
Migrates Kamae from Keras 2 (tf.keras) to Keras 3, enabling multi-backend support (TensorFlow, JAX, PyTorch).
Breaking Changes: Yes (v3.0.0 release)
Test Coverage: 100% maintained (all tests pass)
Key Changes
1. Package Restructure
kamae/tensorflow/→ moved tokeras/{core,tensorflow}kamae/sklearn/→ removed (was experimental)2. Backend-Agnostic API Naming
get_tf_layer()get_keras_layer()getInputTFDtype()getInputKerasDtype()getOutputTFDtype()getOutputKerasDtype()tf_input_schemainput_schema3. Other Changes
dtype.tf_dtype→dtype.keras_dtype(returns string, not TF object).keras(removed version detection code)from kamae.tensorflow.layers→from kamae.keras.core.layersBreaking Changes
See keras3_migration.md for complete migration guide.
Impact
Verification
✅ All tests pass (transformer tests, pipeline tests, graph tests)
✅ Example pipeline runs successfully
✅ No old naming patterns remain in codebase
✅ Spark pipelines continue to work identically
✅ TensorFlow Serving compatibility maintained