[AIROCMLIR-551] Completely support as_underlying_shape and as_logical_shape#2265
[AIROCMLIR-551] Completely support as_underlying_shape and as_logical_shape#2265
Conversation
|
Also, note that the code structure is really similar to the one found in the tosa pipeline if that makes review easier. |
| // CHECK-DAG: %[[expanded:.*]] = tensor.expand_shape %[[arg1]] | ||
| // CHECK-DAG: %[[expanded_0:.*]] = tensor.expand_shape %[[arg0]] | ||
| // CHECK-DAG: linalg.sub ins(%[[expanded_0]], %[[expanded]] {{.*}}) | ||
| // CHECK-DAG: linalg.sub ins(%[[arg0]], %[[arg1]] {{.*}}) |
There was a problem hiding this comment.
This actually improves the previous IR. Notice that the input shape !migraphx.shaped<16xf32, 1> get converted into tensor<16xf32> which is the memory layout shape. In that case, we don't emit tensor.expand_shape.
Before:
func.func @func_sub(%arg0: tensor<16xf32>, %arg1: tensor<16xf32>) -> tensor<16xf32> {
%expanded = tensor.expand_shape %arg1 [[0]] output_shape [16] : tensor<16xf32> into tensor<16xf32>
%expanded_0 = tensor.expand_shape %arg0 [[0]] output_shape [16] : tensor<16xf32> into tensor<16xf32>
%0 = tensor.empty() : tensor<16xf32>
%1 = linalg.sub ins(%expanded_0, %expanded : tensor<16xf32>, tensor<16xf32>) outs(%0 : tensor<16xf32>) -> tensor<16xf32>
%collapsed = tensor.collapse_shape %1 [[0]] : tensor<16xf32> into tensor<16xf32>
return %collapsed : tensor<16xf32>
}After:
func.func @func_sub(%arg0: tensor<16xf32>, %arg1: tensor<16xf32>) -> tensor<16xf32> {
%0 = tensor.empty() : tensor<16xf32>
%1 = linalg.sub ins(%arg0, %arg1 : tensor<16xf32>, tensor<16xf32>) outs(%0 : tensor<16xf32>) -> tensor<16xf32>
return %1 : tensor<16xf32>
}| ConversionPatternRewriter &rewriter) const { | ||
| /// The linalg-to-rock passes emits the following expression | ||
| /// for expanding the strides. We are matching the following IR | ||
| /// clang-format off |
There was a problem hiding this comment.
Just curious, but why do we need to turn clang-format off here? It doesn't look like the comment is too long...
There was a problem hiding this comment.
I think in this case it is ok. Usually, two lines of mlir gets merged into one line if you don't turn off clang-format
| /// %empty = tensor.empty() : .... | ||
| /// %inserted_slice = tensor.insert_slice %actual_data into %empty ... |
There was a problem hiding this comment.
Is the expand strides case the only case that will produce an empty tensor + insert_slice? This seems quite broad, and potentially susceptible to pattern matching unwanted scenarios?
There was a problem hiding this comment.
This is true indeed true, but my understanding is that empty tensor + insert_slice is the same as a rock.expand_strides. rock.expand_strides is the same as extract_slice (of the result) + memcpy which I think has the same semantics as above.
I think the we have to emit attribute during migraphx to linalg if we only want some case of the tensor.insert_slice to be lowered into rock. Semantically,
| #include "mlir/Dialect/Linalg/IR/Linalg.h" | ||
| #include "mlir/Dialect/Tensor/IR/Tensor.h" | ||
| #include "llvm/ADT/STLExtras.h" | ||
| #include "llvm/ADT/Statistic.h" |
There was a problem hiding this comment.
Is this header file still needed?
| @@ -0,0 +1,64 @@ | |||
| // RUN: rocmlir-opt -split-input-file --migraphx-to-linalg -verify-diagnostics %s | FileCheck %s | |||
|
|
|||
| // testcase taken from migraphx-to-tosa-non-contiguous-strides.mlir | |||
There was a problem hiding this comment.
nit: no need to mention this here as this comment will no longer make sense once the Tosa path is removed. This probably makes more sense to have somewhere in the PR description.
| RankedTensorType transposedType = | ||
| dyn_cast<RankedTensorType>(result.getType()); | ||
| if (transposedType.getShape() != ArrayRef(slicingShape)) { | ||
| SmallVector<int64_t, 4> starts(permutation.size(), 0); |
There was a problem hiding this comment.
Nit: Can we remove this unused variable?
There was a problem hiding this comment.
weird, as there are no compiler warnings.
| } | ||
|
|
||
| RankedTensorType transposedType = | ||
| dyn_cast<RankedTensorType>(result.getType()); |
There was a problem hiding this comment.
We should have a null check after using dyn_cast in case we get a null value.
| if (isPermutationStandardForm(permutation)) { | ||
| // reshape into memory layout type | ||
| RankedTensorType memoryType = inType.asMemoryLayoutTensor(); | ||
| if (result.getType() != inType.asMemoryLayoutTensor()) { |
There was a problem hiding this comment.
| if (result.getType() != inType.asMemoryLayoutTensor()) { | |
| if (result.getType() != memoryType) { |
| migraphx::MIXRShapedType resultType = op.getResult().getType(); | ||
| auto resultTensorType = | ||
| cast<RankedTensorType>(getTypeConverter()->convertType(resultType)); | ||
| RankedTensorType inTensorType = dyn_cast<RankedTensorType>(in.getType()); |
There was a problem hiding this comment.
Same here with the dyn_cast with no nullptr check
Motivation
Being able to support transposed memory layout, long strides in both as_underlying_shape and as_logical_shape.
Furthermore, in
as_underlying_shape, broadcasting is supported.Technical Details
This one PR essentially implements these two commits: #2198 , this one.
as_logical_shape is implemented as the following:
as_underlying_shape is implemented as the following:
Test Plan
Added e2e and lit test.
Test Result
Passed both e2e and lit test.
Submission Checklist