Skip to content
This repository was archived by the owner on Nov 16, 2023. It is now read-only.
This repository was archived by the owner on Nov 16, 2023. It is now read-only.

Onnx export of ColumnSelector doesn't drop input columns #478

@antoniovs1029

Description

@antoniovs1029

This issue in NimbusML is pretty much the same issue I created on ML.NET dotnet/machinelearning#4970 but wanted to create this one here just for the record.

Notice that the OnnxRunner correctly drops "case2" but not all the other input columns.

Code

import numpy
from nimbusml import Pipeline, FileDataStream
from nimbusml.datasets import get_dataset
from nimbusml.preprocessing import OnnxRunner
from nimbusml.preprocessing.schema import ColumnSelector, ColumnDuplicator
from data_frame_tool import DataFrameTool as DFT

path = get_dataset('infert').as_filepath()
dataset = FileDataStream.read_csv(path, sep=',',
                               numeric_dtype=numpy.float32,
                               names={0: 'row_num', 5: 'case'})

dataset = dataset.to_df()

pipeline = Pipeline([
    ColumnDuplicator(columns={'case2': 'case'}),
    ColumnSelector(columns=['age']),
])


print("\n\nML.NET RESULT")
result_expected = pipeline.fit_transform(dataset)
print(result_expected)

print("\n\nORT RESULT")
onnx_path = "C:\\Users\\anvelazq\Desktop\\is25colsel\\colsel.onnx"
pipeline.export_to_onnx(onnx_path, 'com.microsoft.ml')
onnxrunner = OnnxRunner(model_file=onnx_path)
result_onnx = onnxrunner.fit_transform(dataset)
print(result_onnx)

print("\n\nONNX RUNNER RESULT")
df_tool = DFT(onnx_path)
result_ort = df_tool.execute(dataset, [])
print(result_ort)

Output

ML.NET RESULT
      age
0    26.0
1    42.0
2    39.0
3    34.0
4    35.0
..    ...
243  31.0
244  34.0
245  35.0
246  29.0
247  23.0

[248 rows x 1 columns]


ORT RESULT
     row_num education   age  parity  induced  case  spontaneous  stratum  pooled.stratum
0        1.0    0-5yrs  26.0     6.0      1.0   1.0          2.0      1.0             3.0
1        2.0    0-5yrs  42.0     1.0      1.0   1.0          0.0      2.0             1.0
2        3.0    0-5yrs  39.0     6.0      2.0   1.0          0.0      3.0             4.0
3        4.0    0-5yrs  34.0     4.0      2.0   1.0          0.0      4.0             2.0
4        5.0   6-11yrs  35.0     3.0      1.0   1.0          1.0      5.0            32.0
..       ...       ...   ...     ...      ...   ...          ...      ...             ...
243    244.0   12+ yrs  31.0     1.0      0.0   0.0          1.0     79.0            45.0
244    245.0   12+ yrs  34.0     1.0      0.0   0.0          0.0     80.0            47.0
245    246.0   12+ yrs  35.0     2.0      2.0   0.0          0.0     81.0            54.0
246    247.0   12+ yrs  29.0     1.0      0.0   0.0          1.0     82.0            43.0
247    248.0   12+ yrs  23.0     1.0      0.0   0.0          1.0     83.0            40.0

[248 rows x 9 columns]


ONNX RUNNER RESULT
     age.output
0          26.0
1          42.0
2          39.0
3          34.0
4          35.0
..          ...
243        31.0
244        34.0
245        35.0
246        29.0
247        23.0

[248 rows x 1 columns]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions