diff --git a/sample/sagemaker/2017-07-24/service-2.json b/sample/sagemaker/2017-07-24/service-2.json index cc75fd2..a7640e3 100644 --- a/sample/sagemaker/2017-07-24/service-2.json +++ b/sample/sagemaker/2017-07-24/service-2.json @@ -130,6 +130,50 @@ ], "documentation":"
Replaces specific nodes within a SageMaker HyperPod cluster with new hardware. BatchReplaceClusterNodes terminates the specified instances and provisions new replacement instances with the same configuration but fresh hardware. The Amazon Machine Image (AMI) and instance configuration remain the same.
This operation is useful for recovering from hardware failures or persistent issues that cannot be resolved through a reboot.
Data Loss Warning: Replacing nodes destroys all instance volumes, including both root and secondary volumes. All data stored on these volumes will be permanently lost and cannot be recovered.
To safeguard your work, back up your data to Amazon S3 or an FSx for Lustre file system before invoking the API on a worker node group. This will help prevent any potential data loss from the instance root volume. For more information about backup, see Use the backup script provided by SageMaker HyperPod.
If you want to invoke this API on an existing cluster, you'll first need to patch the cluster by running the UpdateClusterSoftware API. For more information about patching a cluster, see Update the SageMaker HyperPod platform software of a cluster.
You can replace up to 25 nodes in a single request.
Creates a benchmark job that runs performance benchmarks against inference infrastructure using a predefined AI workload configuration. The benchmark job measures metrics such as latency, throughput, and cost for your generative AI inference endpoints.
" + }, + "CreateAIRecommendationJob":{ + "name":"CreateAIRecommendationJob", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"CreateAIRecommendationJobRequest"}, + "output":{"shape":"CreateAIRecommendationJobResponse"}, + "errors":[ + {"shape":"ResourceNotFound"}, + {"shape":"ResourceInUse"}, + {"shape":"ResourceLimitExceeded"} + ], + "documentation":"Creates a recommendation job that generates intelligent optimization recommendations for generative AI inference deployments. The job analyzes your model, workload configuration, and performance targets to recommend optimal instance types, model optimization techniques (such as quantization and speculative decoding), and deployment configurations.
" + }, + "CreateAIWorkloadConfig":{ + "name":"CreateAIWorkloadConfig", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"CreateAIWorkloadConfigRequest"}, + "output":{"shape":"CreateAIWorkloadConfigResponse"}, + "errors":[ + {"shape":"ResourceInUse"}, + {"shape":"ResourceLimitExceeded"} + ], + "documentation":"Creates a reusable AI workload configuration that defines datasets, data sources, and benchmark tool settings for consistent performance testing of generative AI inference deployments on Amazon SageMaker AI.
" + }, "CreateAction":{ "name":"CreateAction", "http":{ @@ -1043,6 +1087,46 @@ ], "documentation":"Creates a new work team for labeling your data. A work team is defined by one or more Amazon Cognito user pools. You must first create the user pools before you can create a work team.
You cannot create more than 25 work teams in an account and region.
" }, + "DeleteAIBenchmarkJob":{ + "name":"DeleteAIBenchmarkJob", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"DeleteAIBenchmarkJobRequest"}, + "output":{"shape":"DeleteAIBenchmarkJobResponse"}, + "errors":[ + {"shape":"ResourceNotFound"} + ], + "documentation":"Deletes the specified AI benchmark job.
" + }, + "DeleteAIRecommendationJob":{ + "name":"DeleteAIRecommendationJob", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"DeleteAIRecommendationJobRequest"}, + "output":{"shape":"DeleteAIRecommendationJobResponse"}, + "errors":[ + {"shape":"ResourceNotFound"} + ], + "documentation":"Deletes the specified AI recommendation job.
" + }, + "DeleteAIWorkloadConfig":{ + "name":"DeleteAIWorkloadConfig", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"DeleteAIWorkloadConfigRequest"}, + "output":{"shape":"DeleteAIWorkloadConfigResponse"}, + "errors":[ + {"shape":"ResourceNotFound"}, + {"shape":"ResourceInUse"} + ], + "documentation":"Deletes the specified AI workload configuration. You cannot delete a configuration that is referenced by an active benchmark job.
" + }, "DeleteAction":{ "name":"DeleteAction", "http":{ @@ -1752,6 +1836,45 @@ "input":{"shape":"DeregisterDevicesRequest"}, "documentation":"Deregisters the specified devices. After you deregister a device, you will need to re-register the devices.
" }, + "DescribeAIBenchmarkJob":{ + "name":"DescribeAIBenchmarkJob", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"DescribeAIBenchmarkJobRequest"}, + "output":{"shape":"DescribeAIBenchmarkJobResponse"}, + "errors":[ + {"shape":"ResourceNotFound"} + ], + "documentation":"Returns details of an AI benchmark job, including its status, configuration, target endpoint, and timing information.
" + }, + "DescribeAIRecommendationJob":{ + "name":"DescribeAIRecommendationJob", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"DescribeAIRecommendationJobRequest"}, + "output":{"shape":"DescribeAIRecommendationJobResponse"}, + "errors":[ + {"shape":"ResourceNotFound"} + ], + "documentation":"Returns details of an AI recommendation job, including its status, model source, performance targets, optimization recommendations, and deployment configurations.
" + }, + "DescribeAIWorkloadConfig":{ + "name":"DescribeAIWorkloadConfig", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"DescribeAIWorkloadConfigRequest"}, + "output":{"shape":"DescribeAIWorkloadConfigResponse"}, + "errors":[ + {"shape":"ResourceNotFound"} + ], + "documentation":"Returns details of an AI workload configuration, including the dataset configuration, benchmark tool settings, tags, and creation time.
" + }, "DescribeAction":{ "name":"DescribeAction", "http":{ @@ -2774,6 +2897,36 @@ ], "documentation":"Import hub content.
" }, + "ListAIBenchmarkJobs":{ + "name":"ListAIBenchmarkJobs", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"ListAIBenchmarkJobsRequest"}, + "output":{"shape":"ListAIBenchmarkJobsResponse"}, + "documentation":"Returns a list of AI benchmark jobs in your account. You can filter the results by name, status, and creation time, and sort the results. The response is paginated.
" + }, + "ListAIRecommendationJobs":{ + "name":"ListAIRecommendationJobs", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"ListAIRecommendationJobsRequest"}, + "output":{"shape":"ListAIRecommendationJobsResponse"}, + "documentation":"Returns a list of AI recommendation jobs in your account. You can filter the results by name, status, and creation time, and sort the results. The response is paginated.
" + }, + "ListAIWorkloadConfigs":{ + "name":"ListAIWorkloadConfigs", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"ListAIWorkloadConfigsRequest"}, + "output":{"shape":"ListAIWorkloadConfigsResponse"}, + "documentation":"Returns a list of AI workload configurations in your account. You can filter the results by name and creation time, and sort the results. The response is paginated.
" + }, "ListActions":{ "name":"ListActions", "http":{ @@ -3921,6 +4074,32 @@ ], "documentation":"Initiates a remote connection session between a local integrated development environments (IDEs) and a remote SageMaker space.
" }, + "StopAIBenchmarkJob":{ + "name":"StopAIBenchmarkJob", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"StopAIBenchmarkJobRequest"}, + "output":{"shape":"StopAIBenchmarkJobResponse"}, + "errors":[ + {"shape":"ResourceNotFound"} + ], + "documentation":"Stops a running AI benchmark job.
" + }, + "StopAIRecommendationJob":{ + "name":"StopAIRecommendationJob", + "http":{ + "method":"POST", + "requestUri":"/" + }, + "input":{"shape":"StopAIRecommendationJobRequest"}, + "output":{"shape":"StopAIRecommendationJobResponse"}, + "errors":[ + {"shape":"ResourceNotFound"} + ], + "documentation":"Stops a running AI recommendation job.
" + }, "StopAutoMLJob":{ "name":"StopAutoMLJob", "http":{ @@ -4728,6 +4907,719 @@ } }, "shapes":{ + "AIBenchmarkEndpoint":{ + "type":"structure", + "required":["Identifier"], + "members":{ + "Identifier":{ + "shape":"AIResourceIdentifier", + "documentation":"The name or Amazon Resource Name (ARN) of the SageMaker endpoint to benchmark.
" + }, + "TargetContainerHostname":{ + "shape":"String", + "documentation":"The hostname of the specific container to target within a multi-container endpoint.
" + }, + "InferenceComponents":{ + "shape":"AIBenchmarkInferenceComponentList", + "documentation":"The list of inference components to benchmark on the endpoint.
" + } + }, + "documentation":"The SageMaker endpoint configuration for benchmarking.
" + }, + "AIBenchmarkInferenceComponent":{ + "type":"structure", + "required":["Identifier"], + "members":{ + "Identifier":{ + "shape":"AIResourceIdentifier", + "documentation":"The name or Amazon Resource Name (ARN) of the inference component.
" + } + }, + "documentation":"An inference component to benchmark.
" + }, + "AIBenchmarkInferenceComponentList":{ + "type":"list", + "member":{"shape":"AIBenchmarkInferenceComponent"} + }, + "AIBenchmarkJobArn":{ + "type":"string", + "max":256, + "min":0, + "pattern":"arn:aws[a-z\\-]*:sagemaker:[a-z0-9\\-]*:[0-9]{12}:ai-benchmark-job/[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}" + }, + "AIBenchmarkJobStatus":{ + "type":"string", + "enum":[ + "InProgress", + "Completed", + "Failed", + "Stopping", + "Stopped" + ] + }, + "AIBenchmarkJobSummary":{ + "type":"structure", + "required":[ + "AIBenchmarkJobName", + "AIBenchmarkJobArn", + "AIBenchmarkJobStatus", + "CreationTime" + ], + "members":{ + "AIBenchmarkJobName":{ + "shape":"AIEntityName", + "documentation":"The name of the benchmark job.
" + }, + "AIBenchmarkJobArn":{ + "shape":"AIBenchmarkJobArn", + "documentation":"The Amazon Resource Name (ARN) of the benchmark job.
" + }, + "AIBenchmarkJobStatus":{ + "shape":"AIBenchmarkJobStatus", + "documentation":"The status of the benchmark job.
" + }, + "CreationTime":{ + "shape":"Timestamp", + "documentation":"A timestamp that indicates when the benchmark job was created.
" + }, + "EndTime":{ + "shape":"Timestamp", + "documentation":"A timestamp that indicates when the benchmark job completed.
" + }, + "AIWorkloadConfigName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI workload configuration used by the benchmark job.
" + } + }, + "documentation":"Summary information about an AI benchmark job.
" + }, + "AIBenchmarkJobSummaryList":{ + "type":"list", + "member":{"shape":"AIBenchmarkJobSummary"} + }, + "AIBenchmarkNetworkConfig":{ + "type":"structure", + "members":{ + "VpcConfig":{ + "shape":"VpcConfig", + "documentation":"The VPC configuration, including security group IDs and subnet IDs.
" + } + }, + "documentation":"The network configuration for an AI benchmark job.
" + }, + "AIBenchmarkOutputConfig":{ + "type":"structure", + "required":["S3OutputLocation"], + "members":{ + "S3OutputLocation":{ + "shape":"S3Uri", + "documentation":"The Amazon S3 URI where benchmark results are stored.
" + } + }, + "documentation":"The output configuration for an AI benchmark job.
" + }, + "AIBenchmarkOutputResult":{ + "type":"structure", + "required":["S3OutputLocation"], + "members":{ + "S3OutputLocation":{ + "shape":"S3Uri", + "documentation":"The Amazon S3 URI where benchmark results are stored.
" + }, + "CloudWatchLogs":{ + "shape":"AICloudWatchLogsList", + "documentation":"The CloudWatch log information for the benchmark job.
" + } + }, + "documentation":"The output result of an AI benchmark job, including the Amazon S3 location and CloudWatch log information.
" + }, + "AIBenchmarkTarget":{ + "type":"structure", + "members":{ + "Endpoint":{ + "shape":"AIBenchmarkEndpoint", + "documentation":"The SageMaker endpoint to benchmark.
" + } + }, + "documentation":"The target for an AI benchmark job. This is a union type — specify one of the members.
", + "union":true + }, + "AICapacityReservationConfig":{ + "type":"structure", + "members":{ + "CapacityReservationPreference":{ + "shape":"AICapacityReservationPreference", + "documentation":"The capacity reservation preference. The only valid value is capacity-reservations-only.
The list of ML reservation ARNs to use.
" + } + }, + "documentation":"The capacity reservation configuration for an AI recommendation job.
" + }, + "AICapacityReservationPreference":{ + "type":"string", + "enum":["capacity-reservations-only"] + }, + "AIChannelName":{ + "type":"string", + "max":64, + "min":1, + "pattern":"[A-Za-z0-9\\.\\-_]+" + }, + "AICloudWatchLogs":{ + "type":"structure", + "members":{ + "LogGroupArn":{ + "shape":"String", + "documentation":"The Amazon Resource Name (ARN) of the CloudWatch log group.
" + }, + "LogStreamName":{ + "shape":"String", + "documentation":"The name of the CloudWatch log stream.
" + } + }, + "documentation":"CloudWatch log information for an AI benchmark or recommendation job.
" + }, + "AICloudWatchLogsList":{ + "type":"list", + "member":{"shape":"AICloudWatchLogs"} + }, + "AIDatasetConfig":{ + "type":"structure", + "members":{ + "InputDataConfig":{ + "shape":"AIWorkloadInputDataConfigList", + "documentation":"An array of input data channel configurations for the workload.
" + } + }, + "documentation":"The dataset configuration for an AI workload. This is a union type — specify one of the members.
", + "union":true + }, + "AIEntityName":{ + "type":"string", + "max":63, + "min":1, + "pattern":"[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}" + }, + "AIInferenceSpecificationName":{ + "type":"string", + "max":63, + "min":0, + "pattern":"[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}" + }, + "AIMlReservationArn":{ + "type":"string", + "max":256, + "min":0, + "pattern":"arn:aws[a-z\\-]*:sagemaker:[a-z0-9\\-]*:[0-9]{12}:[a-z0-9\\-]{1,14}/.*" + }, + "AIMlReservationArnList":{ + "type":"list", + "member":{"shape":"AIMlReservationArn"} + }, + "AIModelSource":{ + "type":"structure", + "members":{ + "S3":{ + "shape":"AIModelSourceS3", + "documentation":"The Amazon S3 location of the model artifacts.
" + } + }, + "documentation":"The source of the model for an AI recommendation job. This is a union type.
", + "union":true + }, + "AIModelSourceS3":{ + "type":"structure", + "members":{ + "S3Uri":{ + "shape":"S3Uri", + "documentation":"The Amazon S3 URI of the model artifacts.
" + } + }, + "documentation":"The Amazon S3 model source configuration.
" + }, + "AIRecommendation":{ + "type":"structure", + "members":{ + "RecommendationDescription":{ + "shape":"String", + "documentation":"A description of the recommendation.
" + }, + "OptimizationDetails":{ + "shape":"AIRecommendationOptimizationDetailList", + "documentation":"The optimization techniques applied in this recommendation.
" + }, + "ModelDetails":{ + "shape":"AIRecommendationModelDetails", + "documentation":"Details about the model package associated with this recommendation.
" + }, + "DeploymentConfiguration":{ + "shape":"AIRecommendationDeploymentConfiguration", + "documentation":"The deployment configuration for this recommendation, including the container image, instance type, instance count, and environment variables.
" + }, + "AIBenchmarkJobArn":{ + "shape":"AIBenchmarkJobArn", + "documentation":"The Amazon Resource Name (ARN) of the benchmark job associated with this recommendation.
" + }, + "ExpectedPerformance":{ + "shape":"ExpectedPerformanceList", + "documentation":"The expected performance metrics for this recommendation.
" + } + }, + "documentation":"An optimization recommendation generated by an AI recommendation job.
" + }, + "AIRecommendationAllowOptimization":{ + "type":"boolean", + "box":true + }, + "AIRecommendationComputeSpec":{ + "type":"structure", + "members":{ + "InstanceTypes":{ + "shape":"AIRecommendationInstanceTypeList", + "documentation":"The list of instance types to consider for recommendations. You can specify up to 3 instance types.
" + }, + "CapacityReservationConfig":{ + "shape":"AICapacityReservationConfig", + "documentation":"The capacity reservation configuration.
" + } + }, + "documentation":"The compute resource specification for an AI recommendation job.
" + }, + "AIRecommendationConstraint":{ + "type":"structure", + "required":["Metric"], + "members":{ + "Metric":{ + "shape":"AIRecommendationMetric", + "documentation":"The performance metric. Valid values are ttft-ms (time to first token in milliseconds), throughput, and cost.
A performance constraint for an AI recommendation job.
" + }, + "AIRecommendationConstraintList":{ + "type":"list", + "member":{"shape":"AIRecommendationConstraint"} + }, + "AIRecommendationCopyCountPerInstance":{ + "type":"integer", + "box":true + }, + "AIRecommendationDeploymentConfiguration":{ + "type":"structure", + "members":{ + "S3":{ + "shape":"AIRecommendationDeploymentS3ChannelList", + "documentation":"The Amazon S3 data channels for the deployment.
" + }, + "ImageUri":{ + "shape":"String", + "documentation":"The URI of the container image for the deployment.
" + }, + "InstanceType":{ + "shape":"AIRecommendationInstanceType", + "documentation":"The recommended instance type for the deployment.
" + }, + "InstanceCount":{ + "shape":"AIRecommendationInstanceCount", + "documentation":"The recommended number of instances for the deployment.
" + }, + "CopyCountPerInstance":{ + "shape":"AIRecommendationCopyCountPerInstance", + "documentation":"The number of model copies per instance.
" + }, + "EnvironmentVariables":{ + "shape":"EnvironmentMap", + "documentation":"The environment variables for the deployment.
" + } + }, + "documentation":"The deployment configuration for a recommendation.
" + }, + "AIRecommendationDeploymentS3Channel":{ + "type":"structure", + "members":{ + "ChannelName":{ + "shape":"AIChannelName", + "documentation":"A custom name for this Amazon S3 data channel.
" + }, + "Uri":{ + "shape":"S3Uri", + "documentation":"The Amazon S3 URI of the data for this channel.
" + } + }, + "documentation":"An Amazon S3 data channel for a recommended deployment configuration, containing model artifacts or optimized model outputs.
" + }, + "AIRecommendationDeploymentS3ChannelList":{ + "type":"list", + "member":{"shape":"AIRecommendationDeploymentS3Channel"} + }, + "AIRecommendationInferenceFramework":{ + "type":"string", + "enum":[ + "LMI", + "VLLM" + ] + }, + "AIRecommendationInferenceSpecification":{ + "type":"structure", + "members":{ + "Framework":{ + "shape":"AIRecommendationInferenceFramework", + "documentation":"The inference framework. Valid values are LMI and VLLM.
The inference framework for an AI recommendation job.
" + }, + "AIRecommendationInstanceCount":{ + "type":"integer", + "box":true + }, + "AIRecommendationInstanceDetail":{ + "type":"structure", + "members":{ + "InstanceType":{ + "shape":"AIRecommendationInstanceType", + "documentation":"The recommended instance type.
" + }, + "InstanceCount":{ + "shape":"AIRecommendationInstanceCount", + "documentation":"The recommended number of instances.
" + }, + "CopyCountPerInstance":{ + "shape":"AIRecommendationCopyCountPerInstance", + "documentation":"The number of model copies per instance.
" + } + }, + "documentation":"Instance details for a recommendation.
" + }, + "AIRecommendationInstanceDetailList":{ + "type":"list", + "member":{"shape":"AIRecommendationInstanceDetail"} + }, + "AIRecommendationInstanceType":{ + "type":"string", + "enum":[ + "ml.g5.xlarge", + "ml.g5.2xlarge", + "ml.g5.4xlarge", + "ml.g5.8xlarge", + "ml.g5.12xlarge", + "ml.g5.16xlarge", + "ml.g5.24xlarge", + "ml.g5.48xlarge", + "ml.g6.xlarge", + "ml.g6.2xlarge", + "ml.g6.4xlarge", + "ml.g6.8xlarge", + "ml.g6.12xlarge", + "ml.g6.16xlarge", + "ml.g6.24xlarge", + "ml.g6.48xlarge", + "ml.g6e.xlarge", + "ml.g6e.2xlarge", + "ml.g6e.4xlarge", + "ml.g6e.8xlarge", + "ml.g6e.12xlarge", + "ml.g6e.16xlarge", + "ml.g6e.24xlarge", + "ml.g6e.48xlarge", + "ml.g7e.2xlarge", + "ml.g7e.4xlarge", + "ml.g7e.8xlarge", + "ml.g7e.12xlarge", + "ml.g7e.24xlarge", + "ml.g7e.48xlarge", + "ml.p3.2xlarge", + "ml.p3.8xlarge", + "ml.p3.16xlarge", + "ml.p4d.24xlarge", + "ml.p4de.24xlarge", + "ml.p5.4xlarge", + "ml.p5.48xlarge", + "ml.p5e.48xlarge", + "ml.p5en.48xlarge" + ] + }, + "AIRecommendationInstanceTypeList":{ + "type":"list", + "member":{"shape":"AIRecommendationInstanceType"}, + "max":3, + "min":0 + }, + "AIRecommendationJobArn":{ + "type":"string", + "max":256, + "min":0, + "pattern":"arn:aws[a-z\\-]*:sagemaker:[a-z0-9\\-]*:[0-9]{12}:ai-recommendation-job/[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}" + }, + "AIRecommendationJobStatus":{ + "type":"string", + "enum":[ + "InProgress", + "Completed", + "Failed", + "Stopping", + "Stopped" + ] + }, + "AIRecommendationJobSummary":{ + "type":"structure", + "required":[ + "AIRecommendationJobName", + "AIRecommendationJobArn", + "AIRecommendationJobStatus", + "CreationTime" + ], + "members":{ + "AIRecommendationJobName":{ + "shape":"AIEntityName", + "documentation":"The name of the recommendation job.
" + }, + "AIRecommendationJobArn":{ + "shape":"AIRecommendationJobArn", + "documentation":"The Amazon Resource Name (ARN) of the recommendation job.
" + }, + "AIRecommendationJobStatus":{ + "shape":"AIRecommendationJobStatus", + "documentation":"The status of the recommendation job.
" + }, + "CreationTime":{ + "shape":"Timestamp", + "documentation":"A timestamp that indicates when the recommendation job was created.
" + }, + "EndTime":{ + "shape":"Timestamp", + "documentation":"A timestamp that indicates when the recommendation job completed.
" + } + }, + "documentation":"Summary information about an AI recommendation job.
" + }, + "AIRecommendationJobSummaryList":{ + "type":"list", + "member":{"shape":"AIRecommendationJobSummary"} + }, + "AIRecommendationList":{ + "type":"list", + "member":{"shape":"AIRecommendation"} + }, + "AIRecommendationMetric":{ + "type":"string", + "enum":[ + "ttft-ms", + "throughput", + "cost" + ] + }, + "AIRecommendationModelDetails":{ + "type":"structure", + "members":{ + "ModelPackageArn":{ + "shape":"ModelPackageArn", + "documentation":"The Amazon Resource Name (ARN) of the model package.
" + }, + "InferenceSpecificationName":{ + "shape":"AIInferenceSpecificationName", + "documentation":"The name of the inference specification within the model package.
" + }, + "InstanceDetails":{ + "shape":"AIRecommendationInstanceDetailList", + "documentation":"The instance details for this recommendation, including instance type, count, and model copies per instance.
" + } + }, + "documentation":"Details about the model package in a recommendation.
" + }, + "AIRecommendationOptimizationConfigMap":{ + "type":"map", + "key":{"shape":"String"}, + "value":{"shape":"String"} + }, + "AIRecommendationOptimizationDetail":{ + "type":"structure", + "required":["OptimizationType"], + "members":{ + "OptimizationType":{ + "shape":"AIRecommendationOptimizationType", + "documentation":"The type of optimization. Valid values are SpeculativeDecoding and KernelTuning.
A map of configuration parameters for the optimization technique.
" + } + }, + "documentation":"Details about an optimization technique applied in a recommendation.
" + }, + "AIRecommendationOptimizationDetailList":{ + "type":"list", + "member":{"shape":"AIRecommendationOptimizationDetail"} + }, + "AIRecommendationOptimizationType":{ + "type":"string", + "enum":[ + "SpeculativeDecoding", + "KernelTuning" + ] + }, + "AIRecommendationOutputConfig":{ + "type":"structure", + "members":{ + "S3OutputLocation":{ + "shape":"S3Uri", + "documentation":"The Amazon S3 URI where recommendation results are stored.
" + }, + "ModelPackageGroupIdentifier":{ + "shape":"AIResourceIdentifier", + "documentation":"The name or Amazon Resource Name (ARN) of the model package group where the optimized model is registered as a new model package version.
" + } + }, + "documentation":"The output configuration for an AI recommendation job.
" + }, + "AIRecommendationOutputResult":{ + "type":"structure", + "required":["S3OutputLocation"], + "members":{ + "S3OutputLocation":{ + "shape":"S3Uri", + "documentation":"The Amazon S3 URI where the recommendation job writes its output results.
" + }, + "ModelPackageGroupIdentifier":{ + "shape":"AIResourceIdentifier", + "documentation":"The name or Amazon Resource Name (ARN) of the model package group where deployment-ready model packages are registered.
" + } + }, + "documentation":"The output configuration for an AI recommendation job, including the S3 location for results and the model package group for deployment.
" + }, + "AIRecommendationPerformanceMetric":{ + "type":"structure", + "required":[ + "Metric", + "Value" + ], + "members":{ + "Metric":{ + "shape":"String", + "documentation":"The name of the performance metric.
" + }, + "Stat":{ + "shape":"String", + "documentation":"The statistical measure for the metric.
" + }, + "Value":{ + "shape":"String", + "documentation":"The value of the metric.
" + }, + "Unit":{ + "shape":"String", + "documentation":"The unit of the metric value.
" + } + }, + "documentation":"An expected performance metric for a recommendation.
" + }, + "AIRecommendationPerformanceTarget":{ + "type":"structure", + "required":["Constraints"], + "members":{ + "Constraints":{ + "shape":"AIRecommendationConstraintList", + "documentation":"An array of performance constraints that define the optimization objectives.
" + } + }, + "documentation":"The performance targets for an AI recommendation job.
" + }, + "AIResourceIdentifier":{ + "type":"string", + "max":256, + "min":1, + "pattern":"(arn:aws[a-z\\-]*:sagemaker:[a-z0-9\\-]*:[0-9]{12}:[a-z\\-]*/)?([a-zA-Z0-9]([a-zA-Z0-9\\-]){0,62})(?The name of the AI workload configuration." + }, + "AIWorkloadConfigArn":{ + "shape":"AIWorkloadConfigArn", + "documentation":"The Amazon Resource Name (ARN) of the AI workload configuration.
" + }, + "CreationTime":{ + "shape":"Timestamp", + "documentation":"A timestamp that indicates when the configuration was created.
" + } + }, + "documentation":"Summary information about an AI workload configuration.
" + }, + "AIWorkloadConfigSummaryList":{ + "type":"list", + "member":{"shape":"AIWorkloadConfigSummary"} + }, + "AIWorkloadConfigs":{ + "type":"structure", + "required":["WorkloadSpec"], + "members":{ + "WorkloadSpec":{ + "shape":"WorkloadSpec", + "documentation":"The workload specification that defines benchmark parameters.
" + } + }, + "documentation":"The benchmark tool configuration for an AI workload.
" + }, + "AIWorkloadDataSource":{ + "type":"structure", + "members":{ + "S3DataSource":{ + "shape":"AIWorkloadS3DataSource", + "documentation":"The Amazon S3 data source configuration.
" + } + }, + "documentation":"The data source for an AI workload input data channel.
" + }, + "AIWorkloadInputDataConfig":{ + "type":"structure", + "required":[ + "ChannelName", + "DataSource" + ], + "members":{ + "ChannelName":{ + "shape":"AIChannelName", + "documentation":"The logical name for the data channel.
" + }, + "DataSource":{ + "shape":"AIWorkloadDataSource", + "documentation":"The data source for this channel.
" + } + }, + "documentation":"A channel of input data for an AI workload configuration. Each channel has a name and a data source.
" + }, + "AIWorkloadInputDataConfigList":{ + "type":"list", + "member":{"shape":"AIWorkloadInputDataConfig"} + }, + "AIWorkloadS3DataSource":{ + "type":"structure", + "required":["S3Uri"], + "members":{ + "S3Uri":{ + "shape":"S3Uri", + "documentation":"The Amazon S3 URI of the data.
" + } + }, + "documentation":"The Amazon S3 data source for an AI workload.
" + }, "AbsoluteBorrowLimitResourceList":{ "type":"list", "member":{"shape":"ComputeQuotaResourceConfig"}, @@ -11256,6 +12148,151 @@ "min":2, "pattern":"[A-Z]{2}" }, + "CreateAIBenchmarkJobRequest":{ + "type":"structure", + "required":[ + "AIBenchmarkJobName", + "BenchmarkTarget", + "OutputConfig", + "AIWorkloadConfigIdentifier", + "RoleArn" + ], + "members":{ + "AIBenchmarkJobName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI benchmark job. The name must be unique within your Amazon Web Services account in the current Amazon Web Services Region.
" + }, + "BenchmarkTarget":{ + "shape":"AIBenchmarkTarget", + "documentation":"The target endpoint to benchmark. Specify a SageMaker endpoint by providing its name or Amazon Resource Name (ARN).
" + }, + "OutputConfig":{ + "shape":"AIBenchmarkOutputConfig", + "documentation":"The output configuration for the benchmark job, including the Amazon S3 location where benchmark results are stored.
" + }, + "AIWorkloadConfigIdentifier":{ + "shape":"AIResourceIdentifier", + "documentation":"The name or Amazon Resource Name (ARN) of the AI workload configuration to use for this benchmark job.
" + }, + "RoleArn":{ + "shape":"RoleArn", + "documentation":"The Amazon Resource Name (ARN) of an IAM role that enables Amazon SageMaker AI to perform tasks on your behalf.
" + }, + "NetworkConfig":{ + "shape":"AIBenchmarkNetworkConfig", + "documentation":"The network configuration for the benchmark job, including VPC settings.
" + }, + "Tags":{ + "shape":"TagList", + "documentation":"The metadata that you apply to Amazon Web Services resources to help you categorize and organize them. Each tag consists of a key and a value, both of which you define.
" + } + } + }, + "CreateAIBenchmarkJobResponse":{ + "type":"structure", + "required":["AIBenchmarkJobArn"], + "members":{ + "AIBenchmarkJobArn":{ + "shape":"AIBenchmarkJobArn", + "documentation":"The Amazon Resource Name (ARN) of the created benchmark job.
" + } + } + }, + "CreateAIRecommendationJobRequest":{ + "type":"structure", + "required":[ + "AIRecommendationJobName", + "ModelSource", + "OutputConfig", + "AIWorkloadConfigIdentifier", + "PerformanceTarget", + "RoleArn" + ], + "members":{ + "AIRecommendationJobName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI recommendation job. The name must be unique within your Amazon Web Services account in the current Amazon Web Services Region.
" + }, + "ModelSource":{ + "shape":"AIModelSource", + "documentation":"The source of the model to optimize. Specify the Amazon S3 location of the model artifacts.
" + }, + "OutputConfig":{ + "shape":"AIRecommendationOutputConfig", + "documentation":"The output configuration for the recommendation job, including the Amazon S3 location for results and an optional model package group where the optimized model is registered.
" + }, + "AIWorkloadConfigIdentifier":{ + "shape":"AIResourceIdentifier", + "documentation":"The name or Amazon Resource Name (ARN) of the AI workload configuration to use for this recommendation job.
" + }, + "PerformanceTarget":{ + "shape":"AIRecommendationPerformanceTarget", + "documentation":"The performance targets for the recommendation job. Specify constraints on metrics such as time to first token (ttft-ms), throughput, or cost.
The Amazon Resource Name (ARN) of an IAM role that enables Amazon SageMaker AI to perform tasks on your behalf.
" + }, + "InferenceSpecification":{ + "shape":"AIRecommendationInferenceSpecification", + "documentation":"The inference framework configuration. Specify the framework (such as LMI or vLLM) for the recommendation job.
" + }, + "OptimizeModel":{ + "shape":"AIRecommendationAllowOptimization", + "documentation":"Whether to allow model optimization techniques such as quantization, speculative decoding, and kernel tuning. The default is true.
The compute resource specification for the recommendation job. You can specify up to 3 instance types to consider, and optionally provide capacity reservation configuration.
" + }, + "Tags":{ + "shape":"TagList", + "documentation":"The metadata that you apply to Amazon Web Services resources to help you categorize and organize them.
" + } + } + }, + "CreateAIRecommendationJobResponse":{ + "type":"structure", + "required":["AIRecommendationJobArn"], + "members":{ + "AIRecommendationJobArn":{ + "shape":"AIRecommendationJobArn", + "documentation":"The Amazon Resource Name (ARN) of the created recommendation job.
" + } + } + }, + "CreateAIWorkloadConfigRequest":{ + "type":"structure", + "required":["AIWorkloadConfigName"], + "members":{ + "AIWorkloadConfigName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI workload configuration. The name must be unique within your Amazon Web Services account in the current Amazon Web Services Region.
" + }, + "DatasetConfig":{ + "shape":"AIDatasetConfig", + "documentation":"The dataset configuration for the workload. Specify input data channels with their data sources for benchmark workloads.
" + }, + "AIWorkloadConfigs":{ + "shape":"AIWorkloadConfigs", + "documentation":"The benchmark tool configuration and workload specification. Provide the specification as an inline YAML or JSON string.
" + }, + "Tags":{ + "shape":"TagList", + "documentation":"The metadata that you apply to Amazon Web Services resources to help you categorize and organize them. Each tag consists of a key and a value, both of which you define. For more information, see Tagging Amazon Web Services Resources in the Amazon Web Services General Reference.
" + } + } + }, + "CreateAIWorkloadConfigResponse":{ + "type":"structure", + "required":["AIWorkloadConfigArn"], + "members":{ + "AIWorkloadConfigArn":{ + "shape":"AIWorkloadConfigArn", + "documentation":"The Amazon Resource Name (ARN) of the created AI workload configuration.
" + } + } + }, "CreateActionRequest":{ "type":"structure", "required":[ @@ -15239,6 +16276,63 @@ "max":65535, "min":0 }, + "DeleteAIBenchmarkJobRequest":{ + "type":"structure", + "required":["AIBenchmarkJobName"], + "members":{ + "AIBenchmarkJobName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI benchmark job to delete.
" + } + } + }, + "DeleteAIBenchmarkJobResponse":{ + "type":"structure", + "members":{ + "AIBenchmarkJobArn":{ + "shape":"AIBenchmarkJobArn", + "documentation":"The Amazon Resource Name (ARN) of the deleted benchmark job.
" + } + } + }, + "DeleteAIRecommendationJobRequest":{ + "type":"structure", + "required":["AIRecommendationJobName"], + "members":{ + "AIRecommendationJobName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI recommendation job to delete.
" + } + } + }, + "DeleteAIRecommendationJobResponse":{ + "type":"structure", + "members":{ + "AIRecommendationJobArn":{ + "shape":"AIRecommendationJobArn", + "documentation":"The Amazon Resource Name (ARN) of the deleted recommendation job.
" + } + } + }, + "DeleteAIWorkloadConfigRequest":{ + "type":"structure", + "required":["AIWorkloadConfigName"], + "members":{ + "AIWorkloadConfigName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI workload configuration to delete.
" + } + } + }, + "DeleteAIWorkloadConfigResponse":{ + "type":"structure", + "members":{ + "AIWorkloadConfigArn":{ + "shape":"AIWorkloadConfigArn", + "documentation":"The Amazon Resource Name (ARN) of the deleted AI workload configuration.
" + } + } + }, "DeleteActionRequest":{ "type":"structure", "required":["ActionName"], @@ -16263,6 +17357,220 @@ }, "documentation":"Information that SageMaker Neo automatically derived about the model.
" }, + "DescribeAIBenchmarkJobRequest":{ + "type":"structure", + "required":["AIBenchmarkJobName"], + "members":{ + "AIBenchmarkJobName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI benchmark job to describe.
" + } + } + }, + "DescribeAIBenchmarkJobResponse":{ + "type":"structure", + "required":[ + "AIBenchmarkJobName", + "AIBenchmarkJobArn", + "AIBenchmarkJobStatus", + "BenchmarkTarget", + "OutputConfig", + "AIWorkloadConfigIdentifier", + "RoleArn", + "CreationTime" + ], + "members":{ + "AIBenchmarkJobName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI benchmark job.
" + }, + "AIBenchmarkJobArn":{ + "shape":"AIBenchmarkJobArn", + "documentation":"The Amazon Resource Name (ARN) of the AI benchmark job.
" + }, + "AIBenchmarkJobStatus":{ + "shape":"AIBenchmarkJobStatus", + "documentation":"The status of the AI benchmark job.
" + }, + "FailureReason":{ + "shape":"FailureReason", + "documentation":"If the benchmark job failed, the reason it failed.
" + }, + "BenchmarkTarget":{ + "shape":"AIBenchmarkTarget", + "documentation":"The target endpoint that was benchmarked.
" + }, + "OutputConfig":{ + "shape":"AIBenchmarkOutputResult", + "documentation":"The output configuration for the benchmark job, including the Amazon S3 output location and CloudWatch log information.
" + }, + "AIWorkloadConfigIdentifier":{ + "shape":"AIResourceIdentifier", + "documentation":"The name or Amazon Resource Name (ARN) of the AI workload configuration used for this benchmark job.
" + }, + "RoleArn":{ + "shape":"RoleArn", + "documentation":"The Amazon Resource Name (ARN) of the IAM role used by the benchmark job.
" + }, + "NetworkConfig":{ + "shape":"AIBenchmarkNetworkConfig", + "documentation":"The network configuration for the benchmark job.
" + }, + "CreationTime":{ + "shape":"Timestamp", + "documentation":"A timestamp that indicates when the benchmark job was created.
" + }, + "StartTime":{ + "shape":"Timestamp", + "documentation":"A timestamp that indicates when the benchmark job started running.
" + }, + "EndTime":{ + "shape":"Timestamp", + "documentation":"A timestamp that indicates when the benchmark job completed.
" + }, + "Tags":{ + "shape":"TagList", + "documentation":"The tags associated with the benchmark job.
" + } + } + }, + "DescribeAIRecommendationJobRequest":{ + "type":"structure", + "required":["AIRecommendationJobName"], + "members":{ + "AIRecommendationJobName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI recommendation job to describe.
" + } + } + }, + "DescribeAIRecommendationJobResponse":{ + "type":"structure", + "required":[ + "AIRecommendationJobName", + "AIRecommendationJobArn", + "AIRecommendationJobStatus", + "ModelSource", + "OutputConfig", + "AIWorkloadConfigIdentifier", + "RoleArn", + "CreationTime" + ], + "members":{ + "AIRecommendationJobName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI recommendation job.
" + }, + "AIRecommendationJobArn":{ + "shape":"AIRecommendationJobArn", + "documentation":"The Amazon Resource Name (ARN) of the AI recommendation job.
" + }, + "AIRecommendationJobStatus":{ + "shape":"AIRecommendationJobStatus", + "documentation":"The status of the AI recommendation job.
" + }, + "FailureReason":{ + "shape":"FailureReason", + "documentation":"If the recommendation job failed, the reason it failed.
" + }, + "ModelSource":{ + "shape":"AIModelSource", + "documentation":"The source of the model that was analyzed.
" + }, + "OutputConfig":{ + "shape":"AIRecommendationOutputResult", + "documentation":"The output configuration for the recommendation job.
" + }, + "InferenceSpecification":{ + "shape":"AIRecommendationInferenceSpecification", + "documentation":"The inference framework configuration.
" + }, + "AIWorkloadConfigIdentifier":{ + "shape":"AIResourceIdentifier", + "documentation":"The name or Amazon Resource Name (ARN) of the AI workload configuration used for this recommendation job.
" + }, + "OptimizeModel":{ + "shape":"AIRecommendationAllowOptimization", + "documentation":"Whether model optimization techniques were allowed.
" + }, + "PerformanceTarget":{ + "shape":"AIRecommendationPerformanceTarget", + "documentation":"The performance targets specified for the recommendation job.
" + }, + "Recommendations":{ + "shape":"AIRecommendationList", + "documentation":"The list of optimization recommendations generated by the job. Each recommendation includes optimization details, deployment configuration, expected performance metrics, and the associated benchmark job ARN.
" + }, + "RoleArn":{ + "shape":"RoleArn", + "documentation":"The Amazon Resource Name (ARN) of the IAM role used by the recommendation job.
" + }, + "ComputeSpec":{ + "shape":"AIRecommendationComputeSpec", + "documentation":"The compute resource specification for the recommendation job.
" + }, + "CreationTime":{ + "shape":"Timestamp", + "documentation":"A timestamp that indicates when the recommendation job was created.
" + }, + "StartTime":{ + "shape":"Timestamp", + "documentation":"A timestamp that indicates when the recommendation job started running.
" + }, + "EndTime":{ + "shape":"Timestamp", + "documentation":"A timestamp that indicates when the recommendation job completed.
" + }, + "Tags":{ + "shape":"TagList", + "documentation":"The tags associated with the recommendation job.
" + } + } + }, + "DescribeAIWorkloadConfigRequest":{ + "type":"structure", + "required":["AIWorkloadConfigName"], + "members":{ + "AIWorkloadConfigName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI workload configuration to describe.
" + } + } + }, + "DescribeAIWorkloadConfigResponse":{ + "type":"structure", + "required":[ + "AIWorkloadConfigName", + "AIWorkloadConfigArn", + "CreationTime" + ], + "members":{ + "AIWorkloadConfigName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI workload configuration.
" + }, + "AIWorkloadConfigArn":{ + "shape":"AIWorkloadConfigArn", + "documentation":"The Amazon Resource Name (ARN) of the AI workload configuration.
" + }, + "DatasetConfig":{ + "shape":"AIDatasetConfig", + "documentation":"The dataset configuration for the workload.
" + }, + "AIWorkloadConfigs":{ + "shape":"AIWorkloadConfigs", + "documentation":"The benchmark tool configuration and workload specification.
" + }, + "Tags":{ + "shape":"TagList", + "documentation":"The tags associated with the AI workload configuration.
" + }, + "CreationTime":{ + "shape":"Timestamp", + "documentation":"A timestamp that indicates when the AI workload configuration was created.
" + } + } + }, "DescribeActionRequest":{ "type":"structure", "required":["ActionName"], @@ -23218,6 +24526,10 @@ "min":0, "pattern":"[\\S\\s]*" }, + "ExpectedPerformanceList":{ + "type":"list", + "member":{"shape":"AIRecommendationPerformanceMetric"} + }, "Experiment":{ "type":"structure", "members":{ @@ -28082,6 +29394,178 @@ "Action" ] }, + "ListAIBenchmarkJobsRequest":{ + "type":"structure", + "members":{ + "MaxResults":{ + "shape":"MaxResults", + "documentation":"The maximum number of benchmark jobs to return in the response.
" + }, + "NextToken":{ + "shape":"NextToken", + "documentation":"If the previous call to ListAIBenchmarkJobs didn't return the full set of jobs, the call returns a token for getting the next set.
A string in the job name. This filter returns only jobs whose name contains the specified string.
" + }, + "StatusEquals":{ + "shape":"AIBenchmarkJobStatus", + "documentation":"A filter that returns only benchmark jobs with the specified status.
" + }, + "CreationTimeAfter":{ + "shape":"Timestamp", + "documentation":"A filter that returns only jobs created after the specified time.
" + }, + "CreationTimeBefore":{ + "shape":"Timestamp", + "documentation":"A filter that returns only jobs created before the specified time.
" + }, + "SortBy":{ + "shape":"ListAIBenchmarkJobsSortBy", + "documentation":"The field to sort results by. The default is CreationTime.
The sort order for results. The default is Descending.
An array of AIBenchmarkJobSummary objects, one for each benchmark job that matches the specified filters.
If the response is truncated, Amazon SageMaker AI returns this token. To retrieve the next set of jobs, use it in the subsequent request.
" + } + } + }, + "ListAIBenchmarkJobsSortBy":{ + "type":"string", + "enum":[ + "Name", + "CreationTime", + "Status" + ] + }, + "ListAIRecommendationJobsRequest":{ + "type":"structure", + "members":{ + "MaxResults":{ + "shape":"MaxResults", + "documentation":"The maximum number of recommendation jobs to return in the response.
" + }, + "NextToken":{ + "shape":"NextToken", + "documentation":"If the previous call to ListAIRecommendationJobs didn't return the full set of jobs, the call returns a token for getting the next set.
A string in the job name. This filter returns only jobs whose name contains the specified string.
" + }, + "StatusEquals":{ + "shape":"AIRecommendationJobStatus", + "documentation":"A filter that returns only recommendation jobs with the specified status.
" + }, + "CreationTimeAfter":{ + "shape":"Timestamp", + "documentation":"A filter that returns only jobs created after the specified time.
" + }, + "CreationTimeBefore":{ + "shape":"Timestamp", + "documentation":"A filter that returns only jobs created before the specified time.
" + }, + "SortBy":{ + "shape":"ListAIRecommendationJobsSortBy", + "documentation":"The field to sort results by. The default is CreationTime.
The sort order for results. The default is Descending.
An array of AIRecommendationJobSummary objects, one for each recommendation job that matches the specified filters.
If the response is truncated, Amazon SageMaker AI returns this token. To retrieve the next set of jobs, use it in the subsequent request.
" + } + } + }, + "ListAIRecommendationJobsSortBy":{ + "type":"string", + "enum":[ + "Name", + "CreationTime", + "Status" + ] + }, + "ListAIWorkloadConfigsRequest":{ + "type":"structure", + "members":{ + "MaxResults":{ + "shape":"MaxResults", + "documentation":"The maximum number of AI workload configurations to return in the response.
" + }, + "NextToken":{ + "shape":"NextToken", + "documentation":"If the previous call to ListAIWorkloadConfigs didn't return the full set of configurations, the call returns a token for getting the next set of configurations.
A string in the configuration name. This filter returns only configurations whose name contains the specified string.
" + }, + "CreationTimeAfter":{ + "shape":"Timestamp", + "documentation":"A filter that returns only configurations created after the specified time.
" + }, + "CreationTimeBefore":{ + "shape":"Timestamp", + "documentation":"A filter that returns only configurations created before the specified time.
" + }, + "SortBy":{ + "shape":"ListAIWorkloadConfigsSortBy", + "documentation":"The field to sort results by. The default is CreationTime.
The sort order for results. The default is Descending.
An array of AIWorkloadConfigSummary objects, one for each AI workload configuration that matches the specified filters.
If the response is truncated, Amazon SageMaker AI returns this token. To retrieve the next set of configurations, use it in the subsequent request.
" + } + } + }, + "ListAIWorkloadConfigsSortBy":{ + "type":"string", + "enum":[ + "Name", + "CreationTime" + ] + }, "ListActionsRequest":{ "type":"structure", "members":{ @@ -32897,11 +34381,11 @@ "members":{ "EnableEnhancedMetrics":{ "shape":"EnableEnhancedMetrics", - "documentation":"Specifies whether to enable enhanced metrics for the endpoint. Enhanced metrics provide utilization data at instance and container granularity. Container granularity is supported for Inference Components. The default is False.
Specifies whether to enable enhanced metrics for the endpoint. Enhanced metrics provide utilization and invocation data at instance and container granularity. Container granularity is supported for Inference Components. The default is False.
The frequency, in seconds, at which utilization metrics are published to Amazon CloudWatch. The default is 60 seconds.
The interval, in seconds, at which metrics are published to Amazon CloudWatch. Defaults to 60. Valid values: 10, 30, 60, 120, 180, 240, 300. When EnableEnhancedMetrics is set to False, this interval applies to utilization metrics only; invocation metrics continue to be published at the default 60-second interval. When EnableEnhancedMetrics is set to True, this interval applies to both utilization and invocation metrics.
The configuration for Utilization metrics.
" @@ -34271,6 +35755,10 @@ "shape":"String", "documentation":"The name of a pre-trained machine learning benchmarked by Amazon SageMaker Inference Recommender model that matches your model. You can find a list of benchmarked models by calling ListModelMetadata.
Data sources that are available to your model in addition to the one that you specify for ModelDataSource when you use the CreateModelPackage action.
The additional data source that is used during inference in the Docker container for your model package.
" @@ -36107,12 +37595,12 @@ }, "DisableGlueTableCreation":{ "shape":"Boolean", - "documentation":"Set to True to disable the automatic creation of an Amazon Web Services Glue table when configuring an OfflineStore. If set to False, Feature Store will name the OfflineStore Glue table following Athena's naming recommendations.
The default value is False.
Set to True to disable the automatic creation of an Amazon Web Services Glue table when configuring an OfflineStore. If set to True and DataCatalogConfig is provided, Feature Store associates the provided catalog configuration with the feature group without creating a table. In this case, you are responsible for creating and managing the Glue table. If set to True without DataCatalogConfig, no Glue table is created or associated with the feature group. The Iceberg table format is only supported when this is set to False.
If set to False and DataCatalogConfig is provided, Feature Store creates the table using the specified names. If set to False without DataCatalogConfig, Feature Store auto-generates the table name following Athena's naming recommendations. This applies to both Glue and Apache Iceberg table formats.
The default value is False.
The meta data of the Glue table that is autogenerated when an OfflineStore is created.
The meta data of the Glue table for the OfflineStore. If not provided, Feature Store auto-generates the table name, database, and catalog when the OfflineStore is created. You can optionally provide this configuration to specify custom values. This applies to both Glue and Apache Iceberg table formats.
The name of the AI benchmark job to stop.
" + } + } + }, + "StopAIBenchmarkJobResponse":{ + "type":"structure", + "required":["AIBenchmarkJobArn"], + "members":{ + "AIBenchmarkJobArn":{ + "shape":"AIBenchmarkJobArn", + "documentation":"The Amazon Resource Name (ARN) of the stopped benchmark job.
" + } + } + }, + "StopAIRecommendationJobRequest":{ + "type":"structure", + "required":["AIRecommendationJobName"], + "members":{ + "AIRecommendationJobName":{ + "shape":"AIEntityName", + "documentation":"The name of the AI recommendation job to stop.
" + } + } + }, + "StopAIRecommendationJobResponse":{ + "type":"structure", + "required":["AIRecommendationJobArn"], + "members":{ + "AIRecommendationJobArn":{ + "shape":"AIRecommendationJobArn", + "documentation":"The Amazon Resource Name (ARN) of the stopped recommendation job.
" + } + } + }, "StopAutoMLJobRequest":{ "type":"structure", "required":["AutoMLJobName"], @@ -48494,6 +50028,17 @@ "type":"list", "member":{"shape":"Workforce"} }, + "WorkloadSpec":{ + "type":"structure", + "members":{ + "Inline":{ + "shape":"String", + "documentation":"An inline YAML or JSON string that defines benchmark parameters.
" + } + }, + "documentation":"The workload specification for benchmark tool configuration. Provide an inline YAML or JSON string.
", + "union":true + }, "WorkspaceSettings":{ "type":"structure", "members":{ diff --git a/src/sagemaker_core/main/code_injection/shape_dag.py b/src/sagemaker_core/main/code_injection/shape_dag.py index 2dcb233..e904e59 100644 --- a/src/sagemaker_core/main/code_injection/shape_dag.py +++ b/src/sagemaker_core/main/code_injection/shape_dag.py @@ -1,4 +1,339 @@ SHAPE_DAG = { + "AIBenchmarkEndpoint": { + "members": [ + {"name": "Identifier", "shape": "AIResourceIdentifier", "type": "string"}, + {"name": "TargetContainerHostname", "shape": "String", "type": "string"}, + { + "name": "InferenceComponents", + "shape": "AIBenchmarkInferenceComponentList", + "type": "list", + }, + ], + "type": "structure", + }, + "AIBenchmarkInferenceComponent": { + "members": [{"name": "Identifier", "shape": "AIResourceIdentifier", "type": "string"}], + "type": "structure", + }, + "AIBenchmarkInferenceComponentList": { + "member_shape": "AIBenchmarkInferenceComponent", + "member_type": "structure", + "type": "list", + }, + "AIBenchmarkJobSummary": { + "members": [ + {"name": "AIBenchmarkJobName", "shape": "AIEntityName", "type": "string"}, + {"name": "AIBenchmarkJobArn", "shape": "AIBenchmarkJobArn", "type": "string"}, + {"name": "AIBenchmarkJobStatus", "shape": "AIBenchmarkJobStatus", "type": "string"}, + {"name": "CreationTime", "shape": "Timestamp", "type": "timestamp"}, + {"name": "EndTime", "shape": "Timestamp", "type": "timestamp"}, + {"name": "AIWorkloadConfigName", "shape": "AIEntityName", "type": "string"}, + ], + "type": "structure", + }, + "AIBenchmarkJobSummaryList": { + "member_shape": "AIBenchmarkJobSummary", + "member_type": "structure", + "type": "list", + }, + "AIBenchmarkNetworkConfig": { + "members": [{"name": "VpcConfig", "shape": "VpcConfig", "type": "structure"}], + "type": "structure", + }, + "AIBenchmarkOutputConfig": { + "members": [{"name": "S3OutputLocation", "shape": "S3Uri", "type": "string"}], + "type": "structure", + }, + "AIBenchmarkOutputResult": { + "members": [ + {"name": "S3OutputLocation", "shape": "S3Uri", "type": "string"}, + {"name": "CloudWatchLogs", "shape": "AICloudWatchLogsList", "type": "list"}, + ], + "type": "structure", + }, + "AIBenchmarkTarget": { + "members": [{"name": "Endpoint", "shape": "AIBenchmarkEndpoint", "type": "structure"}], + "type": "structure", + }, + "AICapacityReservationConfig": { + "members": [ + { + "name": "CapacityReservationPreference", + "shape": "AICapacityReservationPreference", + "type": "string", + }, + {"name": "MlReservationArns", "shape": "AIMlReservationArnList", "type": "list"}, + ], + "type": "structure", + }, + "AICloudWatchLogs": { + "members": [ + {"name": "LogGroupArn", "shape": "String", "type": "string"}, + {"name": "LogStreamName", "shape": "String", "type": "string"}, + ], + "type": "structure", + }, + "AICloudWatchLogsList": { + "member_shape": "AICloudWatchLogs", + "member_type": "structure", + "type": "list", + }, + "AIDatasetConfig": { + "members": [ + {"name": "InputDataConfig", "shape": "AIWorkloadInputDataConfigList", "type": "list"} + ], + "type": "structure", + }, + "AIMlReservationArnList": { + "member_shape": "AIMlReservationArn", + "member_type": "string", + "type": "list", + }, + "AIModelSource": { + "members": [{"name": "S3", "shape": "AIModelSourceS3", "type": "structure"}], + "type": "structure", + }, + "AIModelSourceS3": { + "members": [{"name": "S3Uri", "shape": "S3Uri", "type": "string"}], + "type": "structure", + }, + "AIRecommendation": { + "members": [ + {"name": "RecommendationDescription", "shape": "String", "type": "string"}, + { + "name": "OptimizationDetails", + "shape": "AIRecommendationOptimizationDetailList", + "type": "list", + }, + {"name": "ModelDetails", "shape": "AIRecommendationModelDetails", "type": "structure"}, + { + "name": "DeploymentConfiguration", + "shape": "AIRecommendationDeploymentConfiguration", + "type": "structure", + }, + {"name": "AIBenchmarkJobArn", "shape": "AIBenchmarkJobArn", "type": "string"}, + {"name": "ExpectedPerformance", "shape": "ExpectedPerformanceList", "type": "list"}, + ], + "type": "structure", + }, + "AIRecommendationComputeSpec": { + "members": [ + {"name": "InstanceTypes", "shape": "AIRecommendationInstanceTypeList", "type": "list"}, + { + "name": "CapacityReservationConfig", + "shape": "AICapacityReservationConfig", + "type": "structure", + }, + ], + "type": "structure", + }, + "AIRecommendationConstraint": { + "members": [{"name": "Metric", "shape": "AIRecommendationMetric", "type": "string"}], + "type": "structure", + }, + "AIRecommendationConstraintList": { + "member_shape": "AIRecommendationConstraint", + "member_type": "structure", + "type": "list", + }, + "AIRecommendationDeploymentConfiguration": { + "members": [ + {"name": "S3", "shape": "AIRecommendationDeploymentS3ChannelList", "type": "list"}, + {"name": "ImageUri", "shape": "String", "type": "string"}, + {"name": "InstanceType", "shape": "AIRecommendationInstanceType", "type": "string"}, + {"name": "InstanceCount", "shape": "AIRecommendationInstanceCount", "type": "integer"}, + { + "name": "CopyCountPerInstance", + "shape": "AIRecommendationCopyCountPerInstance", + "type": "integer", + }, + {"name": "EnvironmentVariables", "shape": "EnvironmentMap", "type": "map"}, + ], + "type": "structure", + }, + "AIRecommendationDeploymentS3Channel": { + "members": [ + {"name": "ChannelName", "shape": "AIChannelName", "type": "string"}, + {"name": "Uri", "shape": "S3Uri", "type": "string"}, + ], + "type": "structure", + }, + "AIRecommendationDeploymentS3ChannelList": { + "member_shape": "AIRecommendationDeploymentS3Channel", + "member_type": "structure", + "type": "list", + }, + "AIRecommendationInferenceSpecification": { + "members": [ + {"name": "Framework", "shape": "AIRecommendationInferenceFramework", "type": "string"} + ], + "type": "structure", + }, + "AIRecommendationInstanceDetail": { + "members": [ + {"name": "InstanceType", "shape": "AIRecommendationInstanceType", "type": "string"}, + {"name": "InstanceCount", "shape": "AIRecommendationInstanceCount", "type": "integer"}, + { + "name": "CopyCountPerInstance", + "shape": "AIRecommendationCopyCountPerInstance", + "type": "integer", + }, + ], + "type": "structure", + }, + "AIRecommendationInstanceDetailList": { + "member_shape": "AIRecommendationInstanceDetail", + "member_type": "structure", + "type": "list", + }, + "AIRecommendationInstanceTypeList": { + "member_shape": "AIRecommendationInstanceType", + "member_type": "string", + "type": "list", + }, + "AIRecommendationJobSummary": { + "members": [ + {"name": "AIRecommendationJobName", "shape": "AIEntityName", "type": "string"}, + {"name": "AIRecommendationJobArn", "shape": "AIRecommendationJobArn", "type": "string"}, + { + "name": "AIRecommendationJobStatus", + "shape": "AIRecommendationJobStatus", + "type": "string", + }, + {"name": "CreationTime", "shape": "Timestamp", "type": "timestamp"}, + {"name": "EndTime", "shape": "Timestamp", "type": "timestamp"}, + ], + "type": "structure", + }, + "AIRecommendationJobSummaryList": { + "member_shape": "AIRecommendationJobSummary", + "member_type": "structure", + "type": "list", + }, + "AIRecommendationList": { + "member_shape": "AIRecommendation", + "member_type": "structure", + "type": "list", + }, + "AIRecommendationModelDetails": { + "members": [ + {"name": "ModelPackageArn", "shape": "ModelPackageArn", "type": "string"}, + { + "name": "InferenceSpecificationName", + "shape": "AIInferenceSpecificationName", + "type": "string", + }, + { + "name": "InstanceDetails", + "shape": "AIRecommendationInstanceDetailList", + "type": "list", + }, + ], + "type": "structure", + }, + "AIRecommendationOptimizationConfigMap": { + "key_shape": "String", + "key_type": "string", + "type": "map", + "value_shape": "String", + "value_type": "string", + }, + "AIRecommendationOptimizationDetail": { + "members": [ + { + "name": "OptimizationType", + "shape": "AIRecommendationOptimizationType", + "type": "string", + }, + { + "name": "OptimizationConfig", + "shape": "AIRecommendationOptimizationConfigMap", + "type": "map", + }, + ], + "type": "structure", + }, + "AIRecommendationOptimizationDetailList": { + "member_shape": "AIRecommendationOptimizationDetail", + "member_type": "structure", + "type": "list", + }, + "AIRecommendationOutputConfig": { + "members": [ + {"name": "S3OutputLocation", "shape": "S3Uri", "type": "string"}, + { + "name": "ModelPackageGroupIdentifier", + "shape": "AIResourceIdentifier", + "type": "string", + }, + ], + "type": "structure", + }, + "AIRecommendationOutputResult": { + "members": [ + {"name": "S3OutputLocation", "shape": "S3Uri", "type": "string"}, + { + "name": "ModelPackageGroupIdentifier", + "shape": "AIResourceIdentifier", + "type": "string", + }, + ], + "type": "structure", + }, + "AIRecommendationPerformanceMetric": { + "members": [ + {"name": "Metric", "shape": "String", "type": "string"}, + {"name": "Stat", "shape": "String", "type": "string"}, + {"name": "Value", "shape": "String", "type": "string"}, + {"name": "Unit", "shape": "String", "type": "string"}, + ], + "type": "structure", + }, + "AIRecommendationPerformanceTarget": { + "members": [ + {"name": "Constraints", "shape": "AIRecommendationConstraintList", "type": "list"} + ], + "type": "structure", + }, + "AIWorkloadConfigSummary": { + "members": [ + {"name": "AIWorkloadConfigName", "shape": "AIEntityName", "type": "string"}, + {"name": "AIWorkloadConfigArn", "shape": "AIWorkloadConfigArn", "type": "string"}, + {"name": "CreationTime", "shape": "Timestamp", "type": "timestamp"}, + ], + "type": "structure", + }, + "AIWorkloadConfigSummaryList": { + "member_shape": "AIWorkloadConfigSummary", + "member_type": "structure", + "type": "list", + }, + "AIWorkloadConfigs": { + "members": [{"name": "WorkloadSpec", "shape": "WorkloadSpec", "type": "structure"}], + "type": "structure", + }, + "AIWorkloadDataSource": { + "members": [ + {"name": "S3DataSource", "shape": "AIWorkloadS3DataSource", "type": "structure"} + ], + "type": "structure", + }, + "AIWorkloadInputDataConfig": { + "members": [ + {"name": "ChannelName", "shape": "AIChannelName", "type": "string"}, + {"name": "DataSource", "shape": "AIWorkloadDataSource", "type": "structure"}, + ], + "type": "structure", + }, + "AIWorkloadInputDataConfigList": { + "member_shape": "AIWorkloadInputDataConfig", + "member_type": "structure", + "type": "list", + }, + "AIWorkloadS3DataSource": { + "members": [{"name": "S3Uri", "shape": "S3Uri", "type": "string"}], + "type": "structure", + }, "AbsoluteBorrowLimitResourceList": { "member_shape": "ComputeQuotaResourceConfig", "member_type": "structure", @@ -2520,6 +2855,78 @@ ], "type": "structure", }, + "CreateAIBenchmarkJobRequest": { + "members": [ + {"name": "AIBenchmarkJobName", "shape": "AIEntityName", "type": "string"}, + {"name": "BenchmarkTarget", "shape": "AIBenchmarkTarget", "type": "structure"}, + {"name": "OutputConfig", "shape": "AIBenchmarkOutputConfig", "type": "structure"}, + { + "name": "AIWorkloadConfigIdentifier", + "shape": "AIResourceIdentifier", + "type": "string", + }, + {"name": "RoleArn", "shape": "RoleArn", "type": "string"}, + {"name": "NetworkConfig", "shape": "AIBenchmarkNetworkConfig", "type": "structure"}, + {"name": "Tags", "shape": "TagList", "type": "list"}, + ], + "type": "structure", + }, + "CreateAIBenchmarkJobResponse": { + "members": [{"name": "AIBenchmarkJobArn", "shape": "AIBenchmarkJobArn", "type": "string"}], + "type": "structure", + }, + "CreateAIRecommendationJobRequest": { + "members": [ + {"name": "AIRecommendationJobName", "shape": "AIEntityName", "type": "string"}, + {"name": "ModelSource", "shape": "AIModelSource", "type": "structure"}, + {"name": "OutputConfig", "shape": "AIRecommendationOutputConfig", "type": "structure"}, + { + "name": "AIWorkloadConfigIdentifier", + "shape": "AIResourceIdentifier", + "type": "string", + }, + { + "name": "PerformanceTarget", + "shape": "AIRecommendationPerformanceTarget", + "type": "structure", + }, + {"name": "RoleArn", "shape": "RoleArn", "type": "string"}, + { + "name": "InferenceSpecification", + "shape": "AIRecommendationInferenceSpecification", + "type": "structure", + }, + { + "name": "OptimizeModel", + "shape": "AIRecommendationAllowOptimization", + "type": "boolean", + }, + {"name": "ComputeSpec", "shape": "AIRecommendationComputeSpec", "type": "structure"}, + {"name": "Tags", "shape": "TagList", "type": "list"}, + ], + "type": "structure", + }, + "CreateAIRecommendationJobResponse": { + "members": [ + {"name": "AIRecommendationJobArn", "shape": "AIRecommendationJobArn", "type": "string"} + ], + "type": "structure", + }, + "CreateAIWorkloadConfigRequest": { + "members": [ + {"name": "AIWorkloadConfigName", "shape": "AIEntityName", "type": "string"}, + {"name": "DatasetConfig", "shape": "AIDatasetConfig", "type": "structure"}, + {"name": "AIWorkloadConfigs", "shape": "AIWorkloadConfigs", "type": "structure"}, + {"name": "Tags", "shape": "TagList", "type": "list"}, + ], + "type": "structure", + }, + "CreateAIWorkloadConfigResponse": { + "members": [ + {"name": "AIWorkloadConfigArn", "shape": "AIWorkloadConfigArn", "type": "string"} + ], + "type": "structure", + }, "CreateActionRequest": { "members": [ {"name": "ActionName", "shape": "ExperimentEntityName", "type": "string"}, @@ -4391,6 +4798,34 @@ ], "type": "structure", }, + "DeleteAIBenchmarkJobRequest": { + "members": [{"name": "AIBenchmarkJobName", "shape": "AIEntityName", "type": "string"}], + "type": "structure", + }, + "DeleteAIBenchmarkJobResponse": { + "members": [{"name": "AIBenchmarkJobArn", "shape": "AIBenchmarkJobArn", "type": "string"}], + "type": "structure", + }, + "DeleteAIRecommendationJobRequest": { + "members": [{"name": "AIRecommendationJobName", "shape": "AIEntityName", "type": "string"}], + "type": "structure", + }, + "DeleteAIRecommendationJobResponse": { + "members": [ + {"name": "AIRecommendationJobArn", "shape": "AIRecommendationJobArn", "type": "string"} + ], + "type": "structure", + }, + "DeleteAIWorkloadConfigRequest": { + "members": [{"name": "AIWorkloadConfigName", "shape": "AIEntityName", "type": "string"}], + "type": "structure", + }, + "DeleteAIWorkloadConfigResponse": { + "members": [ + {"name": "AIWorkloadConfigArn", "shape": "AIWorkloadConfigArn", "type": "string"} + ], + "type": "structure", + }, "DeleteActionRequest": { "members": [{"name": "ActionName", "shape": "ExperimentEntityName", "type": "string"}], "type": "structure", @@ -4894,6 +5329,93 @@ ], "type": "structure", }, + "DescribeAIBenchmarkJobRequest": { + "members": [{"name": "AIBenchmarkJobName", "shape": "AIEntityName", "type": "string"}], + "type": "structure", + }, + "DescribeAIBenchmarkJobResponse": { + "members": [ + {"name": "AIBenchmarkJobName", "shape": "AIEntityName", "type": "string"}, + {"name": "AIBenchmarkJobArn", "shape": "AIBenchmarkJobArn", "type": "string"}, + {"name": "AIBenchmarkJobStatus", "shape": "AIBenchmarkJobStatus", "type": "string"}, + {"name": "FailureReason", "shape": "FailureReason", "type": "string"}, + {"name": "BenchmarkTarget", "shape": "AIBenchmarkTarget", "type": "structure"}, + {"name": "OutputConfig", "shape": "AIBenchmarkOutputResult", "type": "structure"}, + { + "name": "AIWorkloadConfigIdentifier", + "shape": "AIResourceIdentifier", + "type": "string", + }, + {"name": "RoleArn", "shape": "RoleArn", "type": "string"}, + {"name": "NetworkConfig", "shape": "AIBenchmarkNetworkConfig", "type": "structure"}, + {"name": "CreationTime", "shape": "Timestamp", "type": "timestamp"}, + {"name": "StartTime", "shape": "Timestamp", "type": "timestamp"}, + {"name": "EndTime", "shape": "Timestamp", "type": "timestamp"}, + {"name": "Tags", "shape": "TagList", "type": "list"}, + ], + "type": "structure", + }, + "DescribeAIRecommendationJobRequest": { + "members": [{"name": "AIRecommendationJobName", "shape": "AIEntityName", "type": "string"}], + "type": "structure", + }, + "DescribeAIRecommendationJobResponse": { + "members": [ + {"name": "AIRecommendationJobName", "shape": "AIEntityName", "type": "string"}, + {"name": "AIRecommendationJobArn", "shape": "AIRecommendationJobArn", "type": "string"}, + { + "name": "AIRecommendationJobStatus", + "shape": "AIRecommendationJobStatus", + "type": "string", + }, + {"name": "FailureReason", "shape": "FailureReason", "type": "string"}, + {"name": "ModelSource", "shape": "AIModelSource", "type": "structure"}, + {"name": "OutputConfig", "shape": "AIRecommendationOutputResult", "type": "structure"}, + { + "name": "InferenceSpecification", + "shape": "AIRecommendationInferenceSpecification", + "type": "structure", + }, + { + "name": "AIWorkloadConfigIdentifier", + "shape": "AIResourceIdentifier", + "type": "string", + }, + { + "name": "OptimizeModel", + "shape": "AIRecommendationAllowOptimization", + "type": "boolean", + }, + { + "name": "PerformanceTarget", + "shape": "AIRecommendationPerformanceTarget", + "type": "structure", + }, + {"name": "Recommendations", "shape": "AIRecommendationList", "type": "list"}, + {"name": "RoleArn", "shape": "RoleArn", "type": "string"}, + {"name": "ComputeSpec", "shape": "AIRecommendationComputeSpec", "type": "structure"}, + {"name": "CreationTime", "shape": "Timestamp", "type": "timestamp"}, + {"name": "StartTime", "shape": "Timestamp", "type": "timestamp"}, + {"name": "EndTime", "shape": "Timestamp", "type": "timestamp"}, + {"name": "Tags", "shape": "TagList", "type": "list"}, + ], + "type": "structure", + }, + "DescribeAIWorkloadConfigRequest": { + "members": [{"name": "AIWorkloadConfigName", "shape": "AIEntityName", "type": "string"}], + "type": "structure", + }, + "DescribeAIWorkloadConfigResponse": { + "members": [ + {"name": "AIWorkloadConfigName", "shape": "AIEntityName", "type": "string"}, + {"name": "AIWorkloadConfigArn", "shape": "AIWorkloadConfigArn", "type": "string"}, + {"name": "DatasetConfig", "shape": "AIDatasetConfig", "type": "structure"}, + {"name": "AIWorkloadConfigs", "shape": "AIWorkloadConfigs", "type": "structure"}, + {"name": "Tags", "shape": "TagList", "type": "list"}, + {"name": "CreationTime", "shape": "Timestamp", "type": "timestamp"}, + ], + "type": "structure", + }, "DescribeActionRequest": { "members": [{"name": "ActionName", "shape": "ExperimentEntityNameOrArn", "type": "string"}], "type": "structure", @@ -7774,6 +8296,11 @@ "type": "structure", }, "ExecutionRoleArns": {"member_shape": "RoleArn", "member_type": "string", "type": "list"}, + "ExpectedPerformanceList": { + "member_shape": "AIRecommendationPerformanceMetric", + "member_type": "structure", + "type": "list", + }, "Experiment": { "members": [ {"name": "ExperimentName", "shape": "ExperimentEntityName", "type": "string"}, @@ -9769,6 +10296,69 @@ ], "type": "structure", }, + "ListAIBenchmarkJobsRequest": { + "members": [ + {"name": "MaxResults", "shape": "MaxResults", "type": "integer"}, + {"name": "NextToken", "shape": "NextToken", "type": "string"}, + {"name": "NameContains", "shape": "NameContains", "type": "string"}, + {"name": "StatusEquals", "shape": "AIBenchmarkJobStatus", "type": "string"}, + {"name": "CreationTimeAfter", "shape": "Timestamp", "type": "timestamp"}, + {"name": "CreationTimeBefore", "shape": "Timestamp", "type": "timestamp"}, + {"name": "SortBy", "shape": "ListAIBenchmarkJobsSortBy", "type": "string"}, + {"name": "SortOrder", "shape": "SortOrder", "type": "string"}, + ], + "type": "structure", + }, + "ListAIBenchmarkJobsResponse": { + "members": [ + {"name": "AIBenchmarkJobs", "shape": "AIBenchmarkJobSummaryList", "type": "list"}, + {"name": "NextToken", "shape": "NextToken", "type": "string"}, + ], + "type": "structure", + }, + "ListAIRecommendationJobsRequest": { + "members": [ + {"name": "MaxResults", "shape": "MaxResults", "type": "integer"}, + {"name": "NextToken", "shape": "NextToken", "type": "string"}, + {"name": "NameContains", "shape": "NameContains", "type": "string"}, + {"name": "StatusEquals", "shape": "AIRecommendationJobStatus", "type": "string"}, + {"name": "CreationTimeAfter", "shape": "Timestamp", "type": "timestamp"}, + {"name": "CreationTimeBefore", "shape": "Timestamp", "type": "timestamp"}, + {"name": "SortBy", "shape": "ListAIRecommendationJobsSortBy", "type": "string"}, + {"name": "SortOrder", "shape": "SortOrder", "type": "string"}, + ], + "type": "structure", + }, + "ListAIRecommendationJobsResponse": { + "members": [ + { + "name": "AIRecommendationJobs", + "shape": "AIRecommendationJobSummaryList", + "type": "list", + }, + {"name": "NextToken", "shape": "NextToken", "type": "string"}, + ], + "type": "structure", + }, + "ListAIWorkloadConfigsRequest": { + "members": [ + {"name": "MaxResults", "shape": "MaxResults", "type": "integer"}, + {"name": "NextToken", "shape": "NextToken", "type": "string"}, + {"name": "NameContains", "shape": "NameContains", "type": "string"}, + {"name": "CreationTimeAfter", "shape": "Timestamp", "type": "timestamp"}, + {"name": "CreationTimeBefore", "shape": "Timestamp", "type": "timestamp"}, + {"name": "SortBy", "shape": "ListAIWorkloadConfigsSortBy", "type": "string"}, + {"name": "SortOrder", "shape": "SortOrder", "type": "string"}, + ], + "type": "structure", + }, + "ListAIWorkloadConfigsResponse": { + "members": [ + {"name": "AIWorkloadConfigs", "shape": "AIWorkloadConfigSummaryList", "type": "list"}, + {"name": "NextToken", "shape": "NextToken", "type": "string"}, + ], + "type": "structure", + }, "ListActionsRequest": { "members": [ {"name": "SourceUri", "shape": "SourceUri", "type": "string"}, @@ -12300,6 +12890,11 @@ {"name": "Framework", "shape": "String", "type": "string"}, {"name": "FrameworkVersion", "shape": "ModelPackageFrameworkVersion", "type": "string"}, {"name": "NearestModelName", "shape": "String", "type": "string"}, + { + "name": "AdditionalModelDataSources", + "shape": "AdditionalModelDataSources", + "type": "list", + }, { "name": "AdditionalS3DataSource", "shape": "AdditionalS3DataSource", @@ -15625,6 +16220,24 @@ "value_shape": "SchedulerResourceStatus", "value_type": "string", }, + "StopAIBenchmarkJobRequest": { + "members": [{"name": "AIBenchmarkJobName", "shape": "AIEntityName", "type": "string"}], + "type": "structure", + }, + "StopAIBenchmarkJobResponse": { + "members": [{"name": "AIBenchmarkJobArn", "shape": "AIBenchmarkJobArn", "type": "string"}], + "type": "structure", + }, + "StopAIRecommendationJobRequest": { + "members": [{"name": "AIRecommendationJobName", "shape": "AIEntityName", "type": "string"}], + "type": "structure", + }, + "StopAIRecommendationJobResponse": { + "members": [ + {"name": "AIRecommendationJobArn", "shape": "AIRecommendationJobArn", "type": "string"} + ], + "type": "structure", + }, "StopAutoMLJobRequest": { "members": [{"name": "AutoMLJobName", "shape": "AutoMLJobName", "type": "string"}], "type": "structure", @@ -17933,6 +18546,10 @@ "type": "structure", }, "Workforces": {"member_shape": "Workforce", "member_type": "structure", "type": "list"}, + "WorkloadSpec": { + "members": [{"name": "Inline", "shape": "String", "type": "string"}], + "type": "structure", + }, "WorkspaceSettings": { "members": [ {"name": "S3ArtifactPath", "shape": "S3Uri", "type": "string"}, diff --git a/src/sagemaker_core/main/config_schema.py b/src/sagemaker_core/main/config_schema.py index 47fa38e..f352b2a 100644 --- a/src/sagemaker_core/main/config_schema.py +++ b/src/sagemaker_core/main/config_schema.py @@ -16,6 +16,33 @@ "Resources": { "type": "object", "properties": { + "AIBenchmarkJob": { + "type": "object", + "properties": { + "output_config": {"s3_output_location": {"type": "string"}}, + "role_arn": {"type": "string"}, + "network_config": { + "vpc_config": { + "security_group_ids": { + "type": "array", + "items": {"type": "string"}, + }, + "subnets": { + "type": "array", + "items": {"type": "string"}, + }, + } + }, + }, + }, + "AIRecommendationJob": { + "type": "object", + "properties": { + "model_source": {"s3": {"s3_uri": {"type": "string"}}}, + "output_config": {"s3_output_location": {"type": "string"}}, + "role_arn": {"type": "string"}, + }, + }, "Algorithm": { "type": "object", "properties": { diff --git a/src/sagemaker_core/main/resources.py b/src/sagemaker_core/main/resources.py index 013bf07..9f6d76c 100644 --- a/src/sagemaker_core/main/resources.py +++ b/src/sagemaker_core/main/resources.py @@ -142,6 +142,1148 @@ def wrapper(*args, **kwargs): return wrapper +class AIBenchmarkJob(Base): + """ + Class representing resource AIBenchmarkJob + + Attributes: + ai_benchmark_job_name: The name of the AI benchmark job. + ai_benchmark_job_arn: The Amazon Resource Name (ARN) of the AI benchmark job. + ai_benchmark_job_status: The status of the AI benchmark job. + benchmark_target: The target endpoint that was benchmarked. + output_config: The output configuration for the benchmark job, including the Amazon S3 output location and CloudWatch log information. + ai_workload_config_identifier: The name or Amazon Resource Name (ARN) of the AI workload configuration used for this benchmark job. + role_arn: The Amazon Resource Name (ARN) of the IAM role used by the benchmark job. + creation_time: A timestamp that indicates when the benchmark job was created. + failure_reason: If the benchmark job failed, the reason it failed. + network_config: The network configuration for the benchmark job. + start_time: A timestamp that indicates when the benchmark job started running. + end_time: A timestamp that indicates when the benchmark job completed. + tags: The tags associated with the benchmark job. + + """ + + ai_benchmark_job_name: str + ai_benchmark_job_arn: Optional[str] = Unassigned() + ai_benchmark_job_status: Optional[str] = Unassigned() + failure_reason: Optional[str] = Unassigned() + benchmark_target: Optional[shapes.AIBenchmarkTarget] = Unassigned() + output_config: Optional[shapes.AIBenchmarkOutputResult] = Unassigned() + ai_workload_config_identifier: Optional[str] = Unassigned() + role_arn: Optional[str] = Unassigned() + network_config: Optional[shapes.AIBenchmarkNetworkConfig] = Unassigned() + creation_time: Optional[datetime.datetime] = Unassigned() + start_time: Optional[datetime.datetime] = Unassigned() + end_time: Optional[datetime.datetime] = Unassigned() + tags: Optional[List[shapes.Tag]] = Unassigned() + + def get_name(self) -> str: + attributes = vars(self) + resource_name = "ai_benchmark_job_name" + resource_name_split = resource_name.split("_") + attribute_name_candidates = [] + + l = len(resource_name_split) + for i in range(0, l): + attribute_name_candidates.append("_".join(resource_name_split[i:l])) + + for attribute, value in attributes.items(): + if attribute == "name" or attribute in attribute_name_candidates: + return value + logger.error("Name attribute not found for object ai_benchmark_job") + return None + + @classmethod + @Base.add_validate_call + def create( + cls, + ai_benchmark_job_name: str, + benchmark_target: shapes.AIBenchmarkTarget, + output_config: shapes.AIBenchmarkOutputConfig, + ai_workload_config_identifier: str, + role_arn: str, + network_config: Optional[shapes.AIBenchmarkNetworkConfig] = Unassigned(), + tags: Optional[List[shapes.Tag]] = Unassigned(), + session: Optional[Session] = None, + region: Optional[str] = None, + ) -> Optional["AIBenchmarkJob"]: + """ + Create a AIBenchmarkJob resource + + Parameters: + ai_benchmark_job_name: The name of the AI benchmark job. The name must be unique within your Amazon Web Services account in the current Amazon Web Services Region. + benchmark_target: The target endpoint to benchmark. Specify a SageMaker endpoint by providing its name or Amazon Resource Name (ARN). + output_config: The output configuration for the benchmark job, including the Amazon S3 location where benchmark results are stored. + ai_workload_config_identifier: The name or Amazon Resource Name (ARN) of the AI workload configuration to use for this benchmark job. + role_arn: The Amazon Resource Name (ARN) of an IAM role that enables Amazon SageMaker AI to perform tasks on your behalf. + network_config: The network configuration for the benchmark job, including VPC settings. + tags: The metadata that you apply to Amazon Web Services resources to help you categorize and organize them. Each tag consists of a key and a value, both of which you define. + session: Boto3 session. + region: Region name. + + Returns: + The AIBenchmarkJob resource. + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceInUse: Resource being accessed is in use. + ResourceLimitExceeded: You have exceeded an SageMaker resource limit. For example, you might have too many training jobs created. + ResourceNotFound: Resource being access is not found. + ConfigSchemaValidationError: Raised when a configuration file does not adhere to the schema + LocalConfigNotFoundError: Raised when a configuration file is not found in local file system + S3ConfigNotFoundError: Raised when a configuration file is not found in S3 + """ + + logger.info("Creating ai_benchmark_job resource.") + client = Base.get_sagemaker_client( + session=session, region_name=region, service_name="sagemaker" + ) + + operation_input_args = { + "AIBenchmarkJobName": ai_benchmark_job_name, + "BenchmarkTarget": benchmark_target, + "OutputConfig": output_config, + "AIWorkloadConfigIdentifier": ai_workload_config_identifier, + "RoleArn": role_arn, + "NetworkConfig": network_config, + "Tags": tags, + } + + operation_input_args = Base.populate_chained_attributes( + resource_name="AIBenchmarkJob", operation_input_args=operation_input_args + ) + + logger.debug(f"Input request: {operation_input_args}") + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + # create the resource + response = client.create_ai_benchmark_job(**operation_input_args) + logger.debug(f"Response: {response}") + + return cls.get(ai_benchmark_job_name=ai_benchmark_job_name, session=session, region=region) + + @classmethod + @Base.add_validate_call + def get( + cls, + ai_benchmark_job_name: str, + session: Optional[Session] = None, + region: Optional[str] = None, + ) -> Optional["AIBenchmarkJob"]: + """ + Get a AIBenchmarkJob resource + + Parameters: + ai_benchmark_job_name: The name of the AI benchmark job to describe. + session: Boto3 session. + region: Region name. + + Returns: + The AIBenchmarkJob resource. + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceNotFound: Resource being access is not found. + """ + + operation_input_args = { + "AIBenchmarkJobName": ai_benchmark_job_name, + } + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + client = Base.get_sagemaker_client( + session=session, region_name=region, service_name="sagemaker" + ) + response = client.describe_ai_benchmark_job(**operation_input_args) + + logger.debug(response) + + # deserialize the response + transformed_response = transform(response, "DescribeAIBenchmarkJobResponse") + ai_benchmark_job = cls(**transformed_response) + return ai_benchmark_job + + @Base.add_validate_call + def refresh( + self, + ) -> Optional["AIBenchmarkJob"]: + """ + Refresh a AIBenchmarkJob resource + + Returns: + The AIBenchmarkJob resource. + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceNotFound: Resource being access is not found. + """ + + operation_input_args = { + "AIBenchmarkJobName": self.ai_benchmark_job_name, + } + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + client = Base.get_sagemaker_client() + response = client.describe_ai_benchmark_job(**operation_input_args) + + # deserialize response and update self + transform(response, "DescribeAIBenchmarkJobResponse", self) + return self + + @Base.add_validate_call + def delete( + self, + ) -> None: + """ + Delete a AIBenchmarkJob resource + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceNotFound: Resource being access is not found. + """ + + client = Base.get_sagemaker_client() + + operation_input_args = { + "AIBenchmarkJobName": self.ai_benchmark_job_name, + } + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + client.delete_ai_benchmark_job(**operation_input_args) + + logger.info(f"Deleting {self.__class__.__name__} - {self.get_name()}") + + @Base.add_validate_call + def stop(self) -> None: + """ + Stop a AIBenchmarkJob resource + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceNotFound: Resource being access is not found. + """ + + client = SageMakerClient().client + + operation_input_args = { + "AIBenchmarkJobName": self.ai_benchmark_job_name, + } + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + client.stop_ai_benchmark_job(**operation_input_args) + + logger.info(f"Stopping {self.__class__.__name__} - {self.get_name()}") + + @Base.add_validate_call + def wait( + self, + poll: int = 5, + timeout: Optional[int] = None, + ) -> None: + """ + Wait for a AIBenchmarkJob resource. + + Parameters: + poll: The number of seconds to wait between each poll. + timeout: The maximum number of seconds to wait before timing out. + + Raises: + TimeoutExceededError: If the resource does not reach a terminal state before the timeout. + FailedStatusError: If the resource reaches a failed state. + WaiterError: Raised when an error occurs while waiting. + + """ + terminal_states = ["Completed", "Failed", "Stopped"] + start_time = time.time() + + progress = Progress( + SpinnerColumn("bouncingBar"), + TextColumn("{task.description}"), + TimeElapsedColumn(), + ) + progress.add_task("Waiting for AIBenchmarkJob...") + status = Status("Current status:") + + with Live( + Panel( + Group(progress, status), + title="Wait Log Panel", + border_style=Style(color=Color.BLUE.value), + ), + transient=True, + ): + while True: + self.refresh() + current_status = self.ai_benchmark_job_status + status.update(f"Current status: [bold]{current_status}") + + if current_status in terminal_states: + logger.info(f"Final Resource Status: [bold]{current_status}") + + if "failed" in current_status.lower(): + raise FailedStatusError( + resource_type="AIBenchmarkJob", + status=current_status, + reason=self.failure_reason, + ) + + return + + if timeout is not None and time.time() - start_time >= timeout: + raise TimeoutExceededError(resouce_type="AIBenchmarkJob", status=current_status) + time.sleep(poll) + + @classmethod + @Base.add_validate_call + def get_all( + cls, + name_contains: Optional[str] = Unassigned(), + status_equals: Optional[str] = Unassigned(), + creation_time_after: Optional[datetime.datetime] = Unassigned(), + creation_time_before: Optional[datetime.datetime] = Unassigned(), + sort_by: Optional[str] = Unassigned(), + sort_order: Optional[str] = Unassigned(), + session: Optional[Session] = None, + region: Optional[str] = None, + ) -> ResourceIterator["AIBenchmarkJob"]: + """ + Get all AIBenchmarkJob resources + + Parameters: + max_results: The maximum number of benchmark jobs to return in the response. + next_token: If the previous call to ListAIBenchmarkJobs didn't return the full set of jobs, the call returns a token for getting the next set. + name_contains: A string in the job name. This filter returns only jobs whose name contains the specified string. + status_equals: A filter that returns only benchmark jobs with the specified status. + creation_time_after: A filter that returns only jobs created after the specified time. + creation_time_before: A filter that returns only jobs created before the specified time. + sort_by: The field to sort results by. The default is CreationTime. + sort_order: The sort order for results. The default is Descending. + session: Boto3 session. + region: Region name. + + Returns: + Iterator for listed AIBenchmarkJob resources. + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + """ + + client = Base.get_sagemaker_client( + session=session, region_name=region, service_name="sagemaker" + ) + + operation_input_args = { + "NameContains": name_contains, + "StatusEquals": status_equals, + "CreationTimeAfter": creation_time_after, + "CreationTimeBefore": creation_time_before, + "SortBy": sort_by, + "SortOrder": sort_order, + } + + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + return ResourceIterator( + client=client, + list_method="list_ai_benchmark_jobs", + summaries_key="AIBenchmarkJobs", + summary_name="AIBenchmarkJobSummary", + resource_cls=AIBenchmarkJob, + list_method_kwargs=operation_input_args, + ) + + +class AIRecommendationJob(Base): + """ + Class representing resource AIRecommendationJob + + Attributes: + ai_recommendation_job_name: The name of the AI recommendation job. + ai_recommendation_job_arn: The Amazon Resource Name (ARN) of the AI recommendation job. + ai_recommendation_job_status: The status of the AI recommendation job. + model_source: The source of the model that was analyzed. + output_config: The output configuration for the recommendation job. + ai_workload_config_identifier: The name or Amazon Resource Name (ARN) of the AI workload configuration used for this recommendation job. + role_arn: The Amazon Resource Name (ARN) of the IAM role used by the recommendation job. + creation_time: A timestamp that indicates when the recommendation job was created. + failure_reason: If the recommendation job failed, the reason it failed. + inference_specification: The inference framework configuration. + optimize_model: Whether model optimization techniques were allowed. + performance_target: The performance targets specified for the recommendation job. + recommendations: The list of optimization recommendations generated by the job. Each recommendation includes optimization details, deployment configuration, expected performance metrics, and the associated benchmark job ARN. + compute_spec: The compute resource specification for the recommendation job. + start_time: A timestamp that indicates when the recommendation job started running. + end_time: A timestamp that indicates when the recommendation job completed. + tags: The tags associated with the recommendation job. + + """ + + ai_recommendation_job_name: str + ai_recommendation_job_arn: Optional[str] = Unassigned() + ai_recommendation_job_status: Optional[str] = Unassigned() + failure_reason: Optional[str] = Unassigned() + model_source: Optional[shapes.AIModelSource] = Unassigned() + output_config: Optional[shapes.AIRecommendationOutputResult] = Unassigned() + inference_specification: Optional[shapes.AIRecommendationInferenceSpecification] = Unassigned() + ai_workload_config_identifier: Optional[str] = Unassigned() + optimize_model: Optional[bool] = Unassigned() + performance_target: Optional[shapes.AIRecommendationPerformanceTarget] = Unassigned() + recommendations: Optional[List[shapes.AIRecommendation]] = Unassigned() + role_arn: Optional[str] = Unassigned() + compute_spec: Optional[shapes.AIRecommendationComputeSpec] = Unassigned() + creation_time: Optional[datetime.datetime] = Unassigned() + start_time: Optional[datetime.datetime] = Unassigned() + end_time: Optional[datetime.datetime] = Unassigned() + tags: Optional[List[shapes.Tag]] = Unassigned() + + def get_name(self) -> str: + attributes = vars(self) + resource_name = "ai_recommendation_job_name" + resource_name_split = resource_name.split("_") + attribute_name_candidates = [] + + l = len(resource_name_split) + for i in range(0, l): + attribute_name_candidates.append("_".join(resource_name_split[i:l])) + + for attribute, value in attributes.items(): + if attribute == "name" or attribute in attribute_name_candidates: + return value + logger.error("Name attribute not found for object ai_recommendation_job") + return None + + @classmethod + @Base.add_validate_call + def create( + cls, + ai_recommendation_job_name: str, + model_source: shapes.AIModelSource, + output_config: shapes.AIRecommendationOutputConfig, + ai_workload_config_identifier: str, + performance_target: shapes.AIRecommendationPerformanceTarget, + role_arn: str, + inference_specification: Optional[ + shapes.AIRecommendationInferenceSpecification + ] = Unassigned(), + optimize_model: Optional[bool] = Unassigned(), + compute_spec: Optional[shapes.AIRecommendationComputeSpec] = Unassigned(), + tags: Optional[List[shapes.Tag]] = Unassigned(), + session: Optional[Session] = None, + region: Optional[str] = None, + ) -> Optional["AIRecommendationJob"]: + """ + Create a AIRecommendationJob resource + + Parameters: + ai_recommendation_job_name: The name of the AI recommendation job. The name must be unique within your Amazon Web Services account in the current Amazon Web Services Region. + model_source: The source of the model to optimize. Specify the Amazon S3 location of the model artifacts. + output_config: The output configuration for the recommendation job, including the Amazon S3 location for results and an optional model package group where the optimized model is registered. + ai_workload_config_identifier: The name or Amazon Resource Name (ARN) of the AI workload configuration to use for this recommendation job. + performance_target: The performance targets for the recommendation job. Specify constraints on metrics such as time to first token (ttft-ms), throughput, or cost. + role_arn: The Amazon Resource Name (ARN) of an IAM role that enables Amazon SageMaker AI to perform tasks on your behalf. + inference_specification: The inference framework configuration. Specify the framework (such as LMI or vLLM) for the recommendation job. + optimize_model: Whether to allow model optimization techniques such as quantization, speculative decoding, and kernel tuning. The default is true. + compute_spec: The compute resource specification for the recommendation job. You can specify up to 3 instance types to consider, and optionally provide capacity reservation configuration. + tags: The metadata that you apply to Amazon Web Services resources to help you categorize and organize them. + session: Boto3 session. + region: Region name. + + Returns: + The AIRecommendationJob resource. + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceInUse: Resource being accessed is in use. + ResourceLimitExceeded: You have exceeded an SageMaker resource limit. For example, you might have too many training jobs created. + ResourceNotFound: Resource being access is not found. + ConfigSchemaValidationError: Raised when a configuration file does not adhere to the schema + LocalConfigNotFoundError: Raised when a configuration file is not found in local file system + S3ConfigNotFoundError: Raised when a configuration file is not found in S3 + """ + + logger.info("Creating ai_recommendation_job resource.") + client = Base.get_sagemaker_client( + session=session, region_name=region, service_name="sagemaker" + ) + + operation_input_args = { + "AIRecommendationJobName": ai_recommendation_job_name, + "ModelSource": model_source, + "OutputConfig": output_config, + "AIWorkloadConfigIdentifier": ai_workload_config_identifier, + "PerformanceTarget": performance_target, + "RoleArn": role_arn, + "InferenceSpecification": inference_specification, + "OptimizeModel": optimize_model, + "ComputeSpec": compute_spec, + "Tags": tags, + } + + operation_input_args = Base.populate_chained_attributes( + resource_name="AIRecommendationJob", operation_input_args=operation_input_args + ) + + logger.debug(f"Input request: {operation_input_args}") + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + # create the resource + response = client.create_ai_recommendation_job(**operation_input_args) + logger.debug(f"Response: {response}") + + return cls.get( + ai_recommendation_job_name=ai_recommendation_job_name, session=session, region=region + ) + + @classmethod + @Base.add_validate_call + def get( + cls, + ai_recommendation_job_name: str, + session: Optional[Session] = None, + region: Optional[str] = None, + ) -> Optional["AIRecommendationJob"]: + """ + Get a AIRecommendationJob resource + + Parameters: + ai_recommendation_job_name: The name of the AI recommendation job to describe. + session: Boto3 session. + region: Region name. + + Returns: + The AIRecommendationJob resource. + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceNotFound: Resource being access is not found. + """ + + operation_input_args = { + "AIRecommendationJobName": ai_recommendation_job_name, + } + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + client = Base.get_sagemaker_client( + session=session, region_name=region, service_name="sagemaker" + ) + response = client.describe_ai_recommendation_job(**operation_input_args) + + logger.debug(response) + + # deserialize the response + transformed_response = transform(response, "DescribeAIRecommendationJobResponse") + ai_recommendation_job = cls(**transformed_response) + return ai_recommendation_job + + @Base.add_validate_call + def refresh( + self, + ) -> Optional["AIRecommendationJob"]: + """ + Refresh a AIRecommendationJob resource + + Returns: + The AIRecommendationJob resource. + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceNotFound: Resource being access is not found. + """ + + operation_input_args = { + "AIRecommendationJobName": self.ai_recommendation_job_name, + } + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + client = Base.get_sagemaker_client() + response = client.describe_ai_recommendation_job(**operation_input_args) + + # deserialize response and update self + transform(response, "DescribeAIRecommendationJobResponse", self) + return self + + @Base.add_validate_call + def delete( + self, + ) -> None: + """ + Delete a AIRecommendationJob resource + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceNotFound: Resource being access is not found. + """ + + client = Base.get_sagemaker_client() + + operation_input_args = { + "AIRecommendationJobName": self.ai_recommendation_job_name, + } + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + client.delete_ai_recommendation_job(**operation_input_args) + + logger.info(f"Deleting {self.__class__.__name__} - {self.get_name()}") + + @Base.add_validate_call + def stop(self) -> None: + """ + Stop a AIRecommendationJob resource + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceNotFound: Resource being access is not found. + """ + + client = SageMakerClient().client + + operation_input_args = { + "AIRecommendationJobName": self.ai_recommendation_job_name, + } + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + client.stop_ai_recommendation_job(**operation_input_args) + + logger.info(f"Stopping {self.__class__.__name__} - {self.get_name()}") + + @Base.add_validate_call + def wait( + self, + poll: int = 5, + timeout: Optional[int] = None, + ) -> None: + """ + Wait for a AIRecommendationJob resource. + + Parameters: + poll: The number of seconds to wait between each poll. + timeout: The maximum number of seconds to wait before timing out. + + Raises: + TimeoutExceededError: If the resource does not reach a terminal state before the timeout. + FailedStatusError: If the resource reaches a failed state. + WaiterError: Raised when an error occurs while waiting. + + """ + terminal_states = ["Completed", "Failed", "Stopped"] + start_time = time.time() + + progress = Progress( + SpinnerColumn("bouncingBar"), + TextColumn("{task.description}"), + TimeElapsedColumn(), + ) + progress.add_task("Waiting for AIRecommendationJob...") + status = Status("Current status:") + + with Live( + Panel( + Group(progress, status), + title="Wait Log Panel", + border_style=Style(color=Color.BLUE.value), + ), + transient=True, + ): + while True: + self.refresh() + current_status = self.ai_recommendation_job_status + status.update(f"Current status: [bold]{current_status}") + + if current_status in terminal_states: + logger.info(f"Final Resource Status: [bold]{current_status}") + + if "failed" in current_status.lower(): + raise FailedStatusError( + resource_type="AIRecommendationJob", + status=current_status, + reason=self.failure_reason, + ) + + return + + if timeout is not None and time.time() - start_time >= timeout: + raise TimeoutExceededError( + resouce_type="AIRecommendationJob", status=current_status + ) + time.sleep(poll) + + @classmethod + @Base.add_validate_call + def get_all( + cls, + name_contains: Optional[str] = Unassigned(), + status_equals: Optional[str] = Unassigned(), + creation_time_after: Optional[datetime.datetime] = Unassigned(), + creation_time_before: Optional[datetime.datetime] = Unassigned(), + sort_by: Optional[str] = Unassigned(), + sort_order: Optional[str] = Unassigned(), + session: Optional[Session] = None, + region: Optional[str] = None, + ) -> ResourceIterator["AIRecommendationJob"]: + """ + Get all AIRecommendationJob resources + + Parameters: + max_results: The maximum number of recommendation jobs to return in the response. + next_token: If the previous call to ListAIRecommendationJobs didn't return the full set of jobs, the call returns a token for getting the next set. + name_contains: A string in the job name. This filter returns only jobs whose name contains the specified string. + status_equals: A filter that returns only recommendation jobs with the specified status. + creation_time_after: A filter that returns only jobs created after the specified time. + creation_time_before: A filter that returns only jobs created before the specified time. + sort_by: The field to sort results by. The default is CreationTime. + sort_order: The sort order for results. The default is Descending. + session: Boto3 session. + region: Region name. + + Returns: + Iterator for listed AIRecommendationJob resources. + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + """ + + client = Base.get_sagemaker_client( + session=session, region_name=region, service_name="sagemaker" + ) + + operation_input_args = { + "NameContains": name_contains, + "StatusEquals": status_equals, + "CreationTimeAfter": creation_time_after, + "CreationTimeBefore": creation_time_before, + "SortBy": sort_by, + "SortOrder": sort_order, + } + + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + return ResourceIterator( + client=client, + list_method="list_ai_recommendation_jobs", + summaries_key="AIRecommendationJobs", + summary_name="AIRecommendationJobSummary", + resource_cls=AIRecommendationJob, + list_method_kwargs=operation_input_args, + ) + + +class AIWorkloadConfig(Base): + """ + Class representing resource AIWorkloadConfig + + Attributes: + ai_workload_config_name: The name of the AI workload configuration. + ai_workload_config_arn: The Amazon Resource Name (ARN) of the AI workload configuration. + creation_time: A timestamp that indicates when the AI workload configuration was created. + dataset_config: The dataset configuration for the workload. + ai_workload_configs: The benchmark tool configuration and workload specification. + tags: The tags associated with the AI workload configuration. + + """ + + ai_workload_config_name: str + ai_workload_config_arn: Optional[str] = Unassigned() + dataset_config: Optional[shapes.AIDatasetConfig] = Unassigned() + ai_workload_configs: Optional[shapes.AIWorkloadConfigs] = Unassigned() + tags: Optional[List[shapes.Tag]] = Unassigned() + creation_time: Optional[datetime.datetime] = Unassigned() + + def get_name(self) -> str: + attributes = vars(self) + resource_name = "ai_workload_config_name" + resource_name_split = resource_name.split("_") + attribute_name_candidates = [] + + l = len(resource_name_split) + for i in range(0, l): + attribute_name_candidates.append("_".join(resource_name_split[i:l])) + + for attribute, value in attributes.items(): + if attribute == "name" or attribute in attribute_name_candidates: + return value + logger.error("Name attribute not found for object ai_workload_config") + return None + + @classmethod + @Base.add_validate_call + def create( + cls, + ai_workload_config_name: str, + dataset_config: Optional[shapes.AIDatasetConfig] = Unassigned(), + ai_workload_configs: Optional[shapes.AIWorkloadConfigs] = Unassigned(), + tags: Optional[List[shapes.Tag]] = Unassigned(), + session: Optional[Session] = None, + region: Optional[str] = None, + ) -> Optional["AIWorkloadConfig"]: + """ + Create a AIWorkloadConfig resource + + Parameters: + ai_workload_config_name: The name of the AI workload configuration. The name must be unique within your Amazon Web Services account in the current Amazon Web Services Region. + dataset_config: The dataset configuration for the workload. Specify input data channels with their data sources for benchmark workloads. + ai_workload_configs: The benchmark tool configuration and workload specification. Provide the specification as an inline YAML or JSON string. + tags: The metadata that you apply to Amazon Web Services resources to help you categorize and organize them. Each tag consists of a key and a value, both of which you define. For more information, see Tagging Amazon Web Services Resources in the Amazon Web Services General Reference. + session: Boto3 session. + region: Region name. + + Returns: + The AIWorkloadConfig resource. + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceInUse: Resource being accessed is in use. + ResourceLimitExceeded: You have exceeded an SageMaker resource limit. For example, you might have too many training jobs created. + ConfigSchemaValidationError: Raised when a configuration file does not adhere to the schema + LocalConfigNotFoundError: Raised when a configuration file is not found in local file system + S3ConfigNotFoundError: Raised when a configuration file is not found in S3 + """ + + logger.info("Creating ai_workload_config resource.") + client = Base.get_sagemaker_client( + session=session, region_name=region, service_name="sagemaker" + ) + + operation_input_args = { + "AIWorkloadConfigName": ai_workload_config_name, + "DatasetConfig": dataset_config, + "AIWorkloadConfigs": ai_workload_configs, + "Tags": tags, + } + + operation_input_args = Base.populate_chained_attributes( + resource_name="AIWorkloadConfig", operation_input_args=operation_input_args + ) + + logger.debug(f"Input request: {operation_input_args}") + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + # create the resource + response = client.create_ai_workload_config(**operation_input_args) + logger.debug(f"Response: {response}") + + return cls.get( + ai_workload_config_name=ai_workload_config_name, session=session, region=region + ) + + @classmethod + @Base.add_validate_call + def get( + cls, + ai_workload_config_name: str, + session: Optional[Session] = None, + region: Optional[str] = None, + ) -> Optional["AIWorkloadConfig"]: + """ + Get a AIWorkloadConfig resource + + Parameters: + ai_workload_config_name: The name of the AI workload configuration to describe. + session: Boto3 session. + region: Region name. + + Returns: + The AIWorkloadConfig resource. + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceNotFound: Resource being access is not found. + """ + + operation_input_args = { + "AIWorkloadConfigName": ai_workload_config_name, + } + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + client = Base.get_sagemaker_client( + session=session, region_name=region, service_name="sagemaker" + ) + response = client.describe_ai_workload_config(**operation_input_args) + + logger.debug(response) + + # deserialize the response + transformed_response = transform(response, "DescribeAIWorkloadConfigResponse") + ai_workload_config = cls(**transformed_response) + return ai_workload_config + + @Base.add_validate_call + def refresh( + self, + ) -> Optional["AIWorkloadConfig"]: + """ + Refresh a AIWorkloadConfig resource + + Returns: + The AIWorkloadConfig resource. + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceNotFound: Resource being access is not found. + """ + + operation_input_args = { + "AIWorkloadConfigName": self.ai_workload_config_name, + } + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + client = Base.get_sagemaker_client() + response = client.describe_ai_workload_config(**operation_input_args) + + # deserialize response and update self + transform(response, "DescribeAIWorkloadConfigResponse", self) + return self + + @Base.add_validate_call + def delete( + self, + ) -> None: + """ + Delete a AIWorkloadConfig resource + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + ResourceInUse: Resource being accessed is in use. + ResourceNotFound: Resource being access is not found. + """ + + client = Base.get_sagemaker_client() + + operation_input_args = { + "AIWorkloadConfigName": self.ai_workload_config_name, + } + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + client.delete_ai_workload_config(**operation_input_args) + + logger.info(f"Deleting {self.__class__.__name__} - {self.get_name()}") + + @classmethod + @Base.add_validate_call + def get_all( + cls, + name_contains: Optional[str] = Unassigned(), + creation_time_after: Optional[datetime.datetime] = Unassigned(), + creation_time_before: Optional[datetime.datetime] = Unassigned(), + sort_by: Optional[str] = Unassigned(), + sort_order: Optional[str] = Unassigned(), + session: Optional[Session] = None, + region: Optional[str] = None, + ) -> ResourceIterator["AIWorkloadConfig"]: + """ + Get all AIWorkloadConfig resources + + Parameters: + max_results: The maximum number of AI workload configurations to return in the response. + next_token: If the previous call to ListAIWorkloadConfigs didn't return the full set of configurations, the call returns a token for getting the next set of configurations. + name_contains: A string in the configuration name. This filter returns only configurations whose name contains the specified string. + creation_time_after: A filter that returns only configurations created after the specified time. + creation_time_before: A filter that returns only configurations created before the specified time. + sort_by: The field to sort results by. The default is CreationTime. + sort_order: The sort order for results. The default is Descending. + session: Boto3 session. + region: Region name. + + Returns: + Iterator for listed AIWorkloadConfig resources. + + Raises: + botocore.exceptions.ClientError: This exception is raised for AWS service related errors. + The error message and error code can be parsed from the exception as follows: + ``` + try: + # AWS service call here + except botocore.exceptions.ClientError as e: + error_message = e.response['Error']['Message'] + error_code = e.response['Error']['Code'] + ``` + """ + + client = Base.get_sagemaker_client( + session=session, region_name=region, service_name="sagemaker" + ) + + operation_input_args = { + "NameContains": name_contains, + "CreationTimeAfter": creation_time_after, + "CreationTimeBefore": creation_time_before, + "SortBy": sort_by, + "SortOrder": sort_order, + } + + # serialize the input request + operation_input_args = serialize(operation_input_args) + logger.debug(f"Serialized input request: {operation_input_args}") + + return ResourceIterator( + client=client, + list_method="list_ai_workload_configs", + summaries_key="AIWorkloadConfigs", + summary_name="AIWorkloadConfigSummary", + resource_cls=AIWorkloadConfig, + list_method_kwargs=operation_input_args, + ) + + class Action(Base): """ Class representing resource Action diff --git a/src/sagemaker_core/main/shapes.py b/src/sagemaker_core/main/shapes.py index 168be4b..d6894ec 100644 --- a/src/sagemaker_core/main/shapes.py +++ b/src/sagemaker_core/main/shapes.py @@ -452,6 +452,515 @@ class RawMetricData(Base): step: Optional[int] = Unassigned() +class AIBenchmarkInferenceComponent(Base): + """ + AIBenchmarkInferenceComponent + An inference component to benchmark. + + Attributes + ---------------------- + identifier: The name or Amazon Resource Name (ARN) of the inference component. + """ + + identifier: str + + +class AIBenchmarkEndpoint(Base): + """ + AIBenchmarkEndpoint + The SageMaker endpoint configuration for benchmarking. + + Attributes + ---------------------- + identifier: The name or Amazon Resource Name (ARN) of the SageMaker endpoint to benchmark. + target_container_hostname: The hostname of the specific container to target within a multi-container endpoint. + inference_components: The list of inference components to benchmark on the endpoint. + """ + + identifier: str + target_container_hostname: Optional[str] = Unassigned() + inference_components: Optional[List[AIBenchmarkInferenceComponent]] = Unassigned() + + +class AIBenchmarkJobSummary(Base): + """ + AIBenchmarkJobSummary + Summary information about an AI benchmark job. + + Attributes + ---------------------- + ai_benchmark_job_name: The name of the benchmark job. + ai_benchmark_job_arn: The Amazon Resource Name (ARN) of the benchmark job. + ai_benchmark_job_status: The status of the benchmark job. + creation_time: A timestamp that indicates when the benchmark job was created. + end_time: A timestamp that indicates when the benchmark job completed. + ai_workload_config_name: The name of the AI workload configuration used by the benchmark job. + """ + + ai_benchmark_job_name: str + ai_benchmark_job_arn: str + ai_benchmark_job_status: str + creation_time: datetime.datetime + end_time: Optional[datetime.datetime] = Unassigned() + ai_workload_config_name: Optional[str] = Unassigned() + + +class VpcConfig(Base): + """ + VpcConfig + Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC. + + Attributes + ---------------------- + security_group_ids: The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field. + subnets: The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones. + """ + + security_group_ids: List[str] + subnets: List[str] + + +class AIBenchmarkNetworkConfig(Base): + """ + AIBenchmarkNetworkConfig + The network configuration for an AI benchmark job. + + Attributes + ---------------------- + vpc_config: The VPC configuration, including security group IDs and subnet IDs. + """ + + vpc_config: Optional[VpcConfig] = Unassigned() + + +class AIBenchmarkOutputConfig(Base): + """ + AIBenchmarkOutputConfig + The output configuration for an AI benchmark job. + + Attributes + ---------------------- + s3_output_location: The Amazon S3 URI where benchmark results are stored. + """ + + s3_output_location: str + + +class AICloudWatchLogs(Base): + """ + AICloudWatchLogs + CloudWatch log information for an AI benchmark or recommendation job. + + Attributes + ---------------------- + log_group_arn: The Amazon Resource Name (ARN) of the CloudWatch log group. + log_stream_name: The name of the CloudWatch log stream. + """ + + log_group_arn: Optional[str] = Unassigned() + log_stream_name: Optional[str] = Unassigned() + + +class AIBenchmarkOutputResult(Base): + """ + AIBenchmarkOutputResult + The output result of an AI benchmark job, including the Amazon S3 location and CloudWatch log information. + + Attributes + ---------------------- + s3_output_location: The Amazon S3 URI where benchmark results are stored. + cloud_watch_logs: The CloudWatch log information for the benchmark job. + """ + + s3_output_location: str + cloud_watch_logs: Optional[List[AICloudWatchLogs]] = Unassigned() + + +class AIBenchmarkTarget(Base): + """ + AIBenchmarkTarget + The target for an AI benchmark job. This is a union type — specify one of the members. + + Attributes + ---------------------- + endpoint: The SageMaker endpoint to benchmark. + """ + + endpoint: Optional[AIBenchmarkEndpoint] = Unassigned() + + +class AICapacityReservationConfig(Base): + """ + AICapacityReservationConfig + The capacity reservation configuration for an AI recommendation job. + + Attributes + ---------------------- + capacity_reservation_preference: The capacity reservation preference. The only valid value is capacity-reservations-only. + ml_reservation_arns: The list of ML reservation ARNs to use. + """ + + capacity_reservation_preference: Optional[str] = Unassigned() + ml_reservation_arns: Optional[List[str]] = Unassigned() + + +class AIWorkloadS3DataSource(Base): + """ + AIWorkloadS3DataSource + The Amazon S3 data source for an AI workload. + + Attributes + ---------------------- + s3_uri: The Amazon S3 URI of the data. + """ + + s3_uri: str + + +class AIWorkloadDataSource(Base): + """ + AIWorkloadDataSource + The data source for an AI workload input data channel. + + Attributes + ---------------------- + s3_data_source: The Amazon S3 data source configuration. + """ + + s3_data_source: Optional[AIWorkloadS3DataSource] = Unassigned() + + +class AIWorkloadInputDataConfig(Base): + """ + AIWorkloadInputDataConfig + A channel of input data for an AI workload configuration. Each channel has a name and a data source. + + Attributes + ---------------------- + channel_name: The logical name for the data channel. + data_source: The data source for this channel. + """ + + channel_name: str + data_source: AIWorkloadDataSource + + +class AIDatasetConfig(Base): + """ + AIDatasetConfig + The dataset configuration for an AI workload. This is a union type — specify one of the members. + + Attributes + ---------------------- + input_data_config: An array of input data channel configurations for the workload. + """ + + input_data_config: Optional[List[AIWorkloadInputDataConfig]] = Unassigned() + + +class AIModelSourceS3(Base): + """ + AIModelSourceS3 + The Amazon S3 model source configuration. + + Attributes + ---------------------- + s3_uri: The Amazon S3 URI of the model artifacts. + """ + + s3_uri: Optional[str] = Unassigned() + + +class AIModelSource(Base): + """ + AIModelSource + The source of the model for an AI recommendation job. This is a union type. + + Attributes + ---------------------- + s3: The Amazon S3 location of the model artifacts. + """ + + s3: Optional[AIModelSourceS3] = Unassigned() + + +class AIRecommendationOptimizationDetail(Base): + """ + AIRecommendationOptimizationDetail + Details about an optimization technique applied in a recommendation. + + Attributes + ---------------------- + optimization_type: The type of optimization. Valid values are SpeculativeDecoding and KernelTuning. + optimization_config: A map of configuration parameters for the optimization technique. + """ + + optimization_type: str + optimization_config: Optional[Dict[str, str]] = Unassigned() + + +class AIRecommendationInstanceDetail(Base): + """ + AIRecommendationInstanceDetail + Instance details for a recommendation. + + Attributes + ---------------------- + instance_type: The recommended instance type. + instance_count: The recommended number of instances. + copy_count_per_instance: The number of model copies per instance. + """ + + instance_type: Optional[str] = Unassigned() + instance_count: Optional[int] = Unassigned() + copy_count_per_instance: Optional[int] = Unassigned() + + +class AIRecommendationModelDetails(Base): + """ + AIRecommendationModelDetails + Details about the model package in a recommendation. + + Attributes + ---------------------- + model_package_arn: The Amazon Resource Name (ARN) of the model package. + inference_specification_name: The name of the inference specification within the model package. + instance_details: The instance details for this recommendation, including instance type, count, and model copies per instance. + """ + + model_package_arn: Optional[str] = Unassigned() + inference_specification_name: Optional[str] = Unassigned() + instance_details: Optional[List[AIRecommendationInstanceDetail]] = Unassigned() + + +class AIRecommendationDeploymentS3Channel(Base): + """ + AIRecommendationDeploymentS3Channel + An Amazon S3 data channel for a recommended deployment configuration, containing model artifacts or optimized model outputs. + + Attributes + ---------------------- + channel_name: A custom name for this Amazon S3 data channel. + uri: The Amazon S3 URI of the data for this channel. + """ + + channel_name: Optional[str] = Unassigned() + uri: Optional[str] = Unassigned() + + +class AIRecommendationDeploymentConfiguration(Base): + """ + AIRecommendationDeploymentConfiguration + The deployment configuration for a recommendation. + + Attributes + ---------------------- + s3: The Amazon S3 data channels for the deployment. + image_uri: The URI of the container image for the deployment. + instance_type: The recommended instance type for the deployment. + instance_count: The recommended number of instances for the deployment. + copy_count_per_instance: The number of model copies per instance. + environment_variables: The environment variables for the deployment. + """ + + s3: Optional[List[AIRecommendationDeploymentS3Channel]] = Unassigned() + image_uri: Optional[str] = Unassigned() + instance_type: Optional[str] = Unassigned() + instance_count: Optional[int] = Unassigned() + copy_count_per_instance: Optional[int] = Unassigned() + environment_variables: Optional[Dict[str, str]] = Unassigned() + + +class AIRecommendationPerformanceMetric(Base): + """ + AIRecommendationPerformanceMetric + An expected performance metric for a recommendation. + + Attributes + ---------------------- + metric: The name of the performance metric. + stat: The statistical measure for the metric. + value: The value of the metric. + unit: The unit of the metric value. + """ + + metric: str + value: str + stat: Optional[str] = Unassigned() + unit: Optional[str] = Unassigned() + + +class AIRecommendation(Base): + """ + AIRecommendation + An optimization recommendation generated by an AI recommendation job. + + Attributes + ---------------------- + recommendation_description: A description of the recommendation. + optimization_details: The optimization techniques applied in this recommendation. + model_details: Details about the model package associated with this recommendation. + deployment_configuration: The deployment configuration for this recommendation, including the container image, instance type, instance count, and environment variables. + ai_benchmark_job_arn: The Amazon Resource Name (ARN) of the benchmark job associated with this recommendation. + expected_performance: The expected performance metrics for this recommendation. + """ + + recommendation_description: Optional[str] = Unassigned() + optimization_details: Optional[List[AIRecommendationOptimizationDetail]] = Unassigned() + model_details: Optional[AIRecommendationModelDetails] = Unassigned() + deployment_configuration: Optional[AIRecommendationDeploymentConfiguration] = Unassigned() + ai_benchmark_job_arn: Optional[str] = Unassigned() + expected_performance: Optional[List[AIRecommendationPerformanceMetric]] = Unassigned() + + +class AIRecommendationComputeSpec(Base): + """ + AIRecommendationComputeSpec + The compute resource specification for an AI recommendation job. + + Attributes + ---------------------- + instance_types: The list of instance types to consider for recommendations. You can specify up to 3 instance types. + capacity_reservation_config: The capacity reservation configuration. + """ + + instance_types: Optional[List[str]] = Unassigned() + capacity_reservation_config: Optional[AICapacityReservationConfig] = Unassigned() + + +class AIRecommendationConstraint(Base): + """ + AIRecommendationConstraint + A performance constraint for an AI recommendation job. + + Attributes + ---------------------- + metric: The performance metric. Valid values are ttft-ms (time to first token in milliseconds), throughput, and cost. + """ + + metric: str + + +class AIRecommendationInferenceSpecification(Base): + """ + AIRecommendationInferenceSpecification + The inference framework for an AI recommendation job. + + Attributes + ---------------------- + framework: The inference framework. Valid values are LMI and VLLM. + """ + + framework: Optional[str] = Unassigned() + + +class AIRecommendationJobSummary(Base): + """ + AIRecommendationJobSummary + Summary information about an AI recommendation job. + + Attributes + ---------------------- + ai_recommendation_job_name: The name of the recommendation job. + ai_recommendation_job_arn: The Amazon Resource Name (ARN) of the recommendation job. + ai_recommendation_job_status: The status of the recommendation job. + creation_time: A timestamp that indicates when the recommendation job was created. + end_time: A timestamp that indicates when the recommendation job completed. + """ + + ai_recommendation_job_name: str + ai_recommendation_job_arn: str + ai_recommendation_job_status: str + creation_time: datetime.datetime + end_time: Optional[datetime.datetime] = Unassigned() + + +class AIRecommendationOutputConfig(Base): + """ + AIRecommendationOutputConfig + The output configuration for an AI recommendation job. + + Attributes + ---------------------- + s3_output_location: The Amazon S3 URI where recommendation results are stored. + model_package_group_identifier: The name or Amazon Resource Name (ARN) of the model package group where the optimized model is registered as a new model package version. + """ + + s3_output_location: Optional[str] = Unassigned() + model_package_group_identifier: Optional[str] = Unassigned() + + +class AIRecommendationOutputResult(Base): + """ + AIRecommendationOutputResult + The output configuration for an AI recommendation job, including the S3 location for results and the model package group for deployment. + + Attributes + ---------------------- + s3_output_location: The Amazon S3 URI where the recommendation job writes its output results. + model_package_group_identifier: The name or Amazon Resource Name (ARN) of the model package group where deployment-ready model packages are registered. + """ + + s3_output_location: str + model_package_group_identifier: Optional[str] = Unassigned() + + +class AIRecommendationPerformanceTarget(Base): + """ + AIRecommendationPerformanceTarget + The performance targets for an AI recommendation job. + + Attributes + ---------------------- + constraints: An array of performance constraints that define the optimization objectives. + """ + + constraints: List[AIRecommendationConstraint] + + +class AIWorkloadConfigSummary(Base): + """ + AIWorkloadConfigSummary + Summary information about an AI workload configuration. + + Attributes + ---------------------- + ai_workload_config_name: The name of the AI workload configuration. + ai_workload_config_arn: The Amazon Resource Name (ARN) of the AI workload configuration. + creation_time: A timestamp that indicates when the configuration was created. + """ + + ai_workload_config_name: str + ai_workload_config_arn: str + creation_time: datetime.datetime + + +class WorkloadSpec(Base): + """ + WorkloadSpec + The workload specification for benchmark tool configuration. Provide an inline YAML or JSON string. + + Attributes + ---------------------- + inline: An inline YAML or JSON string that defines benchmark parameters. + """ + + inline: Optional[str] = Unassigned() + + +class AIWorkloadConfigs(Base): + """ + AIWorkloadConfigs + The benchmark tool configuration for an AI workload. + + Attributes + ---------------------- + workload_spec: The workload specification that defines benchmark parameters. + """ + + workload_spec: WorkloadSpec + + class AcceleratorPartitionConfig(Base): """ AcceleratorPartitionConfig @@ -635,6 +1144,21 @@ class ModelInput(Base): data_input_config: str +class AdditionalModelDataSource(Base): + """ + AdditionalModelDataSource + Data sources that are available to your model in addition to the one that you specify for ModelDataSource when you use the CreateModel action. + + Attributes + ---------------------- + channel_name: A custom name for this AdditionalModelDataSource object. + s3_data_source + """ + + channel_name: str + s3_data_source: S3ModelDataSource + + class AdditionalS3DataSource(Base): """ AdditionalS3DataSource @@ -689,6 +1213,7 @@ class ModelPackageContainerDefinition(Base): framework: The machine learning framework of the model package container image. framework_version: The framework version of the Model Package Container Image. nearest_model_name: The name of a pre-trained machine learning benchmarked by Amazon SageMaker Inference Recommender model that matches your model. You can find a list of benchmarked models by calling ListModelMetadata. + additional_model_data_sources: Data sources that are available to your model in addition to the one that you specify for ModelDataSource when you use the CreateModelPackage action. additional_s3_data_source: The additional data source that is used during inference in the Docker container for your model package. model_data_e_tag: The ETag associated with Model Data URL. is_checkpoint: Specifies whether the model data is a training checkpoint. @@ -706,6 +1231,7 @@ class ModelPackageContainerDefinition(Base): framework: Optional[str] = Unassigned() framework_version: Optional[str] = Unassigned() nearest_model_name: Optional[str] = Unassigned() + additional_model_data_sources: Optional[List[AdditionalModelDataSource]] = Unassigned() additional_s3_data_source: Optional[AdditionalS3DataSource] = Unassigned() model_data_e_tag: Optional[str] = Unassigned() is_checkpoint: Optional[bool] = Unassigned() @@ -737,21 +1263,6 @@ class AdditionalInferenceSpecificationDefinition(Base): supported_response_mime_types: Optional[List[str]] = Unassigned() -class AdditionalModelDataSource(Base): - """ - AdditionalModelDataSource - Data sources that are available to your model in addition to the one that you specify for ModelDataSource when you use the CreateModel action. - - Attributes - ---------------------- - channel_name: A custom name for this AdditionalModelDataSource object. - s3_data_source - """ - - channel_name: str - s3_data_source: S3ModelDataSource - - class AgentVersion(Base): """ AgentVersion @@ -2107,21 +2618,6 @@ class AutoMLJobCompletionCriteria(Base): max_auto_ml_job_runtime_in_seconds: Optional[int] = Unassigned() -class VpcConfig(Base): - """ - VpcConfig - Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC. - - Attributes - ---------------------- - security_group_ids: The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field. - subnets: The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones. - """ - - security_group_ids: List[str] - subnets: List[str] - - class AutoMLSecurityConfig(Base): """ AutoMLSecurityConfig @@ -6291,8 +6787,8 @@ class MetricsConfig(Base): Attributes ---------------------- - enable_enhanced_metrics: Specifies whether to enable enhanced metrics for the endpoint. Enhanced metrics provide utilization data at instance and container granularity. Container granularity is supported for Inference Components. The default is False. - metric_publish_frequency_in_seconds: The frequency, in seconds, at which utilization metrics are published to Amazon CloudWatch. The default is 60 seconds. + enable_enhanced_metrics: Specifies whether to enable enhanced metrics for the endpoint. Enhanced metrics provide utilization and invocation data at instance and container granularity. Container granularity is supported for Inference Components. The default is False. + metric_publish_frequency_in_seconds: The interval, in seconds, at which metrics are published to Amazon CloudWatch. Defaults to 60. Valid values: 10, 30, 60, 120, 180, 240, 300. When EnableEnhancedMetrics is set to False, this interval applies to utilization metrics only; invocation metrics continue to be published at the default 60-second interval. When EnableEnhancedMetrics is set to True, this interval applies to both utilization and invocation metrics. """ enable_enhanced_metrics: Optional[bool] = Unassigned() @@ -6428,8 +6924,8 @@ class OfflineStoreConfig(Base): Attributes ---------------------- s3_storage_config: The Amazon Simple Storage (Amazon S3) location of OfflineStore. - disable_glue_table_creation: Set to True to disable the automatic creation of an Amazon Web Services Glue table when configuring an OfflineStore. If set to False, Feature Store will name the OfflineStore Glue table following Athena's naming recommendations. The default value is False. - data_catalog_config: The meta data of the Glue table that is autogenerated when an OfflineStore is created. + disable_glue_table_creation: Set to True to disable the automatic creation of an Amazon Web Services Glue table when configuring an OfflineStore. If set to True and DataCatalogConfig is provided, Feature Store associates the provided catalog configuration with the feature group without creating a table. In this case, you are responsible for creating and managing the Glue table. If set to True without DataCatalogConfig, no Glue table is created or associated with the feature group. The Iceberg table format is only supported when this is set to False. If set to False and DataCatalogConfig is provided, Feature Store creates the table using the specified names. If set to False without DataCatalogConfig, Feature Store auto-generates the table name following Athena's naming recommendations. This applies to both Glue and Apache Iceberg table formats. The default value is False. + data_catalog_config: The meta data of the Glue table for the OfflineStore. If not provided, Feature Store auto-generates the table name, database, and catalog when the OfflineStore is created. You can optionally provide this configuration to specify custom values. This applies to both Glue and Apache Iceberg table formats. table_format: Format for the offline store table. Supported formats are Glue (Default) and Apache Iceberg. """ diff --git a/src/sagemaker_core/tools/api_coverage.json b/src/sagemaker_core/tools/api_coverage.json index 5a8993e..7155452 100644 --- a/src/sagemaker_core/tools/api_coverage.json +++ b/src/sagemaker_core/tools/api_coverage.json @@ -1 +1 @@ -{"SupportedAPIs": 374, "UnsupportedAPIs": 17} \ No newline at end of file +{"SupportedAPIs": 388, "UnsupportedAPIs": 17} \ No newline at end of file