Use provisioned concurrency to help with cold start

Thanks for this great example! 

To help with cold start, I did some experiments with provisioned concurrency, lazy load the transformers module in sentiment and prime the sentiment method if the provisioned concurrency is enabled. This reduced the cold start time to 1 ~ 2 seconds, and the first predict call complete in about 1 second. 

```python
def sentiment(payload):
    from transformers import pipeline
    clf = pipeline("sentiment-analysis", model="model/")
    prediction = clf(payload, return_all_scores=True)

    # convert list to dict
    result = {}
    for pred in prediction[0]:
        result[pred["label"]] = pred["score"]
    return result

# Prime the sentiment function for provisioned concurrency
init_type = os.environ.get("AWS_LAMBDA_INITIALIZATION_TYPE", "on-demand")
if init_type == "provisioned-concurrency":
    payload = json.dumps({"fn_index": 0, "data": [
        "Running Gradio on AWS Lambda is amazing"], "session_hash": "fpx8ngrma3d"})
    sentiment(payload)
```

<img width="1438" alt="image" src="https://user-images.githubusercontent.com/12918/222936791-5cdef28c-63c0-4a81-bae1-d1ba21128d6a.png">



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use provisioned concurrency to help with cold start #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Use provisioned concurrency to help with cold start #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions