Skip to content

Launching the GPT-2 model in SageMaker

1. Subscribe to the offering

  1. Log in to AWS with a user with administrative privileges
  2. Navigate to the GPT-2 listing on the AWS Marketplace
  3. Click Continue to Subscribe
  4. Click on Accept offer (it might take 1 or 2 minutes for AWS to accept the offer).

    Note that there is no charge for subscribing to this offering only when launching the model on SageMaker

  5. Once you are subscribed click Continue to Configuration

  6. On the Configure and launch page
    1. Select SageMaker console as the Launch Method (you can also use the CLI if you preffer)
    2. Select the version and region where you want to launch the model
    3. On Amazon SageMaker options select Create a real-time inference endpoint
  7. Click on View in Amazon SageMaker

2. Create the endpoint

In he Create endpoint page:

  1. You will be sent to the Create endpoint wizard on the Amazon SageMaker console

    1. Name the model, e.g. gpt-2
    2. Select or create a new IAM role for executing the model
    3. Under Container definition be sure Use a model package subscription from AWS Marketplace is selected
    4. Click on Next
    5. Name the endpoint, e.g. gpt-2
    6. Under Attach endpoint configuration select Create a new endpoint configuration
    7. Be sure the named model (e.g. gpt-2) is listed under Production variants

      Here you can select the instance types you want for the endpoint. The minimun required is ml.m5.4xlarge

    8. Click on Create endpoint configuration

    9. Finally click on Submit

A new endpoint will be created (this can take a couple of minutes).

SageMaker GPT-2 endpoint

3. Making a query

With the SageMaker Endpoint ready you will have an HTTP endpoint to make predictions, for example:

How to query the Invocations endpoint

For complete documentation on how to query this endpoint see the AWS Docs: InvokeEndpoint documentation in AWS.

The key part being how to handlee the AWS Signature Version 4, for example using Python.


To test the endpoint you can use the Insonmia HTTP client that supports AWS Authentication.

Create a new POST request and select the Auth method AWS IAM v4, fill the credentials, region and use sagemaker as the service.

Insonmia Auth

Select JSON as the body type and use the following test query:

    "input": "This is an input text"

With a response like this:

  "This is an input text box that will be used to input the password. The user may select a user name and password to save for later.\n\nPassword fields allow users to save themselves as a user on our server, which may be useful for",
  "This is an input text for the next button.\n\nYou can also press the backspace key twice to erase the text to the right.\n\nPressing the backspace key again to clear the previously typed text will delete the previous line.",
  "This is an input text for the widget, and it must be in the correct format. The format must be one of the following:\n\n\nText to enter on the form\n\nExample: What is the total distance in miles to your next destination"

Full API docs

For the complete documentation of the API including the different inputs and responses and more ways to query the Invocations endpoint see the API page.