Pipelines API

Cortex’s Pipelines API provides programmatic access to any Machine Learning Pipeline built in your Cortex account. The API allows you to build automated workflows for managing, updating, and deploying your ML Pipelines. This means smoother integrations between Cortex and your business.

The following sections describe the various requests that may be made to the Pipelines API. Each description includes the optional and required parameters that should be submitted with the request.

All API requests should be made to api-us.vidora.com, using the methods listed below. Note that authentication is required for each of these methods.

Manage your ML Pipelines

The below methods allow you to list, filter, and fetch details for the ML pipelines that have been built in your Cortex account. This makes it easy for you to manage up to hundreds of unique pipelines.

List ML Pipelines

Returns an array of pipelines built in your Cortex account, ordered by how recently each pipeline was run. Optionally, the response can be filtered by the parameters listed below.

Request

Method	URL
GET	/v1/api/pipelines

Parameters

Param	Required?	Type	Description
recurring	No	Boolean	Whether or not the pipeline is set to run on a recurring schedule.
active	No	Boolean	Whether or not the pipeline is currently active.

Example Request

http://api-us.vidora.com/v1/api/pipelines?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Example Response

[
  { 
    id: "f90819314266a344",                      # Unique ID for the pipeline
    name: "Sample Classification Pipeline",      # Name of the pipeline
    type: "Classification",                      # Pipeline type
    active: true,                                # Whether the pipeline is active
    recurring: true,                             # Whether it runs on recurring schedule
    status: "complete",                          # Status ("running|complete|failed")
    created_at: "2020-04-05T00:00:00",           # When the pipeline was created
    last_trained_at: "2020-04-05T05:00:00",      # When the pipeline last trained
    last_prediction_at: "2020-04-05T05:00:00".   # When the pipeline last made predictions
  },
  { 
    id: "f8d544d30e43a550", 
    name: "Sample Look Alike Pipeline",
    type: "Look Alike",
    active: true,
    recurring: true,
    status: "running", 
    created_at: "2020-04-05T00:00:00",
    last_trained_at: "2020-04-05T05:00:00",
    last_prediction_at: "2020-04-05T05:00:00"
  }
]

The above example shows a signed GET request to return all pipelines. The response indicates that there are two pipelines in the account, and includes details for each pipeline such as id, name, type, and more.

Get ML Pipeline

Returns details for a given pipeline.

Request

Method	URL
GET	/v1/api/pipelines/<PIPELINE_ID>

Parameters

Param	Required?	Type	Description
pipeline_id	Yes	string	Unique ID for the pipeline.

Example Request

http://api-us.vidora.com/v1/api/pipelines/<PIPELINE_ID>?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Example Response

{
  id: "f90819314266a344",                       # Unique ID for the pipeline    
  name: "Sample Classification Pipeline",       # Name of the pipeline
  type: "Classification",                       # Pipeline type
  active: true,                                 # Whether the pipeline is active
  recurring: true,                              # Whether it runs on recurring schedule
  status: "complete",                           # Status ("running|complete|failed")
  created_at: "2020-04-05T00:00:00",            # When the pipeline was created
  last_trained_at: "2020-04-05T05:00:00",       # When the pipeline last trained
  last_prediction_at: "2020-04-05T05:00:00"     # When the pipeline last made predictions
  active_label_set_id: "j8daw02sz57hjwze"       # Unique ID for the active label set          
}

The above example shows a signed GET request to fetch details for pipeline <PIPELINE_ID>. If the pipeline requires uploaded labels (Classification, Look Alike, or Regression), these details will include a unique ID for the most recently uploaded label set assigned to the pipeline.

Update your ML Pipelines

The below methods allow you to update details of an existing pipeline. Most notably, you can add a new set of labels to trigger an automatic retraining of any Classification, Look Alike, or Regression pipeline. This process involves a POST request to create a new label set (Create Label Set), and a PUT request to assign that label set to a specific pipeline (Update Pipeline).

Create Label Set

Creates a new label set. Once a label set has been successfully validated, you may assign it to a pipeline (via the Update Pipeline method) in order to automatically begin retraining that pipeline with the new labels.

Request

Method	URL
POST	/v1/api/label_sets

Parameters

Param

Required?

Type

Description

label_type

Yes

string

Pipeline type that the labels are intended for. Valid values are “classification”, “look alike”, or “regression”.

name

Yes

string

A descriptive name for the label set.

labels

Yes

file

A file containing the training labels. Uploaded files are required to be in CSV format (optionally gzipped), with an id column (string) and a label column (boolean if label_type is “classification” or “look alike”, float if label_type is “regression”). Sample File Formats:

Classification	Look Alike	Regression
“id”,”label” “A”,1 “B”,0 “C”,1	“id”,”label” “A”,1 “B”,1 “C”,1	“id”,”label” “A”,15.3 “B”,91.7 “C”,54.8

start_date

date

Earliest date for which the uploaded labels are valid (e.g. “2020-02-01”). Default value is 90 days from today. Along with end_date, this value will define the training window for your pipeline.

end_date

date

Latest date for which the uploaded labels are valid (e.g. “2020-04-01”). Default value is today. Along with start_date, this value will define the training window for your pipeline.

ancestor_set_id

string

The ID of another label set to compare labels with. If your new label set is meant to refresh a previously-defined label set, it is useful to pass in an ancestor_set_id to ensure at least 80% of your labels overlap.

Example Request

Creating a new label set requires uploading a file, which must be signed in the POST body like any other Cortex API request. However, since you’re uploading a file, you cannot send JSON as the payload. Instead, the file must be signed and sent as form data. Below is an example which shows how to sign the request.

Payload of Form-Data

Just like signatures for any other Cortex API request, your POST body needs to be included in the signature generation process. The POST body for this request is sent as form-data, which requires a boundary to be defined between each parameter. In this example, we’ll use –BOUNDARY as the defined boundary. If we were to send a common CSV as the labels file, the payload would look like the following:

body =  "--BOUNDARY\r\n"                                                                  \
        "Content-Disposition: form-data; name=\"labels\"; filename=\"my_file.csv\"\r\n"   \
        "Content-Type: text/csv\r\n"                                                      \
        "\r\n"                                                                            \
        "#{file.read}\r\n"                                                                \
         "--BOUNDARY\r\n"                                                                 \

A few things to note about the above:

To end the form-data, append “—” at the end of your boundary. In this example, the end to the form-data is defined by –BOUNDARY.
The Content-Disposition is form-data.
The file being uploaded is a common CSV, so the Content-Type is text/csv.
The file contents are read directly into the payload so it can be signed. In this example, file contents are read using Ruby code.
IMPORTANT: the form data must have the correct line breaks which includes both \r and \n.

Sending a gzipped file

It’s likely that you will want to gzip your labels file to speed up the upload. If sending a gzipped file, the content type must be set to application/gzip or application/x-gzip like the below:

body =  "--BOUNDARY\r\n"                                                                    \
        "Content-Disposition: form-data; name=\"labels\"; filename=\"my_file.csv.gz\"\r\n"  \
        "Content-Type: application/gzip\r\n"                                                \
        "\r\n"                                                                              \
        "#{file.read}\r\n"                                                                  \
         "--BOUNDARY--\r\n"                                                                 \

Adding more params to the payload

When adding other parameters to the request, you can either put them in the form data or append them in the URL. Below is an example of how you would add more parameters to the form data. Note that each parameter is specified using the –BOUNDARY from above.

body =  "--BOUNDARY\r\n"                                                                                  \
        "Content-Disposition: form-data; name=\"labels\"; filename=\"my_file.csv\"\r\n"                   \
        "Content-Type: text/csv\r\n"                                                                      \
        "\r\n"                                                                                            \
        "#{file.read}\r\n"                                                                                \
        "--BOUNDARY\r\n"                                                                                  \
        "Content-Disposition: form-data; name=\"label_type\"\r\n"                                         \
        "\r\n"                                                                                            \
        "classification\r\n"                                                                              \
        "--BOUNDARY\r\n"                                                                                  \
        "Content-Disposition: form-data; name=\"name\"\r\n"                                               \
        "\r\n"                                                                                            \
        "My Classification Label Set\r\n"                                                                 \
        "--BOUNDARY--\r\n"                                                                                \

If you were to append the additional parameters to the URL, they would look like the below. Keep in mind that the labels file would still be sent in the payload as form-data.

POST 

https://api-us.vidora.com/v1/api/label_sets?label_type=classification&name=My%20Classification%20Label%20Set

Sending a gzipped file

Please see the example code in your Cortex account for how to generate a signature for your upload. Like any signed request, you must join the following to create the signature:

Secret Key
Method
Path
URL Params
Body

Sending the request

Since label uploads are sent as form-data, you cannot send a request header of Content-Type: application/json. Instead, it must be sent as form-data that defines the boundary. An example request header for the examples above would specify the following:

"Content-Type": "multipart/form-data; boundary=BOUNDARY"

Ruby gzip example

The below example shows how to create a label set with a gzipped file using Ruby. Ask your account manager for examples using other languages (e.g. Bash).

require "digest/sha2"
require "base64"
require "active_support/time"
require "rest-client"

# Function for generating signatures
def generate_signature(secret, http_method, request_path, params = {}, body = nil)
  string_to_sign = [
    secret,
    http_method,
    request_path,
    params.sort { |pair1, pair2| pair1[0] <=> pair2[0] }.map { |k, v| "#{k}=#{v}" }.join("&"),
    body,
  ].join("\n")

  Base64.strict_encode64(Digest::SHA256.digest(string_to_sign))[0, 43].chomp("=")
end

# Access your labels file on disk and create body
labels_file = File.new("<GZIPPED_FILE>")
body = "--BOUNDARY\r\nContent-Disposition: form-data; name=\"labels\"; filename=\"<GZIPPED_FILE>\"\r\n" \
       "Content-Type: application/gzip\r\n\r\n#{labels_file.read}\r\n--BOUNDARY--\r\n"

params = {
  api_key:    "<API_KEY>",
  expires:    (Time.now.utc + 2.days.to_i).strftime("%Y-%m-%dT%H:%M"),
  name:       "<LABEL_SET_NAME>",
  label_type: "classification",
}

secret = "<API_SECRET>"
params["signature"] = generate_signature(secret, "POST", "/v1/api/label_sets", params, body)

url = "https://api-us.vidora.com/v1/api/label_sets?#{params.to_query}"
request_headers = { "Content-Type": "multipart/form-data; boundary=BOUNDARY" }
response = RestClient.post(url, body, request_headers)
puts response

Example Response

{ 
  id: "0tbeb8mg9ljyod4h",                         # Unique ID for the label set
  name: "My New Label Set",                       # Name of the label set
  status: "validating",                           # Status ("validating|validated|failed")
  label_type: "classification",                   # Type of pipeline the labels will be applied to
  filename: "subscribers-data-2019.csv",          # Name of the uploaded file
  updated_at: "2020-04-05T00:00:00",              # When the label set was last updated
  start_date: "2020-02-01",                       # Earliest date for which the labels are valid
  end_date: "2020-04-01"                          # Last date for which the labels are valid
}

The response includes details about the new label set, including a status field which indicates that the set is currently being validated. When the status changes to “validated”, you may assign the label set to a pipeline via the Update Pipeline method. If the status shows “failed”, an error field in the response will show more information about the validation error.

Get Label Set

Returns details for a given label set. This is useful for checking the status of a label set after you’ve created it.

Request

Method	URL
GET	/v1/api/label_sets/<LABEL_SET_ID>

Parameters

Param	Required?	Type	Description
label_set_id	Yes	string	Unique ID for the label set.

Example Request

http://api-us.vidora.com/v1/api/label_sets/<LABEL_SET_ID>?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Example Response

{ 
  id: "0tbeb8mg9ljyod4h",                       # Unique ID for the label set
  name: "My New Label Set",                     # Name of the label set
  status: "validated",                          # Status ("validating|validated|failed")
  type: "classification",                       # Type of pipeline the labels will be applied to   
  filename: "subscribers-data-2019.csv",        # Name of the uploaded file 
  updated_at: "2020-04-05T00:00:00",            # When the label set was last updated
  start_date: "2020-02-01",                     # Earliest date for which the labels are valid
  end_date: "2020-04-01",                       # Last date for which the labels are valid
  download_url: "http://s3...",                 # URL to download the file
  errors: [],                                   # Array of validation error messages, if any
  warnings: []                                  # Array of validation warning messages, if any
}

The above example shows a signed GET request to fetch details for label set <LABEL_SET_ID>. The response includes details such as the set’s name, label_type, and more.

Update Label Set

Updates details for a given label set. A label set may only be updated if it has not yet been PUT to a pipeline. If the label set has already been assigned to a pipeline, you should POST a new label set instead.

Request

Method	URL
PUT	/v1/api/label_sets/<LABEL_SET_ID>

Parameters

Param

Required?

Type

Description

label_set_id

Yes

string

Unique ID for the label set that you would like to update.

label_type

Yes

string

Pipeline type that the labels are intended for. Valid values are “classification”, “look alike”, or “regression”.

name

Yes

string

A descriptive name for the label set.

labels

Yes

file

Classification	Look Alike	Regression
“id”,”label” “A”,1 “B”,0 “C”,1	“id”,”label” “A”,1 “B”,1 “C”,1	“id”,”label” “A”,15.3 “B”,91.7 “C”,54.8

start_date

date

Earliest date for which the uploaded labels are valid (e.g. “2020-02-01”). Default value is 90 days from today. Along with end_date, this value will define the training window for your pipeline.

end_date

date

Latest date for which the uploaded labels are valid (e.g. “2020-04-01”). Default value is today. Along with start_date, this value will define the training window for your pipeline.

ancestor_set_id

string

Example Request

PUT http://api-us.vidora.com/v1/api/label_sets/<LABEL_SET_ID>?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Content-Type: application/json
{ "start_date": "2020-03-01" }>

Example Response

{ 
  id: "0tbeb8mg9ljyod4h",                       # Unique ID for the label set
  name: "My New Label Set",                     # Name of the label set
  status: "validating",                         # Status ("validating|validated|failed")
  label_type: "classification",                 # Type of pipeline the labels will be applied to
  filename: "subscribers-data-2019.csv",        # Name of the uploaded file
  updated_at: "2020-04-05T00:00:00",            # When the label set was last updated
  start_date: "2020-03-01",                     # Earliest date for which the labels are valid
  end_date: "2020-04-01"                        # Last date for which the labels are valid
}

The above example shows a signed PUT request to update the start date for label set <LABEL_SET_ID>. The response includes details for the label set such as its name, status, and more.

List Label Sets

Returns an array of label sets. Optionally, the response can be filtered by the parameters listed below.

Request

Method	URL
GET	/v1/api/label_sets

Parameters

Param	Required?	Value	Description
status	No	string	Status of the label set (“validating”, “validated”, or “failed”).
label_type	No	integer	Pipeline type that the labels are intended for (“classification”, “look alike”, or “regression”).

Example Request

http://api-us.vidora.com/v1/api/label_sets?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Example Response

[
  { 
    id: "0tbeb8mg9ljyod4h",                     # Unique ID for the label set
    name: "My New Label Set",                   # Name of the label set
    status: "validating",                       # Status ("validating|validated|failed")
    label_type: "classification",               # Type of pipeline the labels will be applied to
    filename: "subscribers-data-2019.csv",      # Name of the uploaded file
    updated_at: "2020-04-05T00:00:00"           # When the label set was last updated
  },
  { 
    id: "p54avfxnd8soj96g", 
    name: "Look Alike Label Set",
    status: "validated",
    label_type: "look alike",
    filename: "survey-data-2020.csv",
    updated_at: "2020-04-07T00:00:00"
  }
]

The above example shows a signed GET request to return all label sets in the account. The response indicates that there are two label sets, and includes details such as each set’s id, name, label_type, and more.

Update Pipeline

Updates details for a given pipeline. Editable information includes the pipeline’s name, and its active label set. If the active label set is updated, the pipeline will automatically begin retraining using the new label set.

Request

Method	URL
PUT	/v1/api/pipelines/<PIPELINE_ID>

Parameters

Param	Required?	Value	Description
pipeline_id	Yes	string	Unique ID for the pipeline.
name	No	string	Edits the name of the pipeline.
active_label_set_id	No	integer	Sets the active label set for the pipeline and triggers an automatic retraining. This option applies only to Classification, Look Alike, and Regression pipelines.

Example Request

PUT 
http://api-us.vidora.com/v1/api/pipelines/<PIPELINE_ID>?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Content-Type: application/json
{ "active_label_set_id": "0tbeb8mg9ljyod4h" }

Example Response

{
  id: "f90819314266a344",                       # Unique ID for the pipeline
  name: "Sample Classification Pipeline",       # Name of the pipeline
  type: "Classification",                       # Pipeline type
  active: true,                                 # Whether the pipeline is active
  recurring: true,                              # Whether it runs on recurring schedule
  status: "running"                             # Status ("running|complete|failed")
  created_at: "2020-04-05T00:00:00",            # When the pipeline was created
  last_trained_at: "2020-04-05T05:00:00",       # When the pipeline last trained
  last_prediction_at: "2020-04-05T05:00:00",    # When the pipeline last made predictions
  active_label_set_id: "0tbeb8mg9ljyod4h"      # Unique ID for the active label set
}

The above example shows a signed PUT request to update the active label sets for pipeline <PIPELINE_ID>. The response includes details for the pipeline such as its name, type, status, and more.

Deploying Pipeline Predictions

The below methods allow you to access prediction files that you have exported from any existing pipeline. Most notably, you can fetch a download link for a given set of exported predictions, allowing you to deploy any pipeline in an automated workflow.

List Prediction Exports for a Pipeline

Returns an array of prediction exports for a given pipeline.

The ID for a pipeline can be fetched by making a request to the List ML Pipelines method.

Request

Method	URL
GET	/v1/api/pipelines/<PIPELINE_ID>/prediction_exports

Parameters

Param	Required?	Value	Description
pipeline_id	Yes	string	Unique ID for the pipeline.
recurring	No	boolean	Whether the prediction export is set to run on a recurring schedule.
exported_since	No	integer	A Unix timestamp which limits the response to Prediction Exports that have been exported at or after this point in time.

Example Request

http://api-us.vidora.com/v1/api/pipelines/<PIPELINE_ID>/prediction_exports?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Example Response

[
  {
    id: "vrhmjwf10b66zl9k",                     # Unique ID for the prediction export
    name: "Sample Recurring Export",            # Name of the prediction export
    status: "complete"                          # Status ("exporting|complete|failed")
  },
  {
    id: "q86yqblw2ejp39bd", 
    name: "Sample One-Time Export",
    status: "exporting"
  }
]

The above example shows a signed GET request to return all prediction exports for pipeline <PIPELINE_ID>. The response indicates that there are two such exports, and includes details for each one such as its name and status.

Get Prediction Export

Returns details for a given prediction export from a given pipeline.

The ID for a pipeline can be fetched by making a request to the List ML Pipelines method. The ID for a prediction export can be fetched by making a request to the List Prediction Exports for a Pipeline method.

Request

Method	URL
GET	/v1/api/pipelines/<PIPELINE_ID>/prediction_exports/<EXPORT_ID>

Parameters

Param	Required?	Value	Description
pipeline_id	Yes	string	Unique ID for the pipeline.
export_id	Yes	string	Unique ID for the prediction export.
recurring	No	boolean	Whether the prediction export is set to run on a recurring schedule.

Example Request

http://api-us.vidora.com/v1/api/pipelines/<PIPELINE_ID>/prediction_exports/<EXPORT_ID>?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Example Response

{
  id: "vrhmjwf10b66zl9k",                          # Unique ID for the prediction export
  name: "Sample Recurring Export",                 # Name of the prediction export
  recurring: true                                  # Whether the export runs on a recurring basis
  status: "complete"                               # Status ("exporting|complete|failed")
  last_exported: "2020-04-05T00:00:00",            # When the export last ran
  columns: ["User ID", "Conversion Probability"],  # Columns included in the exported file
  total: 2581,                                     # Number of predictions exported
  download_url: "http://s3..."                     # Download URL for accessing the exported file
}

The above example shows a signed GET request to fetch details for prediction export <EXPORT_ID> from pipeline <PIPELINE_ID>. The response includes details for the export such as its name, number of predictions, and a link to download the exported file.

Response Codes

Response	Description
200 OK	The request was successful.
400 Bad Request	The request was invalid, possibly due to malformed parameters.
401 Unauthorized	The api_key and/or signature was invalid.
404 Not Found	The id used in the request was not found or the request URI does not exist.
500 Internal Error	There was a server side error, and we cannot serve the request at the current time.

Pipelines API

How Can We Help?

Pipelines API

Manage your ML Pipelines

List ML Pipelines

Get ML Pipeline

Update your ML Pipelines

Create Label Set

Payload of Form-Data

Sending a gzipped file

Adding more params to the payload

Sending a gzipped file

Sending the request

Ruby gzip example

Get Label Set

Update Label Set

List Label Sets

Update Pipeline

Deploying Pipeline Predictions

List Prediction Exports for a Pipeline

Get Prediction Export

Response Codes

Related Links