Skip to content

快速開始

在快速開始中,我們將以 MovieLens Dataset 為例,透過命令列的 curl 工作操作,訓練、部署模型並使用推薦服務。

除了命令列外,也可以使用 Postman CollectionPostman Environment

以下,我們以 $API_KEY 代表開發者的 API_KEY,並以 $TOKEN 代表驗證用的 Token。

建立與匯入資料集 (Dataset)

建立使用者資料集 (/create-user-dataset)

不指定 attribute_schema 由系統自行判斷,num_shards 設為 10,根據資料集的總大小,每十萬筆資料需要將 num_shards 加一。

curl --request POST 'https://api.raas.kklab.com/create-user-dataset' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
    "name": "movielens-users",
    "num_shards": 10,
    "attribute_schema": null
}'
import json
import requests
from pprint import pprint

headers = create_headers(scopes=["/create-user-dataset"])

url = "https://api.raas.kklab.com/create-user-dataset"

data={
    "name": "movielens-users",
    "num_shards": 10,
    "attribute_schema": null
}

response = requests.request("POST", url, headers=headers, data=json.dumps(data))

pprint(response.json())

以下以 $USER_DATASET_ID 代表剛剛建立的使用者資料集id。回傳值如下:

{
    "user_dataset_id": "$USER_DATASET_ID",
    "name": "movielens-users",
    "attribute_schema": null,
    "num_shards": 10
}

匯入使用者資料進使用者資料集 (/upsert-dataset-users)

匯入兩個 user_id 分別為 12,沒有任何 attribute 的 user。

curl --location --request POST 'https://api.raas.kklab.com/upsert-dataset-users' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
    "user_dataset_id": "$USER_DATASET_ID",
    "users": [
        {
            "user_id": "1",
            "attributes": {}
        },
        {
            "user_id": "2",
            "attributes": {}
        }
    ]
}'
import json
import requests
from pprint import pprint

headers = create_headers(scopes=["/upsert-dataset-users"])

url = "https://api.raas.kklab.com/upsert-dataset-users"

data={
    "user_dataset_id": "$USER_DATASET_ID",
    "users": [
        {
            "user_id": "1",
            "attributes": {}
        },
        {
            "user_id": "2",
            "attributes": {}
        }
    ]
}

response = requests.request("POST", url, headers=headers, data=json.dumps(data))

pprint(response.json())

回傳值如下:

{
    "user_dataset_id": "$USER_DATASET_ID",
    "succeed_ids": [
        "1",
        "2"
    ],
    "failed_ids": []
}

兩個user的資料都匯入成功。

建立商品資料集 (/create-item-dataset)

不指定 attribute_schema 由系統自行判斷,num_shards 設為 10,根據資料集的總大小,每十萬筆資料需要將 num_shards 加一。

curl --request POST 'https://api.raas.kklab.com/create-item-dataset' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
    "name": "movielens-items",
    "num_shards": 10,
    "attribute_schema": null
}'
import json
import requests
from pprint import pprint

headers = create_headers(scopes=["/create-item-dataset"])

url = "https://api.raas.kklab.com/create-item-dataset"

data={
    "name": "movielens-items",
    "num_shards": 10,
    "attribute_schema": null
}

response = requests.request("POST", url, headers=headers, data=json.dumps(data))

pprint(response.json())

以下以 $ITEM_DATASET_ID 代表剛剛建立的商品資料集id。回傳值如下:

{
    "user_dataset_id": "$ITEM_DATASET_ID",
    "name": "movielens-items",
    "attribute_schema": null,
    "num_shards": 10
}

匯入商品資料進商品資料集 (/upsert-dataset-items)

匯入兩部電影資料,並設定兩者的 title, genres 兩個 attribute

curl --location --request POST 'https://api.raas.kklab.com/upsert-dataset-items' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
    "item_dataset_id": "$ITEM_DATASET_ID",
    "items": [
        {
            "item_id": "1",
            "attributes": {
                "title": "Toy Story (1995)",
                "genres": [
                    "Adventure",
                    "Animation",
                    "Children",
                    "Comedy",
                    "Fantasy"
                ]
            }
        },
        {
            "item_id": "2",
            "attributes": {
                "title": "Jumanji (1995)",
                "genres": [
                    "Adventure",
                    "Children",
                    "Fantasy"
                ]
            }
        }
    ]
}'
import json
import requests
from pprint import pprint

headers = create_headers(scopes=["/upsert-dataset-items"])

url = "https://api.raas.kklab.com/upsert-dataset-items"

data={
    "item_dataset_id": "$ITEM_DATASET_ID",
    "items": [
        {
            "item_id": "1",
            "attributes": {
                "title": "Toy Story (1995)",
                "genres": [
                    "Adventure",
                    "Animation",
                    "Children",
                    "Comedy",
                    "Fantasy"
                ]
            }
        },
        {
            "item_id": "2",
            "attributes": {
                "title": "Jumanji (1995)",
                "genres": [
                    "Adventure",
                    "Children",
                    "Fantasy"
                ]
            }
        }
    ]
}

response = requests.request("POST", url, headers=headers, data=json.dumps(data))

pprint(response.json())

回傳值如下:

{
    "item_dataset_id": "$ITEM_DATASET_ID",
    "succeed_ids": [
        "1",
        "2"
    ],
    "failed_ids": []
}

兩筆商品資料資料都匯入成功。

建立事件資料集 (/create-event-dataset)

不指定 attribute_schema 由系統自行判斷,num_shards 設為 10,根據資料集的總大小,每十萬筆資料需要將 num_shards 加一。

curl --request POST 'https://api.raas.kklab.com/create-event-dataset' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
    "name": "movielens-events",
    "num_shards": 10,
    "attribute_schema": null
}'
import json
import requests
from pprint import pprint

headers = create_headers(scopes=["/create-event-dataset"])

url = "https://api.raas.kklab.com/create-event-dataset"

data={
    "name": "movielens-events",
    "num_shards": 10,
    "attribute_schema": null
}

response = requests.request("POST", url, headers=headers, data=json.dumps(data))

pprint(response.json())

以下以 $EVENT_DATASET_ID 代表剛剛建立的商品資料集id。回傳值如下:

{
    "user_dataset_id": "$EVENT_DATASET_ID",
    "name": "movielens-events",
    "attribute_schema": null,
    "num_shards": 10
}

匯入事件資料進事件資料集 (/upsert-dataset-events)

匯入兩筆使用者對電影的評分資料:

curl --location --request POST 'https://api.raas.kklab.com/upsert-dataset-events' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
    "event_dataset_id": "$EVENT_DATASET_ID",
    "events": [
        {
            "event_id": "1_1_964982703000",
            "user_id": "1",
            "item_id": "1",
            "ts": 964982703000,
            "event_type": "rating",
            "event_value": 4.0,
            "attributes": {}
        },
        {
            "event_id": "1_2_964981247000",
            "user_id": "1",
            "item_id": "2",
            "ts": 964981247000,
            "event_type": "rating",
            "event_value": 5.0,
            "attributes": {}
        }
    ]
}'
import json
import requests
from pprint import pprint

headers = create_headers(scopes=["/upsert-dataset-events"])

url = "https://api.raas.kklab.com/upsert-dataset-events"

data={
    "event_dataset_id": "$EVENT_DATASET_ID",
    "events": [
        {
            "event_id": "1_1_964982703000",
            "user_id": "1",
            "item_id": "1",
            "ts": 964982703000,
            "event_type": "rating",
            "event_value": 4.0,
            "attributes": {}
        },
        {
            "event_id": "1_2_964981247000",
            "user_id": "1",
            "item_id": "2",
            "ts": 964981247000,
            "event_type": "rating",
            "event_value": 5.0,
            "attributes": {}
        }
    ]
}

response = requests.request("POST", url, headers=headers, data=json.dumps(data))

pprint(response.json())

回傳值如下:

{
    "event_dataset_id": "$EVENT_DATASET_ID",
    "succeed_ids": [
        "1_1_964982703",
        "1_2_964981247"
    ],
    "failed_ids": []
}

兩筆事件資料資料都匯入成功。

建立專案與演算法 (Project and Algorithm)

建立專案 (/create-project)

建立一個 RECOMMEND_ITEMS_TO_USER 類別的專案,並使用剛剛建立與匯入資料的資料集。

curl --request POST 'https://api.raas.kklab.com/create-project' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
    "name": "movielens-project",
    "project_type": "RECOMMEND_ITEMS_TO_USER",
    "user_dataset_id": "$USER_DATASET_ID",
    "item_dataset_id": "$ITEM_DATASET_ID",
    "event_dataset_id": "$EVENT_DATASET_ID",
    "target_event_type": "rating"
}'
import json
import requests
from pprint import pprint

headers = create_headers(scopes=["/create-project"])

url = "https://api.raas.kklab.com/create-project"

data={
    "name": "movielens-project",
    "project_type": "RECOMMEND_ITEMS_TO_USER",
    "user_dataset_id": "$USER_DATASET_ID",
    "item_dataset_id": "$ITEM_DATASET_ID",
    "event_dataset_id": "$EVENT_DATASET_ID",
    "target_event_type": "rating"
}

response = requests.request("POST", url, headers=headers, data=json.dumps(data))

pprint(response.json())

以下以 $PROJECT_ID 代表建立的專案id,並以 $SERVING_URI 代表專案的推薦API URL。回傳值如下:

{
    "project_id": "$PROJECT_ID",
    "name": "movielens-project",
    "project_type": "RECOMMEND_ITEMS_TO_USER",
    "user_dataset_id": "$USER_DATASET_ID",
    "item_dataset_id": "$ITEM_DATASET_ID",
    "event_dataset_id": "$EVENT_DATASET_ID",
    "target_event_type": "rating",
    "serving_uri": "$SERVING_URI"
}

建立演算法 (/create-algorithm)

建立一個演算法用以訓練模型,我們選用 G1 演算法,並由系統決定所有參數 (algorithm parameters)。

curl--request POST 'https://api.raas.kklab.com/create-algorithm' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
    "project_id": "$PROJECT_ID",
    "name": "default G1 algorithm",
    "algorithm_type": "G1",
    "algorithm_parameters": {}
}'
import json
import requests
from pprint import pprint

headers = create_headers(scopes=["/create-algorithm"])

url = "https://api.raas.kklab.com/create-algorithm"

data={
    "project_id": "$PROJECT_ID",
    "name": "default G1 algorithm",
    "algorithm_type": "G1",
    "algorithm_parameters": {}
}

response = requests.request("POST", url, headers=headers, data=json.dumps(data))

pprint(response.json())

以下以 $ALGORITHM_ID 代表建立的演算法id。回傳值如下:

{
    "project_id": "$PROJECT_ID",
    "algorithm_id": "$ALGORITHM_ID",
    "name": "default G1 algorithm",
    "algorithm_type": "G1",
    "algorithm_parameters": {}
}

訓練及部署模型 (Train and Deploy Model)

訓練模型 (/submit-training-job)

為目前的專案訓練一個模型。

curl --location --request POST 'https://api.raas.kklab.com/submit-training-job' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
    "project_id": "$PROJECT_ID",
    "algorithm_id": "$ALGORITHM_ID",
    "name": "G1 model"
}'
import json
import requests
from pprint import pprint

headers = create_headers(scopes=["/submit-training-job"])

url = "https://api.raas.kklab.com/submit-training-job"

data={
    "project_id": "$PROJECT_ID",
    "algorithm_id": "$ALGORITHM_ID",
    "name": "G1 model"
}

response = requests.request("POST", url, headers=headers, data=json.dumps(data))

pprint(response.json())

以下以 $TRAINING_JOB_ID 代表訓練模型的工作id。回傳值如下:

{
    "job_id": "$TRAINING_JOB_ID"
}

取得訓練工作進度 (/get-job)

查詢訓練工作進度。

curl --request GET 'https://api.raas.kklab.com/get-job?project_id=$PROJECT_ID&job_id=$TRAINING_JOB_ID' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY'
import requests
from pprint import pprint

headers = create_headers(scopes=["/get-job"])

url = "https://api.raas.kklab.com/get-job?project_id=$PROJECT_ID&job_id=$TRAINING_JOB_ID"

response = requests.request("GET", url, headers=headers, data=json.dumps(data))

pprint(response.json())

由於範例資料集大小非常小,訓練工作會在幾分鐘內完成,以下以 $MODEL_ID 代表訓練出的模型id,完成時回傳值如下:

{
    "project_id": "$PROJECT_ID",
    "job_id": "$TRAINING_JOB_ID",
    "job_type": "TRAINING",
    "created_ts": 1624287847047,
    "name": "G1 model",
    "job_data": {
        "algorithm_type": "G1",
        "algorithm_parameters": {}
    },
    "job_result": {
        "model_id": "$MODEL_ID"
    },
    "job_status": "SUCCEED",
    "retry_count": 0
}

部署模型 (/submit-deployment-job)

部署模型已供推薦API使用。

curl --request POST 'https://api.raas.kklab.com/submit-deployment-job' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
    "project_id": "$PROJECT_ID",
    "generator_model_id": "$MODEL_ID",
    "name": "my deployment job"
}'
import json
import requests
from pprint import pprint

headers = create_headers(scopes=["/submit-deployment-job"])

url = "https://api.raas.kklab.com/submit-deployment-job"

data={
    "project_id": "$PROJECT_ID",
    "generator_model_id": "$MODEL_ID",
    "name": "my deployment job"
}

response = requests.request("POST", url, headers=headers, data=json.dumps(data))

pprint(response.json())

以下以 $DEPLOYMENT_JOB_ID 代表部署模型的工作id。回傳值如下:

{
    "job_id": "$DEPLOYMENT_JOB_ID"
}

取得部署工作進度 (/get-job)

查詢部署工作進度。

curl --request GET 'https://api.raas.kklab.com/get-job?project_id=$PROJECT_ID&job_id=$DEPLOYMENT_JOB_ID' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY'
import requests
from pprint import pprint

headers = create_headers(scopes=["/get-job"])

url = "https://api.raas.kklab.com/get-job?project_id=$PROJECT_ID&job_id=$DEPLOYMENT_JOB_ID"

response = requests.request("GET", url, headers=headers, data=json.dumps(data))

pprint(response.json())

由於範例資料集大小非常小,部署工作會在幾分鐘內完成,完成時回傳值如下:

{
    "project_id": "$PROJECT_ID",
    "job_id": "$DEPLOYMENT_JOB_ID",
    "job_type": "DEPLOYMENT",
    "created_ts": 1624341750746,
    "name": "my deployment job",
    "job_data": {
        "generator_model_id": "$MODEL_ID",
        "ranker_model_id": null
    },
    "job_result": {
        "serving_uri": "$SERVING_URI"
    },
    "job_status": "SUCCEED",
    "retry_count": 0
}

取得推薦結果 (Recommendation)

取得對使用者的推薦商品 (/recommend-items-to-user)

對一位使用者取得推薦商品。

curl -request POST 'https://api.raas.kklab.com/recommend-items-to-user' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
    "user_id": "1",
    "attributes": [
        "title",
        "genres"
    ],
    "limit": 10
}'
import json
import requests
from pprint import pprint

headers = create_headers(scopes=["/recommend-items-to-user"])

url = "https://api.raas.kklab.com/recommend-items-to-user"

data={
    "user_id": "1",
    "attributes": [
        "title",
        "genres"
    ],
    "limit": 10
}

response = requests.request("POST", url, headers=headers, data=json.dumps(data))

pprint(response.json())

回傳值如下:

{
    "user_id": "1",
    "limit": 10,
    "attributes": [
        "title",
        "genres"
    ],
    "filter_query": null,
    "results": [
        {
            "item": {
                "item_id": "1",
                "title": "Toy Story (1995)",
                "genres": [
                    "Adventure",
                    "Animation",
                    "Children",
                    "Comedy",
                    "Fantasy"
                ]
            }
        },
        {
            "item": {
                "item_id": "2",
                "title": "Jumanji (1995)",
                "genres": [
                    "Adventure",
                    "Children",
                    "Fantasy"
                ]
            }
        }
    ]
}