快速開始
在快速開始中,我們將以 MovieLens Dataset 為例,透過命令列的 curl
工作操作,訓練、部署模型並使用推薦服務。
除了命令列外,也可以使用 Postman Collection 與 Postman Environment
以下,我們以 $API_KEY
代表開發者的 API_KEY,並以 $TOKEN
代表驗證用的 Token。
建立與匯入資料集 (Dataset)
建立使用者資料集 (/create-user-dataset)
不指定 attribute_schema
由系統自行判斷,num_shards
設為 10,根據資料集的總大小,每十萬筆資料需要將 num_shards
加一。
curl --request POST 'https://api.raas.kklab.com/create-user-dataset' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "movielens-users",
"num_shards": 10,
"attribute_schema": null
}'
import json
import requests
from pprint import pprint
headers = create_headers(scopes=["/create-user-dataset"])
url = "https://api.raas.kklab.com/create-user-dataset"
data={
"name": "movielens-users",
"num_shards": 10,
"attribute_schema": null
}
response = requests.request("POST", url, headers=headers, data=json.dumps(data))
pprint(response.json())
以下以 $USER_DATASET_ID
代表剛剛建立的使用者資料集id。回傳值如下:
{
"user_dataset_id": "$USER_DATASET_ID",
"name": "movielens-users",
"attribute_schema": null,
"num_shards": 10
}
匯入使用者資料進使用者資料集 (/upsert-dataset-users)
匯入兩個 user_id
分別為 1
和 2
,沒有任何 attribute
的 user。
curl --location --request POST 'https://api.raas.kklab.com/upsert-dataset-users' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"user_dataset_id": "$USER_DATASET_ID",
"users": [
{
"user_id": "1",
"attributes": {}
},
{
"user_id": "2",
"attributes": {}
}
]
}'
import json
import requests
from pprint import pprint
headers = create_headers(scopes=["/upsert-dataset-users"])
url = "https://api.raas.kklab.com/upsert-dataset-users"
data={
"user_dataset_id": "$USER_DATASET_ID",
"users": [
{
"user_id": "1",
"attributes": {}
},
{
"user_id": "2",
"attributes": {}
}
]
}
response = requests.request("POST", url, headers=headers, data=json.dumps(data))
pprint(response.json())
回傳值如下:
{
"user_dataset_id": "$USER_DATASET_ID",
"succeed_ids": [
"1",
"2"
],
"failed_ids": []
}
兩個user的資料都匯入成功。
建立商品資料集 (/create-item-dataset)
不指定 attribute_schema
由系統自行判斷,num_shards
設為 10,根據資料集的總大小,每十萬筆資料需要將 num_shards
加一。
curl --request POST 'https://api.raas.kklab.com/create-item-dataset' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "movielens-items",
"num_shards": 10,
"attribute_schema": null
}'
import json
import requests
from pprint import pprint
headers = create_headers(scopes=["/create-item-dataset"])
url = "https://api.raas.kklab.com/create-item-dataset"
data={
"name": "movielens-items",
"num_shards": 10,
"attribute_schema": null
}
response = requests.request("POST", url, headers=headers, data=json.dumps(data))
pprint(response.json())
以下以 $ITEM_DATASET_ID
代表剛剛建立的商品資料集id。回傳值如下:
{
"user_dataset_id": "$ITEM_DATASET_ID",
"name": "movielens-items",
"attribute_schema": null,
"num_shards": 10
}
匯入商品資料進商品資料集 (/upsert-dataset-items)
匯入兩部電影資料,並設定兩者的 title, genres 兩個 attribute
:
curl --location --request POST 'https://api.raas.kklab.com/upsert-dataset-items' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"item_dataset_id": "$ITEM_DATASET_ID",
"items": [
{
"item_id": "1",
"attributes": {
"title": "Toy Story (1995)",
"genres": [
"Adventure",
"Animation",
"Children",
"Comedy",
"Fantasy"
]
}
},
{
"item_id": "2",
"attributes": {
"title": "Jumanji (1995)",
"genres": [
"Adventure",
"Children",
"Fantasy"
]
}
}
]
}'
import json
import requests
from pprint import pprint
headers = create_headers(scopes=["/upsert-dataset-items"])
url = "https://api.raas.kklab.com/upsert-dataset-items"
data={
"item_dataset_id": "$ITEM_DATASET_ID",
"items": [
{
"item_id": "1",
"attributes": {
"title": "Toy Story (1995)",
"genres": [
"Adventure",
"Animation",
"Children",
"Comedy",
"Fantasy"
]
}
},
{
"item_id": "2",
"attributes": {
"title": "Jumanji (1995)",
"genres": [
"Adventure",
"Children",
"Fantasy"
]
}
}
]
}
response = requests.request("POST", url, headers=headers, data=json.dumps(data))
pprint(response.json())
回傳值如下:
{
"item_dataset_id": "$ITEM_DATASET_ID",
"succeed_ids": [
"1",
"2"
],
"failed_ids": []
}
兩筆商品資料資料都匯入成功。
建立事件資料集 (/create-event-dataset)
不指定 attribute_schema
由系統自行判斷,num_shards
設為 10,根據資料集的總大小,每十萬筆資料需要將 num_shards
加一。
curl --request POST 'https://api.raas.kklab.com/create-event-dataset' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "movielens-events",
"num_shards": 10,
"attribute_schema": null
}'
import json
import requests
from pprint import pprint
headers = create_headers(scopes=["/create-event-dataset"])
url = "https://api.raas.kklab.com/create-event-dataset"
data={
"name": "movielens-events",
"num_shards": 10,
"attribute_schema": null
}
response = requests.request("POST", url, headers=headers, data=json.dumps(data))
pprint(response.json())
以下以 $EVENT_DATASET_ID
代表剛剛建立的商品資料集id。回傳值如下:
{
"user_dataset_id": "$EVENT_DATASET_ID",
"name": "movielens-events",
"attribute_schema": null,
"num_shards": 10
}
匯入事件資料進事件資料集 (/upsert-dataset-events)
匯入兩筆使用者對電影的評分資料:
curl --location --request POST 'https://api.raas.kklab.com/upsert-dataset-events' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"event_dataset_id": "$EVENT_DATASET_ID",
"events": [
{
"event_id": "1_1_964982703000",
"user_id": "1",
"item_id": "1",
"ts": 964982703000,
"event_type": "rating",
"event_value": 4.0,
"attributes": {}
},
{
"event_id": "1_2_964981247000",
"user_id": "1",
"item_id": "2",
"ts": 964981247000,
"event_type": "rating",
"event_value": 5.0,
"attributes": {}
}
]
}'
import json
import requests
from pprint import pprint
headers = create_headers(scopes=["/upsert-dataset-events"])
url = "https://api.raas.kklab.com/upsert-dataset-events"
data={
"event_dataset_id": "$EVENT_DATASET_ID",
"events": [
{
"event_id": "1_1_964982703000",
"user_id": "1",
"item_id": "1",
"ts": 964982703000,
"event_type": "rating",
"event_value": 4.0,
"attributes": {}
},
{
"event_id": "1_2_964981247000",
"user_id": "1",
"item_id": "2",
"ts": 964981247000,
"event_type": "rating",
"event_value": 5.0,
"attributes": {}
}
]
}
response = requests.request("POST", url, headers=headers, data=json.dumps(data))
pprint(response.json())
回傳值如下:
{
"event_dataset_id": "$EVENT_DATASET_ID",
"succeed_ids": [
"1_1_964982703",
"1_2_964981247"
],
"failed_ids": []
}
兩筆事件資料資料都匯入成功。
建立專案與演算法 (Project and Algorithm)
建立專案 (/create-project)
建立一個 RECOMMEND_ITEMS_TO_USER
類別的專案,並使用剛剛建立與匯入資料的資料集。
curl --request POST 'https://api.raas.kklab.com/create-project' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "movielens-project",
"project_type": "RECOMMEND_ITEMS_TO_USER",
"user_dataset_id": "$USER_DATASET_ID",
"item_dataset_id": "$ITEM_DATASET_ID",
"event_dataset_id": "$EVENT_DATASET_ID",
"target_event_type": "rating"
}'
import json
import requests
from pprint import pprint
headers = create_headers(scopes=["/create-project"])
url = "https://api.raas.kklab.com/create-project"
data={
"name": "movielens-project",
"project_type": "RECOMMEND_ITEMS_TO_USER",
"user_dataset_id": "$USER_DATASET_ID",
"item_dataset_id": "$ITEM_DATASET_ID",
"event_dataset_id": "$EVENT_DATASET_ID",
"target_event_type": "rating"
}
response = requests.request("POST", url, headers=headers, data=json.dumps(data))
pprint(response.json())
以下以 $PROJECT_ID
代表建立的專案id,並以 $SERVING_URI
代表專案的推薦API URL。回傳值如下:
{
"project_id": "$PROJECT_ID",
"name": "movielens-project",
"project_type": "RECOMMEND_ITEMS_TO_USER",
"user_dataset_id": "$USER_DATASET_ID",
"item_dataset_id": "$ITEM_DATASET_ID",
"event_dataset_id": "$EVENT_DATASET_ID",
"target_event_type": "rating",
"serving_uri": "$SERVING_URI"
}
建立演算法 (/create-algorithm)
建立一個演算法用以訓練模型,我們選用 G1
演算法,並由系統決定所有參數 (algorithm parameters)。
curl--request POST 'https://api.raas.kklab.com/create-algorithm' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"project_id": "$PROJECT_ID",
"name": "default G1 algorithm",
"algorithm_type": "G1",
"algorithm_parameters": {}
}'
import json
import requests
from pprint import pprint
headers = create_headers(scopes=["/create-algorithm"])
url = "https://api.raas.kklab.com/create-algorithm"
data={
"project_id": "$PROJECT_ID",
"name": "default G1 algorithm",
"algorithm_type": "G1",
"algorithm_parameters": {}
}
response = requests.request("POST", url, headers=headers, data=json.dumps(data))
pprint(response.json())
以下以 $ALGORITHM_ID
代表建立的演算法id。回傳值如下:
{
"project_id": "$PROJECT_ID",
"algorithm_id": "$ALGORITHM_ID",
"name": "default G1 algorithm",
"algorithm_type": "G1",
"algorithm_parameters": {}
}
訓練及部署模型 (Train and Deploy Model)
訓練模型 (/submit-training-job)
為目前的專案訓練一個模型。
curl --location --request POST 'https://api.raas.kklab.com/submit-training-job' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"project_id": "$PROJECT_ID",
"algorithm_id": "$ALGORITHM_ID",
"name": "G1 model"
}'
import json
import requests
from pprint import pprint
headers = create_headers(scopes=["/submit-training-job"])
url = "https://api.raas.kklab.com/submit-training-job"
data={
"project_id": "$PROJECT_ID",
"algorithm_id": "$ALGORITHM_ID",
"name": "G1 model"
}
response = requests.request("POST", url, headers=headers, data=json.dumps(data))
pprint(response.json())
以下以 $TRAINING_JOB_ID
代表訓練模型的工作id。回傳值如下:
{
"job_id": "$TRAINING_JOB_ID"
}
取得訓練工作進度 (/get-job)
查詢訓練工作進度。
curl --request GET 'https://api.raas.kklab.com/get-job?project_id=$PROJECT_ID&job_id=$TRAINING_JOB_ID' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY'
import requests
from pprint import pprint
headers = create_headers(scopes=["/get-job"])
url = "https://api.raas.kklab.com/get-job?project_id=$PROJECT_ID&job_id=$TRAINING_JOB_ID"
response = requests.request("GET", url, headers=headers, data=json.dumps(data))
pprint(response.json())
由於範例資料集大小非常小,訓練工作會在幾分鐘內完成,以下以 $MODEL_ID
代表訓練出的模型id,完成時回傳值如下:
{
"project_id": "$PROJECT_ID",
"job_id": "$TRAINING_JOB_ID",
"job_type": "TRAINING",
"created_ts": 1624287847047,
"name": "G1 model",
"job_data": {
"algorithm_type": "G1",
"algorithm_parameters": {}
},
"job_result": {
"model_id": "$MODEL_ID"
},
"job_status": "SUCCEED",
"retry_count": 0
}
部署模型 (/submit-deployment-job)
部署模型已供推薦API使用。
curl --request POST 'https://api.raas.kklab.com/submit-deployment-job' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"project_id": "$PROJECT_ID",
"generator_model_id": "$MODEL_ID",
"name": "my deployment job"
}'
import json
import requests
from pprint import pprint
headers = create_headers(scopes=["/submit-deployment-job"])
url = "https://api.raas.kklab.com/submit-deployment-job"
data={
"project_id": "$PROJECT_ID",
"generator_model_id": "$MODEL_ID",
"name": "my deployment job"
}
response = requests.request("POST", url, headers=headers, data=json.dumps(data))
pprint(response.json())
以下以 $DEPLOYMENT_JOB_ID
代表部署模型的工作id。回傳值如下:
{
"job_id": "$DEPLOYMENT_JOB_ID"
}
取得部署工作進度 (/get-job)
查詢部署工作進度。
curl --request GET 'https://api.raas.kklab.com/get-job?project_id=$PROJECT_ID&job_id=$DEPLOYMENT_JOB_ID' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY'
import requests
from pprint import pprint
headers = create_headers(scopes=["/get-job"])
url = "https://api.raas.kklab.com/get-job?project_id=$PROJECT_ID&job_id=$DEPLOYMENT_JOB_ID"
response = requests.request("GET", url, headers=headers, data=json.dumps(data))
pprint(response.json())
由於範例資料集大小非常小,部署工作會在幾分鐘內完成,完成時回傳值如下:
{
"project_id": "$PROJECT_ID",
"job_id": "$DEPLOYMENT_JOB_ID",
"job_type": "DEPLOYMENT",
"created_ts": 1624341750746,
"name": "my deployment job",
"job_data": {
"generator_model_id": "$MODEL_ID",
"ranker_model_id": null
},
"job_result": {
"serving_uri": "$SERVING_URI"
},
"job_status": "SUCCEED",
"retry_count": 0
}
取得推薦結果 (Recommendation)
取得對使用者的推薦商品 (/recommend-items-to-user)
對一位使用者取得推薦商品。
curl -request POST 'https://api.raas.kklab.com/recommend-items-to-user' \
--header 'Authorization: Bearer $TOKEN' \
--header 'APIKey: $API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
"user_id": "1",
"attributes": [
"title",
"genres"
],
"limit": 10
}'
import json
import requests
from pprint import pprint
headers = create_headers(scopes=["/recommend-items-to-user"])
url = "https://api.raas.kklab.com/recommend-items-to-user"
data={
"user_id": "1",
"attributes": [
"title",
"genres"
],
"limit": 10
}
response = requests.request("POST", url, headers=headers, data=json.dumps(data))
pprint(response.json())
回傳值如下:
{
"user_id": "1",
"limit": 10,
"attributes": [
"title",
"genres"
],
"filter_query": null,
"results": [
{
"item": {
"item_id": "1",
"title": "Toy Story (1995)",
"genres": [
"Adventure",
"Animation",
"Children",
"Comedy",
"Fantasy"
]
}
},
{
"item": {
"item_id": "2",
"title": "Jumanji (1995)",
"genres": [
"Adventure",
"Children",
"Fantasy"
]
}
}
]
}