Skip to content

REST API Reference

The PaiTIENT Secure Model Service provides a comprehensive REST API that allows direct integration with your applications without requiring the use of our client SDKs.

Base URL

https://api.paitient.ai/v1

Authentication

All API requests require authentication. See the Authentication section for details.

Endpoints

Models

List Models

http
GET /models

Description: Returns a list of models that are available for deployment.

Response:

json
{
  "models": [
    {
      "id": "ZimaBlueAI/HuatuoGPT-o1-8B",
      "name": "HuatuoGPT",
      "version": "o1-8B",
      "description": "Clinical assistant model trained for healthcare applications",
      "parameters": 8000000000,
      "context_length": 4096,
      "supported_fine_tuning": true
    },
    {
      "id": "other-model-id",
      "name": "Other Model Name",
      "version": "1.0",
      "description": "Description of other model",
      "parameters": 7000000000,
      "context_length": 8192,
      "supported_fine_tuning": false
    }
  ]
}

Deployments

Create Deployment

http
POST /deployments

Description: Creates a new deployment for a model.

Request Body:

json
{
  "model_id": "ZimaBlueAI/HuatuoGPT-o1-8B",
  "deployment_name": "clinical-assistant",
  "compute_type": "gpu",
  "instance_type": "g4dn.xlarge",
  "min_replicas": 1,
  "max_replicas": 3,
  "auto_scaling": true,
  "tags": {
    "environment": "production",
    "department": "clinical-research"
  }
}

Response:

json
{
  "deployment_id": "dep_12345abcde",
  "status": "creating",
  "model_id": "ZimaBlueAI/HuatuoGPT-o1-8B",
  "deployment_name": "clinical-assistant",
  "created_at": "2023-11-28T15:42:53.912Z"
}

Get Deployment

http
GET /deployments/:deployment_id

Description: Returns the details of a deployment.

Response:

json
{
  "deployment_id": "dep_12345abcde",
  "status": "running",
  "model_id": "ZimaBlueAI/HuatuoGPT-o1-8B",
  "deployment_name": "clinical-assistant",
  "endpoint": "https://api.paitient.ai/v1/deployments/dep_12345abcde/generate",
  "compute_type": "gpu",
  "instance_type": "g4dn.xlarge",
  "min_replicas": 1,
  "max_replicas": 3,
  "current_replicas": 1,
  "auto_scaling": true,
  "created_at": "2023-11-28T15:42:53.912Z",
  "updated_at": "2023-11-28T15:48:32.521Z",
  "tags": {
    "environment": "production",
    "department": "clinical-research"
  }
}

List Deployments

http
GET /deployments

Description: Returns a list of deployments.

Response:

json
{
  "deployments": [
    {
      "deployment_id": "dep_12345abcde",
      "status": "running",
      "model_id": "ZimaBlueAI/HuatuoGPT-o1-8B",
      "deployment_name": "clinical-assistant",
      "created_at": "2023-11-28T15:42:53.912Z"
    }
  ]
}

Delete Deployment

http
DELETE /deployments/:deployment_id

Description: Deletes a deployment.

Response:

json
{
  "deployment_id": "dep_12345abcde",
  "status": "deleting"
}

Text Generation

Generate Text

http
POST /deployments/:deployment_id/generate

Description: Generates text using the deployed model.

Request Body:

json
{
  "prompt": "What are the potential side effects of metformin?",
  "max_tokens": 500,
  "temperature": 0.7,
  "top_p": 0.95,
  "stop": ["\n\n"]
}

Response:

json
{
  "id": "gen_67890fghij",
  "deployment_id": "dep_12345abcde",
  "prompt": "What are the potential side effects of metformin?",
  "completion": "Metformin, a commonly prescribed medication for type 2 diabetes, may cause several side effects. The most common side effects include gastrointestinal issues such as nausea, vomiting, diarrhea, abdominal discomfort, and a metallic taste in the mouth. These symptoms typically occur at the beginning of treatment and often resolve over time...",
  "finish_reason": "stop",
  "usage": {
    "prompt_tokens": 8,
    "completion_tokens": 482,
    "total_tokens": 490
  }
}

Error Handling

See the Error Handling section for details on API error responses.

Rate Limits

The API has rate limits to ensure fair usage across all clients. Rate limits are specified in the response headers:

  • X-RateLimit-Limit: The maximum number of requests allowed in a time window
  • X-RateLimit-Remaining: The number of requests remaining in the current time window
  • X-RateLimit-Reset: The time at which the current rate limit window resets, in UTC epoch seconds

Webhook Notifications

The API supports webhook notifications for asynchronous events such as deployment status changes. Configure webhooks in your client dashboard.

API Versions

The current API version is v1. We maintain backward compatibility for all API versions.

Next Steps

Released under the MIT License.