Python SDK: Text Generation

This guide covers the text generation capabilities of the PaiTIENT Secure Model Service Python SDK, allowing you to securely generate text from deployed models in a HIPAA/SOC2 compliant environment.

Basic Text Generation

Generate text from a deployed model:

python

from paitient_secure_model import Client

# Initialize client
client = Client()

# Generate text
response = client.generate_text(
    deployment_id="dep_12345abcde",
    prompt="What are the potential side effects of metformin?"
)

print(response.text)

Generation Options

Text Parameters

Customize text generation with various parameters:

python

# Generation with parameters
response = client.generate_text(
    deployment_id="dep_12345abcde",
    prompt="What are the potential side effects of metformin?",
    max_tokens=500,              # Maximum length of the generated text
    temperature=0.7,             # Controls randomness (0.0-1.0)
    top_p=0.95,                  # Nucleus sampling parameter
    top_k=50,                    # Top-k sampling parameter
    stop=["\n\n", "END"],        # Stop sequences
    repetition_penalty=1.1,      # Penalize repeated tokens
    presence_penalty=0.0,        # Penalize tokens based on presence
    frequency_penalty=0.0        # Penalize tokens based on frequency
)

System Messages

Use system messages to control the model's behavior:

python

# Generation with system message
response = client.generate_text(
    deployment_id="dep_12345abcde",
    prompt="What are the potential side effects of metformin?",
    system_message="You are a helpful medical assistant providing accurate information to healthcare professionals. Always include references to clinical guidelines and mention important warnings."
)

Context Management

Maintain conversation context:

python

# Conversation with context
conversation = client.create_conversation(deployment_id="dep_12345abcde")

# First message
response1 = conversation.generate_text(
    prompt="What are the potential side effects of metformin?"
)
print("Response 1:", response1.text)

# Follow-up question (context is automatically maintained)
response2 = conversation.generate_text(
    prompt="What about patients with kidney disease?"
)
print("Response 2:", response2.text)

# Another follow-up
response3 = conversation.generate_text(
    prompt="Are there any alternatives for these patients?"
)
print("Response 3:", response3.text)

Advanced Generation Features

Streaming Responses

Stream the response as it's generated:

python

# Stream the response
for chunk in client.generate_text_stream(
    deployment_id="dep_12345abcde",
    prompt="Write a detailed summary of diabetes management techniques.",
    max_tokens=1000
):
    print(chunk.text, end="", flush=True)

Batch Processing

Process multiple prompts efficiently:

python

# Define a list of prompts
prompts = [
    "What are the symptoms of hypertension?",
    "What are common treatments for type 2 diabetes?",
    "Explain the mechanism of action for statins."
]

# Process in batch
results = client.generate_text_batch(
    deployment_id="dep_12345abcde",
    prompts=prompts,
    max_tokens=300
)

for i, result in enumerate(results):
    print(f"Prompt {i+1}: {prompts[i]}")
    print(f"Response: {result.text}")
    print()

Function Calling

Define functions that the model can call:

python

from paitient_secure_model import Client
from paitient_secure_model.functions import FunctionDefinition

# Initialize client
client = Client()

# Define functions
functions = [
    FunctionDefinition(
        name="search_medication_interactions",
        description="Search for potential interactions between medications",
        parameters={
            "type": "object",
            "properties": {
                "medications": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "List of medications to check for interactions"
                }
            },
            "required": ["medications"]
        }
    ),
    FunctionDefinition(
        name="calculate_dosage",
        description="Calculate medication dosage based on patient parameters",
        parameters={
            "type": "object",
            "properties": {
                "medication": {"type": "string", "description": "Medication name"},
                "weight_kg": {"type": "number", "description": "Patient weight in kg"},
                "age_years": {"type": "number", "description": "Patient age in years"},
                "kidney_function": {"type": "string", "description": "Kidney function (normal, impaired, severe)"}
            },
            "required": ["medication", "weight_kg", "age_years"]
        }
    )
]

# Generate text with function calling
response = client.generate_text(
    deployment_id="dep_12345abcde",
    prompt="Check for interactions between metformin, lisinopril, and simvastatin.",
    functions=functions,
    function_call="auto"  # Options: "auto", "none", or {"name": "specific_function"}
)

# Check if the model decided to call a function
if response.function_call:
    function_name = response.function_call.name
    function_args = response.function_call.arguments
    
    print(f"Model called function: {function_name}")
    print(f"Arguments: {function_args}")
    
    # Here you would actually execute the function with the provided arguments
    # and then potentially continue the conversation with the result
    if function_name == "search_medication_interactions":
        # Simulate function execution
        result = {"interactions": [
            {"medications": ["metformin", "lisinopril"], "severity": "low", "description": "..."},
            {"medications": ["metformin", "simvastatin"], "severity": "moderate", "description": "..."}
        ]}
        
        # Continue the conversation with the function result
        follow_up = client.generate_text(
            deployment_id="dep_12345abcde",
            prompt="What should I do about these interactions?",
            function_results=[{
                "name": function_name,
                "arguments": function_args,
                "result": result
            }]
        )
        
        print("Follow-up:", follow_up.text)
else:
    print("Model response:", response.text)

Security Controls

Apply security controls to text generation:

python

from paitient_secure_model import Client
from paitient_secure_model.security import SecuritySettings, DataFiltering

# Initialize client
client = Client()

# Generate text with security controls
response = client.generate_text(
    deployment_id="dep_12345abcde",
    prompt="Patient with diabetes and hypertension, currently taking metformin and lisinopril.",
    security_settings=SecuritySettings(
        data_filtering=DataFiltering(
            detect_pii=True,           # Detect personally identifiable information
            redact_phi=True,           # Redact protected health information
            content_filtering="strict", # Apply strict content filtering
            detect_toxic_content=True  # Detect potentially harmful content
        )
    )
)

Response Analysis

Response Metadata

Access additional information about the generation:

python

# Analyze response metadata
response = client.generate_text(
    deployment_id="dep_12345abcde",
    prompt="What are the potential side effects of metformin?",
    max_tokens=500,
    return_metadata=True
)

print("Response:", response.text)
print("Token count:", response.usage.total_tokens)
print("Prompt tokens:", response.usage.prompt_tokens)
print("Completion tokens:", response.usage.completion_tokens)
print("Finish reason:", response.finish_reason)
print("Model used:", response.model)
print("Created at:", response.created_at)

Content Analysis

Analyze the generated content:

python

# Analyze generated content
analysis = client.analyze_text(
    text=response.text,
    analyses=["toxicity", "factuality", "bias", "medical_accuracy"]
)

print("Toxicity score:", analysis.toxicity)
print("Factuality score:", analysis.factuality)
print("Bias score:", analysis.bias)
print("Medical accuracy score:", analysis.medical_accuracy)

Generation Control

Rate Limiting

Implement rate limiting for your application:

python

from paitient_secure_model import Client
from paitient_secure_model.rate_limit import RateLimiter

# Initialize client with rate limiter
client = Client()
limiter = RateLimiter(
    requests_per_minute=60,
    burst_size=10
)

# Generate text with rate limiting
try:
    with limiter:
        response = client.generate_text(
            deployment_id="dep_12345abcde",
            prompt="What are the potential side effects of metformin?"
        )
        print(response.text)
except Exception as e:
    print(f"Rate limit exceeded: {e}")

Timeout Control

Control timeouts for requests:

python

# Generate text with timeout control
try:
    response = client.generate_text(
        deployment_id="dep_12345abcde",
        prompt="Write a detailed analysis of current diabetes management guidelines.",
        max_tokens=2000,
        timeout=30.0  # Timeout in seconds
    )
    print(response.text)
except TimeoutError:
    print("Request timed out. Try again with fewer tokens or simpler prompt.")

Advanced Usage

Async Support

Use async versions of methods for better performance:

python

import asyncio
from paitient_secure_model import AsyncClient

async def generate_responses():
    # Initialize async client
    client = AsyncClient()
    
    # Generate multiple responses concurrently
    prompts = [
        "What are the symptoms of hypertension?",
        "What are common treatments for type 2 diabetes?",
        "Explain the mechanism of action for statins."
    ]
    
    tasks = [client.generate_text(
        deployment_id="dep_12345abcde",
        prompt=prompt,
        max_tokens=300
    ) for prompt in prompts]
    
    responses = await asyncio.gather(*tasks)
    
    for i, response in enumerate(responses):
        print(f"Prompt: {prompts[i]}")
        print(f"Response: {response.text}")
        print()

# Run the async function
asyncio.run(generate_responses())

Custom Models

Generate text from custom fine-tuned models:

python

# Get fine-tuned model ID
fine_tuning_job = client.get_fine_tuning_job("ft_12345abcde")
fine_tuned_model = fine_tuning_job.fine_tuned_model

# Deploy the fine-tuned model
deployment = client.create_deployment(
    model_name=fine_tuned_model,
    deployment_name="custom-medical-assistant"
)

# Wait for deployment to complete
deployment.wait_until_ready()

# Generate text from custom model
response = client.generate_text(
    deployment_id=deployment.id,
    prompt="What are the potential side effects of metformin?"
)

print(response.text)

Multi-tenant Usage

For applications serving multiple clients:

python

# Generate text in multi-tenant context
response = client.generate_text(
    deployment_id="dep_12345abcde",
    prompt="What are the potential side effects of metformin?",
    tenant_id="tenant_12345",  # Ensures strong isolation
    user_id="user_67890"       # For audit and attribution
)

Error Handling

Implement robust error handling:

python

from paitient_secure_model import Client
from paitient_secure_model.exceptions import (
    GenerationError,
    ResourceNotFoundError,
    RateLimitError,
    ContentFilterError,
    InvalidParameterError
)

client = Client()

try:
    response = client.generate_text(
        deployment_id="dep_12345abcde",
        prompt="What are the potential side effects of metformin?",
        max_tokens=500
    )
    print(response.text)
except ResourceNotFoundError:
    print("Deployment not found. Check your deployment ID.")
except RateLimitError as e:
    print(f"Rate limit exceeded. Retry after {e.retry_after} seconds.")
except ContentFilterError as e:
    print(f"Content filtered: {e.reason}")
except InvalidParameterError as e:
    print(f"Invalid parameter: {e}")
except GenerationError as e:
    print(f"Generation failed: {e}")
    print(f"Request ID for troubleshooting: {e.request_id}")

Best Practices

Prompt Engineering

Follow these best practices for effective prompts:

Be Specific: Provide clear, detailed instructions
Establish Context: Include relevant background information
Structure Output: Specify desired format and structure
Use Examples: Include examples for complex tasks
Iterate: Refine prompts based on results

Example of a well-structured prompt:

python

prompt = """
You are a clinical assistant helping a healthcare provider. 
Provide information about metformin for diabetes management.

Please include:
1. Common side effects and their frequency
2. Contraindications
3. Recommended dosage adjustments for patients with renal impairment
4. Drug interactions to be aware of

Format the response with clear headings and bullet points.
Cite relevant clinical guidelines where appropriate.
"""

response = client.generate_text(
    deployment_id="dep_12345abcde",
    prompt=prompt,
    max_tokens=800
)

Performance Optimization

Optimize text generation performance:

Batch Requests: Use batch API for multiple prompts
Stream Long Responses: Use streaming for better UX
Optimize Tokens: Keep prompts concise
Cache Common Responses: Implement response caching
Right-size Parameters: Adjust max_tokens to actual needs

Security

Ensure secure text generation:

Sanitize Inputs: Validate and clean user inputs
Enable Content Filtering: Prevent harmful outputs
Use PII/PHI Detection: Protect sensitive information
Audit Generations: Track and review outputs
Implement Rate Limiting: Prevent abuse

Next Steps

Learn about Deployment
Explore Fine-tuning
Understand Security Best Practices
Review Troubleshooting

Python SDK: Text Generation ​

Basic Text Generation ​

Generation Options ​

Text Parameters ​

System Messages ​

Context Management ​

Advanced Generation Features ​

Streaming Responses ​

Batch Processing ​

Function Calling ​

Security Controls ​

Response Analysis ​

Response Metadata ​

Content Analysis ​

Generation Control ​

Rate Limiting ​

Timeout Control ​

Advanced Usage ​

Async Support ​

Custom Models ​

Multi-tenant Usage ​

Error Handling ​

Best Practices ​

Prompt Engineering ​

Performance Optimization ​

Security ​

Next Steps ​

Python SDK: Text Generation

Basic Text Generation

Generation Options

Text Parameters

System Messages

Context Management

Advanced Generation Features

Streaming Responses

Batch Processing

Function Calling

Security Controls

Response Analysis

Response Metadata

Content Analysis

Generation Control

Rate Limiting

Timeout Control

Advanced Usage

Async Support

Custom Models

Multi-tenant Usage

Error Handling

Best Practices

Prompt Engineering

Performance Optimization

Security

Next Steps