Skip to content

Node.js SDK: Fine-tuning

This guide covers the fine-tuning capabilities of the PaiTIENT Secure Model Service Node.js SDK, allowing you to customize AI models for healthcare-specific use cases while maintaining HIPAA/SOC2 compliance.

Fine-tuning Overview

Fine-tuning allows you to adapt pre-trained models to your specific healthcare domain or use case, improving performance, accuracy, and adherence to guidelines.

The PaiTIENT Secure Model Service provides secure fine-tuning capabilities with:

  1. Data Security: All fine-tuning data is encrypted and processed in secure environments
  2. Compliance: The entire fine-tuning process adheres to HIPAA and SOC2 requirements
  3. Efficient Training: Parameter-efficient fine-tuning methods like LoRA
  4. Performance Tracking: Detailed metrics for fine-tuning progress and evaluation

Prerequisites

Before fine-tuning a model, ensure you have:

  1. Installed the PaiTIENT Node.js SDK
  2. Set up authentication credentials
  3. Prepared a fine-tuning dataset
  4. Selected a base model for fine-tuning

Installation

bash
# Install the PaiTIENT Node.js SDK
npm install paitient-secure-model
# or with yarn
yarn add paitient-secure-model

Dataset Preparation

Dataset Format

The PaiTIENT Secure Model Service supports various dataset formats for different fine-tuning approaches:

Instruction Dataset Format (JSONL)

For instruction fine-tuning, use a JSONL file with this structure:

jsonl
{"instruction": "Explain the mechanism of action for metformin", "response": "Metformin primarily works by reducing glucose production in the liver (hepatic gluconeogenesis), improving insulin sensitivity in peripheral tissues, and enhancing glucose uptake in skeletal muscle. It also decreases intestinal absorption of glucose."}
{"instruction": "What are the symptoms of diabetic ketoacidosis?", "response": "The main symptoms of diabetic ketoacidosis include excessive thirst, frequent urination, nausea and vomiting, abdominal pain, weakness or fatigue, shortness of breath, fruity-scented breath, and confusion. It may also present with high blood glucose levels, high ketone levels in the urine, and electrolyte imbalances."}

Conversation Dataset Format (JSONL)

For conversational fine-tuning, use a JSONL file with this structure:

jsonl
{"conversations": [{"role": "user", "content": "What are the side effects of atorvastatin?"}, {"role": "assistant", "content": "Atorvastatin may cause side effects including muscle pain and weakness, which in rare cases can lead to serious muscle damage (rhabdomyolysis). Other common side effects include joint pain, diarrhea, nausea, and elevated liver enzymes. Rare but serious side effects include liver damage, diabetes, and memory problems. Always consult your healthcare provider if you experience concerning symptoms while taking atorvastatin."}]}
{"conversations": [{"role": "user", "content": "I'm experiencing leg pain after starting atorvastatin. Should I be concerned?"}, {"role": "assistant", "content": "Leg pain after starting atorvastatin could be a sign of muscle-related side effects, which should be taken seriously. Please contact your healthcare provider immediately to discuss these symptoms. Your doctor might want to check your creatine kinase levels to rule out muscle damage. Don't stop taking the medication without medical advice, but this symptom warrants prompt medical attention."}]}

Dataset Validation

Validate your dataset before fine-tuning:

javascript
const { PaiTIENTClient } = require('paitient-secure-model');

// Initialize client
const client = new PaiTIENTClient({
  apiKey: process.env.PAITIENT_API_KEY,
  clientId: process.env.PAITIENT_CLIENT_ID
});

async function validateDataset() {
  try {
    // Validate dataset
    const validationResult = await client.validateDataset({
      filePath: "clinical_guidelines_dataset.jsonl",
      format: "instruction"  // or "conversation"
    });

    if (validationResult.isValid) {
      console.log("Dataset validation successful");
      console.log(`Number of examples: ${validationResult.numExamples}`);
      console.log(`Total tokens: ${validationResult.totalTokens}`);
    } else {
      console.log("Dataset validation failed:");
      for (const error of validationResult.errors) {
        console.log(`  - ${error}`);
      }
    }
  } catch (error) {
    console.error('Dataset validation failed:', error);
  }
}

validateDataset();

Dataset Upload

Upload your dataset to the secure environment:

javascript
// Upload dataset
async function uploadDataset() {
  try {
    const dataset = await client.uploadDataset({
      filePath: "clinical_guidelines_dataset.jsonl",
      name: "Clinical Guidelines Dataset",
      description: "Fine-tuning dataset for clinical guideline adherence",
      format: "instruction",
      tags: {domain: "cardiology", source: "guidelines"}
    });

    console.log(`Dataset ID: ${dataset.id}`);
    console.log(`Number of examples: ${dataset.numExamples}`);
    console.log(`Total tokens: ${dataset.totalTokens}`);
    
    return dataset.id;
  } catch (error) {
    console.error('Dataset upload failed:', error);
  }
}

Basic Fine-tuning

The simplest way to fine-tune a model:

javascript
const { PaiTIENTClient } = require('paitient-secure-model');

// Initialize client
const client = new PaiTIENTClient({
  apiKey: process.env.PAITIENT_API_KEY,
  clientId: process.env.PAITIENT_CLIENT_ID
});

async function fineTuneModel() {
  try {
    // Start a fine-tuning job
    const fineTuningJob = await client.createFineTuningJob({
      baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
      datasetId: "ds_12345abcde",
      fineTuningMethod: "lora",
      modelName: "clinical-assistant-cardiology-v1"
    });

    console.log(`Fine-tuning job ID: ${fineTuningJob.id}`);
    console.log(`Status: ${fineTuningJob.status}`);

    // Wait for fine-tuning to complete
    await fineTuningJob.waitUntilComplete();
    console.log(`Fine-tuning is now ${fineTuningJob.status}`);
    console.log(`Fine-tuned model: ${fineTuningJob.fineTunedModel}`);
    
    return fineTuningJob;
  } catch (error) {
    console.error('Fine-tuning failed:', error);
  }
}

fineTuneModel();

Fine-tuning Options

Training Parameters

Configure training parameters for your fine-tuning job:

javascript
// Fine-tuning with specific training parameters
const fineTuningJob = await client.createFineTuningJob({
  baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
  datasetId: "ds_12345abcde",
  fineTuningMethod: "lora",
  modelName: "clinical-assistant-cardiology-v1",
  hyperparameters: {
    learningRate: 1e-4,
    batchSize: 8,
    epochs: 3,
    warmupSteps: 100,
    loraRank: 16,
    loraAlpha: 32,
    loraDropout: 0.05
  }
});

Validation Dataset

Use a validation dataset to monitor training progress:

javascript
async function fineTuneWithValidation() {
  try {
    // Upload validation dataset
    const validationDataset = await client.uploadDataset({
      filePath: "clinical_guidelines_validation.jsonl",
      name: "Clinical Guidelines Validation Dataset",
      format: "instruction"
    });

    // Fine-tuning with validation
    const fineTuningJob = await client.createFineTuningJob({
      baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
      datasetId: "ds_12345abcde",
      validationDatasetId: validationDataset.id,
      fineTuningMethod: "lora",
      modelName: "clinical-assistant-cardiology-v1",
      validationFrequency: 0.1  // Validate after 10% of training steps
    });
    
    return fineTuningJob;
  } catch (error) {
    console.error('Fine-tuning with validation failed:', error);
  }
}

Compute Configuration

Configure compute resources for your fine-tuning job:

javascript
// Fine-tuning with compute configuration
const fineTuningJob = await client.createFineTuningJob({
  baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
  datasetId: "ds_12345abcde",
  fineTuningMethod: "lora",
  modelName: "clinical-assistant-cardiology-v1",
  computeConfig: {
    instanceType: "g5.2xlarge",
    instanceCount: 1,
    maxRuntimeHours: 24
  }
});

Security Settings

Apply security settings to your fine-tuning job:

javascript
const { PaiTIENTClient, Security } = require('paitient-secure-model');

// Initialize client
const client = new PaiTIENTClient({
  apiKey: process.env.PAITIENT_API_KEY,
  clientId: process.env.PAITIENT_CLIENT_ID
});

async function secureFineTuning() {
  try {
    // Fine-tuning with security settings
    const fineTuningJob = await client.createFineTuningJob({
      baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
      datasetId: "ds_12345abcde",
      fineTuningMethod: "lora",
      modelName: "clinical-assistant-cardiology-v1",
      securitySettings: new Security.SecuritySettings({
        networkIsolation: true,        // Enable network isolation
        encryptionLevel: "maximum",    // Maximum encryption level
        auditLogging: true,            // Enable comprehensive audit logging
        complianceMode: "hipaa"        // Enable HIPAA compliance mode
      })
    });
    
    return fineTuningJob;
  } catch (error) {
    console.error('Secure fine-tuning failed:', error);
  }
}

Advanced Fine-tuning Methods

LoRA Fine-tuning

Low-Rank Adaptation (LoRA) is the default fine-tuning method:

javascript
// LoRA fine-tuning with advanced parameters
const fineTuningJob = await client.createFineTuningJob({
  baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
  datasetId: "ds_12345abcde",
  fineTuningMethod: "lora",
  modelName: "clinical-assistant-cardiology-v1",
  hyperparameters: {
    loraRank: 16,              // Rank of LoRA matrices
    loraAlpha: 32,             // LoRA scaling factor
    loraDropout: 0.05,         // Dropout probability for LoRA layers
    targetModules: ["q_proj", "v_proj"],  // Target modules for LoRA
    learningRate: 1e-4,
    batchSize: 8,
    epochs: 3
  }
});

QLoRA Fine-tuning

Quantized LoRA (QLoRA) for memory-efficient fine-tuning:

javascript
// QLoRA fine-tuning
const fineTuningJob = await client.createFineTuningJob({
  baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
  datasetId: "ds_12345abcde",
  fineTuningMethod: "qlora",
  modelName: "clinical-assistant-cardiology-v1",
  hyperparameters: {
    quantizationBits: 4,       // 4-bit quantization
    loraRank: 16,
    loraAlpha: 32,
    learningRate: 1e-4,
    batchSize: 8,
    epochs: 3
  }
});

Full Fine-tuning

Full parameter fine-tuning for maximum performance:

javascript
// Full fine-tuning (requires more compute resources)
const fineTuningJob = await client.createFineTuningJob({
  baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
  datasetId: "ds_12345abcde",
  fineTuningMethod: "full",
  modelName: "clinical-assistant-cardiology-v1",
  hyperparameters: {
    learningRate: 5e-5,
    batchSize: 4,
    epochs: 2,
    warmupSteps: 100,
    weightDecay: 0.01
  },
  computeConfig: {
    instanceType: "g5.12xlarge",  // More powerful GPU instance
    instanceCount: 2,
    maxRuntimeHours: 48
  }
});

Fine-tuning Management

Check Fine-tuning Status

Monitor the status of your fine-tuning job:

javascript
// Get fine-tuning job status
async function checkFineTuningStatus(jobId) {
  try {
    const job = await client.getFineTuningJob(jobId);
    console.log(`Status: ${job.status}`);
    console.log(`Created: ${job.createdAt}`);
    console.log(`Updated: ${job.updatedAt}`);
    console.log(`Base model: ${job.baseModel}`);

    // Get detailed job information
    const details = await job.getDetails();
    console.log(`Current epoch: ${details.currentEpoch}/${details.totalEpochs}`);
    console.log(`Training loss: ${details.trainingLoss}`);
    console.log(`Validation loss: ${details.validationLoss}`);
    console.log(`Training samples processed: ${details.processedSamples}`);
    
    return job;
  } catch (error) {
    console.error('Status check failed:', error);
  }
}

List Fine-tuning Jobs

Retrieve a list of all your fine-tuning jobs:

javascript
// List all fine-tuning jobs
async function listFineTuningJobs() {
  try {
    const jobs = await client.listFineTuningJobs();
    for (const job of jobs) {
      console.log(`${job.id}: ${job.modelName} - ${job.status}`);
    }

    // Filter fine-tuning jobs
    const completedJobs = await client.listFineTuningJobs({
      filters: {
        status: "completed",
        baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B"
      }
    });
    
    console.log("\nCompleted jobs:");
    for (const job of completedJobs) {
      console.log(`${job.id}: ${job.modelName} - Completed`);
    }
    
    return jobs;
  } catch (error) {
    console.error('List operation failed:', error);
  }
}

Cancel Fine-tuning Job

Cancel a fine-tuning job that's no longer needed:

javascript
// Cancel fine-tuning job
async function cancelFineTuningJob(jobId) {
  try {
    await client.cancelFineTuningJob(jobId);
    console.log(`Job ${jobId} has been cancelled`);
    
    // Verify the status
    const job = await client.getFineTuningJob(jobId);
    console.log(`Current status: ${job.status}`);
    
    return job;
  } catch (error) {
    console.error('Cancellation failed:', error);
  }
}

Fine-tuning Metrics

Retrieve metrics for your fine-tuning job:

javascript
// Get fine-tuning metrics
async function getFineTuningMetrics(jobId) {
  try {
    const metrics = await client.getFineTuningMetrics(jobId);
    
    console.log("Fine-tuning metrics:");
    console.log(`Training loss: ${metrics.trainingLoss.slice(-1)[0]}`);
    console.log(`Validation loss: ${metrics.validationLoss.slice(-1)[0]}`);
    console.log(`Steps completed: ${metrics.steps.slice(-1)[0]}`);
    
    // If using a visualization library like Chart.js
    // You could render these metrics in a frontend application
    return metrics;
  } catch (error) {
    console.error('Metrics retrieval failed:', error);
  }
}

Fine-tuned Model Deployment

Deploy your fine-tuned model:

javascript
// Deploy a fine-tuned model
async function deployFineTunedModel(jobId) {
  try {
    // Get fine-tuned model ID
    const fineTuningJob = await client.getFineTuningJob(jobId);
    const fineTunedModel = fineTuningJob.fineTunedModel;

    // Deploy the fine-tuned model
    const deployment = await client.createDeployment({
      modelName: fineTunedModel,
      deploymentName: "clinical-assistant-cardiology"
    });

    // Wait for deployment to complete
    await deployment.waitUntilReady();
    console.log(`Deployment is now ${deployment.status}`);
    console.log(`Endpoint: ${deployment.endpoint}`);

    // Test the fine-tuned model
    const response = await client.generateText({
      deploymentId: deployment.id,
      prompt: "What is the recommended first-line treatment for hypertension in diabetic patients?"
    });

    console.log("Model response:", response.text);
    
    return deployment;
  } catch (error) {
    console.error('Deployment failed:', error);
  }
}

Fine-tuning Evaluation

Evaluate your fine-tuned model against the base model:

javascript
const { PaiTIENTClient, Evaluation } = require('paitient-secure-model');

// Initialize client
const client = new PaiTIENTClient({
  apiKey: process.env.PAITIENT_API_KEY,
  clientId: process.env.PAITIENT_CLIENT_ID
});

async function evaluateFineTunedModel() {
  try {
    // Get fine-tuned model deployment
    const fineTunedDeployment = await client.getDeployment("dep_fine_tuned");

    // Get base model deployment
    const baseModelDeployment = await client.getDeployment("dep_base_model");

    // Create comparative evaluation
    const evaluation = new Evaluation.ComparativeEvaluation({
      name: "Clinical Guidelines Adherence Evaluation",
      deployments: [
        {id: baseModelDeployment.id, name: "Base Model"},
        {id: fineTunedDeployment.id, name: "Fine-tuned Model"}
      ],
      metrics: [
        "factuality",
        "guidelineAdherence",
        "completeness",
        "clinicalAccuracy"
      ]
    });

    // Run evaluation on test dataset
    const results = await evaluation.run({
      dataset: "clinical_test_cases.jsonl",
      numSamples: 50
    });

    // Print comparative results
    console.log("Comparative Evaluation Results:");
    for (const metric of Object.keys(results.scores["Base Model"])) {
      const baseScore = results.scores["Base Model"][metric];
      const fineTunedScore = results.scores["Fine-tuned Model"][metric];
      const improvement = fineTunedScore - baseScore;
      console.log(`${metric}: ${baseScore.toFixed(2)} → ${fineTunedScore.toFixed(2)} (${improvement >= 0 ? '+' : ''}${improvement.toFixed(2)})`);
    }
    
    return results;
  } catch (error) {
    console.error('Evaluation failed:', error);
  }
}

Advanced Use Cases

Multi-task Fine-tuning

Fine-tune a model for multiple medical specialties:

javascript
// Multi-task fine-tuning
async function multiTaskFineTuning() {
  try {
    // Upload datasets for different specialties
    const cardiologyDataset = await client.uploadDataset({
      filePath: "cardiology_guidelines.jsonl",
      name: "Cardiology Guidelines Dataset",
      format: "instruction"
    });

    const endocrinologyDataset = await client.uploadDataset({
      filePath: "endocrinology_guidelines.jsonl",
      name: "Endocrinology Guidelines Dataset",
      format: "instruction"
    });

    const oncologyDataset = await client.uploadDataset({
      filePath: "oncology_guidelines.jsonl",
      name: "Oncology Guidelines Dataset",
      format: "instruction"
    });

    // Combine datasets for multi-task fine-tuning
    const combinedDataset = await client.mergeDatasets({
      datasetIds: [
        cardiologyDataset.id,
        endocrinologyDataset.id,
        oncologyDataset.id
      ],
      name: "Multi-specialty Guidelines Dataset"
    });

    // Fine-tune on the combined dataset
    const fineTuningJob = await client.createFineTuningJob({
      baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
      datasetId: combinedDataset.id,
      fineTuningMethod: "lora",
      modelName: "clinical-assistant-multi-specialty-v1"
    });
    
    return fineTuningJob;
  } catch (error) {
    console.error('Multi-task fine-tuning failed:', error);
  }
}

Continual Fine-tuning

Update your fine-tuned model with new data:

javascript
// Continual fine-tuning
async function continualFineTuning() {
  try {
    // Get existing fine-tuned model
    const existingFineTunedModel = "ft:ZimaBlueAI/HuatuoGPT-o1-8B:clinical-assistant-v1";

    // Upload new dataset with updated guidelines
    const newGuidelinesDataset = await client.uploadDataset({
      filePath: "updated_guidelines_2023.jsonl",
      name: "Updated Guidelines 2023",
      format: "instruction"
    });

    // Fine-tune from the previously fine-tuned model
    const fineTuningJob = await client.createFineTuningJob({
      baseModel: existingFineTunedModel,  // Start from previously fine-tuned model
      datasetId: newGuidelinesDataset.id,
      fineTuningMethod: "lora",
      modelName: "clinical-assistant-v2",
      hyperparameters: {
        learningRate: 5e-5,  // Lower learning rate for continued fine-tuning
        epochs: 1            // Fewer epochs needed
      }
    });
    
    return fineTuningJob;
  } catch (error) {
    console.error('Continual fine-tuning failed:', error);
  }
}

Domain Adaptation with Synthetic Data

Generate and use synthetic data for domain adaptation:

javascript
const { PaiTIENTClient, DataGeneration } = require('paitient-secure-model');

// Initialize client
const client = new PaiTIENTClient({
  apiKey: process.env.PAITIENT_API_KEY,
  clientId: process.env.PAITIENT_CLIENT_ID
});

async function domainAdaptation() {
  try {
    // Generate synthetic medical conversations
    const syntheticData = await DataGeneration.generateSyntheticData({
      generationModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
      domain: "neurology",
      templateFile: "neurology_templates.json",
      numExamples: 500,
      outputFile: "synthetic_neurology_dataset.jsonl"
    });

    // Upload synthetic dataset
    const syntheticDataset = await client.uploadDataset({
      filePath: "synthetic_neurology_dataset.jsonl",
      name: "Synthetic Neurology Dataset",
      format: "conversation"
    });

    // Fine-tune with synthetic data
    const fineTuningJob = await client.createFineTuningJob({
      baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
      datasetId: syntheticDataset.id,
      fineTuningMethod: "lora",
      modelName: "neurology-assistant-v1"
    });
    
    return fineTuningJob;
  } catch (error) {
    console.error('Domain adaptation failed:', error);
  }
}

Error Handling

Implement proper error handling for fine-tuning:

javascript
const { PaiTIENTClient, errors } = require('paitient-secure-model');

const client = new PaiTIENTClient({
  apiKey: process.env.PAITIENT_API_KEY,
  clientId: process.env.PAITIENT_CLIENT_ID
});

async function robustFineTuning() {
  try {
    const fineTuningJob = await client.createFineTuningJob({
      baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
      datasetId: "ds_12345abcde",
      fineTuningMethod: "lora",
      modelName: "clinical-assistant-v1"
    });
    
    return fineTuningJob;
  } catch (error) {
    if (error instanceof errors.InvalidParameterError) {
      console.error(`Invalid parameter: ${error.message}`);
    } else if (error instanceof errors.QuotaExceededError) {
      console.error(`Quota exceeded: ${error.message}`);
    } else if (error instanceof errors.FineTuningError) {
      console.error(`Fine-tuning failed: ${error.message}`);
      console.error(`Job ID: ${error.jobId}`);
      console.error(`Status: ${error.status}`);
      console.error(`Reason: ${error.reason}`);
      
      // Get detailed error information
      if (error.jobId) {
        try {
          const logs = await client.getFineTuningLogs({
            jobId: error.jobId,
            limit: 10
          });
          
          console.error("Error logs:");
          for (const log of logs) {
            console.error(`  ${log.message}`);
          }
        } catch (logError) {
          console.error(`Failed to fetch logs: ${logError.message}`);
        }
      }
    } else {
      console.error(`Unexpected error: ${error.message}`);
    }
  }
}

Best Practices

Dataset Quality

Follow these best practices for dataset preparation:

  1. Ensure diversity in your examples
  2. Focus on domain-specific content relevant to your use case
  3. Include edge cases common in clinical scenarios
  4. Balance response length appropriate for your application
  5. Use high-quality, verified medical information
  6. De-identify any patient data according to HIPAA guidelines

Hyperparameter Optimization

Optimize hyperparameters for your specific use case:

  1. Start with defaults for your first fine-tuning job
  2. Experiment with learning rates between 1e-5 and 5e-4
  3. Adjust batch size based on available memory
  4. Try different LoRA ranks (8, 16, 32) to balance efficiency and effectiveness
  5. Use early stopping to prevent overfitting

Infrastructure Considerations

Optimize your infrastructure for fine-tuning:

  1. Choose appropriate instance types based on model size
  2. Monitor GPU memory usage to avoid out-of-memory errors
  3. Implement checkpointing for long-running jobs
  4. Use distributed training for very large models

Evaluation Strategy

Develop a robust evaluation strategy:

  1. Create a dedicated test set not used in training
  2. Define domain-specific metrics relevant to clinical performance
  3. Compare against baseline models
  4. Involve domain experts in qualitative evaluation
  5. Test on realistic scenarios from your specific healthcare domain
javascript
// Example of structured evaluation with domain experts
const { PaiTIENTClient, Evaluation } = require('paitient-secure-model');

async function expertEvaluation(deploymentId) {
  try {
    // Create a human evaluation task
    const humanEval = new Evaluation.HumanEvaluationTask({
      name: "Expert Clinical Assessment",
      deploymentId: deploymentId,
      evaluators: [
        {id: "expert_12345", role: "cardiologist"},
        {id: "expert_67890", role: "clinician"}
      ],
      criteria: [
        {
          name: "clinical_accuracy",
          description: "Is the medical information accurate and up-to-date?",
          scale: "1-5"
        },
        {
          name: "guideline_adherence",
          description: "Does the response follow current clinical guidelines?",
          scale: "1-5"
        },
        {
          name: "safety",
          description: "Does the response include appropriate cautions and limitations?",
          scale: "1-5"
        }
      ]
    });

    // Generate samples for evaluation
    await humanEval.generateSamples({
      numSamples: 20,
      prompts: [
        "What are the treatment options for type 2 diabetes?",
        "What are the risk factors for cardiovascular disease?",
        "How should mild hypertension be managed?",
        "What are the side effects of metformin?"
      ]
    });

    // Start the evaluation
    await humanEval.start();
    console.log(`Human evaluation started: ${humanEval.id}`);
    
    return humanEval;
  } catch (error) {
    console.error('Expert evaluation failed:', error);
  }
}

Next Steps

Released under the MIT License.