Node.js SDK: Fine-tuning
This guide covers the fine-tuning capabilities of the PaiTIENT Secure Model Service Node.js SDK, allowing you to customize AI models for healthcare-specific use cases while maintaining HIPAA/SOC2 compliance.
Fine-tuning Overview
Fine-tuning allows you to adapt pre-trained models to your specific healthcare domain or use case, improving performance, accuracy, and adherence to guidelines.
The PaiTIENT Secure Model Service provides secure fine-tuning capabilities with:
- Data Security: All fine-tuning data is encrypted and processed in secure environments
- Compliance: The entire fine-tuning process adheres to HIPAA and SOC2 requirements
- Efficient Training: Parameter-efficient fine-tuning methods like LoRA
- Performance Tracking: Detailed metrics for fine-tuning progress and evaluation
Prerequisites
Before fine-tuning a model, ensure you have:
- Installed the PaiTIENT Node.js SDK
- Set up authentication credentials
- Prepared a fine-tuning dataset
- Selected a base model for fine-tuning
Installation
# Install the PaiTIENT Node.js SDK
npm install paitient-secure-model
# or with yarn
yarn add paitient-secure-modelDataset Preparation
Dataset Format
The PaiTIENT Secure Model Service supports various dataset formats for different fine-tuning approaches:
Instruction Dataset Format (JSONL)
For instruction fine-tuning, use a JSONL file with this structure:
{"instruction": "Explain the mechanism of action for metformin", "response": "Metformin primarily works by reducing glucose production in the liver (hepatic gluconeogenesis), improving insulin sensitivity in peripheral tissues, and enhancing glucose uptake in skeletal muscle. It also decreases intestinal absorption of glucose."}
{"instruction": "What are the symptoms of diabetic ketoacidosis?", "response": "The main symptoms of diabetic ketoacidosis include excessive thirst, frequent urination, nausea and vomiting, abdominal pain, weakness or fatigue, shortness of breath, fruity-scented breath, and confusion. It may also present with high blood glucose levels, high ketone levels in the urine, and electrolyte imbalances."}Conversation Dataset Format (JSONL)
For conversational fine-tuning, use a JSONL file with this structure:
{"conversations": [{"role": "user", "content": "What are the side effects of atorvastatin?"}, {"role": "assistant", "content": "Atorvastatin may cause side effects including muscle pain and weakness, which in rare cases can lead to serious muscle damage (rhabdomyolysis). Other common side effects include joint pain, diarrhea, nausea, and elevated liver enzymes. Rare but serious side effects include liver damage, diabetes, and memory problems. Always consult your healthcare provider if you experience concerning symptoms while taking atorvastatin."}]}
{"conversations": [{"role": "user", "content": "I'm experiencing leg pain after starting atorvastatin. Should I be concerned?"}, {"role": "assistant", "content": "Leg pain after starting atorvastatin could be a sign of muscle-related side effects, which should be taken seriously. Please contact your healthcare provider immediately to discuss these symptoms. Your doctor might want to check your creatine kinase levels to rule out muscle damage. Don't stop taking the medication without medical advice, but this symptom warrants prompt medical attention."}]}Dataset Validation
Validate your dataset before fine-tuning:
const { PaiTIENTClient } = require('paitient-secure-model');
// Initialize client
const client = new PaiTIENTClient({
apiKey: process.env.PAITIENT_API_KEY,
clientId: process.env.PAITIENT_CLIENT_ID
});
async function validateDataset() {
try {
// Validate dataset
const validationResult = await client.validateDataset({
filePath: "clinical_guidelines_dataset.jsonl",
format: "instruction" // or "conversation"
});
if (validationResult.isValid) {
console.log("Dataset validation successful");
console.log(`Number of examples: ${validationResult.numExamples}`);
console.log(`Total tokens: ${validationResult.totalTokens}`);
} else {
console.log("Dataset validation failed:");
for (const error of validationResult.errors) {
console.log(` - ${error}`);
}
}
} catch (error) {
console.error('Dataset validation failed:', error);
}
}
validateDataset();Dataset Upload
Upload your dataset to the secure environment:
// Upload dataset
async function uploadDataset() {
try {
const dataset = await client.uploadDataset({
filePath: "clinical_guidelines_dataset.jsonl",
name: "Clinical Guidelines Dataset",
description: "Fine-tuning dataset for clinical guideline adherence",
format: "instruction",
tags: {domain: "cardiology", source: "guidelines"}
});
console.log(`Dataset ID: ${dataset.id}`);
console.log(`Number of examples: ${dataset.numExamples}`);
console.log(`Total tokens: ${dataset.totalTokens}`);
return dataset.id;
} catch (error) {
console.error('Dataset upload failed:', error);
}
}Basic Fine-tuning
The simplest way to fine-tune a model:
const { PaiTIENTClient } = require('paitient-secure-model');
// Initialize client
const client = new PaiTIENTClient({
apiKey: process.env.PAITIENT_API_KEY,
clientId: process.env.PAITIENT_CLIENT_ID
});
async function fineTuneModel() {
try {
// Start a fine-tuning job
const fineTuningJob = await client.createFineTuningJob({
baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
datasetId: "ds_12345abcde",
fineTuningMethod: "lora",
modelName: "clinical-assistant-cardiology-v1"
});
console.log(`Fine-tuning job ID: ${fineTuningJob.id}`);
console.log(`Status: ${fineTuningJob.status}`);
// Wait for fine-tuning to complete
await fineTuningJob.waitUntilComplete();
console.log(`Fine-tuning is now ${fineTuningJob.status}`);
console.log(`Fine-tuned model: ${fineTuningJob.fineTunedModel}`);
return fineTuningJob;
} catch (error) {
console.error('Fine-tuning failed:', error);
}
}
fineTuneModel();Fine-tuning Options
Training Parameters
Configure training parameters for your fine-tuning job:
// Fine-tuning with specific training parameters
const fineTuningJob = await client.createFineTuningJob({
baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
datasetId: "ds_12345abcde",
fineTuningMethod: "lora",
modelName: "clinical-assistant-cardiology-v1",
hyperparameters: {
learningRate: 1e-4,
batchSize: 8,
epochs: 3,
warmupSteps: 100,
loraRank: 16,
loraAlpha: 32,
loraDropout: 0.05
}
});Validation Dataset
Use a validation dataset to monitor training progress:
async function fineTuneWithValidation() {
try {
// Upload validation dataset
const validationDataset = await client.uploadDataset({
filePath: "clinical_guidelines_validation.jsonl",
name: "Clinical Guidelines Validation Dataset",
format: "instruction"
});
// Fine-tuning with validation
const fineTuningJob = await client.createFineTuningJob({
baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
datasetId: "ds_12345abcde",
validationDatasetId: validationDataset.id,
fineTuningMethod: "lora",
modelName: "clinical-assistant-cardiology-v1",
validationFrequency: 0.1 // Validate after 10% of training steps
});
return fineTuningJob;
} catch (error) {
console.error('Fine-tuning with validation failed:', error);
}
}Compute Configuration
Configure compute resources for your fine-tuning job:
// Fine-tuning with compute configuration
const fineTuningJob = await client.createFineTuningJob({
baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
datasetId: "ds_12345abcde",
fineTuningMethod: "lora",
modelName: "clinical-assistant-cardiology-v1",
computeConfig: {
instanceType: "g5.2xlarge",
instanceCount: 1,
maxRuntimeHours: 24
}
});Security Settings
Apply security settings to your fine-tuning job:
const { PaiTIENTClient, Security } = require('paitient-secure-model');
// Initialize client
const client = new PaiTIENTClient({
apiKey: process.env.PAITIENT_API_KEY,
clientId: process.env.PAITIENT_CLIENT_ID
});
async function secureFineTuning() {
try {
// Fine-tuning with security settings
const fineTuningJob = await client.createFineTuningJob({
baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
datasetId: "ds_12345abcde",
fineTuningMethod: "lora",
modelName: "clinical-assistant-cardiology-v1",
securitySettings: new Security.SecuritySettings({
networkIsolation: true, // Enable network isolation
encryptionLevel: "maximum", // Maximum encryption level
auditLogging: true, // Enable comprehensive audit logging
complianceMode: "hipaa" // Enable HIPAA compliance mode
})
});
return fineTuningJob;
} catch (error) {
console.error('Secure fine-tuning failed:', error);
}
}Advanced Fine-tuning Methods
LoRA Fine-tuning
Low-Rank Adaptation (LoRA) is the default fine-tuning method:
// LoRA fine-tuning with advanced parameters
const fineTuningJob = await client.createFineTuningJob({
baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
datasetId: "ds_12345abcde",
fineTuningMethod: "lora",
modelName: "clinical-assistant-cardiology-v1",
hyperparameters: {
loraRank: 16, // Rank of LoRA matrices
loraAlpha: 32, // LoRA scaling factor
loraDropout: 0.05, // Dropout probability for LoRA layers
targetModules: ["q_proj", "v_proj"], // Target modules for LoRA
learningRate: 1e-4,
batchSize: 8,
epochs: 3
}
});QLoRA Fine-tuning
Quantized LoRA (QLoRA) for memory-efficient fine-tuning:
// QLoRA fine-tuning
const fineTuningJob = await client.createFineTuningJob({
baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
datasetId: "ds_12345abcde",
fineTuningMethod: "qlora",
modelName: "clinical-assistant-cardiology-v1",
hyperparameters: {
quantizationBits: 4, // 4-bit quantization
loraRank: 16,
loraAlpha: 32,
learningRate: 1e-4,
batchSize: 8,
epochs: 3
}
});Full Fine-tuning
Full parameter fine-tuning for maximum performance:
// Full fine-tuning (requires more compute resources)
const fineTuningJob = await client.createFineTuningJob({
baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
datasetId: "ds_12345abcde",
fineTuningMethod: "full",
modelName: "clinical-assistant-cardiology-v1",
hyperparameters: {
learningRate: 5e-5,
batchSize: 4,
epochs: 2,
warmupSteps: 100,
weightDecay: 0.01
},
computeConfig: {
instanceType: "g5.12xlarge", // More powerful GPU instance
instanceCount: 2,
maxRuntimeHours: 48
}
});Fine-tuning Management
Check Fine-tuning Status
Monitor the status of your fine-tuning job:
// Get fine-tuning job status
async function checkFineTuningStatus(jobId) {
try {
const job = await client.getFineTuningJob(jobId);
console.log(`Status: ${job.status}`);
console.log(`Created: ${job.createdAt}`);
console.log(`Updated: ${job.updatedAt}`);
console.log(`Base model: ${job.baseModel}`);
// Get detailed job information
const details = await job.getDetails();
console.log(`Current epoch: ${details.currentEpoch}/${details.totalEpochs}`);
console.log(`Training loss: ${details.trainingLoss}`);
console.log(`Validation loss: ${details.validationLoss}`);
console.log(`Training samples processed: ${details.processedSamples}`);
return job;
} catch (error) {
console.error('Status check failed:', error);
}
}List Fine-tuning Jobs
Retrieve a list of all your fine-tuning jobs:
// List all fine-tuning jobs
async function listFineTuningJobs() {
try {
const jobs = await client.listFineTuningJobs();
for (const job of jobs) {
console.log(`${job.id}: ${job.modelName} - ${job.status}`);
}
// Filter fine-tuning jobs
const completedJobs = await client.listFineTuningJobs({
filters: {
status: "completed",
baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B"
}
});
console.log("\nCompleted jobs:");
for (const job of completedJobs) {
console.log(`${job.id}: ${job.modelName} - Completed`);
}
return jobs;
} catch (error) {
console.error('List operation failed:', error);
}
}Cancel Fine-tuning Job
Cancel a fine-tuning job that's no longer needed:
// Cancel fine-tuning job
async function cancelFineTuningJob(jobId) {
try {
await client.cancelFineTuningJob(jobId);
console.log(`Job ${jobId} has been cancelled`);
// Verify the status
const job = await client.getFineTuningJob(jobId);
console.log(`Current status: ${job.status}`);
return job;
} catch (error) {
console.error('Cancellation failed:', error);
}
}Fine-tuning Metrics
Retrieve metrics for your fine-tuning job:
// Get fine-tuning metrics
async function getFineTuningMetrics(jobId) {
try {
const metrics = await client.getFineTuningMetrics(jobId);
console.log("Fine-tuning metrics:");
console.log(`Training loss: ${metrics.trainingLoss.slice(-1)[0]}`);
console.log(`Validation loss: ${metrics.validationLoss.slice(-1)[0]}`);
console.log(`Steps completed: ${metrics.steps.slice(-1)[0]}`);
// If using a visualization library like Chart.js
// You could render these metrics in a frontend application
return metrics;
} catch (error) {
console.error('Metrics retrieval failed:', error);
}
}Fine-tuned Model Deployment
Deploy your fine-tuned model:
// Deploy a fine-tuned model
async function deployFineTunedModel(jobId) {
try {
// Get fine-tuned model ID
const fineTuningJob = await client.getFineTuningJob(jobId);
const fineTunedModel = fineTuningJob.fineTunedModel;
// Deploy the fine-tuned model
const deployment = await client.createDeployment({
modelName: fineTunedModel,
deploymentName: "clinical-assistant-cardiology"
});
// Wait for deployment to complete
await deployment.waitUntilReady();
console.log(`Deployment is now ${deployment.status}`);
console.log(`Endpoint: ${deployment.endpoint}`);
// Test the fine-tuned model
const response = await client.generateText({
deploymentId: deployment.id,
prompt: "What is the recommended first-line treatment for hypertension in diabetic patients?"
});
console.log("Model response:", response.text);
return deployment;
} catch (error) {
console.error('Deployment failed:', error);
}
}Fine-tuning Evaluation
Evaluate your fine-tuned model against the base model:
const { PaiTIENTClient, Evaluation } = require('paitient-secure-model');
// Initialize client
const client = new PaiTIENTClient({
apiKey: process.env.PAITIENT_API_KEY,
clientId: process.env.PAITIENT_CLIENT_ID
});
async function evaluateFineTunedModel() {
try {
// Get fine-tuned model deployment
const fineTunedDeployment = await client.getDeployment("dep_fine_tuned");
// Get base model deployment
const baseModelDeployment = await client.getDeployment("dep_base_model");
// Create comparative evaluation
const evaluation = new Evaluation.ComparativeEvaluation({
name: "Clinical Guidelines Adherence Evaluation",
deployments: [
{id: baseModelDeployment.id, name: "Base Model"},
{id: fineTunedDeployment.id, name: "Fine-tuned Model"}
],
metrics: [
"factuality",
"guidelineAdherence",
"completeness",
"clinicalAccuracy"
]
});
// Run evaluation on test dataset
const results = await evaluation.run({
dataset: "clinical_test_cases.jsonl",
numSamples: 50
});
// Print comparative results
console.log("Comparative Evaluation Results:");
for (const metric of Object.keys(results.scores["Base Model"])) {
const baseScore = results.scores["Base Model"][metric];
const fineTunedScore = results.scores["Fine-tuned Model"][metric];
const improvement = fineTunedScore - baseScore;
console.log(`${metric}: ${baseScore.toFixed(2)} → ${fineTunedScore.toFixed(2)} (${improvement >= 0 ? '+' : ''}${improvement.toFixed(2)})`);
}
return results;
} catch (error) {
console.error('Evaluation failed:', error);
}
}Advanced Use Cases
Multi-task Fine-tuning
Fine-tune a model for multiple medical specialties:
// Multi-task fine-tuning
async function multiTaskFineTuning() {
try {
// Upload datasets for different specialties
const cardiologyDataset = await client.uploadDataset({
filePath: "cardiology_guidelines.jsonl",
name: "Cardiology Guidelines Dataset",
format: "instruction"
});
const endocrinologyDataset = await client.uploadDataset({
filePath: "endocrinology_guidelines.jsonl",
name: "Endocrinology Guidelines Dataset",
format: "instruction"
});
const oncologyDataset = await client.uploadDataset({
filePath: "oncology_guidelines.jsonl",
name: "Oncology Guidelines Dataset",
format: "instruction"
});
// Combine datasets for multi-task fine-tuning
const combinedDataset = await client.mergeDatasets({
datasetIds: [
cardiologyDataset.id,
endocrinologyDataset.id,
oncologyDataset.id
],
name: "Multi-specialty Guidelines Dataset"
});
// Fine-tune on the combined dataset
const fineTuningJob = await client.createFineTuningJob({
baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
datasetId: combinedDataset.id,
fineTuningMethod: "lora",
modelName: "clinical-assistant-multi-specialty-v1"
});
return fineTuningJob;
} catch (error) {
console.error('Multi-task fine-tuning failed:', error);
}
}Continual Fine-tuning
Update your fine-tuned model with new data:
// Continual fine-tuning
async function continualFineTuning() {
try {
// Get existing fine-tuned model
const existingFineTunedModel = "ft:ZimaBlueAI/HuatuoGPT-o1-8B:clinical-assistant-v1";
// Upload new dataset with updated guidelines
const newGuidelinesDataset = await client.uploadDataset({
filePath: "updated_guidelines_2023.jsonl",
name: "Updated Guidelines 2023",
format: "instruction"
});
// Fine-tune from the previously fine-tuned model
const fineTuningJob = await client.createFineTuningJob({
baseModel: existingFineTunedModel, // Start from previously fine-tuned model
datasetId: newGuidelinesDataset.id,
fineTuningMethod: "lora",
modelName: "clinical-assistant-v2",
hyperparameters: {
learningRate: 5e-5, // Lower learning rate for continued fine-tuning
epochs: 1 // Fewer epochs needed
}
});
return fineTuningJob;
} catch (error) {
console.error('Continual fine-tuning failed:', error);
}
}Domain Adaptation with Synthetic Data
Generate and use synthetic data for domain adaptation:
const { PaiTIENTClient, DataGeneration } = require('paitient-secure-model');
// Initialize client
const client = new PaiTIENTClient({
apiKey: process.env.PAITIENT_API_KEY,
clientId: process.env.PAITIENT_CLIENT_ID
});
async function domainAdaptation() {
try {
// Generate synthetic medical conversations
const syntheticData = await DataGeneration.generateSyntheticData({
generationModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
domain: "neurology",
templateFile: "neurology_templates.json",
numExamples: 500,
outputFile: "synthetic_neurology_dataset.jsonl"
});
// Upload synthetic dataset
const syntheticDataset = await client.uploadDataset({
filePath: "synthetic_neurology_dataset.jsonl",
name: "Synthetic Neurology Dataset",
format: "conversation"
});
// Fine-tune with synthetic data
const fineTuningJob = await client.createFineTuningJob({
baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
datasetId: syntheticDataset.id,
fineTuningMethod: "lora",
modelName: "neurology-assistant-v1"
});
return fineTuningJob;
} catch (error) {
console.error('Domain adaptation failed:', error);
}
}Error Handling
Implement proper error handling for fine-tuning:
const { PaiTIENTClient, errors } = require('paitient-secure-model');
const client = new PaiTIENTClient({
apiKey: process.env.PAITIENT_API_KEY,
clientId: process.env.PAITIENT_CLIENT_ID
});
async function robustFineTuning() {
try {
const fineTuningJob = await client.createFineTuningJob({
baseModel: "ZimaBlueAI/HuatuoGPT-o1-8B",
datasetId: "ds_12345abcde",
fineTuningMethod: "lora",
modelName: "clinical-assistant-v1"
});
return fineTuningJob;
} catch (error) {
if (error instanceof errors.InvalidParameterError) {
console.error(`Invalid parameter: ${error.message}`);
} else if (error instanceof errors.QuotaExceededError) {
console.error(`Quota exceeded: ${error.message}`);
} else if (error instanceof errors.FineTuningError) {
console.error(`Fine-tuning failed: ${error.message}`);
console.error(`Job ID: ${error.jobId}`);
console.error(`Status: ${error.status}`);
console.error(`Reason: ${error.reason}`);
// Get detailed error information
if (error.jobId) {
try {
const logs = await client.getFineTuningLogs({
jobId: error.jobId,
limit: 10
});
console.error("Error logs:");
for (const log of logs) {
console.error(` ${log.message}`);
}
} catch (logError) {
console.error(`Failed to fetch logs: ${logError.message}`);
}
}
} else {
console.error(`Unexpected error: ${error.message}`);
}
}
}Best Practices
Dataset Quality
Follow these best practices for dataset preparation:
- Ensure diversity in your examples
- Focus on domain-specific content relevant to your use case
- Include edge cases common in clinical scenarios
- Balance response length appropriate for your application
- Use high-quality, verified medical information
- De-identify any patient data according to HIPAA guidelines
Hyperparameter Optimization
Optimize hyperparameters for your specific use case:
- Start with defaults for your first fine-tuning job
- Experiment with learning rates between 1e-5 and 5e-4
- Adjust batch size based on available memory
- Try different LoRA ranks (8, 16, 32) to balance efficiency and effectiveness
- Use early stopping to prevent overfitting
Infrastructure Considerations
Optimize your infrastructure for fine-tuning:
- Choose appropriate instance types based on model size
- Monitor GPU memory usage to avoid out-of-memory errors
- Implement checkpointing for long-running jobs
- Use distributed training for very large models
Evaluation Strategy
Develop a robust evaluation strategy:
- Create a dedicated test set not used in training
- Define domain-specific metrics relevant to clinical performance
- Compare against baseline models
- Involve domain experts in qualitative evaluation
- Test on realistic scenarios from your specific healthcare domain
// Example of structured evaluation with domain experts
const { PaiTIENTClient, Evaluation } = require('paitient-secure-model');
async function expertEvaluation(deploymentId) {
try {
// Create a human evaluation task
const humanEval = new Evaluation.HumanEvaluationTask({
name: "Expert Clinical Assessment",
deploymentId: deploymentId,
evaluators: [
{id: "expert_12345", role: "cardiologist"},
{id: "expert_67890", role: "clinician"}
],
criteria: [
{
name: "clinical_accuracy",
description: "Is the medical information accurate and up-to-date?",
scale: "1-5"
},
{
name: "guideline_adherence",
description: "Does the response follow current clinical guidelines?",
scale: "1-5"
},
{
name: "safety",
description: "Does the response include appropriate cautions and limitations?",
scale: "1-5"
}
]
});
// Generate samples for evaluation
await humanEval.generateSamples({
numSamples: 20,
prompts: [
"What are the treatment options for type 2 diabetes?",
"What are the risk factors for cardiovascular disease?",
"How should mild hypertension be managed?",
"What are the side effects of metformin?"
]
});
// Start the evaluation
await humanEval.start();
console.log(`Human evaluation started: ${humanEval.id}`);
return humanEval;
} catch (error) {
console.error('Expert evaluation failed:', error);
}
}Next Steps
- Learn about Text Generation
- Explore Deployment Options
- Understand Model Evaluation
- Review our Security Best Practices