Documentation - SafeKey Lab

Quick Start (60 Seconds)

The fastest way to protect your healthcare application from leaking PII:

1

Get SDK Access

Install the SafeKey Lab Python SDK:

pip install safekeylab

2

Initialize Client

Set up your API key (get one from your dashboard):

from safekeylab import SafeKeyLabClient

# Use production API endpoint
client = SafeKeyLabClient(
    api_key="sk-...",
    base_url="https://safekeylab-api-1054985024815.us-central1.run.app"
)

3

Protect Your Data

Make your first API call:

response = client.protect_text(
    "Patient John Doe, MRN 123456, DOB 01/15/1980"
)

print(response['redacted_text'])
# Output: "Patient [REDACTED], MRN [REDACTED], DOB [REDACTED]"

Installation

SafeKey Lab supports multiple programming languages and frameworks:

Python

pip install safekeylab

Node.js

npm install @safekeylab/sdk

Java

<dependency>
    <groupId>com.safekeylab</groupId>
    <artifactId>safekeylab-sdk</artifactId>
    <version>1.0.0</version>
</dependency>

Authentication

All API requests require authentication using an API key. You can obtain your API key from the SafeKey Lab dashboard.

Using Environment Variables

export SAFEKEYLAB_API_KEY="sk-your-api-key"

Passing Directly to Client

from safekeylab import SafeKeyLabClient

client = SafeKeyLabClient(
    api_key="sk-your-api-key",
    base_url="https://safekeylab-api-1054985024815.us-central1.run.app"
)

Request Headers

When making direct HTTP requests, include your API key in the X-API-Key header:

X-API-Key: sk-your-api-key

Your First Request

After authentication, you're ready to make your first API request. Here's a complete example:

from safekeylab import SafeKeyLabClient

# Initialize the client
client = SafeKeyLabClient(
    api_key="sk-your-api-key",
    base_url="https://safekeylab-api-1054985024815.us-central1.run.app"
)

# Your first PII protection request
text_with_pii = """
Patient: John Doe
DOB: 01/15/1980
MRN: 123456789
Diagnosis: Type 2 Diabetes
Provider: Dr. Smith at Mayo Clinic
"""

# Protect the text
response = client.protect_text(text_with_pii)

print(response['redacted_text'])
# Output shows all PII replaced with [REDACTED] tags

# Get detailed detection results
if 'entities' in response:
    for entity in response['entities']:
        print(f"Found {entity['type']} at position {entity['start']}-{entity['end']}")

Response Structure

{
    "redacted_text": "Patient: [REDACTED]\nDOB: [REDACTED]...",
    "entities": [
        {"type": "PERSON", "text": "John Doe", "start": 9, "end": 17},
        {"type": "DATE", "text": "01/15/1980", "start": 23, "end": 33},
        {"type": "MEDICAL_RECORD_NUMBER", "text": "123456789", "start": 39, "end": 48}
    ],
    "statistics": {
        "total_entities": 5,
        "processing_time": 0.042
    }
}

How It Works

SafeKey Lab uses a multi-layered approach to detect and protect PII in healthcare data:

1. Pattern Recognition

Our system uses advanced pattern matching to identify structured data like SSNs, MRNs, phone numbers, and dates. These patterns are continuously updated based on real-world healthcare data formats.

2. Named Entity Recognition (NER)

Machine learning models trained on millions of medical records identify entities like patient names, provider names, and facility names even when they don't follow standard patterns.

3. Context Analysis

The system understands medical context to differentiate between:

Patient names vs. provider names
Medical terms vs. personal identifiers
Generic drug names vs. patient information

4. Redaction & Replacement

Once PII is identified, it's replaced with appropriate tags while maintaining document structure and readability for research purposes.

🔒 Zero-Trust Architecture

Data is processed in memory only, never stored. Each request is isolated in its own secure container with automatic cleanup after processing.

PII Detection Types

SafeKey Lab detects and redacts 18+ types of PII commonly found in healthcare data:

Category	Types Detected	Example
Patient Identifiers	Name, MRN, SSN	John Doe, 123-45-6789
Demographics	DOB, Age, Address	01/15/1980, 123 Main St
Contact Info	Phone, Email, Fax	(555) 123-4567
Medical Info	Provider, Facility, Device ID	Dr. Smith, Mayo Clinic
Financial	Insurance ID, Account	BCBS123456

Privacy Methods

SafeKey Lab offers multiple privacy protection methods to suit different use cases:

Redaction

Complete removal of PII, replaced with generic tags:

Input: "John Doe, born 01/15/1980"
Output: "[REDACTED], born [REDACTED]"

Tokenization

Replace PII with reversible tokens for data that needs to be re-identified:

Input: "Patient John Doe, MRN 123456"
Output: "Patient TOKEN_PERSON_001, MRN TOKEN_MRN_001"

# Tokens can be reversed with proper authorization
client.detokenize("TOKEN_PERSON_001") # Returns "John Doe"

Synthetic Data Generation

Replace real PII with realistic but fake data for testing:

Input: "John Doe, SSN 123-45-6789"
Output: "Sarah Johnson, SSN 987-65-4321"  # Synthetic replacements

Differential Privacy

Add calibrated noise to aggregate data while preserving privacy:

# Protect aggregate statistics
response = client.protect_aggregate(
    data=patient_demographics,
    epsilon=1.0  # Privacy budget
)

Compliance

SafeKey Lab helps you meet and exceed healthcare regulatory requirements:

HIPAA Compliance

Safe Harbor: Removes all 18 HIPAA identifiers
Expert Determination: Statistical analysis to ensure re-identification risk < 0.01%
Minimum Necessary: Only process and expose required data
Audit Logs: Complete trail of all PHI access and modifications

GDPR Compliance

Right to Erasure: Complete PII removal capabilities
Data Minimization: Process only necessary data
Privacy by Design: Built-in protection at every layer
Data Portability: Export protected data in standard formats

State Regulations

CCPA (California): Consumer privacy rights support
BIPA (Illinois): Biometric data protection
SHIELD Act (New York): Data breach notification

Certifications

SOC 2 Type II certified
ISO 27001:2013 compliant
HITRUST CSF certified
FedRAMP authorized (in process)

📋 Business Associate Agreement (BAA)

We provide BAAs for all covered entities. Contact [email protected] to request your BAA.

MIMIC Dataset Support

SafeKey Lab is specifically optimized for MIMIC-III and MIMIC-IV datasets with 99% accuracy:

from safekeylab import SafeKeyLabClient

client = SafeKeyLabClient(
    api_key="sk-...",
    base_url="https://safekeylab-api-1054985024815.us-central1.run.app"
)

# Process MIMIC discharge summary
with open('DISCHARGE_SUMMARY.txt', 'r') as f:
    mimic_text = f.read()

response = client.protect_text(mimic_text)

# Save de-identified version
with open('DISCHARGE_SUMMARY_DEIDENTIFIED.txt', 'w') as f:
    f.write(response['redacted_text'])

MIMIC-Specific Features

Pre-trained on MIMIC discharge summaries and clinical notes
Handles MIMIC-specific formatting and abbreviations
Maintains clinical context while removing PII
Compatible with PhysioNet data use agreements

EHR Systems Integration

SafeKey Lab seamlessly integrates with major Electronic Health Record systems to provide real-time PII protection:

Supported EHR Systems

Epic: MyChart, Hyperspace, Caboodle
Cerner: PowerChart, Millennium
Athenahealth: athenaPractice, athenaOne
Allscripts: Sunrise, TouchWorks
NextGen: NextGen Office, NextGen Enterprise
eClinicalWorks: Version 11+

Integration Methods

FHIR API Integration

Use FHIR R4 standards for modern EHR integration:

from safekeylab import SafeKeyLabClient
import fhirclient.models.patient as p

client = SafeKeyLabClient(
    api_key="sk-...",
    base_url="https://safekeylab-api-1054985024815.us-central1.run.app"
)

# Process FHIR patient resource
patient_data = p.Patient.read('patient-id', smart.server)
protected_data = client.protect_text(str(patient_data.as_json()))

# Protected data maintains FHIR structure
print(protected_data['redacted_text'])

HL7 Message Processing

Direct integration with HL7 v2.x messages:

# Process HL7 ADT message
hl7_message = """
MSH|^~\&|EPIC|EPICADT|SMS|SMSADT|20240101000000||ADT^A01|1817457|P|2.5|
PID||0493575^^^2^ID 1|123456789|SMITH^JOHN^M||19800101|M||C|123 MAIN ST^^ANYTOWN^OH^12345|
"""

response = client.protect_text(
    hl7_message,
    format="hl7"
)

# Returns HL7 with PII redacted
print(response['redacted_text'])

Direct Database Integration

Connect directly to your EHR database with real-time protection:

# Configure database middleware
from safekeylab.middleware import EHRMiddleware

middleware = EHRMiddleware(
    api_key="sk-...",
    ehr_type="epic",  # or 'cerner', 'athena', etc.
    auto_protect=True
)

# All queries automatically protected
results = middleware.query(
    "SELECT * FROM patient_records WHERE admission_date > '2024-01-01'"
)
# Results have PII automatically redacted

Compliance & Security

HIPAA Compliant: Maintains audit logs for all PHI access
BAA Available: Business Associate Agreement for covered entities
Encryption: TLS 1.3 for data in transit, AES-256 for data at rest
Access Controls: Role-based access with MFA support
Audit Trail: Complete audit logs for compliance reporting

Implementation Timeline

1

Initial Setup (Day 1)

Configure API credentials and test connectivity

2

Integration Testing (Days 2-3)

Test with sample data and validate PII detection

3

Production Rollout (Day 4-5)

Deploy to production with monitoring

🏥 Healthcare-Specific Features

SafeKey Lab understands medical terminology, drug names, procedure codes, and maintains clinical context while removing PII. This ensures your medical records remain useful for research and analysis while protecting patient privacy.

Multimodal Support

Process multiple file types with a single API:

Supported Formats

Documents: PDF, DOCX, TXT, RTF
Images: PNG, JPG, TIFF (for scanned documents)
Medical: DICOM, HL7, FHIR
Audio: MP3, WAV (transcription + redaction)

# Process multimodal content
response = client.protect_multimodal({
    "type": "pdf",
    "content": pdf_content_base64
})

# Process multiple items
response = client.batch_protect([
    "Patient record 1: John Doe, DOB 01/15/1980",
    "Patient record 2: Jane Smith, SSN 123-45-6789"
])

API Reference

Base URLs

# Production Endpoints (Both are fully operational)
Azure: https://safekeylab-api-1054985024815.us-central1.run.app/v1
GCP: https://safekeylab-api-1054985024815.us-central1.run.app/v1

Endpoints

POST /protect

Redact PII from text content

curl -X POST https://safekeylab-api-1054985024815.us-central1.run.app/v1/protect \
  -H "X-API-Key: sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Patient John Doe, DOB 01/15/1980"
  }'

POST /batch/protect

Batch process multiple texts

curl -X POST https://safekeylab-api-1054985024815.us-central1.run.app/v1/batch/protect \
  -H "X-API-Key: sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "texts": [
      "Patient record 1",
      "Patient record 2"
    ]
  }'

GET /health

Check API health status

curl https://safekeylab-api-1054985024815.us-central1.run.app/v1/health

SDKs & Libraries

Native SDKs for all major programming languages and frameworks:

Official SDKs

Python SDK

# Installation
pip install safekeylab

# Usage
from safekeylab import SafeKeyLabClient

client = SafeKeyLabClient(
    api_key="sk-your-api-key",
    base_url="https://safekeylab-api-1054985024815.us-central1.run.app"
)

Node.js/TypeScript SDK

// Installation
npm install @safekeylab/sdk

// Usage
import { SafeKeyLabClient } from '@safekeylab/sdk';

const client = new SafeKeyLabClient({
    apiKey: 'sk-your-api-key',
    baseUrl: 'https://safekeylab-api-1054985024815.us-central1.run.app'
});

Go SDK

// Installation
go get github.com/safekeylab/safekeylab-go

// Usage
import "github.com/safekeylab/safekeylab-go"

client := safekeylab.NewClient(
    "sk-your-api-key",
    safekeylab.WithBaseURL("https://safekeylab-api-1054985024815.us-central1.run.app"),
)

Java SDK

// Maven
<dependency>
    <groupId>com.safekeylab</groupId>
    <artifactId>safekeylab-java</artifactId>
    <version>1.0.0</version>
</dependency>

// Usage
import com.safekeylab.SafeKeyLabClient;

SafeKeyLabClient client = new SafeKeyLabClient.Builder()
    .apiKey("sk-your-api-key")
    .baseUrl("https://safekeylab-api-1054985024815.us-central1.run.app")
    .build();

Framework Integrations

Django: safekeylab-django middleware
Flask: Flask-SafeKeyLab extension
Express: express-safekeylab middleware
Spring Boot: spring-boot-starter-safekeylab
Rails: safekeylab-rails gem

Batch Processing

Process large volumes of data efficiently with batch operations:

Batch Text Protection

from safekeylab import SafeKeyLabClient

client = SafeKeyLabClient(
    api_key="sk-your-api-key",
    base_url="https://safekeylab-api-1054985024815.us-central1.run.app"
)

# Process multiple texts in one request
texts = [
    "Patient John Doe, MRN 123456",
    "Jane Smith, SSN 987-65-4321",
    "Dr. Johnson at Mayo Clinic"
]

response = client.batch_protect(texts)

for i, result in enumerate(response['results']):
    print(f"Text {i+1}: {result['redacted_text']}")

Batch File Processing

# Process multiple files
import os

files_directory = "/path/to/medical/records"
results = []

for filename in os.listdir(files_directory):
    with open(os.path.join(files_directory, filename), 'r') as f:
        result = client.protect_text(f.read())
        results.append({
            'filename': filename,
            'redacted': result['redacted_text']
        })

Async Batch Processing

For large datasets, use async processing with callbacks:

# Submit batch job
job = client.create_batch_job(
    files=["file1.txt", "file2.txt", "file3.txt"],
    callback_url="https://your-app.com/webhook"
)

print(f"Job ID: {job['id']}")
print(f"Status: {job['status']}")

# Check job status
status = client.get_job_status(job['id'])
print(f"Progress: {status['processed']}/{status['total']}")

Performance Guidelines

Batch size: Up to 1000 items per request
File size: Up to 50MB per file
Throughput: 10,000+ records/minute
Parallel processing: Up to 100 concurrent requests

Webhooks

Receive real-time notifications for async operations and events:

Setting Up Webhooks

# Configure webhook endpoint
client.configure_webhook({
    'url': 'https://your-app.com/webhooks/safekeylab',
    'events': ['job.completed', 'job.failed', 'compliance.alert'],
    'secret': 'your-webhook-secret'
})

Webhook Events

Event	Description	Payload
job.completed	Batch job finished successfully	Job ID, results URL, statistics
job.failed	Batch job encountered error	Job ID, error message, partial results
compliance.alert	Compliance issue detected	Alert type, affected data, recommendations
quota.warning	API quota threshold reached	Current usage, limit, reset time

Webhook Security

Verify webhook signatures to ensure requests are from SafeKey Lab:

import hmac
import hashlib

def verify_webhook(request):
    signature = request.headers.get('X-SafeKeyLab-Signature')
    body = request.body

    expected = hmac.new(
        webhook_secret.encode(),
        body.encode(),
        hashlib.sha256
    ).hexdigest()

    return hmac.compare_digest(signature, expected)

Retry Policy

Initial retry: 5 seconds
Max retries: 5 attempts
Backoff: Exponential (5s, 10s, 20s, 40s, 80s)
Timeout: 30 seconds per request

Monitoring & Analytics

Track your PII protection metrics in real-time through the SafeKey Lab dashboard:

API call volume and latency
PII detection accuracy metrics
Data processing volumes
Compliance audit logs
Cost optimization insights

📊 Enterprise Dashboard

Access detailed analytics, audit logs, and compliance reports through your SafeKey Lab dashboard at https://www.safekeylab.com