Built for developers. Designed for scale.

SDKs, APIs, and infrastructure that gets out of your way.

Quick Start

Get up and running in minutes

# Install the SDK
pip install catalyzed

# Connect and query
from catalyzed import Client

client = Client(api_key="your-api-key")

# Query across your private data and marketplace datasets
results = client.query("""
    SELECT c.*, p.citation
    FROM contracts c
    JOIN caselaw.precedents p
      ON c.jurisdiction = p.jurisdiction
    WHERE c.type = 'employment'
""")

# Capture expert feedback
client.feedback.submit(
    result_id=results[0].id,
    rating="correct",
    note="Precedent match confirmed"
)

View full documentation →

Architecture

Built on open standards

Feedback Loop
Connectors500+
StorageParquet/Arrow
QuerySQL + Vector
Orchestration
Outputs

Open formats

Parquet, Arrow, and standard SQL. Your data stays portable.

No vendor lock-in

Export anytime in standard formats. No proprietary schemas.

Your cloud or ours

SaaS, hybrid, or fully on-prem for enterprise customers.

Technical Capabilities

What's under the hood

1 Catalyzed Data

500+ connectors

Databases, datalakes, S3, Google Sheets, CSVs, PDFs, and unstructured documents

Automatic indexing

Schema discovery and inference without manual configuration

Knowledge graph

Entities automatically linked across structured and unstructured data

Petabyte scale

Built for large-scale data workloads from day one

Billions of vectors

Native vector storage and similarity search at scale

2 Catalyzed Orchestration

API/SDK-defined pipelines

Define workflows in code, we handle execution and scaling

Flexible routing

Route to our cloud, your cloud, external systems, or human reviewers

LLM integration

Any model provider with passthrough pricing - OpenAI, Anthropic, local models

Human-in-the-loop

Built-in routing for expert review when confidence is low

Configuration-driven

Define once, run at scale with version-controlled configs

3 Catalyzed Control

Feedback capture

Structured API for capturing corrections, ratings, and annotations

Active learning

System identifies edge cases and requests expert input

Decision audit trail

Every judgment logged and attributable

Model improvement hooks

Feedback data formatted for fine-tuning and evaluation

No ML expertise required

Domain experts teach the system through normal workflow

Security & Compliance

Built for regulated industries

SOC 2 Aligned

Controls in place, formal audit planned. Documentation available on request.

HIPAA Ready

BAA available for healthcare customers. PHI handling protocols in place.

GDPR Compliant

DPA available. Data residency options for EU customers.

Data Residency

Enterprise customers can specify where data lives and where queries execute.

Need to review our security posture? Request our security documentation →

Ready to build?

Explore the docs or talk to our engineering team about your use case.