Kosmos: Integrate OpenAI-Compatible LLM Providers
This article details the proposed enhancement to the Kosmos platform, focusing on enabling support for multiple Language Model (LLM) providers compatible with the OpenAI API. This improvement will greatly increase Kosmos's versatility, accessibility, and user choice. This enhancement is based on community interest expressed in issue #1.
Background and Motivation
Currently, Kosmos is tightly integrated with Anthropic's Claude API, leading to several limitations. By supporting OpenAI-compatible providers, Kosmos aims to offer users greater cost flexibility, enhanced privacy options, provider independence, increased redundancy, access to specialized models, and the ability to create multi-model workflows.
Current State of Kosmos
Kosmos is currently tightly coupled to Anthropic's Claude API. Direct anthropic SDK dependencies are scattered throughout 28+ files. There is also Claude-specific prompt caching in kosmos/core/claude_cache.py, hard-coded imports like from kosmos.core.llm import ClaudeClient, and configurations exclusively for Anthropic (ANTHROPIC_API_KEY, CLAUDE_MODEL).
Files Requiring Refactoring:
- Core (4 files):
llm.py,async_llm.py,claude_cache.py,config.py - Agents (4 files):
hypothesis_generator.py,research_director.py,data_analyst.py,literature_analyzer.py - Hypothesis tools (3 files):
refiner.py,prioritizer.py,testability.py - Utilities (3 files):
summarizer.py,code_generator.py,concept_extractor.py - Plus 14+ additional modules (28 total)
Key Benefits of Multi-Provider Support
Supporting OpenAI-compatible providers will unlock significant advantages:
- Cost Flexibility: Mix expensive and inexpensive models depending on the task, or even use free local models to optimize costs.
- Privacy Options: Run models locally for sensitive research, ensuring data privacy and compliance.
- Provider Independence: Easily switch between providers based on availability, pricing, or performance, avoiding vendor lock-in.
- Redundancy: Mitigate rate limits and service disruptions by having multiple providers available.
- Access to Specialized Models: Utilize code-specific, domain-specific, or fine-tuned models that better suit specific needs.
- Multi-Model Workflows: Compare results across different providers to improve accuracy and reliability.
Proposed Solution: A Phased Approach
To achieve these benefits, we propose implementing a provider-agnostic abstraction layer using a phased approach. This will minimize disruption and ensure a smooth transition.
Phase 1: Core Abstraction Layer
The initial phase focuses on creating a unified LLMProvider interface to abstract the underlying provider implementation. This involves defining a base class with methods for generating text, handling asynchronous requests, and managing structured data. This abstraction will be the foundation for supporting multiple LLM providers in Kosmos. The key is to ensure that all interactions with LLMs go through this layer, regardless of the specific provider being used.
# kosmos/core/providers/base.py
class LLMProvider:
"""Unified interface for all LLM providers"""
def generate(self, messages: list, model: str, temperature: float,
max_tokens: int, **kwargs) -> str:
"""Synchronous completion"""
pass
async def generate_async(self, messages: list, model: str, temperature: float,
max_tokens: int, **kwargs) -> str:
"""Async completion"""
pass
def generate_structured(self, messages: list, schema: dict, **kwargs) -> dict:
"""Generate structured JSON output"""
pass
def get_cost_estimate(self) -> dict:
"""Return usage and cost metrics"""
pass
Implementation Steps:
- Create
LLMProviderbase interface to define the standard methods for interacting with LLMs. - Refactor existing
ClaudeClientintoAnthropicProviderto maintain backward compatibility and isolate Anthropic-specific logic. - Implement
OpenAIProviderusing theopenaiPython package to enable integration with OpenAI-compatible models. - Update all 28 files to import from the provider abstraction, ensuring all LLM interactions use the new interface.
- Add provider selection logic in
config.pyto allow users to choose their preferred provider. - Update documentation and
.env.exampleto guide users on configuring different providers.
Phase 2: Configuration & Provider Registry
Phase 2 introduces a factory pattern for provider instantiation, driven by configuration settings. This allows users to easily switch between providers without modifying the code. The implementation involves reading configuration settings to determine which provider to instantiate and use throughout the Kosmos platform. This phase ensures that the selection of LLM providers is dynamic and easily configurable.
Example Configurations:
# Option 1: Use Claude (existing behavior - DEFAULT)
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...
CLAUDE_MODEL=claude-3-5-sonnet-20241022
# Option 2: Use OpenAI
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4-turbo
OPENAI_BASE_URL=https://api.openai.com/v1 # Optional
# Option 3: Use local Ollama
LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:11434/v1
OPENAI_MODEL=llama3.1:70b
OPENAI_API_KEY=ollama # Dummy key for compatibility
# Option 4: Use OpenRouter (100+ models)
LLM_PROVIDER=openai
OPENAI_BASE_URL=https://openrouter.ai/api/v1
OPENAI_API_KEY=sk-or-...
OPENAI_MODEL=anthropic/claude-3.5-sonnet
# Option 5: Use LM Studio
LLM_PROVIDER=openai
OPENAI_BASE_URL=http://localhost:1234/v1
OPENAI_MODEL=local-model
OPENAI_API_KEY=lm-studio
Provider Factory Implementation:
# kosmos/core/providers/factory.py
def get_provider(config: Config) -> LLMProvider:
provider_type = config.llm_provider.lower()
if provider_type == "anthropic":
return AnthropicProvider(config)
elif provider_type == "openai":
return OpenAIProvider(config)
else:
raise ValueError(f"Unknown provider: {provider_type}")
Cache Migration:
- Refactor
claude_cache.pytollm_cache.pyfor provider-agnostic caching, ensuring compatibility with various LLM providers. - Maintain cache key compatibility for Claude to preserve the existing cache, minimizing data loss during the transition.
Supported OpenAI-Compatible Providers
The implementation will support a wide range of OpenAI-compatible providers, including:
- Official: OpenAI, Azure OpenAI
- Aggregators: OpenRouter (100+ models), Together AI, Anyscale
- Local: Ollama, LM Studio, LocalAI, Jan, GPT4All
- Self-hosted: vLLM, Text Generation Inference (TGI), FastChat
- Specialized: Groq (ultra-fast), DeepInfra, Fireworks AI
Success Criteria
To ensure the successful implementation of multi-provider support, the following criteria must be met:
- [ ] All existing Claude functionality preserved (100% backward compatible).
- [ ] OpenAI provider passes all existing test suites, ensuring compatibility and reliability.
- [ ] At least one local model provider (Ollama) verified working to confirm local model support.
- [ ] Configuration-driven provider switching (zero code changes) for ease of use.
- [ ] Documentation updated with provider setup guides to assist users in configuring different providers.
- [ ] Cost tracking works across all providers, enabling accurate cost management.
- [ ] Provider-agnostic caching implemented for efficient resource utilization.
Non-Goals (Out of Scope)
The following features are valuable but will be addressed in separate issues after the core functionality is stable:
- Provider-specific advanced features (Claude artifacts, OpenAI function calling).
- Automatic task-based model routing (cheap models for summarization, expensive for reasoning).
- Fallback chains (auto-retry with alternative provider on failure).
- Automatic model capability detection.
- Provider performance benchmarking.
- Multi-provider ensemble/voting.
These advanced features can be addressed in future enhancements once the core abstraction layer is stable and well-tested.
Community Questions
Before proceeding with the implementation, we would like to gather feedback from the community on the following questions:
-
Priority Providers: Which OpenAI-compatible providers are most valuable to you?
- [ ] OpenAI official
- [ ] OpenRouter (aggregates 100+ models)
- [ ] Local models (Ollama, LM Studio)
- [ ] Azure OpenAI
- [ ] Together AI / Anyscale
- [ ] Other:
-
Migration Strategy: Should we maintain both old (
ClaudeClient) and new APIs temporarily for smoother migration? -
Default Behavior: Should Claude remain the default provider (backward compatibility) or require explicit configuration?
Your input will help us prioritize and refine the implementation to best meet the needs of the Kosmos community.
Estimated Effort
The estimated effort for this enhancement is Medium-Large: approximately 2-3 weeks for comprehensive implementation.
- Phase 1: 1-1.5 weeks
- Phase 2: 0.5-1 week
- Testing & documentation: 0.5 week
References
- Original request: Issue #1 (comment by @WangRentu)
- OpenAI API specification: https://platform.openai.com/docs/api-reference
- OpenRouter documentation: https://openrouter.ai/docs
- Ollama OpenAI compatibility: https://ollama.com/blog/openai-compatibility
For more information on Language Model APIs, visit the OpenAI API documentation.