Configuring Your Inference Provider
Social Inference Engine supports four LLM providers: OpenAI, Anthropic, Ollama, and vLLM. All are configured via environment variables. The two-tier routing system applies regardless of provider.
Two-Tier Routing
The routing decision is deterministic — set by the signal type, not by sampling or probability.
OpenAI
HostedOPENAI_API_KEY=sk-…requiredFRONTIER_MODEL=gpt-4ooptionalNON_FRONTIER_MODEL=gpt-4o-minioptionalFINE_TUNED_MODEL_ID=ft:gpt-4o-mini:…optionalSet FRONTIER_MODEL=gpt-4o for the three critical signal types. Set FINE_TUNED_MODEL_ID to a fine-tuned GPT-4o mini model ID to activate the non-frontier tier. If FINE_TUNED_MODEL_ID is not set, the system falls back to the base NON_FRONTIER_MODEL.
Anthropic
HostedANTHROPIC_API_KEY=sk-ant-…requiredANTHROPIC_FRONTIER_MODEL=claude-3-5-sonnet-20241022optionalANTHROPIC_NON_FRONTIER_MODEL=claude-3-5-haiku-20241022optionalSet OPENAI_API_KEY="" and ANTHROPIC_API_KEY="sk-ant-…" to route all inference to Anthropic. Two-tier routing is available: Sonnet for the frontier tier, Haiku for the non-frontier tier.
Ollama
Local · Zero egressLOCAL_LLM_URL=http://localhost:11434requiredLOCAL_LLM_MODEL=llama3.1:8brequiredWith LOCAL_LLM_URL and LOCAL_LLM_MODEL set, all inference routes to Ollama. No observation text ever reaches an external network. Performance: 3–12 s per signal on Apple M-series hardware with llama3.1:8b. Run `ollama pull llama3.1:8b` before starting the API.
vLLM
Self-hostedVLLM_ENDPOINT=http://your-vllm-host:8000/v1requiredVLLM_MODEL=Meta-Llama-3.1-8B-InstructoptionalvLLM exposes an OpenAI-compatible API. Set VLLM_ENDPOINT to the base URL of your vLLM server. Ideal for high-throughput self-hosted deployments on A100/H100 GPUs. Latency: 0.2–0.8 s per signal at full GPU utilisation.
Fine-tuning the non-frontier tier
The training directory contains a full fine-tuning pipeline for GPT-4o mini. Running the fine-tuning pipeline produces a model ID that you set as FINE_TUNED_MODEL_ID.
# Prepare training data python training/prepare_training_data.py # Run fine-tuning job (requires OPENAI_API_KEY) python training/fine_tune.py --base-model gpt-4o-mini # The script prints: FINE_TUNED_MODEL_ID=ft:gpt-4o-mini:org:... # Add that value to your .envFull training guide →