Running Claude Code with a Local Model
Claude Code is a powerful coding assistant that can work with your own inference server. The default setup points to Anthropics cloud, which may not suit everyone.
Why use a local setup?
- Keeps your code on your own hardware.
- Works without an internet connection.
- Gives you full control over the model version.
Automation script
The following bash script prepares the environment, checks the server status, selects the loaded model, and starts Claude Code with the correct endpoint.
#!/usr/bin/env bash
# Load environment variables
export CLAUDE_CODE_ENDPOINT="http://localhost:8000/v1"
export CLAUDE_CODE_MODEL="opencode"
# Verify server is reachable
if ! curl -s $CLAUDE_CODE_ENDPOINT &>/dev/null; then
echo "Inference server not reachable at $CLAUDE_CODE_ENDPOINT"
exit 1
fi
# Detect loaded model (example using a health endpoint)
MODEL=$(curl -s $CLAUDE_CODE_ENDPOINT/health | grep -oP '(?<=model: ).*')
if [ -n "$MODEL" ]; then
export CLAUDE_CODE_MODEL=$MODEL
fi
# Launch Claude Code
clause-code --endpoint $CLAUDE_CODE_ENDPOINT --model $CLAUDE_CODE_MODEL
Save the script as run_claude_local.sh, make it executable (chmod +x run_claude_local.sh) and run it whenever you need the local version.
Getting started
- Install an inference server that supports the desired model.
- Set the required environment variables (the script does this for you).
- Run the script and start coding.
This small helper removes repetitive steps and lets you focus on development.