Test-drive ANCHOR with Azure OpenAI¶

This is the fastest safe path to running ANCHOR against a private Azure OpenAI endpoint, so your documents are structured inside your own Azure tenant and never touch public OpenAI. Every step is verifiable before you ingest anything sensitive.

What stays local, what leaves¶

With the Azure provider and the default local embedding model:

Never leaves the host: the raw PDFs (bronze/), per-page text and PNGs (silver/), the structured regions (gold/), and the embedding vectors.
Sent only to your Azure endpoint: rendered page images + text, for the gold region-extraction and polish stages.

If you pick a remote embedding model (text-embedding-3-small/large), page text is also sent to your Azure endpoint for embeddings. Keep the default bge-small (local) to avoid that.

Prerequisite: create and note your deployment(s)¶

ANCHOR calls Azure by deployment name, not model name. Before configuring it:

In the Azure AI Foundry portal, deploy a vision-capable chat model (for example gpt-4o). Copy the deployment name you chose. It may differ from the model name.
If (and only if) you want remote embeddings, also deploy text-embedding-3-small and copy that deployment name.
Note your resource endpoint: https://<resource>.openai.azure.com/.

The deployment name is what you give ANCHOR as the model. If it does not exist, the first ingest fails with a deployment-not-found error. anchor check --probe (below) catches that up front.

Gold extraction checklist¶

Before expecting anchor ingest or PDF upload to produce gold regions, check all five items:

provider = "azure" is set in anchor.toml.
openai_base_url points at https://<resource>.openai.azure.com/openai/v1/.
region_model is the Azure deployment name for a vision-capable chat model.
ANCHOR_OPENAI_API_KEY is set to the Azure resource key.
You run ANCHOR from the project folder, or set ANCHOR_CONFIG explicitly.

If the key is missing, ANCHOR still creates bronze and silver data, but no keyed vision region extractor is wired and the document will not get gold regions. If the endpoint or deployment name is wrong, the Azure call fails during ingest.

1. Install¶

uv tool install anchor-kb
anchor install claude-code      # register the MCP for the default env

2. Create the environment¶

anchor env create work \
  --provider azure \
  --base-url https://<resource>.openai.azure.com/openai/v1/ \
  --vision-model <vision-deployment-name> \
  --embed-model text-embedding-3-small

This creates env "work" and its default project. Give your vision deployment name as --vision-model. Drop --embed-model to keep the embedding default (bge-small, local) unless you deployed an embeddings model. anchor env create:

appends /openai/v1/ if you pass the bare resource URL,
offers to save your API key to a gitignored .env (never the env.toml),
prints exactly what to do next.

The endpoint ANCHOR uses is Azure's OpenAI-compatible v1 surface (https://<resource>.openai.azure.com/openai/v1/), with the key passed as the client's api_key. This is Microsoft's documented pattern for the v1 API.

3. Provide the key (kept out of the committed config)¶

If anchor env create did not capture it:

echo 'ANCHOR_OPENAI_API_KEY=<your-azure-key>' >> .env

A personal OPENAI_API_KEY in your shell is not the right credential for Azure. Use ANCHOR_OPENAI_API_KEY for Azure projects. If you already have a personal OPENAI_API_KEY in your shell, do not treat that as proof the Azure project is configured.

4. Verify before ingesting¶

anchor check --env work          # offline: data zone, endpoint shape, key present?
anchor check --env work --probe  # one tiny call: confirms the deployment + key work

--probe sends a one-token prompt (no document content) to confirm the chat deployment (and the embedding deployment, if remote) resolve and authenticate. anchor check exits non-zero when something would break, so you can trust a clean run before sending real documents.

5. Start a project and open the canvas¶

Run anchor init inside a working folder to start a project bound to the work environment, then ingest into it:

cd ~/work/pumps
anchor init --env work
anchor ingest path/to/datasheet.pdf
anchor serve                    # open the printed http://127.0.0.1:PORT

Verify the ingest:

anchor list
anchor gold-map <slug>

In anchor list, the document should show "has_gold": true and a non-zero region_count. anchor gold-map <slug> should print page regions with bboxes and crop paths.

If anchor serve reports a different port than expected, another server already holds the default. Open the URL it prints, not a remembered one.

6. Drive it from an agent¶

Open your agent (Claude Code, Cursor, and similar tools) in the project folder. The anchor-mcp server inherits that folder and resolves this project, so the agent reads and writes the same knowledge base. No per-project reinstall.

Troubleshooting¶

Symptom	Cause	Fix
`anchor list` shows `"has_gold": false`	no keyed vision region extraction ran	check `ANCHOR_OPENAI_API_KEY`, `openai_base_url`, and `region_model`; run `anchor check --env work --probe`
`DeploymentNotFound` / 404 on ingest	model name is not a real deployment	use the deployment name; re-run `anchor check --env work --probe`
401 / auth error	wrong or missing key	set `ANCHOR_OPENAI_API_KEY` (not `OPENAI_API_KEY`)
calls hit `api.openai.com`	endpoint not set	`anchor check --env work` shows the resolved endpoint; re-run `anchor env create`
endpoint 404 on every call	missing `/openai/v1/`	`anchor check --env work --fix`