Heratio Help Center article. Category: AI Services.
AI Quotas, Cost Tracking, Translation Memory, and Custom NER
Issue: #667 Phase 1 Category: AI Services / Administration Audience: Administrators
What this is
Four operator-facing controls that sit on top of every gated AI service in Heratio:
- Quotas - cap daily and monthly call volume per tenant, per service.
- Cost dashboard - per-tenant per-service inference cost in USD, with a recent-100 call ledger.
- Translation memory - cached translations indexed by source hash. A cache hit skips the inference dispatch entirely.
- Custom NER entities - operator-curated gazetteer that runs as a pre-pass before the ML model.
A fifth surface, a face-detect driver status page, is included for parity but currently bound to a null driver.
Where to find it
| Page | URL | Purpose |
|---|---|---|
| Quotas | /admin/ai/services/quotas |
List + edit per-tenant per-service caps |
| Cost dashboard | /admin/ai/services/cost |
Per-service cost in USD; recent-100 call ledger |
| Translation memory | /admin/ai/services/translation-memory |
Browse cached translations; delete stale entries |
| Custom NER entities | /admin/ai/services/ner-entities |
CRUD on the operator gazetteer |
| Face detect | /admin/ai/services/face-detect |
Driver status + health probe |
All five pages require admin login.
Setting a quota
On /admin/ai/services/quotas:
- Pick a tenant ID (use
0to set the global default that applies to every tenant without an explicit row). - Pick a service:
llm,ner,htr,donut,translate,spellcheck, orface_detect. - Enter a daily limit and a monthly limit.
0means unlimited. - Pick a reset day of the month (1-28) for the monthly window. Day 1 = calendar month.
- Save. The next inference call against that tenant + service will be counted against the new limit.
When a limit is hit, the AI service throws QuotaExceededException rather than dispatching to the model. The user sees a clear error; nothing silently drops.
Reading the cost dashboard
On /admin/ai/services/cost:
- The three summary cards show total USD spent, total calls, and total tokens (in + out) over the selected window.
- The By service table breaks the totals down per service.
- The Recent calls table shows the 100 most recent dispatches with model, tokens, duration, cost, and request ID.
- The Model pricing reference table at the bottom is the source of truth for the per-call USD figure.
Filter by tenant ID and by a "since" timestamp using the form at the top.
Translation memory
Every translation dispatch is keyed by sha256(source_text + source_lang + target_lang). When the same source comes back, the cached target text is returned and hit_count is incremented. The TM page lets you:
- Filter by target language or substring.
- See provenance (machine, human, gateway, mzansilm), confidence, hit count, last used.
- Delete a stale or wrong entry, which forces a fresh dispatch the next time that source is translated.
Custom NER entities
NER's ML extractor is good at common labels but blind to domain-specific ones - project codenames, micro-locations, in-house acronyms. The gazetteer page lets you list each entity once and have NerService find it deterministically on every extraction.
- Type - the bucket the label belongs to. Use
person,organization,place,date, or anything else (custom types land in acustomsbucket). - Label - the canonical name.
- Aliases - one per line. Each is matched case-insensitively as a substring.
- Target URI - optional link to an authority record (Wikidata, VIAF, Geonames, etc.).
- Active - off = no longer in the pre-pass.
What happens if a service is broken
Everything in this phase fails soft. If the database is unreachable or the schema is missing, the quota gate logs a warning and lets the call through; the cost ledger silently skips logging; the TM lookup returns null. Inference still works. The only exception that propagates is QuotaExceededException itself, which is the deliberate signal that a limit has been hit.
Related
- Reference:
/docs/reference/ai-services-phase-1-quotas-cost-tm-ner.md - GitHub issue: ArchiveHeritageGroup/heratio#667
- Source:
packages/ahg-ai-services/src/Services/{QuotaService,CostService,NerGazetteerService,TranslationMemoryService}.php