Trust & transparency

How Modeldex data works today

Modeldex combines structured catalog data, benchmark ingestion, endpoint telemetry, provider news feeds, editorial updates, and community input. This page explains where those layers come from, how often they refresh, and what the trust labels on model, provider, benchmark, and community surfaces actually mean.

Open benchmark trust surfaces API posture & exports News & changelog layer

Freshness is surface-specific

Modeldex does not use one global “last updated” meaning. Benchmarks prefer latest measured result dates, provider news prefers feed/article recency, and model/provider facts use the catalog timestamps produced by sync jobs or editorial updates.

Source labels matter

Benchmarks distinguish verified vs self-reported results, and provider news distinguishes official RSS from Google News fallback when an official feed is missing or broken.

Community and editorial layers stay visible

Comments, ratings, collections, and admin-authored changelog/news are valuable context, but they should not be mistaken for the same trust level as benchmark ingestion or structured provider/model metadata.

Useful now, still productizing

The public API and trust surfaces are already live, but some deeper citation, correction, and transparency workflows are still on the roadmap.

Source layers and freshness cadence

Different parts of Modeldex move at different speeds. The goal is not to hide that complexity, but to make it legible enough that users can understand whether they are looking at a curated profile, an auto-synced benchmark/result layer, or live community commentary.

Heavy sync every 6 hours

Catalog facts: models, providers, capabilities, and pricing-adjacent metadata

Structured catalog + worker sync

Core model/provider fields are stored in the Modeldex catalog and refreshed by worker jobs. This covers provider/model metadata, discovery, capabilities, and many of the timestamps used by public detail pages.

Inputs

•LiteLLM-derived pricing/catalog fetch reused by PricingSyncJob + ModelMetaSyncJob
•Provider/model editorial fields curated in the catalog/admin surfaces
•Model discovery from provider APIs plus OpenRouter coverage

Best surfaces to inspect

Heavy sync every 6 hours

Benchmarks and result freshness

Mixed: verified + self-reported benchmark evidence

Benchmark pages and model benchmark tables prefer the latest measured result when available, then fall back to benchmark timestamps. Evidence mix is shown separately so third-party leaderboard coverage is not conflated with vendor-reported scores.

Inputs

•Papers with Code ingestion
•Aider Polyglot benchmark ingestion
•Arena Hard benchmark ingestion
•llm-stats benchmark ingestion for additional leaderboard families

Best surfaces to inspect

Heavy sync every 6 hours

Endpoint telemetry and routing data

OpenRouter-derived infrastructure signal

Endpoint latency, TTFT, throughput, and tool/JSON support are derived from OpenRouter-facing telemetry and catalog data. These are strong operational hints, but they are not the same thing as a direct first-party provider SLA.

Inputs

•OpenRouter model catalog coverage
•OpenRouter endpoint stats and capability fields
•Daily pricing snapshot history for longitudinal comparisons

Best surfaces to inspect

RSS news sync every hour; changelog is event-driven/editorial

Provider news, releases, and ecosystem updates

Official RSS when possible, Google News fallback when necessary

Provider news aims to stay fresh via RSS ingestion. Official feeds are preferred; Google News fallback is used where official feeds are missing or known-broken. Changelog entries are a separate editorial/operator layer tied to releases and product changes.

Inputs

•Official provider RSS feeds where available
•Google News fallback for providers without viable official feeds
•Admin-authored changelog entries and imported article/news records

Best surfaces to inspect

Live user-generated data with admin moderation

Community signal: ratings, comments, follows, and collections

Community context, not canonical fact source

Community surfaces show how users react to models, providers, benchmarks, and MCP servers. These signals help with discovery and trust interpretation, but they should be read alongside structured facts and explicit source labels.

Inputs

•User ratings and written reviews
•Comments and reactions
•Public collections and curator activity
•Admin moderation and visibility controls

Best surfaces to inspect

What the current labels mean

Verified benchmark evidence

Results sourced from third-party or public leaderboard systems. These are still not perfect ground truth, but they are intentionally separated from vendor self-reporting.

Self-reported benchmark evidence

Results coming from provider blogs, papers, launch posts, or disclosures. Useful context, but they should be compared more cautiously than independent leaderboard results.

Official RSS

Provider news pulled from a known first-party feed. This is the strongest current source label in the provider news layer.

Google News fallback

A recovery path for providers without a usable official feed. It keeps the surface fresh, but it is a weaker source signal than official RSS.