Organization: HAMI-IQ β’ Domain: https://llmprofiles.org β’ Repository: https://github.com/HaMi-IQ/llmprofiles.git
Turn structured data into operational, testable, AEO-ready content.
AEO Pattern (Answer-Engine-Ready in 5 steps)
- Choose profile (e.g.,
FAQPage v1)- Mark up page (server-rendered JSON-LD)
- Assert profile contract in CI (
page.schema.json)- Normalize extractor output in CI (
output.schema.json)- Publish discovery (
/.well-known/llmprofiles.json) + training feed (training.jsonl)
flowchart LR
A[Page Content] --> B[JSON-LD Profile]
B -->|CI page.schema.json| C{Pass?}
C -- No --> D[Fail build]
C -- Yes --> E[Extractor]
E -->|CI output.schema.json| F{Pass?}
F -- No --> D
F -- Yes --> G[training.jsonl]
G --> H[Answer Engines / RAG]
H --> I[Better Answers]
Today's structured data landscape is fragmented and incomplete:
- Schema.org provides a giant vocabulary but no opinionated guidance
- Google's docs offer examples but no machine-enforceable validation
- Teams struggle with over/under-using fields and inconsistent implementations
- No bridge exists between SEO markup and LLM/RAG pipelines
- No standard for training data exports that match your on-page semantics
- Client-only JSON-LD and unstable IDs break answerability
We provide opinionated, testable profiles that bridge the gap between SEO and AI:
Instead of Schema.org's giant vocabulary, we ship constrained subsets per use case (e.g., FAQPage v1) with machine-enforceable validation.
page.schema.json- Validate your JSON-LD in CI before deploymentoutput.schema.json- Normalized data for extractors/RAG pipelines
training.jsonl - Publisher-owned export that mirrors your on-page semantics for RAG/fine-tuning.
Canonical, versioned IRIs (/faqpage/v1), immutability, CHANGELOG, and community PR checks. SemVer: PATCH = non-breaking schema clarifications; MINOR = additive fields; MAJOR = breaking.
/.well-known/llmprofiles.json - Let aggregators/partners auto-discover your profile + training feed.
Built-in stable anchors, language hints, and anti-patterns for better AI retrieval.
| Problem Today | What LLM Profiles Adds |
|---|---|
| Schema.org is huge; teams over/under-use fields | Opinionated profile per use case (FAQPage v1) |
| No way to test JSON-LD pre-deploy | page.schema.json (AJV-friendly) for CI gating |
| Markup β what your LLM stack needs | output.schema.json normalizes data for RAG |
| No standard feed for LLMs | training.jsonl export shape (publisher-owned) |
| Docs are human; machines can't "govern" | Versioned IRIs + CI checks + SHACL in spec |
| Hard for partners to find your data | /.well-known/llmprofiles.json discovery |
- Stable IDs: each page and each Q/A has a persistent
@id(don't recycle). - Language hints:
inLanguage(BCP-47, e.g.,en,ur-PK). - Server-rendered JSON-LD: markup present in initial HTML (no client-only).
- Disambiguation: prefer
sameAslinks to Wikipedia/Wikidata/official pages. - Canonical Q/A: questions are concise; answers are plain-text first; no sales fluff.
- Evidence anchors: use
isBasedOn/urlpointing to page anchors for each answer. - Freshness: include
dateModified; training lines includeversionandsource_url. - Profile discovery:
/.well-known/llmprofiles.jsonpublished and valid. - CI gates green:
page.schema.jsonandoutput.schema.jsonboth pass in CI. - Privacy pass: no secrets/PII in
training.jsonl.
| Profile | Status | Version | Description |
|---|---|---|---|
| FAQPage | β Enhanced | v1.0.0 | FAQ pages with Q&A pairs, training data, and examples |
| QAPage | β Enhanced | v1.0.0 | Single question threads with training data and examples |
| Article | β Enhanced | v1.0.0 | Blog posts and articles with training data and examples |
| ProductOffer | β Enhanced | v1.0.0 | Product listings with training data and examples |
| Event | β Enhanced | v1.0.0 | Event information with training data and examples |
| Course | β Enhanced | v1.0.0 | Educational courses with training data and examples |
| JobPosting | β Enhanced | v1.0.0 | Job advertisements with training data and examples |
| LocalBusiness | β Enhanced | v1.0.0 | Business listings with training data and examples |
| SoftwareApplication | β Enhanced | v1.0.0 | Software products with training data and examples |
| Review | β Enhanced | v1.0.0 | Product reviews with training data and examples |
| Profile | AEO Anchors | Discovery | Training Feed | CI Contract |
|---|---|---|---|---|
| FAQPage v1 | Q/A @id, inLanguage, sameAs |
β | β | page.schema.json + output.schema.json |
| Article v1 | @id, headline, about, sameAs |
β | β | β |
| ProductOffer v1 | @id, sku, gtin, brand |
β | β | β |
| Event v1 | @id, startDate, location |
β | β | β |
| Course v1 | @id, coursePrerequisites |
β | β | β |
| JobPosting v1 | @id, title, hiringOrganization |
β | β | β |
| LocalBusiness v1 | @id, address, geo |
β | β | β |
| SoftwareApp v1 | @id, applicationCategory |
β | β | β |
| Review v1 | @id, reviewRating, itemReviewed |
β | β | β |
| QAPage v1 | @id, question, acceptedAnswer |
β | β | β |
# Browse all available profiles
curl https://llmprofiles.org/api/discovery.json
# Get a specific profile (e.g., FAQPage)
curl https://llmprofiles.org/faqpage/v1/index.jsonld// Fetch the profile and schemas
const profile = await fetch('https://llmprofiles.org/faqpage/v1/index.jsonld');
const pageSchema = await fetch('https://llmprofiles.org/faqpage/v1/page.schema.json');
const outputSchema = await fetch('https://llmprofiles.org/faqpage/v1/output.schema.json');
// Use in your application (AEO-optimized)
const faqMarkup = {
"@context": "https://schema.org",
"@type": "FAQPage",
"@id": "https://example.com/help#faq",
"inLanguage": "en",
"conformsTo": "https://llmprofiles.org/faqpage/v1/index.jsonld",
"mainEntity": [
{
"@type": "Question",
"@id": "https://example.com/help#q-what-is-llmprofiles",
"name": "What is LLM Profiles?",
"acceptedAnswer": {
"@type": "Answer",
"@id": "https://example.com/help#a-what-is-llmprofiles",
"text": "Opinionated, testable structured data profiles for AI & SEO.",
"isBasedOn": "https://example.com/help#faq"
},
"sameAs": ["https://llmprofiles.org/faqpage/v1/index.jsonld"]
}
],
"dateModified": "2025-08-28"
};# Validate your JSON-LD before deployment
node scripts/validate-ajv.js faqpage/v1/page.schema.json your-page-markup.json
# Validate extracted content for RAG
node scripts/validate-ajv.js faqpage/v1/output.schema.json your-extracted-data.json# Get training data for LLM fine-tuning
curl https://llmprofiles.org/faqpage/v1/training.jsonlWhat is
/faqpage/v1/training.jsonl? It's a shape/spec, not our data. Publishers host their owntraining.jsonlwith lines that mirror their on-page semanticsβready for RAG/fine-tuning.
Minimal line (example):
{"id":"faq#what-is-llmprofiles",
"lang":"en",
"url":"https://example.com/help#q-what-is-llmprofiles",
"version":"faqpage.v1",
"input":"What is LLM Profiles?",
"output":"Opinionated, testable structured data profiles for AI & SEO.",
"evidence":["https://example.com/help#faq"]}π§ Testing Tools:
- Google Rich Results Test: https://search.google.com/test/rich-results
- Schema.org Validator: https://validator.schema.org/
- JSON-LD Playground: https://json-ld.org/playground/
name: Validate LLM Profiles
on:
pull_request:
push:
branches: [ main ]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: 20 }
- run: npm i -D ajv ajv-formats
- name: Lint JSON/JSON-LD
run: node scripts/validate-json.js
- name: Validate Page Markup (schema contract)
run: |
npx ajv validate \
-s faqpage/v1/page.schema.json \
-d examples/faqpage/minimal.page.jsonld
- name: Validate Extracted Output (RAG contract)
run: |
npx ajv validate \
-s faqpage/v1/output.schema.json \
-d examples/faqpage/sample.output.jsonTip: add
examples/faqpage/minimal.page.jsonldandexamples/faqpage/sample.output.jsonto the repo so the CI is turnkey.
{
"profiles": [
{
"name": "FAQPage",
"version": "v1",
"iri": "https://llmprofiles.org/faqpage/v1/",
"pageSchema": "https://llmprofiles.org/faqpage/v1/page.schema.json",
"outputSchema": "https://llmprofiles.org/faqpage/v1/output.schema.json",
"training": "https://example.com/ai/training/faq.v1.jsonl",
"examples": "https://example.com/ai/examples/faq"
}
]
}Self-test:
curl -fsSL https://example.com/.well-known/llmprofiles.json | jq .| Anti-pattern | Why it hurts answers | Fix |
|---|---|---|
No stable @id for Q/A |
LLMs can't anchor or dedupe | Mint persistent @id per Q and A |
| Client-only JSON-LD | Many bots never see it | Server-render the markup |
| Fluffy answers | Model drifts to marketing copy | Keep acceptedAnswer.text concise, factual |
Missing inLanguage |
Wrong language retrieval | Set inLanguage (BCP-47) |
| No disambiguation | Entity collisions | Add sameAs links |
| Training lines don't match page | Drift between SEO & AI | Generate training.jsonl from extracted output |
- Prevent deployment errors with CI/CD validation
- Standardize implementations across teams
- Improve rich results with opinionated guidance
- Track structured data quality over time
- Export training data that matches your markup
- Normalize content for RAG pipelines
- Bridge SEO and AI with dual schemas
- Optimize for answer engines (AEO)
- Machine-enforceable contracts instead of docs
- Versioned, immutable profiles for stability
- Discovery API for programmatic access
- Community governance with PR checks
- Own your training data with publisher exports
- Partner discovery via well-known endpoint
- Future-proof with versioned IRIs
- Operational structured data not just guidance
- SEO: paste the JSON-LD, keep IDs stable, review Anti-Patterns.
- DevOps: add the CI workflow and fail builds on schema violations.
- Data/ML: consume
output.schema.jsonβ generatetraining.jsonl. - Partners: read
/.well-known/llmprofiles.jsonfor discovery.
The Profile Discovery API provides programmatic access to discover and explore profiles:
// Get all available profiles
const profiles = await fetch('https://llmprofiles.org/api/discovery.json');
const data = await profiles.json();
console.log('Available profiles:', data.profiles.map(p => p.name));
// Get specific profile
const faqProfile = await fetch('https://llmprofiles.org/api/profile-faqpage.json');
const profile = await faqProfile.json();
console.log('FAQPage capabilities:', profile.capabilities);
// Get capabilities summary
const capabilities = await fetch('https://llmprofiles.org/api/capabilities.json');
const summary = await capabilities.json();
console.log('Total profiles:', summary.summary.totalProfiles);Available endpoints:
GET /api/discovery.json- All profiles with metadataGET /api/capabilities.json- Profile capabilities summaryGET /api/profile-{name}.json- Individual profile detailsGET /api/docs.json- API documentation
See API Documentation for complete details and integration examples.
GET https://llmprofiles.org/index.json
Returns the complete profile registry with all available profiles and their versions.
GET https://llmprofiles.org/{profile}/{version}
Returns the JSON-LD profile definition with:
- Context definitions
- SKOS metadata
- SHACL constraints
- Usage guidelines
GET https://llmprofiles.org/{profile}/{version}/output.schema.json
Returns the JSON Schema for validating extracted content.
GET https://llmprofiles.org/{profile}/{version}/page.schema.json (Enhanced profiles)
Returns the JSON Schema for validating on-page JSON-LD markup.
GET https://llmprofiles.org/{profile}/{version}/training.jsonl (Enhanced profiles)
Returns training data in JSONL format for LLM fine-tuning.
GET https://llmprofiles.org/{profile}/{version}/examples/{type}.jsonld (Enhanced profiles)
Returns implementation examples (minimal, rich, etc.).
- Node.js 20+
- npm or yarn
# Clone the repository
git clone https://github.com/HaMi-IQ/llmprofiles.git
cd llmprofiles
# Install dependencies (for validation)
npm install -D ajv ajv-formats# Validate all JSON files
node scripts/validate-json.js
# Validate specific schema
node -e "
const Ajv = require('ajv');
const addFormats = require('ajv-formats');
const fs = require('fs');
const ajv = new Ajv({strict: false, allErrors: true});
addFormats(ajv);
const schema = JSON.parse(fs.readFileSync('faqpage/v1/output.schema.json', 'utf8'));
ajv.compile(schema);
console.log('Schema validation passed');
"- Create profile directory:
mkdir -p {profile-name}/v1 - Add
index.jsonldwith profile definition - Add
output.schema.jsonfor validation - Update
index.jsonregistry - Update
CHANGELOG.md - Submit pull request
We welcome contributions! Please see our Contributing Guidelines and Code of Conduct.
- Fork the repository
- Create a feature branch:
git checkout -b feature/new-profile - Make your changes
- Run validation:
npm run validate - Submit a pull request
- Keep profiles β€5KB in size
- Use concise SKOS definitions
- Include minimal SHACL constraints
- Bump versions on semantic changes
- Follow established naming conventions
- Code: LICENSE-CODE
- Content: LICENSE-CONTENT
- Website: https://llmprofiles.org
- Documentation: https://llmprofiles.org/docs
- Issues: https://github.com/HaMi-IQ/llmprofiles/issues
- Discussions: https://github.com/HaMi-IQ/llmprofiles/discussions
{profile}/
βββ v1/
β βββ index.jsonld # Profile definition
β βββ output.schema.json # Validation schema
βββ README.md # Profile documentation
- JSON-LD: Linked data serialization
- SKOS: Knowledge organization systems
- SHACL: Shape constraints and validation
- JSON Schema: Output validation
- Schema.org: Core vocabulary
- Complete all 10 planned profiles
- Create interactive documentation
- Add profile compatibility testing
- Implement profile discovery API
- Profile compliance test harness (good vs bad fixtures)
- Add community examples
- Profile marketplace features
- Issues: GitHub Issues
- Security: Security Policy
- Governance: Governance
Maintained by HAMI β’ Version: 1.0.0 β’ Last Updated: 2025-08-28