WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

CodesenSys/llmprofiles

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

16 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

llmprofiles β€” Structured Data Profiles for AI & SEO

AEO-Ready Profile-Discovery CI-Contracts

Organization: HAMI-IQ β€’ Domain: https://llmprofiles.org β€’ Repository: https://github.com/HaMi-IQ/llmprofiles.git

Turn structured data into operational, testable, AEO-ready content.

πŸ”· AEO Pattern in 60 Seconds

AEO Pattern (Answer-Engine-Ready in 5 steps)

  1. Choose profile (e.g., FAQPage v1)
  2. Mark up page (server-rendered JSON-LD)
  3. Assert profile contract in CI (page.schema.json)
  4. Normalize extractor output in CI (output.schema.json)
  5. Publish discovery (/.well-known/llmprofiles.json) + training feed (training.jsonl)
flowchart LR
A[Page Content] --> B[JSON-LD Profile]
B -->|CI page.schema.json| C{Pass?}
C -- No --> D[Fail build]
C -- Yes --> E[Extractor]
E -->|CI output.schema.json| F{Pass?}
F -- No --> D
F -- Yes --> G[training.jsonl]
G --> H[Answer Engines / RAG]
H --> I[Better Answers]
Loading

🎯 The Problem

Today's structured data landscape is fragmented and incomplete:

  • Schema.org provides a giant vocabulary but no opinionated guidance
  • Google's docs offer examples but no machine-enforceable validation
  • Teams struggle with over/under-using fields and inconsistent implementations
  • No bridge exists between SEO markup and LLM/RAG pipelines
  • No standard for training data exports that match your on-page semantics
  • Client-only JSON-LD and unstable IDs break answerability

πŸš€ What LLM Profiles Solves

We provide opinionated, testable profiles that bridge the gap between SEO and AI:

βœ… Opinionated Profiles, Not Just Examples

Instead of Schema.org's giant vocabulary, we ship constrained subsets per use case (e.g., FAQPage v1) with machine-enforceable validation.

βœ… Dual-Contract Design

  • page.schema.json - Validate your JSON-LD in CI before deployment
  • output.schema.json - Normalized data for extractors/RAG pipelines

βœ… LLM-Ready Export Format

training.jsonl - Publisher-owned export that mirrors your on-page semantics for RAG/fine-tuning.

βœ… Governance & Versioning

Canonical, versioned IRIs (/faqpage/v1), immutability, CHANGELOG, and community PR checks. SemVer: PATCH = non-breaking schema clarifications; MINOR = additive fields; MAJOR = breaking.

βœ… Discovery Convention

/.well-known/llmprofiles.json - Let aggregators/partners auto-discover your profile + training feed.

βœ… Answer Engine Optimization (AEO)

Built-in stable anchors, language hints, and anti-patterns for better AI retrieval.

πŸ“Š Before vs. After

Problem Today What LLM Profiles Adds
Schema.org is huge; teams over/under-use fields Opinionated profile per use case (FAQPage v1)
No way to test JSON-LD pre-deploy page.schema.json (AJV-friendly) for CI gating
Markup β‰  what your LLM stack needs output.schema.json normalizes data for RAG
No standard feed for LLMs training.jsonl export shape (publisher-owned)
Docs are human; machines can't "govern" Versioned IRIs + CI checks + SHACL in spec
Hard for partners to find your data /.well-known/llmprofiles.json discovery

βœ… AEO-Ready Checklist (copy this into your PR template)

  • Stable IDs: each page and each Q/A has a persistent @id (don't recycle).
  • Language hints: inLanguage (BCP-47, e.g., en, ur-PK).
  • Server-rendered JSON-LD: markup present in initial HTML (no client-only).
  • Disambiguation: prefer sameAs links to Wikipedia/Wikidata/official pages.
  • Canonical Q/A: questions are concise; answers are plain-text first; no sales fluff.
  • Evidence anchors: use isBasedOn/url pointing to page anchors for each answer.
  • Freshness: include dateModified; training lines include version and source_url.
  • Profile discovery: /.well-known/llmprofiles.json published and valid.
  • CI gates green: page.schema.json and output.schema.json both pass in CI.
  • Privacy pass: no secrets/PII in training.jsonl.

πŸ“‹ Available Profiles

Profile Status Version Description
FAQPage βœ… Enhanced v1.0.0 FAQ pages with Q&A pairs, training data, and examples
QAPage βœ… Enhanced v1.0.0 Single question threads with training data and examples
Article βœ… Enhanced v1.0.0 Blog posts and articles with training data and examples
ProductOffer βœ… Enhanced v1.0.0 Product listings with training data and examples
Event βœ… Enhanced v1.0.0 Event information with training data and examples
Course βœ… Enhanced v1.0.0 Educational courses with training data and examples
JobPosting βœ… Enhanced v1.0.0 Job advertisements with training data and examples
LocalBusiness βœ… Enhanced v1.0.0 Business listings with training data and examples
SoftwareApplication βœ… Enhanced v1.0.0 Software products with training data and examples
Review βœ… Enhanced v1.0.0 Product reviews with training data and examples

🧩 Profiles Compatibility Table (AEO-focused)

Profile AEO Anchors Discovery Training Feed CI Contract
FAQPage v1 Q/A @id, inLanguage, sameAs βœ… βœ… page.schema.json + output.schema.json
Article v1 @id, headline, about, sameAs βœ… βœ… βœ…
ProductOffer v1 @id, sku, gtin, brand βœ… βœ… βœ…
Event v1 @id, startDate, location βœ… βœ… βœ…
Course v1 @id, coursePrerequisites βœ… βœ… βœ…
JobPosting v1 @id, title, hiringOrganization βœ… βœ… βœ…
LocalBusiness v1 @id, address, geo βœ… βœ… βœ…
SoftwareApp v1 @id, applicationCategory βœ… βœ… βœ…
Review v1 @id, reviewRating, itemReviewed βœ… βœ… βœ…
QAPage v1 @id, question, acceptedAnswer βœ… βœ… βœ…

πŸ› οΈ Quick Start

1. Choose Your Profile

# Browse all available profiles
curl https://llmprofiles.org/api/discovery.json

# Get a specific profile (e.g., FAQPage)
curl https://llmprofiles.org/faqpage/v1/index.jsonld

2. Implement & Validate

// Fetch the profile and schemas
const profile = await fetch('https://llmprofiles.org/faqpage/v1/index.jsonld');
const pageSchema = await fetch('https://llmprofiles.org/faqpage/v1/page.schema.json');
const outputSchema = await fetch('https://llmprofiles.org/faqpage/v1/output.schema.json');

// Use in your application (AEO-optimized)
const faqMarkup = {
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "@id": "https://example.com/help#faq",
  "inLanguage": "en",
  "conformsTo": "https://llmprofiles.org/faqpage/v1/index.jsonld",
  "mainEntity": [
    {
      "@type": "Question",
      "@id": "https://example.com/help#q-what-is-llmprofiles",
      "name": "What is LLM Profiles?",
      "acceptedAnswer": {
        "@type": "Answer",
        "@id": "https://example.com/help#a-what-is-llmprofiles",
        "text": "Opinionated, testable structured data profiles for AI & SEO.",
        "isBasedOn": "https://example.com/help#faq"
      },
      "sameAs": ["https://llmprofiles.org/faqpage/v1/index.jsonld"]
    }
  ],
  "dateModified": "2025-08-28"
};

3. Validate in CI/CD

# Validate your JSON-LD before deployment
node scripts/validate-ajv.js faqpage/v1/page.schema.json your-page-markup.json

# Validate extracted content for RAG
node scripts/validate-ajv.js faqpage/v1/output.schema.json your-extracted-data.json

4. Export Training Data

# Get training data for LLM fine-tuning
curl https://llmprofiles.org/faqpage/v1/training.jsonl

What is /faqpage/v1/training.jsonl? It's a shape/spec, not our data. Publishers host their own training.jsonl with lines that mirror their on-page semanticsβ€”ready for RAG/fine-tuning.

Minimal line (example):

{"id":"faq#what-is-llmprofiles",
 "lang":"en",
 "url":"https://example.com/help#q-what-is-llmprofiles",
 "version":"faqpage.v1",
 "input":"What is LLM Profiles?",
 "output":"Opinionated, testable structured data profiles for AI & SEO.",
 "evidence":["https://example.com/help#faq"]}

πŸ”§ Testing Tools:

πŸ§ͺ CI Gate (copy-paste ready)

.github/workflows/validate-llmprofiles.yml

name: Validate LLM Profiles
on:
  pull_request:
  push:
    branches: [ main ]
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20 }
      - run: npm i -D ajv ajv-formats
      - name: Lint JSON/JSON-LD
        run: node scripts/validate-json.js
      - name: Validate Page Markup (schema contract)
        run: |
          npx ajv validate \
            -s faqpage/v1/page.schema.json \
            -d examples/faqpage/minimal.page.jsonld
      - name: Validate Extracted Output (RAG contract)
        run: |
          npx ajv validate \
            -s faqpage/v1/output.schema.json \
            -d examples/faqpage/sample.output.json

Tip: add examples/faqpage/minimal.page.jsonld and examples/faqpage/sample.output.json to the repo so the CI is turnkey.

🌐 Discovery Snippet (copy-paste ready)

/.well-known/llmprofiles.json

{
  "profiles": [
    {
      "name": "FAQPage",
      "version": "v1",
      "iri": "https://llmprofiles.org/faqpage/v1/",
      "pageSchema": "https://llmprofiles.org/faqpage/v1/page.schema.json",
      "outputSchema": "https://llmprofiles.org/faqpage/v1/output.schema.json",
      "training": "https://example.com/ai/training/faq.v1.jsonl",
      "examples": "https://example.com/ai/examples/faq"
    }
  ]
}

Self-test:

curl -fsSL https://example.com/.well-known/llmprofiles.json | jq .

🚫 AEO Anti-Patterns

Anti-pattern Why it hurts answers Fix
No stable @id for Q/A LLMs can't anchor or dedupe Mint persistent @id per Q and A
Client-only JSON-LD Many bots never see it Server-render the markup
Fluffy answers Model drifts to marketing copy Keep acceptedAnswer.text concise, factual
Missing inLanguage Wrong language retrieval Set inLanguage (BCP-47)
No disambiguation Entity collisions Add sameAs links
Training lines don't match page Drift between SEO & AI Generate training.jsonl from extracted output

🎯 Use Cases

For SEO Teams

  • Prevent deployment errors with CI/CD validation
  • Standardize implementations across teams
  • Improve rich results with opinionated guidance
  • Track structured data quality over time

For AI/ML Teams

  • Export training data that matches your markup
  • Normalize content for RAG pipelines
  • Bridge SEO and AI with dual schemas
  • Optimize for answer engines (AEO)

For Developers

  • Machine-enforceable contracts instead of docs
  • Versioned, immutable profiles for stability
  • Discovery API for programmatic access
  • Community governance with PR checks

For Publishers

  • Own your training data with publisher exports
  • Partner discovery via well-known endpoint
  • Future-proof with versioned IRIs
  • Operational structured data not just guidance

🧭 Role-Based Adoption

  • SEO: paste the JSON-LD, keep IDs stable, review Anti-Patterns.
  • DevOps: add the CI workflow and fail builds on schema violations.
  • Data/ML: consume output.schema.json β†’ generate training.jsonl.
  • Partners: read /.well-known/llmprofiles.json for discovery.

πŸ”Œ Discovery API

The Profile Discovery API provides programmatic access to discover and explore profiles:

// Get all available profiles
const profiles = await fetch('https://llmprofiles.org/api/discovery.json');
const data = await profiles.json();
console.log('Available profiles:', data.profiles.map(p => p.name));

// Get specific profile
const faqProfile = await fetch('https://llmprofiles.org/api/profile-faqpage.json');
const profile = await faqProfile.json();
console.log('FAQPage capabilities:', profile.capabilities);

// Get capabilities summary
const capabilities = await fetch('https://llmprofiles.org/api/capabilities.json');
const summary = await capabilities.json();
console.log('Total profiles:', summary.summary.totalProfiles);

Available endpoints:

  • GET /api/discovery.json - All profiles with metadata
  • GET /api/capabilities.json - Profile capabilities summary
  • GET /api/profile-{name}.json - Individual profile details
  • GET /api/docs.json - API documentation

See API Documentation for complete details and integration examples.

πŸ“– API Reference

Registry Endpoint

GET https://llmprofiles.org/index.json

Returns the complete profile registry with all available profiles and their versions.

Profile Endpoint

GET https://llmprofiles.org/{profile}/{version}

Returns the JSON-LD profile definition with:

  • Context definitions
  • SKOS metadata
  • SHACL constraints
  • Usage guidelines

Schema Endpoints

GET https://llmprofiles.org/{profile}/{version}/output.schema.json

Returns the JSON Schema for validating extracted content.

GET https://llmprofiles.org/{profile}/{version}/page.schema.json (Enhanced profiles)

Returns the JSON Schema for validating on-page JSON-LD markup.

GET https://llmprofiles.org/{profile}/{version}/training.jsonl (Enhanced profiles)

Returns training data in JSONL format for LLM fine-tuning.

GET https://llmprofiles.org/{profile}/{version}/examples/{type}.jsonld (Enhanced profiles)

Returns implementation examples (minimal, rich, etc.).

πŸ”§ Development

Prerequisites

  • Node.js 20+
  • npm or yarn

Setup

# Clone the repository
git clone https://github.com/HaMi-IQ/llmprofiles.git
cd llmprofiles

# Install dependencies (for validation)
npm install -D ajv ajv-formats

Validation

# Validate all JSON files
node scripts/validate-json.js

# Validate specific schema
node -e "
const Ajv = require('ajv');
const addFormats = require('ajv-formats');
const fs = require('fs');

const ajv = new Ajv({strict: false, allErrors: true});
addFormats(ajv);

const schema = JSON.parse(fs.readFileSync('faqpage/v1/output.schema.json', 'utf8'));
ajv.compile(schema);
console.log('Schema validation passed');
"

Adding New Profiles

  1. Create profile directory: mkdir -p {profile-name}/v1
  2. Add index.jsonld with profile definition
  3. Add output.schema.json for validation
  4. Update index.json registry
  5. Update CHANGELOG.md
  6. Submit pull request

🀝 Contributing

We welcome contributions! Please see our Contributing Guidelines and Code of Conduct.

Quick Start

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/new-profile
  3. Make your changes
  4. Run validation: npm run validate
  5. Submit a pull request

Profile Guidelines

  • Keep profiles ≀5KB in size
  • Use concise SKOS definitions
  • Include minimal SHACL constraints
  • Bump versions on semantic changes
  • Follow established naming conventions

πŸ“„ License

πŸ”— Links

πŸ—οΈ Architecture

Profile Structure

{profile}/
β”œβ”€β”€ v1/
β”‚   β”œβ”€β”€ index.jsonld          # Profile definition
β”‚   └── output.schema.json    # Validation schema
└── README.md                 # Profile documentation

Standards Used

  • JSON-LD: Linked data serialization
  • SKOS: Knowledge organization systems
  • SHACL: Shape constraints and validation
  • JSON Schema: Output validation
  • Schema.org: Core vocabulary

πŸš€ Roadmap

  • Complete all 10 planned profiles
  • Create interactive documentation
  • Add profile compatibility testing
  • Implement profile discovery API
  • Profile compliance test harness (good vs bad fixtures)
  • Add community examples
  • Profile marketplace features

πŸ“ž Support


Maintained by HAMI β€’ Version: 1.0.0 β€’ Last Updated: 2025-08-28

About

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 60.2%
  • HTML 39.8%