llms-txt.mp4
llms.txt is a web application for generating consolidated text files from websites, designed for Large Language Model training and inference. It produces:
llms.txt: An index of site pages with AI-generated titles and descriptions.llms-full.txt: The full plain text content of all crawled pages.
The project uses Firecrawl for crawling/scraping and OpenAI for generating titles and descriptions.
- Next.js 15
- React 19
- TypeScript 5+
- TailwindCSS 4
- Shadcn/UI
- @mendable/firecrawl-js
- OpenAI via @ai-sdk/openai
- ai
- Node.js >= 20
- pnpm (recommended)
- Firecrawl: Get your key here. Provide it in the UI (Settings) or in a
.envfile at the project root (FIRECRAWL_API_KEY). - OpenAI: Set your key in
.env(OPENAI_API_KEY) or export it in your shell.
Example .env file:
FIRECRAWL_API_KEY=fc-...
OPENAI_API_KEY=sk-...- Clone the repository and install dependencies:
pnpm install
- Add your API keys to the
.envfile at the project root (see previous section). - Start the development server:
The app will be accessible at http://localhost:3000.
pnpm dev
- Enter the website URL to crawl in the input field.
- Make sure your API keys are configured (see Settings or
.env). - Start the generation and monitor progress.
- Download the generated files (
llms.txt,llms-full.txt).
- Maximum number of URLs to crawl is configurable in the UI.
© 2025 - Open source project under the MIT license.
