diff --git a/README.md b/README.md index ee5c46d..32ccf91 100644 --- a/README.md +++ b/README.md @@ -21,6 +21,3 @@ corpus. - `query-docs-feedback`: a Go project with type definitions that queries the MongoDB Docs Feedback received for any feedback related to code examples, and outputs the result to a report as `.csv`. -- `snooty-to-md-converter`: a Node.js project that ingests a single docs - project (via Snooty Data API or locally) and converts it to - Markdown. Created to convert the deprecated `docs-realm` project. \ No newline at end of file diff --git a/snooty-to-md-converter/.eslintrc.cjs b/snooty-to-md-converter/.eslintrc.cjs deleted file mode 100644 index 6eba6c7..0000000 --- a/snooty-to-md-converter/.eslintrc.cjs +++ /dev/null @@ -1,45 +0,0 @@ -module.exports = { - env: { - node: true, - }, - root: true, - parser: "@typescript-eslint/parser", - plugins: ["@typescript-eslint", "jest", "jsdoc", "prettier", "@stylistic/js"], - extends: [ - "eslint:recommended", - "prettier", - "plugin:@typescript-eslint/recommended", - ], - ignorePatterns: ["build/*"], - parserOptions: { - ecmaVersion: 2021, - sourceType: "module", - }, - rules: { - "prettier/prettier": "warn", - "arrow-body-style": "off", - "prefer-arrow-callback": "off", - "jsdoc/require-asterisk-prefix": ["error", "never"], - "@typescript-eslint/no-unused-vars": [ - "warn", - { - varsIgnorePattern: "^_", - argsIgnorePattern: "^_", - ignoreRestSiblings: true, - }, - ], - }, - overrides: [ - { - files: ["test/**/*.ts", "*.test.ts"], - env: { - "jest/globals": true, - }, - }, - ], - settings: { - jest: { - version: 29, - }, - }, -}; diff --git a/snooty-to-md-converter/.gitignore b/snooty-to-md-converter/.gitignore deleted file mode 100644 index 33ba5bc..0000000 --- a/snooty-to-md-converter/.gitignore +++ /dev/null @@ -1,17 +0,0 @@ -.DS_Store - -node_modules - -.env* -!.env.qa -!.env.example - -.secrets.* -!.secrets.example - -.vscode -build/ -.ipynb_checkpoints/ - -images/ -output/ \ No newline at end of file diff --git a/snooty-to-md-converter/.prettierrc.cjs b/snooty-to-md-converter/.prettierrc.cjs deleted file mode 100644 index 2d3316a..0000000 --- a/snooty-to-md-converter/.prettierrc.cjs +++ /dev/null @@ -1,3 +0,0 @@ -module.exports = { - embeddedLanguageFormatting: "off", -}; diff --git a/snooty-to-md-converter/LICENSE b/snooty-to-md-converter/LICENSE deleted file mode 100644 index 261eeb9..0000000 --- a/snooty-to-md-converter/LICENSE +++ /dev/null @@ -1,201 +0,0 @@ - Apache License - Version 2.0, January 2004 - http://www.apache.org/licenses/ - - TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION - - 1. Definitions. - - "License" shall mean the terms and conditions for use, reproduction, - and distribution as defined by Sections 1 through 9 of this document. - - "Licensor" shall mean the copyright owner or entity authorized by - the copyright owner that is granting the License. - - "Legal Entity" shall mean the union of the acting entity and all - other entities that control, are controlled by, or are under common - control with that entity. For the purposes of this definition, - "control" means (i) the power, direct or indirect, to cause the - direction or management of such entity, whether by contract or - otherwise, or (ii) ownership of fifty percent (50%) or more of the - outstanding shares, or (iii) beneficial ownership of such entity. - - "You" (or "Your") shall mean an individual or Legal Entity - exercising permissions granted by this License. - - "Source" form shall mean the preferred form for making modifications, - including but not limited to software source code, documentation - source, and configuration files. - - "Object" form shall mean any form resulting from mechanical - transformation or translation of a Source form, including but - not limited to compiled object code, generated documentation, - and conversions to other media types. - - "Work" shall mean the work of authorship, whether in Source or - Object form, made available under the License, as indicated by a - copyright notice that is included in or attached to the work - (an example is provided in the Appendix below). - - "Derivative Works" shall mean any work, whether in Source or Object - form, that is based on (or derived from) the Work and for which the - editorial revisions, annotations, elaborations, or other modifications - represent, as a whole, an original work of authorship. For the purposes - of this License, Derivative Works shall not include works that remain - separable from, or merely link (or bind by name) to the interfaces of, - the Work and Derivative Works thereof. - - "Contribution" shall mean any work of authorship, including - the original version of the Work and any modifications or additions - to that Work or Derivative Works thereof, that is intentionally - submitted to Licensor for inclusion in the Work by the copyright owner - or by an individual or Legal Entity authorized to submit on behalf of - the copyright owner. For the purposes of this definition, "submitted" - means any form of electronic, verbal, or written communication sent - to the Licensor or its representatives, including but not limited to - communication on electronic mailing lists, source code control systems, - and issue tracking systems that are managed by, or on behalf of, the - Licensor for the purpose of discussing and improving the Work, but - excluding communication that is conspicuously marked or otherwise - designated in writing by the copyright owner as "Not a Contribution." - - "Contributor" shall mean Licensor and any individual or Legal Entity - on behalf of whom a Contribution has been received by Licensor and - subsequently incorporated within the Work. - - 2. Grant of Copyright License. Subject to the terms and conditions of - this License, each Contributor hereby grants to You a perpetual, - worldwide, non-exclusive, no-charge, royalty-free, irrevocable - copyright license to reproduce, prepare Derivative Works of, - publicly display, publicly perform, sublicense, and distribute the - Work and such Derivative Works in Source or Object form. - - 3. Grant of Patent License. Subject to the terms and conditions of - this License, each Contributor hereby grants to You a perpetual, - worldwide, non-exclusive, no-charge, royalty-free, irrevocable - (except as stated in this section) patent license to make, have made, - use, offer to sell, sell, import, and otherwise transfer the Work, - where such license applies only to those patent claims licensable - by such Contributor that are necessarily infringed by their - Contribution(s) alone or by combination of their Contribution(s) - with the Work to which such Contribution(s) was submitted. If You - institute patent litigation against any entity (including a - cross-claim or counterclaim in a lawsuit) alleging that the Work - or a Contribution incorporated within the Work constitutes direct - or contributory patent infringement, then any patent licenses - granted to You under this License for that Work shall terminate - as of the date such litigation is filed. - - 4. Redistribution. You may reproduce and distribute copies of the - Work or Derivative Works thereof in any medium, with or without - modifications, and in Source or Object form, provided that You - meet the following conditions: - - (a) You must give any other recipients of the Work or - Derivative Works a copy of this License; and - - (b) You must cause any modified files to carry prominent notices - stating that You changed the files; and - - (c) You must retain, in the Source form of any Derivative Works - that You distribute, all copyright, patent, trademark, and - attribution notices from the Source form of the Work, - excluding those notices that do not pertain to any part of - the Derivative Works; and - - (d) If the Work includes a "NOTICE" text file as part of its - distribution, then any Derivative Works that You distribute must - include a readable copy of the attribution notices contained - within such NOTICE file, excluding those notices that do not - pertain to any part of the Derivative Works, in at least one - of the following places: within a NOTICE text file distributed - as part of the Derivative Works; within the Source form or - documentation, if provided along with the Derivative Works; or, - within a display generated by the Derivative Works, if and - wherever such third-party notices normally appear. The contents - of the NOTICE file are for informational purposes only and - do not modify the License. You may add Your own attribution - notices within Derivative Works that You distribute, alongside - or as an addendum to the NOTICE text from the Work, provided - that such additional attribution notices cannot be construed - as modifying the License. - - You may add Your own copyright statement to Your modifications and - may provide additional or different license terms and conditions - for use, reproduction, or distribution of Your modifications, or - for any such Derivative Works as a whole, provided Your use, - reproduction, and distribution of the Work otherwise complies with - the conditions stated in this License. - - 5. Submission of Contributions. Unless You explicitly state otherwise, - any Contribution intentionally submitted for inclusion in the Work - by You to the Licensor shall be under the terms and conditions of - this License, without any additional terms or conditions. - Notwithstanding the above, nothing herein shall supersede or modify - the terms of any separate license agreement you may have executed - with Licensor regarding such Contributions. - - 6. Trademarks. This License does not grant permission to use the trade - names, trademarks, service marks, or product names of the Licensor, - except as required for reasonable and customary use in describing the - origin of the Work and reproducing the content of the NOTICE file. - - 7. Disclaimer of Warranty. Unless required by applicable law or - agreed to in writing, Licensor provides the Work (and each - Contributor provides its Contributions) on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or - implied, including, without limitation, any warranties or conditions - of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A - PARTICULAR PURPOSE. You are solely responsible for determining the - appropriateness of using or redistributing the Work and assume any - risks associated with Your exercise of permissions under this License. - - 8. Limitation of Liability. In no event and under no legal theory, - whether in tort (including negligence), contract, or otherwise, - unless required by applicable law (such as deliberate and grossly - negligent acts) or agreed to in writing, shall any Contributor be - liable to You for damages, including any direct, indirect, special, - incidental, or consequential damages of any character arising as a - result of this License or out of the use or inability to use the - Work (including but not limited to damages for loss of goodwill, - work stoppage, computer failure or malfunction, or any and all - other commercial damages or losses), even if such Contributor - has been advised of the possibility of such damages. - - 9. Accepting Warranty or Additional Liability. While redistributing - the Work or Derivative Works thereof, You may choose to offer, - and charge a fee for, acceptance of support, warranty, indemnity, - or other liability obligations and/or rights consistent with this - License. However, in accepting such obligations, You may act only - on Your own behalf and on Your sole responsibility, not on behalf - of any other Contributor, and only if You agree to indemnify, - defend, and hold each Contributor harmless for any liability - incurred by, or claims asserted against, such Contributor by reason - of your accepting any such warranty or additional liability. - - END OF TERMS AND CONDITIONS - - APPENDIX: How to apply the Apache License to your work. - - To apply the Apache License to your work, attach the following - boilerplate notice, with the fields enclosed by brackets "[]" - replaced with your own identifying information. (Don't include - the brackets!) The text should be enclosed in the appropriate - comment syntax for the file format. We also recommend that a - file or class name and description of purpose be included on the - same "printed page" as the copyright notice for easier - identification within third-party archives. - - Copyright [yyyy] [name of copyright owner] - - Licensed under the Apache License, Version 2.0 (the "License"); - you may not use this file except in compliance with the License. - You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. diff --git a/snooty-to-md-converter/README.md b/snooty-to-md-converter/README.md deleted file mode 100644 index 0d358af..0000000 --- a/snooty-to-md-converter/README.md +++ /dev/null @@ -1,55 +0,0 @@ -# snooty-to-md-converter - -This project contains a converter for Snooty-based docs projects to Markdown. -This is intended to help migrate the Realm docs, specifically, from Snooty to Markdown. -This tool ingests the Snooty Data API for a single project (default: `realm`), -and outputs Markdown files, handling links, refs, substitutions, and includes (best-effort), -logging any issues with pointers to the source page. - -## Realm Docs Converter - -### Prerequisites -- Node.js >= 18 (for built-in fetch) -- npm >= 8 -- This project checked out locally - -### Install dependencies (from project root) -``` -npm ci -``` - -## Build the converter (from project root) -``` -npm run build -``` -This compiles TypeScript to `packages/realm-docs-converter/dist` and wires up the CLI bin (`dist/cli.js`). - -## Run the converter (Snooty Data API) -Default mode fetches pages from the Snooty Data API and converts them to Markdown. -``` -node packages/realm-docs-converter/dist/cli.js --project realm --out ./output --branch master --base-url https://snooty-data-api.mongodb.com -``` -- `--project`: Snooty project slug (default: `realm`). -- `--out`: Output directory (required). -- `--branch`: Branch to fetch (default: `master`). -- `--base-url`: Snooty Data API base URL (defaults to https://snooty-data-api.mongodb.com). - -The converter writes one `.md` file per page, mirroring the page path. A `conversion-warnings.log` file is written in the output directory if any unresolved includes/substitutions/refs occur. - -### Local directory fallback (optional) -For legacy/local conversion of a checked-out Snooty project directory containing `.txt/.rst` files, you can use: -``` -node packages/realm-docs-converter/dist/cli.js --local --out -``` - -## Shared images handling -- Many pages reference shared images with absolute paths like `/images/foo.png` (via `:figure:` directives in the source). -- The converter will copy shared images and rewrite references so the Markdown works offline: - - Local mode: if `/images` exists, it is copied to `/images`. - - API mode: if you set `REALM_DOCS_SHARED_IMAGES_DIR` to a local folder containing images, it is copied to `/images`. As a fallback, if a local `./images` directory exists where you run the CLI, that will be copied. - - When `/images` exists, the converter rewrites both Markdown image links like `![alt](/images/path.png)` and HTML `` to use relative paths from each page. - - If `/images` does not exist, absolute `/images/...` links are left as-is. - -## License - -This project is licensed under the [Apache 2.0 License](LICENSE). \ No newline at end of file diff --git a/snooty-to-md-converter/architecture.md b/snooty-to-md-converter/architecture.md deleted file mode 100644 index e69fb23..0000000 --- a/snooty-to-md-converter/architecture.md +++ /dev/null @@ -1,157 +0,0 @@ -# Architecture Overview - -This repository provides a toolchain to convert Snooty-based documentation projects to Markdown, with an emphasis on migrating the Realm docs. It is organized as a small npm workspace with a single package that exposes a CLI and a set of conversion utilities. - -- Workspace root: scripts, top-level README, licensing, and npm workspace wiring -- Package: packages/realm-docs-converter – the actual converter (TypeScript) - - CLI entrypoint: src/cli.ts - - Core conversion orchestrator: src/realm-docs-converter.ts - - Snooty Data API client: src/snooty-api.ts - - Snooty AST → Markdown renderer: src/ast-to-md.ts - - Fallback local RST utilities: src/converters/snooty.ts - -## High-level Flow - -There are two primary modes of operation: API mode (recommended) and Local mode (fallback). Both end with Markdown files written to an output directory, preserving the original page path structure. - -1. API Mode (default) - - CLI parses args and calls convertRealmDocsFromApi(). - - Snooty Data API is queried to obtain a list of pages and each page’s JSON AST. - - The AST is converted to Markdown via astToMarkdown(). - - Shared images may be copied to /images if provided locally. - - Image references beginning with /images/... are rewritten to be relative to each output file. - - Per-page warnings (e.g., unresolved substitutions/refs) are collected and written to conversion-warnings.log. - -2. Local Mode (legacy/fallback) - - CLI parses args --local --out and calls convertRealmDocs(). - - Local RST-like files (.txt/.rst) are recursively discovered under the input directory. - - Each file is read as text and processed by parseSnootyContent() to (best-effort) resolve includes, substitutions, and refs. - - The processed text is converted to Markdown by convertToMarkdown(). - - Shared images are copied from /images to /images when present, and image references are rewritten similarly to API mode. - -## Repository Layout - -- package.json (root) - - Declares an npm workspace with packages/realm-docs-converter - - Scripts delegate to the package (build/start/dev) -- packages/realm-docs-converter/package.json - - name: realm-docs-converter - - bin: dist/cli.js (CLI entry) - - scripts: build via tsc - - engines: Node >= 18 - - deps: dotenv (for env), fs-extra (not heavily used; core logic mostly uses fs) -- packages/realm-docs-converter/tsconfig.json - - OutDir: dist, RootDir: src; CommonJS target ES2020 - -## Key Components - -1. CLI (src/cli.ts) - - Parses arguments: - - Default mode: API - - --project (default: realm) - - --out (required in API mode) - - --branch (default: master) - - --base-url (optional override for Snooty Data API) - - --local (switch to Local mode) with positional and --out - - Calls: - - convertRealmDocsFromApi({ project, outputDir, branch, baseUrl }) for API - - convertRealmDocs({ inputDir, outputDir, handleIncludes: true, handleSubstitutions: true, handleRefs: true }) for Local - - Logs total pages converted and basic usage help if args are missing. - -2. Conversion Orchestrator (src/realm-docs-converter.ts) - - convertRealmDocs(options: { inputDir, outputDir, handleIncludes, handleSubstitutions, handleRefs }) - - Removes and recreates the output directory on each run. - - Copies shared images from /images to /images when present. - - Recursively lists .txt/.rst files. - - For each file: - - Reads content, parses with parseSnootyContent() (includes/substitutions/refs), converts with convertToMarkdown(). - - Ensures the output directory exists, rewrites image paths, and writes a .md file mirroring the folder structure. - - convertRealmDocsFromApi(options: { project, outputDir, branch?, baseUrl? }) - - Removes and recreates the output directory on each run. - - Attempts to copy shared images into /images from: - - SHARED_IMAGES_DIR (if set and exists), or - - ./images next to the current working directory (fallback) - - Fetches pages and their ASTs via fetchSnootyProject(). - - Maintains a global substitutions map shared across pages. - - Converts each page’s AST to Markdown via astToMarkdown(), collects warnings with doc paths and writes them to conversion-warnings.log. - - Writes output .md files, normalizing paths and handling deletions indicated by the API. - - rewriteImagePaths(markdown, outPath, outputRoot) - - If /images exists, rewrites: - - Markdown image URLs: ![alt](/images/path.png) - - HTML - to be relative to the output file location. - - If images directory does not exist, leaves absolute /images/... references as-is. - -3. Snooty Data API Client (src/snooty-api.ts) - - fetchSnootyProject({ project, branch = 'master', baseUrl = 'https://snooty-data-api.mongodb.com' }) - - Tries multiple endpoints to obtain a pages index, then per-page ASTs. - - Normalizes results to an array of { path, ast } (or { path, deleted: true }). - - Handles a variety of index shapes (strings, objects, embedded asts) and skips non-page asset entries. - - Uses a liberal tryFetchAny() that supports JSON and NDJSON-like responses. - -4. AST → Markdown (src/ast-to-md.ts) - - astToMarkdown(root, { substitutions, onWarn, docPath }) - - Walks the Snooty AST and emits Markdown lines. - - Supported constructs (best-effort): - - Sections/titles → #..###### (adds anchors when ids/html_id are present internally) - - Paragraphs, inline formatting (emphasis, strong, inline code) - - Links: external via refuri; internal refs rendered as text labels - - Images → Markdown syntax - - Code/literal blocks (with language if provided) - - Lists (bulleted/ordered) - - Tables → GFM pipe tables - - Admonitions (note, tip, warning, etc.) → blockquotes with a label - - Substitution references via substitutions map; unresolved ones trigger onWarn - - Emits warnings for unresolved substitutions/refs/includes with page path and optional position. - -5. Local RST Fallback (src/converters/snooty.ts) - - parseSnootyContent(text, { filePath, basePath, resolveIncludes, resolveSubstitutions, resolveRefs }) - - Best-effort handling of: - - .. include:: directives (project-root relative when starting with /) - - Substitution definitions and usages (.. |name| replace:: value, and |name|) - - :ref:`text