# @push.rocks/smartsitemap > πŸ—ΊοΈ A comprehensive TypeScript sitemap library with a chainable builder API β€” supporting standard, news, image, video, and hreflang sitemaps with auto-splitting, streaming, validation, and RSS feed integration. ## Issue Reporting and Security For reporting bugs, issues, or security vulnerabilities, please visit [community.foss.global/](https://community.foss.global/). This is the central community hub for all issue reporting. Developers who sign and comply with our contribution agreement and go through identification can also get a [code.foss.global/](https://code.foss.global/) account to submit Pull Requests directly. ## Install ```bash pnpm install @push.rocks/smartsitemap ``` ## ✨ Features - πŸ”— **Chainable Builder API** β€” Fluent, composable API where every method returns `this` - πŸ“° **News Sitemaps** β€” Google News-compatible with proper namespace handling - πŸ–ΌοΈ **Image Sitemaps** β€” Full `image:image` extension support - 🎬 **Video Sitemaps** β€” Full `video:video` extension with all fields - 🌍 **hreflang / i18n** β€” `xhtml:link` alternate language annotations - πŸ“‘ **Sitemap Index** β€” Automatic splitting at 50K URLs with index generation - 🌊 **Streaming** β€” Node.js Readable stream for million-URL sitemaps - βœ… **Validation** β€” URL validation, size limits, spec compliance checks - πŸ“Š **Statistics** β€” URL counts, image/video/news counts, size estimates - πŸ“‘ **RSS/Atom Feed Import** β€” Convert feeds to sitemaps (unique feature!) - πŸ“„ **YAML Config** β€” Declarative sitemap definition from YAML - πŸ—‚οΈ **Multi-Format Output** β€” XML, TXT, JSON, gzipped buffer - 🎨 **XSL Stylesheets** β€” Browser-viewable sitemaps - πŸ” **Bidirectional Parsing** β€” Parse existing sitemaps back into structured data - πŸ’ͺ **Full TypeScript** β€” Complete type safety with exported interfaces ## Quick Start ```typescript import { SmartSitemap } from '@push.rocks/smartsitemap'; // 3 lines to a valid sitemap πŸš€ const xml = SmartSitemap.create() .addUrl('https://example.com/') .addUrl('https://example.com/about') .addUrl('https://example.com/blog') .toXml(); ``` **Output:** ```xml https://example.com/ https://example.com/about https://example.com/blog ``` ## Usage ### 🌐 Standard Sitemap with Full Control ```typescript import { SmartSitemap } from '@push.rocks/smartsitemap'; const xml = SmartSitemap.create({ baseUrl: 'https://example.com' }) .setDefaultChangeFreq('weekly') .setDefaultPriority(0.5) .setXslUrl('/sitemap.xsl') .add({ loc: 'https://example.com/', changefreq: 'daily', priority: 1.0, lastmod: new Date(), }) .add({ loc: 'https://example.com/products', changefreq: 'daily', priority: 0.9, images: [ { loc: 'https://example.com/img/product1.jpg', title: 'Product 1' }, ], }) .add({ loc: 'https://example.com/blog/post-1', lastmod: '2025-01-15', alternates: [ { hreflang: 'de', href: 'https://example.com/de/blog/post-1' }, { hreflang: 'fr', href: 'https://example.com/fr/blog/post-1' }, ], }) .toXml(); ``` ### πŸ”— Builder from a URL Array ```typescript const builder = SmartSitemap.fromUrls([ 'https://example.com/', 'https://example.com/about', 'https://example.com/contact', ]); const xml = builder .setDefaultChangeFreq('monthly') .toXml(); ``` ### πŸ“° News Sitemap ```typescript const xml = SmartSitemap.createNews({ publicationName: 'The Daily Tech', publicationLanguage: 'en', }) .addNewsUrl( 'https://example.com/news/breaking-story', 'Breaking: TypeScript 6.0 Released!', new Date(), ['typescript', 'programming'], ) .addNewsUrl( 'https://example.com/news/another-story', 'Node.js Gets Even Faster', new Date(), ) .toXml(); ``` ### πŸ“° News Sitemap from RSS Feed This is smartsitemap's killer feature β€” no other sitemap library does this: ```typescript // From a feed URL const builder = SmartSitemap.createNews({ publicationName: 'The Daily Tech', publicationLanguage: 'en', }); await builder.importFromFeedUrl('https://thedailytech.com/rss/'); const xml = builder.toXml(); // Or as a one-liner with the static factory const feedBuilder = await SmartSitemap.fromFeedUrl('https://example.com/rss/'); const feedXml = feedBuilder.toXml(); ``` ### πŸ“° News Sitemap from Articles Works seamlessly with `@tsclass/tsclass` `IArticle` objects from your CMS: ```typescript import type { content } from '@tsclass/tsclass'; const articles: content.IArticle[] = [/* from your CMS or database */]; const xml = SmartSitemap.fromArticles(articles, { publicationName: 'My Publication', publicationLanguage: 'en', }).toXml(); ``` ### πŸ–ΌοΈ Image Sitemap ```typescript const xml = SmartSitemap.create() .add({ loc: 'https://example.com/gallery', images: [ { loc: 'https://example.com/img/photo1.jpg', title: 'Sunset' }, { loc: 'https://example.com/img/photo2.jpg', caption: 'Mountain view' }, ], }) .toXml(); ``` ### 🎬 Video Sitemap ```typescript const xml = SmartSitemap.create() .add({ loc: 'https://example.com/videos/tutorial', videos: [ { thumbnailLoc: 'https://example.com/thumb.jpg', title: 'Getting Started with TypeScript', description: 'A comprehensive guide to TypeScript for beginners.', contentLoc: 'https://example.com/video.mp4', duration: 600, rating: 4.8, publicationDate: new Date(), tags: ['typescript', 'tutorial', 'programming'], }, ], }) .toXml(); ``` ### 🌍 hreflang / Internationalization ```typescript const xml = SmartSitemap.create() .add({ loc: 'https://example.com/page', alternates: [ { hreflang: 'en', href: 'https://example.com/page' }, { hreflang: 'de', href: 'https://example.com/de/page' }, { hreflang: 'fr', href: 'https://example.com/fr/page' }, { hreflang: 'x-default', href: 'https://example.com/page' }, ], }) .toXml(); ``` ### πŸ“‘ Automatic Sitemap Index Splitting When you exceed 50K URLs, smartsitemap automatically splits into a sitemap index: ```typescript const builder = SmartSitemap.create({ baseUrl: 'https://example.com', maxUrlsPerSitemap: 45000, // default is 50000 }); // Add hundreds of thousands of URLs for (const page of allPages) { builder.addUrl(page.url, page.lastModified); } const set = builder.toSitemapSet(); // set.needsIndex === true // set.indexXml β†’ '...' // set.sitemaps β†’ [ // { filename: 'sitemap-1.xml', xml: '...' }, // { filename: 'sitemap-2.xml', xml: '...' }, // { filename: 'sitemap-3.xml', xml: '...' }, // ] // Or build an index manually const index = SmartSitemap.createIndex() .addSitemap('https://example.com/sitemap-blog.xml') .addSitemap('https://example.com/sitemap-products.xml', new Date()) .toXml(); ``` ### 🌊 Streaming for Large Sitemaps For sitemaps with millions of URLs that can't fit in memory: ```typescript import { createWriteStream } from 'fs'; import { createGzip } from 'zlib'; import { SitemapStream } from '@push.rocks/smartsitemap'; const stream = new SitemapStream(); const output = createWriteStream('/var/www/sitemap.xml.gz'); stream.pipe(createGzip()).pipe(output); // Stream URLs from a database cursor for await (const page of databaseCursor()) { stream.pushUrl({ loc: page.url, lastmod: page.updatedAt, changefreq: 'weekly', }); } stream.finish(); ``` ### πŸ”€ Merge, Dedupe, Filter & Sort Combine multiple sitemap sources with powerful collection operations: ```typescript const blogSitemap = SmartSitemap.create() .setDefaultChangeFreq('weekly') .addFromArray(blogUrls); const productSitemap = SmartSitemap.create() .setDefaultChangeFreq('daily') .addFromArray(productUrls); const xml = SmartSitemap.create() .merge(blogSitemap) .merge(productSitemap) .dedupe() .filter(url => !url.loc.includes('/deprecated/')) .sort((a, b) => a.loc.localeCompare(b.loc)) .toXml(); ``` ### πŸ“„ YAML Configuration Define sitemaps declaratively: ```typescript const yaml = ` baseUrl: https://example.com defaults: priority: 0.5 urls: daily: - / - /blog weekly: - /docs - /tutorials monthly: - /about - /contact yearly: - /privacy - /terms `; const builder = await SmartSitemap.fromYaml(yaml); const xml = builder.toXml(); ``` ### βœ… Validation Catch errors before they reach search engines: ```typescript const result = SmartSitemap.create() .addUrl('not-a-valid-url') .add({ loc: 'https://example.com/', priority: 1.5 }) // out of range .validate(); console.log(result.valid); // false console.log(result.errors); // [ // { field: 'loc', message: 'Invalid URL: "not-a-valid-url"', url: 'not-a-valid-url' }, // { field: 'priority', message: 'Priority must be between 0.0 and 1.0', url: 'https://example.com/' }, // ] ``` ### πŸ“Š Statistics Get insight into your sitemap: ```typescript const stats = SmartSitemap.create() .addUrl('https://example.com/') .add({ loc: 'https://example.com/gallery', images: [{ loc: '/img/1.jpg' }] }) .stats(); console.log(stats); // { // urlCount: 2, // imageCount: 1, // videoCount: 0, // newsCount: 0, // alternateCount: 0, // estimatedSizeBytes: 750, // needsIndex: false, // } ``` ### πŸ—‚οΈ Multi-Format Output ```typescript const builder = SmartSitemap.create() .addUrl('https://example.com/') .addUrl('https://example.com/about'); // XML (default) const xml = builder.toXml(); // Plain text (one URL per line) const txt = builder.toTxt(); // "https://example.com/\nhttps://example.com/about" // JSON const json = builder.toJson(); // Gzipped XML buffer (for serving compressed) const gzipped = await builder.toGzipBuffer(); ``` ### πŸ” Parse Existing Sitemaps Read and parse sitemaps back into structured data: ```typescript // From URL const parsed = await SmartSitemap.parseUrl('https://example.com/sitemap.xml'); console.log(parsed.type); // 'urlset' or 'sitemapindex' console.log(parsed.urls); // ISitemapUrl[] // From XML string const result = await SmartSitemap.parse(sitemapXmlString); // Parse and get a pre-populated builder for modification const builder = await SitemapParser.toBuilder(existingSitemapXml); builder .addUrl('https://example.com/new-page') .filter(url => !url.loc.includes('/old/')) .toXml(); // Detect type without full parsing SitemapParser.detectType(''); // 'urlset' SitemapParser.detectType(''); // 'sitemapindex' ``` ## πŸ—οΈ Real-World Integration Examples ### Express.js / Hono / Fastify Server ```typescript import { SmartSitemap } from '@push.rocks/smartsitemap'; // Serve dynamic sitemap app.get('/sitemap.xml', async (req, res) => { const xml = SmartSitemap.create() .setDefaultChangeFreq('weekly') .addFromArray(await getUrlsFromDatabase()) .toXml(); res.header('Content-Type', 'application/xml'); res.send(xml); }); // Serve news sitemap from RSS app.get('/news-sitemap.xml', async (req, res) => { const builder = SmartSitemap.createNews({ publicationName: 'My Site' }); await builder.importFromFeedUrl('https://mysite.com/rss/'); res.header('Content-Type', 'application/xml'); res.send(builder.toXml()); }); // Auto-split with sitemap index app.get('/sitemap-index.xml', async (req, res) => { const builder = SmartSitemap.create({ baseUrl: 'https://mysite.com' }); builder.addFromArray(await getAllUrls()); // 200K+ URLs const set = builder.toSitemapSet(); res.header('Content-Type', 'application/xml'); res.send(set.indexXml ?? set.sitemaps[0].xml); }); ``` ### Static Site Generator ```typescript import { SmartSitemap } from '@push.rocks/smartsitemap'; import { writeFileSync } from 'fs'; const xml = SmartSitemap.create() .setDefaultChangeFreq('weekly') .add({ loc: 'https://mysite.com/', changefreq: 'daily', priority: 1.0 }) .add({ loc: 'https://mysite.com/about', changefreq: 'monthly' }) .addFromArray(blogPostUrls) .dedupe() .toXml(); writeFileSync('./public/sitemap.xml', xml); ``` ## API Reference ### SmartSitemap (Static Factories) | Method | Returns | Description | |--------|---------|-------------| | `SmartSitemap.create(options?)` | `UrlsetBuilder` | Create a standard sitemap builder | | `SmartSitemap.createNews(options)` | `NewsSitemapBuilder` | Create a news sitemap builder | | `SmartSitemap.createIndex(options?)` | `SitemapIndexBuilder` | Create a sitemap index builder | | `SmartSitemap.fromUrls(urls, options?)` | `UrlsetBuilder` | Builder from URL string array | | `SmartSitemap.fromYaml(yaml)` | `Promise` | Builder from YAML config | | `SmartSitemap.fromFeedUrl(url, options?)` | `Promise` | Builder from RSS/Atom feed URL | | `SmartSitemap.fromFeedString(xml, options?)` | `Promise` | Builder from RSS/Atom feed string | | `SmartSitemap.fromArticles(articles, options)` | `NewsSitemapBuilder` | Builder from IArticle array | | `SmartSitemap.parse(xml)` | `Promise` | Parse sitemap XML string | | `SmartSitemap.parseUrl(url)` | `Promise` | Fetch and parse sitemap | | `SmartSitemap.validate(xml)` | `Promise` | Validate sitemap XML | ### UrlsetBuilder (Chainable) | Method | Returns | Description | |--------|---------|-------------| | `.add(url)` | `this` | Add a URL with full `ISitemapUrl` options | | `.addUrl(loc, lastmod?)` | `this` | Add by URL string | | `.addUrls(urls)` | `this` | Add multiple `ISitemapUrl` objects | | `.addFromArray(locs)` | `this` | Add from plain string array | | `.merge(other)` | `this` | Merge in another builder's URLs | | `.filter(predicate)` | `this` | Filter URLs in-place | | `.map(transform)` | `this` | Transform URLs in-place | | `.sort(compareFn?)` | `this` | Sort URLs (default: alphabetical) | | `.dedupe()` | `this` | Remove duplicate URLs by loc | | `.setDefaultChangeFreq(freq)` | `this` | Set default changefreq | | `.setDefaultPriority(priority)` | `this` | Set default priority (0.0–1.0) | | `.setXslUrl(url)` | `this` | Set XSL stylesheet URL | | `.importFromFeedUrl(url, options?)` | `Promise` | Import from RSS/Atom feed URL | | `.importFromFeedString(xml, options?)` | `Promise` | Import from RSS/Atom feed string | | `.importFromYaml(yaml)` | `Promise` | Import from YAML config | | `.importFromArticles(articles)` | `this` | Import from IArticle array | | `.toXml()` | `string` | Export as sitemap XML | | `.toTxt()` | `string` | Export as plain text | | `.toJson()` | `string` | Export as JSON | | `.toGzipBuffer()` | `Promise` | Export as gzipped XML | | `.toSitemapSet()` | `ISitemapSet` | Auto-split with index | | `.toStream()` | `SitemapStream` | Export as Node.js Readable stream | | `.validate()` | `IValidationResult` | Validate against spec | | `.stats()` | `ISitemapStats` | Get statistics | | `.getUrls()` | `ISitemapUrl[]` | Get the raw URL array | | `.count` | `number` | Get URL count | ### NewsSitemapBuilder (extends UrlsetBuilder) | Method | Returns | Description | |--------|---------|-------------| | `.addNewsUrl(loc, title, date, keywords?)` | `this` | Add a news article with publication info | ### SitemapIndexBuilder | Method | Returns | Description | |--------|---------|-------------| | `.add(entry)` | `this` | Add a sitemap index entry | | `.addSitemap(loc, lastmod?)` | `this` | Add by URL string | | `.addSitemaps(entries)` | `this` | Add multiple entries | | `SitemapIndexBuilder.fromBuilder(builder, baseUrl)` | `{index, sitemaps[]}` | Auto-split a builder | | `.toXml()` | `string` | Export as sitemap index XML | | `.count` | `number` | Get entry count | ### SitemapStream | Method | Description | |--------|-------------| | `.pushUrl(url)` | Push a URL entry to the stream | | `.finish()` | Signal end of stream, writes closing tag | | `.count` | Number of URLs written | ### Key Types ```typescript interface ISitemapUrl { loc: string; // Required β€” absolute URL lastmod?: Date | string | number; // Date, ISO string, or timestamp (ms) changefreq?: TChangeFreq; // 'always'|'hourly'|'daily'|'weekly'|'monthly'|'yearly'|'never' priority?: number; // 0.0 to 1.0 images?: ISitemapImage[]; // Image extension videos?: ISitemapVideo[]; // Video extension news?: ISitemapNews; // News extension alternates?: ISitemapAlternate[]; // hreflang alternates } interface ISitemapOptions { baseUrl?: string; xslUrl?: string; defaultChangeFreq?: TChangeFreq; defaultPriority?: number; prettyPrint?: boolean; // default: true maxUrlsPerSitemap?: number; // default: 50000 gzip?: boolean; validate?: boolean; // default: true } interface INewsSitemapOptions extends ISitemapOptions { publicationName: string; // Required publicationLanguage?: string; // default: 'en' } ``` ## License and Legal Information This repository contains open-source code licensed under the MIT License. A copy of the license can be found in the [LICENSE](./LICENSE) file. **Please note:** The MIT License does not grant permission to use the trade names, trademarks, service marks, or product names of the project, except as required for reasonable and customary use in describing the origin of the work and reproducing the content of the NOTICE file. ### Trademarks This project is owned and maintained by Task Venture Capital GmbH. The names and logos associated with Task Venture Capital GmbH and any related products or services are trademarks of Task Venture Capital GmbH or third parties, and are not included within the scope of the MIT license granted herein. Use of these trademarks must comply with Task Venture Capital GmbH's Trademark Guidelines or the guidelines of the respective third-party owners, and any usage must be approved in writing. Third-party trademarks used herein are the property of their respective owners and used only in a descriptive manner, e.g. for an implementation of an API or similar. ### Company Information Task Venture Capital GmbH Registered at District Court Bremen HRB 35230 HB, Germany For any legal inquiries or further information, please contact us via email at hello@task.vc. By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works.