# @push.rocks/smartrobots a module for making sense of robots.txt ## Install To install `@push.rocks/smartrobots`, you need to have Node.js installed on your system. Once Node.js is installed, you can install `@push.rocks/smartrobots` by running the following command in your terminal: ```bash npm install @push.rocks/smartrobots --save ``` This command will download and install `@push.rocks/smartrobots` and its dependencies into your project's `node_modules` directory. ## Usage This tutorial will guide you through utilizing `@push.rocks/smartrobots`, a TypeScript module designed for interpreting `robots.txt` files of websites. By working through various scenarios, you'll learn how to efficiently parse and work with the data provided by `robots.txt`. ### Prerequisites Before diving into the examples, ensure you are familiar with basic TypeScript syntax and concepts. You also need a basic understanding of what `robots.txt` is and its significance in web development. ### Basic Setup First, ensure you have imported `Smartrobots` from the `@push.rocks/smartrobots` package in your TypeScript file: ```typescript import { Smartrobots } from '@push.rocks/smartrobots'; ``` Instantiate the `Smartrobots` class to get started: ```typescript const mySmartrobots = new Smartrobots(); ``` ### Parsing `robots.txt` from a URL One common use case is to parse the `robots.txt` file directly from a website. The `Smartrobots` class provides an easy-to-use method to achieve this, as demonstrated below: ```typescript async function parseRobotsFromUrl() { const url = 'https://example.com/robots.txt'; // Replace with the URL to the desired robots.txt try { const parsedData = await mySmartrobots.parseRobotsTxtFromUrl(url); console.log('Parsed robots.txt data:', parsedData); } catch (error) { console.error('Error parsing robots.txt from URL:', error); } } parseRobotsFromUrl(); ``` This function asynchronously fetches the `robots.txt` file from the specified URL and logs the parsed content to the console. ### Parsing a `robots.txt` String If you already have the contents of a `robots.txt` file as a string, you can parse it directly using the `parseRobotsTxt` method. Here's how: ```typescript async function parseRobotsFromString(robotsTxtString: string) { try { const parsedData = await mySmartrobots.parseRobotsTxt(robotsTxtString); console.log('Parsed robots.txt data:', parsedData); } catch (error) { console.error('Error parsing robots.txt string:', error); } } // Example robots.txt string const robotsTxtString = ` User-agent: * Disallow: /secret-page Sitemap: https://example.com/sitemap.xml `; parseRobotsFromString(robotsTxtString); ``` This function takes a string representation of a `robots.txt` file, parses it, and logs the results. In the example string, there are directives for user-agents and a sitemap URL. ### Understanding the Parsed Data The parsed data from `robots.txt` is returned as an object. In its current implementation, `@push.rocks/smartrobots` focuses on extracting sitemap URLs. Here's a sample output from parsing the example `robots.txt` string: ```json { "sitemaps": ["https://example.com/sitemap.xml"] } ``` You can extend the parsing logic based on your requirements to handle more directives from `robots.txt`. ### Conclusion `@push.rocks/smartrobots` provides a straightforward and efficient way to interpret `robots.txt` files in TypeScript projects. Whether you're fetching and parsing `robots.txt` from a URL or working with its contents as a string, this module simplifies the process, allowing you to focus on utilizing the data rather than parsing intricacies. Remember, `robots.txt` files are publicly accessible and should be used responsibly following web standards and etiquette. For more advanced use cases, consider contributing to or extending the functionality of `@push.rocks/smartrobots` to cover a broader range of directives and scenarios. ## License and Legal Information This repository contains open-source code that is licensed under the MIT License. A copy of the MIT License can be found in the [license](license) file within this repository. **Please note:** The MIT License does not grant permission to use the trade names, trademarks, service marks, or product names of the project, except as required for reasonable and customary use in describing the origin of the work and reproducing the content of the NOTICE file. ### Trademarks This project is owned and maintained by Task Venture Capital GmbH. The names and logos associated with Task Venture Capital GmbH and any related products or services are trademarks of Task Venture Capital GmbH and are not included within the scope of the MIT license granted herein. Use of these trademarks must comply with Task Venture Capital GmbH's Trademark Guidelines, and any usage must be approved in writing by Task Venture Capital GmbH. ### Company Information Task Venture Capital GmbH Registered at District court Bremen HRB 35230 HB, Germany For any legal inquiries or if you require further information, please contact us via email at hello@task.vc. By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works.