feat: enhance CodeFeed with repo allowlist/denylist, optional timestamp filtering, and verbose logging

This commit is contained in:
2025-09-14 20:27:51 +00:00
parent 98f5c466a6
commit f21aa58c18
5 changed files with 214 additions and 138 deletions

View File

@@ -11,7 +11,7 @@
"author": "Task Venture Capital GmbH",
"license": "MIT",
"scripts": {
"test": "(tstest test/ --web)",
"test": "(tstest test/ --verbose)",
"build": "(tsbuild tsfolders --web --allowimplicitany)",
"buildDocs": "(tsdoc)"
},

220
readme.md
View File

@@ -1,143 +1,99 @@
```markdown
# @foss.global/codefeed
A module for creating feeds for code development.
Generate an activity feed from a Gitea instance. Scans orgs and repos, retrieves commits since a configurable timestamp, enriches with tags, optional npm publish detection, and CHANGELOG snippets.
## Install
To install the `@foss.global/codefeed` package, you can run the following npm command in your project directory:
```bash
pnpm add @foss.global/codefeed
# or
npm i @foss.global/codefeed
```
Requires Node.js 18+ (global fetch/Request/Response) and ESM.
## Quick Start
```ts
import { CodeFeed } from '@foss.global/codefeed';
// Fetch commits since one week ago (default), no caching
const feed = new CodeFeed('https://code.example.com', 'gitea_token');
const commits = await feed.fetchAllCommitsFromInstance();
console.log(commits);
```
### With options
```ts
const thirtyDays = 30 * 24 * 60 * 60 * 1000;
const since = new Date(Date.now() - thirtyDays).toISOString();
const feed = new CodeFeed('https://code.example.com', 'gitea_token', since, {
enableCache: true, // keep results in memory
cacheWindowMs: thirtyDays, // trim cache to this window
enableNpmCheck: true, // check npm for published versions
taggedOnly: false, // return all commits (or only tagged)
orgAllowlist: ['myorg'], // only scan these orgs
orgDenylist: ['archive'], // skip these orgs
repoAllowlist: ['myorg/app1', 'myorg/app2'], // only these repos
repoDenylist: ['myorg/old-repo'], // skip these repos
untilTimestamp: new Date().toISOString(), // optional upper bound
verbose: true, // print a short metrics summary
});
const commits = await feed.fetchAllCommitsFromInstance();
```
Each returned item follows this shape:
```ts
interface ICommitResult {
baseUrl: string;
org: string;
repo: string;
timestamp: string; // ISO date
hash: string; // commit SHA
commitMessage: string;
tagged: boolean; // commit is pointed to by a tag
publishedOnNpm: boolean; // only when npm check enabled and tag matches
prettyAgoTime: string; // human-readable diff
changelog: string | undefined; // snippet for matching tag version
}
```
## Features
- Pagination for orgs, repos, commits, and tags (no missing pages)
- Retries with exponential backoff for 429/5xx and network errors
- CHANGELOG discovery with case variants (`CHANGELOG.md`, `changelog.md`, `docs/CHANGELOG.md`)
- Tag-to-version mapping based on tag names (`vX.Y.Z``X.Y.Z`)
- Optional npm publish detection via `@org/repo` package versions
- In-memory caching with window trimming and stable sorting
- Allow/deny filters for orgs and repos, optional time upper bound
- One-line metrics summary when `verbose: true`
## Environment
- Gitea base URL and an optional token with read access
- Node.js 18+ (global fetch)
## Testing
The repo contains:
- An integration test using a `GITEA_TOKEN` from `.nogit/` via `@push.rocks/qenv`.
- A mocked pagination test that does not require network.
Run tests:
```bash
npm install @foss.global/codefeed
pnpm test
```
Ensure that you have a compatible version of Node.js installed and that your project is set up to support ECMAScript modules. The `@foss.global/codefeed` module uses ESM syntax.
For the integration test, ensure `GITEA_TOKEN` is provided (e.g., via `.nogit/` as used in `test/test.ts`).
## Usage
## Notes
The `@foss.global/codefeed` package is designed to help developers generate feeds for code developments, specifically targeting Gitea repositories. It fetches and processes commit data, changelogs, and repository activities for further analysis or visualization. Here, we'll delve into how you can utilize the different features of the `CodeFeed` class.
### Setting Up CodeFeed
To get started, import the `CodeFeed` class from the module:
```typescript
import { CodeFeed } from '@foss.global/codefeed';
```
Then, create an instance of `CodeFeed`. You'll need the base URL of your Gitea instance and optionally an API token if your repositories require authentication.
```typescript
// default: fetch commits since 7 days ago, no caching or npm checks, include all commits
const codeFeed = new CodeFeed(
'https://your-gitea-instance-url.com',
'your-api-token'
);
// with options: cache commits in-memory for 30 days, disable npm lookups, return only tagged commits
const thirtyDays = 30 * 24 * 60 * 60 * 1000;
const codeFeedStateful = new CodeFeed(
'https://your-gitea-instance-url.com',
'your-api-token',
undefined, // defaults to 7 days ago
{
enableCache: true,
cacheWindowMs: thirtyDays,
enableNpmCheck: false,
taggedOnly: true,
}
);
```
The constructor can also accept a `lastRunTimestamp` which indicates the last time a sync was performed. If not provided, it defaults to one week (7 days) prior to the current time.
### Fetching Commits
One of the core functionalities of CodeFeed is fetching commits from a Gitea instance. By calling `fetchAllCommitsFromInstance`, you can retrieve commits across multiple repositories:
```typescript
(async () => {
try {
const commits = await codeFeed.fetchAllCommitsFromInstance();
console.log(commits);
} catch (error) {
console.error('An error occurred while fetching commits:', error);
}
})();
```
This method scans all organizations and repositories, fetches all commits since the constructors `lastRunTimestamp` (default: one week ago), and enriches them with metadata like:
- Git tags (to detect releases)
- npm publication status (when enabled)
- parsed changelog entries (when available)
When `taggedOnly` is enabled, only commits marked as release tags are returned. When `enableCache` is enabled, previously fetched commits are kept in memory (up to `cacheWindowMs`), and only new commits are fetched on subsequent calls.
Each commit object in the resulting array conforms to the `ICommitResult` interface, containing details such as:
- `baseUrl`
- `org`
- `repo`
- `timestamp`
- `hash`
- `commitMessage`
- `tagged` (boolean)
- `publishedOnNpm` (boolean)
- `prettyAgoTime` (human-readable relative time)
- `changelog` (text from the `changelog.md` associated with a commit)
### Understanding the Data Fetch Process
#### Fetching Organizations
The `fetchAllOrganizations` method collects all organizations within the Gitea instance:
```typescript
const organizations = await codeFeed.fetchAllOrganizations();
console.log('Organizations:', organizations);
```
This method interacts with the Gitea API to pull organization names, aiding further requests that require organization context.
#### Fetching Repositories
Repositories under these organizations can be retrieved using `fetchAllRepositories`:
```typescript
const repositories = await codeFeed.fetchAllRepositories();
console.log('Repositories:', repositories);
```
Here, filtering by organization can help narrow down the scope further when dealing with large instances.
#### Fetching Tags and Commits
To handle repository-specific details, use:
- `fetchTags(owner: string, repo: string)`: Appropriately handles paginated tag data within a repository.
- `fetchRecentCommitsForRepo(owner: string, repo: string)`: Gathers commit data specific to the past 24 hours for a given repository.
```typescript
const tags = await codeFeed.fetchTags('orgName', 'repoName');
const recentCommits = await codeFeed.fetchRecentCommitsForRepo('orgName', 'repoName');
console.log('Tags:', tags);
console.log('Recent Commits:', recentCommits);
```
### Changelog Integration
Loading changelog content from a repository is integrated into the flow with `loadChangelogFromRepo`. This can be accessed when processing specific commits:
```typescript
await codeFeed.loadChangelogFromRepo('org', 'repo');
const changelog = codeFeed.getChangelogForVersion('1.0.0');
console.log('Changelog for version 1.0.0:', changelog);
```
### Conclusion
The `@foss.global/codefeed` module provides robust capabilities for extracting and managing feed data related to code developments in Gitea environments. Through systematic setup and leveraging API-driven methods, it becomes a valuable tool for developers aiming to keep track of software progress and changes efficiently. The integration hooks like changelog and npm verification further enrich its utility, offering consolidated insights into each commit's journey from codebase to published package.
Explore integrating these capabilities into your development workflows to enhance tracking, deployment pipelines, or analytics systems within your projects. Remember to always handle API tokens securely and adhere to best practices when managing access to repository resources. Stay updated on any changes or enhancements to this module for further feature exposures or bug fixes. Happy coding!
```
undefined
- When `taggedOnly` is enabled, the feed includes only commits associated with tags.
- `publishedOnNpm` is computed by matching the tag-derived version against the npm registry for `@org/repo`.
- For very large instances, consider using allowlists/denylists and enabling caching for incremental runs.

View File

@@ -0,0 +1,82 @@
import { expect, tap } from '@push.rocks/tapbundle';
import { CodeFeed } from '../ts/index.js';
// A subclass to mock fetchFunction for controlled pagination tests
class MockCodeFeed extends CodeFeed {
private data: Record<string, any>;
constructor() {
super('https://mock', undefined, '2024-01-01T00:00:00.000Z', {
enableCache: false,
enableNpmCheck: false,
taggedOnly: false,
verbose: false,
});
// Prepare mock datasets
const commit = (sha: string, date: string, message = 'chore: update') => ({
sha,
commit: { author: { date }, message },
});
const commitsPage1 = Array.from({ length: 50 }).map((_, i) =>
commit(`sha-${i}`, `2024-01-0${(i % 9) + 1}T00:00:00.000Z`)
);
const commitsPage2 = [commit('sha-50', '2024-01-10T00:00:00.000Z'), commit('sha-tagged', '2024-01-11T00:00:00.000Z')];
const tagsPage1 = [
{ name: 'v1.2.3', commit: { sha: 'sha-tagged' } },
];
const changelogContent = Buffer.from(
[
'# Changelog',
'',
'## 2024-01-11 - 1.2.3 - Release',
'* example change',
'',
].join('\n'),
'utf8'
).toString('base64');
this.data = {
'/api/v1/orgs?limit=50&page=1': [{ username: 'org1' }],
'/api/v1/orgs?limit=50&page=2': [],
'/api/v1/orgs/org1/repos?limit=50&page=1': [{ name: 'repo1' }],
'/api/v1/orgs/org1/repos?limit=50&page=2': [],
'/api/v1/repos/org1/repo1/commits?limit=1': [commit('probe', '2024-01-12T00:00:00.000Z')],
'/api/v1/repos/org1/repo1/commits?since=2024-01-01T00%3A00%3A00.000Z&limit=50&page=1': commitsPage1,
'/api/v1/repos/org1/repo1/commits?since=2024-01-01T00%3A00%3A00.000Z&limit=50&page=2': commitsPage2,
'/api/v1/repos/org1/repo1/commits?since=2024-01-01T00%3A00%3A00.000Z&limit=50&page=3': [],
'/api/v1/repos/org1/repo1/tags?limit=50&page=1': tagsPage1,
'/api/v1/repos/org1/repo1/tags?limit=50&page=2': [],
'/api/v1/repos/org1/repo1/contents/CHANGELOG.md': { content: changelogContent },
};
}
public async fetchFunction(urlArg: string, _optionsArg: RequestInit = {}): Promise<Response> {
const payload = this.data[urlArg];
if (payload === undefined) {
return new Response('Not found', { status: 404, statusText: 'Not Found' });
}
return new Response(JSON.stringify(payload), { status: 200, headers: { 'content-type': 'application/json' } });
}
}
let mockFeed: MockCodeFeed;
tap.test('mock: pagination and tag mapping', async () => {
mockFeed = new MockCodeFeed();
const results = await mockFeed.fetchAllCommitsFromInstance();
// ensure we received > 50 commits from two pages
expect(results).toBeArray();
expect(results.length).toBeGreaterThan(50);
// ensure tagged commit is present and has changelog attached when found
const tagged = results.find((r) => r.hash === 'sha-tagged');
expect(tagged).toBeTruthy();
expect(tagged!.tagged).toBeTrue();
// changelog is present for that version (via tag name)
expect(tagged!.changelog).toBeTypeofString();
});
tap.start();

View File

@@ -1,4 +1,4 @@
import { expect, expectAsync, tap } from '@push.rocks/tapbundle';
import { expect, tap } from '@push.rocks/tapbundle';
import * as codefeed from '../ts/index.js';
import * as qenv from '@push.rocks/qenv';
const testQenv = new qenv.Qenv('./', '.nogit/');

View File

@@ -20,6 +20,10 @@ export class CodeFeed {
// allow/deny filters
private orgAllowlist?: string[];
private orgDenylist?: string[];
private repoAllowlist?: string[]; // entries like "org/repo"
private repoDenylist?: string[]; // entries like "org/repo"
private untilTimestamp?: string; // optional upper bound on commit timestamps
private verbose?: boolean; // optional metrics logging
constructor(
baseUrl: string,
@@ -32,6 +36,10 @@ export class CodeFeed {
taggedOnly?: boolean;
orgAllowlist?: string[];
orgDenylist?: string[];
repoAllowlist?: string[];
repoDenylist?: string[];
untilTimestamp?: string;
verbose?: boolean;
}
) {
this.baseUrl = baseUrl;
@@ -45,6 +53,10 @@ export class CodeFeed {
this.enableTaggedOnly = options?.taggedOnly ?? false;
this.orgAllowlist = options?.orgAllowlist;
this.orgDenylist = options?.orgDenylist;
this.repoAllowlist = options?.repoAllowlist;
this.repoDenylist = options?.repoDenylist;
this.untilTimestamp = options?.untilTimestamp;
this.verbose = options?.verbose ?? false;
this.cache = [];
// npm registry instance for version lookups
this.npmRegistry = new plugins.smartnpm.NpmRegistry();
@@ -85,9 +97,18 @@ export class CodeFeed {
)
);
// flatten to [{ owner, name }]
const allRepos = orgs.flatMap((org, i) =>
let allRepos = orgs.flatMap((org, i) =>
repoLists[i].map((r) => ({ owner: org, name: r.name }))
);
// apply repo allow/deny filters using slug "org/repo"
if (this.repoAllowlist && this.repoAllowlist.length > 0) {
const allow = new Set(this.repoAllowlist.map((s) => s.toLowerCase()));
allRepos = allRepos.filter(({ owner, name }) => allow.has(`${owner}/${name}`.toLowerCase()));
}
if (this.repoDenylist && this.repoDenylist.length > 0) {
const deny = new Set(this.repoDenylist.map((s) => s.toLowerCase()));
allRepos = allRepos.filter(({ owner, name }) => !deny.has(`${owner}/${name}`.toLowerCase()));
}
// 3) probe latest commit per repo and fetch full list only if new commits exist
const commitJobs = allRepos.map(({ owner, name }) =>
@@ -127,12 +148,14 @@ export class CodeFeed {
// 4) build new commit entries with tagging, npm and changelog support
const newResults: plugins.interfaces.ICommitResult[] = [];
let reposWithNewCommits = 0;
for (const { owner, name, commits } of commitResults) {
// skip repos with no new commits
if (commits.length === 0) {
this.changelogContent = '';
continue;
}
reposWithNewCommits++;
// load changelog for this repo
await this.loadChangelogFromRepo(owner, name);
// fetch tags for this repo
@@ -178,6 +201,13 @@ export class CodeFeed {
changelogEntry = this.getChangelogForVersion(versionFromTag);
}
}
// optionally enforce an upper bound on commit timestamps
if (this.untilTimestamp) {
const ts = new Date(c.commit.author.date).getTime();
if (ts > new Date(this.untilTimestamp).getTime()) {
continue;
}
}
newResults.push({
baseUrl: this.baseUrl,
org: owner,
@@ -212,6 +242,11 @@ export class CodeFeed {
if (this.enableTaggedOnly) {
return this.cache.filter((c) => c.tagged === true);
}
if (this.verbose) {
console.log(
`[CodeFeed] orgs=${orgs.length} repos=${allRepos.length} reposWithNew=${reposWithNewCommits} commits=${this.cache.length} (cached)`
);
}
return this.cache;
}
// no caching: apply tagged-only filter if requested
@@ -223,10 +258,13 @@ export class CodeFeed {
return true;
});
unique.sort((a, b) => b.timestamp.localeCompare(a.timestamp));
if (this.enableTaggedOnly) {
return unique.filter((c) => c.tagged === true);
const result = this.enableTaggedOnly ? unique.filter((c) => c.tagged === true) : unique;
if (this.verbose) {
console.log(
`[CodeFeed] orgs=${orgs.length} repos=${allRepos.length} reposWithNew=${reposWithNewCommits} commits=${result.length}`
);
}
return unique;
return result;
}
/**