Hybrid \ On-prem

Infra BOM Milestone Gatherers

The Git Activity Gatherer connects to Git providers (GitHub, GitLab, Bitbucket, Azure Repos), discovers and/or iterates repositories, optionally clones repositories, extracts analytics/metadata, and exports results to object storage (e.g., S3) or SFTP. The service runs as a single containerized application on a Linux host.

Here are the following instructions for integrating our platform.

  • Instructions for creating a dedicated On-prem machine, including SSH access for setup and ongoing maintenance.

  • Access to Git Provider and Project management systems.

Dedicated On-prem machine Specs:

1. Hardware Requirements (by deployment size)-

These tiers are guidelines. Actual sizing depends on repo count, repo size/history depth, and concurrency.

CPU: 16 vCPU

MEMORY: 32 GB RAM

STORAGE: 1 TB+ SSD

Disk Breakdown

  • Application & images: ~10 GB

  • Cache/working clones: 100-800 GB (dominant, depends on repo sizes & concurrency)

  • Logs: 10-20 GB (rotate/retain as per policy)

  • Buffer/working headroom: +20-30% of the above

2. Operating System Requirements (64‑bit only)

Supported

  • Ubuntu 20.04 LTS, 22.04 LTS

  • Debian 10/11 (or newer stable)

  • RHEL/Rocky 8+

  • CentOS Stream 8-9

  • Amazon Linux 2 / 2023

Required OS Features

  • systemd (for service management)

  • 64‑bit kernel (≥ 3.10)

  • Working DNS & NTP

  • SSD storage recommended (high I/O)

3. Runtime Requirements (Docker only)

  • Docker Engine: 20.10+ (24.x recommended)

  • User permissions: service user in docker group or sudo for Docker

  • Socket: /var/run/docker.sock accessible to the service user

4. Network & Firewall Requirements

The service is outbound‑only. No public inbound ports are required.

4.1 Outbound (Required)

  • 443/TCP — HTTPS to provider APIs and Git operations over HTTPS (Git providers, object storage, analytics APIs), pulling Docker images from registries.

  • 53/TCP+UDP — DNS resolution

4.2 Outbound (Optional)

  • 22/TCP — SFTP upload to private SFTP storage (if used)

  • 80/TCP — HTTP to on‑prem/legacy endpoints (if applicable)

  • 8080/TCP or 3128/TCP — Proxy egress (corporate environments)

  • 123/UDP — NTP for time sync (recommended)

4.3 Inbound (Optional)

  • 22/TCP — SSH admin (restrict to trusted IPs)

  • 8080/TCP — Local health endpoint (bind to 127.0.0.1 only; not internet‑exposed)

4.4 Destinations

  • Git providers: API + Git over HTTPS (FQDNs per organization policy)

  • Object storage: S3 or S3‑compatible endpoint over HTTPS

  • Container registry

  • Proxy: Corporate egress proxy where applicable

  • DNS & NTP: Organization‑approved resolvers/time sources


Granting PM tool & Git Access

We need access to your PM and Git to initiate the integration process. Please follow these two simple steps for each Platform:

PM tool & Git Access

Please provide us with the following information for PM provider & Git Provider access:

  1. URL (link) to your PM & Git.

  2. Username for both services.

  3. Password (or Access Token) for authentication for both.

Note: Regarding permissions, we only require read-access permissions. If you have any additional questions about specific permissions to grant, please contact us.

Last updated