v3 effort (#158)
* SQL Database (#157)
* point zoekt to v3 branch
* bump zoekt version
* Add tenant ID concept into web app and backend (#160)
* hacked together a example of using zoekt grpc api
* provide tenant id to zoekt git indexer
* update zoekt version to point to multitenant branch
* pipe tenant id through header to zoekt
* remove incorrect submodule reference and settings typo
* update zoekt commit
* remove unused yarn script
* remove unused grpc client in web server
* remove unneeded deps and improve tenant id log
* pass tenant id when creating repo in db
* add mt yarn script
* add nocheckin comment to tenant id in v2 schema
---------
Co-authored-by: bkellam <bshizzle1234@gmail.com>
* bump zoekt version
* parallelize repo indexing (#163)
* hacked together a example of using zoekt grpc api
* provide tenant id to zoekt git indexer
* update zoekt version to point to multitenant branch
* pipe tenant id through header to zoekt
* remove incorrect submodule reference and settings typo
* update zoekt commit
* remove unused yarn script
* remove unused grpc client in web server
* remove unneeded deps and improve tenant id log
* pass tenant id when creating repo in db
* add mt yarn script
* add pol of bullmq into backend
* add better error handling and concurrency setting
* spin up redis instance in dockerfile
* cleanup transaction logic when adding repos to index queue
* add NEW index status fetch condition
* move bullmq deps to backend
---------
Co-authored-by: bkellam <bshizzle1234@gmail.com>
* Authentication (#164)
* Add Org table (#167)
* Move logout button & profile picture into settings dropdown (#172)
* Multi tenancy support in config syncer (#171)
* [wip] initial mt support in config syncer
* Move logout button & profile picture into settings dropdown (#172)
* update sync status properly and fix bug with multiple config in db case
* make config path required in single tenant mode
NOTE: deleting config/repos is currently not supported in multi tenancy case. Support for this will be added in a future PR
---------
Co-authored-by: Brendan Kellam <bshizzle1234@gmail.com>
* add tenant mode support in docker container:
* Organization switching & active org management (#173)
* updated syncedAt date after config sync:
* Migrate to postgres (#174)
* spin up postgres in docker container
* get initial pol of postgres db working in docker image
* spin up postgres server in dev case
* updated syncedAt date after config sync:
* remove unnecessary port expose in docker file
* Connection creation form (#175)
* fix issue with yarn dev startup
* init (#176)
* Add `@sourcebot/schemas` package (#177)
* Connection management (#178)
* add concept of secrets (#180)
* add @sourcebot/schemas package
* migrate things to use the schemas package
* Dockerfile support
* add secret table to schema
* Add concept of connection manager
* Rename Config->Connection
* Handle job failures
* Add join table between repo and connection
* nits
* create first version of crypto package
* add crypto package as deps to others
* forgot to add package changes
* add server action for adding and listing secrets, create test page for it
* add secrets page to nav menu
* add secret to config and support fetching it in backend
* reset secret form on successful submission
* add toast feedback for secrets form
* add instructions for adding encryption key to dev instructions
* add encryption key support in docker file
* add delete secret button
* fix nits from pr review
---------
Co-authored-by: bkellam <bshizzle1234@gmail.com>
* bump zoekt version
* enforce tenancy on search and repo listing endpoints (#181)
* enforce tenancy on search and repo listing
* remove orgId from request schemas
* adds garbage collection for repos (#182)
* refactor repo indexing logic into RepoManager
* wip cleanup stale repos
* add rest of gc logic
* set status to indexing properly
* add initial logic for staging environment
* try to move encryption key env decleration in docker file to fix build issues
* switch encryption key as build arg to se if that fixes build issues
* add deployment action for staging image
* try using mac github action runners instead
* switch to using arm64 runners on arm64 build
* change workflow names to fix trigger issue
* trigger staging actions to see if it works
* fix working directory typo and pray it doesnt push to prod
* checkout v3 when deploying staging
* try to change into the staging dir manuall
* dummy commit to trigger v3 workflows to test
* update staging deploy script to match new version in main
* reference proper image:tag in staging fly config
* update staging fly config to point to ghcr
* Connection management (#183)
* add invite system and google oauth provider (#185)
* add settings page with members list
* add invite to schema and basic create form
* add invite table
* add basic invite link copy button
* add auth invite accept case
* add non auth logic
* add google oauth provider
* fix reference to header component in connections
* add google logo to google oauth
* fix web build errors
* bump staging resources
* change staging cpu to perf
* add side bar nav in settings page
* improve styling of members page
* wip adding stripe checkout button
* wip onboarding flow
* add stripe subscription id to org
* save stripe session id and add manage subscription button in settings
* properly block access to pages if user isn't in an org
* wip add paywall
* Domain support
* Domain support (#188)
* Update Makefile to include crypto package when doing a make clean
* Add default for AUTH_URL in attempt to fix build
* attempt 2
* fix attempt #3: Do not require a encrpytion key at build time
* Fix generate script race condition
* Attempt #4
* add back paywall and also add support for incrememnting seat count on invite redemption
* prevent self invite
* action button styling in settings and toast on copy
* add ability to remove member from org
* move stripe product id to env var
* add await for blocking loop in backend
* add subscription info to billing page
* handle trial case in billing info page
* add trial duration indicator to nav bar
* check if domain starts or ends with dash
* remove unused no org component
* Generate AUTH_SECRET if not provided (#189)
* remove package lock file and fix prisma dep version
* revert dep version updates
* fix yarn.lock
* add auth and membership check to fetchSubscription
* properly handle invite redeem with no valid subscription case
* change back fetch subscription to not require org membership
* add back subscription check in invite redeem page
* Add stripe billing logic (#190)
* add side bar nav in settings page
* improve styling of members page
* wip adding stripe checkout button
* wip onboarding flow
* add stripe subscription id to org
* save stripe session id and add manage subscription button in settings
* properly block access to pages if user isn't in an org
* wip add paywall
* Domain support
* add back paywall and also add support for incrememnting seat count on invite redemption
* prevent self invite
* action button styling in settings and toast on copy
* add ability to remove member from org
* move stripe product id to env var
* add await for blocking loop in backend
* add subscription info to billing page
* handle trial case in billing info page
* add trial duration indicator to nav bar
* check if domain starts or ends with dash
* remove unused no org component
* remove package lock file and fix prisma dep version
* revert dep version updates
* fix yarn.lock
* add auth and membership check to fetchSubscription
* properly handle invite redeem with no valid subscription case
* change back fetch subscription to not require org membership
* add back subscription check in invite redeem page
---------
Co-authored-by: bkellam <bshizzle1234@gmail.com>
* fix nits
* remove providers check
* fix more nits
* change stripe init to be behind function
* fix publishible stripe key handling in docker container
* enforce owner perms (#191)
* add make owner logic, and owner perms for removal, invite, and manage subscription
* add change billing email card to billing settings
* enforce owner role in action level
* remove unused hover card component
* cleanup
* add back gitlab, gitea, and gerrit support (#184)
* add non github config definitions
* refactor github config compilation to seperate file
* add gitlab config compilation
* Connection management (#183)
* wip gitlab repo sync support
* fix gitlab zoekt metadata
* add gitea support
* add gerrit support
* Connection management (#183)
* add gerrit config compilation
* Connection management (#183)
---------
Co-authored-by: Brendan Kellam <bshizzle1234@gmail.com>
* fix apos usage in redeem page
* change csrf cookie to secure not host
* Credentials provider (#192)
* email password functionality
* feedback
* cleanup org's repos and shards if it's inactive (#194)
* add stripe subscription status and webhook
* add inactive org repo cleanup logic
* mark reactivated org connections for sync
* connections qol improvements (#195)
* add client side polling to connections list
* properly fetch repo image url
* add client polling to connection management page, and add ability to sync failed connections
* Fix build with suspense boundary
* improved fix
* add retries for 429 issues (#196)
* add connection compile retry and hard repo limit
* add more retry checks
* cleanup unused change
* address feedback
* fix build errors and add index concurrency env var
* add config upsert timeout env var
* Membership settings rework (#198)
* Add refined members list
* futher progress on members settings polish
* Remove old components
* feedback
* Magic links (#199)
* wip on magic link support
* Switch to nodemailer / resend for transactional mail
* Further cleanup
* Add stylized email using react-email
* fix
* Fix build
* db performance improvements and job resilience (#200)
* replace upsert with seperate create many and raw update many calls
* add bulk repo status update and queue addition with priority
* add support for managed redis
* add note for changing raw sql on schema change
* remove non secret token options
* fix token examples in schema
* add better visualization for connection/repo errors and warnings (#201)
* replace upsert with seperate create many and raw update many calls
* add bulk repo status update and queue addition with priority
* add support for managed redis
* add note for changing raw sql on schema change
* add error package and use BackendException in connection manager
* handle connection failure display on web app
* add warning banner for not found orgs/repos/users
* add failure handling for gerrit
* add gitea notfound warning support
* add warning icon in connections list
* style nits
* add failed repo vis in connections list
* added retry failed repo index buttons
* move nav indicators to client with polling
* fix indicator flash issue and truncate large list results
* display error nav better
* truncate failed repo list in connection list item
* fix merge error
* fix merge bug
* add connection util file [wip]
* refactor notfound fetch logic and add missing error package to dockerfile
* move repeated logic to function and add zod schema for syncStatusMetadata
* add orgid unique constraint to repo
* revert repo compile update logic to upsert loop
* log upsert stats
* [temp] disable polling everywhere (#205)
* add health check endpoint
* Refined onboarding flow (#202)
* Redeem UX pass (#204)
* add log for health check
* fix new connection complete callback route
* add cpu split logic and only wait for postgres if we're going to connec to it
* Inline secret creation (#207)
* use docker scopes to try and improve caching
* Dummy change
* remove cpu split logic
* Add some instrumentation to web
* add posthog events on various user actions (#208)
* add page view event support
* add posthog events
* nit: remove unused import
* feedback
* fix merge error
* use staging posthog papik when building staging image
* fix other merge error and build warnings
* Add invite email (#209)
* wrap posthog provider in suspense to fix build error
* add grafana alloy config and setup (#210)
* add grafana alloy config and setup
* add basic repo prom metrics
* nits in dockerfile
* remove invalid characters when auto filling domain
* add login posthog events
* remove hard coded sourcebot.app references
* make repo garbage collection async (#211)
* add gc queue logic
* fix missing switch cases for gc status
* style org create form better with new staging domain
* change repo rm logic to be async
* simplify repo for inactive org query
* add grace period for garbage collecting repos
* make prom scrape interval 500ms
* fix typo in trial card
* onboarding tweaks
* rename some prom metrics and cleanup unused
* wipe existing repo if we've picked up a killed job to ensure good state
* Connections UX pass + query optimizations (#212)
* remove git & local schemas (#213)
* skip stripe checkout for trial + fix indexing in progress UI + additional schema validation (#214)
* add additional config validation
* wip bypass stripe checkout for trial
* fix stripe trial checkout bypass
* fix indexing in progress ui on home page
* add subscription checks, more schema validation, and fix issue with complete page
* dont display if no indexed repos
* fix skipping onboard complete check
* fix build error
* add back button in onboard connection creation flow
* Add back revision support (#215)
* fix build
* Fix bug with repository snapshot
* fix share links
* fix repo rm issue, 502 page, condition on test clock
* Make login and onboarding mobile friendly
* fix ordering of quick actions
* remove error msg dump on failed repo index job, and update indexedAt field
* Add mobile unsupported splash screne
* cherry pick fix for file links
* [Cherry Pick] Syntax reference guide (#169) (#216)
* Add .env to db gitignore
* fix case where we have repos but they're all failed for repo snapshot
* /settings/secrets page (#217)
* display domain properly in org create form
* Quick action tweaks (#218)
* revamp repo page (#220)
* wip repo table
* new repo page
* add indicator for when feedback is applied in repo page
* add repo button
* fetch connection data in one query
* fix styling
* fix (#219)
* remove / keyboard shortcut hint in search bar
* prevent switching to first page on data update and truncate long repo names in repo list
* General settings + cleanup (#221)
* General settings
* Add alert to org domain change
* First attempt at sending logs to grafana
* logs wip
* add alloy logs
* wip
* [temp] comment out loki for now
* update trial card content and add events for code host selection on onboard
* reduce scraping interval to 15s
* Add prometheus metric for pending repo indexing jobs
* switch magic link to invite code (#222)
* wip magic link codes
* pipe email to email provider properly
* remove magic link data cookie after sign in
* clean up unused imports
* dont remove cookie before we use it
* rm package-lock.json
* revert yarn files to v3 state
* switch email passing from cookie to search param
* add comment for settings dropdown auth update
* remove unused middleware file
* fix build error and warnings
* fix build error with useSearchParam not wrapped in suspense
* add sentry support to backend and webapp (#223)
* add sentry to web app
* set sentry environemnt from env var
* add sentry env replace logic in docker container
* wip add backend sentry
* add sentry to backend
* move dns to env var
* remove test exception
* Fix root domain issue on onboarding
* add setup sentry cli step to github action
* login to sentry
* fix sentry login in action
* Update grafana loki endpoint
* switch source map publish to runtime in entrypoint
* catch and rethrow simplegit exceptions
* alloy nits
* fix alloy
* backend logging (#224)
* revert grafana loki config
* fix login ui nits
* fix quick actions
* fix typo in secret creation
* fix private repo clone issue for gitlab
* add repo index timeout logic
* add posthog identify call after registeration
* various changes to add terms and security info (#225)
* add terms and security to footer
* add security card
* add demo card
* fix build error
* nit fix: center 'get in touch' on security card
* Dark theme improvements (#226)
* (fix) Fixed bug with gitlab and gitea not including hostname in the repoName
* Switch to using t3-env for env-var management (#230)
* Add missing env var
* fix build
* Centralize to using a single .env.development for development workflows (#231)
* Make billing optional (#232)
* Massage environment variables from strings to numbers (#234)
* Single tenancy & auth modes (#233)
* Add docs to this repo
* dummy change
* Declarative connection configuration (#235)
* fix build
* upgrade to next 14.2.25
* Improved database DX
* migrate to yarn v4
* Use origin from header for baseUrl of emails (instead of AUTH_URL). Also removed reference to hide scrollbars
* Remove SOURCEBOT_ENCRYPTION_KEY from build arg
* Fix issue with linking default user to org in single tenant + no-auth mode
* Fix fallback tokens (#242)
* add SECURITY_CARD_ENABLED flag
* Add repository weburl (#243)
* Random fixes and improvements (#244)
* add zoekt max wall time env var
* remove empty warning in docs
* fix reference in sh docs
* add connection manager upsert timeout env var
* Declarative connection cleanup + improvements (#245)
* change contact us footer in app to point to main contact form
* PostHog event pass (#246)
* fix typo
* Add sourcebot cloud environment prop to staging workflow
* Update generated files
* remove AUTH_URL since it unused and (likely) unnecessary
* Revert "remove AUTH_URL since it unused and (likely) unnecessary"
This reverts commit 1f4a5aed22.
* cleanup GitHub action releases (#252)
* remove alloy, change auth defaul to disabled, add settings page in me dropdown
* enforce connection management perms to owner (#253)
* enforce conneciton management perms to owner
* fix formatting
* more formatting
* naming nits
* fix var name error
* change empty repo set copy if auth is disabled
* add CONTRIBUTING.md file
* hide settings in dropdown with auth isnt enabled
* handle case where gerrit weburl is just gitiles path
* Docs overhall (#251)
* remove nocheckin
* fix build error
* remove v3 trigger from deploy staging
* fix build errors round 2
* another error fix
---------
Co-authored-by: msukkari <michael.sukkarieh@mail.mcgill.ca>
|
|
@ -1,11 +1,15 @@
|
|||
Dockerfile
|
||||
.dockerignore
|
||||
node_modules
|
||||
npm-debug.log
|
||||
README.md
|
||||
.next
|
||||
!.next/static
|
||||
!.next/standalone
|
||||
.git
|
||||
.sourcebot
|
||||
.env.local
|
||||
packages/web/.next
|
||||
!packages/web/.next/static
|
||||
!packages/web/.next/standalone
|
||||
**/node_modules
|
||||
**/.env.local
|
||||
**/.sentryclirc
|
||||
**/.env.sentry-build-plugin
|
||||
.yarn
|
||||
!.yarn/releases
|
||||
81
.env.development
Normal file
|
|
@ -0,0 +1,81 @@
|
|||
|
||||
# Prisma
|
||||
DATABASE_URL="postgresql://postgres:postgres@localhost:5432/postgres"
|
||||
|
||||
# Zoekt
|
||||
ZOEKT_WEBSERVER_URL="http://localhost:6070"
|
||||
# SHARD_MAX_MATCH_COUNT=10000
|
||||
# TOTAL_MAX_MATCH_COUNT=100000
|
||||
# The command to use for generating ctags.
|
||||
CTAGS_COMMAND=ctags
|
||||
# logging, strict
|
||||
SRC_TENANT_ENFORCEMENT_MODE=strict
|
||||
|
||||
# Auth.JS
|
||||
# You can generate a new secret with:
|
||||
# openssl rand -base64 33
|
||||
# @see: https://authjs.dev/getting-started/deployment#auth_secret
|
||||
AUTH_SECRET="00000000000000000000000000000000000000000000"
|
||||
AUTH_URL="http://localhost:3000"
|
||||
# AUTH_CREDENTIALS_LOGIN_ENABLED=true
|
||||
# AUTH_GITHUB_CLIENT_ID=""
|
||||
# AUTH_GITHUB_CLIENT_SECRET=""
|
||||
# AUTH_GOOGLE_CLIENT_ID=""
|
||||
# AUTH_GOOGLE_CLIENT_SECRET=""
|
||||
|
||||
# Email
|
||||
# EMAIL_FROM_ADDRESS="" # The from address for transactional emails.
|
||||
# SMTP_CONNECTION_URL="" # The SMTP connection URL for transactional emails.
|
||||
|
||||
# PostHog
|
||||
# POSTHOG_PAPIK=""
|
||||
# NEXT_PUBLIC_POSTHOG_PAPIK=""
|
||||
|
||||
# Sentry
|
||||
# SENTRY_BACKEND_DSN=""
|
||||
# NEXT_PUBLIC_SENTRY_WEBAPP_DSN=""
|
||||
# SENTRY_ENVIRONMENT="dev"
|
||||
# NEXT_PUBLIC_SENTRY_ENVIRONMENT="dev"
|
||||
# SENTRY_AUTH_TOKEN=
|
||||
|
||||
# Logtail
|
||||
# LOGTAIL_TOKEN=""
|
||||
# LOGTAIL_HOST=""
|
||||
|
||||
# Redis
|
||||
REDIS_URL="redis://localhost:6379"
|
||||
|
||||
# Stripe
|
||||
# STRIPE_SECRET_KEY: z.string().optional(),
|
||||
# STRIPE_PRODUCT_ID: z.string().optional(),
|
||||
# STRIPE_WEBHOOK_SECRET: z.string().optional(),
|
||||
# STRIPE_ENABLE_TEST_CLOCKS=false
|
||||
|
||||
# Misc
|
||||
|
||||
# Generated using:
|
||||
# openssl rand -base64 24
|
||||
SOURCEBOT_ENCRYPTION_KEY="00000000000000000000000000000000"
|
||||
|
||||
SOURCEBOT_LOG_LEVEL="debug" # valid values: info, debug, warn, error
|
||||
SOURCEBOT_TELEMETRY_DISABLED=true # Disables telemetry collection
|
||||
|
||||
# Code-host fallback tokens
|
||||
# FALLBACK_GITHUB_CLOUD_TOKEN=""
|
||||
# FALLBACK_GITLAB_CLOUD_TOKEN=""
|
||||
# FALLBACK_GITEA_CLOUD_TOKEN=""
|
||||
|
||||
# Controls the number of concurrent indexing jobs that can run at once
|
||||
# INDEX_CONCURRENCY_MULTIPLE=
|
||||
|
||||
# Controls the polling interval for the web app
|
||||
# NEXT_PUBLIC_POLLING_INTERVAL_MS=
|
||||
|
||||
# Controls the version of the web app
|
||||
# NEXT_PUBLIC_SOURCEBOT_VERSION=
|
||||
|
||||
# CONFIG_MAX_REPOS_NO_TOKEN=
|
||||
# NODE_ENV=
|
||||
# SOURCEBOT_TENANCY_MODE=single
|
||||
|
||||
# NEXT_PUBLIC_SOURCEBOT_CLOUD_ENVIRONMENT=
|
||||
87
.github/workflows/_gcp-deploy.yml
vendored
Normal file
|
|
@ -0,0 +1,87 @@
|
|||
name: GCP Deploy
|
||||
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
environment:
|
||||
required: true
|
||||
description: 'The environment to deploy to'
|
||||
type: string
|
||||
|
||||
jobs:
|
||||
gcp-deploy:
|
||||
runs-on: ubuntu-latest
|
||||
environment: ${{ inputs.environment }}
|
||||
env:
|
||||
IMAGE_PATH: us-west1-docker.pkg.dev/${{ secrets.GCP_PROJECT_ID }}/sourcebot/sourcebot-${{ vars.NEXT_PUBLIC_SOURCEBOT_CLOUD_ENVIRONMENT }}
|
||||
steps:
|
||||
- name: 'Checkout'
|
||||
uses: 'actions/checkout@v3'
|
||||
with:
|
||||
submodules: "true"
|
||||
|
||||
# @see: https://github.com/google-github-actions/auth?tab=readme-ov-file#direct-wif
|
||||
- name: 'Google auth'
|
||||
id: 'auth'
|
||||
uses: 'google-github-actions/auth@v2'
|
||||
with:
|
||||
project_id: '${{ secrets.GCP_PROJECT_ID }}'
|
||||
workload_identity_provider: '${{ secrets.GCP_WIF_PROVIDER }}'
|
||||
|
||||
- name: 'Set up Cloud SDK'
|
||||
uses: 'google-github-actions/setup-gcloud@v1'
|
||||
with:
|
||||
project_id: '${{ secrets.GCP_PROJECT_ID }}'
|
||||
|
||||
- name: 'Docker auth'
|
||||
run: |-
|
||||
gcloud auth configure-docker us-west1-docker.pkg.dev
|
||||
|
||||
- name: Configure SSH
|
||||
run: |
|
||||
mkdir -p ~/.ssh/
|
||||
echo "${{ secrets.GCP_SSH_PRIVATE_KEY }}" > ~/.ssh/private.key
|
||||
chmod 600 ~/.ssh/private.key
|
||||
echo "${{ secrets.GCP_SSH_KNOWN_HOSTS }}" >> ~/.ssh/known_hosts
|
||||
|
||||
- name: Build Docker image
|
||||
id: build
|
||||
uses: docker/build-push-action@v6
|
||||
with:
|
||||
context: .
|
||||
push: true
|
||||
tags: |
|
||||
${{ env.IMAGE_PATH }}:${{ github.sha }}
|
||||
${{ env.IMAGE_PATH }}:latest
|
||||
build-args: |
|
||||
NEXT_PUBLIC_SOURCEBOT_VERSION=${{ github.ref_name }}
|
||||
NEXT_PUBLIC_POSTHOG_PAPIK=${{ vars.NEXT_PUBLIC_POSTHOG_PAPIK }}
|
||||
NEXT_PUBLIC_SOURCEBOT_CLOUD_ENVIRONMENT=${{ vars.NEXT_PUBLIC_SOURCEBOT_CLOUD_ENVIRONMENT }}
|
||||
NEXT_PUBLIC_SENTRY_ENVIRONMENT=${{ vars.NEXT_PUBLIC_SENTRY_ENVIRONMENT }}
|
||||
NEXT_PUBLIC_SENTRY_WEBAPP_DSN=${{ vars.NEXT_PUBLIC_SENTRY_WEBAPP_DSN }}
|
||||
NEXT_PUBLIC_SENTRY_BACKEND_DSN=${{ vars.NEXT_PUBLIC_SENTRY_BACKEND_DSN }}
|
||||
SENTRY_SMUAT=${{ secrets.SENTRY_SMUAT }}
|
||||
SENTRY_ORG=${{ vars.SENTRY_ORG }}
|
||||
SENTRY_WEBAPP_PROJECT=${{ vars.SENTRY_WEBAPP_PROJECT }}
|
||||
SENTRY_BACKEND_PROJECT=${{ vars.SENTRY_BACKEND_PROJECT }}
|
||||
|
||||
|
||||
- name: Deploy to GCP
|
||||
run: |
|
||||
ssh -i ~/.ssh/private.key ${{ secrets.GCP_USERNAME }}@${{ secrets.GCP_HOST }} << 'EOF'
|
||||
# First pull the new image
|
||||
docker pull ${{ env.IMAGE_PATH }}:${{ github.sha }}
|
||||
|
||||
# Stop and remove any existing container
|
||||
docker stop -t 60 sourcebot || true
|
||||
docker rm sourcebot || true
|
||||
|
||||
# Run the new container
|
||||
docker run -d \
|
||||
-p 80:3000 \
|
||||
--rm \
|
||||
--env-file .env \
|
||||
-v /mnt/data:/data \
|
||||
--name sourcebot \
|
||||
${{ env.IMAGE_PATH }}:${{ github.sha }}
|
||||
EOF
|
||||
18
.github/workflows/deploy-prod.yml
vendored
Normal file
|
|
@ -0,0 +1,18 @@
|
|||
name: Deploy Prod
|
||||
|
||||
on:
|
||||
push:
|
||||
tags: ["v*.*.*"]
|
||||
workflow_dispatch:
|
||||
|
||||
jobs:
|
||||
deploy-prod:
|
||||
uses: ./.github/workflows/_gcp-deploy.yml
|
||||
secrets: inherit
|
||||
permissions:
|
||||
contents: 'read'
|
||||
# Requird for OIDC auth with GCP.
|
||||
# @see: https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/about-security-hardening-with-openid-connect#adding-permissions-settings
|
||||
id-token: 'write'
|
||||
with:
|
||||
environment: prod
|
||||
19
.github/workflows/deploy-staging.yml
vendored
Normal file
|
|
@ -0,0 +1,19 @@
|
|||
name: Deploy Staging
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [main]
|
||||
tags: ["v*.*.*"]
|
||||
workflow_dispatch:
|
||||
|
||||
jobs:
|
||||
deploy-staging:
|
||||
uses: ./.github/workflows/_gcp-deploy.yml
|
||||
secrets: inherit
|
||||
permissions:
|
||||
contents: 'read'
|
||||
# Requird for OIDC auth with GCP.
|
||||
# @see: https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/about-security-hardening-with-openid-connect#adding-permissions-settings
|
||||
id-token: 'write'
|
||||
with:
|
||||
environment: staging
|
||||
31
.github/workflows/fly-deploy.yml
vendored
|
|
@ -1,31 +0,0 @@
|
|||
# See https://fly.io/docs/app-guides/continuous-deployment-with-github-actions/
|
||||
|
||||
name: Fly Deploy
|
||||
on:
|
||||
# Since the `fly.toml` specifies the `latest` tag, we trigger this
|
||||
# deployment whenever a new version is published to the container registry.
|
||||
# @see: ghcr-publish.yml
|
||||
workflow_run:
|
||||
workflows: ["Publish to ghcr"]
|
||||
types:
|
||||
- completed
|
||||
|
||||
jobs:
|
||||
deploy:
|
||||
name: Deploy app
|
||||
runs-on: ubuntu-latest
|
||||
environment: production
|
||||
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
submodules: 'true'
|
||||
|
||||
- name: Use flyctl
|
||||
uses: superfly/flyctl-actions/setup-flyctl@master
|
||||
|
||||
- name: Deploy to fly.io
|
||||
run: flyctl deploy --local-only
|
||||
env:
|
||||
FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}
|
||||
38
.github/workflows/gcp-deploy-staging.yml
vendored
|
|
@ -1,38 +0,0 @@
|
|||
name: GCP Deploy (staging)
|
||||
|
||||
on:
|
||||
workflow_run:
|
||||
workflows: ["Publish to ghcr (staging)"]
|
||||
types:
|
||||
- completed
|
||||
|
||||
jobs:
|
||||
deploy:
|
||||
name: Deploy staging app to GCP
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- name: Configure SSH
|
||||
run: |
|
||||
mkdir -p ~/.ssh/
|
||||
echo "${{ secrets.GCP_STAGING_SSH_PRIVATE_KEY }}" > ~/.ssh/private.key
|
||||
chmod 600 ~/.ssh/private.key
|
||||
echo "${{ secrets.GCP_STAGING_SSH_KNOWN_HOSTS }}" >> ~/.ssh/known_hosts
|
||||
|
||||
- name: Deploy to GCP
|
||||
run: |
|
||||
ssh -i ~/.ssh/private.key ${{ secrets.GCP_STAGING_USERNAME }}@${{ secrets.GCP_STAGING_HOST }} << 'EOF'
|
||||
# Stop and remove any existing container
|
||||
docker stop -t 60 sourcebot-staging || true
|
||||
docker rm sourcebot-staging || true
|
||||
|
||||
# Run new container
|
||||
docker run -d \
|
||||
-p 80:3000 \
|
||||
--rm \
|
||||
--pull always \
|
||||
--env-file .env.staging \
|
||||
-v /mnt/data:/data \
|
||||
--name sourcebot-staging \
|
||||
ghcr.io/sourcebot-dev/sourcebot:staging
|
||||
EOF
|
||||
7
.github/workflows/ghcr-publish.yml
vendored
|
|
@ -15,6 +15,7 @@ env:
|
|||
jobs:
|
||||
build:
|
||||
runs-on: ${{ matrix.runs-on}}
|
||||
environment: oss
|
||||
permissions:
|
||||
contents: read
|
||||
packages: write
|
||||
|
|
@ -30,8 +31,6 @@ jobs:
|
|||
- platform: linux/arm64
|
||||
runs-on: ubuntu-24.04-arm
|
||||
|
||||
|
||||
|
||||
steps:
|
||||
- name: Prepare
|
||||
run: |
|
||||
|
|
@ -79,8 +78,8 @@ jobs:
|
|||
platforms: ${{ matrix.platform }}
|
||||
outputs: type=image,name=${{ env.REGISTRY_IMAGE }},push-by-digest=true,name-canonical=true,push=true,annotation.org.opencontainers.image.description=Blazingly fast code search
|
||||
build-args: |
|
||||
SOURCEBOT_VERSION=${{ github.ref_name }}
|
||||
POSTHOG_PAPIK=${{ secrets.POSTHOG_PAPIK }}
|
||||
NEXT_PUBLIC_SOURCEBOT_VERSION=${{ github.ref_name }}
|
||||
NEXT_PUBLIC_POSTHOG_PAPIK=${{ vars.NEXT_PUBLIC_POSTHOG_PAPIK }}
|
||||
|
||||
- name: Export digest
|
||||
run: |
|
||||
|
|
|
|||
134
.github/workflows/staging-ghcr-public.yml
vendored
|
|
@ -1,134 +0,0 @@
|
|||
|
||||
name: Publish to ghcr (staging)
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: ["v3"]
|
||||
|
||||
env:
|
||||
REGISTRY_IMAGE: ghcr.io/sourcebot-dev/sourcebot
|
||||
|
||||
jobs:
|
||||
build:
|
||||
runs-on: ${{ matrix.runs-on}}
|
||||
permissions:
|
||||
contents: read
|
||||
packages: write
|
||||
id-token: write
|
||||
strategy:
|
||||
matrix:
|
||||
platform: [linux/amd64, linux/arm64]
|
||||
include:
|
||||
- platform: linux/amd64
|
||||
runs-on: ubuntu-latest
|
||||
- platform: linux/arm64
|
||||
runs-on: ubuntu-24.04-arm
|
||||
|
||||
steps:
|
||||
- name: Prepare
|
||||
run: |
|
||||
platform=${{ matrix.platform }}
|
||||
echo "PLATFORM_PAIR=${platform//\//-}" >> $GITHUB_ENV
|
||||
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
submodules: "true"
|
||||
|
||||
- name: Extract Docker metadata
|
||||
id: meta
|
||||
uses: docker/metadata-action@v5
|
||||
with:
|
||||
images: ${{ env.REGISTRY_IMAGE }}
|
||||
tags: staging
|
||||
|
||||
- name: Install cosign
|
||||
uses: sigstore/cosign-installer@v3.5.0
|
||||
with:
|
||||
cosign-release: "v2.2.4"
|
||||
|
||||
- name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@v3
|
||||
|
||||
- name: Login to GitHub Packages Docker Registry
|
||||
uses: docker/login-action@v3
|
||||
with:
|
||||
registry: ghcr.io
|
||||
username: ${{ github.actor }}
|
||||
password: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
- name: Build Docker image
|
||||
id: build
|
||||
uses: docker/build-push-action@v6
|
||||
with:
|
||||
context: .
|
||||
labels: ${{ steps.meta.outputs.labels }}
|
||||
cache-from: type=gha
|
||||
cache-to: type=gha,mode=max
|
||||
platforms: ${{ matrix.platform }}
|
||||
outputs: type=image,name=${{ env.REGISTRY_IMAGE }},push-by-digest=true,name-canonical=true,push=true
|
||||
build-args: |
|
||||
SOURCEBOT_VERSION=${{ github.ref_name }}
|
||||
POSTHOG_PAPIK=${{ secrets.POSTHOG_PAPIK }}
|
||||
SOURCEBOT_ENCRYPTION_KEY=${{ secrets.STAGING_SOURCEBOT_ENCRYPTION_KEY }}
|
||||
|
||||
- name: Export digest
|
||||
run: |
|
||||
mkdir -p /tmp/digests
|
||||
digest="${{ steps.build.outputs.digest }}"
|
||||
touch "/tmp/digests/${digest#sha256:}"
|
||||
|
||||
- name: Upload digest
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: digests-${{ env.PLATFORM_PAIR }}
|
||||
path: /tmp/digests/*
|
||||
if-no-files-found: error
|
||||
retention-days: 1
|
||||
|
||||
- name: Sign the published Docker image
|
||||
env:
|
||||
TAGS: ${{ steps.meta.outputs.tags }}
|
||||
DIGEST: ${{ steps.build.outputs.digest }}
|
||||
run: echo "${TAGS}" | xargs -I {} cosign sign --yes {}@${DIGEST}
|
||||
|
||||
merge:
|
||||
runs-on: ubuntu-latest
|
||||
permissions:
|
||||
packages: write
|
||||
needs:
|
||||
- build
|
||||
steps:
|
||||
- name: Download digests
|
||||
uses: actions/download-artifact@v4
|
||||
with:
|
||||
path: /tmp/digests
|
||||
pattern: digests-*
|
||||
merge-multiple: true
|
||||
|
||||
- name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@v3
|
||||
|
||||
- name: Extract Docker metadata
|
||||
id: meta
|
||||
uses: docker/metadata-action@v5
|
||||
with:
|
||||
images: ${{ env.REGISTRY_IMAGE }}
|
||||
tags: staging
|
||||
|
||||
- name: Login to GitHub Packages Docker Registry
|
||||
uses: docker/login-action@v3
|
||||
with:
|
||||
registry: ghcr.io
|
||||
username: ${{ github.actor }}
|
||||
password: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
- name: Create manifest list and push
|
||||
working-directory: /tmp/digests
|
||||
run: |
|
||||
docker buildx imagetools create $(jq -cr '.tags | map("-t " + .) | join(" ")' <<< "$DOCKER_METADATA_OUTPUT_JSON") \
|
||||
$(printf '${{ env.REGISTRY_IMAGE }}@sha256:%s ' *)
|
||||
|
||||
- name: Inspect image
|
||||
run: |
|
||||
docker buildx imagetools inspect ${{ env.REGISTRY_IMAGE }}:${{ steps.meta.outputs.version }}
|
||||
4
.gitignore
vendored
|
|
@ -153,10 +153,10 @@ dist
|
|||
|
||||
# if you are NOT using Zero-installs, then:
|
||||
# comment the following lines
|
||||
!.yarn/cache
|
||||
# !.yarn/cache
|
||||
|
||||
# and uncomment the following lines
|
||||
# .pnp.*
|
||||
.pnp.*
|
||||
|
||||
# End of https://www.toptal.com/developers/gitignore/api/yarn,node
|
||||
|
||||
|
|
|
|||
2
.gitmodules
vendored
|
|
@ -1,3 +1,5 @@
|
|||
[submodule "vendor/zoekt"]
|
||||
path = vendor/zoekt
|
||||
url = https://github.com/sourcebot-dev/zoekt
|
||||
# @todo : update this to main when we have a release
|
||||
branch=v3
|
||||
|
|
|
|||
3
.vscode/extensions.json
vendored
|
|
@ -1,6 +1,7 @@
|
|||
{
|
||||
"recommendations": [
|
||||
"dbaeumer.vscode-eslint",
|
||||
"bradlc.vscode-tailwindcss"
|
||||
"bradlc.vscode-tailwindcss",
|
||||
"prisma.prisma"
|
||||
]
|
||||
}
|
||||
935
.yarn/releases/yarn-4.7.0.cjs
vendored
Executable file
3
.yarnrc.yml
Normal file
|
|
@ -0,0 +1,3 @@
|
|||
enableGlobalCache: false
|
||||
nodeLinker: node-modules
|
||||
yarnPath: .yarn/releases/yarn-4.7.0.cjs
|
||||
40
CONTRIBUTING.md
Normal file
|
|
@ -0,0 +1,40 @@
|
|||
## Build from source
|
||||
>[!NOTE]
|
||||
> Building from source is only required if you'd like to contribute. The recommended way to use Sourcebot is to use the [pre-built docker image](https://github.com/sourcebot-dev/sourcebot/pkgs/container/sourcebot).
|
||||
|
||||
1. Install <a href="https://go.dev/doc/install"><img src="https://go.dev/favicon.ico" width="16" height="16"> go</a>, <a href="https://nodejs.org/"><img src="https://nodejs.org/favicon.ico" width="16" height="16"> NodeJS</a>, [redis](https://redis.io/), and [postgres](https://www.postgresql.org/). Note that a NodeJS version of at least `21.1.0` is required.
|
||||
|
||||
2. Install [ctags](https://github.com/universal-ctags/ctags) (required by zoekt)
|
||||
```sh
|
||||
// macOS:
|
||||
brew install universal-ctags
|
||||
|
||||
// Linux:
|
||||
snap install universal-ctags
|
||||
```
|
||||
|
||||
3. Clone the repository with submodules:
|
||||
```sh
|
||||
git clone --recurse-submodules https://github.com/sourcebot-dev/sourcebot.git
|
||||
```
|
||||
|
||||
4. Run `make` to build zoekt and install dependencies:
|
||||
```sh
|
||||
cd sourcebot
|
||||
make
|
||||
```
|
||||
|
||||
The zoekt binaries and web dependencies are placed into `bin` and `node_modules` respectively.
|
||||
|
||||
5. Create a copy of `.env.development` and name it `.env.development.local`. Update the required environment variables.
|
||||
|
||||
6. If you're using a declerative configuration file (the default behavior if you didn't enable auth), create a configuration file and update the `CONFIG_PATH` environment variable in your `.env.development.local` file.
|
||||
|
||||
7. Start Sourcebot with the command:
|
||||
```sh
|
||||
yarn dev
|
||||
```
|
||||
|
||||
A `.sourcebot` directory will be created and zoekt will begin to index the repositories found in the `config.json` file.
|
||||
|
||||
8. Start searching at `http://localhost:3000`.
|
||||
211
Dockerfile
|
|
@ -1,5 +1,26 @@
|
|||
# ------ Global scope variables ------
|
||||
|
||||
# Set of global build arguments.
|
||||
# These are considered "public" and will be baked into the image.
|
||||
# The convention is to prefix these with `NEXT_PUBLIC_` so that
|
||||
# they can be optionally be passed as client-side environment variables
|
||||
# in the webapp.
|
||||
# @see: https://docs.docker.com/build/building/variables/#scoping
|
||||
|
||||
ARG NEXT_PUBLIC_SOURCEBOT_VERSION
|
||||
# PAPIK = Project API Key
|
||||
# Note that this key does not need to be kept secret, so it's not
|
||||
# necessary to use Docker build secrets here.
|
||||
# @see: https://posthog.com/tutorials/api-capture-events#authenticating-with-the-project-api-key
|
||||
ARG NEXT_PUBLIC_POSTHOG_PAPIK
|
||||
ARG NEXT_PUBLIC_SENTRY_ENVIRONMENT
|
||||
ARG NEXT_PUBLIC_SOURCEBOT_CLOUD_ENVIRONMENT
|
||||
ARG NEXT_PUBLIC_SENTRY_WEBAPP_DSN
|
||||
ARG NEXT_PUBLIC_SENTRY_BACKEND_DSN
|
||||
|
||||
FROM node:20-alpine3.19 AS node-alpine
|
||||
FROM golang:1.23.4-alpine3.19 AS go-alpine
|
||||
# ----------------------------------
|
||||
|
||||
# ------ Build Zoekt ------
|
||||
FROM go-alpine AS zoekt-builder
|
||||
|
|
@ -9,102 +30,193 @@ COPY vendor/zoekt/go.mod vendor/zoekt/go.sum ./
|
|||
RUN go mod download
|
||||
COPY vendor/zoekt ./
|
||||
RUN CGO_ENABLED=0 GOOS=linux go build -o /cmd/ ./cmd/...
|
||||
# -------------------------
|
||||
|
||||
# ------ Build shared libraries ------
|
||||
FROM node-alpine AS shared-libs-builder
|
||||
WORKDIR /app
|
||||
|
||||
COPY package.json yarn.lock* .yarnrc.yml ./
|
||||
COPY .yarn ./.yarn
|
||||
COPY ./packages/db ./packages/db
|
||||
COPY ./packages/schemas ./packages/schemas
|
||||
COPY ./packages/crypto ./packages/crypto
|
||||
COPY ./packages/error ./packages/error
|
||||
|
||||
RUN yarn workspace @sourcebot/db install
|
||||
RUN yarn workspace @sourcebot/schemas install
|
||||
RUN yarn workspace @sourcebot/crypto install
|
||||
RUN yarn workspace @sourcebot/error install
|
||||
# ------------------------------------
|
||||
|
||||
# ------ Build Web ------
|
||||
FROM node-alpine AS web-builder
|
||||
ENV SKIP_ENV_VALIDATION=1
|
||||
# -----------
|
||||
ARG NEXT_PUBLIC_SOURCEBOT_VERSION
|
||||
ENV NEXT_PUBLIC_SOURCEBOT_VERSION=$NEXT_PUBLIC_SOURCEBOT_VERSION
|
||||
ARG NEXT_PUBLIC_POSTHOG_PAPIK
|
||||
ENV NEXT_PUBLIC_POSTHOG_PAPIK=$NEXT_PUBLIC_POSTHOG_PAPIK
|
||||
ARG NEXT_PUBLIC_SENTRY_ENVIRONMENT
|
||||
ENV NEXT_PUBLIC_SENTRY_ENVIRONMENT=$NEXT_PUBLIC_SENTRY_ENVIRONMENT
|
||||
ARG NEXT_PUBLIC_SOURCEBOT_CLOUD_ENVIRONMENT
|
||||
ENV NEXT_PUBLIC_SOURCEBOT_CLOUD_ENVIRONMENT=$NEXT_PUBLIC_SOURCEBOT_CLOUD_ENVIRONMENT
|
||||
ARG NEXT_PUBLIC_SENTRY_WEBAPP_DSN
|
||||
ENV NEXT_PUBLIC_SENTRY_WEBAPP_DSN=$NEXT_PUBLIC_SENTRY_WEBAPP_DSN
|
||||
|
||||
# To upload source maps to Sentry, we need to set the following build-time args.
|
||||
# It's important that we don't set these for oss builds, otherwise the Sentry
|
||||
# auth token will be exposed.
|
||||
# @see : next.config.mjs
|
||||
ARG SENTRY_ORG
|
||||
ENV SENTRY_ORG=$SENTRY_ORG
|
||||
ARG SENTRY_WEBAPP_PROJECT
|
||||
ENV SENTRY_WEBAPP_PROJECT=$SENTRY_WEBAPP_PROJECT
|
||||
# SMUAT = Source Map Upload Auth Token
|
||||
ARG SENTRY_SMUAT
|
||||
ENV SENTRY_SMUAT=$SENTRY_SMUAT
|
||||
# -----------
|
||||
|
||||
RUN apk add --no-cache libc6-compat
|
||||
WORKDIR /app
|
||||
|
||||
COPY package.json yarn.lock* ./
|
||||
COPY package.json yarn.lock* .yarnrc.yml ./
|
||||
COPY .yarn ./.yarn
|
||||
COPY ./packages/web ./packages/web
|
||||
COPY --from=shared-libs-builder /app/node_modules ./node_modules
|
||||
COPY --from=shared-libs-builder /app/packages/db ./packages/db
|
||||
COPY --from=shared-libs-builder /app/packages/schemas ./packages/schemas
|
||||
COPY --from=shared-libs-builder /app/packages/crypto ./packages/crypto
|
||||
COPY --from=shared-libs-builder /app/packages/error ./packages/error
|
||||
|
||||
# Fixes arm64 timeouts
|
||||
RUN yarn config set registry https://registry.npmjs.org/
|
||||
RUN yarn config set network-timeout 1200000
|
||||
RUN yarn workspace @sourcebot/web install --frozen-lockfile
|
||||
RUN yarn workspace @sourcebot/web install
|
||||
|
||||
ENV NEXT_TELEMETRY_DISABLED=1
|
||||
# @see: https://phase.dev/blog/nextjs-public-runtime-variables/
|
||||
ARG NEXT_PUBLIC_SOURCEBOT_TELEMETRY_DISABLED=BAKED_NEXT_PUBLIC_SOURCEBOT_TELEMETRY_DISABLED
|
||||
ARG NEXT_PUBLIC_SOURCEBOT_VERSION=BAKED_NEXT_PUBLIC_SOURCEBOT_VERSION
|
||||
ENV NEXT_PUBLIC_PUBLIC_SEARCH_DEMO=BAKED_NEXT_PUBLIC_PUBLIC_SEARCH_DEMO
|
||||
ENV NEXT_PUBLIC_POSTHOG_PAPIK=BAKED_NEXT_PUBLIC_POSTHOG_PAPIK
|
||||
# @note: leading "/" is required for the basePath property. @see: https://nextjs.org/docs/app/api-reference/next-config-js/basePath
|
||||
ARG NEXT_PUBLIC_DOMAIN_SUB_PATH=/BAKED_NEXT_PUBLIC_DOMAIN_SUB_PATH
|
||||
RUN yarn workspace @sourcebot/web build
|
||||
ENV SKIP_ENV_VALIDATION=0
|
||||
# ------------------------------
|
||||
|
||||
# ------ Build Backend ------
|
||||
FROM node-alpine AS backend-builder
|
||||
ENV SKIP_ENV_VALIDATION=1
|
||||
# -----------
|
||||
ARG NEXT_PUBLIC_SOURCEBOT_VERSION
|
||||
ENV NEXT_PUBLIC_SOURCEBOT_VERSION=$NEXT_PUBLIC_SOURCEBOT_VERSION
|
||||
|
||||
# To upload source maps to Sentry, we need to set the following build-time args.
|
||||
# It's important that we don't set these for oss builds, otherwise the Sentry
|
||||
# auth token will be exposed.
|
||||
ARG SENTRY_ORG
|
||||
ENV SENTRY_ORG=$SENTRY_ORG
|
||||
ARG SENTRY_BACKEND_PROJECT
|
||||
ENV SENTRY_BACKEND_PROJECT=$SENTRY_BACKEND_PROJECT
|
||||
# SMUAT = Source Map Upload Auth Token
|
||||
ARG SENTRY_SMUAT
|
||||
ENV SENTRY_SMUAT=$SENTRY_SMUAT
|
||||
# -----------
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
COPY package.json yarn.lock* ./
|
||||
COPY package.json yarn.lock* .yarnrc.yml ./
|
||||
COPY .yarn ./.yarn
|
||||
COPY ./schemas ./schemas
|
||||
COPY ./packages/backend ./packages/backend
|
||||
RUN yarn workspace @sourcebot/backend install --frozen-lockfile
|
||||
COPY --from=shared-libs-builder /app/node_modules ./node_modules
|
||||
COPY --from=shared-libs-builder /app/packages/db ./packages/db
|
||||
COPY --from=shared-libs-builder /app/packages/schemas ./packages/schemas
|
||||
COPY --from=shared-libs-builder /app/packages/crypto ./packages/crypto
|
||||
COPY --from=shared-libs-builder /app/packages/error ./packages/error
|
||||
RUN yarn workspace @sourcebot/backend install
|
||||
RUN yarn workspace @sourcebot/backend build
|
||||
|
||||
# Upload source maps to Sentry if we have the necessary build-time args.
|
||||
RUN if [ -n "$SENTRY_SMUAT" ] && [ -n "$SENTRY_ORG" ] && [ -n "$SENTRY_BACKEND_PROJECT" ] && [ -n "$NEXT_PUBLIC_SOURCEBOT_VERSION" ]; then \
|
||||
apk add --no-cache curl; \
|
||||
curl -sL https://sentry.io/get-cli/ | sh; \
|
||||
sentry-cli login --auth-token $SENTRY_SMUAT; \
|
||||
sentry-cli sourcemaps inject --org $SENTRY_ORG --project $SENTRY_BACKEND_PROJECT --release $NEXT_PUBLIC_SOURCEBOT_VERSION ./packages/backend/dist; \
|
||||
sentry-cli sourcemaps upload --org $SENTRY_ORG --project $SENTRY_BACKEND_PROJECT --release $NEXT_PUBLIC_SOURCEBOT_VERSION ./packages/backend/dist; \
|
||||
fi
|
||||
|
||||
ENV SKIP_ENV_VALIDATION=0
|
||||
# ------------------------------
|
||||
|
||||
# ------ Runner ------
|
||||
FROM node-alpine AS runner
|
||||
# -----------
|
||||
ARG NEXT_PUBLIC_SOURCEBOT_VERSION
|
||||
ENV NEXT_PUBLIC_SOURCEBOT_VERSION=$NEXT_PUBLIC_SOURCEBOT_VERSION
|
||||
ARG NEXT_PUBLIC_POSTHOG_PAPIK
|
||||
ENV NEXT_PUBLIC_POSTHOG_PAPIK=$NEXT_PUBLIC_POSTHOG_PAPIK
|
||||
ARG NEXT_PUBLIC_SENTRY_ENVIRONMENT
|
||||
ENV NEXT_PUBLIC_SENTRY_ENVIRONMENT=$NEXT_PUBLIC_SENTRY_ENVIRONMENT
|
||||
ARG NEXT_PUBLIC_SENTRY_WEBAPP_DSN
|
||||
ENV NEXT_PUBLIC_SENTRY_WEBAPP_DSN=$NEXT_PUBLIC_SENTRY_WEBAPP_DSN
|
||||
ARG NEXT_PUBLIC_SENTRY_BACKEND_DSN
|
||||
ENV NEXT_PUBLIC_SENTRY_BACKEND_DSN=$NEXT_PUBLIC_SENTRY_BACKEND_DSN
|
||||
# -----------
|
||||
|
||||
RUN echo "Sourcebot Version: $NEXT_PUBLIC_SOURCEBOT_VERSION"
|
||||
|
||||
WORKDIR /app
|
||||
ENV NODE_ENV=production
|
||||
ENV NEXT_TELEMETRY_DISABLED=1
|
||||
ENV DATA_DIR=/data
|
||||
ENV CONFIG_PATH=$DATA_DIR/config.json
|
||||
ENV DATA_CACHE_DIR=$DATA_DIR/.sourcebot
|
||||
|
||||
ARG SOURCEBOT_VERSION=unknown
|
||||
ENV SOURCEBOT_VERSION=$SOURCEBOT_VERSION
|
||||
RUN echo "Sourcebot Version: $SOURCEBOT_VERSION"
|
||||
|
||||
ARG PUBLIC_SEARCH_DEMO=false
|
||||
ENV PUBLIC_SEARCH_DEMO=$PUBLIC_SEARCH_DEMO
|
||||
RUN echo "Public Search Demo: $PUBLIC_SEARCH_DEMO"
|
||||
ENV DATABASE_DATA_DIR=$DATA_CACHE_DIR/db
|
||||
ENV REDIS_DATA_DIR=$DATA_CACHE_DIR/redis
|
||||
ENV DATABASE_URL="postgresql://postgres@localhost:5432/sourcebot"
|
||||
ENV REDIS_URL="redis://localhost:6379"
|
||||
ENV SRC_TENANT_ENFORCEMENT_MODE=strict
|
||||
|
||||
# Valid values are: debug, info, warn, error
|
||||
ENV SOURCEBOT_LOG_LEVEL=info
|
||||
|
||||
# Configures the sub-path of the domain to serve Sourcebot from.
|
||||
# For example, if DOMAIN_SUB_PATH is set to "/sb", Sourcebot
|
||||
# will serve from http(s)://example.com/sb
|
||||
ENV DOMAIN_SUB_PATH=/
|
||||
|
||||
# PAPIK = Project API Key
|
||||
# Note that this key does not need to be kept secret, so it's not
|
||||
# necessary to use Docker build secrets here.
|
||||
# @see: https://posthog.com/tutorials/api-capture-events#authenticating-with-the-project-api-key
|
||||
ARG POSTHOG_PAPIK=
|
||||
ENV POSTHOG_PAPIK=$POSTHOG_PAPIK
|
||||
|
||||
# Sourcebot collects anonymous usage data using [PostHog](https://posthog.com/). Uncomment this line to disable.
|
||||
# ENV SOURCEBOT_TELEMETRY_DISABLED=1
|
||||
|
||||
# Configure dependencies
|
||||
RUN apk add --no-cache git ca-certificates bind-tools tini jansson wget supervisor uuidgen curl perl jq
|
||||
COPY package.json yarn.lock* .yarnrc.yml ./
|
||||
COPY .yarn ./.yarn
|
||||
|
||||
# Configure zoekt
|
||||
COPY vendor/zoekt/install-ctags-alpine.sh .
|
||||
RUN ./install-ctags-alpine.sh && rm install-ctags-alpine.sh
|
||||
RUN mkdir -p ${DATA_CACHE_DIR}
|
||||
COPY --from=zoekt-builder \
|
||||
/cmd/zoekt-git-index \
|
||||
/cmd/zoekt-indexserver \
|
||||
/cmd/zoekt-mirror-github \
|
||||
/cmd/zoekt-mirror-gitiles \
|
||||
/cmd/zoekt-mirror-bitbucket-server \
|
||||
/cmd/zoekt-mirror-gitlab \
|
||||
/cmd/zoekt-mirror-gerrit \
|
||||
/cmd/zoekt-webserver \
|
||||
/cmd/zoekt-index \
|
||||
/usr/local/bin/
|
||||
/cmd/zoekt-git-index \
|
||||
/cmd/zoekt-indexserver \
|
||||
/cmd/zoekt-mirror-github \
|
||||
/cmd/zoekt-mirror-gitiles \
|
||||
/cmd/zoekt-mirror-bitbucket-server \
|
||||
/cmd/zoekt-mirror-gitlab \
|
||||
/cmd/zoekt-mirror-gerrit \
|
||||
/cmd/zoekt-webserver \
|
||||
/cmd/zoekt-index \
|
||||
/usr/local/bin/
|
||||
|
||||
# Configure the webapp
|
||||
# Copy all of the things
|
||||
COPY --from=web-builder /app/packages/web/public ./packages/web/public
|
||||
COPY --from=web-builder /app/packages/web/.next/standalone ./
|
||||
COPY --from=web-builder /app/packages/web/.next/static ./packages/web/.next/static
|
||||
|
||||
# Configure the backend
|
||||
COPY --from=backend-builder /app/node_modules ./node_modules
|
||||
COPY --from=backend-builder /app/packages/backend ./packages/backend
|
||||
|
||||
COPY --from=shared-libs-builder /app/node_modules ./node_modules
|
||||
COPY --from=shared-libs-builder /app/packages/db ./packages/db
|
||||
COPY --from=shared-libs-builder /app/packages/schemas ./packages/schemas
|
||||
COPY --from=shared-libs-builder /app/packages/crypto ./packages/crypto
|
||||
COPY --from=shared-libs-builder /app/packages/error ./packages/error
|
||||
|
||||
# Configure dependencies
|
||||
RUN apk add --no-cache git ca-certificates bind-tools tini jansson wget supervisor uuidgen curl perl jq redis postgresql postgresql-contrib openssl util-linux unzip
|
||||
|
||||
# Configure the database
|
||||
RUN mkdir -p /run/postgresql && \
|
||||
chown -R postgres:postgres /run/postgresql && \
|
||||
chmod 775 /run/postgresql
|
||||
|
||||
COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
|
||||
COPY prefix-output.sh ./prefix-output.sh
|
||||
RUN chmod +x ./prefix-output.sh
|
||||
|
|
@ -116,4 +228,5 @@ COPY default-config.json .
|
|||
EXPOSE 3000
|
||||
ENV PORT=3000
|
||||
ENV HOSTNAME="0.0.0.0"
|
||||
ENTRYPOINT ["/sbin/tini", "--", "./entrypoint.sh"]
|
||||
ENTRYPOINT ["/sbin/tini", "--", "./entrypoint.sh"]
|
||||
# ------------------------------
|
||||
17
Makefile
|
|
@ -1,9 +1,9 @@
|
|||
|
||||
CMDS := zoekt ui
|
||||
CMDS := zoekt yarn
|
||||
|
||||
ALL: $(CMDS)
|
||||
|
||||
ui:
|
||||
yarn:
|
||||
yarn install
|
||||
|
||||
zoekt:
|
||||
|
|
@ -20,6 +20,19 @@ clean:
|
|||
packages/web/.next \
|
||||
packages/backend/dist \
|
||||
packages/backend/node_modules \
|
||||
packages/db/node_modules \
|
||||
packages/db/dist \
|
||||
packages/schemas/node_modules \
|
||||
packages/schemas/dist \
|
||||
packages/crypto/node_modules \
|
||||
packages/crypto/dist \
|
||||
packages/error/node_modules \
|
||||
packages/error/dist \
|
||||
.sourcebot
|
||||
|
||||
soft-reset:
|
||||
rm -rf .sourcebot
|
||||
redis-cli FLUSHALL
|
||||
|
||||
|
||||
.PHONY: bin
|
||||
|
|
|
|||
426
README.md
|
|
@ -5,12 +5,37 @@
|
|||
<img height="150" src=".github/images/logo_light.png">
|
||||
</picture>
|
||||
</div>
|
||||
<div align="center">
|
||||
<div>
|
||||
<h3>
|
||||
<a href="https://app.sourcebot.dev">
|
||||
<strong>Sourcebot Cloud</strong>
|
||||
</a> ·
|
||||
<a href="https://docs.sourcebot.dev/self-hosting/overview">
|
||||
<strong>Self Host</strong>
|
||||
</a> ·
|
||||
<a href="https://sourcebot.dev/search">
|
||||
<strong>Demo</strong>
|
||||
</a>
|
||||
</h3>
|
||||
</div>
|
||||
|
||||
<div>
|
||||
<a href="https://docs.sourcebot.dev/"><strong>Docs</strong></a> ·
|
||||
<a href="https://github.com/sourcebot-dev/sourcebot/issues"><strong>Report Bug</strong></a> ·
|
||||
<a href="https://github.com/sourcebot-dev/sourcebot/discussions/categories/ideas"><strong>Feature Request</strong></a> ·
|
||||
<a href="https://www.sourcebot.dev/changelog"><strong>Changelog</strong></a> ·
|
||||
<a href="https://www.sourcebot.dev/contact"><strong>Contact</strong></a> ·
|
||||
</div>
|
||||
<br/>
|
||||
<span>Sourcebot uses <a href="https://github.com/sourcebot-dev/sourcebot/discussions"><strong>Github Discussions</strong></a> for Support and Feature Requests.</span>
|
||||
<br/>
|
||||
<br/>
|
||||
<div>
|
||||
</div>
|
||||
</div>
|
||||
<p align="center">
|
||||
Blazingly fast code search 🏎️
|
||||
</p>
|
||||
<p align="center">
|
||||
<a href="https://sourcebot.dev/search"><img src="https://img.shields.io/badge/Try the Demo!-blue?logo=googlechrome&logoColor=orange"/></a>
|
||||
<a href="mailto:brendan@sourcebot.dev"><img src="https://img.shields.io/badge/Email%20Us-brightgreen" /></a>
|
||||
<a href="mailto:team@sourcebot.dev"><img src="https://img.shields.io/badge/Email%20Us-brightgreen" /></a>
|
||||
<a href="https://github.com/sourcebot-dev/sourcebot/blob/main/LICENSE"><img src="https://img.shields.io/github/license/sourcebot-dev/sourcebot"/></a>
|
||||
<a href="https://github.com/sourcebot-dev/sourcebot/actions/workflows/ghcr-publish.yml"><img src="https://img.shields.io/github/actions/workflow/status/sourcebot-dev/sourcebot/ghcr-publish.yml"/><a>
|
||||
<a href="https://github.com/sourcebot-dev/sourcebot/stargazers"><img src="https://img.shields.io/github/stars/sourcebot-dev/sourcebot" /></a>
|
||||
|
|
@ -23,384 +48,69 @@ Blazingly fast code search 🏎️
|
|||
|
||||
# About
|
||||
|
||||
Sourcebot is a fast code indexing and search tool for your codebases. It is built ontop of the [zoekt](https://github.com/sourcegraph/zoekt) indexer, originally authored by Han-Wen Nienhuys and now [maintained by Sourcegraph](https://sourcegraph.com/blog/sourcegraph-accepting-zoekt-maintainership).
|
||||
Sourcebot is the open source Sourcegraph alternative. Index all your repos and branches across multiple code hosts (GitHub, GitLab, Gitea, or Gerrit) and search through them using a blazingly fast interface.
|
||||
|
||||
https://github.com/user-attachments/assets/98d46192-5469-430f-ad9e-5c042adbb10d
|
||||
|
||||
|
||||
## Features
|
||||
- 💻 **One-command deployment**: Get started instantly using Docker on your own machine.
|
||||
- 🔍 **Multi-repo search**: Effortlessly index and search through multiple public and private repositories in GitHub, GitLab, Gitea, or Gerrit.
|
||||
- 🔍 **Multi-repo search**: Index and search through multiple public and private repositories and branches on GitHub, GitLab, Gitea, or Gerrit.
|
||||
- ⚡**Lightning fast performance**: Built on top of the powerful [Zoekt](https://github.com/sourcegraph/zoekt) search engine.
|
||||
- 📂 **Full file visualization**: Instantly view the entire file when selecting any search result.
|
||||
- 🎨 **Modern web app**: Enjoy a sleek interface with features like syntax highlighting, light/dark mode, and vim-style navigation
|
||||
- 📂 **Full file visualization**: Instantly view the entire file when selecting any search result.
|
||||
|
||||
You can try out our public hosted demo [here](https://sourcebot.dev/search)!
|
||||
|
||||
# Getting Started
|
||||
# Deply Sourcebot
|
||||
|
||||
Get started with a single docker command:
|
||||
Sourcebot can be deployed in seconds using our official docker image. Visit our [docs](https://docs.sourcebot.dev/self-hosting/overview) for more information.
|
||||
|
||||
```
|
||||
docker run -p 3000:3000 --rm --name sourcebot ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
1. Create a config
|
||||
```json
|
||||
touch config.json
|
||||
echo '{
|
||||
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v3/index.json",
|
||||
"connections": {
|
||||
// Comments are supported
|
||||
"starter-connection": {
|
||||
"type": "github",
|
||||
"repos": [
|
||||
"sourcebot-dev/sourcebot"
|
||||
]
|
||||
}
|
||||
}
|
||||
}' > config.jsono
|
||||
```
|
||||
|
||||
Navigate to `localhost:3000` to start searching the Sourcebot repo. Want to search your own repos? Checkout how to [configure Sourcebot](#configuring-sourcebot).
|
||||
|
||||
2. Run the docker container
|
||||
```sh
|
||||
docker run -p 3000:3000 --pull=always --rm -v $(pwd):/data -e CONFIG_PATH=/data/config.json --name sourcebot ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
```
|
||||
<details>
|
||||
<summary>What does this command do?</summary>
|
||||
|
||||
- Pull and run the Sourcebot docker image from [ghcr.io/sourcebot-dev/sourcebot:latest](https://github.com/sourcebot-dev/sourcebot/pkgs/container/sourcebot). Make sure you have [docker installed](https://docs.docker.com/get-started/get-docker/).
|
||||
- Read the repos listed in [default config](./default-config.json) and start indexing them.
|
||||
- Pull and run the Sourcebot docker image from [ghcr.io/sourcebot-dev/sourcebot:latest](https://github.com/sourcebot-dev/sourcebot/pkgs/container/sourcebot).
|
||||
- Mount the current directory (`-v $(pwd):/data`) to allow Sourcebot to persist the `.sourcebot` cache.
|
||||
- Clones sourcebot at `HEAD` into `.sourcebot/github/sourcebot-dev/sourcebot`.
|
||||
- Indexes sourcebot into a .zoekt index file in `.sourcebot/index/`.
|
||||
- Map port 3000 between your machine and the docker image.
|
||||
- Starts the web server on port 3000.
|
||||
</details>
|
||||
</br>
|
||||
3. Start searching at `http://localhost:3000`
|
||||
|
||||
## Configuring Sourcebot
|
||||
</br>
|
||||
|
||||
Sourcebot supports indexing and searching through public and private repositories hosted on
|
||||
<picture>
|
||||
<source media="(prefers-color-scheme: dark)" srcset=".github/images/github-favicon-inverted.png">
|
||||
<img src="https://github.com/favicon.ico" width="16" height="16" alt="GitHub icon">
|
||||
</picture> GitHub, <img src="https://gitlab.com/favicon.ico" width="16" height="16" /> GitLab, <img src="https://gitea.com/favicon.ico" width="16" height="16"> Gitea, and <img src="https://gerrit-review.googlesource.com/favicon.ico" width="16" height="16"> Gerrit. This section will guide you through configuring the repositories that Sourcebot indexes.
|
||||
To learn how to configure Sourcebot to index your own repos, please refer to our [docs](https://docs.sourcebot.dev/self-hosting/overview).
|
||||
|
||||
1. Create a new folder on your machine that stores your configs and `.sourcebot` cache, and navigate into it:
|
||||
```sh
|
||||
mkdir sourcebot_workspace
|
||||
cd sourcebot_workspace
|
||||
```
|
||||
> [!NOTE]
|
||||
> Sourcebot collects [anonymous usage data](https://sourcebot.dev/search/search?query=captureEvent%5C(%20repo%3Asourcebot) by default to help us improve the product. No sensitive data is collected, but if you'd like to disable this you can do so by setting the `SOURCEBOT_TELEMETRY_DISABLED` environment
|
||||
> variable to `false`. Please refer to our [telemetry docs](https://docs.sourcebot.dev/self-hosting/overview#telemetry) for more information.
|
||||
|
||||
2. Create a new config following the [configuration schema](./schemas/v2/index.json) to specify which repositories Sourcebot should index. For example, let's index llama.cpp:
|
||||
|
||||
```sh
|
||||
touch my_config.json
|
||||
echo '{
|
||||
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v2/index.json",
|
||||
"repos": [
|
||||
{
|
||||
"type": "github",
|
||||
"repos": [
|
||||
"ggerganov/llama.cpp"
|
||||
]
|
||||
}
|
||||
]
|
||||
}' > my_config.json
|
||||
```
|
||||
|
||||
>[!NOTE]
|
||||
> Sourcebot can also index all repos owned by a organization, user, group, etc., instead of listing them individually. For examples, see the [configs](./configs) directory. For additional usage information, see the [configuration schema](./schemas/v2/index.json).
|
||||
|
||||
3. Run Sourcebot and point it to the new config you created with the `-e CONFIG_PATH` flag:
|
||||
|
||||
```sh
|
||||
docker run -p 3000:3000 --rm --name sourcebot -v $(pwd):/data -e CONFIG_PATH=/data/my_config.json ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
```
|
||||
|
||||
<details>
|
||||
<summary>What does this command do?</summary>
|
||||
|
||||
- Pull and run the Sourcebot docker image from [ghcr.io/sourcebot-dev/sourcebot:latest](https://github.com/sourcebot-dev/sourcebot/pkgs/container/sourcebot).
|
||||
- Mount the current directory (`-v $(pwd):/data`) to allow Sourcebot to persist the `.sourcebot` cache.
|
||||
- Mirrors (clones) llama.cpp at `HEAD` into `.sourcebot/github/ggerganov/llama.cpp`.
|
||||
- Indexes llama.cpp into a .zoekt index file in `.sourcebot/index/`.
|
||||
- Map port 3000 between your machine and the docker image.
|
||||
- Starts the web server on port 3000.
|
||||
</details>
|
||||
<br>
|
||||
|
||||
You should see a `.sourcebot` folder in your current directory. This folder stores a cache of the repositories zoekt has indexed. The `HEAD` commit of a repository is re-indexed [every hour](./packages/backend/src/constants.ts). Indexing private repos? See [Providing an access token](#providing-an-access-token).
|
||||
|
||||
</br>
|
||||
|
||||
## Providing an access token
|
||||
This will depend on the code hosting platform you're using:
|
||||
|
||||
<div>
|
||||
<details>
|
||||
<summary>
|
||||
<picture>
|
||||
<source media="(prefers-color-scheme: dark)" srcset=".github/images/github-favicon-inverted.png">
|
||||
<img src="https://github.com/favicon.ico" width="16" height="16" alt="GitHub icon">
|
||||
</picture> GitHub
|
||||
</summary>
|
||||
|
||||
In order to index private repositories, you'll need to generate a GitHub Personal Access Token (PAT). Create a new PAT [here](https://github.com/settings/tokens/new) and make sure you select the `repo` scope:
|
||||
|
||||

|
||||
|
||||
Next, update your configuration with the `token` field:
|
||||
```json
|
||||
{
|
||||
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v2/index.json",
|
||||
"repos": [
|
||||
{
|
||||
"type": "github",
|
||||
"token": "ghp_mytoken",
|
||||
...
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
You can also pass tokens as environment variables:
|
||||
```json
|
||||
{
|
||||
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v2/index.json",
|
||||
"repos": [
|
||||
{
|
||||
"type": "github",
|
||||
"token": {
|
||||
// note: this env var can be named anything. It
|
||||
// doesn't need to be `GITHUB_TOKEN`.
|
||||
"env": "GITHUB_TOKEN"
|
||||
},
|
||||
...
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
You'll need to pass this environment variable each time you run Sourcebot:
|
||||
|
||||
<pre>
|
||||
docker run -e <b>GITHUB_TOKEN=ghp_mytoken</b> /* additional args */ ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
</pre>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary><img src="https://gitlab.com/favicon.ico" width="16" height="16" /> GitLab</summary>
|
||||
|
||||
Generate a GitLab Personal Access Token (PAT) [here](https://gitlab.com/-/user_settings/personal_access_tokens) and make sure you select the `read_api` scope:
|
||||
|
||||

|
||||
|
||||
Next, update your configuration with the `token` field:
|
||||
```json
|
||||
{
|
||||
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v2/index.json",
|
||||
"repos": [
|
||||
{
|
||||
"type": "gitlab",
|
||||
"token": "glpat-mytoken",
|
||||
...
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
You can also pass tokens as environment variables:
|
||||
```json
|
||||
{
|
||||
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v2/index.json",
|
||||
"repos": [
|
||||
{
|
||||
"type": "gitlab",
|
||||
"token": {
|
||||
// note: this env var can be named anything. It
|
||||
// doesn't need to be `GITLAB_TOKEN`.
|
||||
"env": "GITLAB_TOKEN"
|
||||
},
|
||||
...
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
You'll need to pass this environment variable each time you run Sourcebot:
|
||||
|
||||
<pre>
|
||||
docker run -e <b>GITLAB_TOKEN=glpat-mytoken</b> /* additional args */ ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
</pre>
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary><img src="https://gitea.com/favicon.ico" width="16" height="16"> Gitea</summary>
|
||||
|
||||
Generate a Gitea access token [here](http://gitea.com/user/settings/applications). At minimum, you'll need to select the `read:repository` scope, but `read:user` and `read:organization` are required for the `user` and `org` fields of your config file:
|
||||
|
||||

|
||||
|
||||
Next, update your configuration with the `token` field:
|
||||
```json
|
||||
{
|
||||
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v2/index.json",
|
||||
"repos": [
|
||||
{
|
||||
"type": "gitea",
|
||||
"token": "my-secret-token",
|
||||
...
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
You can also pass tokens as environment variables:
|
||||
```json
|
||||
{
|
||||
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v2/index.json",
|
||||
"repos": [
|
||||
{
|
||||
"type": "gitea",
|
||||
"token": {
|
||||
// note: this env var can be named anything. It
|
||||
// doesn't need to be `GITEA_TOKEN`.
|
||||
"env": "GITEA_TOKEN"
|
||||
},
|
||||
...
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
You'll need to pass this environment variable each time you run Sourcebot:
|
||||
|
||||
<pre>
|
||||
docker run -e <b>GITEA_TOKEN=my-secret-token</b> /* additional args */ ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
</pre>
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary><img src="https://gerrit-review.googlesource.com/favicon.ico" width="16" height="16"> Gerrit</summary>
|
||||
Gerrit authentication is not yet currently supported.
|
||||
</details>
|
||||
|
||||
|
||||
</div>
|
||||
|
||||
## Using a self-hosted GitLab / GitHub instance
|
||||
|
||||
If you're using a self-hosted GitLab or GitHub instance with a custom domain, you can specify the domain in your config file. See [configs/self-hosted.json](configs/self-hosted.json) for examples.
|
||||
|
||||
## Searching multiple branches
|
||||
|
||||
By default, Sourcebot will index the default branch. To configure Sourcebot to index multiple branches (or tags), the `revisions` field can be used:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v2/index.json",
|
||||
"repos": [
|
||||
{
|
||||
"type": "github",
|
||||
"revisions": {
|
||||
// Index the `main` branch and any branches matching the `releases/*` glob pattern.
|
||||
"branches": [
|
||||
"main",
|
||||
"releases/*"
|
||||
],
|
||||
// Index the `latest` tag and any tags matching the `v*.*.*` glob pattern.
|
||||
"tags": [
|
||||
"latest",
|
||||
"v*.*.*"
|
||||
]
|
||||
},
|
||||
"repos": [
|
||||
"my_org/repo_a",
|
||||
"my_org/repo_b"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
For each repository (in this case, `repo_a` and `repo_b`), Sourcebot will index all branches and tags matching the `branches` and `tags` patterns provided. Any branches or tags that don't match the patterns will be ignored and not indexed.
|
||||
|
||||
To search on a specific revision, use the `revision` filter in the search bar:
|
||||
|
||||
<picture>
|
||||
<source media="(prefers-color-scheme: dark)" srcset=".github/images/revisions_filter_dark.png">
|
||||
<img style="max-width:700px;width:100%" src=".github/images/revisions_filter_light.png">
|
||||
</picture>
|
||||
|
||||
## Searching a local directory
|
||||
|
||||
Local directories can be searched by using the `local` type in your config file:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v2/index.json",
|
||||
"repos": [
|
||||
{
|
||||
"type": "local",
|
||||
"path": "/repos/my-repo",
|
||||
// re-index files when a change is detected
|
||||
"watch": true,
|
||||
"exclude": {
|
||||
// exclude paths from being indexed
|
||||
"paths": [
|
||||
"node_modules",
|
||||
"build"
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
You'll need to mount the directory as a volume when running Sourcebot:
|
||||
|
||||
<pre>
|
||||
docker run <b>-v /path/to/my-repo:/repos/my-repo</b> /* additional args */ ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
</pre>
|
||||
|
||||
## Build from source
|
||||
# Build from source
|
||||
>[!NOTE]
|
||||
> Building from source is only required if you'd like to contribute. The recommended way to use Sourcebot is to use the [pre-built docker image](https://github.com/sourcebot-dev/sourcebot/pkgs/container/sourcebot).
|
||||
> Building from source is only required if you'd like to contribute. If you'd just like to use Sourcebot, we recommend checking out our self-hosting [docs](https://docs.sourcebot.dev/self-hosting/overview).
|
||||
|
||||
1. Install <a href="https://go.dev/doc/install"><img src="https://go.dev/favicon.ico" width="16" height="16"> go</a> and <a href="https://nodejs.org/"><img src="https://nodejs.org/favicon.ico" width="16" height="16"> NodeJS</a>. Note that a NodeJS version of at least `21.1.0` is required.
|
||||
If you'd like to build from source, please checkout the `CONTRIBUTING.md` file for more information.
|
||||
|
||||
2. Install [ctags](https://github.com/universal-ctags/ctags) (required by zoekt)
|
||||
```sh
|
||||
// macOS:
|
||||
brew install universal-ctags
|
||||
|
||||
// Linux:
|
||||
snap install universal-ctags
|
||||
```
|
||||
|
||||
3. Clone the repository with submodules:
|
||||
```sh
|
||||
git clone --recurse-submodules https://github.com/sourcebot-dev/sourcebot.git
|
||||
```
|
||||
|
||||
4. Run `make` to build zoekt and install dependencies:
|
||||
```sh
|
||||
cd sourcebot
|
||||
make
|
||||
```
|
||||
|
||||
The zoekt binaries and web dependencies are placed into `bin` and `node_modules` respectively.
|
||||
|
||||
5. Create a `config.json` file at the repository root. See [Configuring Sourcebot](#configuring-sourcebot) for more information.
|
||||
|
||||
6. Start Sourcebot with the command:
|
||||
```sh
|
||||
yarn dev
|
||||
```
|
||||
|
||||
A `.sourcebot` directory will be created and zoekt will begin to index the repositories found given `config.json`.
|
||||
|
||||
7. Start searching at `http://localhost:3000`.
|
||||
|
||||
## Telemetry
|
||||
|
||||
By default, Sourcebot collects anonymized usage data through [PostHog](https://posthog.com/) to help us improve the performance and reliability of our tool. We do not collect or transmit [any information related to your codebase](https://sourcebot.dev/search/search?query=captureEvent%20repo%3Asourcebot%20case%3Ano). In addition, all events are [sanitized](./packages/web/src/app/posthogProvider.tsx) to ensure that no sensitive or identifying details leave your machine. The data we collect includes general usage statistics and metadata such as query performance (e.g., search duration, error rates) to monitor the application's health and functionality. This information helps us better understand how Sourcebot is used and where improvements can be made :)
|
||||
|
||||
If you'd like to disable all telemetry, you can do so by setting the environment variable `SOURCEBOT_TELEMETRY_DISABLED` to `1` in the docker run command:
|
||||
|
||||
<pre>
|
||||
docker run -e <b>SOURCEBOT_TELEMETRY_DISABLED=1</b> /* additional args */ ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
</pre>
|
||||
|
||||
Or if you are [building locally](#build-from-source), create a `.env.local` file at the repository root with the following contents:
|
||||
```sh
|
||||
SOURCEBOT_TELEMETRY_DISABLED=1
|
||||
NEXT_PUBLIC_SOURCEBOT_TELEMETRY_DISABLED=1
|
||||
```
|
||||
|
||||
## Attributions
|
||||
|
||||
Sourcebot makes use of the following libraries:
|
||||
|
||||
- [@vscode/codicons](https://github.com/microsoft/vscode-codicons) under the [CC BY 4.0 License](https://github.com/microsoft/vscode-codicons/blob/main/LICENSE).
|
||||
|
|
|
|||
|
|
@ -1,7 +1,6 @@
|
|||
{
|
||||
"$schema": "./schemas/v2/index.json",
|
||||
"settings": {
|
||||
"autoDeleteStaleRepos": true,
|
||||
"reindexInterval": 86400000, // 24 hours
|
||||
"resyncInterval": 86400000 // 24 hours
|
||||
},
|
||||
|
|
@ -17,15 +16,197 @@
|
|||
}
|
||||
},
|
||||
"repos": [
|
||||
"pytorch/pytorch",
|
||||
"torvalds/linux",
|
||||
"pytorch/pytorch",
|
||||
"commaai/openpilot",
|
||||
"ggerganov/whisper.cpp",
|
||||
"ggerganov/llama.cpp",
|
||||
"codemirror/dev",
|
||||
"tailwindlabs/tailwindcss",
|
||||
"sourcebot-dev/sourcebot",
|
||||
"freeCodeCamp/freeCodeCamp",
|
||||
"EbookFoundation/free-programming-books",
|
||||
"sindresorhus/awesome",
|
||||
"public-apis/public-apis",
|
||||
"codecrafters-io/build-your-own-x",
|
||||
"jwasham/coding-interview-university",
|
||||
"kamranahmedse/developer-roadmap",
|
||||
"donnemartin/system-design-primer",
|
||||
"996icu/996.ICU",
|
||||
"facebook/react",
|
||||
"vinta/awesome-python",
|
||||
"vuejs/vue",
|
||||
"practical-tutorials/project-based-learning",
|
||||
"awesome-selfhosted/awesome-selfhosted",
|
||||
"TheAlgorithms/Python",
|
||||
"trekhleb/javascript-algorithms",
|
||||
"tensorflow/tensorflow",
|
||||
"getify/You-Dont-Know-JS",
|
||||
"CyC2018/CS-Notes",
|
||||
"ohmyzsh/ohmyzsh",
|
||||
"ossu/computer-science",
|
||||
"twbs/bootstrap",
|
||||
"Significant-Gravitas/AutoGPT",
|
||||
"flutter/flutter",
|
||||
"microsoft/vscode",
|
||||
"github/gitignore",
|
||||
"jackfrued/Python-100-Days",
|
||||
"jlevy/the-art-of-command-line",
|
||||
"trimstray/the-book-of-secret-knowledge",
|
||||
"Snailclimb/JavaGuide",
|
||||
"airbnb/javascript",
|
||||
"AUTOMATIC1111/stable-diffusion-webui",
|
||||
"huggingface/transformers",
|
||||
"avelino/awesome-go",
|
||||
"ytdl-org/youtube-dl",
|
||||
"vercel/next.js",
|
||||
"labuladong/fucking-algorithm",
|
||||
"golang/go",
|
||||
"Chalarangelo/30-seconds-of-code",
|
||||
"yangshun/tech-interview-handbook",
|
||||
"facebook/react-native",
|
||||
"electron/electron",
|
||||
"Genymobile/scrcpy",
|
||||
"f/awesome-chatgpt-prompts",
|
||||
"microsoft/PowerToys",
|
||||
"justjavac/free-programming-books-zh_CN",
|
||||
"kubernetes/kubernetes",
|
||||
"d3/d3",
|
||||
"nodejs/node",
|
||||
"massgravel/Microsoft-Activation-Scripts",
|
||||
"axios/axios",
|
||||
"mrdoob/three.js",
|
||||
"krahets/hello-algo",
|
||||
"facebook/create-react-app",
|
||||
"ollama/ollama",
|
||||
"microsoft/TypeScript",
|
||||
"goldbergyoni/nodebestpractices",
|
||||
"rust-lang/rust",
|
||||
"denoland/deno",
|
||||
"angular/angular",
|
||||
"ggerganov/llama.cpp",
|
||||
"langchain-ai/langchain",
|
||||
"microsoft/terminal",
|
||||
"521xueweihan/HelloGitHub",
|
||||
"mui/material-ui",
|
||||
"ant-design/ant-design",
|
||||
"yt-dlp/yt-dlp",
|
||||
"ryanmcdermott/clean-code-javascript",
|
||||
"godotengine/godot",
|
||||
"ripienaar/free-for-dev",
|
||||
"iluwatar/java-design-patterns",
|
||||
"puppeteer/puppeteer",
|
||||
"papers-we-love/papers-we-love",
|
||||
"PanJiaChen/vue-element-admin",
|
||||
"iptv-org/iptv",
|
||||
"fatedier/frp",
|
||||
"excalidraw/excalidraw",
|
||||
"tauri-apps/tauri",
|
||||
"Hack-with-Github/Awesome-Hacking",
|
||||
"nvbn/thefuck",
|
||||
"mtdvio/every-programmer-should-know",
|
||||
"storybookjs/storybook",
|
||||
"neovim/neovim",
|
||||
"microsoft/Web-Dev-For-Beginners",
|
||||
"django/django",
|
||||
"florinpop17/app-ideas",
|
||||
"animate-css/animate.css",
|
||||
"nvm-sh/nvm",
|
||||
"gothinkster/realworld",
|
||||
"bitcoin/bitcoin",
|
||||
"sveltejs/svelte",
|
||||
"opencv/opencv",
|
||||
"gin-gonic/gin",
|
||||
"laravel/laravel",
|
||||
"fastapi/fastapi",
|
||||
"macrozheng/mall",
|
||||
"jaywcjlove/awesome-mac",
|
||||
"tonsky/FiraCode",
|
||||
"ChatGPTNextWeb/ChatGPT-Next-Web",
|
||||
"rustdesk/rustdesk",
|
||||
"tensorflow/models",
|
||||
"doocs/advanced-java",
|
||||
"shadcn-ui/ui",
|
||||
"gohugoio/hugo",
|
||||
"MisterBooo/LeetCodeAnimation",
|
||||
"spring-projects/spring-boot",
|
||||
"supabase/supabase",
|
||||
"oven-sh/bun",
|
||||
"FortAwesome/Font-Awesome",
|
||||
"home-assistant/core",
|
||||
"typicode/json-server",
|
||||
"mermaid-js/mermaid",
|
||||
"openai/whisper",
|
||||
"netdata/netdata",
|
||||
"vuejs/awesome-vue",
|
||||
"DopplerHQ/awesome-interview-questions",
|
||||
"3b1b/manim",
|
||||
"2dust/v2rayN",
|
||||
"nomic-ai/gpt4all",
|
||||
"elastic/elasticsearch",
|
||||
"anuraghazra/github-readme-stats",
|
||||
"microsoft/ML-For-Beginners",
|
||||
"MunGell/awesome-for-beginners",
|
||||
"fighting41love/funNLP",
|
||||
"vitejs/vite",
|
||||
"thedaviddias/Front-End-Checklist",
|
||||
"coder/code-server",
|
||||
"moby/moby",
|
||||
"CompVis/stable-diffusion",
|
||||
"base-org/node",
|
||||
"nestjs/nest",
|
||||
"pallets/flask",
|
||||
"hakimel/reveal.js",
|
||||
"Anduin2017/HowToCook",
|
||||
"microsoft/playwright",
|
||||
"swiftlang/swift",
|
||||
"Developer-Y/cs-video-courses",
|
||||
"redis/redis",
|
||||
"bregman-arie/devops-exercises",
|
||||
"josephmisiti/awesome-machine-learning",
|
||||
"binary-husky/gpt_academic",
|
||||
"junegunn/fzf",
|
||||
"syncthing/syncthing",
|
||||
"hoppscotch/hoppscotch",
|
||||
"protocolbuffers/protobuf",
|
||||
"enaqx/awesome-react",
|
||||
"expressjs/express",
|
||||
"microsoft/generative-ai-for-beginners",
|
||||
"grafana/grafana",
|
||||
"abi/screenshot-to-code",
|
||||
"ByteByteGoHq/system-design-101",
|
||||
"chartjs/Chart.js",
|
||||
"webpack/webpack",
|
||||
"d2l-ai/d2l-zh",
|
||||
"sdmg15/Best-websites-a-programmer-should-visit",
|
||||
"strapi/strapi",
|
||||
"python/cpython",
|
||||
"leonardomso/33-js-concepts",
|
||||
"kdn251/interviews",
|
||||
"ventoy/Ventoy",
|
||||
"ansible/ansible",
|
||||
"apache/superset",
|
||||
"tesseract-ocr/tesseract",
|
||||
"lydiahallie/javascript-questions",
|
||||
"xtekky/gpt4free",
|
||||
"FuelLabs/sway",
|
||||
"twitter/the-algorithm",
|
||||
"keras-team/keras",
|
||||
"resume/resume.github.com",
|
||||
"swisskyrepo/PayloadsAllTheThings",
|
||||
"ocornut/imgui",
|
||||
"socketio/socket.io",
|
||||
"awesomedata/awesome-public-datasets",
|
||||
"louislam/uptime-kuma",
|
||||
"kelseyhightower/nocode",
|
||||
"sherlock-project/sherlock",
|
||||
"reduxjs/redux",
|
||||
"apache/echarts",
|
||||
"obsproject/obs-studio",
|
||||
"openai/openai-cookbook",
|
||||
"fffaraz/awesome-cpp",
|
||||
"scikit-learn/scikit-learn",
|
||||
"TheAlgorithms/Java",
|
||||
"atom/atom",
|
||||
"Eugeny/tabby",
|
||||
"lodash/lodash",
|
||||
"caddyserver/caddy",
|
||||
|
|
|
|||
59
docs/.editorconfig
Normal file
|
|
@ -0,0 +1,59 @@
|
|||
[*]
|
||||
cpp_indent_braces=false
|
||||
cpp_indent_multi_line_relative_to=innermost_parenthesis
|
||||
cpp_indent_within_parentheses=indent
|
||||
cpp_indent_preserve_within_parentheses=false
|
||||
cpp_indent_case_labels=false
|
||||
cpp_indent_case_contents=true
|
||||
cpp_indent_case_contents_when_block=false
|
||||
cpp_indent_lambda_braces_when_parameter=true
|
||||
cpp_indent_goto_labels=one_left
|
||||
cpp_indent_preprocessor=leftmost_column
|
||||
cpp_indent_access_specifiers=false
|
||||
cpp_indent_namespace_contents=true
|
||||
cpp_indent_preserve_comments=false
|
||||
cpp_new_line_before_open_brace_namespace=ignore
|
||||
cpp_new_line_before_open_brace_type=ignore
|
||||
cpp_new_line_before_open_brace_function=ignore
|
||||
cpp_new_line_before_open_brace_block=ignore
|
||||
cpp_new_line_before_open_brace_lambda=ignore
|
||||
cpp_new_line_scope_braces_on_separate_lines=false
|
||||
cpp_new_line_close_brace_same_line_empty_type=false
|
||||
cpp_new_line_close_brace_same_line_empty_function=false
|
||||
cpp_new_line_before_catch=true
|
||||
cpp_new_line_before_else=true
|
||||
cpp_new_line_before_while_in_do_while=false
|
||||
cpp_space_before_function_open_parenthesis=remove
|
||||
cpp_space_within_parameter_list_parentheses=false
|
||||
cpp_space_between_empty_parameter_list_parentheses=false
|
||||
cpp_space_after_keywords_in_control_flow_statements=true
|
||||
cpp_space_within_control_flow_statement_parentheses=false
|
||||
cpp_space_before_lambda_open_parenthesis=false
|
||||
cpp_space_within_cast_parentheses=false
|
||||
cpp_space_after_cast_close_parenthesis=false
|
||||
cpp_space_within_expression_parentheses=false
|
||||
cpp_space_before_block_open_brace=true
|
||||
cpp_space_between_empty_braces=false
|
||||
cpp_space_before_initializer_list_open_brace=false
|
||||
cpp_space_within_initializer_list_braces=true
|
||||
cpp_space_preserve_in_initializer_list=true
|
||||
cpp_space_before_open_square_bracket=false
|
||||
cpp_space_within_square_brackets=false
|
||||
cpp_space_before_empty_square_brackets=false
|
||||
cpp_space_between_empty_square_brackets=false
|
||||
cpp_space_group_square_brackets=true
|
||||
cpp_space_within_lambda_brackets=false
|
||||
cpp_space_between_empty_lambda_brackets=false
|
||||
cpp_space_before_comma=false
|
||||
cpp_space_after_comma=true
|
||||
cpp_space_remove_around_member_operators=true
|
||||
cpp_space_before_inheritance_colon=true
|
||||
cpp_space_before_constructor_colon=true
|
||||
cpp_space_remove_before_semicolon=true
|
||||
cpp_space_after_semicolon=false
|
||||
cpp_space_remove_around_unary_operator=true
|
||||
cpp_space_around_binary_operator=insert
|
||||
cpp_space_around_assignment_operator=insert
|
||||
cpp_space_pointer_reference_alignment=left
|
||||
cpp_space_around_ternary_operator=insert
|
||||
cpp_wrap_preserve_blocks=one_liners
|
||||
32
docs/README.md
Normal file
|
|
@ -0,0 +1,32 @@
|
|||
# Mintlify Starter Kit
|
||||
|
||||
Click on `Use this template` to copy the Mintlify starter kit. The starter kit contains examples including
|
||||
|
||||
- Guide pages
|
||||
- Navigation
|
||||
- Customizations
|
||||
- API Reference pages
|
||||
- Use of popular components
|
||||
|
||||
### Development
|
||||
|
||||
Install the [Mintlify CLI](https://www.npmjs.com/package/mintlify) to preview the documentation changes locally. To install, use the following command
|
||||
|
||||
```
|
||||
npm i -g mintlify
|
||||
```
|
||||
|
||||
Run the following command at the root of your documentation (where docs.json is)
|
||||
|
||||
```
|
||||
mintlify dev
|
||||
```
|
||||
|
||||
### Publishing Changes
|
||||
|
||||
Install our Github App to auto propagate changes from your repo to your deployment. Changes will be deployed to production automatically after pushing to the default branch. Find the link to install on your dashboard.
|
||||
|
||||
#### Troubleshooting
|
||||
|
||||
- Mintlify dev isn't running - Run `mintlify install` it'll re-install dependencies.
|
||||
- Page loads as a 404 - Make sure you are running in a folder with `docs.json`
|
||||
107
docs/development.mdx
Normal file
|
|
@ -0,0 +1,107 @@
|
|||
---
|
||||
title: 'Development'
|
||||
description: 'Preview changes locally to update your docs'
|
||||
---
|
||||
|
||||
<Info>
|
||||
**Prerequisite**: Please install Node.js (version 19 or higher) before proceeding. <br />
|
||||
Please upgrade to ```docs.json``` before proceeding and delete the legacy ```mint.json``` file.
|
||||
</Info>
|
||||
|
||||
Follow these steps to install and run Mintlify on your operating system:
|
||||
|
||||
**Step 1**: Install Mintlify:
|
||||
|
||||
<CodeGroup>
|
||||
|
||||
```bash npm
|
||||
npm i -g mintlify
|
||||
```
|
||||
|
||||
```bash yarn
|
||||
yarn global add mintlify
|
||||
```
|
||||
|
||||
</CodeGroup>
|
||||
|
||||
**Step 2**: Navigate to the docs directory (where the `docs.json` file is located) and execute the following command:
|
||||
|
||||
```bash
|
||||
mintlify dev
|
||||
```
|
||||
|
||||
A local preview of your documentation will be available at `http://localhost:3000`.
|
||||
|
||||
### Custom Ports
|
||||
|
||||
By default, Mintlify uses port 3000. You can customize the port Mintlify runs on by using the `--port` flag. To run Mintlify on port 3333, for instance, use this command:
|
||||
|
||||
```bash
|
||||
mintlify dev --port 3333
|
||||
```
|
||||
|
||||
If you attempt to run Mintlify on a port that's already in use, it will use the next available port:
|
||||
|
||||
```md
|
||||
Port 3000 is already in use. Trying 3001 instead.
|
||||
```
|
||||
|
||||
## Mintlify Versions
|
||||
|
||||
Please note that each CLI release is associated with a specific version of Mintlify. If your local website doesn't align with the production version, please update the CLI:
|
||||
|
||||
<CodeGroup>
|
||||
|
||||
```bash npm
|
||||
npm i -g mintlify@latest
|
||||
```
|
||||
|
||||
```bash yarn
|
||||
yarn global upgrade mintlify
|
||||
```
|
||||
|
||||
</CodeGroup>
|
||||
|
||||
## Validating Links
|
||||
|
||||
The CLI can assist with validating reference links made in your documentation. To identify any broken links, use the following command:
|
||||
|
||||
```bash
|
||||
mintlify broken-links
|
||||
```
|
||||
|
||||
## Deployment
|
||||
|
||||
<Tip>
|
||||
Unlimited editors available under the [Pro
|
||||
Plan](https://mintlify.com/pricing) and above.
|
||||
</Tip>
|
||||
|
||||
If the deployment is successful, you should see the following:
|
||||
|
||||
<Frame>
|
||||
<img src="/images/checks-passed.png" style={{ borderRadius: '0.5rem' }} />
|
||||
</Frame>
|
||||
|
||||
## Code Formatting
|
||||
|
||||
We suggest using extensions on your IDE to recognize and format MDX. If you're a VSCode user, consider the [MDX VSCode extension](https://marketplace.visualstudio.com/items?itemName=unifiedjs.vscode-mdx) for syntax highlighting, and [Prettier](https://marketplace.visualstudio.com/items?itemName=esbenp.prettier-vscode) for code formatting.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title='Error: Could not load the "sharp" module using the darwin-arm64 runtime'>
|
||||
|
||||
This may be due to an outdated version of node. Try the following:
|
||||
1. Remove the currently-installed version of mintlify: `npm remove -g mintlify`
|
||||
2. Upgrade to Node v19 or higher.
|
||||
3. Reinstall mintlify: `npm install -g mintlify`
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Issue: Encountering an unknown error">
|
||||
|
||||
Solution: Go to the root of your device and delete the \~/.mintlify folder. Afterwards, run `mintlify dev` again.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
Curious about what changed in the CLI version? [Check out the CLI changelog.](https://www.npmjs.com/package/mintlify?activeTab=versions)
|
||||
123
docs/docs.json
Normal file
|
|
@ -0,0 +1,123 @@
|
|||
{
|
||||
"$schema": "https://mintlify.com/docs.json",
|
||||
"theme": "mint",
|
||||
"name": "Sourcebot",
|
||||
"colors": {
|
||||
"primary": "#851EE7",
|
||||
"light": "#FFFFFF",
|
||||
"dark": "#851EE7"
|
||||
},
|
||||
"favicon": "/fav.svg",
|
||||
"styling": {
|
||||
"eyebrows": "section"
|
||||
},
|
||||
"navigation": {
|
||||
"anchors": [
|
||||
{
|
||||
"anchor": "Docs",
|
||||
"icon": "book-open",
|
||||
"groups": [
|
||||
{
|
||||
"group": "General",
|
||||
"pages": [
|
||||
"docs/overview",
|
||||
"docs/getting-started",
|
||||
"docs/getting-started-selfhost"
|
||||
]
|
||||
},
|
||||
{
|
||||
"group": "Connecting your code",
|
||||
"pages": [
|
||||
"docs/connections/overview",
|
||||
"docs/connections/github",
|
||||
"docs/connections/gitlab",
|
||||
"docs/connections/gitea",
|
||||
"docs/connections/gerrit",
|
||||
"docs/connections/request-new"
|
||||
]
|
||||
},
|
||||
{
|
||||
"group": "More",
|
||||
"pages": [
|
||||
"docs/more/roles-and-permissions"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"anchor": "Self Hosting",
|
||||
"icon": "server",
|
||||
"groups": [
|
||||
{
|
||||
"group": "Getting Started",
|
||||
"pages": [
|
||||
"self-hosting/overview",
|
||||
"self-hosting/configuration"
|
||||
]
|
||||
},
|
||||
{
|
||||
"group": "More",
|
||||
"pages": [
|
||||
"self-hosting/more/authentication",
|
||||
"self-hosting/more/tenancy",
|
||||
"self-hosting/more/transactional-emails",
|
||||
"self-hosting/more/declarative-config"
|
||||
]
|
||||
},
|
||||
{
|
||||
"group": "Security",
|
||||
"pages": [
|
||||
]
|
||||
},
|
||||
{
|
||||
"group": "Upgrade",
|
||||
"pages": [
|
||||
"self-hosting/upgrade/v2-to-v3-guide"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"anchor": "Changelog",
|
||||
"href": "https://sourcebot.dev/changelog",
|
||||
"icon": "list-check"
|
||||
},
|
||||
{
|
||||
"anchor": "Support",
|
||||
"href": "https://github.com/sourcebot-dev/sourcebot/discussions/categories/support",
|
||||
"icon": "life-ring"
|
||||
}
|
||||
]
|
||||
},
|
||||
"logo": {
|
||||
"light": "/logo/light.png",
|
||||
"dark": "/logo/dark.png"
|
||||
},
|
||||
"navbar": {
|
||||
"links": [
|
||||
{
|
||||
"label": "GitHub",
|
||||
"href": "https://github.com/sourcebot-dev/sourcebot"
|
||||
}
|
||||
],
|
||||
"primary": {
|
||||
"type": "button",
|
||||
"label": "Sourcebot Cloud",
|
||||
"href": "https://app.sourcebot.dev"
|
||||
}
|
||||
},
|
||||
"footer": {
|
||||
"socials": {
|
||||
"github": "https://github.com/sourcebot-dev/sourcebot"
|
||||
}
|
||||
},
|
||||
"integrations": {
|
||||
"posthog": {
|
||||
"apiKey": "phc_DBGufjG0rkj3OEhuTcZ04xfeZB6eDhO7dP8ZCnqH7K7"
|
||||
}
|
||||
},
|
||||
"appearance": {
|
||||
"default": "dark",
|
||||
"strict": true
|
||||
}
|
||||
}
|
||||
125
docs/docs/connections/gerrit.mdx
Normal file
|
|
@ -0,0 +1,125 @@
|
|||
---
|
||||
title: Linking code from Gerrit
|
||||
sidebarTitle: Gerrit
|
||||
---
|
||||
|
||||
<Note>Authenticating with Gerrit is currently not supported. If you need this capability, please raise a [feature request](https://github.com/sourcebot-dev/sourcebot/discussions/categories/ideas).</Note>
|
||||
|
||||
Sourcebot can sync code from self-hosted gerrit instances.
|
||||
|
||||
## Connecting to a Gerrit instance
|
||||
|
||||
To connect to a gerrit instance, provide the `url` property to your config:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "gerrit",
|
||||
"url": "https://gerrit.example.com"
|
||||
// .. rest of config ..
|
||||
}
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Sync projects by glob pattern">
|
||||
```json
|
||||
{
|
||||
"type": "gerrit",
|
||||
"url": "https://gerrit.example.com",
|
||||
// Sync all repos under project1 and project2/sub-project
|
||||
"projects": [
|
||||
"project1/**",
|
||||
"project2/sub-project/**"
|
||||
]
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Exclude repos from syncing">
|
||||
```json
|
||||
{
|
||||
"type": "gerrit",
|
||||
"url": "https://gerrit.example.com",
|
||||
// Sync all repos under project1 and project2/sub-project...
|
||||
"projects": [
|
||||
"project1/**",
|
||||
"project2/sub-project/**"
|
||||
],
|
||||
// ...except:
|
||||
"exclude": {
|
||||
// any project that matches these glob patterns
|
||||
"projects": [
|
||||
"project1/foo-project",
|
||||
"project2/sub-project/some-sub-folder/**"
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Schema reference
|
||||
|
||||
<Accordion title="Reference">
|
||||
[schemas/v3/gerrit.json](https://github.com/sourcebot-dev/sourcebot/blob/main/schemas/v3/gerrit.json)
|
||||
|
||||
```json
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"type": "object",
|
||||
"title": "GerritConnectionConfig",
|
||||
"properties": {
|
||||
"type": {
|
||||
"const": "gerrit",
|
||||
"description": "Gerrit Configuration"
|
||||
},
|
||||
"url": {
|
||||
"type": "string",
|
||||
"format": "url",
|
||||
"description": "The URL of the Gerrit host.",
|
||||
"examples": [
|
||||
"https://gerrit.example.com"
|
||||
],
|
||||
"pattern": "^https?:\\/\\/[^\\s/$.?#].[^\\s]*$"
|
||||
},
|
||||
"projects": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"description": "List of specific projects to sync. If not specified, all projects will be synced. Glob patterns are supported",
|
||||
"examples": [
|
||||
[
|
||||
"project1/repo1",
|
||||
"project2/**"
|
||||
]
|
||||
]
|
||||
},
|
||||
"exclude": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"projects": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"project1/repo1",
|
||||
"project2/**"
|
||||
]
|
||||
],
|
||||
"description": "List of specific projects to exclude from syncing."
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"type",
|
||||
"url"
|
||||
],
|
||||
"additionalProperties": false
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
308
docs/docs/connections/gitea.mdx
Normal file
|
|
@ -0,0 +1,308 @@
|
|||
---
|
||||
title: Linking code from Gitea
|
||||
sidebarTitle: Gitea
|
||||
---
|
||||
|
||||
Sourcebot can sync code from Gitea Cloud, and self-hosted.
|
||||
|
||||
## Examples
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Sync individual repos">
|
||||
```json
|
||||
{
|
||||
"type": "gitea",
|
||||
"repos": [
|
||||
"sourcebot-dev/sourcebot",
|
||||
"getsentry/sentry",
|
||||
"torvalds/linux"
|
||||
]
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Sync all repos in a organization">
|
||||
```json
|
||||
{
|
||||
"type": "gitea",
|
||||
"orgs": [
|
||||
"sourcebot-dev",
|
||||
"getsentry",
|
||||
"vercel"
|
||||
]
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Sync all repos owned by a user">
|
||||
```json
|
||||
{
|
||||
"type": "gitea",
|
||||
"users": [
|
||||
"torvalds",
|
||||
"ggerganov"
|
||||
]
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Exclude repos from syncing">
|
||||
```json
|
||||
{
|
||||
"type": "gitea",
|
||||
// Include all repos in my-org...
|
||||
"orgs": [
|
||||
"my-org"
|
||||
],
|
||||
// ...except:
|
||||
"exclude": {
|
||||
// repos that are archived
|
||||
"archived": true,
|
||||
// repos that are forks
|
||||
"forks": true,
|
||||
// repos that match these glob patterns
|
||||
"repos": [
|
||||
"my-org/repo1",
|
||||
"my-org/repo2",
|
||||
"my-org/sub-org-1/**",
|
||||
"my-org/sub-org-*/**"
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Authenticating with Gitea
|
||||
|
||||
In order to index private repositories, you'll need to generate a Gitea access token. Generate a Gitea access token [here](http://gitea.com/user/settings/applications). At minimum, you'll need to select the `read:repository` scope. `read:user` and `read:organization` are required for the `user` and `org` fields of your config file:
|
||||
|
||||

|
||||
|
||||
Next, provide the access token via the `token` property, either as an environment variable or a secret:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="Environment Variable">
|
||||
<Note>Environment variables are only supported in a [declarative config](/self-hosting/more/declarative-config) and cannot be used in the web UI.</Note>
|
||||
|
||||
1. Add the `token` property to your connection config:
|
||||
```json
|
||||
{
|
||||
"type": "gitea",
|
||||
"token": {
|
||||
// note: this env var can be named anything. It
|
||||
// doesn't need to be `GITEA_TOKEN`.
|
||||
"env": "GITEA_TOKEN"
|
||||
}
|
||||
// .. rest of config ..
|
||||
}
|
||||
```
|
||||
|
||||
2. Pass this environment variable each time you run Sourcebot:
|
||||
```bash
|
||||
docker run \
|
||||
-e GITEA_TOKEN=<PAT> \
|
||||
/* additional args */ \
|
||||
ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
```
|
||||
</Tab>
|
||||
|
||||
<Tab title="Secret">
|
||||
<Note>Secrets are only supported when [authentication](/self-hosting/more/authentication) is enabled.</Note>
|
||||
|
||||
1. Navigate to **Secrets** in settings and create a new secret with your PAT:
|
||||
|
||||

|
||||
|
||||
2. Add the `token` property to your connection config:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "gitea",
|
||||
"token": {
|
||||
"secret": "mysecret"
|
||||
}
|
||||
// .. rest of config ..
|
||||
}
|
||||
```
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Connecting to a custom Gitea
|
||||
|
||||
To connect to a custom Gitea deployment, provide the `url` property to your config:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "gitea",
|
||||
"url": "https://gitea.example.com"
|
||||
// .. rest of config ..
|
||||
}
|
||||
```
|
||||
|
||||
## Schema reference
|
||||
|
||||
<Accordion title="Reference">
|
||||
[schemas/v3/gitea.json](https://github.com/sourcebot-dev/sourcebot/blob/main/schemas/v3/gitea.json)
|
||||
|
||||
```json
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"type": "object",
|
||||
"title": "GiteaConnectionConfig",
|
||||
"properties": {
|
||||
"type": {
|
||||
"const": "gitea",
|
||||
"description": "Gitea Configuration"
|
||||
},
|
||||
"token": {
|
||||
"description": "A Personal Access Token (PAT).",
|
||||
"examples": [
|
||||
{
|
||||
"secret": "SECRET_KEY"
|
||||
}
|
||||
],
|
||||
"anyOf": [
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"secret": {
|
||||
"type": "string",
|
||||
"description": "The name of the secret that contains the token."
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"secret"
|
||||
],
|
||||
"additionalProperties": false
|
||||
},
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"env": {
|
||||
"type": "string",
|
||||
"description": "The name of the environment variable that contains the token. Only supported in declarative connection configs."
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"env"
|
||||
],
|
||||
"additionalProperties": false
|
||||
}
|
||||
]
|
||||
},
|
||||
"url": {
|
||||
"type": "string",
|
||||
"format": "url",
|
||||
"default": "https://gitea.com",
|
||||
"description": "The URL of the Gitea host. Defaults to https://gitea.com",
|
||||
"examples": [
|
||||
"https://gitea.com",
|
||||
"https://gitea.example.com"
|
||||
],
|
||||
"pattern": "^https?:\\/\\/[^\\s/$.?#].[^\\s]*$"
|
||||
},
|
||||
"orgs": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"my-org-name"
|
||||
]
|
||||
],
|
||||
"description": "List of organizations to sync with. All repositories in the organization visible to the provided `token` (if any) will be synced, unless explicitly defined in the `exclude` property. If a `token` is provided, it must have the read:organization scope."
|
||||
},
|
||||
"repos": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^[\\w.-]+\\/[\\w.-]+$"
|
||||
},
|
||||
"description": "List of individual repositories to sync with. Expected to be formatted as '{orgName}/{repoName}' or '{userName}/{repoName}'."
|
||||
},
|
||||
"users": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"username-1",
|
||||
"username-2"
|
||||
]
|
||||
],
|
||||
"description": "List of users to sync with. All repositories that the user owns will be synced, unless explicitly defined in the `exclude` property. If a `token` is provided, it must have the read:user scope."
|
||||
},
|
||||
"exclude": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"forks": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Exclude forked repositories from syncing."
|
||||
},
|
||||
"archived": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Exclude archived repositories from syncing."
|
||||
},
|
||||
"repos": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"default": [],
|
||||
"description": "List of individual repositories to exclude from syncing. Glob patterns are supported."
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
},
|
||||
"revisions": {
|
||||
"type": "object",
|
||||
"description": "The revisions (branches, tags) that should be included when indexing. The default branch (HEAD) is always indexed. A maximum of 64 revisions can be indexed, with any additional revisions being ignored.",
|
||||
"properties": {
|
||||
"branches": {
|
||||
"type": "array",
|
||||
"description": "List of branches to include when indexing. For a given repo, only the branches that exist on the repo's remote *and* match at least one of the provided `branches` will be indexed. The default branch (HEAD) is always indexed. Glob patterns are supported. A maximum of 64 branches can be indexed, with any additional branches being ignored.",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"main",
|
||||
"release/*"
|
||||
],
|
||||
[
|
||||
"**"
|
||||
]
|
||||
],
|
||||
"default": []
|
||||
},
|
||||
"tags": {
|
||||
"type": "array",
|
||||
"description": "List of tags to include when indexing. For a given repo, only the tags that exist on the repo's remote *and* match at least one of the provided `tags` will be indexed. Glob patterns are supported. A maximum of 64 tags can be indexed, with any additional tags being ignored.",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"latest",
|
||||
"v2.*.*"
|
||||
],
|
||||
[
|
||||
"**"
|
||||
]
|
||||
],
|
||||
"default": []
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"type"
|
||||
],
|
||||
"additionalProperties": false
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
391
docs/docs/connections/github.mdx
Normal file
|
|
@ -0,0 +1,391 @@
|
|||
---
|
||||
title: Linking code from GitHub
|
||||
sidebarTitle: GitHub
|
||||
---
|
||||
|
||||
Sourcebot can sync code from GitHub.com, GitHub Enterprise Server, and GitHub Enterprise Cloud.
|
||||
|
||||
## Examples
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Sync individual repos">
|
||||
```json
|
||||
{
|
||||
"type": "github",
|
||||
"repos": [
|
||||
"sourcebot-dev/sourcebot",
|
||||
"getsentry/sentry",
|
||||
"torvalds/linux"
|
||||
]
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Sync all repos in a organization">
|
||||
```json
|
||||
{
|
||||
"type": "github",
|
||||
"orgs": [
|
||||
"sourcebot-dev",
|
||||
"getsentry",
|
||||
"vercel"
|
||||
]
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Sync all repos owned by a user">
|
||||
```json
|
||||
{
|
||||
"type": "github",
|
||||
"users": [
|
||||
"torvalds",
|
||||
"ggerganov"
|
||||
]
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Filter repos by topic">
|
||||
```json
|
||||
{
|
||||
"type": "github",
|
||||
// Sync all repos in `my-org` that have a topic that...
|
||||
"orgs": [
|
||||
"my-org"
|
||||
],
|
||||
// ...match one of these glob patterns.
|
||||
"topics": [
|
||||
"test-*",
|
||||
"ci-*",
|
||||
"k8s"
|
||||
]
|
||||
}
|
||||
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Exclude repos from syncing">
|
||||
```json
|
||||
{
|
||||
"type": "github",
|
||||
// Include all repos in my-org...
|
||||
"orgs": [
|
||||
"my-org"
|
||||
],
|
||||
// ...except:
|
||||
"exclude": {
|
||||
// repos that are archived
|
||||
"archived": true,
|
||||
// repos that are forks
|
||||
"forks": true,
|
||||
// repos that match these glob patterns
|
||||
"repos": [
|
||||
"my-org/repo1",
|
||||
"my-org/repo2",
|
||||
"my-org/sub-org-1/**",
|
||||
"my-org/sub-org-*/**"
|
||||
],
|
||||
"size": {
|
||||
// repos that are less than 1MB (in bytes)...
|
||||
"min": 1048576,
|
||||
// or repos greater than 100MB (in bytes)
|
||||
"max": 104857600
|
||||
},
|
||||
// repos with topics that match these glob patterns
|
||||
"topics": [
|
||||
"test-*",
|
||||
"ci"
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Authenticating with GitHub
|
||||
|
||||
In order to index private repositories, you'll need to generate a GitHub Personal Access Token (PAT). Create a new PAT [here](https://github.com/settings/tokens/new) and make sure you select the `repo` scope:
|
||||
|
||||

|
||||
|
||||
Next, provide the PAT via the `token` property, either as an environment variable or a secret:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="Environment Variable">
|
||||
<Note>Environment variables are only supported in a [declarative config](/self-hosting/more/declarative-config) and cannot be used in the web UI.</Note>
|
||||
|
||||
1. Add the `token` property to your connection config:
|
||||
```json
|
||||
{
|
||||
"type": "github",
|
||||
"token": {
|
||||
// note: this env var can be named anything. It
|
||||
// doesn't need to be `GITHUB_TOKEN`.
|
||||
"env": "GITHUB_TOKEN"
|
||||
}
|
||||
// .. rest of config ..
|
||||
}
|
||||
```
|
||||
|
||||
2. Pass this environment variable each time you run Sourcebot:
|
||||
```bash
|
||||
docker run \
|
||||
-e GITHUB_TOKEN=<PAT> \
|
||||
/* additional args */ \
|
||||
ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
```
|
||||
</Tab>
|
||||
|
||||
<Tab title="Secret">
|
||||
<Note>Secrets are only supported when [authentication](/self-hosting/more/authentication) is enabled.</Note>
|
||||
|
||||
1. Navigate to **Secrets** in settings and create a new secret with your PAT:
|
||||
|
||||

|
||||
|
||||
2. Add the `token` property to your connection config:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "github",
|
||||
"token": {
|
||||
"secret": "mysecret"
|
||||
}
|
||||
// .. rest of config ..
|
||||
}
|
||||
```
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Connecting to a custom GitHub host
|
||||
|
||||
To connect to a GitHub host other than `github.com`, provide the `url` property to your config:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "github",
|
||||
"url": "https://github.example.com"
|
||||
// .. rest of config ..
|
||||
}
|
||||
```
|
||||
|
||||
## Schema reference
|
||||
|
||||
<Accordion title="Reference">
|
||||
[schemas/v3/github.json](https://github.com/sourcebot-dev/sourcebot/blob/main/schemas/v3/github.json)
|
||||
|
||||
```json
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"type": "object",
|
||||
"title": "GithubConnectionConfig",
|
||||
"properties": {
|
||||
"type": {
|
||||
"const": "github",
|
||||
"description": "GitHub Configuration"
|
||||
},
|
||||
"token": {
|
||||
"description": "A Personal Access Token (PAT).",
|
||||
"examples": [
|
||||
{
|
||||
"secret": "SECRET_KEY"
|
||||
}
|
||||
],
|
||||
"anyOf": [
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"secret": {
|
||||
"type": "string",
|
||||
"description": "The name of the secret that contains the token."
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"secret"
|
||||
],
|
||||
"additionalProperties": false
|
||||
},
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"env": {
|
||||
"type": "string",
|
||||
"description": "The name of the environment variable that contains the token. Only supported in declarative connection configs."
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"env"
|
||||
],
|
||||
"additionalProperties": false
|
||||
}
|
||||
]
|
||||
},
|
||||
"url": {
|
||||
"type": "string",
|
||||
"format": "url",
|
||||
"default": "https://github.com",
|
||||
"description": "The URL of the GitHub host. Defaults to https://github.com",
|
||||
"examples": [
|
||||
"https://github.com",
|
||||
"https://github.example.com"
|
||||
],
|
||||
"pattern": "^https?:\\/\\/[^\\s/$.?#].[^\\s]*$"
|
||||
},
|
||||
"users": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^[\\w.-]+$"
|
||||
},
|
||||
"default": [],
|
||||
"examples": [
|
||||
[
|
||||
"torvalds",
|
||||
"DHH"
|
||||
]
|
||||
],
|
||||
"description": "List of users to sync with. All repositories that the user owns will be synced, unless explicitly defined in the `exclude` property."
|
||||
},
|
||||
"orgs": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^[\\w.-]+$"
|
||||
},
|
||||
"default": [],
|
||||
"examples": [
|
||||
[
|
||||
"my-org-name"
|
||||
],
|
||||
[
|
||||
"sourcebot-dev",
|
||||
"commaai"
|
||||
]
|
||||
],
|
||||
"description": "List of organizations to sync with. All repositories in the organization visible to the provided `token` (if any) will be synced, unless explicitly defined in the `exclude` property."
|
||||
},
|
||||
"repos": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^[\\w.-]+\\/[\\w.-]+$"
|
||||
},
|
||||
"default": [],
|
||||
"description": "List of individual repositories to sync with. Expected to be formatted as '{orgName}/{repoName}' or '{userName}/{repoName}'."
|
||||
},
|
||||
"topics": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"minItems": 1,
|
||||
"default": [],
|
||||
"description": "List of repository topics to include when syncing. Only repositories that match at least one of the provided `topics` will be synced. If not specified, all repositories will be synced, unless explicitly defined in the `exclude` property. Glob patterns are supported.",
|
||||
"examples": [
|
||||
[
|
||||
"docs",
|
||||
"core"
|
||||
]
|
||||
]
|
||||
},
|
||||
"exclude": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"forks": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Exclude forked repositories from syncing."
|
||||
},
|
||||
"archived": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Exclude archived repositories from syncing."
|
||||
},
|
||||
"repos": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"default": [],
|
||||
"description": "List of individual repositories to exclude from syncing. Glob patterns are supported."
|
||||
},
|
||||
"topics": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"default": [],
|
||||
"description": "List of repository topics to exclude when syncing. Repositories that match one of the provided `topics` will be excluded from syncing. Glob patterns are supported.",
|
||||
"examples": [
|
||||
[
|
||||
"tests",
|
||||
"ci"
|
||||
]
|
||||
]
|
||||
},
|
||||
"size": {
|
||||
"type": "object",
|
||||
"description": "Exclude repositories based on their disk usage. Note: the disk usage is calculated by GitHub and may not reflect the actual disk usage when cloned.",
|
||||
"properties": {
|
||||
"min": {
|
||||
"type": "integer",
|
||||
"description": "Minimum repository size (in bytes) to sync (inclusive). Repositories less than this size will be excluded from syncing."
|
||||
},
|
||||
"max": {
|
||||
"type": "integer",
|
||||
"description": "Maximum repository size (in bytes) to sync (inclusive). Repositories greater than this size will be excluded from syncing."
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
},
|
||||
"revisions": {
|
||||
"type": "object",
|
||||
"description": "The revisions (branches, tags) that should be included when indexing. The default branch (HEAD) is always indexed. A maximum of 64 revisions can be indexed, with any additional revisions being ignored.",
|
||||
"properties": {
|
||||
"branches": {
|
||||
"type": "array",
|
||||
"description": "List of branches to include when indexing. For a given repo, only the branches that exist on the repo's remote *and* match at least one of the provided `branches` will be indexed. The default branch (HEAD) is always indexed. Glob patterns are supported. A maximum of 64 branches can be indexed, with any additional branches being ignored.",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"main",
|
||||
"release/*"
|
||||
],
|
||||
[
|
||||
"**"
|
||||
]
|
||||
],
|
||||
"default": []
|
||||
},
|
||||
"tags": {
|
||||
"type": "array",
|
||||
"description": "List of tags to include when indexing. For a given repo, only the tags that exist on the repo's remote *and* match at least one of the provided `tags` will be indexed. Glob patterns are supported. A maximum of 64 tags can be indexed, with any additional tags being ignored.",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"latest",
|
||||
"v2.*.*"
|
||||
],
|
||||
[
|
||||
"**"
|
||||
]
|
||||
],
|
||||
"default": []
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"type"
|
||||
],
|
||||
"additionalProperties": false
|
||||
}
|
||||
```
|
||||
|
||||
</Accordion>
|
||||
384
docs/docs/connections/gitlab.mdx
Normal file
|
|
@ -0,0 +1,384 @@
|
|||
---
|
||||
title: Linking code from GitLab
|
||||
sidebarTitle: GitLab
|
||||
---
|
||||
|
||||
Sourcebot can sync code from GitLab.com, Self Managed (CE & EE), and Dedicated.
|
||||
|
||||
|
||||
## Examples
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Sync individual projects">
|
||||
```json
|
||||
{
|
||||
"type": "gitlab",
|
||||
"projects": [
|
||||
"my-group/foo",
|
||||
"my-group/subgroup/bar"
|
||||
]
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Sync all projects in a group/subgroup">
|
||||
```json
|
||||
{
|
||||
"type": "gitlab",
|
||||
"groups": [
|
||||
"my-group",
|
||||
"my-other-group/sub-group"
|
||||
]
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Sync all projects in a self managed instance">
|
||||
<Note>This option is ignored if `url` is unset. See [connecting to a custom gitlab host](/docs/connections/gitlab#connecting-to-a-custom-gitlab-host).</Note>
|
||||
```json
|
||||
{
|
||||
"type": "gitlab",
|
||||
"url": "https://gitlab.example.com",
|
||||
// Index all projects in this self-managed instance
|
||||
"all": true
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Sync all projects owned by a user">
|
||||
```json
|
||||
{
|
||||
"type": "gitlab",
|
||||
"users": [
|
||||
"user-1",
|
||||
"user-2"
|
||||
]
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Filter projects by topic">
|
||||
```json
|
||||
{
|
||||
"type": "gitlab",
|
||||
// Sync all projects in `my-group` that have a topic that...
|
||||
"groups": [
|
||||
"my-group"
|
||||
],
|
||||
// ...match one of these glob patterns.
|
||||
"topics": [
|
||||
"test-*",
|
||||
"ci-*",
|
||||
"k8s"
|
||||
]
|
||||
}
|
||||
|
||||
```
|
||||
</Accordion>
|
||||
<Accordion title="Exclude projects from syncing">
|
||||
```json
|
||||
{
|
||||
"type": "gitlab",
|
||||
// Include all projects in these groups...
|
||||
"groups": [
|
||||
"my-group",
|
||||
"my-other-group/sub-group"
|
||||
]
|
||||
// ...except:
|
||||
"exclude": {
|
||||
// projects that are archived
|
||||
"archived": true,
|
||||
// projects that are forks
|
||||
"forks": true,
|
||||
// projects that match these glob patterns
|
||||
"projects": [
|
||||
"my-group/foo/**",
|
||||
"my-group/bar/**",
|
||||
"my-other-group/sub-group/specific-project"
|
||||
],
|
||||
// repos with topics that match these glob patterns
|
||||
"topics": [
|
||||
"test-*",
|
||||
"ci"
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
|
||||
## Authenticating with GitLab
|
||||
|
||||
In order to index private projects, you'll need to generate a GitLab Personal Access Token (PAT). Create a new PAT [here](https://gitlab.com/-/user_settings/personal_access_tokens) and make sure you select the `read_api` scope:
|
||||
|
||||

|
||||
|
||||
Next, provide the PAT via the `token` property, either as an environment variable or a secret:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="Environment Variable">
|
||||
<Note>Environment variables are only supported in a [declarative config](/self-hosting/more/declarative-config) and cannot be used in the web UI.</Note>
|
||||
|
||||
1. Add the `token` property to your connection config:
|
||||
```json
|
||||
{
|
||||
"type": "gitlab",
|
||||
"token": {
|
||||
// note: this env var can be named anything. It
|
||||
// doesn't need to be `GITLAB_TOKEN`.
|
||||
"env": "GITLAB_TOKEN"
|
||||
}
|
||||
// .. rest of config ..
|
||||
}
|
||||
```
|
||||
|
||||
2. Pass this environment variable each time you run Sourcebot:
|
||||
```bash
|
||||
docker run \
|
||||
-e GITLAB_TOKEN=<PAT> \
|
||||
/* additional args */ \
|
||||
ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
```
|
||||
</Tab>
|
||||
|
||||
<Tab title="Secret">
|
||||
<Note>Secrets are only supported when [authentication](/self-hosting/more/authentication) is enabled.</Note>
|
||||
|
||||
1. Navigate to **Secrets** in settings and create a new secret with your PAT:
|
||||
|
||||

|
||||
|
||||
2. Add the `token` property to your connection config:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "gitlab",
|
||||
"token": {
|
||||
"secret": "mysecret"
|
||||
}
|
||||
// .. rest of config ..
|
||||
}
|
||||
```
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Connecting to a custom GitLab host
|
||||
|
||||
To connect to a GitLab host other than `gitlab.com`, provide the `url` property to your config:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "gitlab",
|
||||
"url": "https://gitlab.example.com"
|
||||
// .. rest of config ..
|
||||
}
|
||||
```
|
||||
|
||||
## Schema reference
|
||||
|
||||
<Accordion title="Reference">
|
||||
[schemas/v3/gitlab.json](https://github.com/sourcebot-dev/sourcebot/blob/main/schemas/v3/gitlab.json)
|
||||
|
||||
```json
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"type": "object",
|
||||
"title": "GitlabConnectionConfig",
|
||||
"properties": {
|
||||
"type": {
|
||||
"const": "gitlab",
|
||||
"description": "GitLab Configuration"
|
||||
},
|
||||
"token": {
|
||||
"description": "An authentication token.",
|
||||
"examples": [
|
||||
{
|
||||
"secret": "SECRET_KEY"
|
||||
}
|
||||
],
|
||||
"anyOf": [
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"secret": {
|
||||
"type": "string",
|
||||
"description": "The name of the secret that contains the token."
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"secret"
|
||||
],
|
||||
"additionalProperties": false
|
||||
},
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"env": {
|
||||
"type": "string",
|
||||
"description": "The name of the environment variable that contains the token. Only supported in declarative connection configs."
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"env"
|
||||
],
|
||||
"additionalProperties": false
|
||||
}
|
||||
]
|
||||
},
|
||||
"url": {
|
||||
"type": "string",
|
||||
"format": "url",
|
||||
"default": "https://gitlab.com",
|
||||
"description": "The URL of the GitLab host. Defaults to https://gitlab.com",
|
||||
"examples": [
|
||||
"https://gitlab.com",
|
||||
"https://gitlab.example.com"
|
||||
],
|
||||
"pattern": "^https?:\\/\\/[^\\s/$.?#].[^\\s]*$"
|
||||
},
|
||||
"all": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Sync all projects visible to the provided `token` (if any) in the GitLab instance. This option is ignored if `url` is either unset or set to https://gitlab.com ."
|
||||
},
|
||||
"users": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"description": "List of users to sync with. All projects owned by the user and visible to the provided `token` (if any) will be synced, unless explicitly defined in the `exclude` property."
|
||||
},
|
||||
"groups": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"my-group"
|
||||
],
|
||||
[
|
||||
"my-group/sub-group-a",
|
||||
"my-group/sub-group-b"
|
||||
]
|
||||
],
|
||||
"description": "List of groups to sync with. All projects in the group (and recursive subgroups) visible to the provided `token` (if any) will be synced, unless explicitly defined in the `exclude` property. Subgroups can be specified by providing the path to the subgroup (e.g. `my-group/sub-group-a`)."
|
||||
},
|
||||
"projects": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"my-group/my-project"
|
||||
],
|
||||
[
|
||||
"my-group/my-sub-group/my-project"
|
||||
]
|
||||
],
|
||||
"description": "List of individual projects to sync with. The project's namespace must be specified. See: https://docs.gitlab.com/ee/user/namespace/"
|
||||
},
|
||||
"topics": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"minItems": 1,
|
||||
"description": "List of project topics to include when syncing. Only projects that match at least one of the provided `topics` will be synced. If not specified, all projects will be synced, unless explicitly defined in the `exclude` property. Glob patterns are supported.",
|
||||
"examples": [
|
||||
[
|
||||
"docs",
|
||||
"core"
|
||||
]
|
||||
]
|
||||
},
|
||||
"exclude": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"forks": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Exclude forked projects from syncing."
|
||||
},
|
||||
"archived": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Exclude archived projects from syncing."
|
||||
},
|
||||
"projects": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"default": [],
|
||||
"examples": [
|
||||
[
|
||||
"my-group/my-project"
|
||||
]
|
||||
],
|
||||
"description": "List of projects to exclude from syncing. Glob patterns are supported. The project's namespace must be specified, see: https://docs.gitlab.com/ee/user/namespace/"
|
||||
},
|
||||
"topics": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"description": "List of project topics to exclude when syncing. Projects that match one of the provided `topics` will be excluded from syncing. Glob patterns are supported.",
|
||||
"examples": [
|
||||
[
|
||||
"tests",
|
||||
"ci"
|
||||
]
|
||||
]
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
},
|
||||
"revisions": {
|
||||
"type": "object",
|
||||
"description": "The revisions (branches, tags) that should be included when indexing. The default branch (HEAD) is always indexed. A maximum of 64 revisions can be indexed, with any additional revisions being ignored.",
|
||||
"properties": {
|
||||
"branches": {
|
||||
"type": "array",
|
||||
"description": "List of branches to include when indexing. For a given repo, only the branches that exist on the repo's remote *and* match at least one of the provided `branches` will be indexed. The default branch (HEAD) is always indexed. Glob patterns are supported. A maximum of 64 branches can be indexed, with any additional branches being ignored.",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"main",
|
||||
"release/*"
|
||||
],
|
||||
[
|
||||
"**"
|
||||
]
|
||||
],
|
||||
"default": []
|
||||
},
|
||||
"tags": {
|
||||
"type": "array",
|
||||
"description": "List of tags to include when indexing. For a given repo, only the tags that exist on the repo's remote *and* match at least one of the provided `tags` will be indexed. Glob patterns are supported. A maximum of 64 tags can be indexed, with any additional tags being ignored.",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"latest",
|
||||
"v2.*.*"
|
||||
],
|
||||
[
|
||||
"**"
|
||||
]
|
||||
],
|
||||
"default": []
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"type"
|
||||
],
|
||||
"additionalProperties": false
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
33
docs/docs/connections/overview.mdx
Normal file
|
|
@ -0,0 +1,33 @@
|
|||
---
|
||||
title: Overview
|
||||
sidebarTitle: Overview
|
||||
---
|
||||
|
||||
To connect your code to Sourcebot you create **connections**. A **connection** is a configuration object that describes how Sourcebot should fetch information from a supported code host.
|
||||
|
||||
There are two ways to define connections:
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Declarative configuration file">
|
||||
This is only supported when self-hosting, and is the default mechanism to define connections. Connections are defined in a [JSON file](/self-hosting/more/declarative-config)
|
||||
and the path to the file is provided through the `CONFIG_PATH` environment variable
|
||||
</Accordion>
|
||||
<Accordion title="UI connection management">
|
||||
This is the only way to define connections when using Sourcebot Cloud, and can be configured when self-hosting by enabling [authentication](/self-hosting/more/authentications).
|
||||
|
||||
In this method, connections are defined and managed within the webapp:
|
||||
|
||||

|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
### Supported code hosts
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card horizontal title="GitHub" icon="github" href="/docs/connections/github" />
|
||||
<Card horizontal title="GitLab" icon="gitlab" href="/docs/connections/gitlab" />
|
||||
<Card horizontal title="Gitea" href="/docs/connections/gitea" />
|
||||
<Card horizontal title="Gerrit" href="/docs/connections/gerrit" />
|
||||
</CardGroup>
|
||||
|
||||
<Note>Missing your code host? [Submit a feature request on GitHub](https://github.com/sourcebot-dev/sourcebot/discussions/categories/ideas).</Note>
|
||||
7
docs/docs/connections/request-new.mdx
Normal file
|
|
@ -0,0 +1,7 @@
|
|||
---
|
||||
sidebarTitle: Request another host
|
||||
url: https://github.com/sourcebot-dev/sourcebot/discussions/categories/ideas
|
||||
title: Request another code host
|
||||
---
|
||||
|
||||
Is your code host not supported? Please open a [feature request](https://github.com/sourcebot-dev/sourcebot/discussions/categories/ideas).
|
||||
8
docs/docs/getting-started-selfhost.mdx
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
---
|
||||
sidebarTitle: Quick start guide (self-host)
|
||||
url: /self-hosting/overview
|
||||
---
|
||||
|
||||
{/*This page acts as a navigation link*/}
|
||||
|
||||
[Quick start guide (self-host)](/self-hosting/overview)
|
||||
55
docs/docs/getting-started.mdx
Normal file
|
|
@ -0,0 +1,55 @@
|
|||
---
|
||||
title: Cloud quick start guide
|
||||
sidebarTitle: Quick start guide (cloud)
|
||||
---
|
||||
|
||||
<Note>Looking for a self-hosted solution? Checkout our [self-hosting docs](/self-hosting/overview).</Note>
|
||||
|
||||
This page will provide a quick walkthrough of how to get onboarded on Sourcebot, import your code, and start searching.
|
||||
|
||||
{/*@todo: record a quick start guide
|
||||
<iframe
|
||||
width="560"
|
||||
height="315"
|
||||
src="https://www.youtube.com/embed/4KzFe50RQkQ"
|
||||
title="YouTube video player"
|
||||
frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
|
||||
allowfullscreen
|
||||
></iframe>
|
||||
*/}
|
||||
|
||||
<Steps>
|
||||
<Step title="Register an account">
|
||||
Head over to [app.sourcebot.dev](https://app.sourcebot.dev) and create an account.
|
||||
</Step>
|
||||
|
||||
<Step title="Create an organization">
|
||||
After logging in, you'll be asked to create an organization. You'll invite your team members to this organization later so they can also use Sourcebot.
|
||||
|
||||

|
||||
</Step>
|
||||
|
||||
<Step title="Link your code host">
|
||||
After selecting a code host you want to connect to, you'll be presented with the connection creation page. This page has the following three inputs:
|
||||
- Connection name (required): The name of the connection within Sourcebot
|
||||
- Secret (optional): An [access token](/access-tokens/overview) that is used to fetch private repos
|
||||
- Configuration: The JSON configuration schema that defines the repos/orgs to fetch.
|
||||
|
||||
For a more detailed explanation of connections, check out the [Connections](/docs/connections/overview) page.
|
||||
|
||||
The example below shows a connection named `sourcebot-org` that fetches all of the repos for the `sourcebot-dev` GitHub organization, but excludes the `sourcebot-dev/zoekt` repo
|
||||
|
||||
<Note>This page won't let you continue with an invalid connection schema. If you're hitting errors, make sure the input you're providing is a valid JSON</Note>
|
||||

|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
### Search
|
||||
|
||||
Once you create your organization's first connection successfully, you'll be redirected to your org's main search page. From here, you can use the search bar to search across all
|
||||
of the repos you've indexed
|
||||
|
||||

|
||||
|
||||
Congrats, you've successfuly setup Sourcebot! Read on to learn more about the Sourcebot's capabilities. Checkout the [Connections](/docs/connections/overview) page to learn how to control which repos Sourcebot fetches
|
||||
13
docs/docs/more/roles-and-permissions.mdx
Normal file
|
|
@ -0,0 +1,13 @@
|
|||
---
|
||||
title: Roles and Permissions
|
||||
---
|
||||
|
||||
<Note>Looking to sync permissions with your identify provider? We're working on it - [reach out](https://www.sourcebot.dev/contact) to us to learn more</Note>
|
||||
|
||||
If you're using Sourcebot Cloud, or are self-hosting with [authentication](/self-hosting/more/authentication) enabled, you may have multiple members in your organization. Each
|
||||
member has a role which defines their permissions:
|
||||
|
||||
| Role | Permission |
|
||||
| :--- | :--------- |
|
||||
| `Owner` | Each organization has a single `Owner`. This user has full access rights, including: connection management, organization management, and inviting new members. |
|
||||
| `Member` | Read-only access to the organization. A `Member` can search across the repos indexed by an organization's connections, but may not manage the organization or its connections. |
|
||||
22
docs/docs/overview.mdx
Normal file
|
|
@ -0,0 +1,22 @@
|
|||
---
|
||||
title: "Overview"
|
||||
---
|
||||
|
||||
import ConnectionCards from '/snippets/connection-cards.mdx';
|
||||
|
||||
Sourcebot is an **[open-source](https://github.com/sourcebot-dev/sourcebot) code search tool** that is purpose built to search multi-million line codebases in seconds. It integrates with [GitHub](/docs/connections/github), [GitLab](/docs/connections/gitlab), and [other platforms](/docs/connections).
|
||||
|
||||
## Getting Started
|
||||
|
||||
There are two ways to get started using Sourcebot:
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card horizontal title="Self-Host" icon="server" href="/self-hosting/overview">
|
||||
Deploy Sourcebot on your own infrastructure.
|
||||
</Card>
|
||||
<Card horizontal title="Sourcebot Cloud" icon="cloud" href="/docs/getting-started">
|
||||
Use Sourcebot on our managed infrastructure.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
We also have a [public demo](https://sourcebot.dev/search) if you'd like to try Sourcebot out before registering.
|
||||
9
docs/fav.svg
Normal file
|
|
@ -0,0 +1,9 @@
|
|||
<svg width="100" height="100" viewBox="0 0 100 100" fill="none" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
|
||||
<rect width="100" height="100" fill="url(#pattern0_64_7)"/>
|
||||
<defs>
|
||||
<pattern id="pattern0_64_7" patternContentUnits="objectBoundingBox" width="1" height="1">
|
||||
<use xlink:href="#image0_64_7" transform="scale(0.03125)"/>
|
||||
</pattern>
|
||||
<image id="image0_64_7" width="32" height="32" preserveAspectRatio="none" xlink:href=""/>
|
||||
</defs>
|
||||
</svg>
|
||||
|
After Width: | Height: | Size: 1.9 KiB |
BIN
docs/images/architecture_diagram.png
Normal file
|
After Width: | Height: | Size: 905 KiB |
BIN
docs/images/connect_code_host.png
Normal file
|
After Width: | Height: | Size: 284 KiB |
BIN
docs/images/connection_create_secret.png
Normal file
|
After Width: | Height: | Size: 221 KiB |
BIN
docs/images/connection_nav.png
Normal file
|
After Width: | Height: | Size: 42 KiB |
BIN
docs/images/connection_page.png
Normal file
|
After Width: | Height: | Size: 225 KiB |
BIN
docs/images/create_connection_example.png
Normal file
|
After Width: | Height: | Size: 270 KiB |
BIN
docs/images/demo.mp4
Normal file
BIN
docs/images/gitea_pat_creation.png
Normal file
|
After Width: | Height: | Size: 188 KiB |
BIN
docs/images/github_connection.png
Normal file
|
After Width: | Height: | Size: 210 KiB |
BIN
docs/images/github_pat_scopes.png
Normal file
|
After Width: | Height: | Size: 82 KiB |
BIN
docs/images/gitlab_connection.png
Normal file
|
After Width: | Height: | Size: 206 KiB |
BIN
docs/images/gitlab_pat_scopes.png
Normal file
|
After Width: | Height: | Size: 109 KiB |
BIN
docs/images/login.png
Normal file
|
After Width: | Height: | Size: 137 KiB |
BIN
docs/images/login_redeem_code.png
Normal file
|
After Width: | Height: | Size: 95 KiB |
BIN
docs/images/onboard_complete.png
Normal file
|
After Width: | Height: | Size: 402 KiB |
BIN
docs/images/onboard_invite.png
Normal file
|
After Width: | Height: | Size: 215 KiB |
BIN
docs/images/org_create.png
Normal file
|
After Width: | Height: | Size: 183 KiB |
BIN
docs/images/org_switch.png
Normal file
|
After Width: | Height: | Size: 67 KiB |
BIN
docs/images/sb_interface.png
Normal file
|
After Width: | Height: | Size: 286 KiB |
BIN
docs/images/secret_dropdown.png
Normal file
|
After Width: | Height: | Size: 70 KiB |
BIN
docs/images/secrets_list.png
Normal file
|
After Width: | Height: | Size: 61 KiB |
BIN
docs/images/secrets_page.png
Normal file
|
After Width: | Height: | Size: 185 KiB |
BIN
docs/images/settings_nav.png
Normal file
|
After Width: | Height: | Size: 76 KiB |
BIN
docs/logo/dark.png
Normal file
|
After Width: | Height: | Size: 61 KiB |
21
docs/logo/dark.svg
Normal file
|
After Width: | Height: | Size: 12 KiB |
BIN
docs/logo/light.png
Normal file
|
After Width: | Height: | Size: 56 KiB |
21
docs/logo/light.svg
Normal file
|
After Width: | Height: | Size: 12 KiB |
59
docs/self-hosting/configuration.mdx
Normal file
|
|
@ -0,0 +1,59 @@
|
|||
---
|
||||
title: Configuration
|
||||
sidebarTitle: Configuration
|
||||
---
|
||||
|
||||
|
||||
## Environment Variables
|
||||
|
||||
Sourcebot accepts a variety of environment variables to fine tune your deployment.
|
||||
|
||||
| Variable | Default | Description |
|
||||
| :------- | :------ | :---------- |
|
||||
| `SOURCEBOT_LOG_LEVEL` | `info` | The Sourcebot logging level. Valid values are `debug`, `info`, `warn`, `error`, in order of severity. |
|
||||
| `DATABASE_URL` | `postgresql://postgres@ localhost:5432/sourcebot` | Connection string of your Postgres database. By default, a Postgres database is automatically provisioned at startup within the container. |
|
||||
| `REDIS_URL` | `redis://localhost:6379` | Connection string of your Redis instance. By default, a Redis database is automatically provisioned at startup within the container. |
|
||||
| `SOURCEBOT_ENCRYPTION_KEY` | - | Used to encrypt connection secrets. Generated using `openssl rand -base64 24`. Automatically generated at startup if no value is provided. |
|
||||
| `AUTH_SECRET` | - | Used to validate login session cookies. Generated using `openssl rand -base64 33`. Automatically generated at startup if no value is provided. |
|
||||
| `AUTH_URL` | - | URL of your Sourcebot deployment, e.g., `https://example.com` or `http://localhost:3000`. Required when `SOURCEBOT_AUTH_ENABLED` is `true`. |
|
||||
| `SOURCEBOT_TENANCY_MODE` | `single` | The tenancy configuration for Sourcebot. Valid values are `single` or `multi`. See [this doc](/self-hosting/more/tenancy) for more info. |
|
||||
| `SOURCEBOT_AUTH_ENABLED` | `false` | Enables/disables authentication in Sourcebot. If set to `false`, `SOURCEBOT_TENANCY_MODE` must be `single`. See [this doc](/self-hosting/more/authentication) for more info. |
|
||||
| `SOURCEBOT_TELEMETRY_DISABLED` | `false` | Enables/disables telemetry collection in Sourcebot. See [this doc](/self-hosting/security/telemetry) for more info. |
|
||||
| `DATA_DIR` | `/data` | The directory within the container to store all persistent data. Typically, this directory will be volume mapped such that data is persisted across container restarts (e.g., `docker run -v $(pwd):/data`) |
|
||||
| `DATA_CACHE_DIR` | `$DATA_DIR/.sourcebot` | The root data directory in which all data written to disk by Sourcebot will be located. |
|
||||
| `DATABASE_DATA_DIR` | `$DATA_CACHE_DIR/db` | The data directory for the default Postgres database. |
|
||||
| `REDIS_DATA_DIR` | `$DATA_CACHE_DIR/redis` | The data directory for the default Redis instance. |
|
||||
|
||||
|
||||
## Additional Features
|
||||
|
||||
There are additional features that can be enabled and configured via environment variables.
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card horizontal title="Authentication" icon="lock" href="/self-hosting/more/authentication" />
|
||||
<Card horizontal title="Tenancy" icon="users" href="/self-hosting/more/tenancy" />
|
||||
<Card horizontal title="Transactional Emails" icon="envelope" href="/self-hosting/more/transactional-emails" />
|
||||
<Card horizontal title="Declarative Configs" icon="page" href="/self-hosting/more/declarative-config" />
|
||||
</CardGroup>
|
||||
|
||||
## Health Check and Version Endpoints
|
||||
|
||||
Sourcebot includes a health check endpoint that indicates if the application is alive, returning `200 OK` if it is:
|
||||
|
||||
```sh
|
||||
curl http://localhost:3000/api/health
|
||||
```
|
||||
|
||||
It also includes a version endpoint to check the current version of the application:
|
||||
|
||||
```sh
|
||||
curl http://localhost:3000/api/version
|
||||
```
|
||||
|
||||
Sample response:
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "v3.0.0"
|
||||
}
|
||||
```
|
||||
63
docs/self-hosting/more/authentication.mdx
Normal file
|
|
@ -0,0 +1,63 @@
|
|||
---
|
||||
title: Authentication
|
||||
sidebarTitle: Authentication
|
||||
---
|
||||
|
||||
<Note>SSO is currently not supported. If you'd like SSO, please reach out using our [contact form](https://www.sourcebot.dev/contact)</Note>
|
||||
<Warning>If you're switching from non-auth, delete the Sourcebot cache (the `.sourcebot` folder) before starting.</Warning>
|
||||
|
||||
Sourcebot has built-in authentication that gates access to your organization. OAuth, email codes, and email / password are supported. To enable authentication, set the `SOURCEBOT_AUTH_ENABLED` environment variable to `true`.
|
||||
When authentication is enabled:
|
||||
|
||||
- [Connection managment](/docs/connections/overview) happens through the UI
|
||||
- Members must be invited to an organization to gain access
|
||||
- If you're in single-tenant mode, the first user to register will be made the owner of the default organization. Check out the [roles page](/docs/more/roles-and-permissions) for more info on the different roles and permissions
|
||||
|
||||

|
||||
|
||||
|
||||
# Authentication Providers
|
||||
|
||||
<Warning>Make sure the `AUTH_URL` environment variable is [configured correctly](/self-hosting/configuration) when using Sourcebot in a deployed environment.</Warning>
|
||||
|
||||
To enable an authentication provider in Sourcebot, configure the required environment variables for the provider. Under the hood, Sourcebot uses Auth.js which supports [many providers](https://authjs.dev/getting-started/authentication/oauth). Submit a [feature request on GitHub](https://github.com/sourcebot-dev/sourcebot/discussions/categories/ideas) if you want us to add support for a specific provider.
|
||||
|
||||
|
||||
## Email / Password
|
||||
---
|
||||
Email / password authentication is enabled by default. It can be **disabled** by setting `AUTH_CREDENTIALS_LOGIN_ENABLED` to `false`.
|
||||
|
||||
## Email codes
|
||||
---
|
||||
Email codes are 6 digit codes sent to a provided email. Email codes are enabled when transactional emails are configured using the following environment variables:
|
||||
|
||||
- `SMTP_CONNECTION_URL`
|
||||
- `EMAIL_FROM_ADDRESS`
|
||||
|
||||
|
||||
See [transactional emails](/self-hosting/more/transactional-emails) for more details.
|
||||
|
||||
## GitHub
|
||||
---
|
||||
|
||||
[Auth.js GitHub Provider Docs](https://authjs.dev/getting-started/providers/github)
|
||||
|
||||
**Required environment variables:**
|
||||
- `AUTH_GITHUB_CLIENT_ID`
|
||||
- `AUTH_GITHUB_CLIENT_SECRET`
|
||||
|
||||
## Google
|
||||
---
|
||||
|
||||
[Auth.js Google Provider Docs](https://next-auth.js.org/providers/google)
|
||||
|
||||
**Required environment variables:**
|
||||
- `AUTH_GOOGLE_CLIENT_ID`
|
||||
- `AUTH_GOOGLE_CLIENT_SECRET`
|
||||
|
||||
---
|
||||
|
||||
# Troubleshooting
|
||||
|
||||
- If you experience issues logging in, logging out, or accessing an organization you should have access to, try clearing your cookies & performing a full page refresh (`Cmd/Ctrl + Shift + R` on most browsers).
|
||||
- Still not working? Reach out to us on our [discord](https://discord.com/invite/6Fhp27x7Pb) or [github discussions](https://github.com/sourcebot-dev/sourcebot/discussions)
|
||||
624
docs/self-hosting/more/declarative-config.mdx
Normal file
|
|
@ -0,0 +1,624 @@
|
|||
---
|
||||
title: Configuring Sourcebot from a file (declarative config)
|
||||
sidebarTitle: Declarative config
|
||||
---
|
||||
|
||||
Some teams require Sourcebot to be configured via a file (where it can be stored in version control, run through CI/CD pipelines, etc.) instead of a web UI. For more information on configuring connections, see this [overview](/docs/connections/overview).
|
||||
|
||||
|
||||
| Variable | Description |
|
||||
| :------- | :---------- |
|
||||
| `CONFIG_PATH` | Path to declarative config. |
|
||||
|
||||
|
||||
```json
|
||||
{
|
||||
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/refs/heads/main/schemas/v3/index.json",
|
||||
"connections": {
|
||||
"connection-1": {
|
||||
"type": "github",
|
||||
"repos": [
|
||||
"sourcebot-dev/sourcebot"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Schema reference
|
||||
|
||||
<Accordion title="Reference">
|
||||
[schemas/v3/index.json](https://github.com/sourcebot-dev/sourcebot/blob/main/schemas/v3/index.json)
|
||||
|
||||
```json
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"type": "object",
|
||||
"title": "SourcebotConfig",
|
||||
"definitions": {
|
||||
"Settings": {
|
||||
"type": "object",
|
||||
"description": "Defines the globabl settings for Sourcebot.",
|
||||
"properties": {
|
||||
"maxFileSize": {
|
||||
"type": "number",
|
||||
"description": "The maximum size of a file (in bytes) to be indexed. Files that exceed this maximum will not be indexed. Defaults to 2MB.",
|
||||
"minimum": 1
|
||||
},
|
||||
"maxTrigramCount": {
|
||||
"type": "number",
|
||||
"description": "The maximum number of trigrams per document. Files that exceed this maximum will not be indexed. Default to 20000.",
|
||||
"minimum": 1
|
||||
},
|
||||
"reindexIntervalMs": {
|
||||
"type": "number",
|
||||
"description": "The interval (in milliseconds) at which the indexer should re-index all repositories. Defaults to 1 hour.",
|
||||
"minimum": 1
|
||||
},
|
||||
"resyncConnectionPollingIntervalMs": {
|
||||
"type": "number",
|
||||
"description": "The polling rate (in milliseconds) at which the db should be checked for connections that need to be re-synced. Defaults to 1 second.",
|
||||
"minimum": 1
|
||||
},
|
||||
"reindexRepoPollingIntervalMs": {
|
||||
"type": "number",
|
||||
"description": "The polling rate (in milliseconds) at which the db should be checked for repos that should be re-indexed. Defaults to 1 second.",
|
||||
"minimum": 1
|
||||
},
|
||||
"maxConnectionSyncJobConcurrency": {
|
||||
"type": "number",
|
||||
"description": "The number of connection sync jobs to run concurrently. Defaults to 8.",
|
||||
"minimum": 1
|
||||
},
|
||||
"maxRepoIndexingJobConcurrency": {
|
||||
"type": "number",
|
||||
"description": "The number of repo indexing jobs to run concurrently. Defaults to 8.",
|
||||
"minimum": 1
|
||||
},
|
||||
"maxRepoGarbageCollectionJobConcurrency": {
|
||||
"type": "number",
|
||||
"description": "The number of repo GC jobs to run concurrently. Defaults to 8.",
|
||||
"minimum": 1
|
||||
},
|
||||
"repoGarbageCollectionGracePeriodMs": {
|
||||
"type": "number",
|
||||
"description": "The grace period (in milliseconds) for garbage collection. Used to prevent deleting shards while they're being loaded. Defaults to 10 seconds.",
|
||||
"minimum": 1
|
||||
},
|
||||
"repoIndexTimeoutMs": {
|
||||
"type": "number",
|
||||
"description": "The timeout (in milliseconds) for a repo indexing to timeout. Defaults to 2 hours.",
|
||||
"minimum": 1
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
}
|
||||
},
|
||||
"properties": {
|
||||
"$schema": {
|
||||
"type": "string"
|
||||
},
|
||||
"settings": {
|
||||
"$ref": "#/definitions/Settings"
|
||||
},
|
||||
"connections": {
|
||||
"type": "object",
|
||||
"description": "Defines a collection of connections from varying code hosts that Sourcebot should sync with. This is only available in single-tenancy mode.",
|
||||
"patternProperties": {
|
||||
"^[a-zA-Z0-9_-]+$": {
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"title": "ConnectionConfig",
|
||||
"oneOf": [
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"type": "object",
|
||||
"title": "GithubConnectionConfig",
|
||||
"properties": {
|
||||
"type": {
|
||||
"const": "github",
|
||||
"description": "GitHub Configuration"
|
||||
},
|
||||
"token": {
|
||||
"description": "A Personal Access Token (PAT).",
|
||||
"examples": [
|
||||
{
|
||||
"secret": "SECRET_KEY"
|
||||
}
|
||||
],
|
||||
"anyOf": [
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"secret": {
|
||||
"type": "string",
|
||||
"description": "The name of the secret that contains the token."
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"secret"
|
||||
],
|
||||
"additionalProperties": false
|
||||
},
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"env": {
|
||||
"type": "string",
|
||||
"description": "The name of the environment variable that contains the token. Only supported in declarative connection configs."
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"env"
|
||||
],
|
||||
"additionalProperties": false
|
||||
}
|
||||
]
|
||||
},
|
||||
"url": {
|
||||
"type": "string",
|
||||
"format": "url",
|
||||
"default": "https://github.com",
|
||||
"description": "The URL of the GitHub host. Defaults to https://github.com",
|
||||
"examples": [
|
||||
"https://github.com",
|
||||
"https://github.example.com"
|
||||
],
|
||||
"pattern": "^https?:\\/\\/[^\\s/$.?#].[^\\s]*$"
|
||||
},
|
||||
"users": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^[\\w.-]+$"
|
||||
},
|
||||
"default": [],
|
||||
"examples": [
|
||||
[
|
||||
"torvalds",
|
||||
"DHH"
|
||||
]
|
||||
],
|
||||
"description": "List of users to sync with. All repositories that the user owns will be synced, unless explicitly defined in the `exclude` property."
|
||||
},
|
||||
"orgs": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^[\\w.-]+$"
|
||||
},
|
||||
"default": [],
|
||||
"examples": [
|
||||
[
|
||||
"my-org-name"
|
||||
],
|
||||
[
|
||||
"sourcebot-dev",
|
||||
"commaai"
|
||||
]
|
||||
],
|
||||
"description": "List of organizations to sync with. All repositories in the organization visible to the provided `token` (if any) will be synced, unless explicitly defined in the `exclude` property."
|
||||
},
|
||||
"repos": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^[\\w.-]+\\/[\\w.-]+$"
|
||||
},
|
||||
"default": [],
|
||||
"description": "List of individual repositories to sync with. Expected to be formatted as '{orgName}/{repoName}' or '{userName}/{repoName}'."
|
||||
},
|
||||
"topics": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"minItems": 1,
|
||||
"default": [],
|
||||
"description": "List of repository topics to include when syncing. Only repositories that match at least one of the provided `topics` will be synced. If not specified, all repositories will be synced, unless explicitly defined in the `exclude` property. Glob patterns are supported.",
|
||||
"examples": [
|
||||
[
|
||||
"docs",
|
||||
"core"
|
||||
]
|
||||
]
|
||||
},
|
||||
"exclude": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"forks": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Exclude forked repositories from syncing."
|
||||
},
|
||||
"archived": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Exclude archived repositories from syncing."
|
||||
},
|
||||
"repos": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"default": [],
|
||||
"description": "List of individual repositories to exclude from syncing. Glob patterns are supported."
|
||||
},
|
||||
"topics": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"default": [],
|
||||
"description": "List of repository topics to exclude when syncing. Repositories that match one of the provided `topics` will be excluded from syncing. Glob patterns are supported.",
|
||||
"examples": [
|
||||
[
|
||||
"tests",
|
||||
"ci"
|
||||
]
|
||||
]
|
||||
},
|
||||
"size": {
|
||||
"type": "object",
|
||||
"description": "Exclude repositories based on their disk usage. Note: the disk usage is calculated by GitHub and may not reflect the actual disk usage when cloned.",
|
||||
"properties": {
|
||||
"min": {
|
||||
"type": "integer",
|
||||
"description": "Minimum repository size (in bytes) to sync (inclusive). Repositories less than this size will be excluded from syncing."
|
||||
},
|
||||
"max": {
|
||||
"type": "integer",
|
||||
"description": "Maximum repository size (in bytes) to sync (inclusive). Repositories greater than this size will be excluded from syncing."
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
},
|
||||
"revisions": {
|
||||
"type": "object",
|
||||
"description": "The revisions (branches, tags) that should be included when indexing. The default branch (HEAD) is always indexed. A maximum of 64 revisions can be indexed, with any additional revisions being ignored.",
|
||||
"properties": {
|
||||
"branches": {
|
||||
"type": "array",
|
||||
"description": "List of branches to include when indexing. For a given repo, only the branches that exist on the repo's remote *and* match at least one of the provided `branches` will be indexed. The default branch (HEAD) is always indexed. Glob patterns are supported. A maximum of 64 branches can be indexed, with any additional branches being ignored.",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"main",
|
||||
"release/*"
|
||||
],
|
||||
[
|
||||
"**"
|
||||
]
|
||||
],
|
||||
"default": []
|
||||
},
|
||||
"tags": {
|
||||
"type": "array",
|
||||
"description": "List of tags to include when indexing. For a given repo, only the tags that exist on the repo's remote *and* match at least one of the provided `tags` will be indexed. Glob patterns are supported. A maximum of 64 tags can be indexed, with any additional tags being ignored.",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"latest",
|
||||
"v2.*.*"
|
||||
],
|
||||
[
|
||||
"**"
|
||||
]
|
||||
],
|
||||
"default": []
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"type"
|
||||
],
|
||||
"additionalProperties": false
|
||||
},
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"type": "object",
|
||||
"title": "GitlabConnectionConfig",
|
||||
"properties": {
|
||||
"type": {
|
||||
"const": "gitlab",
|
||||
"description": "GitLab Configuration"
|
||||
},
|
||||
"token": {
|
||||
"$ref": "#/properties/connections/patternProperties/%5E%5Ba-zA-Z0-9_-%5D%2B%24/oneOf/0/properties/token",
|
||||
"description": "An authentication token.",
|
||||
"examples": [
|
||||
{
|
||||
"secret": "SECRET_KEY"
|
||||
}
|
||||
]
|
||||
},
|
||||
"url": {
|
||||
"type": "string",
|
||||
"format": "url",
|
||||
"default": "https://gitlab.com",
|
||||
"description": "The URL of the GitLab host. Defaults to https://gitlab.com",
|
||||
"examples": [
|
||||
"https://gitlab.com",
|
||||
"https://gitlab.example.com"
|
||||
],
|
||||
"pattern": "^https?:\\/\\/[^\\s/$.?#].[^\\s]*$"
|
||||
},
|
||||
"all": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Sync all projects visible to the provided `token` (if any) in the GitLab instance. This option is ignored if `url` is either unset or set to https://gitlab.com ."
|
||||
},
|
||||
"users": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"description": "List of users to sync with. All projects owned by the user and visible to the provided `token` (if any) will be synced, unless explicitly defined in the `exclude` property."
|
||||
},
|
||||
"groups": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"my-group"
|
||||
],
|
||||
[
|
||||
"my-group/sub-group-a",
|
||||
"my-group/sub-group-b"
|
||||
]
|
||||
],
|
||||
"description": "List of groups to sync with. All projects in the group (and recursive subgroups) visible to the provided `token` (if any) will be synced, unless explicitly defined in the `exclude` property. Subgroups can be specified by providing the path to the subgroup (e.g. `my-group/sub-group-a`)."
|
||||
},
|
||||
"projects": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"my-group/my-project"
|
||||
],
|
||||
[
|
||||
"my-group/my-sub-group/my-project"
|
||||
]
|
||||
],
|
||||
"description": "List of individual projects to sync with. The project's namespace must be specified. See: https://docs.gitlab.com/ee/user/namespace/"
|
||||
},
|
||||
"topics": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"minItems": 1,
|
||||
"description": "List of project topics to include when syncing. Only projects that match at least one of the provided `topics` will be synced. If not specified, all projects will be synced, unless explicitly defined in the `exclude` property. Glob patterns are supported.",
|
||||
"examples": [
|
||||
[
|
||||
"docs",
|
||||
"core"
|
||||
]
|
||||
]
|
||||
},
|
||||
"exclude": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"forks": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Exclude forked projects from syncing."
|
||||
},
|
||||
"archived": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Exclude archived projects from syncing."
|
||||
},
|
||||
"projects": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"default": [],
|
||||
"examples": [
|
||||
[
|
||||
"my-group/my-project"
|
||||
]
|
||||
],
|
||||
"description": "List of projects to exclude from syncing. Glob patterns are supported. The project's namespace must be specified, see: https://docs.gitlab.com/ee/user/namespace/"
|
||||
},
|
||||
"topics": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"description": "List of project topics to exclude when syncing. Projects that match one of the provided `topics` will be excluded from syncing. Glob patterns are supported.",
|
||||
"examples": [
|
||||
[
|
||||
"tests",
|
||||
"ci"
|
||||
]
|
||||
]
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
},
|
||||
"revisions": {
|
||||
"$ref": "#/properties/connections/patternProperties/%5E%5Ba-zA-Z0-9_-%5D%2B%24/oneOf/0/properties/revisions"
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"type"
|
||||
],
|
||||
"additionalProperties": false
|
||||
},
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"type": "object",
|
||||
"title": "GiteaConnectionConfig",
|
||||
"properties": {
|
||||
"type": {
|
||||
"const": "gitea",
|
||||
"description": "Gitea Configuration"
|
||||
},
|
||||
"token": {
|
||||
"$ref": "#/properties/connections/patternProperties/%5E%5Ba-zA-Z0-9_-%5D%2B%24/oneOf/0/properties/token",
|
||||
"description": "A Personal Access Token (PAT).",
|
||||
"examples": [
|
||||
{
|
||||
"secret": "SECRET_KEY"
|
||||
}
|
||||
]
|
||||
},
|
||||
"url": {
|
||||
"type": "string",
|
||||
"format": "url",
|
||||
"default": "https://gitea.com",
|
||||
"description": "The URL of the Gitea host. Defaults to https://gitea.com",
|
||||
"examples": [
|
||||
"https://gitea.com",
|
||||
"https://gitea.example.com"
|
||||
],
|
||||
"pattern": "^https?:\\/\\/[^\\s/$.?#].[^\\s]*$"
|
||||
},
|
||||
"orgs": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"my-org-name"
|
||||
]
|
||||
],
|
||||
"description": "List of organizations to sync with. All repositories in the organization visible to the provided `token` (if any) will be synced, unless explicitly defined in the `exclude` property. If a `token` is provided, it must have the read:organization scope."
|
||||
},
|
||||
"repos": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^[\\w.-]+\\/[\\w.-]+$"
|
||||
},
|
||||
"description": "List of individual repositories to sync with. Expected to be formatted as '{orgName}/{repoName}' or '{userName}/{repoName}'."
|
||||
},
|
||||
"users": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"username-1",
|
||||
"username-2"
|
||||
]
|
||||
],
|
||||
"description": "List of users to sync with. All repositories that the user owns will be synced, unless explicitly defined in the `exclude` property. If a `token` is provided, it must have the read:user scope."
|
||||
},
|
||||
"exclude": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"forks": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Exclude forked repositories from syncing."
|
||||
},
|
||||
"archived": {
|
||||
"type": "boolean",
|
||||
"default": false,
|
||||
"description": "Exclude archived repositories from syncing."
|
||||
},
|
||||
"repos": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"default": [],
|
||||
"description": "List of individual repositories to exclude from syncing. Glob patterns are supported."
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
},
|
||||
"revisions": {
|
||||
"$ref": "#/properties/connections/patternProperties/%5E%5Ba-zA-Z0-9_-%5D%2B%24/oneOf/0/properties/revisions"
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"type"
|
||||
],
|
||||
"additionalProperties": false
|
||||
},
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"type": "object",
|
||||
"title": "GerritConnectionConfig",
|
||||
"properties": {
|
||||
"type": {
|
||||
"const": "gerrit",
|
||||
"description": "Gerrit Configuration"
|
||||
},
|
||||
"url": {
|
||||
"type": "string",
|
||||
"format": "url",
|
||||
"description": "The URL of the Gerrit host.",
|
||||
"examples": [
|
||||
"https://gerrit.example.com"
|
||||
],
|
||||
"pattern": "^https?:\\/\\/[^\\s/$.?#].[^\\s]*$"
|
||||
},
|
||||
"projects": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"description": "List of specific projects to sync. If not specified, all projects will be synced. Glob patterns are supported",
|
||||
"examples": [
|
||||
[
|
||||
"project1/repo1",
|
||||
"project2/**"
|
||||
]
|
||||
]
|
||||
},
|
||||
"exclude": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"projects": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
},
|
||||
"examples": [
|
||||
[
|
||||
"project1/repo1",
|
||||
"project2/**"
|
||||
]
|
||||
],
|
||||
"description": "List of specific projects to exclude from syncing."
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"type",
|
||||
"url"
|
||||
],
|
||||
"additionalProperties": false
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
}
|
||||
```
|
||||
|
||||
</Accordion>
|
||||
27
docs/self-hosting/more/tenancy.mdx
Normal file
|
|
@ -0,0 +1,27 @@
|
|||
---
|
||||
title: Multi Tenancy Mode
|
||||
sidebarTitle: Multi tenancy
|
||||
---
|
||||
|
||||
<Warning>If you're switching from single-tenant mode, delete the Sourcebot cache (the `.sourcebot` folder) before starting.</Warning>
|
||||
<Warning>[Authentication](/self-hosting/more/authentication) must be enabled to enable multi tenancy mode</Warning>
|
||||
Multi tenancy allows your Sourcebot deployment to have **multiple organizations**, each with their own set of members and repos. To enable multi tenancy mode, define an environment variable
|
||||
named `SOURCEBOT_AUTH_ENABLED` and set its value to `multi`. When multi tenancy mode is enabled:
|
||||
|
||||
- Any members or repos that are configured in an organization are isolated to that organization
|
||||
- Members must be invited to an organization to gain access
|
||||
- Members may be a part of multiple organizations and switch through them in the UI
|
||||
|
||||
|
||||
### Organization creation form
|
||||
|
||||
When you sign in for the first time (assuming you didn't go through an invite), you'll be presented with the organization creation form. The member who creates
|
||||
the organization will be the Owner.
|
||||
|
||||

|
||||
|
||||
### Switching between organizations
|
||||
|
||||
To switch between organizations, press the drop down on the top left of the navigation menu. This also provides an option to create a new organization:
|
||||
|
||||

|
||||
14
docs/self-hosting/more/transactional-emails.mdx
Normal file
|
|
@ -0,0 +1,14 @@
|
|||
---
|
||||
title: Transactional Email
|
||||
sidebarTitle: Transactional email
|
||||
---
|
||||
|
||||
To enable transactional emails in your deployment, set the following environment variables. We recommend using [Resend](https://resend.com/), but you can use any provider. Setting this enables you to:
|
||||
|
||||
- Send emails when new members are invited
|
||||
- Log into the Sourcebot deployment using [email codes](self-hosting/more/authentication#email-codes)
|
||||
|
||||
| Variable | Description |
|
||||
| :------- | :---------- |
|
||||
| `SMTP_CONNECTION_URL` | SMTP server connection. |
|
||||
| `EMAIL_FROM_ADDRESS` | The sender's email address |
|
||||
126
docs/self-hosting/overview.mdx
Normal file
|
|
@ -0,0 +1,126 @@
|
|||
---
|
||||
title: Self-host Sourcebot
|
||||
sidebarTitle: Overview
|
||||
---
|
||||
|
||||
<Note>Want a managed solution? Checkout [Sourcebot Cloud](/docs/getting-started).</Note>
|
||||
|
||||
Sourcebot is open source and can be self-hosted using our official [Docker image](https://github.com/sourcebot-dev/sourcebot/pkgs/container/sourcebot).
|
||||
|
||||
## Quick Start Guide
|
||||
|
||||
{/*@todo: record a self-hosting quick start guide
|
||||
<iframe
|
||||
width="560"
|
||||
height="315"
|
||||
src="https://www.youtube.com/embed/4KzFe50RQkQ"
|
||||
title="YouTube video player"
|
||||
frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
|
||||
allowfullscreen
|
||||
></iframe>
|
||||
*/}
|
||||
|
||||
<Steps>
|
||||
<Step title="Create a config">
|
||||
By default, Sourcebot requires a configuration file with a list of [code host connections](/docs/connections/overview) that specify what repositories should be **synced** (cloned and indexed). To get started, run the following command to create a starter `config.json`:
|
||||
|
||||
```bash
|
||||
touch config.json
|
||||
echo '{
|
||||
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v3/index.json",
|
||||
"connections": {
|
||||
// Comments are supported
|
||||
"starter-connection": {
|
||||
"type": "github",
|
||||
"repos": [
|
||||
"sourcebot-dev/sourcebot"
|
||||
]
|
||||
}
|
||||
}
|
||||
}' > config.json
|
||||
```
|
||||
|
||||
This config creates a single GitHub connection named `starter-connection` that specifies [Sourcebot](https://github.com/sourcebot-dev/sourcebot) as a repo to sync.
|
||||
</Step>
|
||||
|
||||
<Step title="Launch your instance">
|
||||
Sourcebot is packaged as a [single Docker image](https://github.com/sourcebot-dev/sourcebot/pkgs/container/sourcebot). In the same directory as `config.json`, run the following command to start your instance:
|
||||
|
||||
``` bash
|
||||
docker run \
|
||||
-p 3000:3000 \
|
||||
--pull=always \
|
||||
--rm \
|
||||
-v $(pwd):/data \
|
||||
-e CONFIG_PATH=/data/config.json \
|
||||
--name sourcebot \
|
||||
ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
```
|
||||
|
||||
Navigate to `localhost:3000` to start searching the Sourcebot repo.
|
||||
|
||||
<Accordion title="Details">
|
||||
**This command**:
|
||||
- pulls the latest version of the `sourcebot` docker image.
|
||||
- mounts the working directory to `/data` in the container to allow Sourcebot to persist data across restarts, and to access the `config.json`. In your local directory, you should see a `.sourcebot` folder created that contains all persistent data.
|
||||
- runs any pending database migrations.
|
||||
- starts up all services, including the webserver exposed on port 3000.
|
||||
- reads `config.json` and starts syncing.
|
||||
</Accordion>
|
||||
|
||||
<Warning>Hit an issue? Please let us know on [GitHub discussions](https://github.com/sourcebot-dev/sourcebot/discussions/categories/support) or by [emailing us](mailto:team@sourcebot.dev).</Warning>
|
||||
</Step>
|
||||
|
||||
<Step title="Link your code">
|
||||
Sourcebot supports indexing public & private code on the following code hosts:
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card horizontal title="GitHub" href="/docs/connections/github" />
|
||||
<Card horizontal title="GitLab" href="/docs/connections/gitlab" />
|
||||
<Card horizontal title="Gitea" href="/docs/connections/gitea" />
|
||||
<Card horizontal title="Gerrit" href="/docs/connections/gerrit" />
|
||||
</CardGroup>
|
||||
|
||||
<Note>Missing your code host? [Submit a feature request on GitHub](https://github.com/sourcebot-dev/sourcebot/discussions/categories/ideas).</Note>
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Architecture
|
||||
|
||||
Sourcebot is shipped as a single docker container that runs a collection of services using [supervisord](https://supervisord.org/):
|
||||
|
||||

|
||||
|
||||
{/*TODO: outline the different services, how Sourcebot communicates with code hosts, and the different*/}
|
||||
|
||||
Sourcebot consists of the following components:
|
||||
- **Web Server** : main Next.js web application serving the Sourcebot UI.
|
||||
- **Backend Worker** : Node.js process that incrementally syncs with code hosts (e.g., GitHub, GitLab etc.) and asynchronously indexes configured repositories.
|
||||
- **Zoekt** : the [open-source](https://github.com/sourcegraph/zoekt), trigram indexing code search engine that powers Sourcebot under the hood.
|
||||
- **Postgres** : transactional database for storing business-logic data.
|
||||
- **Redis Job Queue** : fast in-memory store. Used with [BullMQ](https://docs.bullmq.io/) for queuing asynchronous work.
|
||||
- **`.sourcebot/` cache** : file-system cache where persistent data is written.
|
||||
|
||||
You can use managed Redis / Postgres services that run outside of the Sourcebot container by providing the `REDIS_URL` and `DATABASE_URL` environment variables, respectively. See the [configuration](/self-hosting/configuration) for more configuration options.
|
||||
|
||||
## Scalability
|
||||
|
||||
One of our design philosophies for Sourcebot is to keep our infrastructure [radically simple](https://www.radicalsimpli.city/) while balancing scalability concerns. Depending on the number of repositories you have indexed and the instance you are running Sourcebot on, you may experience slow search times or other performance degradations. Our recommendation is to vertically scale your instance by increasing the number of CPU cores and memory.
|
||||
|
||||
Sourcebot does not support horizontal scaling at this time, but it is on our roadmap. If this is something your team would be interested in, please contact us at [team@sourcebot.dev](mailto:team@sourcebot.dev).
|
||||
|
||||
|
||||
## Telemetry
|
||||
By default, Sourcebot collects anonymized usage data through [PostHog](https://posthog.com/) to help us improve the performance and reliability of our tool. We don't collect or transmit <a href="https://sourcebot.dev/search/search?query=captureEvent%5C(%20repo%3Asourcebot">any information related to your codebase</a>. In addition, all events are [sanitized](https://github.com/sourcebot-dev/sourcebot/blob/HEAD/packages/web/src/app/posthogProvider.tsx) to ensure that no sensitive details (ex. ip address, query info) leave your machine.
|
||||
|
||||
The data we collect includes general usage statistics and metadata such as query performance (e.g., search duration, error rates) to monitor the application's health and functionality. This information helps us better understand how Sourcebot is used and where improvements can be made.
|
||||
|
||||
If you'd like to disable all telemetry, you can do so by setting the environment variable `SOURCEBOT_TELEMETRY_DISABLED` to `true`:
|
||||
|
||||
```bash
|
||||
docker run \
|
||||
-e SOURCEBOT_TELEMETRY_DISABLED=true \
|
||||
/* additional args */ \
|
||||
ghcr.io/sourcebot-dev/sourcebot:latest
|
||||
```
|
||||
93
docs/self-hosting/upgrade/v2-to-v3-guide.mdx
Normal file
|
|
@ -0,0 +1,93 @@
|
|||
---
|
||||
title: V2 to V3 Guide
|
||||
sidebarTitle: V2 to V3 guide
|
||||
---
|
||||
|
||||
This guide will walk you through upgrading your Sourcebot deployment from v2 to v3.
|
||||
|
||||
<Warning>
|
||||
Please note that the following features are no longer supported in v3:
|
||||
- Local file indexing
|
||||
- Raw remote `.git` repo indexing (i.e. not through a supported code host)
|
||||
|
||||
If your deployment is dependent on these features, please [reach out](https://github.com/sourcebot-dev/sourcebot/discussions).
|
||||
</Warning>
|
||||
|
||||
<Warning>This migration will require you to reindex all your repos</Warning>
|
||||
|
||||
<Steps>
|
||||
<Step title="Spin down deployment">
|
||||
</Step>
|
||||
<Step title="Delete Sourcebot cache (.sourcebot directory)">
|
||||
</Step>
|
||||
<Step title="Migrate your configuration file to the v3 schema">
|
||||
The main change between the v3 and v2 schemas is how the data is structured. In v2, you defined a `repos` array which contained unnamed config objects:
|
||||
|
||||
```json
|
||||
{
|
||||
"$schema": "./schemas/v2/index.json",
|
||||
"repos": [
|
||||
{
|
||||
"type": "github",
|
||||
"repos": [
|
||||
"sourcebot-dev/sourcebot"
|
||||
]
|
||||
},
|
||||
{
|
||||
"type": "gitlab":
|
||||
"groups": [
|
||||
"wireshark"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
In v3, you define a `connections` map which contains named `connection` objects:
|
||||
```json
|
||||
{
|
||||
"$schema": "./schemas/v3/index.json",
|
||||
"connections": {
|
||||
"sourcebot-connection": {
|
||||
"type": "github",
|
||||
"repos": [
|
||||
"sourcebot-dev/sourcebot"
|
||||
]
|
||||
},
|
||||
"wireshark-connection": {
|
||||
"type": "gitlab":
|
||||
"groups": [
|
||||
"wireshark
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The schema of the connections defined here is the same as the "repos" you defined in the v2 schema. Some helpful notes:
|
||||
|
||||
- The name of the connection (`sourcebot-connection` and `wireshark-connection` above) is only used to identify the connection in Sourcebot. It can be any string that contains letters, digits, hyphens, or underscores
|
||||
- A connection is associated with one and only one code host platform, and this must be specified in the connections `type` field
|
||||
- Make sure you update the `$schema` field to point to the v3 schema
|
||||
- The `settings` object doesn't need to be changed. We've added new settings params (check out the v3 schema for more details)
|
||||
</Step>
|
||||
<Step title="Start your Sourcebot deployment">
|
||||
When you start up your Sourcebot deployment, it will create a fresh cache and begin indexing against your new v3 configuration file.
|
||||
|
||||
If there are issues with your configuration file it will provide an error in the console.
|
||||
After updating your configuration file, restart your Sourcebot deployment to pick up the new changes.
|
||||
</Step>
|
||||
<Step title="You're done!">
|
||||
Congrats, you've successfully migrated to v3! Please let us know what you think of the new features by reaching out on our [discord](https://discord.gg/6Fhp27x7Pb) or [GitHub discussion](https://github.com/sourcebot-dev/sourcebot/discussions/categories/support)
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
Some things to check:
|
||||
|
||||
- Make sure you update the `$schema` field in the configuration file to point to the v3 schema
|
||||
- Make sure you have a name for each `connection`, and that the name only contains letters, digits, hyphens, or underscores
|
||||
- Make sure each `connection` has a `type` field with a valid value (`gitlab`, `github`, `gitea`, `gerrit`)
|
||||
|
||||
Having troubles migrating from v2 to v3? Reach out to us on [discord](https://discord.gg/6Fhp27x7Pb) or [GitHub discussion](https://github.com/sourcebot-dev/sourcebot/discussions/categories/support) and we'll try our best to help
|
||||
4
docs/snippets/connection-cards.mdx
Normal file
|
|
@ -0,0 +1,4 @@
|
|||
<CardGroup cols={2}>
|
||||
<Card title="GitHub" icon="github" href="/docs/connections/github"></Card>
|
||||
<Card title="GitLab" icon="gitlab" href="/docs/connections/gitlab"></Card>
|
||||
</CardGroup>
|
||||
194
entrypoint.sh
|
|
@ -1,16 +1,26 @@
|
|||
#!/bin/sh
|
||||
set -e
|
||||
|
||||
echo -e "\e[34m[Info] Sourcebot version: $SOURCEBOT_VERSION\e[0m"
|
||||
echo -e "\e[34m[Info] Sourcebot version: $NEXT_PUBLIC_SOURCEBOT_VERSION\e[0m"
|
||||
|
||||
# If we don't have a PostHog key, then we need to disable telemetry.
|
||||
if [ -z "$POSTHOG_PAPIK" ]; then
|
||||
echo -e "\e[33m[Warning] POSTHOG_PAPIK was not set. Setting SOURCEBOT_TELEMETRY_DISABLED.\e[0m"
|
||||
export SOURCEBOT_TELEMETRY_DISABLED=1
|
||||
if [ -z "$NEXT_PUBLIC_POSTHOG_PAPIK" ]; then
|
||||
echo -e "\e[33m[Warning] NEXT_PUBLIC_POSTHOG_PAPIK was not set. Setting SOURCEBOT_TELEMETRY_DISABLED.\e[0m"
|
||||
export SOURCEBOT_TELEMETRY_DISABLED=true
|
||||
fi
|
||||
|
||||
if [ -n "$SOURCEBOT_TELEMETRY_DISABLED" ]; then
|
||||
# Validate that SOURCEBOT_TELEMETRY_DISABLED is either "true" or "false"
|
||||
if [ "$SOURCEBOT_TELEMETRY_DISABLED" != "true" ] && [ "$SOURCEBOT_TELEMETRY_DISABLED" != "false" ]; then
|
||||
echo -e "\e[31m[Error] SOURCEBOT_TELEMETRY_DISABLED must be either 'true' or 'false'. Got '$SOURCEBOT_TELEMETRY_DISABLED'\e[0m"
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
export SOURCEBOT_TELEMETRY_DISABLED=false
|
||||
fi
|
||||
|
||||
# Issue a info message about telemetry
|
||||
if [ ! -z "$SOURCEBOT_TELEMETRY_DISABLED" ]; then
|
||||
if [ "$SOURCEBOT_TELEMETRY_DISABLED" = "true" ]; then
|
||||
echo -e "\e[34m[Info] Disabling telemetry since SOURCEBOT_TELEMETRY_DISABLED was set.\e[0m"
|
||||
fi
|
||||
|
||||
|
|
@ -19,9 +29,59 @@ if [ ! -d "$DATA_CACHE_DIR" ]; then
|
|||
mkdir -p "$DATA_CACHE_DIR"
|
||||
fi
|
||||
|
||||
# Check if DATABASE_DATA_DIR exists, if not initialize it
|
||||
if [ ! -d "$DATABASE_DATA_DIR" ]; then
|
||||
echo -e "\e[34m[Info] Initializing database at $DATABASE_DATA_DIR...\e[0m"
|
||||
mkdir -p $DATABASE_DATA_DIR && chown -R postgres:postgres "$DATABASE_DATA_DIR"
|
||||
su postgres -c "initdb -D $DATABASE_DATA_DIR"
|
||||
fi
|
||||
|
||||
# Create the redis data directory if it doesn't exist
|
||||
if [ ! -d "$REDIS_DATA_DIR" ]; then
|
||||
mkdir -p $REDIS_DATA_DIR
|
||||
fi
|
||||
|
||||
if [ -z "$SOURCEBOT_ENCRYPTION_KEY" ]; then
|
||||
echo -e "\e[33m[Warning] SOURCEBOT_ENCRYPTION_KEY is not set.\e[0m"
|
||||
|
||||
if [ -f "$DATA_CACHE_DIR/.secret" ]; then
|
||||
echo -e "\e[34m[Info] Loading environment variables from $DATA_CACHE_DIR/.secret\e[0m"
|
||||
else
|
||||
echo -e "\e[34m[Info] Generating a new encryption key...\e[0m"
|
||||
SOURCEBOT_ENCRYPTION_KEY=$(openssl rand -base64 24)
|
||||
echo "SOURCEBOT_ENCRYPTION_KEY=\"$SOURCEBOT_ENCRYPTION_KEY\"" >> "$DATA_CACHE_DIR/.secret"
|
||||
fi
|
||||
|
||||
set -a
|
||||
. "$DATA_CACHE_DIR/.secret"
|
||||
set +a
|
||||
fi
|
||||
|
||||
# @see : https://authjs.dev/getting-started/deployment#auth_secret
|
||||
if [ -z "$AUTH_SECRET" ]; then
|
||||
echo -e "\e[33m[Warning] AUTH_SECRET is not set.\e[0m"
|
||||
|
||||
if [ -f "$DATA_CACHE_DIR/.authjs-secret" ]; then
|
||||
echo -e "\e[34m[Info] Loading environment variables from $DATA_CACHE_DIR/.authjs-secret\e[0m"
|
||||
else
|
||||
echo -e "\e[34m[Info] Generating a new encryption key...\e[0m"
|
||||
AUTH_SECRET=$(openssl rand -base64 33)
|
||||
echo "AUTH_SECRET=\"$AUTH_SECRET\"" >> "$DATA_CACHE_DIR/.authjs-secret"
|
||||
fi
|
||||
|
||||
set -a
|
||||
. "$DATA_CACHE_DIR/.authjs-secret"
|
||||
set +a
|
||||
fi
|
||||
|
||||
if [ -z "$AUTH_URL" ]; then
|
||||
echo -e "\e[33m[Warning] AUTH_URL is not set.\e[0m"
|
||||
export AUTH_URL="http://localhost:3000"
|
||||
fi
|
||||
|
||||
# In order to detect if this is the first run, we create a `.installed` file in
|
||||
# the cache directory.
|
||||
FIRST_RUN_FILE="$DATA_CACHE_DIR/.installedv2"
|
||||
FIRST_RUN_FILE="$DATA_CACHE_DIR/.installedv3"
|
||||
|
||||
if [ ! -f "$FIRST_RUN_FILE" ]; then
|
||||
touch "$FIRST_RUN_FILE"
|
||||
|
|
@ -29,13 +89,13 @@ if [ ! -f "$FIRST_RUN_FILE" ]; then
|
|||
|
||||
# If this is our first run, send a `install` event to PostHog
|
||||
# (if telemetry is enabled)
|
||||
if [ -z "$SOURCEBOT_TELEMETRY_DISABLED" ]; then
|
||||
if [ "$SOURCEBOT_TELEMETRY_DISABLED" = "false" ]; then
|
||||
if ! ( curl -L --output /dev/null --silent --fail --header "Content-Type: application/json" -d '{
|
||||
"api_key": "'"$POSTHOG_PAPIK"'",
|
||||
"api_key": "'"$NEXT_PUBLIC_POSTHOG_PAPIK"'",
|
||||
"event": "install",
|
||||
"distinct_id": "'"$SOURCEBOT_INSTALL_ID"'",
|
||||
"properties": {
|
||||
"sourcebot_version": "'"$SOURCEBOT_VERSION"'"
|
||||
"sourcebot_version": "'"$NEXT_PUBLIC_SOURCEBOT_VERSION"'"
|
||||
}
|
||||
}' https://us.i.posthog.com/capture/ ) then
|
||||
echo -e "\e[33m[Warning] Failed to send install event.\e[0m"
|
||||
|
|
@ -46,17 +106,17 @@ else
|
|||
PREVIOUS_VERSION=$(cat "$FIRST_RUN_FILE" | jq -r '.version')
|
||||
|
||||
# If the version has changed, we assume an upgrade has occurred.
|
||||
if [ "$PREVIOUS_VERSION" != "$SOURCEBOT_VERSION" ]; then
|
||||
echo -e "\e[34m[Info] Upgraded from version $PREVIOUS_VERSION to $SOURCEBOT_VERSION\e[0m"
|
||||
if [ "$PREVIOUS_VERSION" != "$NEXT_PUBLIC_SOURCEBOT_VERSION" ]; then
|
||||
echo -e "\e[34m[Info] Upgraded from version $PREVIOUS_VERSION to $NEXT_PUBLIC_SOURCEBOT_VERSION\e[0m"
|
||||
|
||||
if [ -z "$SOURCEBOT_TELEMETRY_DISABLED" ]; then
|
||||
if [ "$SOURCEBOT_TELEMETRY_DISABLED" = "false" ]; then
|
||||
if ! ( curl -L --output /dev/null --silent --fail --header "Content-Type: application/json" -d '{
|
||||
"api_key": "'"$POSTHOG_PAPIK"'",
|
||||
"api_key": "'"$NEXT_PUBLIC_POSTHOG_PAPIK"'",
|
||||
"event": "upgrade",
|
||||
"distinct_id": "'"$SOURCEBOT_INSTALL_ID"'",
|
||||
"properties": {
|
||||
"from_version": "'"$PREVIOUS_VERSION"'",
|
||||
"to_version": "'"$SOURCEBOT_VERSION"'"
|
||||
"to_version": "'"$NEXT_PUBLIC_SOURCEBOT_VERSION"'"
|
||||
}
|
||||
}' https://us.i.posthog.com/capture/ ) then
|
||||
echo -e "\e[33m[Warning] Failed to send upgrade event.\e[0m"
|
||||
|
|
@ -65,94 +125,34 @@ else
|
|||
fi
|
||||
fi
|
||||
|
||||
echo "{\"version\": \"$SOURCEBOT_VERSION\", \"install_id\": \"$SOURCEBOT_INSTALL_ID\"}" > "$FIRST_RUN_FILE"
|
||||
echo "{\"version\": \"$NEXT_PUBLIC_SOURCEBOT_VERSION\", \"install_id\": \"$SOURCEBOT_INSTALL_ID\"}" > "$FIRST_RUN_FILE"
|
||||
|
||||
# Fallback to sample config if a config does not exist
|
||||
if echo "$CONFIG_PATH" | grep -qE '^https?://'; then
|
||||
if ! curl --output /dev/null --silent --head --fail "$CONFIG_PATH"; then
|
||||
echo -e "\e[33m[Warning] Remote config file at '$CONFIG_PATH' not found. Falling back on sample config.\e[0m"
|
||||
CONFIG_PATH="./default-config.json"
|
||||
|
||||
# Start the database and wait for it to be ready before starting any other service
|
||||
if [ "$DATABASE_URL" = "postgresql://postgres@localhost:5432/sourcebot" ]; then
|
||||
su postgres -c "postgres -D $DATABASE_DATA_DIR" &
|
||||
until pg_isready -h localhost -p 5432 -U postgres; do
|
||||
echo -e "\e[34m[Info] Waiting for the database to be ready...\e[0m"
|
||||
sleep 1
|
||||
done
|
||||
|
||||
# Check if the database already exists, and create it if it dne
|
||||
EXISTING_DB=$(psql -U postgres -tAc "SELECT 1 FROM pg_database WHERE datname = 'sourcebot'")
|
||||
|
||||
if [ "$EXISTING_DB" = "1" ]; then
|
||||
echo "Database 'sourcebot' already exists; skipping creation."
|
||||
else
|
||||
echo "Creating database 'sourcebot'..."
|
||||
psql -U postgres -c "CREATE DATABASE \"sourcebot\""
|
||||
fi
|
||||
elif [ ! -f "$CONFIG_PATH" ]; then
|
||||
echo -e "\e[33m[Warning] Config file at '$CONFIG_PATH' not found. Falling back on sample config.\e[0m"
|
||||
CONFIG_PATH="./default-config.json"
|
||||
fi
|
||||
|
||||
echo -e "\e[34m[Info] Using config file at: '$CONFIG_PATH'.\e[0m"
|
||||
|
||||
# Update NextJs public env variables w/o requiring a rebuild.
|
||||
# @see: https://phase.dev/blog/nextjs-public-runtime-variables/
|
||||
{
|
||||
# Infer NEXT_PUBLIC_SOURCEBOT_TELEMETRY_DISABLED if it is not set
|
||||
if [ -z "$NEXT_PUBLIC_SOURCEBOT_TELEMETRY_DISABLED" ] && [ ! -z "$SOURCEBOT_TELEMETRY_DISABLED" ]; then
|
||||
export NEXT_PUBLIC_SOURCEBOT_TELEMETRY_DISABLED="$SOURCEBOT_TELEMETRY_DISABLED"
|
||||
fi
|
||||
|
||||
# Infer NEXT_PUBLIC_SOURCEBOT_VERSION if it is not set
|
||||
if [ -z "$NEXT_PUBLIC_SOURCEBOT_VERSION" ] && [ ! -z "$SOURCEBOT_VERSION" ]; then
|
||||
export NEXT_PUBLIC_SOURCEBOT_VERSION="$SOURCEBOT_VERSION"
|
||||
fi
|
||||
|
||||
# Infer NEXT_PUBLIC_PUBLIC_SEARCH_DEMO if it is not set
|
||||
if [ -z "$NEXT_PUBLIC_PUBLIC_SEARCH_DEMO" ] && [ ! -z "$PUBLIC_SEARCH_DEMO" ]; then
|
||||
export NEXT_PUBLIC_PUBLIC_SEARCH_DEMO="$PUBLIC_SEARCH_DEMO"
|
||||
fi
|
||||
|
||||
# Always infer NEXT_PUBLIC_POSTHOG_PAPIK
|
||||
export NEXT_PUBLIC_POSTHOG_PAPIK="$POSTHOG_PAPIK"
|
||||
|
||||
# Iterate over all .js files in .next & public, making substitutions for the `BAKED_` sentinal values
|
||||
# with their actual desired runtime value.
|
||||
find /app/packages/web/public /app/packages/web/.next -type f -name "*.js" |
|
||||
while read file; do
|
||||
sed -i "s|BAKED_NEXT_PUBLIC_SOURCEBOT_TELEMETRY_DISABLED|${NEXT_PUBLIC_SOURCEBOT_TELEMETRY_DISABLED}|g" "$file"
|
||||
sed -i "s|BAKED_NEXT_PUBLIC_SOURCEBOT_VERSION|${NEXT_PUBLIC_SOURCEBOT_VERSION}|g" "$file"
|
||||
sed -i "s|BAKED_NEXT_PUBLIC_POSTHOG_PAPIK|${NEXT_PUBLIC_POSTHOG_PAPIK}|g" "$file"
|
||||
sed -i "s|BAKED_NEXT_PUBLIC_PUBLIC_SEARCH_DEMO|${NEXT_PUBLIC_PUBLIC_SEARCH_DEMO}|g" "$file"
|
||||
done
|
||||
}
|
||||
|
||||
|
||||
# Update specifically NEXT_PUBLIC_DOMAIN_SUB_PATH w/o requiring a rebuild.
|
||||
# Ultimately, the DOMAIN_SUB_PATH sets the `basePath` param in the next.config.mjs.
|
||||
# Similar to above, we pass in a `BAKED_` sentinal value into next.config.mjs at build
|
||||
# time. Unlike above, the `basePath` configuration is set in files other than just javascript
|
||||
# code (e.g., manifest files, css files, etc.), so this section has subtle differences.
|
||||
#
|
||||
# @see: https://nextjs.org/docs/app/api-reference/next-config-js/basePath
|
||||
# @see: https://phase.dev/blog/nextjs-public-runtime-variables/
|
||||
{
|
||||
if [ ! -z "$DOMAIN_SUB_PATH" ]; then
|
||||
# If the sub-path is "/", this creates problems with certain replacements. For example:
|
||||
# /BAKED_NEXT_PUBLIC_DOMAIN_SUB_PATH/_next/image -> //_next/image (notice the double slash...)
|
||||
# To get around this, we default to an empty sub-path, which is the default when no sub-path is defined.
|
||||
if [ "$DOMAIN_SUB_PATH" = "/" ]; then
|
||||
DOMAIN_SUB_PATH=""
|
||||
|
||||
# Otherwise, we need to ensure that the sub-path starts with a slash, since this is a requirement
|
||||
# for the basePath property. For example, assume DOMAIN_SUB_PATH=/bot, then:
|
||||
# /BAKED_NEXT_PUBLIC_DOMAIN_SUB_PATH/_next/image -> /bot/_next/image
|
||||
elif [[ ! "$DOMAIN_SUB_PATH" =~ ^/ ]]; then
|
||||
DOMAIN_SUB_PATH="/$DOMAIN_SUB_PATH"
|
||||
fi
|
||||
fi
|
||||
|
||||
if [ ! -z "$DOMAIN_SUB_PATH" ]; then
|
||||
echo -e "\e[34m[Info] DOMAIN_SUB_PATH was set to "$DOMAIN_SUB_PATH". Overriding default path.\e[0m"
|
||||
fi
|
||||
|
||||
# Always set NEXT_PUBLIC_DOMAIN_SUB_PATH to DOMAIN_SUB_PATH (even if it is empty!!)
|
||||
export NEXT_PUBLIC_DOMAIN_SUB_PATH="$DOMAIN_SUB_PATH"
|
||||
|
||||
# Iterate over _all_ files in the web directory, making substitutions for the `BAKED_` sentinal values
|
||||
# with their actual desired runtime value.
|
||||
find /app/packages/web -type f |
|
||||
while read file; do
|
||||
# @note: the leading "/" is required here as it is included at build time. See Dockerfile.
|
||||
sed -i "s|/BAKED_NEXT_PUBLIC_DOMAIN_SUB_PATH|${NEXT_PUBLIC_DOMAIN_SUB_PATH}|g" "$file"
|
||||
done
|
||||
}
|
||||
# Run a Database migration
|
||||
echo -e "\e[34m[Info] Running database migration...\e[0m"
|
||||
yarn workspace @sourcebot/db prisma:migrate:prod
|
||||
|
||||
# Create the log directory
|
||||
mkdir -p /var/log/sourcebot
|
||||
|
||||
# Run supervisord
|
||||
exec supervisord -c /etc/supervisor/conf.d/supervisord.conf
|
||||
31
grafana.alloy
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
prometheus.scrape "local_app" {
|
||||
targets = [
|
||||
{
|
||||
__address__ = "localhost:6070",
|
||||
},
|
||||
{
|
||||
__address__ = "localhost:3060",
|
||||
},
|
||||
]
|
||||
|
||||
metrics_path = "/metrics"
|
||||
scrape_timeout = "500ms"
|
||||
scrape_interval = "15s"
|
||||
|
||||
job_name = sys.env("GRAFANA_ENVIRONMENT")
|
||||
|
||||
forward_to = [
|
||||
prometheus.remote_write.grafana_cloud.receiver,
|
||||
]
|
||||
}
|
||||
|
||||
prometheus.remote_write "grafana_cloud" {
|
||||
endpoint {
|
||||
url = sys.env("GRAFANA_ENDPOINT")
|
||||
|
||||
basic_auth {
|
||||
username = sys.env("GRAFANA_PROM_USERNAME")
|
||||
password = sys.env("GRAFANA_PROM_PASSWORD")
|
||||
}
|
||||
}
|
||||
}
|
||||
19
package.json
|
|
@ -4,14 +4,21 @@
|
|||
"packages/*"
|
||||
],
|
||||
"scripts": {
|
||||
"build": "yarn workspaces run build",
|
||||
"build": "cross-env SKIP_ENV_VALIDATION=1 yarn workspaces run build",
|
||||
"test": "yarn workspaces run test",
|
||||
"dev": "npm-run-all --print-label --parallel dev:zoekt dev:backend dev:web",
|
||||
"dev:zoekt": "export PATH=\"$PWD/bin:$PATH\" && zoekt-webserver -index .sourcebot/index -rpc",
|
||||
"dev:backend": "yarn workspace @sourcebot/backend dev:watch",
|
||||
"dev:web": "yarn workspace @sourcebot/web dev"
|
||||
"dev": "yarn dev:prisma:migrate:dev && npm-run-all --print-label --parallel dev:zoekt dev:backend dev:web",
|
||||
"with-env": "cross-env PATH=\"$PWD/bin:$PATH\" dotenv -e .env.development -c --",
|
||||
"dev:zoekt": "yarn with-env zoekt-webserver -index .sourcebot/index -rpc",
|
||||
"dev:backend": "yarn with-env yarn workspace @sourcebot/backend dev:watch",
|
||||
"dev:web": "yarn with-env yarn workspace @sourcebot/web dev",
|
||||
"dev:prisma:migrate:dev": "yarn with-env yarn workspace @sourcebot/db prisma:migrate:dev",
|
||||
"dev:prisma:studio": "yarn with-env yarn workspace @sourcebot/db prisma:studio",
|
||||
"dev:prisma:migrate:reset": "yarn with-env yarn workspace @sourcebot/db prisma:migrate:reset"
|
||||
},
|
||||
"devDependencies": {
|
||||
"cross-env": "^7.0.3",
|
||||
"dotenv-cli": "^8.0.0",
|
||||
"npm-run-all": "^4.1.5"
|
||||
}
|
||||
},
|
||||
"packageManager": "yarn@4.7.0"
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1 +0,0 @@
|
|||
POSTHOG_HOST=https://us.i.posthog.com
|
||||
3
packages/backend/.gitignore
vendored
|
|
@ -1,3 +1,4 @@
|
|||
dist/
|
||||
!.env
|
||||
.sentryclirc
|
||||
# Sentry Config File
|
||||
.sentryclirc
|
||||
|
|
|
|||
|
|
@ -5,16 +5,16 @@
|
|||
"main": "index.js",
|
||||
"type": "module",
|
||||
"scripts": {
|
||||
"dev:watch": "yarn generate:types && tsc-watch --preserveWatchOutput --onSuccess \"yarn dev --configPath ../../config.json --cacheDir ../../.sourcebot\"",
|
||||
"dev": "export PATH=\"$PWD/../../bin:$PATH\" && export CTAGS_COMMAND=ctags && node ./dist/index.js",
|
||||
"build": "yarn generate:types && tsc",
|
||||
"generate:types": "tsx tools/generateTypes.ts",
|
||||
"test": "vitest --config ./vitest.config.ts"
|
||||
"dev:watch": "tsc-watch --preserveWatchOutput --onSuccess \"yarn dev --cacheDir ../../.sourcebot\"",
|
||||
"dev": "node ./dist/index.js",
|
||||
"build": "tsc",
|
||||
"test": "cross-env SKIP_ENV_VALIDATION=1 vitest --config ./vitest.config.ts"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/argparse": "^2.0.16",
|
||||
"@types/micromatch": "^4.0.9",
|
||||
"@types/node": "^22.7.5",
|
||||
"cross-env": "^7.0.3",
|
||||
"json-schema-to-typescript": "^15.0.2",
|
||||
"tsc-watch": "^6.2.0",
|
||||
"tsx": "^4.19.1",
|
||||
|
|
@ -23,17 +23,34 @@
|
|||
},
|
||||
"dependencies": {
|
||||
"@gitbeaker/rest": "^40.5.1",
|
||||
"@logtail/node": "^0.5.2",
|
||||
"@logtail/winston": "^0.5.2",
|
||||
"@octokit/rest": "^21.0.2",
|
||||
"@sentry/cli": "^2.42.2",
|
||||
"@sentry/node": "^9.3.0",
|
||||
"@sentry/profiling-node": "^9.3.0",
|
||||
"@sourcebot/crypto": "workspace:*",
|
||||
"@sourcebot/db": "workspace:*",
|
||||
"@sourcebot/error": "workspace:*",
|
||||
"@sourcebot/schemas": "workspace:*",
|
||||
"@t3-oss/env-core": "^0.12.0",
|
||||
"@types/express": "^5.0.0",
|
||||
"ajv": "^8.17.1",
|
||||
"argparse": "^2.0.1",
|
||||
"bullmq": "^5.34.10",
|
||||
"cross-fetch": "^4.0.0",
|
||||
"dotenv": "^16.4.5",
|
||||
"express": "^4.21.2",
|
||||
"gitea-js": "^1.22.0",
|
||||
"glob": "^11.0.0",
|
||||
"ioredis": "^5.4.2",
|
||||
"lowdb": "^7.0.1",
|
||||
"micromatch": "^4.0.8",
|
||||
"posthog-node": "^4.2.1",
|
||||
"prom-client": "^15.1.3",
|
||||
"simple-git": "^3.27.0",
|
||||
"strip-json-comments": "^5.0.1",
|
||||
"winston": "^3.15.0"
|
||||
"winston": "^3.15.0",
|
||||
"zod": "^3.24.2"
|
||||
}
|
||||
}
|
||||
|
|
|
|||
318
packages/backend/src/connectionManager.ts
Normal file
|
|
@ -0,0 +1,318 @@
|
|||
import { Connection, ConnectionSyncStatus, PrismaClient, Prisma } from "@sourcebot/db";
|
||||
import { Job, Queue, Worker } from 'bullmq';
|
||||
import { Settings } from "./types.js";
|
||||
import { ConnectionConfig } from "@sourcebot/schemas/v3/connection.type";
|
||||
import { createLogger } from "./logger.js";
|
||||
import { Redis } from 'ioredis';
|
||||
import { RepoData, compileGithubConfig, compileGitlabConfig, compileGiteaConfig, compileGerritConfig } from "./repoCompileUtils.js";
|
||||
import { BackendError, BackendException } from "@sourcebot/error";
|
||||
import { captureEvent } from "./posthog.js";
|
||||
import { env } from "./env.js";
|
||||
import * as Sentry from "@sentry/node";
|
||||
|
||||
interface IConnectionManager {
|
||||
scheduleConnectionSync: (connection: Connection) => Promise<void>;
|
||||
registerPollingCallback: () => void;
|
||||
dispose: () => void;
|
||||
}
|
||||
|
||||
const QUEUE_NAME = 'connectionSyncQueue';
|
||||
|
||||
type JobPayload = {
|
||||
connectionId: number,
|
||||
orgId: number,
|
||||
config: ConnectionConfig,
|
||||
};
|
||||
|
||||
type JobResult = {
|
||||
repoCount: number,
|
||||
}
|
||||
|
||||
export class ConnectionManager implements IConnectionManager {
|
||||
private worker: Worker;
|
||||
private queue: Queue<JobPayload>;
|
||||
private logger = createLogger('ConnectionManager');
|
||||
|
||||
constructor(
|
||||
private db: PrismaClient,
|
||||
private settings: Settings,
|
||||
redis: Redis,
|
||||
) {
|
||||
this.queue = new Queue<JobPayload>(QUEUE_NAME, {
|
||||
connection: redis,
|
||||
});
|
||||
this.worker = new Worker(QUEUE_NAME, this.runSyncJob.bind(this), {
|
||||
connection: redis,
|
||||
concurrency: this.settings.maxConnectionSyncJobConcurrency,
|
||||
});
|
||||
this.worker.on('completed', this.onSyncJobCompleted.bind(this));
|
||||
this.worker.on('failed', this.onSyncJobFailed.bind(this));
|
||||
}
|
||||
|
||||
public async scheduleConnectionSync(connection: Connection) {
|
||||
await this.db.$transaction(async (tx) => {
|
||||
await tx.connection.update({
|
||||
where: { id: connection.id },
|
||||
data: { syncStatus: ConnectionSyncStatus.IN_SYNC_QUEUE },
|
||||
});
|
||||
|
||||
const connectionConfig = connection.config as unknown as ConnectionConfig;
|
||||
|
||||
await this.queue.add('connectionSyncJob', {
|
||||
connectionId: connection.id,
|
||||
orgId: connection.orgId,
|
||||
config: connectionConfig,
|
||||
});
|
||||
this.logger.info(`Added job to queue for connection ${connection.id}`);
|
||||
}).catch((err: unknown) => {
|
||||
this.logger.error(`Failed to add job to queue for connection ${connection.id}: ${err}`);
|
||||
});
|
||||
}
|
||||
|
||||
public async registerPollingCallback() {
|
||||
setInterval(async () => {
|
||||
const connections = await this.db.connection.findMany({
|
||||
where: {
|
||||
syncStatus: ConnectionSyncStatus.SYNC_NEEDED,
|
||||
}
|
||||
});
|
||||
for (const connection of connections) {
|
||||
await this.scheduleConnectionSync(connection);
|
||||
}
|
||||
}, this.settings.resyncConnectionPollingIntervalMs);
|
||||
}
|
||||
|
||||
private async runSyncJob(job: Job<JobPayload>): Promise<JobResult> {
|
||||
const { config, orgId } = job.data;
|
||||
// @note: We aren't actually doing anything with this atm.
|
||||
const abortController = new AbortController();
|
||||
|
||||
const connection = await this.db.connection.findUnique({
|
||||
where: {
|
||||
id: job.data.connectionId,
|
||||
},
|
||||
});
|
||||
|
||||
if (!connection) {
|
||||
const e = new BackendException(BackendError.CONNECTION_SYNC_CONNECTION_NOT_FOUND, {
|
||||
message: `Connection ${job.data.connectionId} not found`,
|
||||
});
|
||||
Sentry.captureException(e);
|
||||
throw e;
|
||||
}
|
||||
|
||||
// Reset the syncStatusMetadata to an empty object at the start of the sync job
|
||||
await this.db.connection.update({
|
||||
where: {
|
||||
id: job.data.connectionId,
|
||||
},
|
||||
data: {
|
||||
syncStatus: ConnectionSyncStatus.SYNCING,
|
||||
syncStatusMetadata: {}
|
||||
}
|
||||
})
|
||||
|
||||
|
||||
let result: {
|
||||
repoData: RepoData[],
|
||||
notFound: {
|
||||
users: string[],
|
||||
orgs: string[],
|
||||
repos: string[],
|
||||
}
|
||||
} = {
|
||||
repoData: [],
|
||||
notFound: {
|
||||
users: [],
|
||||
orgs: [],
|
||||
repos: [],
|
||||
}
|
||||
};
|
||||
|
||||
try {
|
||||
result = await (async () => {
|
||||
switch (config.type) {
|
||||
case 'github': {
|
||||
return await compileGithubConfig(config, job.data.connectionId, orgId, this.db, abortController);
|
||||
}
|
||||
case 'gitlab': {
|
||||
return await compileGitlabConfig(config, job.data.connectionId, orgId, this.db);
|
||||
}
|
||||
case 'gitea': {
|
||||
return await compileGiteaConfig(config, job.data.connectionId, orgId, this.db);
|
||||
}
|
||||
case 'gerrit': {
|
||||
return await compileGerritConfig(config, job.data.connectionId, orgId);
|
||||
}
|
||||
}
|
||||
})();
|
||||
} catch (err) {
|
||||
this.logger.error(`Failed to compile repo data for connection ${job.data.connectionId}: ${err}`);
|
||||
Sentry.captureException(err);
|
||||
|
||||
if (err instanceof BackendException) {
|
||||
throw err;
|
||||
} else {
|
||||
throw new BackendException(BackendError.CONNECTION_SYNC_SYSTEM_ERROR, {
|
||||
message: `Failed to compile repo data for connection ${job.data.connectionId}`,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
let { repoData, notFound } = result;
|
||||
|
||||
// Push the information regarding not found users, orgs, and repos to the connection's syncStatusMetadata. Note that
|
||||
// this won't be overwritten even if the connection job fails
|
||||
await this.db.connection.update({
|
||||
where: {
|
||||
id: job.data.connectionId,
|
||||
},
|
||||
data: {
|
||||
syncStatusMetadata: { notFound }
|
||||
}
|
||||
});
|
||||
|
||||
// Filter out any duplicates by external_id and external_codeHostUrl.
|
||||
repoData = repoData.filter((repo, index, self) => {
|
||||
return index === self.findIndex(r =>
|
||||
r.external_id === repo.external_id &&
|
||||
r.external_codeHostUrl === repo.external_codeHostUrl
|
||||
);
|
||||
})
|
||||
|
||||
// @note: to handle orphaned Repos we delete all RepoToConnection records for this connection,
|
||||
// and then recreate them when we upsert the repos. For example, if a repo is no-longer
|
||||
// captured by the connection's config (e.g., it was deleted, marked archived, etc.), it won't
|
||||
// appear in the repoData array above, and so the RepoToConnection record won't be re-created.
|
||||
// Repos that have no RepoToConnection records are considered orphaned and can be deleted.
|
||||
await this.db.$transaction(async (tx) => {
|
||||
const deleteStart = performance.now();
|
||||
await tx.connection.update({
|
||||
where: {
|
||||
id: job.data.connectionId,
|
||||
},
|
||||
data: {
|
||||
repos: {
|
||||
deleteMany: {}
|
||||
}
|
||||
}
|
||||
});
|
||||
const deleteDuration = performance.now() - deleteStart;
|
||||
this.logger.info(`Deleted all RepoToConnection records for connection ${job.data.connectionId} in ${deleteDuration}ms`);
|
||||
|
||||
const totalUpsertStart = performance.now();
|
||||
for (const repo of repoData) {
|
||||
const upsertStart = performance.now();
|
||||
await tx.repo.upsert({
|
||||
where: {
|
||||
external_id_external_codeHostUrl_orgId: {
|
||||
external_id: repo.external_id,
|
||||
external_codeHostUrl: repo.external_codeHostUrl,
|
||||
orgId: orgId,
|
||||
}
|
||||
},
|
||||
update: repo,
|
||||
create: repo,
|
||||
})
|
||||
const upsertDuration = performance.now() - upsertStart;
|
||||
this.logger.info(`Upserted repo ${repo.external_id} in ${upsertDuration}ms`);
|
||||
}
|
||||
const totalUpsertDuration = performance.now() - totalUpsertStart;
|
||||
this.logger.info(`Upserted ${repoData.length} repos in ${totalUpsertDuration}ms`);
|
||||
}, { timeout: env.CONNECTION_MANAGER_UPSERT_TIMEOUT_MS });
|
||||
|
||||
return {
|
||||
repoCount: repoData.length,
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
private async onSyncJobCompleted(job: Job<JobPayload>, result: JobResult) {
|
||||
this.logger.info(`Connection sync job ${job.id} completed`);
|
||||
const { connectionId } = job.data;
|
||||
|
||||
let syncStatusMetadata: Record<string, unknown> = (await this.db.connection.findUnique({
|
||||
where: { id: connectionId },
|
||||
select: { syncStatusMetadata: true }
|
||||
}))?.syncStatusMetadata as Record<string, unknown> ?? {};
|
||||
const { notFound } = syncStatusMetadata as { notFound: {
|
||||
users: string[],
|
||||
orgs: string[],
|
||||
repos: string[],
|
||||
}};
|
||||
|
||||
await this.db.connection.update({
|
||||
where: {
|
||||
id: connectionId,
|
||||
},
|
||||
data: {
|
||||
syncStatus:
|
||||
notFound.users.length > 0 ||
|
||||
notFound.orgs.length > 0 ||
|
||||
notFound.repos.length > 0 ? ConnectionSyncStatus.SYNCED_WITH_WARNINGS : ConnectionSyncStatus.SYNCED,
|
||||
syncedAt: new Date()
|
||||
}
|
||||
})
|
||||
|
||||
captureEvent('backend_connection_sync_job_completed', {
|
||||
connectionId: connectionId,
|
||||
repoCount: result.repoCount,
|
||||
});
|
||||
}
|
||||
|
||||
private async onSyncJobFailed(job: Job<JobPayload> | undefined, err: unknown) {
|
||||
this.logger.info(`Connection sync job failed with error: ${err}`);
|
||||
Sentry.captureException(err, {
|
||||
tags: {
|
||||
connectionid: job?.data.connectionId,
|
||||
jobId: job?.id,
|
||||
queue: QUEUE_NAME,
|
||||
}
|
||||
});
|
||||
|
||||
if (job) {
|
||||
const { connectionId } = job.data;
|
||||
|
||||
captureEvent('backend_connection_sync_job_failed', {
|
||||
connectionId: connectionId,
|
||||
error: err instanceof BackendException ? err.code : 'UNKNOWN',
|
||||
});
|
||||
|
||||
// We may have pushed some metadata during the execution of the job, so we make sure to not overwrite the metadata here
|
||||
let syncStatusMetadata: Record<string, unknown> = (await this.db.connection.findUnique({
|
||||
where: { id: connectionId },
|
||||
select: { syncStatusMetadata: true }
|
||||
}))?.syncStatusMetadata as Record<string, unknown> ?? {};
|
||||
|
||||
if (err instanceof BackendException) {
|
||||
syncStatusMetadata = {
|
||||
...syncStatusMetadata,
|
||||
error: err.code,
|
||||
...err.metadata,
|
||||
}
|
||||
} else {
|
||||
syncStatusMetadata = {
|
||||
...syncStatusMetadata,
|
||||
error: 'UNKNOWN',
|
||||
}
|
||||
}
|
||||
|
||||
await this.db.connection.update({
|
||||
where: {
|
||||
id: connectionId,
|
||||
},
|
||||
data: {
|
||||
syncStatus: ConnectionSyncStatus.FAILED,
|
||||
syncedAt: new Date(),
|
||||
syncStatusMetadata: syncStatusMetadata as Prisma.InputJsonValue,
|
||||
}
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
public dispose() {
|
||||
this.worker.close();
|
||||
this.queue.close();
|
||||
}
|
||||
}
|
||||
|
||||
47
packages/backend/src/connectionUtils.ts
Normal file
|
|
@ -0,0 +1,47 @@
|
|||
import * as Sentry from "@sentry/node";
|
||||
|
||||
type ValidResult<T> = {
|
||||
type: 'valid';
|
||||
data: T[];
|
||||
};
|
||||
|
||||
type NotFoundResult = {
|
||||
type: 'notFound';
|
||||
value: string;
|
||||
};
|
||||
|
||||
type CustomResult<T> = ValidResult<T> | NotFoundResult;
|
||||
|
||||
export function processPromiseResults<T>(
|
||||
results: PromiseSettledResult<CustomResult<T>>[],
|
||||
): {
|
||||
validItems: T[];
|
||||
notFoundItems: string[];
|
||||
} {
|
||||
const validItems: T[] = [];
|
||||
const notFoundItems: string[] = [];
|
||||
|
||||
results.forEach(result => {
|
||||
if (result.status === 'fulfilled') {
|
||||
const value = result.value;
|
||||
if (value.type === 'valid') {
|
||||
validItems.push(...value.data);
|
||||
} else {
|
||||
notFoundItems.push(value.value);
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
return {
|
||||
validItems,
|
||||
notFoundItems,
|
||||
};
|
||||
}
|
||||
|
||||
export function throwIfAnyFailed<T>(results: PromiseSettledResult<T>[]) {
|
||||
const failedResult = results.find(result => result.status === 'rejected');
|
||||
if (failedResult) {
|
||||
Sentry.captureException(failedResult.reason);
|
||||
throw failedResult.reason;
|
||||
}
|
||||
}
|
||||
|
|
@ -6,7 +6,12 @@ import { Settings } from "./types.js";
|
|||
export const DEFAULT_SETTINGS: Settings = {
|
||||
maxFileSize: 2 * 1024 * 1024, // 2MB in bytes
|
||||
maxTrigramCount: 20000,
|
||||
autoDeleteStaleRepos: true,
|
||||
reindexInterval: 1000 * 60 * 60, // 1 hour in milliseconds
|
||||
resyncInterval: 1000 * 60 * 60 * 24, // 1 day in milliseconds
|
||||
reindexIntervalMs: 1000 * 60 * 60, // 1 hour
|
||||
resyncConnectionPollingIntervalMs: 1000 * 1, // 1 second
|
||||
reindexRepoPollingIntervalMs: 1000 * 1, // 1 second
|
||||
maxConnectionSyncJobConcurrency: 8,
|
||||
maxRepoIndexingJobConcurrency: 8,
|
||||
maxRepoGarbageCollectionJobConcurrency: 8,
|
||||
repoGarbageCollectionGracePeriodMs: 10 * 1000, // 10 seconds
|
||||
repoIndexTimeoutMs: 1000 * 60 * 60 * 2, // 2 hours
|
||||
}
|
||||
|
|
@ -1,125 +0,0 @@
|
|||
import { expect, test } from 'vitest';
|
||||
import { DEFAULT_DB_DATA, migration_addDeleteStaleRepos, migration_addMaxFileSize, migration_addReindexInterval, migration_addResyncInterval, migration_addSettings, Schema } from './db';
|
||||
import { DEFAULT_SETTINGS } from './constants';
|
||||
import { DeepPartial } from './types';
|
||||
import { Low } from 'lowdb';
|
||||
|
||||
class InMemoryAdapter<T> {
|
||||
private data: T;
|
||||
async read() {
|
||||
return this.data;
|
||||
}
|
||||
async write(data: T) {
|
||||
this.data = data;
|
||||
}
|
||||
}
|
||||
|
||||
export const createMockDB = (defaultData: Schema = DEFAULT_DB_DATA) => {
|
||||
const db = new Low(new InMemoryAdapter<Schema>(), defaultData);
|
||||
return db;
|
||||
}
|
||||
|
||||
test('migration_addSettings adds the `settings` field with defaults if it does not exist', () => {
|
||||
const schema: DeepPartial<Schema> = {};
|
||||
|
||||
const migratedSchema = migration_addSettings(schema as Schema);
|
||||
expect(migratedSchema).toStrictEqual({
|
||||
settings: DEFAULT_SETTINGS,
|
||||
});
|
||||
});
|
||||
|
||||
test('migration_addMaxFileSize adds the `maxFileSize` field with the default value if it does not exist', () => {
|
||||
const schema: DeepPartial<Schema> = {
|
||||
settings: {},
|
||||
}
|
||||
|
||||
const migratedSchema = migration_addMaxFileSize(schema as Schema);
|
||||
expect(migratedSchema).toStrictEqual({
|
||||
settings: {
|
||||
maxFileSize: DEFAULT_SETTINGS.maxFileSize,
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
test('migration_addMaxFileSize will throw if `settings` is not defined', () => {
|
||||
const schema: DeepPartial<Schema> = {};
|
||||
expect(() => migration_addMaxFileSize(schema as Schema)).toThrow();
|
||||
});
|
||||
|
||||
test('migration_addDeleteStaleRepos adds the `autoDeleteStaleRepos` field with the default value if it does not exist', () => {
|
||||
const schema: DeepPartial<Schema> = {
|
||||
settings: {
|
||||
maxFileSize: DEFAULT_SETTINGS.maxFileSize,
|
||||
},
|
||||
}
|
||||
|
||||
const migratedSchema = migration_addDeleteStaleRepos(schema as Schema);
|
||||
expect(migratedSchema).toStrictEqual({
|
||||
settings: {
|
||||
maxFileSize: DEFAULT_SETTINGS.maxFileSize,
|
||||
autoDeleteStaleRepos: DEFAULT_SETTINGS.autoDeleteStaleRepos,
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
test('migration_addReindexInterval adds the `reindexInterval` field with the default value if it does not exist', () => {
|
||||
const schema: DeepPartial<Schema> = {
|
||||
settings: {
|
||||
maxFileSize: DEFAULT_SETTINGS.maxFileSize,
|
||||
autoDeleteStaleRepos: DEFAULT_SETTINGS.autoDeleteStaleRepos,
|
||||
},
|
||||
}
|
||||
|
||||
const migratedSchema = migration_addReindexInterval(schema as Schema);
|
||||
expect(migratedSchema).toStrictEqual({
|
||||
settings: {
|
||||
maxFileSize: DEFAULT_SETTINGS.maxFileSize,
|
||||
autoDeleteStaleRepos: DEFAULT_SETTINGS.autoDeleteStaleRepos,
|
||||
reindexInterval: DEFAULT_SETTINGS.reindexInterval,
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
test('migration_addReindexInterval preserves existing reindexInterval value if already set', () => {
|
||||
const customInterval = 60;
|
||||
const schema: DeepPartial<Schema> = {
|
||||
settings: {
|
||||
maxFileSize: DEFAULT_SETTINGS.maxFileSize,
|
||||
reindexInterval: customInterval,
|
||||
},
|
||||
}
|
||||
|
||||
const migratedSchema = migration_addReindexInterval(schema as Schema);
|
||||
expect(migratedSchema.settings.reindexInterval).toBe(customInterval);
|
||||
});
|
||||
|
||||
test('migration_addResyncInterval adds the `resyncInterval` field with the default value if it does not exist', () => {
|
||||
const schema: DeepPartial<Schema> = {
|
||||
settings: {
|
||||
maxFileSize: DEFAULT_SETTINGS.maxFileSize,
|
||||
autoDeleteStaleRepos: DEFAULT_SETTINGS.autoDeleteStaleRepos,
|
||||
},
|
||||
}
|
||||
|
||||
const migratedSchema = migration_addResyncInterval(schema as Schema);
|
||||
expect(migratedSchema).toStrictEqual({
|
||||
settings: {
|
||||
maxFileSize: DEFAULT_SETTINGS.maxFileSize,
|
||||
autoDeleteStaleRepos: DEFAULT_SETTINGS.autoDeleteStaleRepos,
|
||||
resyncInterval: DEFAULT_SETTINGS.resyncInterval,
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
test('migration_addResyncInterval preserves existing resyncInterval value if already set', () => {
|
||||
const customInterval = 120;
|
||||
const schema: DeepPartial<Schema> = {
|
||||
settings: {
|
||||
maxFileSize: DEFAULT_SETTINGS.maxFileSize,
|
||||
resyncInterval: customInterval,
|
||||
},
|
||||
}
|
||||
|
||||
const migratedSchema = migration_addResyncInterval(schema as Schema);
|
||||
expect(migratedSchema.settings.resyncInterval).toBe(customInterval);
|
||||
});
|
||||
|
|
@ -1,123 +0,0 @@
|
|||
import { JSONFilePreset } from "lowdb/node";
|
||||
import { type Low } from "lowdb";
|
||||
import { AppContext, Repository, Settings } from "./types.js";
|
||||
import { DEFAULT_SETTINGS } from "./constants.js";
|
||||
import { createLogger } from "./logger.js";
|
||||
|
||||
const logger = createLogger('db');
|
||||
|
||||
export type Schema = {
|
||||
settings: Settings,
|
||||
repos: {
|
||||
[key: string]: Repository;
|
||||
}
|
||||
}
|
||||
|
||||
export const DEFAULT_DB_DATA: Schema = {
|
||||
repos: {},
|
||||
settings: DEFAULT_SETTINGS,
|
||||
}
|
||||
|
||||
export type Database = Low<Schema>;
|
||||
|
||||
export const loadDB = async (ctx: AppContext): Promise<Database> => {
|
||||
const db = await JSONFilePreset<Schema>(`${ctx.cachePath}/db.json`, DEFAULT_DB_DATA);
|
||||
|
||||
await applyMigrations(db);
|
||||
|
||||
return db;
|
||||
}
|
||||
|
||||
export const updateRepository = async (repoId: string, data: Repository, db: Database) => {
|
||||
db.data.repos[repoId] = {
|
||||
...db.data.repos[repoId],
|
||||
...data,
|
||||
}
|
||||
await db.write();
|
||||
}
|
||||
|
||||
export const updateSettings = async (settings: Settings, db: Database) => {
|
||||
db.data.settings = settings;
|
||||
await db.write();
|
||||
}
|
||||
|
||||
export const createRepository = async (repo: Repository, db: Database) => {
|
||||
db.data.repos[repo.id] = repo;
|
||||
await db.write();
|
||||
}
|
||||
|
||||
export const applyMigrations = async (db: Database) => {
|
||||
const log = (name: string) => {
|
||||
logger.info(`Applying migration '${name}'`);
|
||||
}
|
||||
|
||||
await db.update((schema) => {
|
||||
// @NOTE: please ensure new migrations are added after older ones!
|
||||
schema = migration_addSettings(schema, log);
|
||||
schema = migration_addMaxFileSize(schema, log);
|
||||
schema = migration_addDeleteStaleRepos(schema, log);
|
||||
schema = migration_addReindexInterval(schema, log);
|
||||
schema = migration_addResyncInterval(schema, log);
|
||||
return schema;
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* @see: https://github.com/sourcebot-dev/sourcebot/pull/118
|
||||
*/
|
||||
export const migration_addSettings = (schema: Schema, log?: (name: string) => void) => {
|
||||
if (!schema.settings) {
|
||||
log?.("addSettings");
|
||||
schema.settings = DEFAULT_SETTINGS;
|
||||
}
|
||||
|
||||
return schema;
|
||||
}
|
||||
|
||||
/**
|
||||
* @see: https://github.com/sourcebot-dev/sourcebot/pull/118
|
||||
*/
|
||||
export const migration_addMaxFileSize = (schema: Schema, log?: (name: string) => void) => {
|
||||
if (!schema.settings.maxFileSize) {
|
||||
log?.("addMaxFileSize");
|
||||
schema.settings.maxFileSize = DEFAULT_SETTINGS.maxFileSize;
|
||||
}
|
||||
|
||||
return schema;
|
||||
}
|
||||
|
||||
/**
|
||||
* @see: https://github.com/sourcebot-dev/sourcebot/pull/128
|
||||
*/
|
||||
export const migration_addDeleteStaleRepos = (schema: Schema, log?: (name: string) => void) => {
|
||||
if (schema.settings.autoDeleteStaleRepos === undefined) {
|
||||
log?.("addDeleteStaleRepos");
|
||||
schema.settings.autoDeleteStaleRepos = DEFAULT_SETTINGS.autoDeleteStaleRepos;
|
||||
}
|
||||
|
||||
return schema;
|
||||
}
|
||||
|
||||
/**
|
||||
* @see: https://github.com/sourcebot-dev/sourcebot/pull/134
|
||||
*/
|
||||
export const migration_addReindexInterval = (schema: Schema, log?: (name: string) => void) => {
|
||||
if (schema.settings.reindexInterval === undefined) {
|
||||
log?.("addReindexInterval");
|
||||
schema.settings.reindexInterval = DEFAULT_SETTINGS.reindexInterval;
|
||||
}
|
||||
|
||||
return schema;
|
||||
}
|
||||
|
||||
/**
|
||||
* @see: https://github.com/sourcebot-dev/sourcebot/pull/134
|
||||
*/
|
||||
export const migration_addResyncInterval = (schema: Schema, log?: (name: string) => void) => {
|
||||
if (schema.settings.resyncInterval === undefined) {
|
||||
log?.("addResyncInterval");
|
||||
schema.settings.resyncInterval = DEFAULT_SETTINGS.resyncInterval;
|
||||
}
|
||||
|
||||
return schema;
|
||||
}
|
||||
52
packages/backend/src/env.ts
Normal file
|
|
@ -0,0 +1,52 @@
|
|||
import { createEnv } from "@t3-oss/env-core";
|
||||
import { z } from "zod";
|
||||
import dotenv from 'dotenv';
|
||||
|
||||
// Booleans are specified as 'true' or 'false' strings.
|
||||
const booleanSchema = z.enum(["true", "false"]);
|
||||
|
||||
// Numbers are treated as strings in .env files.
|
||||
// coerce helps us convert them to numbers.
|
||||
// @see: https://zod.dev/?id=coercion-for-primitives
|
||||
const numberSchema = z.coerce.number();
|
||||
|
||||
dotenv.config({
|
||||
path: './.env',
|
||||
});
|
||||
|
||||
dotenv.config({
|
||||
path: './.env.local',
|
||||
override: true
|
||||
});
|
||||
|
||||
export const env = createEnv({
|
||||
server: {
|
||||
SOURCEBOT_ENCRYPTION_KEY: z.string(),
|
||||
SOURCEBOT_LOG_LEVEL: z.enum(["info", "debug", "warn", "error"]).default("info"),
|
||||
SOURCEBOT_TELEMETRY_DISABLED: booleanSchema.default("false"),
|
||||
SOURCEBOT_INSTALL_ID: z.string().default("unknown"),
|
||||
NEXT_PUBLIC_SOURCEBOT_VERSION: z.string().default("unknown"),
|
||||
|
||||
NEXT_PUBLIC_POSTHOG_PAPIK: z.string().optional(),
|
||||
|
||||
FALLBACK_GITHUB_CLOUD_TOKEN: z.string().optional(),
|
||||
FALLBACK_GITLAB_CLOUD_TOKEN: z.string().optional(),
|
||||
FALLBACK_GITEA_CLOUD_TOKEN: z.string().optional(),
|
||||
|
||||
REDIS_URL: z.string().url().default("redis://localhost:6379"),
|
||||
|
||||
NEXT_PUBLIC_SENTRY_BACKEND_DSN: z.string().optional(),
|
||||
NEXT_PUBLIC_SENTRY_ENVIRONMENT: z.string().optional(),
|
||||
|
||||
LOGTAIL_TOKEN: z.string().optional(),
|
||||
LOGTAIL_HOST: z.string().url().optional(),
|
||||
|
||||
DATABASE_URL: z.string().url().default("postgresql://postgres:postgres@localhost:5432/postgres"),
|
||||
CONFIG_PATH: z.string().optional(),
|
||||
|
||||
CONNECTION_MANAGER_UPSERT_TIMEOUT_MS: numberSchema.default(10000),
|
||||
},
|
||||
runtimeEnv: process.env,
|
||||
emptyStringAsUndefined: true,
|
||||
skipValidation: process.env.SKIP_ENV_VALIDATION === "1",
|
||||
});
|
||||
|
|
@ -1,23 +0,0 @@
|
|||
import dotenv from 'dotenv';
|
||||
|
||||
export const getEnv = (env: string | undefined, defaultValue?: string) => {
|
||||
return env ?? defaultValue;
|
||||
}
|
||||
|
||||
export const getEnvBoolean = (env: string | undefined, defaultValue: boolean) => {
|
||||
if (!env) {
|
||||
return defaultValue;
|
||||
}
|
||||
return env === 'true' || env === '1';
|
||||
}
|
||||
|
||||
dotenv.config({
|
||||
path: './.env',
|
||||
});
|
||||
|
||||
export const SOURCEBOT_LOG_LEVEL = getEnv(process.env.SOURCEBOT_LOG_LEVEL, 'info')!;
|
||||
export const SOURCEBOT_TELEMETRY_DISABLED = getEnvBoolean(process.env.SOURCEBOT_TELEMETRY_DISABLED, false)!;
|
||||
export const SOURCEBOT_INSTALL_ID = getEnv(process.env.SOURCEBOT_INSTALL_ID, 'unknown')!;
|
||||
export const SOURCEBOT_VERSION = getEnv(process.env.SOURCEBOT_VERSION, 'unknown')!;
|
||||
export const POSTHOG_PAPIK = getEnv(process.env.POSTHOG_PAPIK);
|
||||
export const POSTHOG_HOST = getEnv(process.env.POSTHOG_HOST);
|
||||
|
|
@ -1,9 +1,11 @@
|
|||
import fetch from 'cross-fetch';
|
||||
import { GerritConfig } from './schemas/v2.js';
|
||||
import { AppContext, GitRepository } from './types.js';
|
||||
import { GerritConfig } from "@sourcebot/schemas/v2/index.type"
|
||||
import { createLogger } from './logger.js';
|
||||
import path from 'path';
|
||||
import { measure, marshalBool, excludeReposByName, includeReposByName } from './utils.js';
|
||||
import micromatch from "micromatch";
|
||||
import { measure, fetchWithRetry } from './utils.js';
|
||||
import { BackendError } from '@sourcebot/error';
|
||||
import { BackendException } from '@sourcebot/error';
|
||||
import * as Sentry from "@sentry/node";
|
||||
|
||||
// https://gerrit-review.googlesource.com/Documentation/rest-api.html
|
||||
interface GerritProjects {
|
||||
|
|
@ -16,6 +18,13 @@ interface GerritProjectInfo {
|
|||
web_links?: GerritWebLink[];
|
||||
}
|
||||
|
||||
interface GerritProject {
|
||||
name: string;
|
||||
id: string;
|
||||
state?: string;
|
||||
web_links?: GerritWebLink[];
|
||||
}
|
||||
|
||||
interface GerritWebLink {
|
||||
name: string;
|
||||
url: string;
|
||||
|
|
@ -23,86 +32,55 @@ interface GerritWebLink {
|
|||
|
||||
const logger = createLogger('Gerrit');
|
||||
|
||||
export const getGerritReposFromConfig = async (config: GerritConfig, ctx: AppContext): Promise<GitRepository[]> => {
|
||||
|
||||
export const getGerritReposFromConfig = async (config: GerritConfig): Promise<GerritProject[]> => {
|
||||
const url = config.url.endsWith('/') ? config.url : `${config.url}/`;
|
||||
const hostname = new URL(config.url).hostname;
|
||||
|
||||
const { durationMs, data: projects } = await measure(async () => {
|
||||
let { durationMs, data: projects } = await measure(async () => {
|
||||
try {
|
||||
return fetchAllProjects(url)
|
||||
const fetchFn = () => fetchAllProjects(url);
|
||||
return fetchWithRetry(fetchFn, `projects from ${url}`, logger);
|
||||
} catch (err) {
|
||||
Sentry.captureException(err);
|
||||
if (err instanceof BackendException) {
|
||||
throw err;
|
||||
}
|
||||
|
||||
logger.error(`Failed to fetch projects from ${url}`, err);
|
||||
return null;
|
||||
}
|
||||
});
|
||||
|
||||
if (!projects) {
|
||||
return [];
|
||||
const e = new Error(`Failed to fetch projects from ${url}`);
|
||||
Sentry.captureException(e);
|
||||
throw e;
|
||||
}
|
||||
|
||||
// exclude "All-Projects" and "All-Users" projects
|
||||
delete projects['All-Projects'];
|
||||
delete projects['All-Users'];
|
||||
delete projects['All-Avatars']
|
||||
delete projects['All-Archived-Projects']
|
||||
|
||||
logger.debug(`Fetched ${Object.keys(projects).length} projects in ${durationMs}ms.`);
|
||||
|
||||
let repos: GitRepository[] = Object.keys(projects).map((projectName) => {
|
||||
const project = projects[projectName];
|
||||
let webUrl = "https://www.gerritcodereview.com/";
|
||||
// Gerrit projects can have multiple web links; use the first one
|
||||
if (project.web_links) {
|
||||
const webLink = project.web_links[0];
|
||||
if (webLink) {
|
||||
webUrl = webLink.url;
|
||||
}
|
||||
}
|
||||
const repoId = `${hostname}/${projectName}`;
|
||||
const repoPath = path.resolve(path.join(ctx.reposPath, `${repoId}.git`));
|
||||
|
||||
const cloneUrl = `${url}${encodeURIComponent(projectName)}`;
|
||||
|
||||
return {
|
||||
vcs: 'git',
|
||||
codeHost: 'gerrit',
|
||||
name: projectName,
|
||||
id: repoId,
|
||||
cloneUrl: cloneUrl,
|
||||
path: repoPath,
|
||||
isStale: false, // Gerrit projects are typically not stale
|
||||
isFork: false, // Gerrit doesn't have forks in the same way as GitHub
|
||||
isArchived: false,
|
||||
gitConfigMetadata: {
|
||||
// Gerrit uses Gitiles for web UI. This can sometimes be "browse" type in zoekt
|
||||
'zoekt.web-url-type': 'gitiles',
|
||||
'zoekt.web-url': webUrl,
|
||||
'zoekt.name': repoId,
|
||||
'zoekt.archived': marshalBool(false),
|
||||
'zoekt.fork': marshalBool(false),
|
||||
'zoekt.public': marshalBool(true), // Assuming projects are public; adjust as needed
|
||||
},
|
||||
branches: [],
|
||||
tags: []
|
||||
} satisfies GitRepository;
|
||||
});
|
||||
|
||||
const excludedProjects = ['All-Projects', 'All-Users', 'All-Avatars', 'All-Archived-Projects'];
|
||||
projects = projects.filter(project => !excludedProjects.includes(project.name));
|
||||
|
||||
// include repos by glob if specified in config
|
||||
if (config.projects) {
|
||||
repos = includeReposByName(repos, config.projects);
|
||||
projects = projects.filter((project) => {
|
||||
return micromatch.isMatch(project.name, config.projects!);
|
||||
});
|
||||
}
|
||||
|
||||
|
||||
if (config.exclude && config.exclude.projects) {
|
||||
repos = excludeReposByName(repos, config.exclude.projects);
|
||||
projects = projects.filter((project) => {
|
||||
return !micromatch.isMatch(project.name, config.exclude!.projects!);
|
||||
});
|
||||
}
|
||||
|
||||
return repos;
|
||||
logger.debug(`Fetched ${Object.keys(projects).length} projects in ${durationMs}ms.`);
|
||||
return projects;
|
||||
};
|
||||
|
||||
const fetchAllProjects = async (url: string): Promise<GerritProjects> => {
|
||||
const fetchAllProjects = async (url: string): Promise<GerritProject[]> => {
|
||||
const projectsEndpoint = `${url}projects/`;
|
||||
let allProjects: GerritProjects = {};
|
||||
let allProjects: GerritProject[] = [];
|
||||
let start = 0; // Start offset for pagination
|
||||
let hasMoreProjects = true;
|
||||
|
||||
|
|
@ -110,17 +88,43 @@ const fetchAllProjects = async (url: string): Promise<GerritProjects> => {
|
|||
const endpointWithParams = `${projectsEndpoint}?S=${start}`;
|
||||
logger.debug(`Fetching projects from Gerrit at ${endpointWithParams}`);
|
||||
|
||||
const response = await fetch(endpointWithParams);
|
||||
if (!response.ok) {
|
||||
throw new Error(`Failed to fetch projects from Gerrit: ${response.statusText}`);
|
||||
let response: Response;
|
||||
try {
|
||||
response = await fetch(endpointWithParams);
|
||||
if (!response.ok) {
|
||||
console.log(`Failed to fetch projects from Gerrit at ${endpointWithParams} with status ${response.status}`);
|
||||
const e = new BackendException(BackendError.CONNECTION_SYNC_FAILED_TO_FETCH_GERRIT_PROJECTS, {
|
||||
status: response.status,
|
||||
});
|
||||
Sentry.captureException(e);
|
||||
throw e;
|
||||
}
|
||||
} catch (err) {
|
||||
Sentry.captureException(err);
|
||||
if (err instanceof BackendException) {
|
||||
throw err;
|
||||
}
|
||||
|
||||
const status = (err as any).code;
|
||||
console.log(`Failed to fetch projects from Gerrit at ${endpointWithParams} with status ${status}`);
|
||||
throw new BackendException(BackendError.CONNECTION_SYNC_FAILED_TO_FETCH_GERRIT_PROJECTS, {
|
||||
status: status,
|
||||
});
|
||||
}
|
||||
|
||||
const text = await response.text();
|
||||
const jsonText = text.replace(")]}'\n", ''); // Remove XSSI protection prefix
|
||||
const data: GerritProjects = JSON.parse(jsonText);
|
||||
|
||||
// Merge the current batch of projects with allProjects
|
||||
Object.assign(allProjects, data);
|
||||
// Add fetched projects to allProjects
|
||||
for (const [projectName, projectInfo] of Object.entries(data)) {
|
||||
allProjects.push({
|
||||
name: projectName,
|
||||
id: projectInfo.id,
|
||||
state: projectInfo.state,
|
||||
web_links: projectInfo.web_links
|
||||
})
|
||||
}
|
||||
|
||||
// Check if there are more projects to fetch
|
||||
hasMoreProjects = Object.values(data).some(
|
||||
|
|
|
|||
|
|
@ -1,130 +1,66 @@
|
|||
import { GitRepository, AppContext } from './types.js';
|
||||
import { simpleGit, SimpleGitProgressEvent } from 'simple-git';
|
||||
import { existsSync } from 'fs';
|
||||
import { createLogger } from './logger.js';
|
||||
import { GitConfig } from './schemas/v2.js';
|
||||
import path from 'path';
|
||||
|
||||
const logger = createLogger('git');
|
||||
|
||||
export const cloneRepository = async (repo: GitRepository, onProgress?: (event: SimpleGitProgressEvent) => void) => {
|
||||
if (existsSync(repo.path)) {
|
||||
logger.warn(`${repo.id} already exists. Skipping clone.`)
|
||||
return;
|
||||
}
|
||||
|
||||
export const cloneRepository = async (cloneURL: string, path: string, gitConfig?: Record<string, string>, onProgress?: (event: SimpleGitProgressEvent) => void) => {
|
||||
const git = simpleGit({
|
||||
progress: onProgress,
|
||||
});
|
||||
|
||||
const gitConfig = Object.entries(repo.gitConfigMetadata ?? {}).flatMap(
|
||||
const configParams = Object.entries(gitConfig ?? {}).flatMap(
|
||||
([key, value]) => ['--config', `${key}=${value}`]
|
||||
);
|
||||
|
||||
await git.clone(
|
||||
repo.cloneUrl,
|
||||
repo.path,
|
||||
[
|
||||
"--bare",
|
||||
...gitConfig
|
||||
]
|
||||
);
|
||||
try {
|
||||
await git.clone(
|
||||
cloneURL,
|
||||
path,
|
||||
[
|
||||
"--bare",
|
||||
...configParams
|
||||
]
|
||||
);
|
||||
|
||||
await git.cwd({
|
||||
path: repo.path,
|
||||
}).addConfig("remote.origin.fetch", "+refs/heads/*:refs/heads/*");
|
||||
await git.cwd({
|
||||
path,
|
||||
}).addConfig("remote.origin.fetch", "+refs/heads/*:refs/heads/*");
|
||||
} catch (error) {
|
||||
throw new Error(`Failed to clone repository`);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
export const fetchRepository = async (repo: GitRepository, onProgress?: (event: SimpleGitProgressEvent) => void) => {
|
||||
export const fetchRepository = async (path: string, onProgress?: (event: SimpleGitProgressEvent) => void) => {
|
||||
const git = simpleGit({
|
||||
progress: onProgress,
|
||||
});
|
||||
|
||||
await git.cwd({
|
||||
path: repo.path,
|
||||
}).fetch(
|
||||
"origin",
|
||||
[
|
||||
"--prune",
|
||||
"--progress"
|
||||
]
|
||||
);
|
||||
}
|
||||
|
||||
const isValidGitRepo = async (url: string): Promise<boolean> => {
|
||||
const git = simpleGit();
|
||||
try {
|
||||
await git.listRemote([url]);
|
||||
return true;
|
||||
await git.cwd({
|
||||
path: path,
|
||||
}).fetch(
|
||||
"origin",
|
||||
[
|
||||
"--prune",
|
||||
"--progress"
|
||||
]
|
||||
);
|
||||
} catch (error) {
|
||||
logger.debug(`Error checking if ${url} is a valid git repo: ${error}`);
|
||||
return false;
|
||||
throw new Error(`Failed to fetch repository ${path}`);
|
||||
}
|
||||
}
|
||||
|
||||
const stripProtocolAndGitSuffix = (url: string): string => {
|
||||
return url.replace(/^[a-zA-Z]+:\/\//, '').replace(/\.git$/, '');
|
||||
export const getBranches = async (path: string) => {
|
||||
const git = simpleGit();
|
||||
const branches = await git.cwd({
|
||||
path,
|
||||
}).branch();
|
||||
|
||||
return branches.all;
|
||||
}
|
||||
|
||||
const getRepoNameFromUrl = (url: string): string => {
|
||||
const strippedUrl = stripProtocolAndGitSuffix(url);
|
||||
return strippedUrl.split('/').slice(-2).join('/');
|
||||
}
|
||||
|
||||
export const getGitRepoFromConfig = async (config: GitConfig, ctx: AppContext) => {
|
||||
const repoValid = await isValidGitRepo(config.url);
|
||||
if (!repoValid) {
|
||||
logger.error(`Git repo provided in config with url ${config.url} is not valid`);
|
||||
return null;
|
||||
}
|
||||
|
||||
const cloneUrl = config.url;
|
||||
const repoId = stripProtocolAndGitSuffix(cloneUrl);
|
||||
const repoName = getRepoNameFromUrl(config.url);
|
||||
const repoPath = path.resolve(path.join(ctx.reposPath, `${repoId}.git`));
|
||||
const repo: GitRepository = {
|
||||
vcs: 'git',
|
||||
id: repoId,
|
||||
name: repoName,
|
||||
path: repoPath,
|
||||
isStale: false,
|
||||
cloneUrl: cloneUrl,
|
||||
branches: [],
|
||||
tags: [],
|
||||
}
|
||||
|
||||
if (config.revisions) {
|
||||
if (config.revisions.branches) {
|
||||
const branchGlobs = config.revisions.branches;
|
||||
const git = simpleGit();
|
||||
const branchList = await git.listRemote(['--heads', cloneUrl]);
|
||||
const branches = branchList
|
||||
.split('\n')
|
||||
.map(line => line.split('\t')[1])
|
||||
.filter(Boolean)
|
||||
.map(branch => branch.replace('refs/heads/', ''));
|
||||
|
||||
repo.branches = branches.filter(branch =>
|
||||
branchGlobs.some(glob => new RegExp(glob).test(branch))
|
||||
);
|
||||
}
|
||||
|
||||
if (config.revisions.tags) {
|
||||
const tagGlobs = config.revisions.tags;
|
||||
const git = simpleGit();
|
||||
const tagList = await git.listRemote(['--tags', cloneUrl]);
|
||||
const tags = tagList
|
||||
.split('\n')
|
||||
.map(line => line.split('\t')[1])
|
||||
.filter(Boolean)
|
||||
.map(tag => tag.replace('refs/tags/', ''));
|
||||
|
||||
repo.tags = tags.filter(tag =>
|
||||
tagGlobs.some(glob => new RegExp(glob).test(tag))
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
return repo;
|
||||
export const getTags = async (path: string) => {
|
||||
const git = simpleGit();
|
||||
const tags = await git.cwd({
|
||||
path,
|
||||
}).tags();
|
||||
return tags.all;
|
||||
}
|
||||
|
|
@ -1,160 +1,132 @@
|
|||
import { Api, giteaApi, HttpResponse, Repository as GiteaRepository } from 'gitea-js';
|
||||
import { GiteaConfig } from './schemas/v2.js';
|
||||
import { excludeArchivedRepos, excludeForkedRepos, excludeReposByName, getTokenFromConfig, marshalBool, measure } from './utils.js';
|
||||
import { AppContext, GitRepository } from './types.js';
|
||||
import { GiteaConnectionConfig } from '@sourcebot/schemas/v3/gitea.type';
|
||||
import { getTokenFromConfig, measure } from './utils.js';
|
||||
import fetch from 'cross-fetch';
|
||||
import { createLogger } from './logger.js';
|
||||
import path from 'path';
|
||||
import micromatch from 'micromatch';
|
||||
import { PrismaClient } from '@sourcebot/db';
|
||||
import { processPromiseResults, throwIfAnyFailed } from './connectionUtils.js';
|
||||
import * as Sentry from "@sentry/node";
|
||||
import { env } from './env.js';
|
||||
|
||||
const logger = createLogger('Gitea');
|
||||
const GITEA_CLOUD_HOSTNAME = "gitea.com";
|
||||
|
||||
export const getGiteaReposFromConfig = async (config: GiteaConfig, ctx: AppContext) => {
|
||||
const token = config.token ? getTokenFromConfig(config.token, ctx) : undefined;
|
||||
export const getGiteaReposFromConfig = async (config: GiteaConnectionConfig, orgId: number, db: PrismaClient) => {
|
||||
const hostname = config.url ?
|
||||
new URL(config.url).hostname :
|
||||
GITEA_CLOUD_HOSTNAME;
|
||||
|
||||
const token = config.token ?
|
||||
await getTokenFromConfig(config.token, orgId, db, logger) :
|
||||
hostname === GITEA_CLOUD_HOSTNAME ?
|
||||
env.FALLBACK_GITEA_CLOUD_TOKEN :
|
||||
undefined;
|
||||
|
||||
const api = giteaApi(config.url ?? 'https://gitea.com', {
|
||||
token,
|
||||
token: token,
|
||||
customFetch: fetch,
|
||||
});
|
||||
|
||||
let allRepos: GiteaRepository[] = [];
|
||||
let notFound: {
|
||||
users: string[],
|
||||
orgs: string[],
|
||||
repos: string[],
|
||||
} = {
|
||||
users: [],
|
||||
orgs: [],
|
||||
repos: [],
|
||||
};
|
||||
|
||||
if (config.orgs) {
|
||||
const _repos = await getReposForOrgs(config.orgs, api);
|
||||
allRepos = allRepos.concat(_repos);
|
||||
const { validRepos, notFoundOrgs } = await getReposForOrgs(config.orgs, api);
|
||||
allRepos = allRepos.concat(validRepos);
|
||||
notFound.orgs = notFoundOrgs;
|
||||
}
|
||||
|
||||
if (config.repos) {
|
||||
const _repos = await getRepos(config.repos, api);
|
||||
allRepos = allRepos.concat(_repos);
|
||||
const { validRepos, notFoundRepos } = await getRepos(config.repos, api);
|
||||
allRepos = allRepos.concat(validRepos);
|
||||
notFound.repos = notFoundRepos;
|
||||
}
|
||||
|
||||
if (config.users) {
|
||||
const _repos = await getReposOwnedByUsers(config.users, api);
|
||||
allRepos = allRepos.concat(_repos);
|
||||
const { validRepos, notFoundUsers } = await getReposOwnedByUsers(config.users, api);
|
||||
allRepos = allRepos.concat(validRepos);
|
||||
notFound.users = notFoundUsers;
|
||||
}
|
||||
|
||||
allRepos = allRepos.filter(repo => repo.full_name !== undefined);
|
||||
allRepos = allRepos.filter(repo => {
|
||||
if (repo.full_name === undefined) {
|
||||
logger.warn(`Repository with undefined full_name found: orgId=${orgId}, repoId=${repo.id}`);
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
});
|
||||
|
||||
let repos: GitRepository[] = allRepos
|
||||
.map((repo) => {
|
||||
const hostname = config.url ? new URL(config.url).hostname : 'gitea.com';
|
||||
const repoId = `${hostname}/${repo.full_name!}`;
|
||||
const repoPath = path.resolve(path.join(ctx.reposPath, `${repoId}.git`));
|
||||
let repos = allRepos
|
||||
.filter((repo) => {
|
||||
const isExcluded = shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: config.exclude,
|
||||
});
|
||||
|
||||
const cloneUrl = new URL(repo.clone_url!);
|
||||
if (token) {
|
||||
cloneUrl.username = token;
|
||||
}
|
||||
|
||||
return {
|
||||
vcs: 'git',
|
||||
codeHost: 'gitea',
|
||||
name: repo.full_name!,
|
||||
id: repoId,
|
||||
cloneUrl: cloneUrl.toString(),
|
||||
path: repoPath,
|
||||
isStale: false,
|
||||
isFork: repo.fork!,
|
||||
isArchived: !!repo.archived,
|
||||
gitConfigMetadata: {
|
||||
'zoekt.web-url-type': 'gitea',
|
||||
'zoekt.web-url': repo.html_url!,
|
||||
'zoekt.name': repoId,
|
||||
'zoekt.archived': marshalBool(repo.archived),
|
||||
'zoekt.fork': marshalBool(repo.fork!),
|
||||
'zoekt.public': marshalBool(repo.internal === false && repo.private === false),
|
||||
},
|
||||
branches: [],
|
||||
tags: []
|
||||
} satisfies GitRepository;
|
||||
return !isExcluded;
|
||||
});
|
||||
|
||||
if (config.exclude) {
|
||||
if (!!config.exclude.forks) {
|
||||
repos = excludeForkedRepos(repos, logger);
|
||||
}
|
||||
|
||||
if (!!config.exclude.archived) {
|
||||
repos = excludeArchivedRepos(repos, logger);
|
||||
}
|
||||
|
||||
if (config.exclude.repos) {
|
||||
repos = excludeReposByName(repos, config.exclude.repos, logger);
|
||||
}
|
||||
}
|
||||
|
||||
logger.debug(`Found ${repos.length} total repositories.`);
|
||||
return {
|
||||
validRepos: repos,
|
||||
notFound,
|
||||
};
|
||||
}
|
||||
|
||||
if (config.revisions) {
|
||||
if (config.revisions.branches) {
|
||||
const branchGlobs = config.revisions.branches;
|
||||
repos = await Promise.all(
|
||||
repos.map(async (repo) => {
|
||||
const [owner, name] = repo.name.split('/');
|
||||
let branches = (await getBranchesForRepo(owner, name, api)).map(branch => branch.name!);
|
||||
branches = micromatch.match(branches, branchGlobs);
|
||||
|
||||
return {
|
||||
...repo,
|
||||
branches,
|
||||
};
|
||||
})
|
||||
)
|
||||
}
|
||||
|
||||
if (config.revisions.tags) {
|
||||
const tagGlobs = config.revisions.tags;
|
||||
repos = await Promise.all(
|
||||
repos.map(async (repo) => {
|
||||
const [owner, name] = repo.name.split('/');
|
||||
let tags = (await getTagsForRepo(owner, name, api)).map(tag => tag.name!);
|
||||
tags = micromatch.match(tags, tagGlobs);
|
||||
|
||||
return {
|
||||
...repo,
|
||||
tags,
|
||||
};
|
||||
})
|
||||
)
|
||||
}
|
||||
const shouldExcludeRepo = ({
|
||||
repo,
|
||||
exclude
|
||||
} : {
|
||||
repo: GiteaRepository,
|
||||
exclude?: {
|
||||
forks?: boolean,
|
||||
archived?: boolean,
|
||||
repos?: string[],
|
||||
}
|
||||
}) => {
|
||||
let reason = '';
|
||||
const repoName = repo.full_name!;
|
||||
|
||||
const shouldExclude = (() => {
|
||||
if (!!exclude?.forks && repo.fork) {
|
||||
reason = `\`exclude.forks\` is true`;
|
||||
return true;
|
||||
}
|
||||
|
||||
return repos;
|
||||
}
|
||||
if (!!exclude?.archived && !!repo.archived) {
|
||||
reason = `\`exclude.archived\` is true`;
|
||||
return true;
|
||||
}
|
||||
|
||||
const getTagsForRepo = async <T>(owner: string, repo: string, api: Api<T>) => {
|
||||
try {
|
||||
logger.debug(`Fetching tags for repo ${owner}/${repo}...`);
|
||||
const { durationMs, data: tags } = await measure(() =>
|
||||
paginate((page) => api.repos.repoListTags(owner, repo, {
|
||||
page
|
||||
}))
|
||||
);
|
||||
logger.debug(`Found ${tags.length} tags in repo ${owner}/${repo} in ${durationMs}ms.`);
|
||||
return tags;
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch tags for repo ${owner}/${repo}.`, e);
|
||||
return [];
|
||||
}
|
||||
}
|
||||
if (exclude?.repos) {
|
||||
if (micromatch.isMatch(repoName, exclude.repos)) {
|
||||
reason = `\`exclude.repos\` contains ${repoName}`;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
const getBranchesForRepo = async <T>(owner: string, repo: string, api: Api<T>) => {
|
||||
try {
|
||||
logger.debug(`Fetching branches for repo ${owner}/${repo}...`);
|
||||
const { durationMs, data: branches } = await measure(() =>
|
||||
paginate((page) => api.repos.repoListBranches(owner, repo, {
|
||||
page
|
||||
}))
|
||||
);
|
||||
logger.debug(`Found ${branches.length} branches in repo ${owner}/${repo} in ${durationMs}ms.`);
|
||||
return branches;
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch branches for repo ${owner}/${repo}.`, e);
|
||||
return [];
|
||||
return false;
|
||||
})();
|
||||
|
||||
if (shouldExclude) {
|
||||
logger.debug(`Excluding repo ${repoName}. Reason: ${reason}`);
|
||||
}
|
||||
|
||||
return shouldExclude;
|
||||
}
|
||||
|
||||
const getReposOwnedByUsers = async <T>(users: string[], api: Api<T>) => {
|
||||
const repos = (await Promise.all(users.map(async (user) => {
|
||||
const results = await Promise.allSettled(users.map(async (user) => {
|
||||
try {
|
||||
logger.debug(`Fetching repos for user ${user}...`);
|
||||
|
||||
|
|
@ -165,18 +137,35 @@ const getReposOwnedByUsers = async <T>(users: string[], api: Api<T>) => {
|
|||
);
|
||||
|
||||
logger.debug(`Found ${data.length} repos owned by user ${user} in ${durationMs}ms.`);
|
||||
return data;
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch repos for user ${user}.`, e);
|
||||
return [];
|
||||
}
|
||||
}))).flat();
|
||||
return {
|
||||
type: 'valid' as const,
|
||||
data
|
||||
};
|
||||
} catch (e: any) {
|
||||
Sentry.captureException(e);
|
||||
|
||||
return repos;
|
||||
if (e?.status === 404) {
|
||||
logger.error(`User ${user} not found or no access`);
|
||||
return {
|
||||
type: 'notFound' as const,
|
||||
value: user
|
||||
};
|
||||
}
|
||||
throw e;
|
||||
}
|
||||
}));
|
||||
|
||||
throwIfAnyFailed(results);
|
||||
const { validItems: validRepos, notFoundItems: notFoundUsers } = processPromiseResults<GiteaRepository>(results);
|
||||
|
||||
return {
|
||||
validRepos,
|
||||
notFoundUsers,
|
||||
};
|
||||
}
|
||||
|
||||
const getReposForOrgs = async <T>(orgs: string[], api: Api<T>) => {
|
||||
return (await Promise.all(orgs.map(async (org) => {
|
||||
const results = await Promise.allSettled(orgs.map(async (org) => {
|
||||
try {
|
||||
logger.debug(`Fetching repos for org ${org}...`);
|
||||
|
||||
|
|
@ -188,16 +177,35 @@ const getReposForOrgs = async <T>(orgs: string[], api: Api<T>) => {
|
|||
);
|
||||
|
||||
logger.debug(`Found ${data.length} repos for org ${org} in ${durationMs}ms.`);
|
||||
return data;
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch repos for org ${org}.`, e);
|
||||
return [];
|
||||
return {
|
||||
type: 'valid' as const,
|
||||
data
|
||||
};
|
||||
} catch (e: any) {
|
||||
Sentry.captureException(e);
|
||||
|
||||
if (e?.status === 404) {
|
||||
logger.error(`Organization ${org} not found or no access`);
|
||||
return {
|
||||
type: 'notFound' as const,
|
||||
value: org
|
||||
};
|
||||
}
|
||||
throw e;
|
||||
}
|
||||
}))).flat();
|
||||
}));
|
||||
|
||||
throwIfAnyFailed(results);
|
||||
const { validItems: validRepos, notFoundItems: notFoundOrgs } = processPromiseResults<GiteaRepository>(results);
|
||||
|
||||
return {
|
||||
validRepos,
|
||||
notFoundOrgs,
|
||||
};
|
||||
}
|
||||
|
||||
const getRepos = async <T>(repos: string[], api: Api<T>) => {
|
||||
return (await Promise.all(repos.map(async (repo) => {
|
||||
const results = await Promise.allSettled(repos.map(async (repo) => {
|
||||
try {
|
||||
logger.debug(`Fetching repository info for ${repo}...`);
|
||||
|
||||
|
|
@ -207,13 +215,31 @@ const getRepos = async <T>(repos: string[], api: Api<T>) => {
|
|||
);
|
||||
|
||||
logger.debug(`Found repo ${repo} in ${durationMs}ms.`);
|
||||
return {
|
||||
type: 'valid' as const,
|
||||
data: [response.data]
|
||||
};
|
||||
} catch (e: any) {
|
||||
Sentry.captureException(e);
|
||||
|
||||
return [response.data];
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch repository info for ${repo}.`, e);
|
||||
return [];
|
||||
if (e?.status === 404) {
|
||||
logger.error(`Repository ${repo} not found or no access`);
|
||||
return {
|
||||
type: 'notFound' as const,
|
||||
value: repo
|
||||
};
|
||||
}
|
||||
throw e;
|
||||
}
|
||||
}))).flat();
|
||||
}));
|
||||
|
||||
throwIfAnyFailed(results);
|
||||
const { validItems: validRepos, notFoundItems: notFoundRepos } = processPromiseResults<GiteaRepository>(results);
|
||||
|
||||
return {
|
||||
validRepos,
|
||||
notFoundRepos,
|
||||
};
|
||||
}
|
||||
|
||||
// @see : https://docs.gitea.com/development/api-usage#pagination
|
||||
|
|
@ -224,7 +250,9 @@ const paginate = async <T>(request: (page: number) => Promise<HttpResponse<T[],
|
|||
|
||||
const totalCountString = result.headers.get('x-total-count');
|
||||
if (!totalCountString) {
|
||||
throw new Error("Header 'x-total-count' not found");
|
||||
const e = new Error("Header 'x-total-count' not found");
|
||||
Sentry.captureException(e);
|
||||
throw e;
|
||||
}
|
||||
const totalCount = parseInt(totalCountString);
|
||||
|
||||
|
|
|
|||
206
packages/backend/src/github.test.ts
Normal file
|
|
@ -0,0 +1,206 @@
|
|||
import { expect, test } from 'vitest';
|
||||
import { OctokitRepository, shouldExcludeRepo } from './github';
|
||||
|
||||
test('shouldExcludeRepo returns true when clone_url is undefined', () => {
|
||||
const repo = { full_name: 'test/repo' } as OctokitRepository;
|
||||
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
})).toBe(true);
|
||||
});
|
||||
|
||||
test('shouldExcludeRepo returns false when the repo is not excluded.', () => {
|
||||
const repo = {
|
||||
full_name: 'test/repo',
|
||||
clone_url: 'https://github.com/test/repo.git',
|
||||
} as OctokitRepository;
|
||||
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
})).toBe(false);
|
||||
});
|
||||
|
||||
test('shouldExcludeRepo handles forked repos correctly', () => {
|
||||
const repo = {
|
||||
full_name: 'test/forked-repo',
|
||||
clone_url: 'https://github.com/test/forked-repo.git',
|
||||
fork: true,
|
||||
} as OctokitRepository;
|
||||
|
||||
expect(shouldExcludeRepo({ repo })).toBe(false);
|
||||
expect(shouldExcludeRepo({ repo, exclude: { forks: true } })).toBe(true);
|
||||
expect(shouldExcludeRepo({ repo, exclude: { forks: false } })).toBe(false);
|
||||
});;
|
||||
|
||||
test('shouldExcludeRepo handles archived repos correctly', () => {
|
||||
const repo = {
|
||||
full_name: 'test/archived-repo',
|
||||
clone_url: 'https://github.com/test/archived-repo.git',
|
||||
archived: true,
|
||||
} as OctokitRepository;
|
||||
|
||||
expect(shouldExcludeRepo({ repo })).toBe(false);
|
||||
expect(shouldExcludeRepo({ repo, exclude: { archived: true } })).toBe(true);
|
||||
expect(shouldExcludeRepo({ repo, exclude: { archived: false } })).toBe(false);
|
||||
});
|
||||
|
||||
test('shouldExcludeRepo handles include.topics correctly', () => {
|
||||
const repo = {
|
||||
full_name: 'test/repo',
|
||||
clone_url: 'https://github.com/test/repo.git',
|
||||
topics: [
|
||||
'test-topic',
|
||||
'another-topic'
|
||||
] as string[],
|
||||
} as OctokitRepository;
|
||||
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
include: {}
|
||||
})).toBe(false);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
include: {
|
||||
topics: [],
|
||||
}
|
||||
})).toBe(true);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
include: {
|
||||
topics: ['a-topic-that-does-not-exist'],
|
||||
}
|
||||
})).toBe(true);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
include: {
|
||||
topics: ['test-topic'],
|
||||
}
|
||||
})).toBe(false);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
include: {
|
||||
topics: ['test-*'],
|
||||
}
|
||||
})).toBe(false);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
include: {
|
||||
topics: ['TEST-tOpIC'],
|
||||
}
|
||||
})).toBe(false);
|
||||
});
|
||||
|
||||
test('shouldExcludeRepo handles exclude.topics correctly', () => {
|
||||
const repo = {
|
||||
full_name: 'test/repo',
|
||||
clone_url: 'https://github.com/test/repo.git',
|
||||
topics: [
|
||||
'test-topic',
|
||||
'another-topic'
|
||||
],
|
||||
} as OctokitRepository;
|
||||
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {}
|
||||
})).toBe(false);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {
|
||||
topics: [],
|
||||
}
|
||||
})).toBe(false);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {
|
||||
topics: ['a-topic-that-does-not-exist'],
|
||||
}
|
||||
})).toBe(false);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {
|
||||
topics: ['test-topic'],
|
||||
}
|
||||
})).toBe(true);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {
|
||||
topics: ['test-*'],
|
||||
}
|
||||
})).toBe(true);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {
|
||||
topics: ['TEST-tOpIC'],
|
||||
}
|
||||
})).toBe(true);
|
||||
});
|
||||
|
||||
|
||||
test('shouldExcludeRepo handles exclude.size correctly', () => {
|
||||
const repo = {
|
||||
full_name: 'test/repo',
|
||||
clone_url: 'https://github.com/test/repo.git',
|
||||
size: 6, // 6KB
|
||||
} as OctokitRepository;
|
||||
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {
|
||||
size: {
|
||||
min: 10 * 1000, // 10KB
|
||||
}
|
||||
}
|
||||
})).toBe(true);
|
||||
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {
|
||||
size: {
|
||||
max: 2 * 1000, // 2KB
|
||||
}
|
||||
}
|
||||
})).toBe(true);
|
||||
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {
|
||||
size: {
|
||||
min: 5 * 1000, // 5KB
|
||||
max: 10 * 1000, // 10KB
|
||||
}
|
||||
}
|
||||
})).toBe(false);
|
||||
});
|
||||
|
||||
test('shouldExcludeRepo handles exclude.repos correctly', () => {
|
||||
const repo = {
|
||||
full_name: 'test/example-repo',
|
||||
clone_url: 'https://github.com/test/example-repo.git',
|
||||
} as OctokitRepository;
|
||||
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {
|
||||
repos: []
|
||||
}
|
||||
})).toBe(false);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {
|
||||
repos: ['test/example-repo']
|
||||
}
|
||||
})).toBe(true);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {
|
||||
repos: ['test/*']
|
||||
}
|
||||
})).toBe(true);
|
||||
expect(shouldExcludeRepo({
|
||||
repo,
|
||||
exclude: {
|
||||
repos: ['repo-does-not-exist']
|
||||
}
|
||||
})).toBe(false);
|
||||
});
|
||||
|
|
@ -1,14 +1,18 @@
|
|||
import { Octokit } from "@octokit/rest";
|
||||
import { GitHubConfig } from "./schemas/v2.js";
|
||||
import { GithubConnectionConfig } from "@sourcebot/schemas/v3/github.type";
|
||||
import { createLogger } from "./logger.js";
|
||||
import { AppContext, GitRepository } from "./types.js";
|
||||
import path from 'path';
|
||||
import { excludeArchivedRepos, excludeForkedRepos, excludeReposByName, excludeReposByTopic, getTokenFromConfig, includeReposByTopic, marshalBool, measure } from "./utils.js";
|
||||
import { getTokenFromConfig, measure, fetchWithRetry } from "./utils.js";
|
||||
import micromatch from "micromatch";
|
||||
import { PrismaClient } from "@sourcebot/db";
|
||||
import { BackendException, BackendError } from "@sourcebot/error";
|
||||
import { processPromiseResults, throwIfAnyFailed } from "./connectionUtils.js";
|
||||
import * as Sentry from "@sentry/node";
|
||||
import { env } from "./env.js";
|
||||
|
||||
const logger = createLogger("GitHub");
|
||||
const GITHUB_CLOUD_HOSTNAME = "github.com";
|
||||
|
||||
type OctokitRepository = {
|
||||
export type OctokitRepository = {
|
||||
name: string,
|
||||
id: number,
|
||||
full_name: string,
|
||||
|
|
@ -22,11 +26,30 @@ type OctokitRepository = {
|
|||
forks_count?: number,
|
||||
archived?: boolean,
|
||||
topics?: string[],
|
||||
// @note: this is expressed in kilobytes.
|
||||
size?: number,
|
||||
owner: {
|
||||
avatar_url: string,
|
||||
}
|
||||
}
|
||||
|
||||
export const getGitHubReposFromConfig = async (config: GitHubConfig, signal: AbortSignal, ctx: AppContext) => {
|
||||
const token = config.token ? getTokenFromConfig(config.token, ctx) : undefined;
|
||||
const isHttpError = (error: unknown, status: number): boolean => {
|
||||
return error !== null
|
||||
&& typeof error === 'object'
|
||||
&& 'status' in error
|
||||
&& error.status === status;
|
||||
}
|
||||
|
||||
export const getGitHubReposFromConfig = async (config: GithubConnectionConfig, orgId: number, db: PrismaClient, signal: AbortSignal) => {
|
||||
const hostname = config.url ?
|
||||
new URL(config.url).hostname :
|
||||
GITHUB_CLOUD_HOSTNAME;
|
||||
|
||||
const token = config.token ?
|
||||
await getTokenFromConfig(config.token, orgId, db, logger) :
|
||||
hostname === GITHUB_CLOUD_HOSTNAME ?
|
||||
env.FALLBACK_GITHUB_CLOUD_TOKEN :
|
||||
undefined;
|
||||
|
||||
const octokit = new Octokit({
|
||||
auth: token,
|
||||
|
|
@ -35,295 +58,317 @@ export const getGitHubReposFromConfig = async (config: GitHubConfig, signal: Abo
|
|||
} : {}),
|
||||
});
|
||||
|
||||
if (token) {
|
||||
try {
|
||||
await octokit.rest.users.getAuthenticated();
|
||||
} catch (error) {
|
||||
Sentry.captureException(error);
|
||||
|
||||
if (isHttpError(error, 401)) {
|
||||
const e = new BackendException(BackendError.CONNECTION_SYNC_INVALID_TOKEN, {
|
||||
...(config.token && 'secret' in config.token ? {
|
||||
secretKey: config.token.secret,
|
||||
} : {}),
|
||||
});
|
||||
Sentry.captureException(e);
|
||||
throw e;
|
||||
}
|
||||
|
||||
const e = new BackendException(BackendError.CONNECTION_SYNC_SYSTEM_ERROR, {
|
||||
message: `Failed to authenticate with GitHub`,
|
||||
});
|
||||
Sentry.captureException(e);
|
||||
throw e;
|
||||
}
|
||||
}
|
||||
|
||||
let allRepos: OctokitRepository[] = [];
|
||||
let notFound: {
|
||||
users: string[],
|
||||
orgs: string[],
|
||||
repos: string[],
|
||||
} = {
|
||||
users: [],
|
||||
orgs: [],
|
||||
repos: [],
|
||||
};
|
||||
|
||||
if (config.orgs) {
|
||||
const _repos = await getReposForOrgs(config.orgs, octokit, signal);
|
||||
allRepos = allRepos.concat(_repos);
|
||||
const { validRepos, notFoundOrgs } = await getReposForOrgs(config.orgs, octokit, signal);
|
||||
allRepos = allRepos.concat(validRepos);
|
||||
notFound.orgs = notFoundOrgs;
|
||||
}
|
||||
|
||||
if (config.repos) {
|
||||
const _repos = await getRepos(config.repos, octokit, signal);
|
||||
allRepos = allRepos.concat(_repos);
|
||||
const { validRepos, notFoundRepos } = await getRepos(config.repos, octokit, signal);
|
||||
allRepos = allRepos.concat(validRepos);
|
||||
notFound.repos = notFoundRepos;
|
||||
}
|
||||
|
||||
if (config.users) {
|
||||
const isAuthenticated = config.token !== undefined;
|
||||
const _repos = await getReposOwnedByUsers(config.users, isAuthenticated, octokit, signal);
|
||||
allRepos = allRepos.concat(_repos);
|
||||
const { validRepos, notFoundUsers } = await getReposOwnedByUsers(config.users, isAuthenticated, octokit, signal);
|
||||
allRepos = allRepos.concat(validRepos);
|
||||
notFound.users = notFoundUsers;
|
||||
}
|
||||
|
||||
// Marshall results to our type
|
||||
let repos: GitRepository[] = allRepos
|
||||
let repos = allRepos
|
||||
.filter((repo) => {
|
||||
if (!repo.clone_url) {
|
||||
logger.warn(`Repository ${repo.name} missing property 'clone_url'. Excluding.`)
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
})
|
||||
.map((repo) => {
|
||||
const hostname = config.url ? new URL(config.url).hostname : 'github.com';
|
||||
const repoId = `${hostname}/${repo.full_name}`;
|
||||
const repoPath = path.resolve(path.join(ctx.reposPath, `${repoId}.git`));
|
||||
|
||||
const cloneUrl = new URL(repo.clone_url!);
|
||||
if (token) {
|
||||
cloneUrl.username = token;
|
||||
}
|
||||
|
||||
return {
|
||||
vcs: 'git',
|
||||
codeHost: 'github',
|
||||
name: repo.full_name,
|
||||
id: repoId,
|
||||
cloneUrl: cloneUrl.toString(),
|
||||
path: repoPath,
|
||||
isStale: false,
|
||||
isFork: repo.fork,
|
||||
isArchived: !!repo.archived,
|
||||
topics: repo.topics ?? [],
|
||||
gitConfigMetadata: {
|
||||
'zoekt.web-url-type': 'github',
|
||||
'zoekt.web-url': repo.html_url,
|
||||
'zoekt.name': repoId,
|
||||
'zoekt.github-stars': (repo.stargazers_count ?? 0).toString(),
|
||||
'zoekt.github-watchers': (repo.watchers_count ?? 0).toString(),
|
||||
'zoekt.github-subscribers': (repo.subscribers_count ?? 0).toString(),
|
||||
'zoekt.github-forks': (repo.forks_count ?? 0).toString(),
|
||||
'zoekt.archived': marshalBool(repo.archived),
|
||||
'zoekt.fork': marshalBool(repo.fork),
|
||||
'zoekt.public': marshalBool(repo.private === false)
|
||||
const isExcluded = shouldExcludeRepo({
|
||||
repo,
|
||||
include: {
|
||||
topics: config.topics,
|
||||
},
|
||||
sizeInBytes: repo.size ? repo.size * 1000 : undefined,
|
||||
branches: [],
|
||||
tags: [],
|
||||
} satisfies GitRepository;
|
||||
exclude: config.exclude,
|
||||
});
|
||||
|
||||
return !isExcluded;
|
||||
});
|
||||
|
||||
if (config.topics) {
|
||||
const topics = config.topics.map(topic => topic.toLowerCase());
|
||||
repos = includeReposByTopic(repos, topics, logger);
|
||||
}
|
||||
|
||||
if (config.exclude) {
|
||||
if (!!config.exclude.forks) {
|
||||
repos = excludeForkedRepos(repos, logger);
|
||||
}
|
||||
|
||||
if (!!config.exclude.archived) {
|
||||
repos = excludeArchivedRepos(repos, logger);
|
||||
}
|
||||
|
||||
if (config.exclude.repos) {
|
||||
repos = excludeReposByName(repos, config.exclude.repos, logger);
|
||||
}
|
||||
|
||||
if (config.exclude.topics) {
|
||||
const topics = config.exclude.topics.map(topic => topic.toLowerCase());
|
||||
repos = excludeReposByTopic(repos, topics, logger);
|
||||
}
|
||||
|
||||
if (config.exclude.size) {
|
||||
const min = config.exclude.size.min;
|
||||
const max = config.exclude.size.max;
|
||||
if (min) {
|
||||
repos = repos.filter((repo) => {
|
||||
// If we don't have a size, we can't filter by size.
|
||||
if (!repo.sizeInBytes) {
|
||||
return true;
|
||||
}
|
||||
|
||||
if (repo.sizeInBytes < min) {
|
||||
logger.debug(`Excluding repo ${repo.name}. Reason: repo is less than \`exclude.size.min\`=${min} bytes.`);
|
||||
return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
});
|
||||
}
|
||||
|
||||
if (max) {
|
||||
repos = repos.filter((repo) => {
|
||||
// If we don't have a size, we can't filter by size.
|
||||
if (!repo.sizeInBytes) {
|
||||
return true;
|
||||
}
|
||||
|
||||
if (repo.sizeInBytes > max) {
|
||||
logger.debug(`Excluding repo ${repo.name}. Reason: repo is greater than \`exclude.size.max\`=${max} bytes.`);
|
||||
return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
logger.debug(`Found ${repos.length} total repositories.`);
|
||||
|
||||
if (config.revisions) {
|
||||
if (config.revisions.branches) {
|
||||
const branchGlobs = config.revisions.branches;
|
||||
repos = await Promise.all(
|
||||
repos.map(async (repo) => {
|
||||
const [owner, name] = repo.name.split('/');
|
||||
let branches = (await getBranchesForRepo(owner, name, octokit, signal)).map(branch => branch.name);
|
||||
branches = micromatch.match(branches, branchGlobs);
|
||||
return {
|
||||
validRepos: repos,
|
||||
notFound,
|
||||
};
|
||||
}
|
||||
|
||||
return {
|
||||
...repo,
|
||||
branches,
|
||||
};
|
||||
})
|
||||
)
|
||||
export const shouldExcludeRepo = ({
|
||||
repo,
|
||||
include,
|
||||
exclude
|
||||
} : {
|
||||
repo: OctokitRepository,
|
||||
include?: {
|
||||
topics?: GithubConnectionConfig['topics']
|
||||
},
|
||||
exclude?: GithubConnectionConfig['exclude']
|
||||
}) => {
|
||||
let reason = '';
|
||||
const repoName = repo.full_name;
|
||||
|
||||
const shouldExclude = (() => {
|
||||
if (!repo.clone_url) {
|
||||
reason = 'clone_url is undefined';
|
||||
return true;
|
||||
}
|
||||
|
||||
if (config.revisions.tags) {
|
||||
const tagGlobs = config.revisions.tags;
|
||||
repos = await Promise.all(
|
||||
repos.map(async (repo) => {
|
||||
const [owner, name] = repo.name.split('/');
|
||||
let tags = (await getTagsForRepo(owner, name, octokit, signal)).map(tag => tag.name);
|
||||
tags = micromatch.match(tags, tagGlobs);
|
||||
|
||||
return {
|
||||
...repo,
|
||||
tags,
|
||||
};
|
||||
})
|
||||
)
|
||||
if (!!exclude?.forks && repo.fork) {
|
||||
reason = `\`exclude.forks\` is true`;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
return repos;
|
||||
}
|
||||
|
||||
const getTagsForRepo = async (owner: string, repo: string, octokit: Octokit, signal: AbortSignal) => {
|
||||
try {
|
||||
logger.debug(`Fetching tags for repo ${owner}/${repo}...`);
|
||||
const { durationMs, data: tags } = await measure(() => octokit.paginate(octokit.repos.listTags, {
|
||||
owner,
|
||||
repo,
|
||||
per_page: 100,
|
||||
request: {
|
||||
signal
|
||||
|
||||
if (!!exclude?.archived && !!repo.archived) {
|
||||
reason = `\`exclude.archived\` is true`;
|
||||
return true;
|
||||
}
|
||||
|
||||
if (exclude?.repos) {
|
||||
if (micromatch.isMatch(repoName, exclude.repos)) {
|
||||
reason = `\`exclude.repos\` contains ${repoName}`;
|
||||
return true;
|
||||
}
|
||||
}));
|
||||
|
||||
logger.debug(`Found ${tags.length} tags for repo ${owner}/${repo} in ${durationMs}ms`);
|
||||
return tags;
|
||||
} catch (e) {
|
||||
logger.debug(`Error fetching tags for repo ${owner}/${repo}: ${e}`);
|
||||
return [];
|
||||
}
|
||||
}
|
||||
|
||||
const getBranchesForRepo = async (owner: string, repo: string, octokit: Octokit, signal: AbortSignal) => {
|
||||
try {
|
||||
logger.debug(`Fetching branches for repo ${owner}/${repo}...`);
|
||||
const { durationMs, data: branches } = await measure(() => octokit.paginate(octokit.repos.listBranches, {
|
||||
owner,
|
||||
repo,
|
||||
per_page: 100,
|
||||
request: {
|
||||
signal
|
||||
}
|
||||
|
||||
if (exclude?.topics) {
|
||||
const configTopics = exclude.topics.map(topic => topic.toLowerCase());
|
||||
const repoTopics = repo.topics ?? [];
|
||||
|
||||
const matchingTopics = repoTopics.filter((topic) => micromatch.isMatch(topic, configTopics));
|
||||
if (matchingTopics.length > 0) {
|
||||
reason = `\`exclude.topics\` matches the following topics: ${matchingTopics.join(', ')}`;
|
||||
return true;
|
||||
}
|
||||
}));
|
||||
logger.debug(`Found ${branches.length} branches for repo ${owner}/${repo} in ${durationMs}ms`);
|
||||
return branches;
|
||||
} catch (e) {
|
||||
logger.debug(`Error fetching branches for repo ${owner}/${repo}: ${e}`);
|
||||
return [];
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (include?.topics) {
|
||||
const configTopics = include.topics.map(topic => topic.toLowerCase());
|
||||
const repoTopics = repo.topics ?? [];
|
||||
|
||||
const matchingTopics = repoTopics.filter((topic) => micromatch.isMatch(topic, configTopics));
|
||||
if (matchingTopics.length === 0) {
|
||||
reason = `\`include.topics\` does not match any of the following topics: ${configTopics.join(', ')}`;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
const repoSizeInBytes = repo.size ? repo.size * 1000 : undefined;
|
||||
if (exclude?.size && repoSizeInBytes) {
|
||||
const min = exclude.size.min;
|
||||
const max = exclude.size.max;
|
||||
|
||||
if (min && repoSizeInBytes < min) {
|
||||
reason = `repo is less than \`exclude.size.min\`=${min} bytes.`;
|
||||
return true;
|
||||
}
|
||||
|
||||
if (max && repoSizeInBytes > max) {
|
||||
reason = `repo is greater than \`exclude.size.max\`=${max} bytes.`;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
return false;
|
||||
})();
|
||||
|
||||
if (shouldExclude) {
|
||||
logger.debug(`Excluding repo ${repoName}. Reason: ${reason}`);
|
||||
return true;
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
const getReposOwnedByUsers = async (users: string[], isAuthenticated: boolean, octokit: Octokit, signal: AbortSignal) => {
|
||||
const repos = (await Promise.all(users.map(async (user) => {
|
||||
const results = await Promise.allSettled(users.map(async (user) => {
|
||||
try {
|
||||
logger.debug(`Fetching repository info for user ${user}...`);
|
||||
|
||||
const { durationMs, data } = await measure(async () => {
|
||||
if (isAuthenticated) {
|
||||
return octokit.paginate(octokit.repos.listForAuthenticatedUser, {
|
||||
username: user,
|
||||
visibility: 'all',
|
||||
affiliation: 'owner',
|
||||
per_page: 100,
|
||||
request: {
|
||||
signal,
|
||||
},
|
||||
});
|
||||
} else {
|
||||
return octokit.paginate(octokit.repos.listForUser, {
|
||||
username: user,
|
||||
per_page: 100,
|
||||
request: {
|
||||
signal,
|
||||
},
|
||||
});
|
||||
}
|
||||
const fetchFn = async () => {
|
||||
if (isAuthenticated) {
|
||||
return octokit.paginate(octokit.repos.listForAuthenticatedUser, {
|
||||
username: user,
|
||||
visibility: 'all',
|
||||
affiliation: 'owner',
|
||||
per_page: 100,
|
||||
request: {
|
||||
signal,
|
||||
},
|
||||
});
|
||||
} else {
|
||||
return octokit.paginate(octokit.repos.listForUser, {
|
||||
username: user,
|
||||
per_page: 100,
|
||||
request: {
|
||||
signal,
|
||||
},
|
||||
});
|
||||
}
|
||||
};
|
||||
|
||||
return fetchWithRetry(fetchFn, `user ${user}`, logger);
|
||||
});
|
||||
|
||||
logger.debug(`Found ${data.length} owned by user ${user} in ${durationMs}ms.`);
|
||||
return data;
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch repository info for user ${user}.`, e);
|
||||
return [];
|
||||
}
|
||||
}))).flat();
|
||||
return {
|
||||
type: 'valid' as const,
|
||||
data
|
||||
};
|
||||
} catch (error) {
|
||||
Sentry.captureException(error);
|
||||
logger.error(`Failed to fetch repositories for user ${user}.`, error);
|
||||
|
||||
return repos;
|
||||
if (isHttpError(error, 404)) {
|
||||
logger.error(`User ${user} not found or no access`);
|
||||
return {
|
||||
type: 'notFound' as const,
|
||||
value: user
|
||||
};
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}));
|
||||
|
||||
throwIfAnyFailed(results);
|
||||
const { validItems: validRepos, notFoundItems: notFoundUsers } = processPromiseResults<OctokitRepository>(results);
|
||||
|
||||
return {
|
||||
validRepos,
|
||||
notFoundUsers,
|
||||
};
|
||||
}
|
||||
|
||||
const getReposForOrgs = async (orgs: string[], octokit: Octokit, signal: AbortSignal) => {
|
||||
const repos = (await Promise.all(orgs.map(async (org) => {
|
||||
const results = await Promise.allSettled(orgs.map(async (org) => {
|
||||
try {
|
||||
logger.debug(`Fetching repository info for org ${org}...`);
|
||||
logger.info(`Fetching repository info for org ${org}...`);
|
||||
|
||||
const { durationMs, data } = await measure(() => octokit.paginate(octokit.repos.listForOrg, {
|
||||
org: org,
|
||||
per_page: 100,
|
||||
request: {
|
||||
signal
|
||||
}
|
||||
}));
|
||||
const { durationMs, data } = await measure(async () => {
|
||||
const fetchFn = () => octokit.paginate(octokit.repos.listForOrg, {
|
||||
org: org,
|
||||
per_page: 100,
|
||||
request: {
|
||||
signal
|
||||
}
|
||||
});
|
||||
|
||||
logger.debug(`Found ${data.length} in org ${org} in ${durationMs}ms.`);
|
||||
return data;
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch repository info for org ${org}.`, e);
|
||||
return [];
|
||||
return fetchWithRetry(fetchFn, `org ${org}`, logger);
|
||||
});
|
||||
|
||||
logger.info(`Found ${data.length} in org ${org} in ${durationMs}ms.`);
|
||||
return {
|
||||
type: 'valid' as const,
|
||||
data
|
||||
};
|
||||
} catch (error) {
|
||||
Sentry.captureException(error);
|
||||
logger.error(`Failed to fetch repositories for org ${org}.`, error);
|
||||
|
||||
if (isHttpError(error, 404)) {
|
||||
logger.error(`Organization ${org} not found or no access`);
|
||||
return {
|
||||
type: 'notFound' as const,
|
||||
value: org
|
||||
};
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}))).flat();
|
||||
}));
|
||||
|
||||
return repos;
|
||||
throwIfAnyFailed(results);
|
||||
const { validItems: validRepos, notFoundItems: notFoundOrgs } = processPromiseResults<OctokitRepository>(results);
|
||||
|
||||
return {
|
||||
validRepos,
|
||||
notFoundOrgs,
|
||||
};
|
||||
}
|
||||
|
||||
const getRepos = async (repoList: string[], octokit: Octokit, signal: AbortSignal) => {
|
||||
const repos = (await Promise.all(repoList.map(async (repo) => {
|
||||
const results = await Promise.allSettled(repoList.map(async (repo) => {
|
||||
try {
|
||||
logger.debug(`Fetching repository info for ${repo}...`);
|
||||
|
||||
const [owner, repoName] = repo.split('/');
|
||||
const { durationMs, data: result } = await measure(() => octokit.repos.get({
|
||||
owner,
|
||||
repo: repoName,
|
||||
request: {
|
||||
signal
|
||||
}
|
||||
}));
|
||||
logger.info(`Fetching repository info for ${repo}...`);
|
||||
|
||||
logger.debug(`Found info for repository ${repo} in ${durationMs}ms`);
|
||||
const { durationMs, data: result } = await measure(async () => {
|
||||
const fetchFn = () => octokit.repos.get({
|
||||
owner,
|
||||
repo: repoName,
|
||||
request: {
|
||||
signal
|
||||
}
|
||||
});
|
||||
|
||||
return [result.data];
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch repository info for ${repo}.`, e);
|
||||
return [];
|
||||
return fetchWithRetry(fetchFn, repo, logger);
|
||||
});
|
||||
|
||||
logger.info(`Found info for repository ${repo} in ${durationMs}ms`);
|
||||
return {
|
||||
type: 'valid' as const,
|
||||
data: [result.data]
|
||||
};
|
||||
|
||||
} catch (error) {
|
||||
Sentry.captureException(error);
|
||||
logger.error(`Failed to fetch repository ${repo}.`, error);
|
||||
|
||||
if (isHttpError(error, 404)) {
|
||||
logger.error(`Repository ${repo} not found or no access`);
|
||||
return {
|
||||
type: 'notFound' as const,
|
||||
value: repo
|
||||
};
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}))).flat();
|
||||
}));
|
||||
|
||||
return repos;
|
||||
throwIfAnyFailed(results);
|
||||
const { validItems: validRepos, notFoundItems: notFoundRepos } = processPromiseResults<OctokitRepository>(results);
|
||||
|
||||
return {
|
||||
validRepos,
|
||||
notFoundRepos,
|
||||
};
|
||||
}
|
||||
43
packages/backend/src/gitlab.test.ts
Normal file
|
|
@ -0,0 +1,43 @@
|
|||
import { expect, test } from 'vitest';
|
||||
import { shouldExcludeProject } from './gitlab';
|
||||
import { ProjectSchema } from '@gitbeaker/rest';
|
||||
|
||||
|
||||
test('shouldExcludeProject returns false when the project is not excluded.', () => {
|
||||
const project = {
|
||||
path_with_namespace: 'test/project',
|
||||
} as ProjectSchema;
|
||||
|
||||
expect(shouldExcludeProject({
|
||||
project,
|
||||
})).toBe(false);
|
||||
});
|
||||
|
||||
test('shouldExcludeProject returns true when the project is excluded by exclude.archived.', () => {
|
||||
const project = {
|
||||
path_with_namespace: 'test/project',
|
||||
archived: true,
|
||||
} as ProjectSchema;
|
||||
|
||||
expect(shouldExcludeProject({
|
||||
project,
|
||||
exclude: {
|
||||
archived: true,
|
||||
}
|
||||
})).toBe(true)
|
||||
});
|
||||
|
||||
test('shouldExcludeProject returns true when the project is excluded by exclude.forks.', () => {
|
||||
const project = {
|
||||
path_with_namespace: 'test/project',
|
||||
forked_from_project: {}
|
||||
} as unknown as ProjectSchema;
|
||||
|
||||
expect(shouldExcludeProject({
|
||||
project,
|
||||
exclude: {
|
||||
forks: true,
|
||||
}
|
||||
})).toBe(true)
|
||||
});
|
||||
|
||||
|
|
@ -1,208 +1,260 @@
|
|||
import { Gitlab, ProjectSchema } from "@gitbeaker/rest";
|
||||
import { GitLabConfig } from "./schemas/v2.js";
|
||||
import { excludeArchivedRepos, excludeForkedRepos, excludeReposByName, excludeReposByTopic, getTokenFromConfig, includeReposByTopic, marshalBool, measure } from "./utils.js";
|
||||
import { createLogger } from "./logger.js";
|
||||
import { AppContext, GitRepository } from "./types.js";
|
||||
import path from 'path';
|
||||
import micromatch from "micromatch";
|
||||
import { createLogger } from "./logger.js";
|
||||
import { GitlabConnectionConfig } from "@sourcebot/schemas/v3/gitlab.type"
|
||||
import { getTokenFromConfig, measure, fetchWithRetry } from "./utils.js";
|
||||
import { PrismaClient } from "@sourcebot/db";
|
||||
import { processPromiseResults, throwIfAnyFailed } from "./connectionUtils.js";
|
||||
import * as Sentry from "@sentry/node";
|
||||
import { env } from "./env.js";
|
||||
|
||||
const logger = createLogger("GitLab");
|
||||
const GITLAB_CLOUD_HOSTNAME = "gitlab.com";
|
||||
export const GITLAB_CLOUD_HOSTNAME = "gitlab.com";
|
||||
|
||||
export const getGitLabReposFromConfig = async (config: GitLabConfig, ctx: AppContext) => {
|
||||
const token = config.token ? getTokenFromConfig(config.token, ctx) : undefined;
|
||||
export const getGitLabReposFromConfig = async (config: GitlabConnectionConfig, orgId: number, db: PrismaClient) => {
|
||||
const hostname = config.url ?
|
||||
new URL(config.url).hostname :
|
||||
GITLAB_CLOUD_HOSTNAME;
|
||||
|
||||
const token = config.token ?
|
||||
await getTokenFromConfig(config.token, orgId, db, logger) :
|
||||
hostname === GITLAB_CLOUD_HOSTNAME ?
|
||||
env.FALLBACK_GITLAB_CLOUD_TOKEN :
|
||||
undefined;
|
||||
|
||||
const api = new Gitlab({
|
||||
...(config.token ? {
|
||||
...(token ? {
|
||||
token,
|
||||
} : {}),
|
||||
...(config.url ? {
|
||||
host: config.url,
|
||||
} : {}),
|
||||
});
|
||||
const hostname = config.url ? new URL(config.url).hostname : GITLAB_CLOUD_HOSTNAME;
|
||||
|
||||
|
||||
let allProjects: ProjectSchema[] = [];
|
||||
let allRepos: ProjectSchema[] = [];
|
||||
let notFound: {
|
||||
orgs: string[],
|
||||
users: string[],
|
||||
repos: string[],
|
||||
} = {
|
||||
orgs: [],
|
||||
users: [],
|
||||
repos: [],
|
||||
};
|
||||
|
||||
if (config.all === true) {
|
||||
if (hostname !== GITLAB_CLOUD_HOSTNAME) {
|
||||
try {
|
||||
logger.debug(`Fetching all projects visible in ${config.url}...`);
|
||||
const { durationMs, data: _projects } = await measure(() => api.Projects.all({
|
||||
perPage: 100,
|
||||
}));
|
||||
const { durationMs, data: _projects } = await measure(async () => {
|
||||
const fetchFn = () => api.Projects.all({
|
||||
perPage: 100,
|
||||
});
|
||||
return fetchWithRetry(fetchFn, `all projects in ${config.url}`, logger);
|
||||
});
|
||||
logger.debug(`Found ${_projects.length} projects in ${durationMs}ms.`);
|
||||
allProjects = allProjects.concat(_projects);
|
||||
allRepos = allRepos.concat(_projects);
|
||||
} catch (e) {
|
||||
Sentry.captureException(e);
|
||||
logger.error(`Failed to fetch all projects visible in ${config.url}.`, e);
|
||||
throw e;
|
||||
}
|
||||
} else {
|
||||
logger.warn(`Ignoring option all:true in ${ctx.configPath} : host is ${GITLAB_CLOUD_HOSTNAME}`);
|
||||
logger.warn(`Ignoring option all:true in config : host is ${GITLAB_CLOUD_HOSTNAME}`);
|
||||
}
|
||||
}
|
||||
|
||||
if (config.groups) {
|
||||
const _projects = (await Promise.all(config.groups.map(async (group) => {
|
||||
const results = await Promise.allSettled(config.groups.map(async (group) => {
|
||||
try {
|
||||
logger.debug(`Fetching project info for group ${group}...`);
|
||||
const { durationMs, data } = await measure(() => api.Groups.allProjects(group, {
|
||||
perPage: 100,
|
||||
includeSubgroups: true
|
||||
}));
|
||||
const { durationMs, data } = await measure(async () => {
|
||||
const fetchFn = () => api.Groups.allProjects(group, {
|
||||
perPage: 100,
|
||||
includeSubgroups: true
|
||||
});
|
||||
return fetchWithRetry(fetchFn, `group ${group}`, logger);
|
||||
});
|
||||
logger.debug(`Found ${data.length} projects in group ${group} in ${durationMs}ms.`);
|
||||
return data;
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch project info for group ${group}.`, e);
|
||||
return [];
|
||||
}
|
||||
}))).flat();
|
||||
return {
|
||||
type: 'valid' as const,
|
||||
data
|
||||
};
|
||||
} catch (e: any) {
|
||||
Sentry.captureException(e);
|
||||
logger.error(`Failed to fetch projects for group ${group}.`, e);
|
||||
|
||||
allProjects = allProjects.concat(_projects);
|
||||
const status = e?.cause?.response?.status;
|
||||
if (status === 404) {
|
||||
logger.error(`Group ${group} not found or no access`);
|
||||
return {
|
||||
type: 'notFound' as const,
|
||||
value: group
|
||||
};
|
||||
}
|
||||
throw e;
|
||||
}
|
||||
}));
|
||||
|
||||
throwIfAnyFailed(results);
|
||||
const { validItems: validRepos, notFoundItems: notFoundOrgs } = processPromiseResults(results);
|
||||
allRepos = allRepos.concat(validRepos);
|
||||
notFound.orgs = notFoundOrgs;
|
||||
}
|
||||
|
||||
if (config.users) {
|
||||
const _projects = (await Promise.all(config.users.map(async (user) => {
|
||||
const results = await Promise.allSettled(config.users.map(async (user) => {
|
||||
try {
|
||||
logger.debug(`Fetching project info for user ${user}...`);
|
||||
const { durationMs, data } = await measure(() => api.Users.allProjects(user, {
|
||||
perPage: 100,
|
||||
}));
|
||||
const { durationMs, data } = await measure(async () => {
|
||||
const fetchFn = () => api.Users.allProjects(user, {
|
||||
perPage: 100,
|
||||
});
|
||||
return fetchWithRetry(fetchFn, `user ${user}`, logger);
|
||||
});
|
||||
logger.debug(`Found ${data.length} projects owned by user ${user} in ${durationMs}ms.`);
|
||||
return data;
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch project info for user ${user}.`, e);
|
||||
return [];
|
||||
}
|
||||
}))).flat();
|
||||
return {
|
||||
type: 'valid' as const,
|
||||
data
|
||||
};
|
||||
} catch (e: any) {
|
||||
Sentry.captureException(e);
|
||||
logger.error(`Failed to fetch projects for user ${user}.`, e);
|
||||
|
||||
allProjects = allProjects.concat(_projects);
|
||||
const status = e?.cause?.response?.status;
|
||||
if (status === 404) {
|
||||
logger.error(`User ${user} not found or no access`);
|
||||
return {
|
||||
type: 'notFound' as const,
|
||||
value: user
|
||||
};
|
||||
}
|
||||
throw e;
|
||||
}
|
||||
}));
|
||||
|
||||
throwIfAnyFailed(results);
|
||||
const { validItems: validRepos, notFoundItems: notFoundUsers } = processPromiseResults(results);
|
||||
allRepos = allRepos.concat(validRepos);
|
||||
notFound.users = notFoundUsers;
|
||||
}
|
||||
|
||||
if (config.projects) {
|
||||
const _projects = (await Promise.all(config.projects.map(async (project) => {
|
||||
const results = await Promise.allSettled(config.projects.map(async (project) => {
|
||||
try {
|
||||
logger.debug(`Fetching project info for project ${project}...`);
|
||||
const { durationMs, data } = await measure(() => api.Projects.show(project));
|
||||
const { durationMs, data } = await measure(async () => {
|
||||
const fetchFn = () => api.Projects.show(project);
|
||||
return fetchWithRetry(fetchFn, `project ${project}`, logger);
|
||||
});
|
||||
logger.debug(`Found project ${project} in ${durationMs}ms.`);
|
||||
return [data];
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch project info for project ${project}.`, e);
|
||||
return [];
|
||||
}
|
||||
}))).flat();
|
||||
return {
|
||||
type: 'valid' as const,
|
||||
data: [data]
|
||||
};
|
||||
} catch (e: any) {
|
||||
Sentry.captureException(e);
|
||||
logger.error(`Failed to fetch project ${project}.`, e);
|
||||
|
||||
allProjects = allProjects.concat(_projects);
|
||||
const status = e?.cause?.response?.status;
|
||||
|
||||
if (status === 404) {
|
||||
logger.error(`Project ${project} not found or no access`);
|
||||
return {
|
||||
type: 'notFound' as const,
|
||||
value: project
|
||||
};
|
||||
}
|
||||
throw e;
|
||||
}
|
||||
}));
|
||||
|
||||
throwIfAnyFailed(results);
|
||||
const { validItems: validRepos, notFoundItems: notFoundRepos } = processPromiseResults(results);
|
||||
allRepos = allRepos.concat(validRepos);
|
||||
notFound.repos = notFoundRepos;
|
||||
}
|
||||
|
||||
let repos: GitRepository[] = allProjects
|
||||
.map((project) => {
|
||||
const repoId = `${hostname}/${project.path_with_namespace}`;
|
||||
const repoPath = path.resolve(path.join(ctx.reposPath, `${repoId}.git`))
|
||||
const isFork = project.forked_from_project !== undefined;
|
||||
|
||||
const cloneUrl = new URL(project.http_url_to_repo);
|
||||
if (token) {
|
||||
cloneUrl.username = 'oauth2';
|
||||
cloneUrl.password = token;
|
||||
}
|
||||
|
||||
return {
|
||||
vcs: 'git',
|
||||
codeHost: 'gitlab',
|
||||
name: project.path_with_namespace,
|
||||
id: repoId,
|
||||
cloneUrl: cloneUrl.toString(),
|
||||
path: repoPath,
|
||||
isStale: false,
|
||||
isFork,
|
||||
isArchived: project.archived,
|
||||
topics: project.topics ?? [],
|
||||
gitConfigMetadata: {
|
||||
'zoekt.web-url-type': 'gitlab',
|
||||
'zoekt.web-url': project.web_url,
|
||||
'zoekt.name': repoId,
|
||||
'zoekt.gitlab-stars': project.star_count?.toString() ?? '0',
|
||||
'zoekt.gitlab-forks': project.forks_count?.toString() ?? '0',
|
||||
'zoekt.archived': marshalBool(project.archived),
|
||||
'zoekt.fork': marshalBool(isFork),
|
||||
'zoekt.public': marshalBool(project.visibility === 'public'),
|
||||
let repos = allRepos
|
||||
.filter((project) => {
|
||||
const isExcluded = shouldExcludeProject({
|
||||
project,
|
||||
include: {
|
||||
topics: config.topics,
|
||||
},
|
||||
branches: [],
|
||||
tags: [],
|
||||
} satisfies GitRepository;
|
||||
exclude: config.exclude
|
||||
});
|
||||
|
||||
return !isExcluded;
|
||||
});
|
||||
|
||||
if (config.topics) {
|
||||
const topics = config.topics.map(topic => topic.toLowerCase());
|
||||
repos = includeReposByTopic(repos, topics, logger);
|
||||
}
|
||||
|
||||
if (config.exclude) {
|
||||
if (!!config.exclude.forks) {
|
||||
repos = excludeForkedRepos(repos, logger);
|
||||
}
|
||||
|
||||
if (!!config.exclude.archived) {
|
||||
repos = excludeArchivedRepos(repos, logger);
|
||||
}
|
||||
|
||||
if (config.exclude.projects) {
|
||||
repos = excludeReposByName(repos, config.exclude.projects, logger);
|
||||
}
|
||||
|
||||
if (config.exclude.topics) {
|
||||
const topics = config.exclude.topics.map(topic => topic.toLowerCase());
|
||||
repos = excludeReposByTopic(repos, topics, logger);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
logger.debug(`Found ${repos.length} total repositories.`);
|
||||
|
||||
if (config.revisions) {
|
||||
if (config.revisions.branches) {
|
||||
const branchGlobs = config.revisions.branches;
|
||||
repos = await Promise.all(repos.map(async (repo) => {
|
||||
try {
|
||||
logger.debug(`Fetching branches for repo ${repo.name}...`);
|
||||
let { durationMs, data } = await measure(() => api.Branches.all(repo.name));
|
||||
logger.debug(`Found ${data.length} branches in repo ${repo.name} in ${durationMs}ms.`);
|
||||
return {
|
||||
validRepos: repos,
|
||||
notFound,
|
||||
};
|
||||
}
|
||||
|
||||
let branches = data.map((branch) => branch.name);
|
||||
branches = micromatch.match(branches, branchGlobs);
|
||||
export const shouldExcludeProject = ({
|
||||
project,
|
||||
include,
|
||||
exclude,
|
||||
}: {
|
||||
project: ProjectSchema,
|
||||
include?: {
|
||||
topics?: GitlabConnectionConfig['topics'],
|
||||
},
|
||||
exclude?: GitlabConnectionConfig['exclude'],
|
||||
}) => {
|
||||
const projectName = project.path_with_namespace;
|
||||
let reason = '';
|
||||
|
||||
return {
|
||||
...repo,
|
||||
branches,
|
||||
};
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch branches for repo ${repo.name}.`, e);
|
||||
return repo;
|
||||
}
|
||||
}));
|
||||
const shouldExclude = (() => {
|
||||
if (!!exclude?.archived && project.archived) {
|
||||
reason = `\`exclude.archived\` is true`;
|
||||
return true;
|
||||
}
|
||||
|
||||
if (config.revisions.tags) {
|
||||
const tagGlobs = config.revisions.tags;
|
||||
repos = await Promise.all(repos.map(async (repo) => {
|
||||
try {
|
||||
logger.debug(`Fetching tags for repo ${repo.name}...`);
|
||||
let { durationMs, data } = await measure(() => api.Tags.all(repo.name));
|
||||
logger.debug(`Found ${data.length} tags in repo ${repo.name} in ${durationMs}ms.`);
|
||||
|
||||
let tags = data.map((tag) => tag.name);
|
||||
tags = micromatch.match(tags, tagGlobs);
|
||||
|
||||
return {
|
||||
...repo,
|
||||
tags,
|
||||
};
|
||||
} catch (e) {
|
||||
logger.error(`Failed to fetch tags for repo ${repo.name}.`, e);
|
||||
return repo;
|
||||
}
|
||||
}));
|
||||
if (!!exclude?.forks && project.forked_from_project !== undefined) {
|
||||
reason = `\`exclude.forks\` is true`;
|
||||
return true;
|
||||
}
|
||||
|
||||
if (exclude?.projects) {
|
||||
if (micromatch.isMatch(projectName, exclude.projects)) {
|
||||
reason = `\`exclude.projects\` contains ${projectName}`;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
if (include?.topics) {
|
||||
const configTopics = include.topics.map(topic => topic.toLowerCase());
|
||||
const projectTopics = project.topics ?? [];
|
||||
|
||||
const matchingTopics = projectTopics.filter((topic) => micromatch.isMatch(topic, configTopics));
|
||||
if (matchingTopics.length === 0) {
|
||||
reason = `\`include.topics\` does not match any of the following topics: ${configTopics.join(', ')}`;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
if (exclude?.topics) {
|
||||
const configTopics = exclude.topics.map(topic => topic.toLowerCase());
|
||||
const projectTopics = project.topics ?? [];
|
||||
|
||||
const matchingTopics = projectTopics.filter((topic) => micromatch.isMatch(topic, configTopics));
|
||||
if (matchingTopics.length > 0) {
|
||||
reason = `\`exclude.topics\` matches the following topics: ${matchingTopics.join(', ')}`;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
})();
|
||||
|
||||
if (shouldExclude) {
|
||||
logger.debug(`Excluding project ${projectName}. Reason: ${reason}`);
|
||||
return true;
|
||||
}
|
||||
|
||||
return repos;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
|
@ -1,10 +1,40 @@
|
|||
import "./instrument.js";
|
||||
|
||||
import * as Sentry from "@sentry/node";
|
||||
import { ArgumentParser } from "argparse";
|
||||
import { existsSync } from 'fs';
|
||||
import { mkdir } from 'fs/promises';
|
||||
import path from 'path';
|
||||
import { isRemotePath } from "./utils.js";
|
||||
import { AppContext } from "./types.js";
|
||||
import { main } from "./main.js"
|
||||
import { PrismaClient } from "@sourcebot/db";
|
||||
|
||||
// Register handler for normal exit
|
||||
process.on('exit', (code) => {
|
||||
console.log(`Process is exiting with code: ${code}`);
|
||||
});
|
||||
|
||||
// Register handlers for abnormal terminations
|
||||
process.on('SIGINT', () => {
|
||||
console.log('Process interrupted (SIGINT)');
|
||||
process.exit(130);
|
||||
});
|
||||
|
||||
process.on('SIGTERM', () => {
|
||||
console.log('Process terminated (SIGTERM)');
|
||||
process.exit(143);
|
||||
});
|
||||
|
||||
// Register handlers for uncaught exceptions and unhandled rejections
|
||||
process.on('uncaughtException', (err) => {
|
||||
console.log(`Uncaught exception: ${err.message}`);
|
||||
process.exit(1);
|
||||
});
|
||||
|
||||
process.on('unhandledRejection', (reason, promise) => {
|
||||
console.log(`Unhandled rejection at: ${promise}, reason: ${reason}`);
|
||||
process.exit(1);
|
||||
});
|
||||
|
||||
|
||||
const parser = new ArgumentParser({
|
||||
|
|
@ -12,26 +42,15 @@ const parser = new ArgumentParser({
|
|||
});
|
||||
|
||||
type Arguments = {
|
||||
configPath: string;
|
||||
cacheDir: string;
|
||||
}
|
||||
|
||||
parser.add_argument("--configPath", {
|
||||
help: "Path to config file",
|
||||
required: true,
|
||||
});
|
||||
|
||||
parser.add_argument("--cacheDir", {
|
||||
help: "Path to .sourcebot cache directory",
|
||||
required: true,
|
||||
});
|
||||
const args = parser.parse_args() as Arguments;
|
||||
|
||||
if (!isRemotePath(args.configPath) && !existsSync(args.configPath)) {
|
||||
console.error(`Config file ${args.configPath} does not exist`);
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const cacheDir = args.cacheDir;
|
||||
const reposPath = path.join(cacheDir, 'repos');
|
||||
const indexPath = path.join(cacheDir, 'index');
|
||||
|
|
@ -47,9 +66,21 @@ const context: AppContext = {
|
|||
indexPath,
|
||||
reposPath,
|
||||
cachePath: cacheDir,
|
||||
configPath: args.configPath,
|
||||
}
|
||||
|
||||
main(context).finally(() => {
|
||||
console.log("Shutting down...");
|
||||
});
|
||||
const prisma = new PrismaClient();
|
||||
|
||||
main(prisma, context)
|
||||
.then(async () => {
|
||||
await prisma.$disconnect();
|
||||
})
|
||||
.catch(async (e) => {
|
||||
console.error(e);
|
||||
Sentry.captureException(e);
|
||||
|
||||
await prisma.$disconnect();
|
||||
process.exit(1);
|
||||
})
|
||||
.finally(() => {
|
||||
console.log("Shutting down...");
|
||||
});
|
||||
|
|
|
|||
12
packages/backend/src/instrument.ts
Normal file
|
|
@ -0,0 +1,12 @@
|
|||
import * as Sentry from "@sentry/node";
|
||||
import { env } from "./env.js";
|
||||
|
||||
if (!!env.NEXT_PUBLIC_SENTRY_BACKEND_DSN && !!env.NEXT_PUBLIC_SENTRY_ENVIRONMENT) {
|
||||
Sentry.init({
|
||||
dsn: env.NEXT_PUBLIC_SENTRY_BACKEND_DSN,
|
||||
release: env.NEXT_PUBLIC_SOURCEBOT_VERSION,
|
||||
environment: env.NEXT_PUBLIC_SENTRY_ENVIRONMENT,
|
||||
});
|
||||
} else {
|
||||
console.debug("Sentry was not initialized");
|
||||
}
|
||||
|
|
@ -1,71 +0,0 @@
|
|||
import { existsSync, FSWatcher, statSync, watch } from "fs";
|
||||
import { createLogger } from "./logger.js";
|
||||
import { LocalConfig } from "./schemas/v2.js";
|
||||
import { AppContext, LocalRepository } from "./types.js";
|
||||
import { resolvePathRelativeToConfig } from "./utils.js";
|
||||
import path from "path";
|
||||
|
||||
const logger = createLogger('local');
|
||||
const fileWatchers = new Map<string, FSWatcher>();
|
||||
const abortControllers = new Map<string, AbortController>();
|
||||
|
||||
|
||||
export const getLocalRepoFromConfig = (config: LocalConfig, ctx: AppContext) => {
|
||||
const repoPath = resolvePathRelativeToConfig(config.path, ctx.configPath);
|
||||
logger.debug(`Resolved path '${config.path}' to '${repoPath}'`);
|
||||
|
||||
if (!existsSync(repoPath)) {
|
||||
throw new Error(`The local repository path '${repoPath}' referenced in ${ctx.configPath} does not exist`);
|
||||
}
|
||||
|
||||
const stat = statSync(repoPath);
|
||||
if (!stat.isDirectory()) {
|
||||
throw new Error(`The local repository path '${repoPath}' referenced in ${ctx.configPath} is not a directory`);
|
||||
}
|
||||
|
||||
const repo: LocalRepository = {
|
||||
vcs: 'local',
|
||||
name: path.basename(repoPath),
|
||||
id: repoPath,
|
||||
path: repoPath,
|
||||
isStale: false,
|
||||
excludedPaths: config.exclude?.paths ?? [],
|
||||
watch: config.watch ?? true,
|
||||
}
|
||||
|
||||
return repo;
|
||||
}
|
||||
|
||||
export const initLocalRepoFileWatchers = (repos: LocalRepository[], onUpdate: (repo: LocalRepository, ac: AbortSignal) => Promise<void>) => {
|
||||
// Close all existing watchers
|
||||
fileWatchers.forEach((watcher) => {
|
||||
watcher.close();
|
||||
});
|
||||
|
||||
repos
|
||||
.filter(repo => !repo.isStale && repo.watch)
|
||||
.forEach((repo) => {
|
||||
logger.info(`Watching local repository ${repo.id} for changes...`);
|
||||
const watcher = watch(repo.path, async () => {
|
||||
const existingController = abortControllers.get(repo.id);
|
||||
if (existingController) {
|
||||
existingController.abort();
|
||||
}
|
||||
|
||||
const controller = new AbortController();
|
||||
abortControllers.set(repo.id, controller);
|
||||
|
||||
try {
|
||||
await onUpdate(repo, controller.signal);
|
||||
} catch (err: any) {
|
||||
if (err.name !== 'AbortError') {
|
||||
logger.error(`Error while watching local repository ${repo.id} for changes:`);
|
||||
console.log(err);
|
||||
} else {
|
||||
logger.debug(`Aborting watch for local repository ${repo.id} due to abort signal`);
|
||||
}
|
||||
}
|
||||
});
|
||||
fileWatchers.set(repo.id, watcher);
|
||||
});
|
||||
}
|
||||
|
|
@ -1,11 +1,14 @@
|
|||
import winston, { format } from 'winston';
|
||||
import { SOURCEBOT_LOG_LEVEL } from './environment.js';
|
||||
import { Logtail } from '@logtail/node';
|
||||
import { LogtailTransport } from '@logtail/winston';
|
||||
import { env } from './env.js';
|
||||
|
||||
const { combine, colorize, timestamp, prettyPrint, errors, printf, label: labelFn } = format;
|
||||
|
||||
|
||||
const createLogger = (label: string) => {
|
||||
return winston.createLogger({
|
||||
level: SOURCEBOT_LOG_LEVEL,
|
||||
level: env.SOURCEBOT_LOG_LEVEL,
|
||||
format: combine(
|
||||
errors({ stack: true }),
|
||||
timestamp(),
|
||||
|
|
@ -28,6 +31,13 @@ const createLogger = (label: string) => {
|
|||
}),
|
||||
),
|
||||
}),
|
||||
...(env.LOGTAIL_TOKEN && env.LOGTAIL_HOST ? [
|
||||
new LogtailTransport(
|
||||
new Logtail(env.LOGTAIL_TOKEN, {
|
||||
endpoint: env.LOGTAIL_HOST,
|
||||
})
|
||||
)
|
||||
] : []),
|
||||
]
|
||||
});
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1,206 +0,0 @@
|
|||
import { expect, test, vi } from 'vitest';
|
||||
import { deleteStaleRepository, isAllRepoReindexingRequired, isRepoReindexingRequired } from './main';
|
||||
import { AppContext, GitRepository, LocalRepository, Repository, Settings } from './types';
|
||||
import { DEFAULT_DB_DATA } from './db';
|
||||
import { createMockDB } from './db.test';
|
||||
import { rm } from 'fs/promises';
|
||||
import path from 'path';
|
||||
import { glob } from 'glob';
|
||||
|
||||
vi.mock('fs/promises', () => ({
|
||||
rm: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('glob', () => ({
|
||||
glob: vi.fn().mockReturnValue(['fake_index.zoekt']),
|
||||
}));
|
||||
|
||||
vi.mock('fs', () => ({
|
||||
existsSync: vi.fn().mockReturnValue(true),
|
||||
}));
|
||||
|
||||
const createMockContext = (rootPath: string = '/app') => {
|
||||
return {
|
||||
configPath: path.join(rootPath, 'config.json'),
|
||||
cachePath: path.join(rootPath, '.sourcebot'),
|
||||
indexPath: path.join(rootPath, '.sourcebot/index'),
|
||||
reposPath: path.join(rootPath, '.sourcebot/repos'),
|
||||
} satisfies AppContext;
|
||||
}
|
||||
|
||||
|
||||
test('isRepoReindexingRequired should return false when no changes are made', () => {
|
||||
const previous: Repository = {
|
||||
vcs: 'git',
|
||||
name: 'test',
|
||||
id: 'test',
|
||||
path: '',
|
||||
cloneUrl: '',
|
||||
isStale: false,
|
||||
branches: ['main'],
|
||||
tags: ['v1.0'],
|
||||
};
|
||||
const current = previous;
|
||||
|
||||
expect(isRepoReindexingRequired(previous, current)).toBe(false);
|
||||
})
|
||||
|
||||
test('isRepoReindexingRequired should return true when git branches change', () => {
|
||||
const previous: Repository = {
|
||||
vcs: 'git',
|
||||
name: 'test',
|
||||
id: 'test',
|
||||
path: '',
|
||||
cloneUrl: '',
|
||||
isStale: false,
|
||||
branches: ['main'],
|
||||
tags: ['v1.0'],
|
||||
};
|
||||
|
||||
const current: Repository = {
|
||||
...previous,
|
||||
branches: ['main', 'feature']
|
||||
};
|
||||
|
||||
expect(isRepoReindexingRequired(previous, current)).toBe(true);
|
||||
});
|
||||
|
||||
test('isRepoReindexingRequired should return true when git tags change', () => {
|
||||
const previous: Repository = {
|
||||
vcs: 'git',
|
||||
name: 'test',
|
||||
id: 'test',
|
||||
path: '',
|
||||
cloneUrl: '',
|
||||
isStale: false,
|
||||
branches: ['main'],
|
||||
tags: ['v1.0'],
|
||||
};
|
||||
|
||||
const current: Repository = {
|
||||
...previous,
|
||||
tags: ['v1.0', 'v2.0']
|
||||
};
|
||||
|
||||
expect(isRepoReindexingRequired(previous, current)).toBe(true);
|
||||
});
|
||||
|
||||
test('isRepoReindexingRequired should return true when local excludedPaths change', () => {
|
||||
const previous: Repository = {
|
||||
vcs: 'local',
|
||||
name: 'test',
|
||||
id: 'test',
|
||||
path: '/',
|
||||
isStale: false,
|
||||
excludedPaths: ['node_modules'],
|
||||
watch: false,
|
||||
};
|
||||
|
||||
const current: Repository = {
|
||||
...previous,
|
||||
excludedPaths: ['node_modules', 'dist']
|
||||
};
|
||||
|
||||
expect(isRepoReindexingRequired(previous, current)).toBe(true);
|
||||
});
|
||||
|
||||
test('isAllRepoReindexingRequired should return false when fileLimitSize has not changed', () => {
|
||||
const previous: Settings = {
|
||||
maxFileSize: 1000,
|
||||
autoDeleteStaleRepos: true,
|
||||
}
|
||||
const current: Settings = {
|
||||
...previous,
|
||||
}
|
||||
expect(isAllRepoReindexingRequired(previous, current)).toBe(false);
|
||||
});
|
||||
|
||||
test('isAllRepoReindexingRequired should return true when fileLimitSize has changed', () => {
|
||||
const previous: Settings = {
|
||||
maxFileSize: 1000,
|
||||
autoDeleteStaleRepos: true,
|
||||
}
|
||||
const current: Settings = {
|
||||
...previous,
|
||||
maxFileSize: 2000,
|
||||
}
|
||||
expect(isAllRepoReindexingRequired(previous, current)).toBe(true);
|
||||
});
|
||||
|
||||
test('isAllRepoReindexingRequired should return false when autoDeleteStaleRepos has changed', () => {
|
||||
const previous: Settings = {
|
||||
maxFileSize: 1000,
|
||||
autoDeleteStaleRepos: true,
|
||||
}
|
||||
const current: Settings = {
|
||||
...previous,
|
||||
autoDeleteStaleRepos: false,
|
||||
}
|
||||
expect(isAllRepoReindexingRequired(previous, current)).toBe(false);
|
||||
});
|
||||
|
||||
test('deleteStaleRepository can delete a git repository', async () => {
|
||||
const ctx = createMockContext();
|
||||
|
||||
const repo: GitRepository = {
|
||||
id: 'github.com/sourcebot-dev/sourcebot',
|
||||
vcs: 'git',
|
||||
name: 'sourcebot',
|
||||
cloneUrl: 'https://github.com/sourcebot-dev/sourcebot',
|
||||
path: `${ctx.reposPath}/github.com/sourcebot-dev/sourcebot`,
|
||||
branches: ['main'],
|
||||
tags: [''],
|
||||
isStale: true,
|
||||
}
|
||||
|
||||
const db = createMockDB({
|
||||
...DEFAULT_DB_DATA,
|
||||
repos: {
|
||||
'github.com/sourcebot-dev/sourcebot': repo,
|
||||
}
|
||||
});
|
||||
|
||||
|
||||
await deleteStaleRepository(repo, db, ctx);
|
||||
|
||||
expect(db.data.repos['github.com/sourcebot-dev/sourcebot']).toBeUndefined();
|
||||
expect(rm).toHaveBeenCalledWith(`${ctx.reposPath}/github.com/sourcebot-dev/sourcebot`, {
|
||||
recursive: true,
|
||||
});
|
||||
expect(glob).toHaveBeenCalledWith(`github.com%2Fsourcebot-dev%2Fsourcebot*.zoekt`, {
|
||||
cwd: ctx.indexPath,
|
||||
absolute: true
|
||||
});
|
||||
expect(rm).toHaveBeenCalledWith(`fake_index.zoekt`);
|
||||
});
|
||||
|
||||
test('deleteStaleRepository can delete a local repository', async () => {
|
||||
const ctx = createMockContext();
|
||||
|
||||
const repo: LocalRepository = {
|
||||
vcs: 'local',
|
||||
name: 'UnrealEngine',
|
||||
id: '/path/to/UnrealEngine',
|
||||
path: '/path/to/UnrealEngine',
|
||||
watch: false,
|
||||
excludedPaths: [],
|
||||
isStale: true,
|
||||
}
|
||||
|
||||
const db = createMockDB({
|
||||
...DEFAULT_DB_DATA,
|
||||
repos: {
|
||||
'/path/to/UnrealEngine': repo,
|
||||
}
|
||||
});
|
||||
|
||||
await deleteStaleRepository(repo, db, ctx);
|
||||
|
||||
expect(db.data.repos['/path/to/UnrealEngine']).toBeUndefined();
|
||||
expect(rm).not.toHaveBeenCalledWith('/path/to/UnrealEngine');
|
||||
expect(glob).toHaveBeenCalledWith(`UnrealEngine*.zoekt`, {
|
||||
cwd: ctx.indexPath,
|
||||
absolute: true
|
||||
});
|
||||
expect(rm).toHaveBeenCalledWith('fake_index.zoekt');
|
||||
});
|
||||
|
|
@ -1,414 +1,72 @@
|
|||
import { readFile, rm } from 'fs/promises';
|
||||
import { existsSync, watch } from 'fs';
|
||||
import { SourcebotConfigurationSchema } from "./schemas/v2.js";
|
||||
import { getGitHubReposFromConfig } from "./github.js";
|
||||
import { getGitLabReposFromConfig } from "./gitlab.js";
|
||||
import { getGiteaReposFromConfig } from "./gitea.js";
|
||||
import { getGerritReposFromConfig } from "./gerrit.js";
|
||||
import { AppContext, LocalRepository, GitRepository, Repository, Settings } from "./types.js";
|
||||
import { cloneRepository, fetchRepository, getGitRepoFromConfig } from "./git.js";
|
||||
import { PrismaClient } from '@sourcebot/db';
|
||||
import { createLogger } from "./logger.js";
|
||||
import { createRepository, Database, loadDB, updateRepository, updateSettings } from './db.js';
|
||||
import { arraysEqualShallow, isRemotePath, measure } from "./utils.js";
|
||||
import { DEFAULT_SETTINGS } from "./constants.js";
|
||||
import { AppContext } from "./types.js";
|
||||
import { DEFAULT_SETTINGS } from './constants.js';
|
||||
import { Redis } from 'ioredis';
|
||||
import { ConnectionManager } from './connectionManager.js';
|
||||
import { RepoManager } from './repoManager.js';
|
||||
import { env } from './env.js';
|
||||
import { PromClient } from './promClient.js';
|
||||
import { isRemotePath } from './utils.js';
|
||||
import { readFile } from 'fs/promises';
|
||||
import stripJsonComments from 'strip-json-comments';
|
||||
import { indexGitRepository, indexLocalRepository } from "./zoekt.js";
|
||||
import { getLocalRepoFromConfig, initLocalRepoFileWatchers } from "./local.js";
|
||||
import { captureEvent } from "./posthog.js";
|
||||
import { glob } from 'glob';
|
||||
import path from 'path';
|
||||
import { SourcebotConfig } from '@sourcebot/schemas/v3/index.type';
|
||||
import { indexSchema } from '@sourcebot/schemas/v3/index.schema';
|
||||
import { Ajv } from "ajv";
|
||||
|
||||
const logger = createLogger('main');
|
||||
const ajv = new Ajv({
|
||||
validateFormats: false,
|
||||
});
|
||||
|
||||
const syncGitRepository = async (repo: GitRepository, settings: Settings, ctx: AppContext) => {
|
||||
let fetchDuration_s: number | undefined = undefined;
|
||||
let cloneDuration_s: number | undefined = undefined;
|
||||
|
||||
if (existsSync(repo.path)) {
|
||||
logger.info(`Fetching ${repo.id}...`);
|
||||
|
||||
const { durationMs } = await measure(() => fetchRepository(repo, ({ method, stage , progress}) => {
|
||||
logger.info(`git.${method} ${stage} stage ${progress}% complete for ${repo.id}`)
|
||||
}));
|
||||
fetchDuration_s = durationMs / 1000;
|
||||
|
||||
process.stdout.write('\n');
|
||||
logger.info(`Fetched ${repo.id} in ${fetchDuration_s}s`);
|
||||
|
||||
} else {
|
||||
logger.info(`Cloning ${repo.id}...`);
|
||||
|
||||
const { durationMs } = await measure(() => cloneRepository(repo, ({ method, stage, progress }) => {
|
||||
logger.info(`git.${method} ${stage} stage ${progress}% complete for ${repo.id}`)
|
||||
}));
|
||||
cloneDuration_s = durationMs / 1000;
|
||||
|
||||
process.stdout.write('\n');
|
||||
logger.info(`Cloned ${repo.id} in ${cloneDuration_s}s`);
|
||||
const getSettings = async (configPath?: string) => {
|
||||
if (!configPath) {
|
||||
return DEFAULT_SETTINGS;
|
||||
}
|
||||
|
||||
logger.info(`Indexing ${repo.id}...`);
|
||||
const { durationMs } = await measure(() => indexGitRepository(repo, settings, ctx));
|
||||
const indexDuration_s = durationMs / 1000;
|
||||
logger.info(`Indexed ${repo.id} in ${indexDuration_s}s`);
|
||||
|
||||
return {
|
||||
fetchDuration_s,
|
||||
cloneDuration_s,
|
||||
indexDuration_s,
|
||||
}
|
||||
}
|
||||
|
||||
const syncLocalRepository = async (repo: LocalRepository, settings: Settings, ctx: AppContext, signal?: AbortSignal) => {
|
||||
logger.info(`Indexing ${repo.id}...`);
|
||||
const { durationMs } = await measure(() => indexLocalRepository(repo, settings, ctx, signal));
|
||||
const indexDuration_s = durationMs / 1000;
|
||||
logger.info(`Indexed ${repo.id} in ${indexDuration_s}s`);
|
||||
return {
|
||||
indexDuration_s,
|
||||
}
|
||||
}
|
||||
|
||||
export const deleteStaleRepository = async (repo: Repository, db: Database, ctx: AppContext) => {
|
||||
logger.info(`Deleting stale repository ${repo.id}:`);
|
||||
|
||||
// Delete the checked out git repository (if applicable)
|
||||
if (repo.vcs === "git" && existsSync(repo.path)) {
|
||||
logger.info(`\tDeleting git directory ${repo.path}...`);
|
||||
await rm(repo.path, {
|
||||
recursive: true,
|
||||
});
|
||||
}
|
||||
|
||||
// Delete all .zoekt index files
|
||||
{
|
||||
// .zoekt index files are named with the repository name,
|
||||
// index version, and shard number. Some examples:
|
||||
//
|
||||
// git repos:
|
||||
// github.com%2Fsourcebot-dev%2Fsourcebot_v16.00000.zoekt
|
||||
// gitlab.com%2Fmy-org%2Fmy-project.00000.zoekt
|
||||
//
|
||||
// local repos:
|
||||
// UnrealEngine_v16.00000.zoekt
|
||||
// UnrealEngine_v16.00001.zoekt
|
||||
// ...
|
||||
// UnrealEngine_v16.00016.zoekt
|
||||
//
|
||||
// Notice that local repos are named with the repository basename and
|
||||
// git repos are named with the query-encoded repository name. Form a
|
||||
// glob pattern with the correct prefix & suffix to match the correct
|
||||
// index file(s) for the repository.
|
||||
//
|
||||
// @see : https://github.com/sourcegraph/zoekt/blob/c03b77fbf18b76904c0e061f10f46597eedd7b14/build/builder.go#L348
|
||||
const indexFilesGlobPattern = (() => {
|
||||
switch (repo.vcs) {
|
||||
case 'git':
|
||||
return `${encodeURIComponent(repo.id)}*.zoekt`;
|
||||
case 'local':
|
||||
return `${path.basename(repo.path)}*.zoekt`;
|
||||
}
|
||||
})();
|
||||
|
||||
const indexFiles = await glob(indexFilesGlobPattern, {
|
||||
cwd: ctx.indexPath,
|
||||
absolute: true
|
||||
});
|
||||
|
||||
await Promise.all(indexFiles.map((file) => {
|
||||
if (!existsSync(file)) {
|
||||
return;
|
||||
}
|
||||
|
||||
logger.info(`\tDeleting index file ${file}...`);
|
||||
return rm(file);
|
||||
}));
|
||||
}
|
||||
|
||||
// Delete db entry
|
||||
logger.info(`\tDeleting db entry...`);
|
||||
await db.update(({ repos }) => {
|
||||
delete repos[repo.id];
|
||||
});
|
||||
|
||||
logger.info(`Deleted stale repository ${repo.id}`);
|
||||
|
||||
captureEvent('repo_deleted', {
|
||||
vcs: repo.vcs,
|
||||
codeHost: repo.codeHost,
|
||||
})
|
||||
}
|
||||
|
||||
/**
|
||||
* Certain configuration changes (e.g., a branch is added) require
|
||||
* a reindexing of the repository.
|
||||
*/
|
||||
export const isRepoReindexingRequired = (previous: Repository, current: Repository) => {
|
||||
/**
|
||||
* Checks if the any of the `revisions` properties have changed.
|
||||
*/
|
||||
const isRevisionsChanged = () => {
|
||||
if (previous.vcs !== 'git' || current.vcs !== 'git') {
|
||||
return false;
|
||||
}
|
||||
|
||||
return (
|
||||
!arraysEqualShallow(previous.branches, current.branches) ||
|
||||
!arraysEqualShallow(previous.tags, current.tags)
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Check if the `exclude.paths` property has changed.
|
||||
*/
|
||||
const isExcludePathsChanged = () => {
|
||||
if (previous.vcs !== 'local' || current.vcs !== 'local') {
|
||||
return false;
|
||||
}
|
||||
|
||||
return !arraysEqualShallow(previous.excludedPaths, current.excludedPaths);
|
||||
}
|
||||
|
||||
return (
|
||||
isRevisionsChanged() ||
|
||||
isExcludePathsChanged()
|
||||
)
|
||||
}
|
||||
|
||||
/**
|
||||
* Certain settings changes (e.g., the file limit size is changed) require
|
||||
* a reindexing of _all_ repositories.
|
||||
*/
|
||||
export const isAllRepoReindexingRequired = (previous: Settings, current: Settings) => {
|
||||
return (
|
||||
previous?.maxFileSize !== current?.maxFileSize
|
||||
)
|
||||
}
|
||||
|
||||
const syncConfig = async (configPath: string, db: Database, signal: AbortSignal, ctx: AppContext) => {
|
||||
const configContent = await (async () => {
|
||||
if (isRemotePath(configPath)) {
|
||||
const response = await fetch(configPath, {
|
||||
signal,
|
||||
});
|
||||
const response = await fetch(configPath);
|
||||
if (!response.ok) {
|
||||
throw new Error(`Failed to fetch config file ${configPath}: ${response.statusText}`);
|
||||
}
|
||||
return response.text();
|
||||
} else {
|
||||
return readFile(configPath, {
|
||||
encoding: 'utf-8',
|
||||
signal,
|
||||
});
|
||||
return readFile(configPath, { encoding: 'utf-8' });
|
||||
}
|
||||
})();
|
||||
|
||||
// @todo: we should validate the configuration file's structure here.
|
||||
const config = JSON.parse(stripJsonComments(configContent)) as SourcebotConfigurationSchema;
|
||||
|
||||
// Update the settings
|
||||
const updatedSettings: Settings = {
|
||||
maxFileSize: config.settings?.maxFileSize ?? DEFAULT_SETTINGS.maxFileSize,
|
||||
maxTrigramCount: config.settings?.maxTrigramCount ?? DEFAULT_SETTINGS.maxTrigramCount,
|
||||
autoDeleteStaleRepos: config.settings?.autoDeleteStaleRepos ?? DEFAULT_SETTINGS.autoDeleteStaleRepos,
|
||||
reindexInterval: config.settings?.reindexInterval ?? DEFAULT_SETTINGS.reindexInterval,
|
||||
resyncInterval: config.settings?.resyncInterval ?? DEFAULT_SETTINGS.resyncInterval,
|
||||
}
|
||||
const _isAllRepoReindexingRequired = isAllRepoReindexingRequired(db.data.settings, updatedSettings);
|
||||
await updateSettings(updatedSettings, db);
|
||||
|
||||
// Fetch all repositories from the config file
|
||||
let configRepos: Repository[] = [];
|
||||
for (const repoConfig of config.repos ?? []) {
|
||||
switch (repoConfig.type) {
|
||||
case 'github': {
|
||||
const gitHubRepos = await getGitHubReposFromConfig(repoConfig, signal, ctx);
|
||||
configRepos.push(...gitHubRepos);
|
||||
break;
|
||||
}
|
||||
case 'gitlab': {
|
||||
const gitLabRepos = await getGitLabReposFromConfig(repoConfig, ctx);
|
||||
configRepos.push(...gitLabRepos);
|
||||
break;
|
||||
}
|
||||
case 'gitea': {
|
||||
const giteaRepos = await getGiteaReposFromConfig(repoConfig, ctx);
|
||||
configRepos.push(...giteaRepos);
|
||||
break;
|
||||
}
|
||||
case 'gerrit': {
|
||||
const gerritRepos = await getGerritReposFromConfig(repoConfig, ctx);
|
||||
configRepos.push(...gerritRepos);
|
||||
break;
|
||||
}
|
||||
case 'local': {
|
||||
const repo = getLocalRepoFromConfig(repoConfig, ctx);
|
||||
configRepos.push(repo);
|
||||
break;
|
||||
}
|
||||
case 'git': {
|
||||
const gitRepo = await getGitRepoFromConfig(repoConfig, ctx);
|
||||
gitRepo && configRepos.push(gitRepo);
|
||||
break;
|
||||
}
|
||||
}
|
||||
const config = JSON.parse(stripJsonComments(configContent)) as SourcebotConfig;
|
||||
const isValidConfig = ajv.validate(indexSchema, config);
|
||||
if (!isValidConfig) {
|
||||
throw new Error(`Config file '${configPath}' is invalid: ${ajv.errorsText(ajv.errors)}`);
|
||||
}
|
||||
|
||||
// De-duplicate on id
|
||||
configRepos.sort((a, b) => {
|
||||
return a.id.localeCompare(b.id);
|
||||
});
|
||||
configRepos = configRepos.filter((item, index, self) => {
|
||||
if (index === 0) return true;
|
||||
if (item.id === self[index - 1].id) {
|
||||
logger.debug(`Duplicate repository ${item.id} found in config file.`);
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
});
|
||||
|
||||
logger.info(`Discovered ${configRepos.length} unique repositories from config.`);
|
||||
|
||||
// Merge the repositories into the database
|
||||
for (const newRepo of configRepos) {
|
||||
if (newRepo.id in db.data.repos) {
|
||||
const existingRepo = db.data.repos[newRepo.id];
|
||||
const isReindexingRequired = _isAllRepoReindexingRequired || isRepoReindexingRequired(existingRepo, newRepo);
|
||||
if (isReindexingRequired) {
|
||||
logger.info(`Marking ${newRepo.id} for reindexing due to configuration change.`);
|
||||
}
|
||||
await updateRepository(existingRepo.id, {
|
||||
...newRepo,
|
||||
...(isReindexingRequired ? {
|
||||
lastIndexedDate: undefined,
|
||||
}: {})
|
||||
}, db);
|
||||
} else {
|
||||
await createRepository(newRepo, db);
|
||||
|
||||
captureEvent("repo_created", {
|
||||
vcs: newRepo.vcs,
|
||||
codeHost: newRepo.codeHost,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Find repositories that are in the database, but not in the configuration file
|
||||
{
|
||||
const a = configRepos.map(repo => repo.id);
|
||||
const b = Object.keys(db.data.repos);
|
||||
const diff = b.filter(x => !a.includes(x));
|
||||
|
||||
for (const id of diff) {
|
||||
await db.update(({ repos }) => {
|
||||
const repo = repos[id];
|
||||
if (repo.isStale) {
|
||||
return;
|
||||
}
|
||||
|
||||
logger.warn(`Repository ${id} is no longer listed in the configuration file or was not found. Marking as stale.`);
|
||||
repo.isStale = true;
|
||||
});
|
||||
}
|
||||
return {
|
||||
...DEFAULT_SETTINGS,
|
||||
...config.settings,
|
||||
}
|
||||
}
|
||||
|
||||
export const main = async (context: AppContext) => {
|
||||
const db = await loadDB(context);
|
||||
|
||||
let abortController = new AbortController();
|
||||
let isSyncing = false;
|
||||
const _syncConfig = async () => {
|
||||
if (isSyncing) {
|
||||
abortController.abort();
|
||||
abortController = new AbortController();
|
||||
}
|
||||
export const main = async (db: PrismaClient, context: AppContext) => {
|
||||
const redis = new Redis(env.REDIS_URL, {
|
||||
maxRetriesPerRequest: null
|
||||
});
|
||||
redis.ping().then(() => {
|
||||
logger.info('Connected to redis');
|
||||
}).catch((err: unknown) => {
|
||||
logger.error('Failed to connect to redis');
|
||||
console.error(err);
|
||||
process.exit(1);
|
||||
});
|
||||
|
||||
logger.info(`Syncing configuration file ${context.configPath} ...`);
|
||||
isSyncing = true;
|
||||
const settings = await getSettings(env.CONFIG_PATH);
|
||||
|
||||
try {
|
||||
const { durationMs } = await measure(() => syncConfig(context.configPath, db, abortController.signal, context))
|
||||
logger.info(`Synced configuration file ${context.configPath} in ${durationMs / 1000}s`);
|
||||
isSyncing = false;
|
||||
} catch (err: any) {
|
||||
if (err.name === "AbortError") {
|
||||
// @note: If we're aborting, we don't want to set isSyncing to false
|
||||
// since it implies another sync is in progress.
|
||||
} else {
|
||||
isSyncing = false;
|
||||
logger.error(`Failed to sync configuration file ${context.configPath} with error:`);
|
||||
console.log(err);
|
||||
}
|
||||
}
|
||||
const promClient = new PromClient();
|
||||
|
||||
const localRepos = Object.values(db.data.repos).filter(repo => repo.vcs === 'local');
|
||||
initLocalRepoFileWatchers(localRepos, async (repo, signal) => {
|
||||
logger.info(`Change detected to local repository ${repo.id}. Re-syncing...`);
|
||||
await syncLocalRepository(repo, db.data.settings, context, signal);
|
||||
await db.update(({ repos }) => repos[repo.id].lastIndexedDate = new Date().toUTCString());
|
||||
});
|
||||
}
|
||||
const connectionManager = new ConnectionManager(db, settings, redis);
|
||||
connectionManager.registerPollingCallback();
|
||||
|
||||
// Re-sync on file changes if the config file is local
|
||||
if (!isRemotePath(context.configPath)) {
|
||||
watch(context.configPath, () => {
|
||||
logger.info(`Config file ${context.configPath} changed. Re-syncing...`);
|
||||
_syncConfig();
|
||||
});
|
||||
}
|
||||
|
||||
// Re-sync at a fixed interval
|
||||
setInterval(() => {
|
||||
_syncConfig();
|
||||
}, db.data.settings.resyncInterval);
|
||||
|
||||
// Sync immediately on startup
|
||||
await _syncConfig();
|
||||
|
||||
while (true) {
|
||||
const repos = db.data.repos;
|
||||
|
||||
for (const [_, repo] of Object.entries(repos)) {
|
||||
const lastIndexed = repo.lastIndexedDate ? new Date(repo.lastIndexedDate) : new Date(0);
|
||||
|
||||
if (repo.isStale) {
|
||||
if (db.data.settings.autoDeleteStaleRepos) {
|
||||
await deleteStaleRepository(repo, db, context);
|
||||
} else {
|
||||
// skip deletion...
|
||||
}
|
||||
continue;
|
||||
}
|
||||
|
||||
if (lastIndexed.getTime() > (Date.now() - db.data.settings.reindexInterval)) {
|
||||
continue;
|
||||
}
|
||||
|
||||
try {
|
||||
let indexDuration_s: number | undefined;
|
||||
let fetchDuration_s: number | undefined;
|
||||
let cloneDuration_s: number | undefined;
|
||||
|
||||
if (repo.vcs === 'git') {
|
||||
const stats = await syncGitRepository(repo, db.data.settings, context);
|
||||
indexDuration_s = stats.indexDuration_s;
|
||||
fetchDuration_s = stats.fetchDuration_s;
|
||||
cloneDuration_s = stats.cloneDuration_s;
|
||||
} else if (repo.vcs === 'local') {
|
||||
const stats = await syncLocalRepository(repo, db.data.settings, context);
|
||||
indexDuration_s = stats.indexDuration_s;
|
||||
}
|
||||
} catch (err: any) {
|
||||
// @todo : better error handling here..
|
||||
logger.error(err);
|
||||
continue;
|
||||
}
|
||||
|
||||
await db.update(({ repos }) => repos[repo.id].lastIndexedDate = new Date().toUTCString());
|
||||
}
|
||||
|
||||
await new Promise(resolve => setTimeout(resolve, 1000));
|
||||
|
||||
}
|
||||
const repoManager = new RepoManager(db, settings, redis, promClient, context);
|
||||
await repoManager.blockingPollLoop();
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1,29 +1,29 @@
|
|||
import { PostHog } from 'posthog-node';
|
||||
import { PosthogEvent, PosthogEventMap } from './posthogEvents.js';
|
||||
import { POSTHOG_HOST, POSTHOG_PAPIK, SOURCEBOT_INSTALL_ID, SOURCEBOT_TELEMETRY_DISABLED, SOURCEBOT_VERSION } from './environment.js';
|
||||
import { env } from './env.js';
|
||||
|
||||
let posthog: PostHog | undefined = undefined;
|
||||
|
||||
if (POSTHOG_PAPIK) {
|
||||
if (env.NEXT_PUBLIC_POSTHOG_PAPIK) {
|
||||
posthog = new PostHog(
|
||||
POSTHOG_PAPIK,
|
||||
env.NEXT_PUBLIC_POSTHOG_PAPIK,
|
||||
{
|
||||
host: POSTHOG_HOST,
|
||||
host: "https://us.i.posthog.com",
|
||||
}
|
||||
);
|
||||
}
|
||||
|
||||
export function captureEvent<E extends PosthogEvent>(event: E, properties: PosthogEventMap[E]) {
|
||||
if (SOURCEBOT_TELEMETRY_DISABLED) {
|
||||
if (env.SOURCEBOT_TELEMETRY_DISABLED === 'true') {
|
||||
return;
|
||||
}
|
||||
|
||||
posthog?.capture({
|
||||
distinctId: SOURCEBOT_INSTALL_ID,
|
||||
distinctId: env.SOURCEBOT_INSTALL_ID,
|
||||
event: event,
|
||||
properties: {
|
||||
...properties,
|
||||
sourcebot_version: SOURCEBOT_VERSION,
|
||||
sourcebot_version: env.NEXT_PUBLIC_SOURCEBOT_VERSION,
|
||||
},
|
||||
});
|
||||
}
|
||||
|
|
|
|||
|
|
@ -5,17 +5,24 @@ export type PosthogEventMap = {
|
|||
vcs: string;
|
||||
codeHost?: string;
|
||||
},
|
||||
repo_synced: {
|
||||
vcs: string;
|
||||
codeHost?: string;
|
||||
fetchDuration_s?: number;
|
||||
cloneDuration_s?: number;
|
||||
indexDuration_s?: number;
|
||||
},
|
||||
repo_deleted: {
|
||||
vcs: string;
|
||||
codeHost?: string;
|
||||
}
|
||||
},
|
||||
//////////////////////////////////////////////////////////////////
|
||||
backend_connection_sync_job_failed: {
|
||||
connectionId: number,
|
||||
error: string,
|
||||
},
|
||||
backend_connection_sync_job_completed: {
|
||||
connectionId: number,
|
||||
repoCount: number,
|
||||
},
|
||||
backend_revisions_truncated: {
|
||||
repoId: number,
|
||||
revisionCount: number,
|
||||
},
|
||||
//////////////////////////////////////////////////////////////////
|
||||
}
|
||||
|
||||
export type PosthogEvent = keyof PosthogEventMap;
|
||||
106
packages/backend/src/promClient.ts
Normal file
|
|
@ -0,0 +1,106 @@
|
|||
import express, { Request, Response } from 'express';
|
||||
import client, { Registry, Counter, Gauge } from 'prom-client';
|
||||
|
||||
export class PromClient {
|
||||
private registry: Registry;
|
||||
private app: express.Application;
|
||||
public activeRepoIndexingJobs: Gauge<string>;
|
||||
public pendingRepoIndexingJobs: Gauge<string>;
|
||||
public repoIndexingReattemptsTotal: Counter<string>;
|
||||
public repoIndexingFailTotal: Counter<string>;
|
||||
public repoIndexingSuccessTotal: Counter<string>;
|
||||
|
||||
public activeRepoGarbageCollectionJobs: Gauge<string>;
|
||||
public repoGarbageCollectionErrorTotal: Counter<string>;
|
||||
public repoGarbageCollectionFailTotal: Counter<string>;
|
||||
public repoGarbageCollectionSuccessTotal: Counter<string>;
|
||||
|
||||
public readonly PORT = 3060;
|
||||
|
||||
constructor() {
|
||||
this.registry = new Registry();
|
||||
|
||||
this.activeRepoIndexingJobs = new Gauge({
|
||||
name: 'active_repo_indexing_jobs',
|
||||
help: 'The number of repo indexing jobs in progress',
|
||||
labelNames: ['repo'],
|
||||
});
|
||||
this.registry.registerMetric(this.activeRepoIndexingJobs);
|
||||
|
||||
this.pendingRepoIndexingJobs = new Gauge({
|
||||
name: 'pending_repo_indexing_jobs',
|
||||
help: 'The number of repo indexing jobs waiting in queue',
|
||||
labelNames: ['repo'],
|
||||
});
|
||||
this.registry.registerMetric(this.pendingRepoIndexingJobs);
|
||||
|
||||
this.repoIndexingReattemptsTotal = new Counter({
|
||||
name: 'repo_indexing_reattempts',
|
||||
help: 'The number of repo indexing reattempts',
|
||||
labelNames: ['repo'],
|
||||
});
|
||||
this.registry.registerMetric(this.repoIndexingReattemptsTotal);
|
||||
|
||||
this.repoIndexingFailTotal = new Counter({
|
||||
name: 'repo_indexing_fails',
|
||||
help: 'The number of repo indexing fails',
|
||||
labelNames: ['repo'],
|
||||
});
|
||||
this.registry.registerMetric(this.repoIndexingFailTotal);
|
||||
|
||||
this.repoIndexingSuccessTotal = new Counter({
|
||||
name: 'repo_indexing_successes',
|
||||
help: 'The number of repo indexing successes',
|
||||
labelNames: ['repo'],
|
||||
});
|
||||
this.registry.registerMetric(this.repoIndexingSuccessTotal);
|
||||
|
||||
this.activeRepoGarbageCollectionJobs = new Gauge({
|
||||
name: 'active_repo_garbage_collection_jobs',
|
||||
help: 'The number of repo garbage collection jobs in progress',
|
||||
labelNames: ['repo'],
|
||||
});
|
||||
this.registry.registerMetric(this.activeRepoGarbageCollectionJobs);
|
||||
|
||||
this.repoGarbageCollectionErrorTotal = new Counter({
|
||||
name: 'repo_garbage_collection_errors',
|
||||
help: 'The number of repo garbage collection errors',
|
||||
labelNames: ['repo'],
|
||||
});
|
||||
this.registry.registerMetric(this.repoGarbageCollectionErrorTotal);
|
||||
|
||||
this.repoGarbageCollectionFailTotal = new Counter({
|
||||
name: 'repo_garbage_collection_fails',
|
||||
help: 'The number of repo garbage collection fails',
|
||||
labelNames: ['repo'],
|
||||
});
|
||||
this.registry.registerMetric(this.repoGarbageCollectionFailTotal);
|
||||
|
||||
this.repoGarbageCollectionSuccessTotal = new Counter({
|
||||
name: 'repo_garbage_collection_successes',
|
||||
help: 'The number of repo garbage collection successes',
|
||||
labelNames: ['repo'],
|
||||
});
|
||||
this.registry.registerMetric(this.repoGarbageCollectionSuccessTotal);
|
||||
|
||||
client.collectDefaultMetrics({
|
||||
register: this.registry,
|
||||
});
|
||||
|
||||
this.app = express();
|
||||
this.app.get('/metrics', async (req: Request, res: Response) => {
|
||||
res.set('Content-Type', this.registry.contentType);
|
||||
|
||||
const metrics = await this.registry.metrics();
|
||||
res.end(metrics);
|
||||
});
|
||||
|
||||
this.app.listen(this.PORT, () => {
|
||||
console.log(`Prometheus metrics server is running on port ${this.PORT}`);
|
||||
});
|
||||
}
|
||||
|
||||
getRegistry(): Registry {
|
||||
return this.registry;
|
||||
}
|
||||
}
|
||||
279
packages/backend/src/repoCompileUtils.ts
Normal file
|
|
@ -0,0 +1,279 @@
|
|||
import { GithubConnectionConfig } from '@sourcebot/schemas/v3/github.type';
|
||||
import { getGitHubReposFromConfig } from "./github.js";
|
||||
import { getGitLabReposFromConfig } from "./gitlab.js";
|
||||
import { getGiteaReposFromConfig } from "./gitea.js";
|
||||
import { getGerritReposFromConfig } from "./gerrit.js";
|
||||
import { Prisma, PrismaClient } from '@sourcebot/db';
|
||||
import { WithRequired } from "./types.js"
|
||||
import { marshalBool } from "./utils.js";
|
||||
import { GerritConnectionConfig, GiteaConnectionConfig, GitlabConnectionConfig } from '@sourcebot/schemas/v3/connection.type';
|
||||
import { RepoMetadata } from './types.js';
|
||||
|
||||
export type RepoData = WithRequired<Prisma.RepoCreateInput, 'connections'>;
|
||||
|
||||
export const compileGithubConfig = async (
|
||||
config: GithubConnectionConfig,
|
||||
connectionId: number,
|
||||
orgId: number,
|
||||
db: PrismaClient,
|
||||
abortController: AbortController): Promise<{
|
||||
repoData: RepoData[],
|
||||
notFound: {
|
||||
users: string[],
|
||||
orgs: string[],
|
||||
repos: string[],
|
||||
}
|
||||
}> => {
|
||||
const gitHubReposResult = await getGitHubReposFromConfig(config, orgId, db, abortController.signal);
|
||||
const gitHubRepos = gitHubReposResult.validRepos;
|
||||
const notFound = gitHubReposResult.notFound;
|
||||
|
||||
const hostUrl = config.url ?? 'https://github.com';
|
||||
const hostname = new URL(hostUrl).hostname;
|
||||
|
||||
const repos = gitHubRepos.map((repo) => {
|
||||
const repoName = `${hostname}/${repo.full_name}`;
|
||||
const cloneUrl = new URL(repo.clone_url!);
|
||||
|
||||
const record: RepoData = {
|
||||
external_id: repo.id.toString(),
|
||||
external_codeHostType: 'github',
|
||||
external_codeHostUrl: hostUrl,
|
||||
cloneUrl: cloneUrl.toString(),
|
||||
webUrl: repo.html_url,
|
||||
name: repoName,
|
||||
imageUrl: repo.owner.avatar_url,
|
||||
isFork: repo.fork,
|
||||
isArchived: !!repo.archived,
|
||||
org: {
|
||||
connect: {
|
||||
id: orgId,
|
||||
},
|
||||
},
|
||||
connections: {
|
||||
create: {
|
||||
connectionId: connectionId,
|
||||
}
|
||||
},
|
||||
metadata: {
|
||||
gitConfig: {
|
||||
'zoekt.web-url-type': 'github',
|
||||
'zoekt.web-url': repo.html_url,
|
||||
'zoekt.name': repoName,
|
||||
'zoekt.github-stars': (repo.stargazers_count ?? 0).toString(),
|
||||
'zoekt.github-watchers': (repo.watchers_count ?? 0).toString(),
|
||||
'zoekt.github-subscribers': (repo.subscribers_count ?? 0).toString(),
|
||||
'zoekt.github-forks': (repo.forks_count ?? 0).toString(),
|
||||
'zoekt.archived': marshalBool(repo.archived),
|
||||
'zoekt.fork': marshalBool(repo.fork),
|
||||
'zoekt.public': marshalBool(repo.private === false),
|
||||
},
|
||||
branches: config.revisions?.branches ?? undefined,
|
||||
tags: config.revisions?.tags ?? undefined,
|
||||
} satisfies RepoMetadata,
|
||||
};
|
||||
|
||||
return record;
|
||||
})
|
||||
|
||||
return {
|
||||
repoData: repos,
|
||||
notFound,
|
||||
};
|
||||
}
|
||||
|
||||
export const compileGitlabConfig = async (
|
||||
config: GitlabConnectionConfig,
|
||||
connectionId: number,
|
||||
orgId: number,
|
||||
db: PrismaClient) => {
|
||||
|
||||
const gitlabReposResult = await getGitLabReposFromConfig(config, orgId, db);
|
||||
const gitlabRepos = gitlabReposResult.validRepos;
|
||||
const notFound = gitlabReposResult.notFound;
|
||||
|
||||
const hostUrl = config.url ?? 'https://gitlab.com';
|
||||
const hostname = new URL(hostUrl).hostname;
|
||||
|
||||
const repos = gitlabRepos.map((project) => {
|
||||
const projectUrl = `${hostUrl}/${project.path_with_namespace}`;
|
||||
const cloneUrl = new URL(project.http_url_to_repo);
|
||||
const isFork = project.forked_from_project !== undefined;
|
||||
const repoName = `${hostname}/${project.path_with_namespace}`;
|
||||
|
||||
const record: RepoData = {
|
||||
external_id: project.id.toString(),
|
||||
external_codeHostType: 'gitlab',
|
||||
external_codeHostUrl: hostUrl,
|
||||
cloneUrl: cloneUrl.toString(),
|
||||
webUrl: projectUrl,
|
||||
name: repoName,
|
||||
imageUrl: project.avatar_url,
|
||||
isFork: isFork,
|
||||
isArchived: !!project.archived,
|
||||
org: {
|
||||
connect: {
|
||||
id: orgId,
|
||||
},
|
||||
},
|
||||
connections: {
|
||||
create: {
|
||||
connectionId: connectionId,
|
||||
}
|
||||
},
|
||||
metadata: {
|
||||
gitConfig: {
|
||||
'zoekt.web-url-type': 'gitlab',
|
||||
'zoekt.web-url': projectUrl,
|
||||
'zoekt.name': repoName,
|
||||
'zoekt.gitlab-stars': (project.stargazers_count ?? 0).toString(),
|
||||
'zoekt.gitlab-forks': (project.forks_count ?? 0).toString(),
|
||||
'zoekt.archived': marshalBool(project.archived),
|
||||
'zoekt.fork': marshalBool(isFork),
|
||||
'zoekt.public': marshalBool(project.private === false)
|
||||
},
|
||||
branches: config.revisions?.branches ?? undefined,
|
||||
tags: config.revisions?.tags ?? undefined,
|
||||
} satisfies RepoMetadata,
|
||||
};
|
||||
|
||||
return record;
|
||||
})
|
||||
|
||||
return {
|
||||
repoData: repos,
|
||||
notFound,
|
||||
};
|
||||
}
|
||||
|
||||
export const compileGiteaConfig = async (
|
||||
config: GiteaConnectionConfig,
|
||||
connectionId: number,
|
||||
orgId: number,
|
||||
db: PrismaClient) => {
|
||||
|
||||
const giteaReposResult = await getGiteaReposFromConfig(config, orgId, db);
|
||||
const giteaRepos = giteaReposResult.validRepos;
|
||||
const notFound = giteaReposResult.notFound;
|
||||
|
||||
const hostUrl = config.url ?? 'https://gitea.com';
|
||||
const hostname = new URL(hostUrl).hostname;
|
||||
|
||||
const repos = giteaRepos.map((repo) => {
|
||||
const cloneUrl = new URL(repo.clone_url!);
|
||||
const repoName = `${hostname}/${repo.full_name!}`;
|
||||
|
||||
const record: RepoData = {
|
||||
external_id: repo.id!.toString(),
|
||||
external_codeHostType: 'gitea',
|
||||
external_codeHostUrl: hostUrl,
|
||||
cloneUrl: cloneUrl.toString(),
|
||||
webUrl: repo.html_url,
|
||||
name: repoName,
|
||||
imageUrl: repo.owner?.avatar_url,
|
||||
isFork: repo.fork!,
|
||||
isArchived: !!repo.archived,
|
||||
org: {
|
||||
connect: {
|
||||
id: orgId,
|
||||
},
|
||||
},
|
||||
connections: {
|
||||
create: {
|
||||
connectionId: connectionId,
|
||||
}
|
||||
},
|
||||
metadata: {
|
||||
gitConfig: {
|
||||
'zoekt.web-url-type': 'gitea',
|
||||
'zoekt.web-url': repo.html_url!,
|
||||
'zoekt.name': repoName,
|
||||
'zoekt.archived': marshalBool(repo.archived),
|
||||
'zoekt.fork': marshalBool(repo.fork!),
|
||||
'zoekt.public': marshalBool(repo.internal === false && repo.private === false),
|
||||
},
|
||||
branches: config.revisions?.branches ?? undefined,
|
||||
tags: config.revisions?.tags ?? undefined,
|
||||
} satisfies RepoMetadata,
|
||||
};
|
||||
|
||||
return record;
|
||||
})
|
||||
|
||||
return {
|
||||
repoData: repos,
|
||||
notFound,
|
||||
};
|
||||
}
|
||||
|
||||
export const compileGerritConfig = async (
|
||||
config: GerritConnectionConfig,
|
||||
connectionId: number,
|
||||
orgId: number) => {
|
||||
|
||||
const gerritRepos = await getGerritReposFromConfig(config);
|
||||
const hostUrl = (config.url ?? 'https://gerritcodereview.com').replace(/\/$/, ''); // Remove trailing slash
|
||||
const hostname = new URL(hostUrl).hostname;
|
||||
|
||||
const repos = gerritRepos.map((project) => {
|
||||
const repoId = `${hostname}/${project.name}`;
|
||||
const cloneUrl = new URL(`${config.url}/${encodeURIComponent(project.name)}`);
|
||||
|
||||
let webUrl = "https://www.gerritcodereview.com/";
|
||||
// Gerrit projects can have multiple web links; use the first one
|
||||
if (project.web_links) {
|
||||
const webLink = project.web_links[0];
|
||||
if (webLink) {
|
||||
webUrl = webLink.url;
|
||||
}
|
||||
}
|
||||
|
||||
// Handle case where webUrl is just a gitiles path
|
||||
// https://github.com/GerritCodeReview/plugins_gitiles/blob/5ee7f57/src/main/java/com/googlesource/gerrit/plugins/gitiles/GitilesWeblinks.java#L50
|
||||
if (webUrl.startsWith('/plugins/gitiles/')) {
|
||||
webUrl = `${hostUrl}${webUrl}`;
|
||||
}
|
||||
|
||||
const record: RepoData = {
|
||||
external_id: project.id.toString(),
|
||||
external_codeHostType: 'gerrit',
|
||||
external_codeHostUrl: hostUrl,
|
||||
cloneUrl: cloneUrl.toString(),
|
||||
webUrl: webUrl,
|
||||
name: project.name,
|
||||
isFork: false,
|
||||
isArchived: false,
|
||||
org: {
|
||||
connect: {
|
||||
id: orgId,
|
||||
},
|
||||
},
|
||||
connections: {
|
||||
create: {
|
||||
connectionId: connectionId,
|
||||
}
|
||||
},
|
||||
metadata: {
|
||||
gitConfig: {
|
||||
'zoekt.web-url-type': 'gitiles',
|
||||
'zoekt.web-url': webUrl,
|
||||
'zoekt.name': repoId,
|
||||
'zoekt.archived': marshalBool(false),
|
||||
'zoekt.fork': marshalBool(false),
|
||||
'zoekt.public': marshalBool(true),
|
||||
},
|
||||
} satisfies RepoMetadata,
|
||||
};
|
||||
|
||||
return record;
|
||||
})
|
||||
|
||||
return {
|
||||
repoData: repos,
|
||||
notFound: {
|
||||
users: [],
|
||||
orgs: [],
|
||||
repos: [],
|
||||
}
|
||||
};
|
||||
}
|
||||
546
packages/backend/src/repoManager.ts
Normal file
|
|
@ -0,0 +1,546 @@
|
|||
import { Job, Queue, Worker } from 'bullmq';
|
||||
import { Redis } from 'ioredis';
|
||||
import { createLogger } from "./logger.js";
|
||||
import { Connection, PrismaClient, Repo, RepoToConnection, RepoIndexingStatus, StripeSubscriptionStatus } from "@sourcebot/db";
|
||||
import { GithubConnectionConfig, GitlabConnectionConfig, GiteaConnectionConfig } from '@sourcebot/schemas/v3/connection.type';
|
||||
import { AppContext, Settings, RepoMetadata } from "./types.js";
|
||||
import { getRepoPath, getTokenFromConfig, measure, getShardPrefix } from "./utils.js";
|
||||
import { cloneRepository, fetchRepository } from "./git.js";
|
||||
import { existsSync, readdirSync, promises } from 'fs';
|
||||
import { indexGitRepository } from "./zoekt.js";
|
||||
import { PromClient } from './promClient.js';
|
||||
import * as Sentry from "@sentry/node";
|
||||
|
||||
interface IRepoManager {
|
||||
blockingPollLoop: () => void;
|
||||
dispose: () => void;
|
||||
}
|
||||
|
||||
const REPO_INDEXING_QUEUE = 'repoIndexingQueue';
|
||||
const REPO_GC_QUEUE = 'repoGarbageCollectionQueue';
|
||||
|
||||
type RepoWithConnections = Repo & { connections: (RepoToConnection & { connection: Connection })[] };
|
||||
type RepoIndexingPayload = {
|
||||
repo: RepoWithConnections,
|
||||
}
|
||||
|
||||
type RepoGarbageCollectionPayload = {
|
||||
repo: Repo,
|
||||
}
|
||||
|
||||
export class RepoManager implements IRepoManager {
|
||||
private indexWorker: Worker;
|
||||
private indexQueue: Queue<RepoIndexingPayload>;
|
||||
private gcWorker: Worker;
|
||||
private gcQueue: Queue<RepoGarbageCollectionPayload>;
|
||||
private logger = createLogger('RepoManager');
|
||||
|
||||
constructor(
|
||||
private db: PrismaClient,
|
||||
private settings: Settings,
|
||||
redis: Redis,
|
||||
private promClient: PromClient,
|
||||
private ctx: AppContext,
|
||||
) {
|
||||
// Repo indexing
|
||||
this.indexQueue = new Queue<RepoIndexingPayload>(REPO_INDEXING_QUEUE, {
|
||||
connection: redis,
|
||||
});
|
||||
this.indexWorker = new Worker(REPO_INDEXING_QUEUE, this.runIndexJob.bind(this), {
|
||||
connection: redis,
|
||||
concurrency: this.settings.maxRepoIndexingJobConcurrency,
|
||||
});
|
||||
this.indexWorker.on('completed', this.onIndexJobCompleted.bind(this));
|
||||
this.indexWorker.on('failed', this.onIndexJobFailed.bind(this));
|
||||
|
||||
// Garbage collection
|
||||
this.gcQueue = new Queue<RepoGarbageCollectionPayload>(REPO_GC_QUEUE, {
|
||||
connection: redis,
|
||||
});
|
||||
this.gcWorker = new Worker(REPO_GC_QUEUE, this.runGarbageCollectionJob.bind(this), {
|
||||
connection: redis,
|
||||
concurrency: this.settings.maxRepoGarbageCollectionJobConcurrency,
|
||||
});
|
||||
this.gcWorker.on('completed', this.onGarbageCollectionJobCompleted.bind(this));
|
||||
this.gcWorker.on('failed', this.onGarbageCollectionJobFailed.bind(this));
|
||||
}
|
||||
|
||||
public async blockingPollLoop() {
|
||||
while (true) {
|
||||
await this.fetchAndScheduleRepoIndexing();
|
||||
await this.fetchAndScheduleRepoGarbageCollection();
|
||||
await this.fetchAndScheduleRepoTimeouts();
|
||||
|
||||
await new Promise(resolve => setTimeout(resolve, this.settings.reindexRepoPollingIntervalMs));
|
||||
}
|
||||
}
|
||||
|
||||
///////////////////////////
|
||||
// Repo indexing
|
||||
///////////////////////////
|
||||
|
||||
private async scheduleRepoIndexingBulk(repos: RepoWithConnections[]) {
|
||||
await this.db.$transaction(async (tx) => {
|
||||
await tx.repo.updateMany({
|
||||
where: { id: { in: repos.map(repo => repo.id) } },
|
||||
data: { repoIndexingStatus: RepoIndexingStatus.IN_INDEX_QUEUE }
|
||||
});
|
||||
|
||||
const reposByOrg = repos.reduce<Record<number, RepoWithConnections[]>>((acc, repo) => {
|
||||
if (!acc[repo.orgId]) {
|
||||
acc[repo.orgId] = [];
|
||||
}
|
||||
acc[repo.orgId].push(repo);
|
||||
return acc;
|
||||
}, {});
|
||||
|
||||
for (const orgId in reposByOrg) {
|
||||
const orgRepos = reposByOrg[orgId];
|
||||
// Set priority based on number of repos (more repos = lower priority)
|
||||
// This helps prevent large orgs from overwhelming the indexQueue
|
||||
const priority = Math.min(Math.ceil(orgRepos.length / 10), 2097152);
|
||||
|
||||
await this.indexQueue.addBulk(orgRepos.map(repo => ({
|
||||
name: 'repoIndexJob',
|
||||
data: { repo },
|
||||
opts: {
|
||||
priority: priority
|
||||
}
|
||||
})));
|
||||
|
||||
// Increment pending jobs counter for each repo added
|
||||
orgRepos.forEach(repo => {
|
||||
this.promClient.pendingRepoIndexingJobs.inc({ repo: repo.id.toString() });
|
||||
});
|
||||
|
||||
this.logger.info(`Added ${orgRepos.length} jobs to indexQueue for org ${orgId} with priority ${priority}`);
|
||||
}
|
||||
|
||||
|
||||
}).catch((err: unknown) => {
|
||||
this.logger.error(`Failed to add jobs to indexQueue for repos ${repos.map(repo => repo.id).join(', ')}: ${err}`);
|
||||
});
|
||||
}
|
||||
|
||||
|
||||
private async fetchAndScheduleRepoIndexing() {
|
||||
const thresholdDate = new Date(Date.now() - this.settings.reindexIntervalMs);
|
||||
const repos = await this.db.repo.findMany({
|
||||
where: {
|
||||
OR: [
|
||||
// "NEW" is really a misnomer here - it just means that the repo needs to be indexed
|
||||
// immediately. In most cases, this will be because the repo was just created and
|
||||
// is indeed "new". However, it could also be that a "retry" was requested on a failed
|
||||
// index. So, we don't want to block on the indexedAt timestamp here.
|
||||
{
|
||||
repoIndexingStatus: RepoIndexingStatus.NEW,
|
||||
},
|
||||
// When the repo has already been indexed, we only want to reindex if the reindexing
|
||||
// interval has elapsed (or if the date isn't set for some reason).
|
||||
{
|
||||
AND: [
|
||||
{ repoIndexingStatus: RepoIndexingStatus.INDEXED },
|
||||
{ OR: [
|
||||
{ indexedAt: null },
|
||||
{ indexedAt: { lt: thresholdDate } },
|
||||
]}
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
include: {
|
||||
connections: {
|
||||
include: {
|
||||
connection: true
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
if (repos.length > 0) {
|
||||
await this.scheduleRepoIndexingBulk(repos);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
// TODO: do this better? ex: try using the tokens from all the connections
|
||||
// We can no longer use repo.cloneUrl directly since it doesn't contain the token for security reasons. As a result, we need to
|
||||
// fetch the token here using the connections from the repo. Multiple connections could be referencing this repo, and each
|
||||
// may have their own token. This method will just pick the first connection that has a token (if one exists) and uses that. This
|
||||
// may technically cause syncing to fail if that connection's token just so happens to not have access to the repo it's referrencing.
|
||||
private async getTokenForRepo(repo: RepoWithConnections, db: PrismaClient) {
|
||||
const repoConnections = repo.connections;
|
||||
if (repoConnections.length === 0) {
|
||||
this.logger.error(`Repo ${repo.id} has no connections`);
|
||||
return;
|
||||
}
|
||||
|
||||
|
||||
let token: string | undefined;
|
||||
for (const repoConnection of repoConnections) {
|
||||
const connection = repoConnection.connection;
|
||||
if (connection.connectionType !== 'github' && connection.connectionType !== 'gitlab' && connection.connectionType !== 'gitea') {
|
||||
continue;
|
||||
}
|
||||
|
||||
const config = connection.config as unknown as GithubConnectionConfig | GitlabConnectionConfig | GiteaConnectionConfig;
|
||||
if (config.token) {
|
||||
token = await getTokenFromConfig(config.token, connection.orgId, db, this.logger);
|
||||
if (token) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return token;
|
||||
}
|
||||
|
||||
private async syncGitRepository(repo: RepoWithConnections, repoAlreadyInIndexingState: boolean) {
|
||||
let fetchDuration_s: number | undefined = undefined;
|
||||
let cloneDuration_s: number | undefined = undefined;
|
||||
|
||||
const repoPath = getRepoPath(repo, this.ctx);
|
||||
const metadata = repo.metadata as RepoMetadata;
|
||||
|
||||
|
||||
// If the repo was already in the indexing state, this job was likely killed and picked up again. As a result,
|
||||
// to ensure the repo state is valid, we delete the repo if it exists so we get a fresh clone
|
||||
if (repoAlreadyInIndexingState && existsSync(repoPath)) {
|
||||
this.logger.info(`Deleting repo directory ${repoPath} during sync because it was already in the indexing state`);
|
||||
await promises.rm(repoPath, { recursive: true, force: true });
|
||||
}
|
||||
|
||||
if (existsSync(repoPath)) {
|
||||
this.logger.info(`Fetching ${repo.id}...`);
|
||||
|
||||
const { durationMs } = await measure(() => fetchRepository(repoPath, ({ method, stage, progress }) => {
|
||||
this.logger.debug(`git.${method} ${stage} stage ${progress}% complete for ${repo.id}`)
|
||||
}));
|
||||
fetchDuration_s = durationMs / 1000;
|
||||
|
||||
process.stdout.write('\n');
|
||||
this.logger.info(`Fetched ${repo.name} in ${fetchDuration_s}s`);
|
||||
|
||||
} else {
|
||||
this.logger.info(`Cloning ${repo.id}...`);
|
||||
|
||||
const token = await this.getTokenForRepo(repo, this.db);
|
||||
const cloneUrl = new URL(repo.cloneUrl);
|
||||
if (token) {
|
||||
switch (repo.external_codeHostType) {
|
||||
case 'gitlab':
|
||||
cloneUrl.username = 'oauth2';
|
||||
cloneUrl.password = token;
|
||||
break;
|
||||
case 'gitea':
|
||||
case 'github':
|
||||
default:
|
||||
cloneUrl.username = token;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
const { durationMs } = await measure(() => cloneRepository(cloneUrl.toString(), repoPath, metadata.gitConfig, ({ method, stage, progress }) => {
|
||||
this.logger.debug(`git.${method} ${stage} stage ${progress}% complete for ${repo.id}`)
|
||||
}));
|
||||
cloneDuration_s = durationMs / 1000;
|
||||
|
||||
process.stdout.write('\n');
|
||||
this.logger.info(`Cloned ${repo.id} in ${cloneDuration_s}s`);
|
||||
}
|
||||
|
||||
this.logger.info(`Indexing ${repo.id}...`);
|
||||
const { durationMs } = await measure(() => indexGitRepository(repo, this.settings, this.ctx));
|
||||
const indexDuration_s = durationMs / 1000;
|
||||
this.logger.info(`Indexed ${repo.id} in ${indexDuration_s}s`);
|
||||
|
||||
return {
|
||||
fetchDuration_s,
|
||||
cloneDuration_s,
|
||||
indexDuration_s,
|
||||
}
|
||||
}
|
||||
|
||||
private async runIndexJob(job: Job<RepoIndexingPayload>) {
|
||||
this.logger.info(`Running index job (id: ${job.id}) for repo ${job.data.repo.id}`);
|
||||
const repo = job.data.repo as RepoWithConnections;
|
||||
|
||||
// We have to use the existing repo object to get the repoIndexingStatus because the repo object
|
||||
// inside the job is unchanged from when it was added to the queue.
|
||||
const existingRepo = await this.db.repo.findUnique({
|
||||
where: {
|
||||
id: repo.id,
|
||||
},
|
||||
});
|
||||
if (!existingRepo) {
|
||||
this.logger.error(`Repo ${repo.id} not found`);
|
||||
const e = new Error(`Repo ${repo.id} not found`);
|
||||
Sentry.captureException(e);
|
||||
throw e;
|
||||
}
|
||||
const repoAlreadyInIndexingState = existingRepo.repoIndexingStatus === RepoIndexingStatus.INDEXING;
|
||||
|
||||
|
||||
await this.db.repo.update({
|
||||
where: {
|
||||
id: repo.id,
|
||||
},
|
||||
data: {
|
||||
repoIndexingStatus: RepoIndexingStatus.INDEXING,
|
||||
}
|
||||
});
|
||||
this.promClient.activeRepoIndexingJobs.inc();
|
||||
this.promClient.pendingRepoIndexingJobs.dec({ repo: repo.id.toString() });
|
||||
|
||||
let indexDuration_s: number | undefined;
|
||||
let fetchDuration_s: number | undefined;
|
||||
let cloneDuration_s: number | undefined;
|
||||
|
||||
let stats;
|
||||
let attempts = 0;
|
||||
const maxAttempts = 3;
|
||||
|
||||
while (attempts < maxAttempts) {
|
||||
try {
|
||||
stats = await this.syncGitRepository(repo, repoAlreadyInIndexingState);
|
||||
break;
|
||||
} catch (error) {
|
||||
Sentry.captureException(error);
|
||||
|
||||
attempts++;
|
||||
this.promClient.repoIndexingReattemptsTotal.inc();
|
||||
if (attempts === maxAttempts) {
|
||||
this.logger.error(`Failed to sync repository ${repo.id} after ${maxAttempts} attempts. Error: ${error}`);
|
||||
throw error;
|
||||
}
|
||||
|
||||
const sleepDuration = 5000 * Math.pow(2, attempts - 1);
|
||||
this.logger.error(`Failed to sync repository ${repo.id}, attempt ${attempts}/${maxAttempts}. Sleeping for ${sleepDuration / 1000}s... Error: ${error}`);
|
||||
await new Promise(resolve => setTimeout(resolve, sleepDuration));
|
||||
}
|
||||
}
|
||||
|
||||
indexDuration_s = stats!.indexDuration_s;
|
||||
fetchDuration_s = stats!.fetchDuration_s;
|
||||
cloneDuration_s = stats!.cloneDuration_s;
|
||||
}
|
||||
|
||||
private async onIndexJobCompleted(job: Job<RepoIndexingPayload>) {
|
||||
this.logger.info(`Repo index job ${job.id} completed`);
|
||||
this.promClient.activeRepoIndexingJobs.dec();
|
||||
this.promClient.repoIndexingSuccessTotal.inc();
|
||||
|
||||
await this.db.repo.update({
|
||||
where: {
|
||||
id: job.data.repo.id,
|
||||
},
|
||||
data: {
|
||||
indexedAt: new Date(),
|
||||
repoIndexingStatus: RepoIndexingStatus.INDEXED,
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
private async onIndexJobFailed(job: Job<RepoIndexingPayload> | undefined, err: unknown) {
|
||||
this.logger.info(`Repo index job failed (id: ${job?.id ?? 'unknown'}) with error: ${err}`);
|
||||
Sentry.captureException(err, {
|
||||
tags: {
|
||||
repoId: job?.data.repo.id,
|
||||
jobId: job?.id,
|
||||
queue: REPO_INDEXING_QUEUE,
|
||||
}
|
||||
});
|
||||
|
||||
if (job) {
|
||||
this.promClient.activeRepoIndexingJobs.dec();
|
||||
this.promClient.repoIndexingFailTotal.inc();
|
||||
|
||||
await this.db.repo.update({
|
||||
where: {
|
||||
id: job.data.repo.id,
|
||||
},
|
||||
data: {
|
||||
repoIndexingStatus: RepoIndexingStatus.FAILED,
|
||||
indexedAt: new Date(),
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
///////////////////////////
|
||||
// Repo garbage collection
|
||||
///////////////////////////
|
||||
|
||||
private async scheduleRepoGarbageCollectionBulk(repos: Repo[]) {
|
||||
await this.db.$transaction(async (tx) => {
|
||||
await tx.repo.updateMany({
|
||||
where: { id: { in: repos.map(repo => repo.id) } },
|
||||
data: { repoIndexingStatus: RepoIndexingStatus.IN_GC_QUEUE }
|
||||
});
|
||||
|
||||
await this.gcQueue.addBulk(repos.map(repo => ({
|
||||
name: 'repoGarbageCollectionJob',
|
||||
data: { repo },
|
||||
})));
|
||||
|
||||
this.logger.info(`Added ${repos.length} jobs to gcQueue`);
|
||||
});
|
||||
}
|
||||
|
||||
private async fetchAndScheduleRepoGarbageCollection() {
|
||||
////////////////////////////////////
|
||||
// Get repos with no connections
|
||||
////////////////////////////////////
|
||||
|
||||
|
||||
const thresholdDate = new Date(Date.now() - this.settings.repoGarbageCollectionGracePeriodMs);
|
||||
const reposWithNoConnections = await this.db.repo.findMany({
|
||||
where: {
|
||||
repoIndexingStatus: {
|
||||
in: [
|
||||
RepoIndexingStatus.INDEXED, // we don't include NEW repos here because they'll be picked up by the index queue (potential race condition)
|
||||
RepoIndexingStatus.FAILED,
|
||||
]
|
||||
},
|
||||
connections: {
|
||||
none: {}
|
||||
},
|
||||
OR: [
|
||||
{ indexedAt: null },
|
||||
{ indexedAt: { lt: thresholdDate } }
|
||||
]
|
||||
},
|
||||
});
|
||||
if (reposWithNoConnections.length > 0) {
|
||||
this.logger.info(`Garbage collecting ${reposWithNoConnections.length} repos with no connections: ${reposWithNoConnections.map(repo => repo.id).join(', ')}`);
|
||||
}
|
||||
|
||||
////////////////////////////////////
|
||||
// Get inactive org repos
|
||||
////////////////////////////////////
|
||||
const sevenDaysAgo = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000);
|
||||
const inactiveOrgRepos = await this.db.repo.findMany({
|
||||
where: {
|
||||
org: {
|
||||
stripeSubscriptionStatus: StripeSubscriptionStatus.INACTIVE,
|
||||
stripeLastUpdatedAt: {
|
||||
lt: sevenDaysAgo
|
||||
}
|
||||
},
|
||||
OR: [
|
||||
{ indexedAt: null },
|
||||
{ indexedAt: { lt: thresholdDate } }
|
||||
]
|
||||
}
|
||||
});
|
||||
|
||||
if (inactiveOrgRepos.length > 0) {
|
||||
this.logger.info(`Garbage collecting ${inactiveOrgRepos.length} inactive org repos: ${inactiveOrgRepos.map(repo => repo.id).join(', ')}`);
|
||||
}
|
||||
|
||||
const reposToDelete = [...reposWithNoConnections, ...inactiveOrgRepos];
|
||||
if (reposToDelete.length > 0) {
|
||||
await this.scheduleRepoGarbageCollectionBulk(reposToDelete);
|
||||
}
|
||||
}
|
||||
|
||||
private async runGarbageCollectionJob(job: Job<RepoGarbageCollectionPayload>) {
|
||||
this.logger.info(`Running garbage collection job (id: ${job.id}) for repo ${job.data.repo.id}`);
|
||||
this.promClient.activeRepoGarbageCollectionJobs.inc();
|
||||
|
||||
const repo = job.data.repo as Repo;
|
||||
await this.db.repo.update({
|
||||
where: {
|
||||
id: repo.id
|
||||
},
|
||||
data: {
|
||||
repoIndexingStatus: RepoIndexingStatus.GARBAGE_COLLECTING
|
||||
}
|
||||
});
|
||||
|
||||
// delete cloned repo
|
||||
const repoPath = getRepoPath(repo, this.ctx);
|
||||
if (existsSync(repoPath)) {
|
||||
this.logger.info(`Deleting repo directory ${repoPath}`);
|
||||
await promises.rm(repoPath, { recursive: true, force: true });
|
||||
}
|
||||
|
||||
// delete shards
|
||||
const shardPrefix = getShardPrefix(repo.orgId, repo.id);
|
||||
const files = readdirSync(this.ctx.indexPath).filter(file => file.startsWith(shardPrefix));
|
||||
for (const file of files) {
|
||||
const filePath = `${this.ctx.indexPath}/${file}`;
|
||||
this.logger.info(`Deleting shard file ${filePath}`);
|
||||
await promises.rm(filePath, { force: true });
|
||||
}
|
||||
}
|
||||
|
||||
private async onGarbageCollectionJobCompleted(job: Job<RepoGarbageCollectionPayload>) {
|
||||
this.logger.info(`Garbage collection job ${job.id} completed`);
|
||||
this.promClient.activeRepoGarbageCollectionJobs.dec();
|
||||
this.promClient.repoGarbageCollectionSuccessTotal.inc();
|
||||
|
||||
await this.db.repo.delete({
|
||||
where: {
|
||||
id: job.data.repo.id
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
private async onGarbageCollectionJobFailed(job: Job<RepoGarbageCollectionPayload> | undefined, err: unknown) {
|
||||
this.logger.info(`Garbage collection job failed (id: ${job?.id ?? 'unknown'}) with error: ${err}`);
|
||||
Sentry.captureException(err, {
|
||||
tags: {
|
||||
repoId: job?.data.repo.id,
|
||||
jobId: job?.id,
|
||||
queue: REPO_GC_QUEUE,
|
||||
}
|
||||
});
|
||||
|
||||
if (job) {
|
||||
this.promClient.activeRepoGarbageCollectionJobs.dec();
|
||||
this.promClient.repoGarbageCollectionFailTotal.inc();
|
||||
|
||||
await this.db.repo.update({
|
||||
where: {
|
||||
id: job.data.repo.id
|
||||
},
|
||||
data: {
|
||||
repoIndexingStatus: RepoIndexingStatus.GARBAGE_COLLECTION_FAILED
|
||||
}
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
private async fetchAndScheduleRepoTimeouts() {
|
||||
const repos = await this.db.repo.findMany({
|
||||
where: {
|
||||
repoIndexingStatus: RepoIndexingStatus.INDEXING,
|
||||
updatedAt: {
|
||||
lt: new Date(Date.now() - this.settings.repoIndexTimeoutMs)
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
if (repos.length > 0) {
|
||||
this.logger.info(`Scheduling ${repos.length} repo timeouts`);
|
||||
await this.scheduleRepoTimeoutsBulk(repos);
|
||||
}
|
||||
}
|
||||
|
||||
private async scheduleRepoTimeoutsBulk(repos: Repo[]) {
|
||||
await this.db.$transaction(async (tx) => {
|
||||
await tx.repo.updateMany({
|
||||
where: { id: { in: repos.map(repo => repo.id) } },
|
||||
data: { repoIndexingStatus: RepoIndexingStatus.FAILED }
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
public async dispose() {
|
||||
this.indexWorker.close();
|
||||
this.indexQueue.close();
|
||||
this.gcQueue.close();
|
||||
this.gcWorker.close();
|
||||
}
|
||||
}
|
||||