mirror of
https://github.com/sourcebot-dev/sourcebot.git
synced 2025-12-12 12:25:22 +00:00
Sourcebot V4 introduces authentication, performance improvements and code navigation. Checkout the [migration guide](https://docs.sourcebot.dev/self-hosting/upgrade/v3-to-v4-guide) for information on upgrading your instance to v4. ### Changed - [**Breaking Change**] Authentication is now required by default. Notes: - When setting up your instance, email / password login will be the default authentication provider. - The first user that logs into the instance is given the `owner` role. ([docs](https://docs.sourcebot.dev/docs/more/roles-and-permissions)). - Subsequent users can request to join the instance. The `owner` can approve / deny requests to join the instance via `Settings` > `Members` > `Pending Requests`. - If a user is approved to join the instance, they are given the `member` role. - Additional login providers, including email links and SSO, can be configured with additional environment variables. ([docs](https://docs.sourcebot.dev/self-hosting/configuration/authentication)). - Clicking on a search result now takes you to the `/browse` view. Files can still be previewed by clicking the "Preview" button or holding `Cmd` / `Ctrl` when clicking on a search result. [#315](https://github.com/sourcebot-dev/sourcebot/pull/315) ### Added - [Sourcebot EE] Added search-based code navigation, allowing you to jump between symbol definition and references when viewing source files. [Read the documentation](https://docs.sourcebot.dev/docs/search/code-navigation). [#315](https://github.com/sourcebot-dev/sourcebot/pull/315) - Added collapsible filter panel. [#315](https://github.com/sourcebot-dev/sourcebot/pull/315) ### Fixed - Improved scroll performance for large numbers of search results. [#315](https://github.com/sourcebot-dev/sourcebot/pull/315)
139 lines
No EOL
7.3 KiB
Text
139 lines
No EOL
7.3 KiB
Text
---
|
|
title: Self-host Sourcebot
|
|
sidebarTitle: Overview
|
|
---
|
|
|
|
import SupportedPlatforms from '/snippets/platform-support.mdx'
|
|
|
|
<Note>Want a managed solution? Checkout [Sourcebot Cloud](/docs/getting-started).</Note>
|
|
|
|
Sourcebot is open source and can be self-hosted using our official [Docker image](https://github.com/sourcebot-dev/sourcebot/pkgs/container/sourcebot).
|
|
|
|
## Quick Start Guide
|
|
|
|
{/*@todo: record a self-hosting quick start guide
|
|
<iframe
|
|
width="560"
|
|
height="315"
|
|
src="https://www.youtube.com/embed/4KzFe50RQkQ"
|
|
title="YouTube video player"
|
|
frameborder="0"
|
|
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
|
|
allowfullscreen
|
|
></iframe>
|
|
*/}
|
|
|
|
<Steps>
|
|
<Step title="Create a config">
|
|
By default, Sourcebot requires a configuration file with a list of [code host connections](/docs/connections/overview) that specify what repositories should be **synced** (cloned and indexed). To get started, run the following command to create a starter `config.json`:
|
|
|
|
```bash
|
|
touch config.json
|
|
echo '{
|
|
"$schema": "https://raw.githubusercontent.com/sourcebot-dev/sourcebot/main/schemas/v3/index.json",
|
|
"connections": {
|
|
// Comments are supported
|
|
"starter-connection": {
|
|
"type": "github",
|
|
"repos": [
|
|
"sourcebot-dev/sourcebot"
|
|
]
|
|
}
|
|
}
|
|
}' > config.json
|
|
```
|
|
|
|
This config creates a single GitHub connection named `starter-connection` that specifies [Sourcebot](https://github.com/sourcebot-dev/sourcebot) as a repo to sync.
|
|
</Step>
|
|
|
|
<Step title="Launch your instance">
|
|
<Warning>If you're deploying Sourcebot behind a domain, you must set the [AUTH_URL](/self-hosting/configuration/environment-variables) environment variable</Warning>
|
|
Sourcebot is packaged as a [single Docker image](https://github.com/sourcebot-dev/sourcebot/pkgs/container/sourcebot). In the same directory as `config.json`, run the following command to start your instance:
|
|
|
|
``` bash
|
|
docker run \
|
|
-p 3000:3000 \
|
|
--pull=always \
|
|
--rm \
|
|
-v $(pwd):/data \
|
|
-e CONFIG_PATH=/data/config.json \
|
|
--name sourcebot \
|
|
ghcr.io/sourcebot-dev/sourcebot:latest
|
|
```
|
|
|
|
Navigate to `localhost:3000` to start searching the Sourcebot repo.
|
|
|
|
<Accordion title="Details">
|
|
**This command**:
|
|
- pulls the latest version of the `sourcebot` docker image.
|
|
- mounts the working directory to `/data` in the container to allow Sourcebot to persist data across restarts, and to access the `config.json`. In your local directory, you should see a `.sourcebot` folder created that contains all persistent data.
|
|
- runs any pending database migrations.
|
|
- starts up all services, including the webserver exposed on port 3000.
|
|
- reads `config.json` and starts syncing.
|
|
</Accordion>
|
|
|
|
<Note>Hit an issue? Please let us know on [GitHub discussions](https://github.com/sourcebot-dev/sourcebot/discussions/categories/support) or by [emailing us](mailto:team@sourcebot.dev).</Note>
|
|
</Step>
|
|
|
|
<Step title="Create the owner account">
|
|
Sourcebot has built-in authentication which gates your instance. The first account which is registered on a fresh Sourcebot deployment is made owner.
|
|
|
|
Registration is performed using basic credentials which are stored encrypted within your deployment. To setup more authentication providers
|
|
check out the [auth docs](/self-hosting/configuration/authentication)
|
|
|
|
<img width="600" height="500" style={{ borderRadius: '0.5rem' }} src="/images/login_basic.png" />
|
|
</Step>
|
|
|
|
<Step title="Link your code">
|
|
Sourcebot supports indexing public & private code on the following code hosts:
|
|
|
|
<SupportedPlatforms />
|
|
|
|
<Note>Missing your code host? [Submit a feature request on GitHub](https://github.com/sourcebot-dev/sourcebot/discussions/categories/ideas).</Note>
|
|
</Step>
|
|
</Steps>
|
|
|
|
## Architecture
|
|
|
|
Sourcebot is shipped as a single docker container that runs a collection of services using [supervisord](https://supervisord.org/):
|
|
|
|

|
|
|
|
{/*TODO: outline the different services, how Sourcebot communicates with code hosts, and the different*/}
|
|
|
|
Sourcebot consists of the following components:
|
|
- **Web Server** : main Next.js web application serving the Sourcebot UI.
|
|
- **Backend Worker** : Node.js process that incrementally syncs with code hosts (e.g., GitHub, GitLab etc.) and asynchronously indexes configured repositories.
|
|
- **Zoekt** : the [open-source](https://github.com/sourcegraph/zoekt), trigram indexing code search engine that powers Sourcebot under the hood.
|
|
- **Postgres** : transactional database for storing business-logic data.
|
|
- **Redis Job Queue** : fast in-memory store. Used with [BullMQ](https://docs.bullmq.io/) for queuing asynchronous work.
|
|
- **`.sourcebot/` cache** : file-system cache where persistent data is written.
|
|
|
|
You can use managed Redis / Postgres services that run outside of the Sourcebot container by providing the `REDIS_URL` and `DATABASE_URL` environment variables, respectively. See the [configuration](/self-hosting/configuration) for more configuration options.
|
|
|
|
## Scalability
|
|
|
|
One of our design philosophies for Sourcebot is to keep our infrastructure [radically simple](https://www.radicalsimpli.city/) while balancing scalability concerns. Depending on the number of repositories you have indexed and the instance you are running Sourcebot on, you may experience slow search times or other performance degradations. Our recommendation is to vertically scale your instance by increasing the number of CPU cores and memory.
|
|
|
|
Sourcebot does not support horizontal scaling at this time, but it is on our roadmap. If this is something your team would be interested in, please contact us at [team@sourcebot.dev](mailto:team@sourcebot.dev).
|
|
|
|
|
|
## Telemetry
|
|
By default, Sourcebot collects anonymized usage data through [PostHog](https://posthog.com/) to help us improve the performance and reliability of our tool. We don't collect or transmit <a href="https://demo.sourcebot.dev/~/search?query=captureEvent%5C(%20repo%3Asourcebot">any information related to your codebase</a>. In addition, all events are [sanitized](https://github.com/sourcebot-dev/sourcebot/blob/HEAD/packages/web/src/app/posthogProvider.tsx) to ensure that no sensitive details (ex. ip address, query info) leave your machine.
|
|
|
|
The data we collect includes general usage statistics and metadata such as query performance (e.g., search duration, error rates) to monitor the application's health and functionality. This information helps us better understand how Sourcebot is used and where improvements can be made.
|
|
|
|
If you'd like to disable all telemetry, you can do so by setting the environment variable `SOURCEBOT_TELEMETRY_DISABLED` to `true`:
|
|
|
|
```bash
|
|
docker run \
|
|
-e SOURCEBOT_TELEMETRY_DISABLED=true \
|
|
/* additional args */ \
|
|
ghcr.io/sourcebot-dev/sourcebot:latest
|
|
```
|
|
|
|
If you disabled telemetry correctly, you'll see the following log when starting Sourcebot:
|
|
|
|
```sh
|
|
Disabling telemetry since SOURCEBOT_TELEMETRY_DISABLED was set.
|
|
``` |