sourcebot/docs/docs/configuration/config-file.mdx
2025-11-04 21:22:31 -08:00

154 lines
No EOL
7.3 KiB
Text
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: Config File
sidebarTitle: Config file
---
import ConfigSchema from '/snippets/schemas/v3/index.schema.mdx'
import EnvironmentOverridesSchema from '/snippets/schemas/v3/environmentOverrides.schema.mdx'
When self-hosting Sourcebot, you **must** provide it a config file. This is done by defining a config file in a volume that's mounted to Sourcebot, and providing the path to this
file in the `CONFIG_PATH` environment variable. For example:
```bash icon="terminal" Passing in a CONFIG_PATH to Sourcebot
docker run \
-v $(pwd)/config.json:/data/config.json \
-e CONFIG_PATH=/data/config.json \
... \ # other options
ghcr.io/sourcebot-dev/sourcebot:latest
```
The config file tells Sourcebot which repos to index, what language models to use, and various other settings as defined in the [schema](#config-file-schema).
# Config File Schema
The config file you provide Sourcebot must follow the [schema](https://github.com/sourcebot-dev/sourcebot/blob/main/schemas/v3/index.json). This schema consists of the following properties:
- [Connections](/docs/connections/overview) (`connections`): Defines a set of connections that tell Sourcebot which repos to index and from where
- [Language Models](/docs/configuration/language-model-providers) (`models`): Defines a set of language model providers for use with [Ask Sourcebot](/docs/features/ask)
- [Settings](#settings) (`settings`): Additional settings to tweak your Sourcebot deployment
- [Search Contexts](/docs/features/search/search-contexts) (`contexts`): Groupings of repos that you can search against
# Config File Syncing
Sourcebot syncs the config file on startup, and automatically whenever a change is detected.
# Settings
The following are settings that can be provided in your config file to modify Sourcebot's behavior
| Setting | Type | Default | Minimum | Description / Notes |
|-------------------------------------------------|---------|------------|---------|----------------------------------------------------------------------------------------|
| `maxFileSize` | number | 2MB | 1 | Maximum size (bytes) of a file to index. Files exceeding this are skipped. |
| `maxTrigramCount` | number | 20000 | 1 | Maximum trigrams per document. Larger files are skipped. |
| `reindexIntervalMs` | number | 1hour | 1 | Interval at which all repositories are reindexed. |
| `resyncConnectionIntervalMs` | number | 24hours | 1 | Interval for checking connections that need resyncing. |
| `resyncConnectionPollingIntervalMs` | number | 1second | 1 | DB polling rate for connections that need resyncing. |
| `reindexRepoPollingIntervalMs` | number | 1second | 1 | DB polling rate for repos that should be reindexed. |
| `maxConnectionSyncJobConcurrency` | number | 8 | 1 | Concurrent connectionsync jobs. |
| `maxRepoIndexingJobConcurrency` | number | 8 | 1 | Concurrent repoindexing jobs. |
| `maxRepoGarbageCollectionJobConcurrency` | number | 8 | 1 | Concurrent repogarbagecollection jobs. |
| `repoGarbageCollectionGracePeriodMs` | number | 10seconds | 1 | Grace period to avoid deleting shards while loading. |
| `repoIndexTimeoutMs` | number | 2hours | 1 | Timeout for a single repoindexing run. |
| `enablePublicAccess` **(deprecated)** | boolean | false | — | Use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead. |
| `experiment_repoDrivenPermissionSyncIntervalMs` | number | 24hours | 1 | Interval at which the repo permission syncer should run. |
| `experiment_userDrivenPermissionSyncIntervalMs` | number | 24hours | 1 | Interval at which the user permission syncer should run. |
# Tokens
Tokens are used to securely pass secrets to Sourcebot in a config file. They are used in various places, including connections, language model providers, auth providers, etc. Tokens can be passed as either environment variables or Google Cloud secrets:
<AccordionGroup>
<Accordion title="Environment Variables">
```json
{
"token": {
"env": "TOKEN_NAME"
}
}
```
</Accordion>
<Accordion title="Google Cloud Secrets">
```json
{
"token": {
"googleCloudSecret": "projects/<project-id>/secrets/<secret-name>/versions/<version-id>"
}
}
```
</Accordion>
</AccordionGroup>
# Overriding environment variables from the config
You can override / set environment variables from the config file by using the `environmentOverrides` property. Overrides can be of type `string`, `number`, `boolean`, or a [token](/docs/configuration/config-file#tokens). Tokens are useful when you want to configure a environment variable using a Google Cloud Secret or other supported secret management service.
<AccordionGroup>
<Accordion title="Token">
```jsonc
{
"environmentOverrides": {
"DATABASE_URL": {
"type": "token",
"value": {
"googleCloudSecret": "projects/<id>/secrets/postgres-connection-string/versions/latest"
}
},
"REDIS_URL": {
"type": "token",
"value": {
"googleCloudSecret": "projects/<id>/secrets/redis-connection-string/versions/latest"
}
}
},
}
```
</Accordion>
<Accordion title="String">
```jsonc
{
"environmentOverrides": {
"EMAIL_FROM_ADDRESS": {
"type": "string",
"value": "hello@sourcebot.dev"
}
}
}
```
</Accordion>
<Accordion title="Number">
```jsonc
{
"environmentOverrides": {
"SOURCEBOT_CHAT_MODEL_TEMPERATURE": {
"type": "number",
"value": 0.5
}
}
}
```
</Accordion>
<Accordion title="Boolean">
```jsonc
{
"environmentOverrides": {
"SOURCEBOT_TELEMETRY_DISABLED": {
"type": "boolean",
"value": false
}
}
}
```
</Accordion>
</AccordionGroup>
**Note:** Overrides are **not** set as system environment variables, and instead are resolved at runtime on startup and stored in memory.
<Accordion title="Schema reference">
[schemas/v3/environmentOverrides.json](https://github.com/sourcebot-dev/sourcebot/blob/main/schemas/v3/environmentOverrides.json)
<EnvironmentOverridesSchema />
</Accordion>