--- title: Config File sidebarTitle: Config file --- import ConfigSchema from '/snippets/schemas/v3/index.schema.mdx' import EnvironmentOverridesSchema from '/snippets/schemas/v3/environmentOverrides.schema.mdx' When self-hosting Sourcebot, you **must** provide it a config file. This is done by defining a config file in a volume that's mounted to Sourcebot, and providing the path to this file in the `CONFIG_PATH` environment variable. For example: ```bash icon="terminal" Passing in a CONFIG_PATH to Sourcebot docker run \ -v $(pwd)/config.json:/data/config.json \ -e CONFIG_PATH=/data/config.json \ ... \ # other options ghcr.io/sourcebot-dev/sourcebot:latest ``` The config file tells Sourcebot which repos to index, what language models to use, and various other settings as defined in the [schema](#config-file-schema). # Config File Schema The config file you provide Sourcebot must follow the [schema](https://github.com/sourcebot-dev/sourcebot/blob/main/schemas/v3/index.json). This schema consists of the following properties: - [Connections](/docs/connections/overview) (`connections`): Defines a set of connections that tell Sourcebot which repos to index and from where - [Language Models](/docs/configuration/language-model-providers) (`models`): Defines a set of language model providers for use with [Ask Sourcebot](/docs/features/ask) - [Settings](#settings) (`settings`): Additional settings to tweak your Sourcebot deployment - [Search Contexts](/docs/features/search/search-contexts) (`contexts`): Groupings of repos that you can search against # Config File Syncing Sourcebot syncs the config file on startup, and automatically whenever a change is detected. # Settings The following are settings that can be provided in your config file to modify Sourcebot's behavior | Setting | Type | Default | Minimum | Description / Notes | |-------------------------------------------------|---------|------------|---------|----------------------------------------------------------------------------------------| | `maxFileSize` | number | 2 MB | 1 | Maximum size (bytes) of a file to index. Files exceeding this are skipped. | | `maxTrigramCount` | number | 20 000 | 1 | Maximum trigrams per document. Larger files are skipped. | | `reindexIntervalMs` | number | 1 hour | 1 | Interval at which all repositories are re‑indexed. | | `resyncConnectionIntervalMs` | number | 24 hours | 1 | Interval for checking connections that need re‑syncing. | | `resyncConnectionPollingIntervalMs` | number | 1 second | 1 | DB polling rate for connections that need re‑syncing. | | `reindexRepoPollingIntervalMs` | number | 1 second | 1 | DB polling rate for repos that should be re‑indexed. | | `maxConnectionSyncJobConcurrency` | number | 8 | 1 | Concurrent connection‑sync jobs. | | `maxRepoIndexingJobConcurrency` | number | 8 | 1 | Concurrent repo‑indexing jobs. | | `maxRepoGarbageCollectionJobConcurrency` | number | 8 | 1 | Concurrent repo‑garbage‑collection jobs. | | `repoGarbageCollectionGracePeriodMs` | number | 10 seconds | 1 | Grace period to avoid deleting shards while loading. | | `repoIndexTimeoutMs` | number | 2 hours | 1 | Timeout for a single repo‑indexing run. | | `enablePublicAccess` **(deprecated)** | boolean | false | — | Use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead. | | `experiment_repoDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 | Interval at which the repo permission syncer should run. | | `experiment_userDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 | Interval at which the user permission syncer should run. | # Tokens Tokens are used to securely pass secrets to Sourcebot in a config file. They are used in various places, including connections, language model providers, auth providers, etc. Tokens can be passed as either environment variables or Google Cloud secrets: ```json { "token": { "env": "TOKEN_NAME" } } ``` ```json { "token": { "googleCloudSecret": "projects//secrets//versions/" } } ``` # Overriding environment variables from the config You can override / set environment variables from the config file by using the `environmentOverrides` property. Overrides can be of type `string`, `number`, `boolean`, or a [token](/docs/configuration/config-file#tokens). Tokens are useful when you want to configure a environment variable using a Google Cloud Secret or other supported secret management service. ```jsonc { "environmentOverrides": { "DATABASE_URL": { "type": "token", "value": { "googleCloudSecret": "projects//secrets/postgres-connection-string/versions/latest" } }, "REDIS_URL": { "type": "token", "value": { "googleCloudSecret": "projects//secrets/redis-connection-string/versions/latest" } } }, } ``` ```jsonc { "environmentOverrides": { "EMAIL_FROM_ADDRESS": { "type": "string", "value": "hello@sourcebot.dev" } } } ``` ```jsonc { "environmentOverrides": { "SOURCEBOT_CHAT_MODEL_TEMPERATURE": { "type": "number", "value": 0.5 } } } ``` ```jsonc { "environmentOverrides": { "SOURCEBOT_TELEMETRY_DISABLED": { "type": "boolean", "value": false } } } ``` **Note:** Overrides are **not** set as system environment variables, and instead are resolved at runtime on startup and stored in memory. [schemas/v3/environmentOverrides.json](https://github.com/sourcebot-dev/sourcebot/blob/main/schemas/v3/environmentOverrides.json)