Sourcebot is a self-hosted tool that helps you understand your codebase.
Find a file
2024-09-18 17:12:33 -07:00
.github Remove qemu step 2024-09-07 19:52:27 -07:00
public Add page to list indexed repositories 2024-09-10 21:55:00 -07:00
schemas document building locally 2024-09-17 23:06:00 -07:00
src Add tsx syntax highlight support 2024-09-18 00:23:54 -07:00
vendor update zoekt 2024-09-17 15:18:01 -07:00
.dockerignore Add PostHog telemetry support 2024-09-16 21:37:34 -07:00
.env Add PostHog telemetry support 2024-09-16 21:37:34 -07:00
.eslintignore Add eslint 2024-08-29 21:38:48 -07:00
.eslintrc.json Add eslint 2024-08-29 21:38:48 -07:00
.gitignore Add helper makefile 2024-09-17 20:34:43 -07:00
.gitmodules Add zoekt as a submodule 2024-08-30 18:35:55 -07:00
components.json wip - add shadcn, api router, and some other stuff 2024-08-24 17:45:44 -07:00
Dockerfile Add PostHog telemetry support 2024-09-16 21:37:34 -07:00
entrypoint.sh Add PostHog telemetry support 2024-09-16 21:37:34 -07:00
fly.toml Decrease ram usage 2024-09-06 13:41:28 -07:00
LICENSE Add license 2024-09-07 17:27:31 -07:00
Makefile document building locally 2024-09-17 23:06:00 -07:00
next.config.mjs Add PostHog telemetry support 2024-09-16 21:37:34 -07:00
package.json remove parentheses usage in package.json 2024-09-18 17:12:33 -07:00
postcss.config.mjs init nextjs project 2024-08-23 13:54:13 -07:00
README.md Add placeholder demo gif + improvements to README 2024-09-18 00:22:17 -07:00
sample-config.json document building locally 2024-09-17 23:06:00 -07:00
supervisord.conf Add expanded context results + switch over to using zoekt's json apis 2024-09-09 23:16:41 -07:00
tailwind.config.ts wip - add shadcn, api router, and some other stuff 2024-08-24 17:45:44 -07:00
tsconfig.json init nextjs project 2024-08-23 13:54:13 -07:00
yarn.lock document building locally 2024-09-17 23:06:00 -07:00

Blazingly fast code search 🏎️

About

Sourcebot is a fast code indexing and search tool for your codebases. It is built ontop of the zoekt indexer, originally authored by Han-Wen Nienhuys and now maintained by Sourcegraph.

Demo video

Getting Started

Using Docker

  1. Install Docker

  2. Create a config.json file and list the repositories you want to index. The JSON schema index.json defines the structure of the config file and the available options. For example, if we want to index Sourcebot on its own code, we could use the following config found in sample-config.json:

    {
        "$schema": "https://raw.githubusercontent.com/TaqlaAI/sourcebot/main/schemas/index.json",
        "Configs": [
            {
                "Type": "github",
                "GitHubOrg": "TaqlaAI",
                "Name": "^sourcebot$"
            }
        ]
    }
    

Sourcebot also supports indexing GitLab & BitBucket. Checkout the index.json for a full list of available options.

  1. Create a Personal Access Token (PAT) to authenticate with a code host(s):

    GitHub

    Generate a GitHub Personal Access Token (PAT) here. If you are indexing public repositories only, you can select the public_repo scope, otherwise you will need the repo scope.

    GitLab

    TODO

    BitBucket

    TODO

  2. Launch the latest image from the ghcr registry:

    GitHub
    docker run -p 3000:3000 --rm --name sourcebot -v $(pwd):/data -e GITHUB_TOKEN=<token> ghcr.io/taqlaai/sourcebot:main
    
    GitLab
    docker run -p 3000:3000 --rm --name sourcebot -v $(pwd):/data -e GITLAB_TOKEN=<token> ghcr.io/taqlaai/sourcebot:main
    
    BitBucket

    TODO

    Two things should happen: (1) a .sourcebot directory will be created containing the mirror repositories and indexes, and (2) you will see output similar to:

    INFO spawned: 'node-server' with pid 10
    INFO spawned: 'zoekt-indexserver' with pid 11
    INFO spawned: 'zoekt-webserver' with pid 12
    run [zoekt-mirror-github -dest /data/.sourcebot/repos -delete -org <org>]
    ...
    INFO success: node-server entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    INFO success: zoekt-indexserver entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    INFO success: zoekt-webserver entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
    

    zoekt will now index your repositories (at HEAD). By default, it will re-index existing repositories every hour, and discover new repositories every 24 hours.

  3. Go to http://localhost:3000 - once a index has been created, you can start searching.

Building Sourcebot

  1. Install go and NodeJS

  2. Clone the repository with submodules:

    git clone --recurse-submodules https://github.com/TaqlaAI/sourcebot.git
    
  3. Run make to build zoekt and install dependencies:

    cd sourcebot
    make
    

The zoekt binaries and web dependencies are placed into bin and node_modules respectively.

  1. Create a config.json file and list the repositories you want to index. The JSON schema defined in index.json defines the structure of the config file and the available options. For example, if we want to index Sourcebot on its own code, we could use the following config found in sample-config.json:

    {
        "$schema": "https://raw.githubusercontent.com/TaqlaAI/sourcebot/main/schemas/index.json",
        "Configs": [
            {
                "Type": "github",
                "GitHubOrg": "TaqlaAI",
                "Name": "^sourcebot$"
            }
        ]
    }
    
  2. Create a Personal Access Token (PAT) to authenticate with a code host:

    GitHub

    Generate a GitHub Personal Access Token (PAT) here. If you are indexing public repositories only, you can select the public_repo scope, otherwise you will need the repo scope.

    Create a text file named .github-token in your home directory and paste the token in it. The file should look like:

    ghp_...
    

    zoekt will read this file to authenticate with GitHub.

    GitLab TODO
    BitBucket TODO
  3. Start Sourcebot with the command:

    yarn dev
    

    A .sourcebot directory will be created and zoekt will begin to index the repositories found given config.json.

  4. Go to http://localhost:3000 - once a index has been created, you can start searching.

Disabling Telemetry

By default, Sourcebot collects anonymous usage data using PostHog. You can disable this by setting the environment variable SOURCEBOT_TELEMETRY_DISABLED to 1 in the docker run command:

docker run -e SOURCEBOT_TELEMETRY_DISABLED=1 ...stuff... ghcr.io/taqlaai/sourcebot:main

Or if you are building locally, add the following to your .env file:

SOURCEBOT_TELEMETRY_DISABLED=1
NEXT_PUBLIC_SOURCEBOT_TELEMETRY_DISABLED=1