Sourcebot is a self-hosted tool that helps you understand your codebase.
Find a file
2024-09-28 21:40:59 -07:00
.github nit: invert README github icons when in dark mode 2024-09-24 21:56:17 -07:00
public Add page to list indexed repositories 2024-09-10 21:55:00 -07:00
schemas document building locally 2024-09-17 23:06:00 -07:00
src switch to using prettyBytes for repos table 2024-09-26 13:03:23 -07:00
vendor update zoekt 2024-09-28 21:40:59 -07:00
.dockerignore Add PostHog telemetry support 2024-09-16 21:37:34 -07:00
.env Add 'install' event that is fired once on first run (#11) 2024-09-28 16:47:07 -07:00
.eslintignore Add eslint 2024-08-29 21:38:48 -07:00
.eslintrc.json Add eslint 2024-08-29 21:38:48 -07:00
.gitignore Add helper makefile 2024-09-17 20:34:43 -07:00
.gitmodules Add zoekt as a submodule 2024-08-30 18:35:55 -07:00
CHANGELOG.md Add changelog 2024-09-28 17:02:44 -07:00
components.json wip - add shadcn, api router, and some other stuff 2024-08-24 17:45:44 -07:00
demo-site-config.json Add config for demo site 2024-09-19 19:03:25 -07:00
Dockerfile Add 'install' event that is fired once on first run (#11) 2024-09-28 16:47:07 -07:00
entrypoint.sh Add 'install' event that is fired once on first run (#11) 2024-09-28 16:47:07 -07:00
fly.toml Decrease ram usage 2024-09-06 13:41:28 -07:00
LICENSE Add license 2024-09-07 17:27:31 -07:00
Makefile document building locally 2024-09-17 23:06:00 -07:00
next.config.mjs Add PostHog telemetry support 2024-09-16 21:37:34 -07:00
package.json switch to using prettyBytes for repos table 2024-09-26 13:03:23 -07:00
postcss.config.mjs init nextjs project 2024-08-23 13:54:13 -07:00
README.md read me v2 (#10) 2024-09-28 16:51:55 -07:00
sample-config.json document building locally 2024-09-17 23:06:00 -07:00
supervisord.conf Add expanded context results + switch over to using zoekt's json apis 2024-09-09 23:16:41 -07:00
tailwind.config.ts wip - add shadcn, api router, and some other stuff 2024-08-24 17:45:44 -07:00
tsconfig.json init nextjs project 2024-08-23 13:54:13 -07:00
yarn.lock switch to using prettyBytes for repos table 2024-09-26 13:03:23 -07:00

Blazingly fast code search 🏎️

About

Sourcebot is a fast code indexing and search tool for your codebases. It is built ontop of the zoekt indexer, originally authored by Han-Wen Nienhuys and now maintained by Sourcegraph.

Demo video

Features

  • 💻 One-command deployment: Get started instantly using Docker on your own machine.
  • 🔍 Multi-repo search: Effortlessly index and search through multiple public and private repositories (GitHub, GitLab, BitBucket).
  • Lightning fast performance: Built on top of the powerful Zoekt search engine.
  • 📂 Full file visualization: Instantly view the entire file when selecting any search result.
  • 🎨 Modern web application: Enjoy a sleek interface with features like syntax highlighting, light/dark mode, and vim-style navigation

You can try out our public hosted demo here!

Getting Started

Get started with a single docker command:

docker run -p 3000:3000 --rm --name sourcebot ghcr.io/taqlaai/sourcebot:main

Navigate to localhost:3000 to start searching the Sourcebot repo. Want to search your own repos? Checkout how to configure Sourcebot.

What does this command do?
  • Pull and run the Sourcebot docker image from ghcr.io/taqlaai/sourcebot:main. You'll need to make sure you have docker installed to do this.
  • Sourcebot will index itself to prepare for your search request.
  • Map port 3000 between your machine and the docker image (-p 3000:3000).

Configuring Sourcebot

Sourcebot supports indexing and searching through public and private repositories hosted on GitHub icon GitHub, GitLab, and BitBucket. This section will guide you through configuring the repositories that Sourcebot indexes.

  1. Create a new folder on your machine that stores your configs and .sourcebot cache, and navigate into it:
mkdir sourcebot_workspace
cd sourcebot_workspace
  1. Create a new config following the configuration schema to specify which repositories Sourcebot should index. For example to index Sourcebot itself:
touch my_config.json
echo `{
    "$schema": "https://raw.githubusercontent.com/TaqlaAI/sourcebot/main/schemas/index.json",
    "Configs": [
        {
            "Type": "github",
            "GitHubOrg": "TaqlaAI",
            "Name": "sourcebot"
        }
    ]
}` > my_config.json
  1. Run Sourcebot and point it to the new config you created:
docker run -p 3000:3000 --rm --name sourcebot -e CONFIG_PATH=./my_config.json -v $(pwd):/data ghcr.io/taqlaai/sourcebot:main

This command will also mount the current directory (-v $(pwd):/data) to allow Sourcebot to persist the .sourcebot cache.

(Optional) Provide an access token to index private repositories

In order to allow Sourcebot to index your private repositories, you must provide it with an access token.

GitHub icon GitHub

Generate a GitHub Personal Access Token (PAT) here and make sure you select the repo scope.

You'll need to pass this PAT each time you run Sourcebot by setting the GITHUB_TOKEN environment variable:

docker run -p 3000:3000 --rm --name sourcebot -e GITHUB_TOKEN=[your-github-token] -v $(pwd):/data ghcr.io/taqlaai/sourcebot:main
GitLab

TODO

BitBucket

TODO

Build from source

Note

You don't need to build Sourcebot in order to use it! If you'd just like to use Sourcebot, please read how to configure Sourcebot.

If you'd like to make changes to Sourcebot you'll need to build from source:

  1. Install go and NodeJS. Note that a NodeJS version of at least 21.1.0 is required.

  2. Install ctags (required by zoekt-indexserver): Mac: brew install universal-ctags Ubuntu: apt-get install universal-ctags

  3. Clone the repository with submodules:

    git clone --recurse-submodules https://github.com/TaqlaAI/sourcebot.git
    
  4. Run make to build zoekt and install dependencies:

    cd sourcebot
    make
    

The zoekt binaries and web dependencies are placed into bin and node_modules respectively.

  1. Create a config.json file and list the repositories you want to index. The JSON schema defined in index.json defines the structure of the config file and the available options. For example, if we want to index Sourcebot on its own code, we could use the following config found in sample-config.json:

    {
        "$schema": "https://raw.githubusercontent.com/TaqlaAI/sourcebot/main/schemas/index.json",
        "Configs": [
            {
                "Type": "github",
                "GitHubOrg": "TaqlaAI",
                "Name": "sourcebot"
            }
        ]
    }
    
  2. Create a Personal Access Token (PAT) to authenticate with a code host:

    GitHub icon GitHub Generate a GitHub Personal Access Token (PAT) [here](https://github.com/settings/tokens/new). If you are indexing public repositories only, you can select the `public_repo` scope, otherwise you will need the `repo` scope.

    Create a text file named .github-token in your home directory and paste the token in it. The file should look like:

    ghp_...
    

    zoekt will read this file to authenticate with GitHub.

    GitLab TODO
    BitBucket TODO
  3. Start Sourcebot with the command:

    yarn dev
    

    A .sourcebot directory will be created and zoekt will begin to index the repositories found given config.json.

  4. Go to http://localhost:3000 - once an index has been created, you can start searching.

Telemetry

By default, Sourcebot collects anonymized usage data through PostHog to help us improve the performance and reliability of our tool. We do not collect or transmit any information related to your codebase. In addition, all events are sanitized to ensure that no sensitive or identifying details leave your machine. The data we collect includes general usage statistics and metadata such as query performance (e.g., search duration, error rates) to monitor the application's health and functionality. This information helps us better understand how Sourcebot is used and where improvements can be made :)

If you'd like to disable all telemetry, you can do so by setting the environment variable SOURCEBOT_TELEMETRY_DISABLED to 1 in the docker run command:

docker run -e SOURCEBOT_TELEMETRY_DISABLED=1 /* additional args */ ghcr.io/taqlaai/sourcebot:main

Or if you are building locally, add the following to your .env file:

SOURCEBOT_TELEMETRY_DISABLED=1
NEXT_PUBLIC_SOURCEBOT_TELEMETRY_DISABLED=1