feat(ee): GitLab permission syncing (#585)
Some checks are pending
Publish to ghcr / build (linux/amd64, blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Publish to ghcr / build (linux/arm64, blacksmith-8vcpu-ubuntu-2204-arm) (push) Waiting to run
Publish to ghcr / merge (push) Blocked by required conditions

This commit is contained in:
Brendan Kellam 2025-10-30 11:08:10 -07:00 committed by GitHub
parent 384aa9ebe6
commit 4899c9fbc7
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 211 additions and 49 deletions

View file

@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased] ## [Unreleased]
## Added
- [Experimental][Sourcebot EE] Added GitLab permission syncing. [#585](https://github.com/sourcebot-dev/sourcebot/pull/585)
### Fixed ### Fixed
- [ask sb] Fixed issue where reasoning tokens would appear in `text` content for openai compatible models. [#582](https://github.com/sourcebot-dev/sourcebot/pull/582) - [ask sb] Fixed issue where reasoning tokens would appear in `text` content for openai compatible models. [#582](https://github.com/sourcebot-dev/sourcebot/pull/582)
- Fixed issue with GitHub app token tracking and refreshing. [#583](https://github.com/sourcebot-dev/sourcebot/pull/583) - Fixed issue with GitHub app token tracking and refreshing. [#583](https://github.com/sourcebot-dev/sourcebot/pull/583)

View file

@ -52,6 +52,14 @@ Optional environment variables:
[Auth.js GitLab Provider Docs](https://authjs.dev/getting-started/providers/gitlab) [Auth.js GitLab Provider Docs](https://authjs.dev/getting-started/providers/gitlab)
Authentication using GitLab is supported via a [OAuth2.0 app](https://docs.gitlab.com/integration/oauth_provider/#create-an-instance-wide-application) installed on the GitLab instance. Follow the instructions in the [GitLab docs](https://docs.gitlab.com/integration/oauth_provider/) to create an app. The callback URL should be configurd to `<sourcebot_deployment_url>/api/auth/callback/gitlab`, and the following scopes need to be set:
| Scope | Required | Notes |
|------------|----------|----------------------------------------------------------------------------------------------------|
| read_user | Yes | Allows Sourcebot to read basic user information required for authentication. |
| read_api | Conditional | Required **only** when [permission syncing](/docs/features/permission-syncing) is enabled. Enables Sourcebot to list all repositories and projects for the authenticated user. |
**Required environment variables:** **Required environment variables:**
- `AUTH_EE_GITLAB_CLIENT_ID` - `AUTH_EE_GITLAB_CLIENT_ID`
- `AUTH_EE_GITLAB_CLIENT_SECRET` - `AUTH_EE_GITLAB_CLIENT_SECRET`

View file

@ -35,7 +35,7 @@ We are actively working on supporting more code hosts. If you'd like to see a sp
| Platform | Permission syncing | | Platform | Permission syncing |
|:----------|------------------------------| |:----------|------------------------------|
| [GitHub (GHEC & GHEC Server)](/docs/features/permission-syncing#github) | ✅ | | [GitHub (GHEC & GHEC Server)](/docs/features/permission-syncing#github) | ✅ |
| GitLab | 🛑 | | [GitLab (Self-managed & Cloud)](/docs/features/permission-syncing#gitlab) | ✅ |
| Bitbucket Cloud | 🛑 | | Bitbucket Cloud | 🛑 |
| Bitbucket Data Center | 🛑 | | Bitbucket Data Center | 🛑 |
| Gitea | 🛑 | | Gitea | 🛑 |
@ -59,6 +59,18 @@ Permission syncing works with **GitHub.com**, **GitHub Enterprise Cloud**, and *
- A GitHub OAuth provider must be configured to (1) correlate a Sourcebot user with a GitHub user, and (2) to list repositories that the user has access to for [User driven syncing](/docs/features/permission-syncing#how-it-works). - A GitHub OAuth provider must be configured to (1) correlate a Sourcebot user with a GitHub user, and (2) to list repositories that the user has access to for [User driven syncing](/docs/features/permission-syncing#how-it-works).
- OAuth tokens must assume the `repo` scope in order to use the [List repositories for the authenticated user API](https://docs.github.com/en/rest/repos/repos?apiVersion=2022-11-28#list-repositories-for-the-authenticated-user) during [User driven syncing](/docs/features/permission-syncing#how-it-works). Sourcebot **will only** use this token for **reads**. - OAuth tokens must assume the `repo` scope in order to use the [List repositories for the authenticated user API](https://docs.github.com/en/rest/repos/repos?apiVersion=2022-11-28#list-repositories-for-the-authenticated-user) during [User driven syncing](/docs/features/permission-syncing#how-it-works). Sourcebot **will only** use this token for **reads**.
## GitLab
Prerequisite: [Add GitLab as an OAuth provider](/docs/configuration/auth/providers#gitlab).
Permission syncing works with **GitLab Self-managed** and **GitLab Cloud**. Users with **Guest** role or above with membership to a group or project will have their access synced to Sourcebot. Both direct and indirect membership to a group or project will be synced with Sourcebot. For more details, see the [GitLab docs](https://docs.gitlab.com/user/project/members/#membership-types).
**Notes:**
- A GitLab OAuth provider must be configured to (1) correlate a Sourcebot user with a GitLab user, and (2) to list repositories that the user has access to for [User driven syncing](/docs/features/permission-syncing#how-it-works).
- OAuth tokens require the `read_api` scope in order to use the [List projects for the authenticated user API](https://docs.gitlab.com/ee/api/projects.html#list-all-projects) during [User driven syncing](/docs/features/permission-syncing#how-it-works).
# How it works # How it works
Permission syncing works by periodically syncing ACLs from the code host(s) to Sourcebot to build an internal mapping between Users and Repositories. This mapping is hydrated in two directions: Permission syncing works by periodically syncing ACLs from the code host(s) to Sourcebot to build an internal mapping between Users and Repositories. This mapping is hydrated in two directions:

View file

@ -5,6 +5,7 @@ export const SINGLE_TENANT_ORG_ID = 1;
export const PERMISSION_SYNC_SUPPORTED_CODE_HOST_TYPES = [ export const PERMISSION_SYNC_SUPPORTED_CODE_HOST_TYPES = [
'github', 'github',
'gitlab',
]; ];
export const REPOS_CACHE_DIR = path.join(env.DATA_CACHE_DIR, 'repos'); export const REPOS_CACHE_DIR = path.join(env.DATA_CACHE_DIR, 'repos');

View file

@ -7,6 +7,7 @@ import { Redis } from 'ioredis';
import { PERMISSION_SYNC_SUPPORTED_CODE_HOST_TYPES } from "../constants.js"; import { PERMISSION_SYNC_SUPPORTED_CODE_HOST_TYPES } from "../constants.js";
import { env } from "../env.js"; import { env } from "../env.js";
import { createOctokitFromToken, getRepoCollaborators, GITHUB_CLOUD_HOSTNAME } from "../github.js"; import { createOctokitFromToken, getRepoCollaborators, GITHUB_CLOUD_HOSTNAME } from "../github.js";
import { createGitLabFromPersonalAccessToken, getProjectMembers } from "../gitlab.js";
import { Settings } from "../types.js"; import { Settings } from "../types.js";
import { getAuthCredentialsForRepo } from "../utils.js"; import { getAuthCredentialsForRepo } from "../utils.js";
@ -16,7 +17,9 @@ type RepoPermissionSyncJob = {
const QUEUE_NAME = 'repoPermissionSyncQueue'; const QUEUE_NAME = 'repoPermissionSyncQueue';
const logger = createLogger('repo-permission-syncer'); const LOG_TAG = 'repo-permission-syncer';
const logger = createLogger(LOG_TAG);
const createJobLogger = (jobId: string) => createLogger(`${LOG_TAG}:job:${jobId}`);
export class RepoPermissionSyncer { export class RepoPermissionSyncer {
private queue: Queue<RepoPermissionSyncJob>; private queue: Queue<RepoPermissionSyncJob>;
@ -109,28 +112,31 @@ export class RepoPermissionSyncer {
} }
private async schedulePermissionSync(repos: Repo[]) { private async schedulePermissionSync(repos: Repo[]) {
await this.db.$transaction(async (tx) => { // @note: we don't perform this in a transaction because
const jobs = await tx.repoPermissionSyncJob.createManyAndReturn({ // we want to avoid the situation where a job is created and run
data: repos.map(repo => ({ // prior to the transaction being committed.
repoId: repo.id, const jobs = await this.db.repoPermissionSyncJob.createManyAndReturn({
})), data: repos.map(repo => ({
}); repoId: repo.id,
})),
await this.queue.addBulk(jobs.map((job) => ({
name: 'repoPermissionSyncJob',
data: {
jobId: job.id,
},
opts: {
removeOnComplete: env.REDIS_REMOVE_ON_COMPLETE,
removeOnFail: env.REDIS_REMOVE_ON_FAIL,
}
})))
}); });
await this.queue.addBulk(jobs.map((job) => ({
name: 'repoPermissionSyncJob',
data: {
jobId: job.id,
},
opts: {
removeOnComplete: env.REDIS_REMOVE_ON_COMPLETE,
removeOnFail: env.REDIS_REMOVE_ON_FAIL,
}
})))
} }
private async runJob(job: Job<RepoPermissionSyncJob>) { private async runJob(job: Job<RepoPermissionSyncJob>) {
const id = job.data.jobId; const id = job.data.jobId;
const logger = createJobLogger(id);
const { repo } = await this.db.repoPermissionSyncJob.update({ const { repo } = await this.db.repoPermissionSyncJob.update({
where: { where: {
id, id,
@ -194,6 +200,33 @@ export class RepoPermissionSyncer {
}, },
}); });
return accounts.map(account => account.userId);
} else if (repo.external_codeHostType === 'gitlab') {
const api = await createGitLabFromPersonalAccessToken({
token: credentials.token,
url: credentials.hostUrl,
});
const projectId = repo.external_id;
if (!projectId) {
throw new Error(`Repo ${id} does not have an external_id`);
}
const members = await getProjectMembers(projectId, api);
const gitlabUserIds = members.map(member => member.id.toString());
const accounts = await this.db.account.findMany({
where: {
provider: 'gitlab',
providerAccountId: {
in: gitlabUserIds,
}
},
select: {
userId: true,
},
});
return accounts.map(account => account.userId); return accounts.map(account => account.userId);
} }
@ -221,6 +254,8 @@ export class RepoPermissionSyncer {
} }
private async onJobCompleted(job: Job<RepoPermissionSyncJob>) { private async onJobCompleted(job: Job<RepoPermissionSyncJob>) {
const logger = createJobLogger(job.data.jobId);
const { repo } = await this.db.repoPermissionSyncJob.update({ const { repo } = await this.db.repoPermissionSyncJob.update({
where: { where: {
id: job.data.jobId, id: job.data.jobId,
@ -243,6 +278,8 @@ export class RepoPermissionSyncer {
} }
private async onJobFailed(job: Job<RepoPermissionSyncJob> | undefined, err: Error) { private async onJobFailed(job: Job<RepoPermissionSyncJob> | undefined, err: Error) {
const logger = createJobLogger(job?.data.jobId ?? 'unknown');
Sentry.captureException(err, { Sentry.captureException(err, {
tags: { tags: {
jobId: job?.data.jobId, jobId: job?.data.jobId,

View file

@ -6,10 +6,13 @@ import { Redis } from "ioredis";
import { PERMISSION_SYNC_SUPPORTED_CODE_HOST_TYPES } from "../constants.js"; import { PERMISSION_SYNC_SUPPORTED_CODE_HOST_TYPES } from "../constants.js";
import { env } from "../env.js"; import { env } from "../env.js";
import { createOctokitFromToken, getReposForAuthenticatedUser } from "../github.js"; import { createOctokitFromToken, getReposForAuthenticatedUser } from "../github.js";
import { createGitLabFromOAuthToken, getProjectsForAuthenticatedUser } from "../gitlab.js";
import { hasEntitlement } from "@sourcebot/shared"; import { hasEntitlement } from "@sourcebot/shared";
import { Settings } from "../types.js"; import { Settings } from "../types.js";
const logger = createLogger('user-permission-syncer'); const LOG_TAG = 'user-permission-syncer';
const logger = createLogger(LOG_TAG);
const createJobLogger = (jobId: string) => createLogger(`${LOG_TAG}:job:${jobId}`);
const QUEUE_NAME = 'userPermissionSyncQueue'; const QUEUE_NAME = 'userPermissionSyncQueue';
@ -110,28 +113,31 @@ export class UserPermissionSyncer {
} }
private async schedulePermissionSync(users: User[]) { private async schedulePermissionSync(users: User[]) {
await this.db.$transaction(async (tx) => { // @note: we don't perform this in a transaction because
const jobs = await tx.userPermissionSyncJob.createManyAndReturn({ // we want to avoid the situation where a job is created and run
data: users.map(user => ({ // prior to the transaction being committed.
userId: user.id, const jobs = await this.db.userPermissionSyncJob.createManyAndReturn({
})), data: users.map(user => ({
}); userId: user.id,
})),
await this.queue.addBulk(jobs.map((job) => ({
name: 'userPermissionSyncJob',
data: {
jobId: job.id,
},
opts: {
removeOnComplete: env.REDIS_REMOVE_ON_COMPLETE,
removeOnFail: env.REDIS_REMOVE_ON_FAIL,
}
})))
}); });
await this.queue.addBulk(jobs.map((job) => ({
name: 'userPermissionSyncJob',
data: {
jobId: job.id,
},
opts: {
removeOnComplete: env.REDIS_REMOVE_ON_COMPLETE,
removeOnFail: env.REDIS_REMOVE_ON_FAIL,
}
})))
} }
private async runJob(job: Job<UserPermissionSyncJob>) { private async runJob(job: Job<UserPermissionSyncJob>) {
const id = job.data.jobId; const id = job.data.jobId;
const logger = createJobLogger(id);
const { user } = await this.db.userPermissionSyncJob.update({ const { user } = await this.db.userPermissionSyncJob.update({
where: { where: {
id, id,
@ -183,6 +189,37 @@ export class UserPermissionSyncer {
} }
}); });
repos.forEach(repo => aggregatedRepoIds.add(repo.id));
} else if (account.provider === 'gitlab') {
if (!account.access_token) {
throw new Error(`User '${user.email}' does not have a GitLab OAuth access token associated with their GitLab account.`);
}
const api = await createGitLabFromOAuthToken({
oauthToken: account.access_token,
url: env.AUTH_EE_GITLAB_BASE_URL,
});
// @note: we only care about the private and internal repos since we don't need to build a mapping
// for public repos.
// @see: packages/web/src/prisma.ts
const privateGitLabProjects = await getProjectsForAuthenticatedUser('private', api);
const internalGitLabProjects = await getProjectsForAuthenticatedUser('internal', api);
const gitLabProjectIds = [
...privateGitLabProjects,
...internalGitLabProjects,
].map(project => project.id.toString());
const repos = await this.db.repo.findMany({
where: {
external_codeHostType: 'gitlab',
external_id: {
in: gitLabProjectIds,
}
}
});
repos.forEach(repo => aggregatedRepoIds.add(repo.id)); repos.forEach(repo => aggregatedRepoIds.add(repo.id));
} }
} }
@ -212,6 +249,8 @@ export class UserPermissionSyncer {
} }
private async onJobCompleted(job: Job<UserPermissionSyncJob>) { private async onJobCompleted(job: Job<UserPermissionSyncJob>) {
const logger = createJobLogger(job.data.jobId);
const { user } = await this.db.userPermissionSyncJob.update({ const { user } = await this.db.userPermissionSyncJob.update({
where: { where: {
id: job.data.jobId, id: job.data.jobId,
@ -234,6 +273,8 @@ export class UserPermissionSyncer {
} }
private async onJobFailed(job: Job<UserPermissionSyncJob> | undefined, err: Error) { private async onJobFailed(job: Job<UserPermissionSyncJob> | undefined, err: Error) {
const logger = createJobLogger(job?.data.jobId ?? 'unknown');
Sentry.captureException(err, { Sentry.captureException(err, {
tags: { tags: {
jobId: job?.data.jobId, jobId: job?.data.jobId,
@ -260,7 +301,7 @@ export class UserPermissionSyncer {
logger.error(errorMessage(user.email ?? user.id)); logger.error(errorMessage(user.email ?? user.id));
} else { } else {
logger.error(errorMessage('unknown user (id not found)')); logger.error(errorMessage('unknown job (id not found)'));
} }
} }
} }

View file

@ -56,6 +56,7 @@ export const env = createEnv({
EXPERIMENT_EE_PERMISSION_SYNC_ENABLED: booleanSchema.default('false'), EXPERIMENT_EE_PERMISSION_SYNC_ENABLED: booleanSchema.default('false'),
AUTH_EE_GITHUB_BASE_URL: z.string().optional(), AUTH_EE_GITHUB_BASE_URL: z.string().optional(),
AUTH_EE_GITLAB_BASE_URL: z.string().default("https://gitlab.com"),
}, },
runtimeEnv: process.env, runtimeEnv: process.env,
emptyStringAsUndefined: true, emptyStringAsUndefined: true,

View file

@ -12,6 +12,28 @@ import { getTokenFromConfig } from "@sourcebot/crypto";
const logger = createLogger('gitlab'); const logger = createLogger('gitlab');
export const GITLAB_CLOUD_HOSTNAME = "gitlab.com"; export const GITLAB_CLOUD_HOSTNAME = "gitlab.com";
export const createGitLabFromPersonalAccessToken = async ({ token, url }: { token?: string, url?: string }) => {
const isGitLabCloud = url ? new URL(url).hostname === GITLAB_CLOUD_HOSTNAME : false;
return new Gitlab({
token,
...(isGitLabCloud ? {} : {
host: url,
}),
queryTimeout: env.GITLAB_CLIENT_QUERY_TIMEOUT_SECONDS * 1000,
});
}
export const createGitLabFromOAuthToken = async ({ oauthToken, url }: { oauthToken?: string, url?: string }) => {
const isGitLabCloud = url ? new URL(url).hostname === GITLAB_CLOUD_HOSTNAME : false;
return new Gitlab({
oauthToken,
...(isGitLabCloud ? {} : {
host: url,
}),
queryTimeout: env.GITLAB_CLIENT_QUERY_TIMEOUT_SECONDS * 1000,
});
}
export const getGitLabReposFromConfig = async (config: GitlabConnectionConfig, orgId: number, db: PrismaClient) => { export const getGitLabReposFromConfig = async (config: GitlabConnectionConfig, orgId: number, db: PrismaClient) => {
const hostname = config.url ? const hostname = config.url ?
new URL(config.url).hostname : new URL(config.url).hostname :
@ -23,14 +45,9 @@ export const getGitLabReposFromConfig = async (config: GitlabConnectionConfig, o
env.FALLBACK_GITLAB_CLOUD_TOKEN : env.FALLBACK_GITLAB_CLOUD_TOKEN :
undefined; undefined;
const api = new Gitlab({ const api = await createGitLabFromPersonalAccessToken({
...(token ? { token,
token, url: config.url,
} : {}),
...(config.url ? {
host: config.url,
} : {}),
queryTimeout: env.GITLAB_CLIENT_QUERY_TIMEOUT_SECONDS * 1000,
}); });
let allRepos: ProjectSchema[] = []; let allRepos: ProjectSchema[] = [];
@ -262,3 +279,37 @@ export const shouldExcludeProject = ({
return false; return false;
} }
export const getProjectMembers = async (projectId: string, api: InstanceType<typeof Gitlab>) => {
try {
const fetchFn = () => api.ProjectMembers.all(projectId, {
perPage: 100,
includeInherited: true,
});
const members = await fetchWithRetry(fetchFn, `project ${projectId}`, logger);
return members as Array<{ id: number }>;
} catch (error) {
Sentry.captureException(error);
logger.error(`Failed to fetch members for project ${projectId}.`, error);
throw error;
}
}
export const getProjectsForAuthenticatedUser = async (visibility: 'private' | 'internal' | 'public' | 'all' = 'all', api: InstanceType<typeof Gitlab>) => {
try {
const fetchFn = () => api.Projects.all({
membership: true,
...(visibility !== 'all' ? {
visibility,
} : {}),
perPage: 100,
});
const response = await fetchWithRetry(fetchFn, `authenticated user`, logger);
return response;
} catch (error) {
Sentry.captureException(error);
logger.error(`Failed to fetch projects for authenticated user.`, error);
throw error;
}
}

View file

@ -121,7 +121,6 @@ export const compileGitlabConfig = async (
const projectUrl = `${hostUrl}/${project.path_with_namespace}`; const projectUrl = `${hostUrl}/${project.path_with_namespace}`;
const cloneUrl = new URL(project.http_url_to_repo); const cloneUrl = new URL(project.http_url_to_repo);
const isFork = project.forked_from_project !== undefined; const isFork = project.forked_from_project !== undefined;
// @todo: we will need to double check whether 'internal' should also be considered public or not.
const isPublic = project.visibility === 'public'; const isPublic = project.visibility === 'public';
const repoDisplayName = project.path_with_namespace; const repoDisplayName = project.path_with_namespace;
const repoName = path.join(repoNameRoot, repoDisplayName); const repoName = path.join(repoNameRoot, repoDisplayName);

View file

@ -51,7 +51,16 @@ export const getSSOProviders = (): Provider[] => {
authorization: { authorization: {
url: `${env.AUTH_EE_GITLAB_BASE_URL}/oauth/authorize`, url: `${env.AUTH_EE_GITLAB_BASE_URL}/oauth/authorize`,
params: { params: {
scope: "read_user", scope: [
"read_user",
// Permission syncing requires the `read_api` scope in order to fetch projects
// for the authenticated user and project members.
// @see: https://docs.gitlab.com/ee/api/projects.html#list-all-projects
...(env.EXPERIMENT_EE_PERMISSION_SYNC_ENABLED === 'true' && hasEntitlement('permission-syncing') ?
['read_api'] :
[]
),
].join(' '),
}, },
}, },
token: { token: {