Commit graph

466 commits

Author SHA1 Message Date
Timothy Jaeryang Baek
01a5b97415 refac/fix: do not process xlsx files with azure doc intelligence 2025-09-29 23:05:24 -05:00
Timothy Jaeryang Baek
e7fa86aa26 chore: format 2025-09-29 00:58:21 -05:00
Tim Jaeryang Baek
2d94b8e905
Merge pull request #17837 from Classic298/milvus-multitenancy
feat: Impelement Milvus multitenancy // breaking: set milvus multitenancy as standard option (just like Qdrant already is)
2025-09-29 00:29:35 -05:00
Timothy Jaeryang Baek
118549caf3 enh/fix: filter content metadata 2025-09-28 20:17:27 -05:00
Classic298
b1e63639cd
ADD FAT WARNING - QDRANT 2025-09-28 21:17:07 +02:00
Classic298
0e99c43495
ADD FAT WARNING 2025-09-28 21:16:02 +02:00
Classic298
01d4a8ab7a
Update factory.py 2025-09-28 11:06:29 +02:00
Classic298
8dc43f9e3a
Create milvus_multitenancy.py 2025-09-28 11:05:15 +02:00
Tim Jaeryang Baek
f8a3ed2d18
Merge pull request #17770 from Classic298/feat-milvus-diskann-support
feat: Add DISKANN index type support for Milvus
2025-09-26 14:23:53 -05:00
Classic298
9e3d5407ae
Merge branch 'open-webui:main' into feat-milvus-diskann-support 2025-09-26 10:43:01 +02:00
Classic298
b550d78905
Merge branch 'open-webui:main' into fix-milvus-limit-error 2025-09-26 10:42:53 +02:00
google-labs-jules[bot]
123dbf152e feat: Add DISKANN index type support for Milvus
This commit introduces support for the DISKANN index type in the Milvus vector database integration.

Changes include:
- Added `MILVUS_DISKANN_MAX_DEGREE` and `MILVUS_DISKANN_SEARCH_LIST_SIZE` configuration variables.
- Updated the Milvus client to recognize and configure the DISKANN index type during collection creation.
2025-09-26 06:54:06 +00:00
google-labs-jules[bot]
e7ccaf6e78 Fix: milvus error because the limit set to None by default
The pymilvus library expects -1 for unlimited queries, but the code was passing None, which caused a TypeError. This commit changes the default value of the limit parameter in the query method from None to -1. It also updates the call site in the get method to pass -1 instead of None and updates the type hint and a comment to reflect this change.
2025-09-26 06:39:54 +00:00
Timothy Jaeryang Baek
7f411dd5cc feat/enh: perplexity search support 2025-09-25 14:02:46 -05:00
Timothy Jaeryang Baek
fe65fe0b97 refac: ollama cloud web search count support 2025-09-24 15:58:56 -05:00
Timothy Jaeryang Baek
e06489d92b enh: search_ollama_cloud 2025-09-24 15:19:05 -05:00
Timothy Jaeryang Baek
6e4a2f18e1 refac 2025-09-21 00:14:43 -04:00
Timothy Jaeryang Baek
a51f0c30ec refac/fix: knowledge permission 2025-09-15 11:40:31 -05:00
Timothy Jaeryang Baek
e61e7434a0 refac 2025-09-14 10:46:49 +02:00
Timothy Jaeryang Baek
1ef8204359 refac 2025-09-14 10:45:52 +02:00
Timothy Jaeryang Baek
58d7ca35e3 refac 2025-09-14 10:27:07 +02:00
Timothy Jaeryang Baek
aa8ab349ed feat: ref chat 2025-09-14 10:26:46 +02:00
Timothy Jaeryang Baek
210197fd43 refac/fix: web/youtube file attachment handling 2025-09-13 00:02:48 +04:00
Timothy Jaeryang Baek
2185fc61c0 refac 2025-09-11 21:29:56 +04:00
Timothy Jaeryang Baek
485392fe63 chore: format 2025-09-09 18:19:31 +04:00
Tim Jaeryang Baek
71fd483fba
Merge pull request #17276 from Elettrotecnica/extend-docling-configuration
feat: Extend docling configuration options
2025-09-09 18:04:30 +04:00
Timothy Jaeryang Baek
0214c1e66c refac 2025-09-09 16:48:59 +04:00
Timothy Jaeryang Baek
5f0d262c59 fix: yt embed 2025-09-09 16:00:42 +04:00
Antonio Pisano
daa2a036f8 Extend docling configuration options to include:
* do_ocr
* force_ocr
* pdf_backend
* table_mode
* pipeline

as per https://github.com/docling-project/docling-serve/blob/main/docs/usage.md

See https://github.com/open-webui/open-webui/issues/17148
2025-09-08 18:51:33 +02:00
Timothy Jaeryang Baek
4f2e426fc7 refac 2025-09-01 14:27:20 +04:00
Timothy Jaeryang Baek
609a6a3721 refac 2025-09-01 14:22:02 +04:00
Timothy Jaeryang Baek
85153afda8 refac 2025-09-01 14:21:17 +04:00
Timothy Jaeryang Baek
487979859a fix: web/youtube attachements 2025-09-01 01:22:50 +04:00
Timothy Jaeryang Baek
ac0243e8b7 refac 2025-09-01 00:57:13 +04:00
Tim Jaeryang Baek
719d115d49
Merge pull request #17049 from rgaricano/dev-FIX_lex-sem
FIX: Hybrid Search
2025-09-01 00:00:25 +04:00
Tim Jaeryang Baek
4e7b0ea4b4
Merge pull request #17013 from athoik/fix-17000
fix: handle unicode filenames in external document loader
2025-08-31 23:58:52 +04:00
Timothy Jaeryang Baek
c2b4976c82 enh: PGVECTOR_CREATE_EXTENSION env var 2025-08-31 23:58:18 +04:00
_00_
647e38f701
Revert bypass hybrid search when BM25_weight=0
Revert PR https://github.com/open-webui/open-webui/commit/74b1c801
2025-08-30 10:45:35 +02:00
Athanasios Oikonomou
d735b036fe fix: handle unicode filenames in external document loader
Files with special characters in their names (e.g., ü.pdf) caused issues since HTTP headers only allow Latin-1 characters.
This change URL-encodes `X-Filename` before adding it to request headers, preventing failures when uploading or processing such files.

Fixes: #17000
2025-08-28 22:19:50 +03:00
Timothy Jaeryang Baek
2bb6063dcb refac/fix: marker 2025-08-28 03:03:31 +04:00
Timothy Jaeryang Baek
23a9731899 refac/fix: hybrid search 2025-08-26 15:04:46 +04:00
Tim Jaeryang Baek
4267e22d4a
Merge pull request #16826 from selenecodes/feat/azure-document-intelligence-azure-entra-auth
feat: Authenticate Azure Document Intelligence using DefaultAzureCredential
2025-08-26 14:32:04 +04:00
_00_
093af754e7
FIX: Playwright Timeout (ms) interpreted as seconds
Fix for Playwright Timeout (ms) interpreted as seconds.

To address https://github.com/open-webui/open-webui/issues/16801

In Frontend Playwright Timeout is setted as (ms), but in backend is interpreted as (s) doing a time conversion for playwright_timeout var (that have to be in ms).

& as  _Originally posted by @rawbby in [#16801](https://github.com/open-webui/open-webui/issues/16801#issuecomment-3216782565)_

> I personally think milliseconds are a reasonable choice for the timeout. Maybe the conversion should be fixed, not the label.
> This would further not break existing configurations from users that rely on their current config.
>
2025-08-23 14:15:00 +02:00
Selene Blok
5051bfe7ab feat(document retrieval): Authenticate Azure Document Intelligence using AzureDefaultCredential if API key is not provided 2025-08-22 16:15:43 +02:00
Timothy Jaeryang Baek
fbff4e19de fix: reranking 2025-08-22 16:47:05 +04:00
Timothy Jaeryang Baek
60b8cfb9fa refac 2025-08-21 21:48:21 +04:00
Timothy Jaeryang Baek
02479425a5 refac 2025-08-21 12:51:41 +04:00
Timothy Jaeryang Baek
1a15a62b73 chore: format 2025-08-21 04:47:28 +04:00
Tim Jaeryang Baek
7452b87877
Merge pull request #16741 from 0xThresh/s3vector-support
fix: batch S3 vectors in groups of 500 to comply with API limitations
2025-08-20 13:25:42 +04:00
James W.
45d9a720b9
Merge branch 'open-webui:main' into s3vector-support 2025-08-19 22:06:16 -06:00
0xThresh.eth
7fcc545672 fix: batch S3 vectors in groups of 500 to comply with API limitations 2025-08-19 22:05:47 -06:00
Timothy Jaeryang Baek
f97f21bf3a refac/fix: rename WEB_SEARCH_CONCURRENT_REQUESTS to WEB_LOADER_CONCURRENT_REQUESTS 2025-08-18 20:06:36 +04:00
Tim Jaeryang Baek
0b59aa940e
Merge pull request #16606 from Rain6435/fix/azure-postgresql-pgvector-permissions
fix: resolve Azure PostgreSQL pgvector extension permission issue
2025-08-15 00:59:04 +04:00
Rain6435
a1e62ab422 fix: Formatting 2025-08-14 01:50:57 -04:00
Rain6435
1a42e96a3b fix: resolve Azure PostgreSQL pgvector extension permission issue
Replace direct CREATE EXTENSION commands with conditional checks to avoid
  permission errors on Azure PostgreSQL Flexible Server where only
  azure_pg_admin members can create extensions.

  - Check pg_extension table before attempting to create vector extension
  - Apply same fix to pgcrypto extension for consistency
  - Allows following least privilege principle for database users

  Fixes #12453
2025-08-14 01:45:02 -04:00
Timothy Jaeryang Baek
ad98d4300b refac/fix: milvus query logic 2025-08-14 03:18:38 +04:00
expruc
74b1c80132 disable collection retrieval and bm_25 calculation if bm_25 weight is 0 or less 2025-08-12 15:53:39 +03:00
Timothy Jaeryang Baek
890691319f fix: s3vector import issue 2025-08-11 16:23:08 +04:00
Timothy Jaeryang Baek
21094ca88b fix: pinecone insert issue 2025-08-11 16:22:58 +04:00
Timothy Jaeryang Baek
77189664c2 chore: format 2025-08-09 23:57:35 +04:00
Tim Jaeryang Baek
53425ffadb
Merge pull request #16419 from expruc/feat/qdrant_improvements
feat: qdrant client improvements
2025-08-09 23:52:12 +04:00
expruc
8af9ad3f30 updated query function with scroll too 2025-08-09 22:04:41 +03:00
expruc
88abd01b87 qdrant client improvements 2025-08-09 21:12:30 +03:00
Jan Kessler
3a9601c053
use .rollback() after read-only transaction on pgvector to avoid infinitely idle transactions (and errors in certain scenarios) 2025-08-09 20:09:45 +02:00
Tim Jaeryang Baek
17084f629c
Merge pull request #16385 from gaby/2025-08-08-13-38-31
feat: Propagate upstream OpenAI router errors
2025-08-09 00:58:14 +04:00
Tim Jaeryang Baek
8714df17dd
Merge pull request #16381 from psy42a/patch-1
fix: failure to bind metadata variable on insert for PGVECTOR_PGCRYPTO feature returning syntax error
2025-08-09 00:26:30 +04:00
Juan Calderon-Perez
7619f449c8 Format code base 2025-08-08 10:10:32 -04:00
Juan Calderon-Perez
d2f2d42e09 Format python code 2025-08-08 10:09:31 -04:00
Timothy Jaeryang Baek
8b489cb31f refac: s3 vector 2025-08-08 12:24:47 +04:00
Tim Jaeryang Baek
70eb83b701
Merge pull request #16185 from hiwylee/vector-search-branch
feat: oracle 23ai Vector search for new supported vector db
2025-08-06 14:36:14 +04:00
psy42a
f3b0f7d358
Fix syntax error where the previous use of :metadata::text in some sqlachamy/postgres versions doesn't bind at all
Fix syntax error where the previous use of :metadata::text in some sqlachamy/postgres versions doesn't bind the variable at all
2025-08-05 23:27:50 +10:00
Timothy Jaeryang Baek
e8696c63fe refac 2025-08-04 15:23:43 +04:00
Tim Jaeryang Baek
5db60ca34f
Merge pull request #15903 from Hisma/marker-api-update
feat: Add configurable API URL (for self-hosting) and additional_config parameter for Datalab Marker API
2025-08-04 15:21:03 +04:00
Timothy Jaeryang Baek
7aeca7dee2 refac 2025-08-04 15:12:39 +04:00
hiwylee
bd215a1b96
Merge branch 'dev' into vector-search-branch 2025-08-01 04:23:38 +09:00
hiwylee
0e640dd71e resolve conflict 2025-08-01 02:58:51 +09:00
Timothy Jaeryang Baek
6a17ba5b7a refac: metadata handling in vectordb 2025-07-31 17:45:06 +04:00
Tim Jaeryang Baek
dcade8cdf8
Merge pull request #15785 from bekzod/patch-1
BREAKING CHANGE: Update docling endpoint
2025-07-24 21:09:13 +04:00
Tim Jaeryang Baek
bd18bf5c83
Merge pull request #15951 from 0xThresh/s3vector-support
feat: Add S3 Vector Buckets Support for Knowledge
2025-07-23 12:02:20 +04:00
0xThresh.eth
860f3b3cab chore: run formatting 2025-07-22 22:46:00 -06:00
0xThresh.eth
8dcf668448 chore: final cleanup 2025-07-22 22:37:57 -06:00
0xThresh.eth
d463a29ba1 feat: S3 vector support tested 2025-07-22 21:36:35 -06:00
Hisma
21337a2fd8 ci fix 2025-07-22 22:07:14 -04:00
Hisma
a99e20cc3d add format_lines 2025-07-22 21:06:29 -04:00
Hisma
f31cc07a9d feat: update marker api 2025-07-22 20:49:28 -04:00
Timothy Jaeryang Baek
8bc7d85eac refac 2025-07-22 17:17:26 +04:00
Timothy Jaeryang Baek
bf3c807047 refac 2025-07-22 11:38:47 +04:00
0xThresh.eth
f6ee1965cb merge main 2025-07-21 18:06:17 -06:00
0xThresh.eth
5c59c50e2d more prgoress on s3 vector 2025-07-20 16:48:23 -06:00
bekzod
4bc054a347
Update docling endpoint 2025-07-16 20:40:13 +05:00
0xThresh.eth
d9f2b6b14e feat: add starter config for s3 vector 2025-07-15 21:20:54 -06:00
Timothy Jaeryang Baek
500e6e64fe refac 2025-07-15 21:57:24 +04:00
Timothy Jaeryang Baek
92c9068369 refac 2025-07-14 17:50:03 +04:00
Timothy Jaeryang Baek
18bd83413b refac 2025-07-14 14:05:06 +04:00
Timothy Jaeryang Baek
0013f5c1fc refac/enh: forward user info header to reranker 2025-07-14 13:59:10 +04:00
Timothy Jaeryang Baek
b4f04ff3a7 enh/refac: pgvector pool support 2025-07-14 12:18:44 +04:00
Tim Jaeryang Baek
9b84a8e443
Merge pull request #15632 from athoik/quote
fix: don't over quote forwarded headers
2025-07-12 00:24:29 +04:00
Timothy Jaeryang Baek
77c1905609 refac 2025-07-11 12:35:42 +04:00
Timothy Jaeryang Baek
033d07ee23 refac: file handling 2025-07-11 12:29:17 +04:00
Timothy Jaeryang Baek
3b9d86de0b refac 2025-07-11 12:00:21 +04:00