Commit graph

402 commits

Author SHA1 Message Date
Tim Jaeryang Baek
17084f629c
Merge pull request #16385 from gaby/2025-08-08-13-38-31
feat: Propagate upstream OpenAI router errors
2025-08-09 00:58:14 +04:00
Tim Jaeryang Baek
8714df17dd
Merge pull request #16381 from psy42a/patch-1
fix: failure to bind metadata variable on insert for PGVECTOR_PGCRYPTO feature returning syntax error
2025-08-09 00:26:30 +04:00
Juan Calderon-Perez
7619f449c8 Format code base 2025-08-08 10:10:32 -04:00
Juan Calderon-Perez
d2f2d42e09 Format python code 2025-08-08 10:09:31 -04:00
Timothy Jaeryang Baek
8b489cb31f refac: s3 vector 2025-08-08 12:24:47 +04:00
Tim Jaeryang Baek
70eb83b701
Merge pull request #16185 from hiwylee/vector-search-branch
feat: oracle 23ai Vector search for new supported vector db
2025-08-06 14:36:14 +04:00
psy42a
f3b0f7d358
Fix syntax error where the previous use of :metadata::text in some sqlachamy/postgres versions doesn't bind at all
Fix syntax error where the previous use of :metadata::text in some sqlachamy/postgres versions doesn't bind the variable at all
2025-08-05 23:27:50 +10:00
Timothy Jaeryang Baek
e8696c63fe refac 2025-08-04 15:23:43 +04:00
Tim Jaeryang Baek
5db60ca34f
Merge pull request #15903 from Hisma/marker-api-update
feat: Add configurable API URL (for self-hosting) and additional_config parameter for Datalab Marker API
2025-08-04 15:21:03 +04:00
Timothy Jaeryang Baek
7aeca7dee2 refac 2025-08-04 15:12:39 +04:00
hiwylee
bd215a1b96
Merge branch 'dev' into vector-search-branch 2025-08-01 04:23:38 +09:00
hiwylee
0e640dd71e resolve conflict 2025-08-01 02:58:51 +09:00
Timothy Jaeryang Baek
6a17ba5b7a refac: metadata handling in vectordb 2025-07-31 17:45:06 +04:00
Tim Jaeryang Baek
dcade8cdf8
Merge pull request #15785 from bekzod/patch-1
BREAKING CHANGE: Update docling endpoint
2025-07-24 21:09:13 +04:00
Tim Jaeryang Baek
bd18bf5c83
Merge pull request #15951 from 0xThresh/s3vector-support
feat: Add S3 Vector Buckets Support for Knowledge
2025-07-23 12:02:20 +04:00
0xThresh.eth
860f3b3cab chore: run formatting 2025-07-22 22:46:00 -06:00
0xThresh.eth
8dcf668448 chore: final cleanup 2025-07-22 22:37:57 -06:00
0xThresh.eth
d463a29ba1 feat: S3 vector support tested 2025-07-22 21:36:35 -06:00
Hisma
21337a2fd8 ci fix 2025-07-22 22:07:14 -04:00
Hisma
a99e20cc3d add format_lines 2025-07-22 21:06:29 -04:00
Hisma
f31cc07a9d feat: update marker api 2025-07-22 20:49:28 -04:00
Timothy Jaeryang Baek
8bc7d85eac refac 2025-07-22 17:17:26 +04:00
Timothy Jaeryang Baek
bf3c807047 refac 2025-07-22 11:38:47 +04:00
0xThresh.eth
f6ee1965cb merge main 2025-07-21 18:06:17 -06:00
0xThresh.eth
5c59c50e2d more prgoress on s3 vector 2025-07-20 16:48:23 -06:00
bekzod
4bc054a347
Update docling endpoint 2025-07-16 20:40:13 +05:00
0xThresh.eth
d9f2b6b14e feat: add starter config for s3 vector 2025-07-15 21:20:54 -06:00
Timothy Jaeryang Baek
500e6e64fe refac 2025-07-15 21:57:24 +04:00
Timothy Jaeryang Baek
92c9068369 refac 2025-07-14 17:50:03 +04:00
Timothy Jaeryang Baek
18bd83413b refac 2025-07-14 14:05:06 +04:00
Timothy Jaeryang Baek
0013f5c1fc refac/enh: forward user info header to reranker 2025-07-14 13:59:10 +04:00
Timothy Jaeryang Baek
b4f04ff3a7 enh/refac: pgvector pool support 2025-07-14 12:18:44 +04:00
Tim Jaeryang Baek
9b84a8e443
Merge pull request #15632 from athoik/quote
fix: don't over quote forwarded headers
2025-07-12 00:24:29 +04:00
Timothy Jaeryang Baek
77c1905609 refac 2025-07-11 12:35:42 +04:00
Timothy Jaeryang Baek
033d07ee23 refac: file handling 2025-07-11 12:29:17 +04:00
Timothy Jaeryang Baek
3b9d86de0b refac 2025-07-11 12:00:21 +04:00
Athanasios Oikonomou
96758176cc fix: don't over quote forwarded headers
Fix introduced on #15035 is over quoting headers.

Eg mails instead of user@example.com shown as user%40example.com
Eg names instead of First Last shown as First%20Last

Also we are spending some time quoting ids and roles without required.

Keep quote only on user name, initially had problem based on the discussion
https://github.com/open-webui/open-webui/discussions/14391

Also add space in safe characters, in order remove %20 from names.
2025-07-10 22:08:28 +03:00
Wonyong Lee
46e0992a83 json_serialize returing varchar2(2096) 2025-07-10 12:12:43 +00:00
Timothy Jaeryang Baek
8d84b4c2a4 enh/refac: temp chat file upload behaviour
client-side content extraction
2025-07-09 22:59:37 +04:00
Timothy Jaeryang Baek
b3c4bc6041 enh: allow full context mode for collections 2025-07-09 01:29:49 +04:00
Timothy Jaeryang Baek
d5f9bbc7a7 enh: reference note in chat 2025-07-09 01:17:25 +04:00
Tim Jaeryang Baek
a748f19ac2
Merge pull request #15548 from expruc/fix/docling_ignore_html
fix: text/html files being detected as text when loaded with docling/tika
2025-07-08 13:16:01 +04:00
Oracle Public Cloud User
e0afd7f496 fianl : vector-search-feature 2025-07-07 17:25:16 +00:00
Oracle Public Cloud User
12ebdbae81 refactor oracle23ai.py 2025-07-07 16:21:34 +00:00
Oracle Public Cloud User
25e241ae41 added new feature : oracle23ai vector search 2025-07-07 12:13:05 +00:00
Timothy Jaeryang Baek
3e15c8ab69 refac 2025-07-07 15:56:05 +04:00
Oracle Public Cloud User
b56dbb26be alpha2 2025-07-07 08:52:58 +00:00
Oracle Public Cloud User
3e2fd074bb oracle 23ai vector search 2025-07-07 05:58:02 +00:00
expruc
453a2bd9b5 fixed issue where text/html files being detected as text when loaded 2025-07-06 20:10:26 +03:00
Anush008
17debaa6de
chore: Raise if QDRANT_URI is not set
Signed-off-by: Anush008 <anushshetty90@gmail.com>
2025-07-04 13:17:46 +05:30
Anush008
c8a49d373a
refactor: Removed more swallows
Signed-off-by: Anush008 <anushshetty90@gmail.com>
2025-07-04 12:38:22 +05:30
Anush008
0ac57a088f
refactor: More implementation improvements
Signed-off-by: Anush008 <anushshetty90@gmail.com>
2025-07-04 12:33:54 +05:30
Anush008
7c734d3fea
Merge remote-tracking branch 'origin/dev' into Anush008/main
Signed-off-by: Anush008 <anushshetty90@gmail.com>
2025-07-04 12:22:08 +05:30
Tim Jaeryang Baek
600344f2e8
Merge pull request #15510 from kopero2000/bug/oauth_logout_fix
fix/oauth logout fix
2025-07-04 10:30:02 +04:00
Bela Vizi
9623ef4360 add trust env to clientsession 2025-07-02 17:59:56 +02:00
guenhter
5c2e0e4beb feat: add qdrant indices for metadata fields
All fieldnames which are part of a query should
have an index for performance reasons. This is
even enforced on some qdrant cluster like those
on qdrant.io, and queries using a unindexed column
fail with an error.
2025-06-29 15:30:55 +02:00
Timothy Jaeryang Baek
1b064a6c85 chore: format 2025-06-28 15:21:20 +04:00
guenhter
a66206f44f feat: support better qdrant collection isolation
The prefix string for qdrant collection is now
configurable,  which means the same qdrant cluster
can be used to host more open webui instances and
to be able to separate the collections between the
different owui instances.
2025-06-26 13:52:26 +02:00
Timothy Jaeryang Baek
1f123eb100 refac 2025-06-25 12:20:08 +04:00
Tim Jaeryang Baek
d60c800d66
Merge pull request #15276 from zhangtyzzz/update_brave_search
[fix] Update brave.py to use the correct field
2025-06-25 11:04:06 +04:00
Anush008
05bee5663d
Merge remote-tracking branch 'origin/dev' 2025-06-25 12:04:23 +05:30
zhangtyzzz
ac5567f78d
Update brave.py to use the correct field
fixing issues caused by incorrect field names.
2025-06-25 09:11:58 +08:00
Anush008
5dba298c1e
refactor: Updated Qdrant multi-tenancy implementation
Signed-off-by: Anush008 <anushshetty90@gmail.com>
2025-06-24 14:12:44 +05:30
Doris Lam
74ae9ab897 fix opensearch race condition, use keyword search instead of full text search in filter query 2025-06-23 18:43:33 -07:00
Timothy Jaeryang Baek
81b8267e85 feat: odt file parse support 2025-06-19 18:39:00 +04:00
priten
f7920df870 Fix non-ascii error issue on ENABLE_FORWARD_USER_INFO_HEADERS 2025-06-16 12:33:11 -05:00
Timothy Jaeryang Baek
7753f57d42 chore: format 2025-06-16 13:48:50 +04:00
Tim Jaeryang Baek
c5b48ec551
Merge pull request #14992 from sreesdas/dev
Fix: Added support for multiple pages in external document loader
2025-06-16 11:01:33 +04:00
sree
62bfe73964 Fix: Added support for multiple pages in external document loader, added filename in api request header 2025-06-15 19:59:05 +05:30
Vaclav Cerny
4bbc32efa6 fix: serialize picture description parameters to JSON in DoclingLoader 2025-06-11 20:00:25 +02:00
Timothy Jaeryang Baek
d430fe9551 refac 2025-06-10 11:30:54 +04:00
Timothy Jaeryang Baek
7f488b3754 feat: experimental pgvector pgcrypto support 2025-06-09 18:14:33 +04:00
Timothy Jaeryang Baek
7f75acff96 chore: format 2025-06-08 22:08:25 +04:00
Timothy Jaeryang Baek
0cd400f5ee refac: docling picture describe params 2025-06-08 20:02:14 +04:00
Tim Jaeryang Baek
6bf393a480
Merge pull request #14787 from vaclcer/vaclavs-custom-docling
feat: Customize Docling's "Describe Pictures" feature
2025-06-08 19:02:36 +04:00
Tim Jaeryang Baek
50d9a2ac58
Merge pull request #14781 from lucyknada/patch-2
fix: fix #14752 and add manual transcription retrieval
2025-06-08 18:40:28 +04:00
Vaclav Cerny
99f05561f8 Add configuration options for picture description modes and update related components 2025-06-08 16:30:26 +02:00
lucy
b0965a8184
fixes #14752 and adds manual transcription option 2025-06-08 14:26:24 +02:00
Timothy Jaeryang Baek
5e35aab292 chore: format 2025-06-05 01:12:28 +04:00
Tim Jaeryang Baek
7c4f261aa2
Merge pull request #14616 from Davixk/feat/new-perplexity-options
feat: add Perplexity AI model and search context usage configuration options
2025-06-05 00:28:00 +04:00
Vaclav Cerny
9772c18b20 fix(loader): remove deprecated picture description configuration 2025-06-04 17:21:44 +02:00
Vaclav Cerny
c71236ba07 feat(loader): enhance picture description prompt for improved detail and clarity 2025-06-04 14:25:31 +02:00
Vaclav Cerny
c4278f4784 fix description vs classification mismatch 2025-06-04 14:13:00 +02:00
Vaclav Cerny
8644e81a1c feat(loader): add picture description configuration for DoclingLoader 2025-06-04 12:34:39 +02:00
Timothy Jaeryang Baek
4d364e2967 refac: remove msg from known type 2025-06-03 16:27:28 +04:00
Dave
77b357c73b fix: update label for search context usage to clarify its purpose 2025-06-03 00:27:07 +02:00
Dave
96e9bfe0e5 feat: add Perplexity model and search context usage configuration options 2025-06-03 00:19:08 +02:00
Tim Jaeryang Baek
3c32d2cada
Merge pull request #14539 from PVBLIC-F/refac/mistral
perf mistral.py Enhance for Overall Speed and Efficiency
2025-06-02 23:52:59 +04:00
PVBLIC Foundation
cf3635ba25
Update mistral.py
1. Intelligent Error Handling
Added _is_retryable_error() method to distinguish retryable vs non-retryable errors
Prevents unnecessary retries on client errors (4xx) that won't succeed
Caps retry delay at 30 seconds to prevent excessive waiting
2. Optimized Timeout Configuration
Upload: Capped at 2 minutes (was using full 5-minute timeout)
URL requests: 30 seconds (should be fast)
OCR processing: Full timeout (can take time)
Cleanup: 30 seconds (should be quick)
3. Enhanced Connection Pool
Increased connection limits: 20 total, 10 per host
Longer DNS cache TTL (10 minutes vs 5 minutes)
Increased keepalive timeout (60s vs 30s)
Added async DNS resolver for better performance
Granular timeout controls (connect, read, total)
4. Concurrency Control for Batch Processing
Added semaphore-based concurrency control (default: 5 concurrent)
Prevents API overwhelming while maintaining throughput
Configurable concurrency limit per workload
5. Memory Efficient Result Processing
Early exit for empty content validation
Better error metadata for debugging
Added content length tracking
Streamlined page processing logic
6. General Performance Improvements
Better error logging with truncated responses
Optimized metadata creation
Improved debug logging efficiency
2025-05-30 20:06:29 -07:00
PVBLIC Foundation
66bde32623
Update pinecone.py 2025-05-30 18:47:23 -07:00
PVBLIC Foundation
4ecf2a8685
Update pinecone.py
May 2025 Latest Pinecone Best Practices
2025-05-30 09:33:57 -07:00
Timothy Jaeryang Baek
9306ae5972 refac 2025-05-30 01:19:56 +04:00
Timothy Jaeryang Baek
e1e2c096e2 refac: PLEASE follow existing convention 2025-05-30 00:34:18 +04:00
Tim Jaeryang Baek
ff353578db
Merge pull request #14370 from daw/feat/add-azure-openai-embeddings-option
feat:Add Azure OpenAI embedding support
2025-05-30 00:18:55 +04:00
Timothy Jaeryang Baek
7dc7d5c028 refac: PLEASE FOLLOW EXISTING CONVENTION 2025-05-29 03:47:02 +04:00
Timothy Jaeryang Baek
551597b9cc chore: format 2025-05-29 02:36:33 +04:00
Hisma
e12a79c0e2 fix: handle json output format correctly 2025-05-27 01:12:03 -04:00
Hisma
a9405cc101 feat: Marker api content extraction support 2025-05-27 00:44:07 -04:00
Timothy Jaeryang Baek
da75d0ca1e chore: format 2025-05-24 02:13:54 +04:00
Tim Jaeryang Baek
e663b90a9f
Merge pull request #14069 from Ithanil/bm25_weight
feat: Configurable weight for BM25Retriever during hybrid search
2025-05-24 01:13:03 +04:00