Tim Jaeryang Baek
4267e22d4a
Merge pull request #16826 from selenecodes/feat/azure-document-intelligence-azure-entra-auth
...
feat: Authenticate Azure Document Intelligence using DefaultAzureCredential
2025-08-26 14:32:04 +04:00
_00_
093af754e7
FIX: Playwright Timeout (ms) interpreted as seconds
...
Fix for Playwright Timeout (ms) interpreted as seconds.
To address https://github.com/open-webui/open-webui/issues/16801
In Frontend Playwright Timeout is setted as (ms), but in backend is interpreted as (s) doing a time conversion for playwright_timeout var (that have to be in ms).
& as _Originally posted by @rawbby in [#16801 ](https://github.com/open-webui/open-webui/issues/16801#issuecomment-3216782565 )_
> I personally think milliseconds are a reasonable choice for the timeout. Maybe the conversion should be fixed, not the label.
> This would further not break existing configurations from users that rely on their current config.
>
2025-08-23 14:15:00 +02:00
Selene Blok
5051bfe7ab
feat(document retrieval): Authenticate Azure Document Intelligence using AzureDefaultCredential if API key is not provided
2025-08-22 16:15:43 +02:00
Timothy Jaeryang Baek
fbff4e19de
fix: reranking
2025-08-22 16:47:05 +04:00
Timothy Jaeryang Baek
60b8cfb9fa
refac
2025-08-21 21:48:21 +04:00
Timothy Jaeryang Baek
02479425a5
refac
2025-08-21 12:51:41 +04:00
Timothy Jaeryang Baek
1a15a62b73
chore: format
2025-08-21 04:47:28 +04:00
Tim Jaeryang Baek
7452b87877
Merge pull request #16741 from 0xThresh/s3vector-support
...
fix: batch S3 vectors in groups of 500 to comply with API limitations
2025-08-20 13:25:42 +04:00
James W.
45d9a720b9
Merge branch 'open-webui:main' into s3vector-support
2025-08-19 22:06:16 -06:00
0xThresh.eth
7fcc545672
fix: batch S3 vectors in groups of 500 to comply with API limitations
2025-08-19 22:05:47 -06:00
Timothy Jaeryang Baek
f97f21bf3a
refac/fix: rename WEB_SEARCH_CONCURRENT_REQUESTS to WEB_LOADER_CONCURRENT_REQUESTS
2025-08-18 20:06:36 +04:00
Tim Jaeryang Baek
0b59aa940e
Merge pull request #16606 from Rain6435/fix/azure-postgresql-pgvector-permissions
...
fix: resolve Azure PostgreSQL pgvector extension permission issue
2025-08-15 00:59:04 +04:00
Rain6435
a1e62ab422
fix: Formatting
2025-08-14 01:50:57 -04:00
Rain6435
1a42e96a3b
fix: resolve Azure PostgreSQL pgvector extension permission issue
...
Replace direct CREATE EXTENSION commands with conditional checks to avoid
permission errors on Azure PostgreSQL Flexible Server where only
azure_pg_admin members can create extensions.
- Check pg_extension table before attempting to create vector extension
- Apply same fix to pgcrypto extension for consistency
- Allows following least privilege principle for database users
Fixes #12453
2025-08-14 01:45:02 -04:00
Timothy Jaeryang Baek
ad98d4300b
refac/fix: milvus query logic
2025-08-14 03:18:38 +04:00
expruc
74b1c80132
disable collection retrieval and bm_25 calculation if bm_25 weight is 0 or less
2025-08-12 15:53:39 +03:00
Timothy Jaeryang Baek
890691319f
fix: s3vector import issue
2025-08-11 16:23:08 +04:00
Timothy Jaeryang Baek
21094ca88b
fix: pinecone insert issue
2025-08-11 16:22:58 +04:00
Timothy Jaeryang Baek
77189664c2
chore: format
2025-08-09 23:57:35 +04:00
Tim Jaeryang Baek
53425ffadb
Merge pull request #16419 from expruc/feat/qdrant_improvements
...
feat: qdrant client improvements
2025-08-09 23:52:12 +04:00
expruc
8af9ad3f30
updated query function with scroll too
2025-08-09 22:04:41 +03:00
expruc
88abd01b87
qdrant client improvements
2025-08-09 21:12:30 +03:00
Jan Kessler
3a9601c053
use .rollback() after read-only transaction on pgvector to avoid infinitely idle transactions (and errors in certain scenarios)
2025-08-09 20:09:45 +02:00
Tim Jaeryang Baek
17084f629c
Merge pull request #16385 from gaby/2025-08-08-13-38-31
...
feat: Propagate upstream OpenAI router errors
2025-08-09 00:58:14 +04:00
Tim Jaeryang Baek
8714df17dd
Merge pull request #16381 from psy42a/patch-1
...
fix: failure to bind metadata variable on insert for PGVECTOR_PGCRYPTO feature returning syntax error
2025-08-09 00:26:30 +04:00
Juan Calderon-Perez
7619f449c8
Format code base
2025-08-08 10:10:32 -04:00
Juan Calderon-Perez
d2f2d42e09
Format python code
2025-08-08 10:09:31 -04:00
Timothy Jaeryang Baek
8b489cb31f
refac: s3 vector
2025-08-08 12:24:47 +04:00
Tim Jaeryang Baek
70eb83b701
Merge pull request #16185 from hiwylee/vector-search-branch
...
feat: oracle 23ai Vector search for new supported vector db
2025-08-06 14:36:14 +04:00
psy42a
f3b0f7d358
Fix syntax error where the previous use of :metadata::text in some sqlachamy/postgres versions doesn't bind at all
...
Fix syntax error where the previous use of :metadata::text in some sqlachamy/postgres versions doesn't bind the variable at all
2025-08-05 23:27:50 +10:00
Timothy Jaeryang Baek
e8696c63fe
refac
2025-08-04 15:23:43 +04:00
Tim Jaeryang Baek
5db60ca34f
Merge pull request #15903 from Hisma/marker-api-update
...
feat: Add configurable API URL (for self-hosting) and additional_config parameter for Datalab Marker API
2025-08-04 15:21:03 +04:00
Timothy Jaeryang Baek
7aeca7dee2
refac
2025-08-04 15:12:39 +04:00
hiwylee
bd215a1b96
Merge branch 'dev' into vector-search-branch
2025-08-01 04:23:38 +09:00
hiwylee
0e640dd71e
resolve conflict
2025-08-01 02:58:51 +09:00
Timothy Jaeryang Baek
6a17ba5b7a
refac: metadata handling in vectordb
2025-07-31 17:45:06 +04:00
Tim Jaeryang Baek
dcade8cdf8
Merge pull request #15785 from bekzod/patch-1
...
BREAKING CHANGE: Update docling endpoint
2025-07-24 21:09:13 +04:00
Tim Jaeryang Baek
bd18bf5c83
Merge pull request #15951 from 0xThresh/s3vector-support
...
feat: Add S3 Vector Buckets Support for Knowledge
2025-07-23 12:02:20 +04:00
0xThresh.eth
860f3b3cab
chore: run formatting
2025-07-22 22:46:00 -06:00
0xThresh.eth
8dcf668448
chore: final cleanup
2025-07-22 22:37:57 -06:00
0xThresh.eth
d463a29ba1
feat: S3 vector support tested
2025-07-22 21:36:35 -06:00
Hisma
21337a2fd8
ci fix
2025-07-22 22:07:14 -04:00
Hisma
a99e20cc3d
add format_lines
2025-07-22 21:06:29 -04:00
Hisma
f31cc07a9d
feat: update marker api
2025-07-22 20:49:28 -04:00
Timothy Jaeryang Baek
8bc7d85eac
refac
2025-07-22 17:17:26 +04:00
Timothy Jaeryang Baek
bf3c807047
refac
2025-07-22 11:38:47 +04:00
0xThresh.eth
f6ee1965cb
merge main
2025-07-21 18:06:17 -06:00
0xThresh.eth
5c59c50e2d
more prgoress on s3 vector
2025-07-20 16:48:23 -06:00
bekzod
4bc054a347
Update docling endpoint
2025-07-16 20:40:13 +05:00
0xThresh.eth
d9f2b6b14e
feat: add starter config for s3 vector
2025-07-15 21:20:54 -06:00
Timothy Jaeryang Baek
500e6e64fe
refac
2025-07-15 21:57:24 +04:00
Timothy Jaeryang Baek
92c9068369
refac
2025-07-14 17:50:03 +04:00
Timothy Jaeryang Baek
18bd83413b
refac
2025-07-14 14:05:06 +04:00
Timothy Jaeryang Baek
0013f5c1fc
refac/enh: forward user info header to reranker
2025-07-14 13:59:10 +04:00
Timothy Jaeryang Baek
b4f04ff3a7
enh/refac: pgvector pool support
2025-07-14 12:18:44 +04:00
Tim Jaeryang Baek
9b84a8e443
Merge pull request #15632 from athoik/quote
...
fix: don't over quote forwarded headers
2025-07-12 00:24:29 +04:00
Timothy Jaeryang Baek
77c1905609
refac
2025-07-11 12:35:42 +04:00
Timothy Jaeryang Baek
033d07ee23
refac: file handling
2025-07-11 12:29:17 +04:00
Timothy Jaeryang Baek
3b9d86de0b
refac
2025-07-11 12:00:21 +04:00
Athanasios Oikonomou
96758176cc
fix: don't over quote forwarded headers
...
Fix introduced on #15035 is over quoting headers.
Eg mails instead of user@example.com shown as user%40example.com
Eg names instead of First Last shown as First%20Last
Also we are spending some time quoting ids and roles without required.
Keep quote only on user name, initially had problem based on the discussion
https://github.com/open-webui/open-webui/discussions/14391
Also add space in safe characters, in order remove %20 from names.
2025-07-10 22:08:28 +03:00
Wonyong Lee
46e0992a83
json_serialize returing varchar2(2096)
2025-07-10 12:12:43 +00:00
Timothy Jaeryang Baek
8d84b4c2a4
enh/refac: temp chat file upload behaviour
...
client-side content extraction
2025-07-09 22:59:37 +04:00
Timothy Jaeryang Baek
b3c4bc6041
enh: allow full context mode for collections
2025-07-09 01:29:49 +04:00
Timothy Jaeryang Baek
d5f9bbc7a7
enh: reference note in chat
2025-07-09 01:17:25 +04:00
Tim Jaeryang Baek
a748f19ac2
Merge pull request #15548 from expruc/fix/docling_ignore_html
...
fix: text/html files being detected as text when loaded with docling/tika
2025-07-08 13:16:01 +04:00
Oracle Public Cloud User
e0afd7f496
fianl : vector-search-feature
2025-07-07 17:25:16 +00:00
Oracle Public Cloud User
12ebdbae81
refactor oracle23ai.py
2025-07-07 16:21:34 +00:00
Oracle Public Cloud User
25e241ae41
added new feature : oracle23ai vector search
2025-07-07 12:13:05 +00:00
Timothy Jaeryang Baek
3e15c8ab69
refac
2025-07-07 15:56:05 +04:00
Oracle Public Cloud User
b56dbb26be
alpha2
2025-07-07 08:52:58 +00:00
Oracle Public Cloud User
3e2fd074bb
oracle 23ai vector search
2025-07-07 05:58:02 +00:00
expruc
453a2bd9b5
fixed issue where text/html files being detected as text when loaded
2025-07-06 20:10:26 +03:00
Anush008
17debaa6de
chore: Raise if QDRANT_URI is not set
...
Signed-off-by: Anush008 <anushshetty90@gmail.com>
2025-07-04 13:17:46 +05:30
Anush008
c8a49d373a
refactor: Removed more swallows
...
Signed-off-by: Anush008 <anushshetty90@gmail.com>
2025-07-04 12:38:22 +05:30
Anush008
0ac57a088f
refactor: More implementation improvements
...
Signed-off-by: Anush008 <anushshetty90@gmail.com>
2025-07-04 12:33:54 +05:30
Anush008
7c734d3fea
Merge remote-tracking branch 'origin/dev' into Anush008/main
...
Signed-off-by: Anush008 <anushshetty90@gmail.com>
2025-07-04 12:22:08 +05:30
Tim Jaeryang Baek
600344f2e8
Merge pull request #15510 from kopero2000/bug/oauth_logout_fix
...
fix/oauth logout fix
2025-07-04 10:30:02 +04:00
Bela Vizi
9623ef4360
add trust env to clientsession
2025-07-02 17:59:56 +02:00
guenhter
5c2e0e4beb
feat: add qdrant indices for metadata fields
...
All fieldnames which are part of a query should
have an index for performance reasons. This is
even enforced on some qdrant cluster like those
on qdrant.io, and queries using a unindexed column
fail with an error.
2025-06-29 15:30:55 +02:00
Timothy Jaeryang Baek
1b064a6c85
chore: format
2025-06-28 15:21:20 +04:00
guenhter
a66206f44f
feat: support better qdrant collection isolation
...
The prefix string for qdrant collection is now
configurable, which means the same qdrant cluster
can be used to host more open webui instances and
to be able to separate the collections between the
different owui instances.
2025-06-26 13:52:26 +02:00
Timothy Jaeryang Baek
1f123eb100
refac
2025-06-25 12:20:08 +04:00
Tim Jaeryang Baek
d60c800d66
Merge pull request #15276 from zhangtyzzz/update_brave_search
...
[fix] Update brave.py to use the correct field
2025-06-25 11:04:06 +04:00
Anush008
05bee5663d
Merge remote-tracking branch 'origin/dev'
2025-06-25 12:04:23 +05:30
zhangtyzzz
ac5567f78d
Update brave.py to use the correct field
...
fixing issues caused by incorrect field names.
2025-06-25 09:11:58 +08:00
Anush008
5dba298c1e
refactor: Updated Qdrant multi-tenancy implementation
...
Signed-off-by: Anush008 <anushshetty90@gmail.com>
2025-06-24 14:12:44 +05:30
Doris Lam
74ae9ab897
fix opensearch race condition, use keyword search instead of full text search in filter query
2025-06-23 18:43:33 -07:00
Timothy Jaeryang Baek
81b8267e85
feat: odt file parse support
2025-06-19 18:39:00 +04:00
priten
f7920df870
Fix non-ascii error issue on ENABLE_FORWARD_USER_INFO_HEADERS
2025-06-16 12:33:11 -05:00
Timothy Jaeryang Baek
7753f57d42
chore: format
2025-06-16 13:48:50 +04:00
Tim Jaeryang Baek
c5b48ec551
Merge pull request #14992 from sreesdas/dev
...
Fix: Added support for multiple pages in external document loader
2025-06-16 11:01:33 +04:00
sree
62bfe73964
Fix: Added support for multiple pages in external document loader, added filename in api request header
2025-06-15 19:59:05 +05:30
Vaclav Cerny
4bbc32efa6
fix: serialize picture description parameters to JSON in DoclingLoader
2025-06-11 20:00:25 +02:00
Timothy Jaeryang Baek
d430fe9551
refac
2025-06-10 11:30:54 +04:00
Timothy Jaeryang Baek
7f488b3754
feat: experimental pgvector pgcrypto support
2025-06-09 18:14:33 +04:00
Timothy Jaeryang Baek
7f75acff96
chore: format
2025-06-08 22:08:25 +04:00
Timothy Jaeryang Baek
0cd400f5ee
refac: docling picture describe params
2025-06-08 20:02:14 +04:00
Tim Jaeryang Baek
6bf393a480
Merge pull request #14787 from vaclcer/vaclavs-custom-docling
...
feat: Customize Docling's "Describe Pictures" feature
2025-06-08 19:02:36 +04:00
Tim Jaeryang Baek
50d9a2ac58
Merge pull request #14781 from lucyknada/patch-2
...
fix: fix #14752 and add manual transcription retrieval
2025-06-08 18:40:28 +04:00
Vaclav Cerny
99f05561f8
Add configuration options for picture description modes and update related components
2025-06-08 16:30:26 +02:00
lucy
b0965a8184
fixes #14752 and adds manual transcription option
2025-06-08 14:26:24 +02:00
Timothy Jaeryang Baek
5e35aab292
chore: format
2025-06-05 01:12:28 +04:00
Tim Jaeryang Baek
7c4f261aa2
Merge pull request #14616 from Davixk/feat/new-perplexity-options
...
feat: add Perplexity AI model and search context usage configuration options
2025-06-05 00:28:00 +04:00
Vaclav Cerny
9772c18b20
fix(loader): remove deprecated picture description configuration
2025-06-04 17:21:44 +02:00
Vaclav Cerny
c71236ba07
feat(loader): enhance picture description prompt for improved detail and clarity
2025-06-04 14:25:31 +02:00
Vaclav Cerny
c4278f4784
fix description vs classification mismatch
2025-06-04 14:13:00 +02:00
Vaclav Cerny
8644e81a1c
feat(loader): add picture description configuration for DoclingLoader
2025-06-04 12:34:39 +02:00
Timothy Jaeryang Baek
4d364e2967
refac: remove msg from known type
2025-06-03 16:27:28 +04:00
Dave
77b357c73b
fix: update label for search context usage to clarify its purpose
2025-06-03 00:27:07 +02:00
Dave
96e9bfe0e5
feat: add Perplexity model and search context usage configuration options
2025-06-03 00:19:08 +02:00
Tim Jaeryang Baek
3c32d2cada
Merge pull request #14539 from PVBLIC-F/refac/mistral
...
perf mistral.py Enhance for Overall Speed and Efficiency
2025-06-02 23:52:59 +04:00
PVBLIC Foundation
cf3635ba25
Update mistral.py
...
1. Intelligent Error Handling
Added _is_retryable_error() method to distinguish retryable vs non-retryable errors
Prevents unnecessary retries on client errors (4xx) that won't succeed
Caps retry delay at 30 seconds to prevent excessive waiting
2. Optimized Timeout Configuration
Upload: Capped at 2 minutes (was using full 5-minute timeout)
URL requests: 30 seconds (should be fast)
OCR processing: Full timeout (can take time)
Cleanup: 30 seconds (should be quick)
3. Enhanced Connection Pool
Increased connection limits: 20 total, 10 per host
Longer DNS cache TTL (10 minutes vs 5 minutes)
Increased keepalive timeout (60s vs 30s)
Added async DNS resolver for better performance
Granular timeout controls (connect, read, total)
4. Concurrency Control for Batch Processing
Added semaphore-based concurrency control (default: 5 concurrent)
Prevents API overwhelming while maintaining throughput
Configurable concurrency limit per workload
5. Memory Efficient Result Processing
Early exit for empty content validation
Better error metadata for debugging
Added content length tracking
Streamlined page processing logic
6. General Performance Improvements
Better error logging with truncated responses
Optimized metadata creation
Improved debug logging efficiency
2025-05-30 20:06:29 -07:00
PVBLIC Foundation
66bde32623
Update pinecone.py
2025-05-30 18:47:23 -07:00
PVBLIC Foundation
4ecf2a8685
Update pinecone.py
...
May 2025 Latest Pinecone Best Practices
2025-05-30 09:33:57 -07:00
Timothy Jaeryang Baek
9306ae5972
refac
2025-05-30 01:19:56 +04:00
Timothy Jaeryang Baek
e1e2c096e2
refac: PLEASE follow existing convention
2025-05-30 00:34:18 +04:00
Tim Jaeryang Baek
ff353578db
Merge pull request #14370 from daw/feat/add-azure-openai-embeddings-option
...
feat:Add Azure OpenAI embedding support
2025-05-30 00:18:55 +04:00
Timothy Jaeryang Baek
7dc7d5c028
refac: PLEASE FOLLOW EXISTING CONVENTION
2025-05-29 03:47:02 +04:00
Timothy Jaeryang Baek
551597b9cc
chore: format
2025-05-29 02:36:33 +04:00
Hisma
e12a79c0e2
fix: handle json output format correctly
2025-05-27 01:12:03 -04:00
Hisma
a9405cc101
feat: Marker api content extraction support
2025-05-27 00:44:07 -04:00
Timothy Jaeryang Baek
da75d0ca1e
chore: format
2025-05-24 02:13:54 +04:00
Tim Jaeryang Baek
e663b90a9f
Merge pull request #14069 from Ithanil/bm25_weight
...
feat: Configurable weight for BM25Retriever during hybrid search
2025-05-24 01:13:03 +04:00
Timothy Jaeryang Baek
8b5e89eada
chore: format
2025-05-24 00:43:38 +04:00
Jan Kessler
e70dd33233
rename BM25_WEIGHT -> HYBRID_BM25_WEIGHT
2025-05-23 22:06:44 +02:00
Tim Jaeryang Baek
c8f1bdf928
Merge pull request #14245 from PVBLIC-F/dev
...
perf Update mistral.py
2025-05-23 21:57:16 +04:00
PVBLIC Foundation
bf193dfb5d
Update mistral.py
2025-05-23 10:00:19 -07:00
Timothy Jaeryang Baek
aac25eac9e
refac: reranker
...
Co-Authored-By: Tornike Gurgenidze <togurg14@freeuni.edu.ge>
2025-05-23 01:29:48 +04:00
Tim Jaeryang Baek
da4aa5f08b
Merge pull request #14152 from U8F69/fix_user_auth
...
fix(auth): correctly use password hash when duplicate email records exist
2025-05-22 14:58:10 +04:00
U8F69
dd6124a84f
fix(auth): fix invalid password use in auth
2025-05-22 11:03:43 +08:00
PVBLIC Foundation
86e24bb4aa
Update pinecone.py
...
I've improved the pinecone.py file by:
Updated from the deprecated PineconeGRPC client to the newer Pinecone client
Modified the client initialization code to match the new API requirements
Added better response handling with getattr() to safely access attributes from response objects
Removed the streaming_upsert method which is not available in the newer client
Added safer attribute access with fallbacks throughout the code
Updated the close method to reflect that the newer client doesn't need explicit closing
These changes ensure the code is compatible with the latest Pinecone Python SDK and will be more robust against future changes. The key improvement is migrating away from the deprecated gRPC client which will eventually stop working.
2025-05-21 15:28:42 -07:00
Tim Jaeryang Baek
d3c7628092
Merge pull request #14059 from sreesdas/main
...
fix: resolve issue where external document loader was not invoked
2025-05-20 17:43:06 +04:00
Tim Jaeryang Baek
fac5884d8c
Merge pull request #14073 from tth37/fix_default_web_loader_verify_ssl
...
fix: Default web loader fail silently when `verify_ssl=False`
2025-05-20 17:24:22 +04:00
tth37
78befd5a2f
fix: Default web loader fail when verify_ssl=False
2025-05-20 19:44:18 +08:00
Jan Kessler
308d8ac04a
make bm25_weight a regular parameter of query_doc.. / get_sources_from_files functions
2025-05-20 11:46:32 +02:00
Jan Kessler
b5ddaf6417
make weight for bm25 retriever in hybrid search ui-configurable
2025-05-20 10:39:31 +02:00
sree
f408b08965
minor bug fix for external document loader not working
2025-05-20 11:10:23 +05:30
Derek Wischusen
42be1f956a
Add Azure OpenAI embedding support
2025-05-19 22:58:04 -04:00
Marcelo Mendoza
d6ad96affb
fix: use get method for title and snippet in search results
2025-05-19 17:24:47 +02:00
Timothy Jaeryang Baek
6692fb2181
chore: format
2025-05-17 01:00:37 +04:00
Kiet Trinh
418ac1a8da
refac: Rename Qdrant multi-tenancy variable for improved clarity and consistency
2025-05-15 09:09:24 +00:00
Kiet Trinh
485bd7666c
fix: Update Qdrant multi-tenancy variable name for consistency in configuration
2025-05-15 08:02:58 +00:00
LoiTra
184d8dfd7e
feat: Implement Qdrant multi-tenancy support with collection management and tenant isolation
2025-05-15 11:28:06 +07:00
Timothy Jaeryang Baek
b143c71da2
refac: AIOHTTP_CLIENT_SESSION_SSL
2025-05-14 23:33:52 +04:00
Timothy Jaeryang Baek
42382b5167
fix
2025-05-14 22:46:01 +04:00
Timothy Jaeryang Baek
8732b64b6b
feat: external document loader support
2025-05-14 22:28:40 +04:00
Timothy Jaeryang Baek
de70d0cb64
feat: docling do picture description support
2025-05-14 21:26:49 +04:00
hwzhuhao
6f869ded43
feat:Add vector type and vector factory class for vector database integration
2025-05-14 21:30:50 +08:00
Timothy Jaeryang Baek
6b5f99bf66
fix: external reranker
2025-05-10 19:33:34 +04:00
Timothy Jaeryang Baek
c61790b355
chore: format
2025-05-10 19:00:01 +04:00