Commit graph

107 commits

Author SHA1 Message Date
Timothy Jaeryang Baek
7e6b8a9a71 refac/fix: docling auth 2025-12-06 08:06:42 -05:00
Henne
a7e614ca4c
feat: Adds document intelligence model configuration (#19692)
* Adds document intelligence model configuration

Enables the configuration of the Document Intelligence model to be used by the RAG pipeline.

This allows users to specify the model they want to use for document processing, providing flexibility and control over the extraction process.

* Added Titel to Document Intelligence Model Config

Added Titel to Document Intelligence Model Config
2025-12-02 14:41:09 -05:00
Timothy Jaeryang Baek
9c19d0abd4 refac/breaking: docling params
Some checks are pending
Deploy to HuggingFace Spaces / check-secret (push) Waiting to run
Deploy to HuggingFace Spaces / deploy (push) Blocked by required conditions
Create and publish Docker images with specific build args / build-main-image (linux/amd64, ubuntu-latest) (push) Waiting to run
Create and publish Docker images with specific build args / build-main-image (linux/arm64, ubuntu-24.04-arm) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/amd64, ubuntu-latest) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/arm64, ubuntu-24.04-arm) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda126-image (linux/amd64, ubuntu-latest) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda126-image (linux/arm64, ubuntu-24.04-arm) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/amd64, ubuntu-latest) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/arm64, ubuntu-24.04-arm) (push) Waiting to run
Create and publish Docker images with specific build args / build-slim-image (linux/amd64, ubuntu-latest) (push) Waiting to run
Create and publish Docker images with specific build args / build-slim-image (linux/arm64, ubuntu-24.04-arm) (push) Waiting to run
Create and publish Docker images with specific build args / merge-main-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-cuda-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-cuda126-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-ollama-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-slim-images (push) Blocked by required conditions
Python CI / Format Backend (push) Waiting to run
Frontend Build / Format & Build Frontend (push) Waiting to run
Frontend Build / Frontend Unit Tests (push) Waiting to run
2025-11-24 16:01:13 -05:00
Timothy Jaeryang Baek
4673e120c4 refac/fix: mineru params
breaking change
2025-11-11 00:30:11 -05:00
Timothy Jaeryang Baek
e2b9942648 feat: Optionally forward user headers to external document loader
Co-Authored-By: Classic298 <27028174+Classic298@users.noreply.github.com>
2025-11-06 00:05:46 -05:00
Timothy Jaeryang Baek
415b93c7c3 enh: configurable mistral ocr base url 2025-11-05 23:25:51 -05:00
Timothy Jaeryang Baek
d14329b285 refac 2025-11-02 18:47:41 -05:00
Timothy Jaeryang Baek
d98c539d89 refac 2025-11-01 18:48:11 -04:00
Tim Baek
c787070dc9
Merge pull request #18765 from mkhludnev/patch-2
Some checks are pending
Deploy to HuggingFace Spaces / check-secret (push) Waiting to run
Deploy to HuggingFace Spaces / deploy (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-ollama-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-slim-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / build-main-image (linux/amd64, ubuntu-latest) (push) Waiting to run
Create and publish Docker images with specific build args / build-main-image (linux/arm64, ubuntu-24.04-arm) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda-image (linux/amd64, ubuntu-latest) (push) Waiting to run
Create and publish Docker images with specific build args / merge-cuda126-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / build-cuda-image (linux/arm64, ubuntu-24.04-arm) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda126-image (linux/amd64, ubuntu-latest) (push) Waiting to run
Create and publish Docker images with specific build args / build-cuda126-image (linux/arm64, ubuntu-24.04-arm) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/amd64, ubuntu-latest) (push) Waiting to run
Create and publish Docker images with specific build args / build-ollama-image (linux/arm64, ubuntu-24.04-arm) (push) Waiting to run
Create and publish Docker images with specific build args / build-slim-image (linux/amd64, ubuntu-latest) (push) Waiting to run
Create and publish Docker images with specific build args / build-slim-image (linux/arm64, ubuntu-24.04-arm) (push) Waiting to run
Create and publish Docker images with specific build args / merge-main-images (push) Blocked by required conditions
Create and publish Docker images with specific build args / merge-cuda-images (push) Blocked by required conditions
Python CI / Format Backend (push) Waiting to run
fix: Don't missguide Tika with mime-type
2025-10-31 14:18:43 -07:00
Mikhail Khludnev
24aeec9120
Don't missguide Tika with mime-type
Fix #18683 
* Tika is smart enough to detect content type. 
* Windows browsers just misguides Tika providing `application/ms-word` for .rtf files.
2025-10-31 23:12:23 +03:00
Timothy Jaeryang Baek
a70bc52c34 chore: format 2025-10-26 19:33:39 -07:00
Ivan Ostanin
7d29991fa5
fix: pass youtube_proxy as a GenericProxyConfig type object 2025-10-19 02:01:27 +02:00
palazski
288b323df8 feat: use MINERU_PARAMS json field for mineru settings 2025-10-15 22:59:59 +03:00
palazski
40e9d9c330 feat: add mineru as document parser support with both local and managed api 2025-10-13 21:09:52 +03:00
Tim Jaeryang Baek
8d7d79d54b
0.6.33 (#18118)
* feat: improve ollama model management experience

This commit introduces several improvements to the Ollama model management modal:

- Adds a cancel button to the model pulling operation, using the existing 'x' button pattern.
- Adds a cancel button to the "Update All" models operation, allowing the user to cancel the update for the currently processing model.
- Cleans up toast notifications when updating all models. A single toast is now shown at the beginning and a summary toast at the end, preventing notification spam.
- Refactors the `ManageOllama.svelte` component to support these new cancellation features.
- Adds tooltips to all buttons in the modal to improve clarity.
- Disables buttons when their corresponding input fields are empty to prevent accidental clicks.

* fix

* i18n: improve Chinese translation

* fix: handle non‑UTF8 chars in third‑party responses without error

* German translation of new strings in i18n

* log web search queries only with level 'debug' instead of 'info'

* Tool calls now only include text and dont inlcude other content like image b64

* fix onedrive

* fix: discovery url

* fix: default permissions not being loaded

* fix: ai hallucination

* fix: non rich text input copy

* refac: rm print statements

* refac: disable direct models from model editors

* refac/fix: do not process xlsx files with azure doc intelligence

* Update pull_request_template.md

* Update generated image translation in DE-de

* added missing danish translations

* feat(onedrive): Enable search and "My Organization" pivot

* style(onedrive): Formatting fix

* feat: Implement toggling for vertical and horizontal flow layouts

This commit introduces the necessary logic and UI controls to allow users to switch the Flow component layout between vertical and horizontal orientations.

*   **`Flow.svelte` Refactoring:**
    *   Updates logic for calculating level offsets and node positions to consistently respect the current flow orientation.
    *   Adds a control panel using `<Controls>` and `<SwitchButton>` components.
    *   Provides user interface elements to easily switch the flow layout between horizontal and vertical orientations.

* build(deps): bump pydantic from 2.11.7 to 2.11.9 in /backend

Bumps [pydantic](https://github.com/pydantic/pydantic) from 2.11.7 to 2.11.9.
- [Release notes](https://github.com/pydantic/pydantic/releases)
- [Changelog](https://github.com/pydantic/pydantic/blob/v2.11.9/HISTORY.md)
- [Commits](https://github.com/pydantic/pydantic/compare/v2.11.7...v2.11.9)

---
updated-dependencies:
- dependency-name: pydantic
  dependency-version: 2.11.9
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump black from 25.1.0 to 25.9.0 in /backend

Bumps [black](https://github.com/psf/black) from 25.1.0 to 25.9.0.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/compare/25.1.0...25.9.0)

---
updated-dependencies:
- dependency-name: black
  dependency-version: 25.9.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump markdown from 3.8.2 to 3.9 in /backend

Bumps [markdown](https://github.com/Python-Markdown/markdown) from 3.8.2 to 3.9.
- [Release notes](https://github.com/Python-Markdown/markdown/releases)
- [Changelog](https://github.com/Python-Markdown/markdown/blob/master/docs/changelog.md)
- [Commits](https://github.com/Python-Markdown/markdown/compare/3.8.2...3.9.0)

---
updated-dependencies:
- dependency-name: markdown
  dependency-version: '3.9'
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump chromadb from 1.0.20 to 1.1.0 in /backend

Bumps [chromadb](https://github.com/chroma-core/chroma) from 1.0.20 to 1.1.0.
- [Release notes](https://github.com/chroma-core/chroma/releases)
- [Changelog](https://github.com/chroma-core/chroma/blob/main/RELEASE_PROCESS.md)
- [Commits](https://github.com/chroma-core/chroma/compare/1.0.20...1.1.0)

---
updated-dependencies:
- dependency-name: chromadb
  dependency-version: 1.1.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump opentelemetry-api from 1.36.0 to 1.37.0

Bumps [opentelemetry-api](https://github.com/open-telemetry/opentelemetry-python) from 1.36.0 to 1.37.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-python/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-python/compare/v1.36.0...v1.37.0)

---
updated-dependencies:
- dependency-name: opentelemetry-api
  dependency-version: 1.37.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* refac: ollama embed form data

* fix: non rich text handling

* fix: oauth client registration

* refac

* chore: dep bump

* chore: fastapi bump

* chore/refac: bump bcrypt and remove passlib

* Improving Korean Translation

* refac

* Improving Korean Translation

* feat: PWA share_target implementation

Co-Authored-By: gjveld <19951982+gjveld@users.noreply.github.com>

* refac: message input mobile detection behaviour

* feat: model_ids per folder

* Update translation.json (pt-BR)

inclusion of new translations of items that have been added

* refac

* refac

* refac

* refac

* refac/fix: temp chat

* refac

* refac: stop task

* refac/fix: azure audio escape

* refac: external tool validation

* refac/enh: start.sh additional args support

* refac

* refac: styling

* refac/fix: direct connection floating action buttons

* refac/fix: system prompt duplication

* refac/enh: openai tts additional params support

* refac

* feat: load data in parallel to accelerate page loading speed

* i18n: improve Chinese translation

* refac

* refac: model selector

* UPD: i18n es-ES Translation v0.6.33

UPD: i18n es-ES Translation v0.6.33

Updated new strings.

* refac

* improved query pref by querying only relevant columns

* refac/enh: docling params

* refac

* refac: openai additional headers support

* refac

* FEAT: Add Vega Char Visualizer Renderer

### FEAT: Add Vega Char Visualizer Renderer

Feature required in https://github.com/open-webui/open-webui/discussions/18022

Added npm vega lib to package.json
Added function for visualization renderer to src/libs/utils/index.ts
Added logic to src/lib/components/chat/Messages/CodeBlock.svelte

The treatment is similar as for mermaid diagrams.

Reference: https://vega.github.io/vega/

* refac

* chore

* refac

* FEAT: Add Vega-Lite Char Visualizer Renderer

### FEAT: Add Vega Char Visualizer Renderer

Add suport for Vega-Lite Specifications.
Vega-Lite is a "compiled" version of Vega Char Visualizer.
For be rendered with Vega it have to be compiled.
This PR add the check and compile if necessary, is a complement of recent Vega Renderer Feature added.

* refac

* refac/fix: switch

* enh/refac: url input handling

* refac

* refac: styling

* UPD: Add Validators & Error Toast for Mermaid & Vega diagrams

### UPD: Feat:  Add Validators & Error Toast for Mermaid & Vega diagrams

Description:
As many time the diagrams generated or entered have syntax errors the diagrams are not rendered due to that errors, but as there isn't any notification is difficult to know what happend.

This PR add validator and toast notification when error on Mermaid and Vega/Vega-Lite diagrams, helping the user to fix its.

* removed redundant knowledge API call

* Fix Code Format

* refac: model workspace view

* refac

* refac: knowledge

* refac: prompts

* refac: tools

* refac

* feat: attach folder

* refac: make tencentcloud-sdk-python optional

* refac/fix: oauth

* enh: ENABLE_OAUTH_EMAIL_FALLBACK

* refac/fix: folders

* Update requirements.txt

* Update pyproject.toml

* UPD: Add Validators & Error Toast for Mermaid & Vega diagrams

### UPD: Feat:  Add Validators & Error Toast for Mermaid & Vega diagrams

Description:
As many time the diagrams generated or entered have syntax errors the diagrams are not rendered due to that errors, but as there isn't any notification is difficult to know what happend.

This PR add validator and toast notification when error on Mermaid and Vega/Vega-Lite diagrams, helping the user to fix its.

Note:
Another possibility of integrating this Graph Visualizer is through its svelte component: https://github.com/vega/svelte-vega/tree/main/packages/svelte-vega

* Removed unused toast import & Code Format

* refac

* refac: external tool server view

* refac

* refac: overview

* refac: styling

* refac

* Update bug_report.yaml

* refac

* refac

* refac

* refac

* refac: oauth client fallback

* Fixed: Cannot handle batch sizes > 1 if no padding token is defined

Fixes Cannot handle batch sizes > 1 if no padding token is defined

For reranker models that do not have this defined in their config by using the eos_token_id if present as pad_token_id.

* refac: fallback to reasoning content

* fix(i18n): corrected typo in Spanish translation for "Reasoning Tags"

Typo fixed in Spanish translation file at line 1240 of `open-webui/src/lib/i18n/locales/es-ES/translation.json`:

- Incorrect: "Eriquetas de Razonamiento"
- Correct:   "Etiquetas de Razonamiento"

This improves clarity and consistency in the UI.

* refac/fix: ENABLE_STAR_SESSIONS_MIDDLEWARE

* refac/fix: redirect

* refac

* refac

* refac

* refac: web search error handling

* refac: source parsing

* refac: functions

* refac

* refac/enh: note pdf export

* refac/fix: mcp oauth2.1

* chore: format

* chore: Changelog (#17995)

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* refac

* chore: dep bump

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: silentoplayz <jacwoo21@outlook.com>
Co-authored-by: Shirasawa <764798966@qq.com>
Co-authored-by: Jan Kessler <jakessle@uni-mainz.de>
Co-authored-by: Jacob Leksan <jacob.leksan@expedient.com>
Co-authored-by: Classic298 <27028174+Classic298@users.noreply.github.com>
Co-authored-by: sinejespersen <sinejespersen@protonmail.com>
Co-authored-by: Selene Blok <selene.blok@rws.nl>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Cyp <cypher9715@naver.com>
Co-authored-by: gjveld <19951982+gjveld@users.noreply.github.com>
Co-authored-by: joaoback <156559121+joaoback@users.noreply.github.com>
Co-authored-by: _00_ <131402327+rgaricano@users.noreply.github.com>
Co-authored-by: expruc <eygabi01@gmail.com>
Co-authored-by: YetheSamartaka <55753928+YetheSamartaka@users.noreply.github.com>
Co-authored-by: Akutangulo <akutangulo@gmail.com>
2025-10-07 16:20:27 -05:00
Timothy Jaeryang Baek
485392fe63 chore: format 2025-09-09 18:19:31 +04:00
Tim Jaeryang Baek
71fd483fba
Merge pull request #17276 from Elettrotecnica/extend-docling-configuration
feat: Extend docling configuration options
2025-09-09 18:04:30 +04:00
Timothy Jaeryang Baek
5f0d262c59 fix: yt embed 2025-09-09 16:00:42 +04:00
Antonio Pisano
daa2a036f8 Extend docling configuration options to include:
* do_ocr
* force_ocr
* pdf_backend
* table_mode
* pipeline

as per https://github.com/docling-project/docling-serve/blob/main/docs/usage.md

See https://github.com/open-webui/open-webui/issues/17148
2025-09-08 18:51:33 +02:00
Athanasios Oikonomou
d735b036fe fix: handle unicode filenames in external document loader
Files with special characters in their names (e.g., ü.pdf) caused issues since HTTP headers only allow Latin-1 characters.
This change URL-encodes `X-Filename` before adding it to request headers, preventing failures when uploading or processing such files.

Fixes: #17000
2025-08-28 22:19:50 +03:00
Timothy Jaeryang Baek
2bb6063dcb refac/fix: marker 2025-08-28 03:03:31 +04:00
Selene Blok
5051bfe7ab feat(document retrieval): Authenticate Azure Document Intelligence using AzureDefaultCredential if API key is not provided 2025-08-22 16:15:43 +02:00
Timothy Jaeryang Baek
e8696c63fe refac 2025-08-04 15:23:43 +04:00
Tim Jaeryang Baek
5db60ca34f
Merge pull request #15903 from Hisma/marker-api-update
feat: Add configurable API URL (for self-hosting) and additional_config parameter for Datalab Marker API
2025-08-04 15:21:03 +04:00
Hisma
21337a2fd8 ci fix 2025-07-22 22:07:14 -04:00
Hisma
a99e20cc3d add format_lines 2025-07-22 21:06:29 -04:00
Hisma
f31cc07a9d feat: update marker api 2025-07-22 20:49:28 -04:00
bekzod
4bc054a347
Update docling endpoint 2025-07-16 20:40:13 +05:00
expruc
453a2bd9b5 fixed issue where text/html files being detected as text when loaded 2025-07-06 20:10:26 +03:00
Tim Jaeryang Baek
600344f2e8
Merge pull request #15510 from kopero2000/bug/oauth_logout_fix
fix/oauth logout fix
2025-07-04 10:30:02 +04:00
Bela Vizi
9623ef4360 add trust env to clientsession 2025-07-02 17:59:56 +02:00
Timothy Jaeryang Baek
81b8267e85 feat: odt file parse support 2025-06-19 18:39:00 +04:00
Timothy Jaeryang Baek
7753f57d42 chore: format 2025-06-16 13:48:50 +04:00
Tim Jaeryang Baek
c5b48ec551
Merge pull request #14992 from sreesdas/dev
Fix: Added support for multiple pages in external document loader
2025-06-16 11:01:33 +04:00
sree
62bfe73964 Fix: Added support for multiple pages in external document loader, added filename in api request header 2025-06-15 19:59:05 +05:30
Vaclav Cerny
4bbc32efa6 fix: serialize picture description parameters to JSON in DoclingLoader 2025-06-11 20:00:25 +02:00
Timothy Jaeryang Baek
7f75acff96 chore: format 2025-06-08 22:08:25 +04:00
Timothy Jaeryang Baek
0cd400f5ee refac: docling picture describe params 2025-06-08 20:02:14 +04:00
Tim Jaeryang Baek
6bf393a480
Merge pull request #14787 from vaclcer/vaclavs-custom-docling
feat: Customize Docling's "Describe Pictures" feature
2025-06-08 19:02:36 +04:00
Tim Jaeryang Baek
50d9a2ac58
Merge pull request #14781 from lucyknada/patch-2
fix: fix #14752 and add manual transcription retrieval
2025-06-08 18:40:28 +04:00
Vaclav Cerny
99f05561f8 Add configuration options for picture description modes and update related components 2025-06-08 16:30:26 +02:00
lucy
b0965a8184
fixes #14752 and adds manual transcription option 2025-06-08 14:26:24 +02:00
Timothy Jaeryang Baek
5e35aab292 chore: format 2025-06-05 01:12:28 +04:00
Vaclav Cerny
9772c18b20 fix(loader): remove deprecated picture description configuration 2025-06-04 17:21:44 +02:00
Vaclav Cerny
c71236ba07 feat(loader): enhance picture description prompt for improved detail and clarity 2025-06-04 14:25:31 +02:00
Vaclav Cerny
c4278f4784 fix description vs classification mismatch 2025-06-04 14:13:00 +02:00
Vaclav Cerny
8644e81a1c feat(loader): add picture description configuration for DoclingLoader 2025-06-04 12:34:39 +02:00
Timothy Jaeryang Baek
4d364e2967 refac: remove msg from known type 2025-06-03 16:27:28 +04:00
PVBLIC Foundation
cf3635ba25
Update mistral.py
1. Intelligent Error Handling
Added _is_retryable_error() method to distinguish retryable vs non-retryable errors
Prevents unnecessary retries on client errors (4xx) that won't succeed
Caps retry delay at 30 seconds to prevent excessive waiting
2. Optimized Timeout Configuration
Upload: Capped at 2 minutes (was using full 5-minute timeout)
URL requests: 30 seconds (should be fast)
OCR processing: Full timeout (can take time)
Cleanup: 30 seconds (should be quick)
3. Enhanced Connection Pool
Increased connection limits: 20 total, 10 per host
Longer DNS cache TTL (10 minutes vs 5 minutes)
Increased keepalive timeout (60s vs 30s)
Added async DNS resolver for better performance
Granular timeout controls (connect, read, total)
4. Concurrency Control for Batch Processing
Added semaphore-based concurrency control (default: 5 concurrent)
Prevents API overwhelming while maintaining throughput
Configurable concurrency limit per workload
5. Memory Efficient Result Processing
Early exit for empty content validation
Better error metadata for debugging
Added content length tracking
Streamlined page processing logic
6. General Performance Improvements
Better error logging with truncated responses
Optimized metadata creation
Improved debug logging efficiency
2025-05-30 20:06:29 -07:00
Timothy Jaeryang Baek
7dc7d5c028 refac: PLEASE FOLLOW EXISTING CONVENTION 2025-05-29 03:47:02 +04:00