diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md
index 7aa2fec16b..d0f38c2334 100644
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -8,9 +8,19 @@ assignees: ''
# Bug Report
-**Important: Before submitting a bug report, please check whether a similar issue or feature request has already been posted in the Issues or Discussions section. It's likely we're already tracking it. In case of uncertainty, initiate a discussion post first. This helps us all to efficiently focus on improving the project.**
+## Important Notes
-**Let's collaborate respectfully. If you bring negativity, please understand our capacity to engage may be limited. If you're open to learning and communicating constructively, we're more than happy to assist you. Remember, Open WebUI is a volunteer-driven project maintained by a single maintainer, supported by our amazing contributors who also manage full-time jobs. We respect your time; please respect ours. If you have an issue, We highly encourage you to submit a pull request or to fork the project. We actively work to prevent contributor burnout to preserve the quality and continuity of Open WebUI.**
+- **Before submitting a bug report**: Please check the Issues or Discussions section to see if a similar issue or feature request has already been posted. It's likely we're already tracking it! If you’re unsure, start a discussion post first. This will help us efficiently focus on improving the project.
+
+- **Collaborate respectfully**: We value a constructive attitude, so please be mindful of your communication. If negativity is part of your approach, our capacity to engage may be limited. We’re here to help if you’re open to learning and communicating positively. Remember, Open WebUI is a volunteer-driven project managed by a single maintainer and supported by contributors who also have full-time jobs. We appreciate your time and ask that you respect ours.
+
+- **Contributing**: If you encounter an issue, we highly encourage you to submit a pull request or fork the project. We actively work to prevent contributor burnout to maintain the quality and continuity of Open WebUI.
+
+- **Bug reproducibility**: If a bug cannot be reproduced with a `:main` or `:dev` Docker setup, or a pip install with Python 3.11, it may require additional help from the community. In such cases, we will move it to the "issues" Discussions section due to our limited resources. We encourage the community to assist with these issues. Remember, it’s not that the issue doesn’t exist; we need your help!
+
+Note: Please remove the notes above when submitting your post. Thank you for your understanding and support!
+
+---
## Installation Method
diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md
index 27abd35b4a..5d6e9d708d 100644
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@@ -8,9 +8,19 @@ assignees: ''
# Feature Request
-**Important: Before submitting a feature request, please check whether a similar issue or feature request has already been posted in the Issues or Discussions section. It's likely we're already tracking it. In case of uncertainty, initiate a discussion post first. This helps us all to efficiently focus on improving the project.**
+## Important Notes
-**Let's collaborate respectfully. If you bring negativity, please understand our capacity to engage may be limited. If you're open to learning and communicating constructively, we're more than happy to assist you. Remember, Open WebUI is a volunteer-driven project maintained by a single maintainer, supported by our amazing contributors who also manage full-time jobs. We respect your time; please respect ours. If you have an issue, We highly encourage you to submit a pull request or to fork the project. We actively work to prevent contributor burnout to preserve the quality and continuity of Open WebUI.**
+- **Before submitting a report**: Please check the Issues or Discussions section to see if a similar issue or feature request has already been posted. It's likely we're already tracking it! If you’re unsure, start a discussion post first. This will help us efficiently focus on improving the project.
+
+- **Collaborate respectfully**: We value a constructive attitude, so please be mindful of your communication. If negativity is part of your approach, our capacity to engage may be limited. We’re here to help if you’re open to learning and communicating positively. Remember, Open WebUI is a volunteer-driven project managed by a single maintainer and supported by contributors who also have full-time jobs. We appreciate your time and ask that you respect ours.
+
+- **Contributing**: If you encounter an issue, we highly encourage you to submit a pull request or fork the project. We actively work to prevent contributor burnout to maintain the quality and continuity of Open WebUI.
+
+- **Bug reproducibility**: If a bug cannot be reproduced with a `:main` or `:dev` Docker setup, or a pip install with Python 3.11, it may require additional help from the community. In such cases, we will move it to the "issues" Discussions section due to our limited resources. We encourage the community to assist with these issues. Remember, it’s not that the issue doesn’t exist; we need your help!
+
+Note: Please remove the notes above when submitting your post. Thank you for your understanding and support!
+
+---
**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
diff --git a/CHANGELOG.md b/CHANGELOG.md
index f7ee69b3a7..5ca694e65c 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,73 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.3.35] - 2024-10-26
+
+### Added
+
+- **📁 Robust File Handling**: Enhanced file input handling for chat. If the content extraction fails or is empty, users will now receive a clear warning, preventing silent failures and ensuring you always know what's happening with your uploads.
+- **🌍 New Language Support**: Introduced Hungarian translations and updated French translations, expanding the platform's language accessibility for a more global user base.
+
+### Fixed
+
+- **📚 Knowledge Base Loading Issue**: Resolved a critical bug where the Knowledge Base was not loading, ensuring smooth access to your stored documents and improving information retrieval in RAG-enhanced workflows.
+- **🛠️ Tool Parameters Issue**: Fixed an error where tools were not functioning correctly when required parameters were missing, ensuring reliable tool performance and more efficient task completions.
+- **🔗 Merged Response Loss in Multi-Model Chats**: Addressed an issue where responses in multi-model chat workflows were being deleted after follow-up queries, improving consistency and ensuring smoother interactions across models.
+
+## [0.3.34] - 2024-10-26
+
+### Added
+
+- **🔧 Feedback Export Enhancements**: Feedback history data can now be exported to JSON, allowing for seamless integration in RLHF processing and further analysis.
+- **🗂️ Embedding Model Lazy Loading**: Search functionality for leaderboard reranking is now more efficient, as embedding models are lazy-loaded only when needed, optimizing performance.
+- **🎨 Rich Text Input Toggle**: Users can now switch back to legacy textarea input for chat if they prefer simpler text input, though rich text is still the default until deprecation.
+- **🛠️ Improved Tool Calling Mechanism**: Enhanced method for parsing and calling tools, improving the reliability and robustness of tool function calls.
+- **🌐 Globalization Enhancements**: Updates to internationalization (i18n) support, further refining multi-language compatibility and accuracy.
+
+### Fixed
+
+- **🖥️ Folder Rename Fix for Firefox**: Addressed a persistent issue where users could not rename folders by pressing enter in Firefox, now ensuring seamless folder management across browsers.
+- **🔠 Tiktoken Model Text Splitter Issue**: Resolved an issue where the tiktoken text splitter wasn’t working in Docker installations, restoring full functionality for tokenized text editing.
+- **💼 S3 File Upload Issue**: Fixed a problem affecting S3 file uploads, ensuring smooth operations for those who store files on cloud storage.
+- **🔒 Strict-Transport-Security Crash**: Resolved a crash when setting the Strict-Transport-Security (HSTS) header, improving stability and security enhancements.
+- **🚫 OIDC Boolean Access Fix**: Addressed an issue with boolean values not being accessed correctly during OIDC logins, ensuring login reliability.
+- **⚙️ Rich Text Paste Behavior**: Refined paste behavior in rich text input to make it smoother and more intuitive when pasting various content types.
+- **🔨 Model Exclusion for Arena Fix**: Corrected the filter function that was not properly excluding models from the arena, improving model management.
+- **🏷️ "Tags Generation Prompt" Fix**: Addressed an issue preventing custom "tags generation prompts" from registering properly, ensuring custom prompt work seamlessly.
+
+## [0.3.33] - 2024-10-24
+
+### Added
+
+- **🏆 Evaluation Leaderboard**: Easily track your performance through a new leaderboard system where your ratings contribute to a real-time ranking based on the Elo system. Sibling responses (regenerations, many model chats) are required for your ratings to count in the leaderboard. Additionally, you can opt-in to share your feedback history and be part of the community-wide leaderboard. Expect further improvements as we refine the algorithm—help us build the best community leaderboard!
+- **⚔️ Arena Model Evaluation**: Enable blind A/B testing of models directly from Admin Settings > Evaluation for a true side-by-side comparison. Ideal for pinpointing the best model for your needs.
+- **🎯 Topic-Based Leaderboard**: Discover more accurate rankings with experimental topic-based reranking, which adjusts leaderboard standings based on tag similarity in feedback. Get more relevant insights based on specific topics!
+- **📁 Folders Support for Chats**: Organize your chats better by grouping them into folders. Drag and drop chats between folders and export them seamlessly for easy sharing or analysis.
+- **📤 Easy Chat Import via Drag & Drop**: Save time by simply dragging and dropping chat exports (JSON) directly onto the sidebar to import them into your workspace—streamlined, efficient, and intuitive!
+- **📚 Enhanced Knowledge Collection**: Now, you can reference individual files from a knowledge collection—ideal for more precise Retrieval-Augmented Generations (RAG) queries and document analysis.
+- **🏷️ Enhanced Tagging System**: Tags now take up less space! Utilize the new 'tag:' query system to manage, search, and organize your conversations more effectively without cluttering the interface.
+- **🧠 Auto-Tagging for Chats**: Your conversations are now automatically tagged for improved organization, mirroring the efficiency of auto-generated titles.
+- **🔍 Backend Chat Query System**: Chat filtering has become more efficient, now handled through the backend\*\* instead of your browser, improving search performance and accuracy.
+- **🎮 Revamped Playground**: Experience a refreshed and optimized Playground for smoother testing, tweaks, and experimentation of your models and tools.
+- **🧩 Token-Based Text Splitter**: Introducing token-based text splitting (tiktoken), giving you more precise control over how text is processed. Previously, only character-based splitting was available.
+- **🔢 Ollama Batch Embeddings**: Leverage new batch embedding support for improved efficiency and performance with Ollama embedding models.
+- **🔍 Enhanced Add Text Content Modal**: Enjoy a cleaner, more intuitive workflow for adding and curating knowledge content with an upgraded input modal from our Knowledge workspace.
+- **🖋️ Rich Text Input for Chats**: Make your chat inputs more dynamic with support for rich text formatting. Your conversations just got a lot more polished and professional.
+- **⚡ Faster Whisper Model Configurability**: Customize your local faster whisper model directly from the WebUI.
+- **☁️ Experimental S3 Support**: Enable stateless WebUI instances with S3 support, greatly enhancing scalability and balancing heavy workloads.
+- **🔕 Disable Update Toast**: Now you can streamline your workspace even further—choose to disable update notifications for a more focused experience.
+- **🌟 RAG Citation Relevance Percentage**: Easily assess citation accuracy with the addition of relevance percentages in RAG results.
+- **⚙️ Mermaid Copy Button**: Mermaid diagrams now come with a handy copy button, simplifying the extraction and use of diagram contents directly in your workflow.
+- **🎨 UI Redesign**: Major interface redesign that will make navigation smoother, keep your focus where it matters, and ensure a modern look.
+
+### Fixed
+
+- **🎙️ Voice Note Mic Stopping Issue**: Fixed the issue where the microphone stayed active after ending a voice note recording, ensuring your audio workflow runs smoothly.
+
+### Removed
+
+- **👋 Goodbye Sidebar Tags**: Sidebar tag clutter is gone. We’ve shifted tag buttons to more effective query-based tag filtering for a sleeker, more agile interface.
+
## [0.3.32] - 2024-10-06
### Added
diff --git a/Dockerfile b/Dockerfile
index 2e898dc890..274e23dbfc 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,6 +1,6 @@
# syntax=docker/dockerfile:1
# Initialize device type args
-# use build args in the docker build commmand with --build-arg="BUILDARG=true"
+# use build args in the docker build command with --build-arg="BUILDARG=true"
ARG USE_CUDA=false
ARG USE_OLLAMA=false
# Tested with cu117 for CUDA 11 and cu121 for CUDA 12 (default)
@@ -11,6 +11,10 @@ ARG USE_CUDA_VER=cu121
# IMPORTANT: If you change the embedding model (sentence-transformers/all-MiniLM-L6-v2) and vice versa, you aren't able to use RAG Chat with your previous documents loaded in the WebUI! You need to re-embed them.
ARG USE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
ARG USE_RERANKING_MODEL=""
+
+# Tiktoken encoding name; models to use can be found at https://huggingface.co/models?library=tiktoken
+ARG USE_TIKTOKEN_ENCODING_NAME="cl100k_base"
+
ARG BUILD_HASH=dev-build
# Override at your own risk - non-root configurations are untested
ARG UID=0
@@ -72,6 +76,10 @@ ENV RAG_EMBEDDING_MODEL="$USE_EMBEDDING_MODEL_DOCKER" \
RAG_RERANKING_MODEL="$USE_RERANKING_MODEL_DOCKER" \
SENTENCE_TRANSFORMERS_HOME="/app/backend/data/cache/embedding/models"
+## Tiktoken model settings ##
+ENV TIKTOKEN_ENCODING_NAME="cl100k_base" \
+ TIKTOKEN_CACHE_DIR="/app/backend/data/cache/tiktoken"
+
## Hugging Face download cache ##
ENV HF_HOME="/app/backend/data/cache/embedding/models"
@@ -131,11 +139,13 @@ RUN pip3 install uv && \
uv pip install --system -r requirements.txt --no-cache-dir && \
python -c "import os; from sentence_transformers import SentenceTransformer; SentenceTransformer(os.environ['RAG_EMBEDDING_MODEL'], device='cpu')" && \
python -c "import os; from faster_whisper import WhisperModel; WhisperModel(os.environ['WHISPER_MODEL'], device='cpu', compute_type='int8', download_root=os.environ['WHISPER_MODEL_DIR'])"; \
+ python -c "import os; import tiktoken; tiktoken.get_encoding(os.environ['TIKTOKEN_ENCODING_NAME'])"; \
else \
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu --no-cache-dir && \
uv pip install --system -r requirements.txt --no-cache-dir && \
python -c "import os; from sentence_transformers import SentenceTransformer; SentenceTransformer(os.environ['RAG_EMBEDDING_MODEL'], device='cpu')" && \
python -c "import os; from faster_whisper import WhisperModel; WhisperModel(os.environ['WHISPER_MODEL'], device='cpu', compute_type='int8', download_root=os.environ['WHISPER_MODEL_DIR'])"; \
+ python -c "import os; import tiktoken; tiktoken.get_encoding(os.environ['TIKTOKEN_ENCODING_NAME'])"; \
fi; \
chown -R $UID:$GID /app/backend/data/
diff --git a/TROUBLESHOOTING.md b/TROUBLESHOOTING.md
index 9bf242381c..83251a3a91 100644
--- a/TROUBLESHOOTING.md
+++ b/TROUBLESHOOTING.md
@@ -18,7 +18,7 @@ If you're experiencing connection issues, it’s often due to the WebUI docker c
docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main
```
-### Error on Slow Reponses for Ollama
+### Error on Slow Responses for Ollama
Open WebUI has a default timeout of 5 minutes for Ollama to finish generating the response. If needed, this can be adjusted via the environment variable AIOHTTP_CLIENT_TIMEOUT, which sets the timeout in seconds.
diff --git a/backend/open_webui/apps/audio/main.py b/backend/open_webui/apps/audio/main.py
index 6398b9ee1a..b138f82de9 100644
--- a/backend/open_webui/apps/audio/main.py
+++ b/backend/open_webui/apps/audio/main.py
@@ -63,6 +63,9 @@ app.state.config.STT_OPENAI_API_KEY = AUDIO_STT_OPENAI_API_KEY
app.state.config.STT_ENGINE = AUDIO_STT_ENGINE
app.state.config.STT_MODEL = AUDIO_STT_MODEL
+app.state.config.WHISPER_MODEL = WHISPER_MODEL
+app.state.faster_whisper_model = None
+
app.state.config.TTS_OPENAI_API_BASE_URL = AUDIO_TTS_OPENAI_API_BASE_URL
app.state.config.TTS_OPENAI_API_KEY = AUDIO_TTS_OPENAI_API_KEY
app.state.config.TTS_ENGINE = AUDIO_TTS_ENGINE
@@ -82,6 +85,31 @@ SPEECH_CACHE_DIR = Path(CACHE_DIR).joinpath("./audio/speech/")
SPEECH_CACHE_DIR.mkdir(parents=True, exist_ok=True)
+def set_faster_whisper_model(model: str, auto_update: bool = False):
+ if model and app.state.config.STT_ENGINE == "":
+ from faster_whisper import WhisperModel
+
+ faster_whisper_kwargs = {
+ "model_size_or_path": model,
+ "device": whisper_device_type,
+ "compute_type": "int8",
+ "download_root": WHISPER_MODEL_DIR,
+ "local_files_only": not auto_update,
+ }
+
+ try:
+ app.state.faster_whisper_model = WhisperModel(**faster_whisper_kwargs)
+ except Exception:
+ log.warning(
+ "WhisperModel initialization failed, attempting download with local_files_only=False"
+ )
+ faster_whisper_kwargs["local_files_only"] = False
+ app.state.faster_whisper_model = WhisperModel(**faster_whisper_kwargs)
+
+ else:
+ app.state.faster_whisper_model = None
+
+
class TTSConfigForm(BaseModel):
OPENAI_API_BASE_URL: str
OPENAI_API_KEY: str
@@ -99,6 +127,7 @@ class STTConfigForm(BaseModel):
OPENAI_API_KEY: str
ENGINE: str
MODEL: str
+ WHISPER_MODEL: str
class AudioConfigUpdateForm(BaseModel):
@@ -152,6 +181,7 @@ async def get_audio_config(user=Depends(get_admin_user)):
"OPENAI_API_KEY": app.state.config.STT_OPENAI_API_KEY,
"ENGINE": app.state.config.STT_ENGINE,
"MODEL": app.state.config.STT_MODEL,
+ "WHISPER_MODEL": app.state.config.WHISPER_MODEL,
},
}
@@ -176,6 +206,8 @@ async def update_audio_config(
app.state.config.STT_OPENAI_API_KEY = form_data.stt.OPENAI_API_KEY
app.state.config.STT_ENGINE = form_data.stt.ENGINE
app.state.config.STT_MODEL = form_data.stt.MODEL
+ app.state.config.WHISPER_MODEL = form_data.stt.WHISPER_MODEL
+ set_faster_whisper_model(form_data.stt.WHISPER_MODEL, WHISPER_MODEL_AUTO_UPDATE)
return {
"tts": {
@@ -194,6 +226,7 @@ async def update_audio_config(
"OPENAI_API_KEY": app.state.config.STT_OPENAI_API_KEY,
"ENGINE": app.state.config.STT_ENGINE,
"MODEL": app.state.config.STT_MODEL,
+ "WHISPER_MODEL": app.state.config.WHISPER_MODEL,
},
}
@@ -367,27 +400,10 @@ def transcribe(file_path):
id = filename.split(".")[0]
if app.state.config.STT_ENGINE == "":
- from faster_whisper import WhisperModel
-
- whisper_kwargs = {
- "model_size_or_path": WHISPER_MODEL,
- "device": whisper_device_type,
- "compute_type": "int8",
- "download_root": WHISPER_MODEL_DIR,
- "local_files_only": not WHISPER_MODEL_AUTO_UPDATE,
- }
-
- log.debug(f"whisper_kwargs: {whisper_kwargs}")
-
- try:
- model = WhisperModel(**whisper_kwargs)
- except Exception:
- log.warning(
- "WhisperModel initialization failed, attempting download with local_files_only=False"
- )
- whisper_kwargs["local_files_only"] = False
- model = WhisperModel(**whisper_kwargs)
+ if app.state.faster_whisper_model is None:
+ set_faster_whisper_model(app.state.config.WHISPER_MODEL)
+ model = app.state.faster_whisper_model
segments, info = model.transcribe(file_path, beam_size=5)
log.info(
"Detected language '%s' with probability %f"
@@ -395,7 +411,6 @@ def transcribe(file_path):
)
transcript = "".join([segment.text for segment in list(segments)])
-
data = {"text": transcript.strip()}
# save the transcript to a json file
@@ -403,7 +418,7 @@ def transcribe(file_path):
with open(transcript_file, "w") as f:
json.dump(data, f)
- print(data)
+ log.debug(data)
return data
elif app.state.config.STT_ENGINE == "openai":
if is_mp4_audio(file_path):
@@ -417,7 +432,7 @@ def transcribe(file_path):
files = {"file": (filename, open(file_path, "rb"))}
data = {"model": app.state.config.STT_MODEL}
- print(files, data)
+ log.debug(files, data)
r = None
try:
@@ -507,7 +522,8 @@ def transcription(
else:
data = transcribe(file_path)
- return data
+ file_path = file_path.split("/")[-1]
+ return {**data, "filename": file_path}
except Exception as e:
log.exception(e)
raise HTTPException(
diff --git a/backend/open_webui/apps/images/utils/comfyui.py b/backend/open_webui/apps/images/utils/comfyui.py
index 0a3e3a1d9b..4c421d7c52 100644
--- a/backend/open_webui/apps/images/utils/comfyui.py
+++ b/backend/open_webui/apps/images/utils/comfyui.py
@@ -125,22 +125,34 @@ async def comfyui_generate_image(
workflow[node_id]["inputs"][node.key] = model
elif node.type == "prompt":
for node_id in node.node_ids:
- workflow[node_id]["inputs"]["text"] = payload.prompt
+ workflow[node_id]["inputs"][
+ node.key if node.key else "text"
+ ] = payload.prompt
elif node.type == "negative_prompt":
for node_id in node.node_ids:
- workflow[node_id]["inputs"]["text"] = payload.negative_prompt
+ workflow[node_id]["inputs"][
+ node.key if node.key else "text"
+ ] = payload.negative_prompt
elif node.type == "width":
for node_id in node.node_ids:
- workflow[node_id]["inputs"]["width"] = payload.width
+ workflow[node_id]["inputs"][
+ node.key if node.key else "width"
+ ] = payload.width
elif node.type == "height":
for node_id in node.node_ids:
- workflow[node_id]["inputs"]["height"] = payload.height
+ workflow[node_id]["inputs"][
+ node.key if node.key else "height"
+ ] = payload.height
elif node.type == "n":
for node_id in node.node_ids:
- workflow[node_id]["inputs"]["batch_size"] = payload.n
+ workflow[node_id]["inputs"][
+ node.key if node.key else "batch_size"
+ ] = payload.n
elif node.type == "steps":
for node_id in node.node_ids:
- workflow[node_id]["inputs"]["steps"] = payload.steps
+ workflow[node_id]["inputs"][
+ node.key if node.key else "steps"
+ ] = payload.steps
elif node.type == "seed":
seed = (
payload.seed
diff --git a/backend/open_webui/apps/ollama/main.py b/backend/open_webui/apps/ollama/main.py
index eed74064f2..2056a4c036 100644
--- a/backend/open_webui/apps/ollama/main.py
+++ b/backend/open_webui/apps/ollama/main.py
@@ -547,7 +547,7 @@ class GenerateEmbeddingsForm(BaseModel):
class GenerateEmbedForm(BaseModel):
model: str
- input: list[str]
+ input: list[str] | str
truncate: Optional[bool] = None
options: Optional[dict] = None
keep_alive: Optional[Union[int, str]] = None
@@ -692,7 +692,7 @@ class GenerateCompletionForm(BaseModel):
options: Optional[dict] = None
system: Optional[str] = None
template: Optional[str] = None
- context: Optional[str] = None
+ context: Optional[list[int]] = None
stream: Optional[bool] = True
raw: Optional[bool] = None
keep_alive: Optional[Union[int, str]] = None
@@ -739,7 +739,7 @@ class GenerateChatCompletionForm(BaseModel):
format: Optional[str] = None
options: Optional[dict] = None
template: Optional[str] = None
- stream: Optional[bool] = None
+ stream: Optional[bool] = True
keep_alive: Optional[Union[int, str]] = None
@@ -761,6 +761,7 @@ async def generate_chat_completion(
form_data: GenerateChatCompletionForm,
url_idx: Optional[int] = None,
user=Depends(get_verified_user),
+ bypass_filter: Optional[bool] = False,
):
payload = {**form_data.model_dump(exclude_none=True)}
log.debug(f"generate_chat_completion() - 1.payload = {payload}")
@@ -769,7 +770,7 @@ async def generate_chat_completion(
model_id = form_data.model
- if app.state.config.ENABLE_MODEL_FILTER:
+ if not bypass_filter and app.state.config.ENABLE_MODEL_FILTER:
if user.role == "user" and model_id not in app.state.config.MODEL_FILTER_LIST:
raise HTTPException(
status_code=403,
diff --git a/backend/open_webui/apps/openai/main.py b/backend/open_webui/apps/openai/main.py
index 70cefb29ca..3647977cad 100644
--- a/backend/open_webui/apps/openai/main.py
+++ b/backend/open_webui/apps/openai/main.py
@@ -18,7 +18,10 @@ from open_webui.config import (
OPENAI_API_KEYS,
AppConfig,
)
-from open_webui.env import AIOHTTP_CLIENT_TIMEOUT
+from open_webui.env import (
+ AIOHTTP_CLIENT_TIMEOUT,
+ AIOHTTP_CLIENT_TIMEOUT_OPENAI_MODEL_LIST,
+)
from open_webui.constants import ERROR_MESSAGES
from open_webui.env import SRC_LOG_LEVELS
@@ -179,7 +182,7 @@ async def speech(request: Request, user=Depends(get_verified_user)):
async def fetch_url(url, key):
- timeout = aiohttp.ClientTimeout(total=3)
+ timeout = aiohttp.ClientTimeout(total=AIOHTTP_CLIENT_TIMEOUT_OPENAI_MODEL_LIST)
try:
headers = {"Authorization": f"Bearer {key}"}
async with aiohttp.ClientSession(timeout=timeout, trust_env=True) as session:
@@ -237,9 +240,7 @@ def merge_models_lists(model_lists):
def is_openai_api_disabled():
- api_keys = app.state.config.OPENAI_API_KEYS
- no_keys = len(api_keys) == 1 and api_keys[0] == ""
- return no_keys or not app.state.config.ENABLE_OPENAI_API
+ return not app.state.config.ENABLE_OPENAI_API
async def get_all_models_raw() -> list:
diff --git a/backend/open_webui/apps/retrieval/main.py b/backend/open_webui/apps/retrieval/main.py
index c80b2011d9..e67d1df230 100644
--- a/backend/open_webui/apps/retrieval/main.py
+++ b/backend/open_webui/apps/retrieval/main.py
@@ -14,7 +14,11 @@ from typing import Iterator, Optional, Sequence, Union
from fastapi import Depends, FastAPI, File, Form, HTTPException, UploadFile, status
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
+import tiktoken
+
+from open_webui.storage.provider import Storage
+from open_webui.apps.webui.models.knowledge import Knowledges
from open_webui.apps.retrieval.vector.connector import VECTOR_DB_CLIENT
# Document loaders
@@ -47,6 +51,8 @@ from open_webui.apps.retrieval.utils import (
from open_webui.apps.webui.models.files import Files
from open_webui.config import (
BRAVE_SEARCH_API_KEY,
+ TIKTOKEN_ENCODING_NAME,
+ RAG_TEXT_SPLITTER,
CHUNK_OVERLAP,
CHUNK_SIZE,
CONTENT_EXTRACTION_ENGINE,
@@ -102,7 +108,7 @@ from open_webui.utils.misc import (
)
from open_webui.utils.utils import get_admin_user, get_verified_user
-from langchain.text_splitter import RecursiveCharacterTextSplitter
+from langchain.text_splitter import RecursiveCharacterTextSplitter, TokenTextSplitter
from langchain_community.document_loaders import (
YoutubeLoader,
)
@@ -129,6 +135,9 @@ app.state.config.ENABLE_RAG_WEB_LOADER_SSL_VERIFICATION = (
app.state.config.CONTENT_EXTRACTION_ENGINE = CONTENT_EXTRACTION_ENGINE
app.state.config.TIKA_SERVER_URL = TIKA_SERVER_URL
+app.state.config.TEXT_SPLITTER = RAG_TEXT_SPLITTER
+app.state.config.TIKTOKEN_ENCODING_NAME = TIKTOKEN_ENCODING_NAME
+
app.state.config.CHUNK_SIZE = CHUNK_SIZE
app.state.config.CHUNK_OVERLAP = CHUNK_OVERLAP
@@ -171,9 +180,9 @@ def update_embedding_model(
auto_update: bool = False,
):
if embedding_model and app.state.config.RAG_EMBEDDING_ENGINE == "":
- import sentence_transformers
+ from sentence_transformers import SentenceTransformer
- app.state.sentence_transformer_ef = sentence_transformers.SentenceTransformer(
+ app.state.sentence_transformer_ef = SentenceTransformer(
get_model_path(embedding_model, auto_update),
device=DEVICE_TYPE,
trust_remote_code=RAG_EMBEDDING_MODEL_TRUST_REMOTE_CODE,
@@ -384,18 +393,19 @@ async def get_rag_config(user=Depends(get_admin_user)):
return {
"status": True,
"pdf_extract_images": app.state.config.PDF_EXTRACT_IMAGES,
- "file": {
- "max_size": app.state.config.FILE_MAX_SIZE,
- "max_count": app.state.config.FILE_MAX_COUNT,
- },
"content_extraction": {
"engine": app.state.config.CONTENT_EXTRACTION_ENGINE,
"tika_server_url": app.state.config.TIKA_SERVER_URL,
},
"chunk": {
+ "text_splitter": app.state.config.TEXT_SPLITTER,
"chunk_size": app.state.config.CHUNK_SIZE,
"chunk_overlap": app.state.config.CHUNK_OVERLAP,
},
+ "file": {
+ "max_size": app.state.config.FILE_MAX_SIZE,
+ "max_count": app.state.config.FILE_MAX_COUNT,
+ },
"youtube": {
"language": app.state.config.YOUTUBE_LOADER_LANGUAGE,
"translation": app.state.YOUTUBE_LOADER_TRANSLATION,
@@ -434,6 +444,7 @@ class ContentExtractionConfig(BaseModel):
class ChunkParamUpdateForm(BaseModel):
+ text_splitter: Optional[str] = None
chunk_size: int
chunk_overlap: int
@@ -493,6 +504,7 @@ async def update_rag_config(form_data: ConfigUpdateForm, user=Depends(get_admin_
app.state.config.TIKA_SERVER_URL = form_data.content_extraction.tika_server_url
if form_data.chunk is not None:
+ app.state.config.TEXT_SPLITTER = form_data.chunk.text_splitter
app.state.config.CHUNK_SIZE = form_data.chunk.chunk_size
app.state.config.CHUNK_OVERLAP = form_data.chunk.chunk_overlap
@@ -539,6 +551,7 @@ async def update_rag_config(form_data: ConfigUpdateForm, user=Depends(get_admin_
"tika_server_url": app.state.config.TIKA_SERVER_URL,
},
"chunk": {
+ "text_splitter": app.state.config.TEXT_SPLITTER,
"chunk_size": app.state.config.CHUNK_SIZE,
"chunk_overlap": app.state.config.CHUNK_OVERLAP,
},
@@ -599,11 +612,10 @@ class QuerySettingsForm(BaseModel):
async def update_query_settings(
form_data: QuerySettingsForm, user=Depends(get_admin_user)
):
- app.state.config.RAG_TEMPLATE = (
- form_data.template if form_data.template != "" else DEFAULT_RAG_TEMPLATE
- )
+ app.state.config.RAG_TEMPLATE = form_data.template
app.state.config.TOP_K = form_data.k if form_data.k else 4
app.state.config.RELEVANCE_THRESHOLD = form_data.r if form_data.r else 0.0
+
app.state.config.ENABLE_RAG_HYBRID_SEARCH = (
form_data.hybrid if form_data.hybrid else False
)
@@ -648,18 +660,46 @@ def save_docs_to_vector_db(
raise ValueError(ERROR_MESSAGES.DUPLICATE_CONTENT)
if split:
- text_splitter = RecursiveCharacterTextSplitter(
- chunk_size=app.state.config.CHUNK_SIZE,
- chunk_overlap=app.state.config.CHUNK_OVERLAP,
- add_start_index=True,
- )
+ if app.state.config.TEXT_SPLITTER in ["", "character"]:
+ text_splitter = RecursiveCharacterTextSplitter(
+ chunk_size=app.state.config.CHUNK_SIZE,
+ chunk_overlap=app.state.config.CHUNK_OVERLAP,
+ add_start_index=True,
+ )
+ elif app.state.config.TEXT_SPLITTER == "token":
+ log.info(
+ f"Using token text splitter: {app.state.config.TIKTOKEN_ENCODING_NAME}"
+ )
+
+ tiktoken.get_encoding(str(app.state.config.TIKTOKEN_ENCODING_NAME))
+ text_splitter = TokenTextSplitter(
+ encoding_name=str(app.state.config.TIKTOKEN_ENCODING_NAME),
+ chunk_size=app.state.config.CHUNK_SIZE,
+ chunk_overlap=app.state.config.CHUNK_OVERLAP,
+ add_start_index=True,
+ )
+ else:
+ raise ValueError(ERROR_MESSAGES.DEFAULT("Invalid text splitter"))
+
docs = text_splitter.split_documents(docs)
if len(docs) == 0:
raise ValueError(ERROR_MESSAGES.EMPTY_CONTENT)
texts = [doc.page_content for doc in docs]
- metadatas = [{**doc.metadata, **(metadata if metadata else {})} for doc in docs]
+ metadatas = [
+ {
+ **doc.metadata,
+ **(metadata if metadata else {}),
+ "embedding_config": json.dumps(
+ {
+ "engine": app.state.config.RAG_EMBEDDING_ENGINE,
+ "model": app.state.config.RAG_EMBEDDING_MODEL,
+ }
+ ),
+ }
+ for doc in docs
+ ]
# ChromaDB does not like datetime formats
# for meta-data so convert them to string.
@@ -675,8 +715,10 @@ def save_docs_to_vector_db(
if overwrite:
VECTOR_DB_CLIENT.delete_collection(collection_name=collection_name)
log.info(f"deleting existing collection {collection_name}")
-
- if add is False:
+ elif add is False:
+ log.info(
+ f"collection {collection_name} already exists, overwrite is False and add is False"
+ )
return True
log.info(f"adding to collection {collection_name}")
@@ -788,15 +830,14 @@ def process_file(
else:
# Process the file and save the content
# Usage: /files/
-
- file_path = file.meta.get("path", None)
+ file_path = file.path
if file_path:
+ file_path = Storage.get_file(file_path)
loader = Loader(
engine=app.state.config.CONTENT_EXTRACTION_ENGINE,
TIKA_SERVER_URL=app.state.config.TIKA_SERVER_URL,
PDF_EXTRACT_IMAGES=app.state.config.PDF_EXTRACT_IMAGES,
)
-
docs = loader.load(
file.filename, file.meta.get("content_type"), file_path
)
@@ -812,7 +853,6 @@ def process_file(
},
)
]
-
text_content = " ".join([doc.page_content for doc in docs])
log.debug(f"text_content: {text_content}")
@@ -1255,6 +1295,7 @@ def delete_entries_from_collection(form_data: DeleteForm, user=Depends(get_admin
@app.post("/reset/db")
def reset_vector_db(user=Depends(get_admin_user)):
VECTOR_DB_CLIENT.reset()
+ Knowledges.delete_all_knowledge()
@app.post("/reset/uploads")
@@ -1277,28 +1318,6 @@ def reset_upload_dir(user=Depends(get_admin_user)) -> bool:
print(f"The directory {folder} does not exist")
except Exception as e:
print(f"Failed to process the directory {folder}. Reason: {e}")
-
- return True
-
-
-@app.post("/reset")
-def reset(user=Depends(get_admin_user)) -> bool:
- folder = f"{UPLOAD_DIR}"
- for filename in os.listdir(folder):
- file_path = os.path.join(folder, filename)
- try:
- if os.path.isfile(file_path) or os.path.islink(file_path):
- os.unlink(file_path)
- elif os.path.isdir(file_path):
- shutil.rmtree(file_path)
- except Exception as e:
- log.error("Failed to delete %s. Reason: %s" % (file_path, e))
-
- try:
- VECTOR_DB_CLIENT.reset()
- except Exception as e:
- log.exception(e)
-
return True
diff --git a/backend/open_webui/apps/retrieval/utils.py b/backend/open_webui/apps/retrieval/utils.py
index 992d467882..153bd804ff 100644
--- a/backend/open_webui/apps/retrieval/utils.py
+++ b/backend/open_webui/apps/retrieval/utils.py
@@ -19,6 +19,7 @@ from open_webui.apps.retrieval.vector.connector import VECTOR_DB_CLIENT
from open_webui.utils.misc import get_last_user_message
from open_webui.env import SRC_LOG_LEVELS
+from open_webui.config import DEFAULT_RAG_TEMPLATE
log = logging.getLogger(__name__)
@@ -239,8 +240,13 @@ def query_collection_with_hybrid_search(
def rag_template(template: str, context: str, query: str):
- count = template.count("[context]")
- assert "[context]" in template, "RAG template does not contain '[context]'"
+ if template == "":
+ template = DEFAULT_RAG_TEMPLATE
+
+ if "[context]" not in template and "{{CONTEXT}}" not in template:
+ log.debug(
+ "WARNING: The RAG template does not contain the '[context]' or '{{CONTEXT}}' placeholder."
+ )
if "
*/ +.markdown-section p + ul { + margin-top: 0; +} + +/* Remove bottom margin of
if it is followed by a
not followed by