This commit fixes an issue where Retrieval-Augmented Generation (RAG)
queries were still being generated even when all attached files were set
to 'full context' mode. This was inefficient as the full content of the
files was already available to the model.
The `chat_completion_files_handler` in `backend/open_webui/utils/middleware.py`
has been updated to:
- Check if all attached files have the `context: 'full'` property.
- Skip the `generate_queries` step if all files are in full context mode.
- Pass a `full_context=True` flag to the `get_sources_from_items`
function to ensure it fetches the entire document content instead of
performing a vector search.
This change ensures that RAG queries are only generated when necessary,
improving the efficiency of the system.
- Fix file handle memory leak in download_file_stream by properly closing and reopening files
- Add requests.Session context manager for proper HTTP connection cleanup
- Remove unnecessary file.seek(0) after file reopening
- Add timeout to prevent hanging connections
This prevents memory accumulation during large file downloads and ensures
proper resource cleanup in all scenarios.
Signed-off-by: Sihyeon Jang <sihyeon.jang@navercorp.com>
- Replace inefficient memory-based filtering with database-level filtering
- Add proper access control conditions to SQL query
- Reduce memory usage by filtering at database level instead of loading all notes
- Maintain access control validation with post-filtering for complex cases
This change significantly improves performance for users with many notes
by reducing the number of database queries and memory usage.
Signed-off-by: Sihyeon Jang <sihyeon.jang@navercorp.com>