Commit graph

144 commits

Author SHA1 Message Date
Tim Jaeryang Baek
ff353578db
Merge pull request #14370 from daw/feat/add-azure-openai-embeddings-option
feat:Add Azure OpenAI embedding support
2025-05-30 00:18:55 +04:00
Timothy Jaeryang Baek
cb4299eb98 refac 2025-05-29 02:33:40 +04:00
Hisma
19bb3589ee fix: add Datalab Marker API to Content Extraction Engine Dropdown 2025-05-27 02:24:53 -04:00
Hisma
a9405cc101 feat: Marker api content extraction support 2025-05-27 00:44:07 -04:00
Timothy Jaeryang Baek
51ab02f3af chore: format 2025-05-24 02:13:46 +04:00
Tim Jaeryang Baek
e663b90a9f
Merge pull request #14069 from Ithanil/bm25_weight
feat: Configurable weight for BM25Retriever during hybrid search
2025-05-24 01:13:03 +04:00
Jan Kessler
e70dd33233
rename BM25_WEIGHT -> HYBRID_BM25_WEIGHT 2025-05-23 22:06:44 +02:00
Timothy Jaeryang Baek
82716f3789 refac 2025-05-20 19:39:18 +04:00
Jan Kessler
b5ddaf6417
make weight for bm25 retriever in hybrid search ui-configurable 2025-05-20 10:39:31 +02:00
Derek Wischusen
42be1f956a Add Azure OpenAI embedding support 2025-05-19 22:58:04 -04:00
Timothy Jaeryang Baek
8f4104fb7a refac 2025-05-19 00:13:03 +04:00
Timothy Jaeryang Baek
2bd7db12a2 enh: ALLOWED_FILE_EXTENSIONS ui 2025-05-16 21:05:52 +04:00
Jesper Kristensen
84e0605835
Cleaning up usage of console log in front end 2025-05-15 21:53:07 +02:00
Timothy Jaeryang Baek
8732b64b6b feat: external document loader support 2025-05-14 22:28:40 +04:00
Timothy Jaeryang Baek
de70d0cb64 feat: docling do picture description support 2025-05-14 21:26:49 +04:00
Timothy Jaeryang Baek
a515a5df1a refac 2025-05-10 18:38:30 +04:00
Timothy Jaeryang Baek
ba72d4625f refac 2025-05-10 18:36:45 +04:00
Timothy Jaeryang Baek
3dc34c2402 feat: external reranker settings ui 2025-05-10 18:33:52 +04:00
Timothy Jaeryang Baek
be912f1529 refac 2025-05-10 18:29:04 +04:00
Timothy Jaeryang Baek
aefd5d9557 chore: format 2025-05-03 23:48:12 +04:00
Athanasios Oikonomou
437804a2f8 fix: update validation logic for Docling OCR engine and language requirements
Both Docling OCR Engine and Language(s) must be provided or both left empty.
2025-05-03 08:12:58 +03:00
Athanasios Oikonomou
4801430ad2 fix: correct condition for Docling OCR engine and language validation
Both must have value or both must be empty.
2025-05-03 08:02:00 +03:00
Athanasios Oikonomou
657162e96d feat(ocr): add support for Docling OCR engine and language configuration
This commit adds support for configuring the OCR engine and language(s) for Docling.
Configuration can be set via the environment variables `DOCLING_OCR_ENGINE` and `DOCLING_OCR_LANG`, or through the UI.

Fixes #13133
2025-05-03 00:32:06 +03:00
Timothy Jaeryang Baek
48a23ce3fe refac: web/rag config 2025-04-12 16:33:36 -07:00
hurxxxx
7c828015d3 fix: ReindexKnowledgeFilesConfirmDialog 2025-04-08 00:53:11 +09:00
hurxxxx
4e545d432b feat: add new admin func - reindex knowledge files 2025-04-08 00:44:10 +09:00
Patrick Wachter
1ac6879268
Add Mistral OCR integration and configuration support 2025-04-01 14:24:33 +02:00
Timothy Jaeryang Baek
737f41dd2e refac 2025-03-28 13:18:44 -07:00
Timothy Jaeryang Baek
402d32ccfd refac 2025-03-28 13:17:43 -07:00
Timothy Jaeryang Baek
0413c747a9 refac: hide hybrid option with full context mode 2025-03-28 13:16:56 -07:00
Timothy Jaeryang Baek
4a79320253 chore: format 2025-03-27 01:40:28 -07:00
Timothy Jaeryang Baek
9d834a8e90
Merge branch 'dev' into k_reranker 2025-03-26 20:50:31 -07:00
Timothy Jaeryang Baek
3186aeac08 chore: format 2025-03-18 06:39:37 -07:00
Fabio Polito
0aa42615f9 Merge remote-tracking branch 'upstream/dev' into docling_context_extraction_engine
merge upstream
2025-03-08 18:52:51 +00:00
orenzhang
72ea6dd9f1
refactor(lint): code lint 2025-03-07 19:59:09 +08:00
orenzhang
92fb1109b6
i18n(common): add i18n translation 2025-03-06 20:16:34 +08:00
Marko Henning
41a4cf7106 Added new k_reranker parameter 2025-03-06 10:47:57 +01:00
Fabio Polito
2982893d0d fix: format fixes 2025-03-06 00:39:00 +00:00
Fabio Polito
9aa407dbd2 feat: merge with main 2025-03-05 22:04:34 +00:00
Timothy Jaeryang Baek
57010901e6 enh: bypass embedding and retrieval 2025-02-26 15:42:19 -08:00
Timothy Jaeryang Baek
1c2e36f1b7 refac 2025-02-26 13:59:08 -08:00
Timothy Jaeryang Baek
fa91d83ac3 refac: documents settings ui 2025-02-26 13:48:56 -08:00
Timothy Jaeryang Baek
9f27d7710b chore: format 2025-02-25 01:46:08 -08:00
hurxxxx
4cc3102758 feat: onedrive file picker integration 2025-02-25 01:47:07 +09:00
Timothy Jaeryang Baek
ab1b910d80
Merge pull request #10486 from Micca/feature/document_intelligence_support
Feat: Adding Support for Azure AI Document Intelligence for Content Extraction (Revised)
2025-02-21 10:56:18 -08:00
Timothy Jaeryang Baek
81715f6553 enh: RAG full context mode 2025-02-18 21:14:58 -08:00
Timothy Jaeryang Baek
e3fa48b6ce chore: tailwind v4 migration 2025-02-15 19:27:25 -08:00
Fabio Polito
2419ef06a0 feat: docling support for document preprocessing 2025-02-14 12:08:03 +00:00
Mazurek Michal
35f3824932 feat: Implement Document Intelligence as Content Extraction Engine 2025-02-07 13:44:47 +01:00
Timothy Jaeryang Baek
a863f98c53 refac: toast error 2025-01-20 22:41:32 -08:00
Timothy Jaeryang Baek
f8269de947 fix 2024-12-24 20:10:52 -07:00
Timothy Jaeryang Baek
50f36a5262 refac: styling 2024-12-19 20:56:16 -08:00
Timothy Jaeryang Baek
0f6d302760 refac 2024-12-18 18:04:56 -08:00
Taylor Wilsdon
1120f4d09a npm run format 2024-12-18 13:32:46 -05:00
Taylor Wilsdon
0dc75363aa Add configurable Google Drive toggle in the Documents admin section along with necessary config scaffolding 2024-12-18 13:25:57 -05:00
Taylor Wilsdon (aider)
5c149c3aa2 style: Align Google Drive switch to the right side of text 2024-12-18 13:24:13 -05:00
Taylor Wilsdon
d43ca803ca feat: Add Google Drive integration toggle to document settings 2024-12-18 13:24:11 -05:00
Timothy Jaeryang Baek
20321e5271 refac: ollama setting for rag 2024-11-18 14:19:56 -08:00
Timothy Jaeryang Baek
227cca35e8 enh: knowledge access control 2024-11-16 16:51:55 -08:00
Timothy Jaeryang Baek
f9412f72f1 refac: styling 2024-11-16 01:54:40 -08:00
Timothy Jaeryang Baek
4eb8b1450c refac 2024-11-15 22:09:06 -08:00
Timothy J. Baek
47e377967e refac: styling 2024-10-21 00:05:27 -07:00
Timothy J. Baek
4b357a7b62 refac: styling 2024-10-20 18:49:30 -07:00
Timothy J. Baek
e8c629a2e2 refac: styling 2024-10-19 23:17:47 -07:00
Timothy J. Baek
eef9045dcc chore: format 2024-10-15 09:22:03 -07:00
Timothy J. Baek
586e005f0f enh: token text splitter support 2024-10-13 04:24:13 -07:00
Timothy J. Baek
5ffd216fca refac 2024-10-13 03:02:02 -07:00
Peter De-Ath
885b9f1ece refactor: Update GenerateEmbeddingsForm to support batch processing
refactor: Update embedding batch size handling in RAG configuration

refactor: add query_doc query caching

refactor: update logging statements in generate_chat_completion function

change embedding_batch_size to Optional
2024-10-08 00:04:35 +01:00
Timothy J. Baek
79c005a041 refac: deprecate docs_dir 2024-10-04 18:22:55 -07:00
Timothy J. Baek
a6c797d4c2 refac: process docs dir 2024-10-04 17:22:00 -07:00
Timothy J. Baek
08969ecf89 refac: rename projects -> knowledge 2024-10-01 22:45:04 -07:00
Timothy J. Baek
c5eb0a9732 refac: documents -> projects 2024-10-01 17:35:35 -07:00
Timothy J. Baek
af57a2c153 refac 2024-09-28 02:23:09 +02:00
Timothy J. Baek
1b349016ff refac 2024-09-28 01:36:35 +02:00
Timothy J. Baek
cb9e76c7f9 refac: default rag template 2024-09-16 12:01:04 +02:00
Timothy J. Baek
6a21a77ee9 refac 2024-08-27 17:05:24 +02:00
Timothy J. Baek
ef28330c1a refac: do NOT change default behaviour in a PR 2024-08-27 15:56:47 +02:00
Timothy J. Baek
09cba5b87a refac: rm sub standard code 2024-08-27 15:51:40 +02:00
Clivia
b6da4baa97 💄 Limit the size and number of uploaded files
💄 Limit the size and number of uploaded files
2024-08-26 23:36:13 +08:00
Timothy J. Baek
7ef5aa520c chore: format 2024-08-13 11:12:35 +01:00
Timothy J. Baek
77b2d2dbee refac: styling 2024-07-27 23:34:29 +01:00
Timothy J. Baek
a23146ebd1 refac: styling 2024-07-08 20:10:00 -07:00
Timothy J. Baek
b62d2a9b28 refac 2024-07-01 17:15:10 -07:00
Timothy J. Baek
a392865615 refac 2024-07-01 17:11:09 -07:00
Nicko van Someren
7aa35a3757 Added HTML and Typescript UI components to support configration of text extraction engine.
Updated RAG /config and /config/update endpoints to support UI updates.

Fixed .dockerignore to prevent Python venv from being copied into Docker image.
2024-07-01 12:10:59 -06:00
Jun Siang Cheah
f8f6943128 refac: use new SensitiveInput component 2024-06-25 20:15:29 +08:00
Jun Siang Cheah
d5b91fb084 feat: hide all API keys by default in admin settings 2024-06-25 19:53:22 +08:00
Timothy J. Baek
d5a1030000 refac: uploads delete 2024-06-18 15:20:04 -07:00
Timothy J. Baek
91cec11500 refac 2024-06-15 04:02:20 -06:00
Timothy J. Baek
529fcaa5c9 fix: document query save 2024-06-12 11:07:04 -07:00
Timothy J. Baek
fa9835a7ad refac: styling 2024-06-09 12:08:16 -07:00
Timothy J. Baek
9053bfdadf refac: styling 2024-06-09 02:41:52 -07:00
Timothy J. Baek
4a7d3a076c fix: styling 2024-06-09 02:29:56 -07:00
Timothy J. Baek
8198807fc9 refac: document settings > admin settings 2024-06-09 02:28:52 -07:00
Renamed from src/lib/components/documents/Settings/General.svelte (Browse further)