Tim Jaeryang Baek
5db60ca34f
Merge pull request #15903 from Hisma/marker-api-update
...
feat: Add configurable API URL (for self-hosting) and additional_config parameter for Datalab Marker API
2025-08-04 15:21:03 +04:00
Timothy Jaeryang Baek
6a17ba5b7a
refac: metadata handling in vectordb
2025-07-31 17:45:06 +04:00
Hisma
a99e20cc3d
add format_lines
2025-07-22 21:06:29 -04:00
Hisma
f31cc07a9d
feat: update marker api
2025-07-22 20:49:28 -04:00
Azure Wang
9aff166f83
- fix: keep reranker_model config been removed by web search config
2025-07-16 23:51:23 +08:00
Timothy Jaeryang Baek
abe280f0a3
refac/fix: reranking function
2025-07-16 13:56:02 +04:00
Timothy Jaeryang Baek
18bd83413b
refac
2025-07-14 14:05:06 +04:00
Timothy Jaeryang Baek
0013f5c1fc
refac/enh: forward user info header to reranker
2025-07-14 13:59:10 +04:00
Timothy Jaeryang Baek
87847ab31a
chore: format
2025-07-13 00:15:16 +04:00
Tim Jaeryang Baek
e3b8f700e4
Merge pull request #14264 from diwakar-s-maurya/patch-6
...
feat: add langchain markdown document splitter
2025-07-08 15:55:20 +04:00
Tim Jaeryang Baek
2bad7eaa07
Merge pull request #15277 from hankewyczz/bug/restore-exa-search
...
fix Restore exa
2025-06-25 11:04:48 +04:00
Zachar Hankewycz
45d7726ee0
Restore exa
2025-06-24 21:24:53 -04:00
zhangtyzzz
5f60b30320
add missed exa
2025-06-19 13:52:58 +08:00
Timothy Jaeryang Baek
6c54ca552a
feat: global image compression
2025-06-16 16:52:57 +04:00
Timothy Jaeryang Baek
f3cae94028
fix: bypass webloader
...
Co-Authored-By: WilliamGates <3852641+williamgateszhao@users.noreply.github.com>
2025-06-16 16:17:52 +04:00
Timothy Jaeryang Baek
0cd400f5ee
refac: docling picture describe params
2025-06-08 20:02:14 +04:00
Vaclav Cerny
99f05561f8
Add configuration options for picture description modes and update related components
2025-06-08 16:30:26 +02:00
Diwakar Singh Maurya
871efb4ad9
feat: add langchain markdown document splitter
2025-06-07 06:02:53 +00:00
Dave
96e9bfe0e5
feat: add Perplexity model and search context usage configuration options
2025-06-03 00:19:08 +02:00
Timothy Jaeryang Baek
e1e2c096e2
refac: PLEASE follow existing convention
2025-05-30 00:34:18 +04:00
Tim Jaeryang Baek
ff353578db
Merge pull request #14370 from daw/feat/add-azure-openai-embeddings-option
...
feat:Add Azure OpenAI embedding support
2025-05-30 00:18:55 +04:00
Tim Jaeryang Baek
042c37ea34
Merge pull request #14311 from Hisma/marker-api-content-extraction
...
feat: Marker api content extraction support
2025-05-29 02:21:13 +04:00
Timothy Jaeryang Baek
4461122a0e
fix: /api/v1/retrieval/query/collection endpoint
2025-05-28 18:45:47 +04:00
Hisma
a9405cc101
feat: Marker api content extraction support
2025-05-27 00:44:07 -04:00
Tim Jaeryang Baek
e663b90a9f
Merge pull request #14069 from Ithanil/bm25_weight
...
feat: Configurable weight for BM25Retriever during hybrid search
2025-05-24 01:13:03 +04:00
Jan Kessler
e70dd33233
rename BM25_WEIGHT -> HYBRID_BM25_WEIGHT
2025-05-23 22:06:44 +02:00
Timothy Jaeryang Baek
2eca6f6414
feat: bypass web loader in web search
...
Co-Authored-By: Perry Li <peiyaoli@mail.nankai.edu.cn>
Co-Authored-By: WilliamGates <3852641+williamgateszhao@users.noreply.github.com>
2025-05-23 02:30:35 +04:00
Jan Kessler
308d8ac04a
make bm25_weight a regular parameter of query_doc.. / get_sources_from_files functions
2025-05-20 11:46:32 +02:00
Jan Kessler
b5ddaf6417
make weight for bm25 retriever in hybrid search ui-configurable
2025-05-20 10:39:31 +02:00
Derek Wischusen
42be1f956a
Add Azure OpenAI embedding support
2025-05-19 22:58:04 -04:00
Timothy Jaeryang Baek
2bd7db12a2
enh: ALLOWED_FILE_EXTENSIONS ui
2025-05-16 21:05:52 +04:00
Timothy Jaeryang Baek
8732b64b6b
feat: external document loader support
2025-05-14 22:28:40 +04:00
Timothy Jaeryang Baek
de70d0cb64
feat: docling do picture description support
2025-05-14 21:26:49 +04:00
hwzhuhao
6f869ded43
feat:Add vector type and vector factory class for vector database integration
2025-05-14 21:30:50 +08:00
Timothy Jaeryang Baek
6f635d8b7d
refac
2025-05-10 19:16:09 +04:00
Timothy Jaeryang Baek
be912f1529
refac
2025-05-10 18:29:04 +04:00
Timothy Jaeryang Baek
d5fd3b3600
feat: external reranker
...
Co-Authored-By: Brendan Campbell <20541191+bcambs09@users.noreply.github.com>
2025-05-10 18:25:20 +04:00
Timothy Jaeryang Baek
34ec10a78c
refac: web search performance
...
Co-Authored-By: Mabeck <64421281+mmabeck@users.noreply.github.com>
2025-05-10 17:54:41 +04:00
tth37
c95a65a4bd
fix: Duplicate web search urls
2025-05-09 20:06:35 +08:00
Timothy Jaeryang Baek
b50dcb1862
refac: remove duplicate urls
2025-05-07 22:25:18 +04:00
Athanasios Oikonomou
657162e96d
feat(ocr): add support for Docling OCR engine and language configuration
...
This commit adds support for configuring the OCR engine and language(s) for Docling.
Configuration can be set via the environment variables `DOCLING_OCR_ENGINE` and `DOCLING_OCR_LANG`, or through the UI.
Fixes #13133
2025-05-03 00:32:06 +03:00
Tim Jaeryang Baek
e87f2669fa
Merge pull request #13191 from tth37/feat_firecrawl_search_engine
...
feat: Add Firecrawl search engine
2025-04-29 08:38:28 -07:00
Tim Jaeryang Baek
7b863465a9
Merge pull request #13311 from stephen304/yacy-support
...
feat: Yacy search support
2025-04-29 08:35:10 -07:00
Stephen Smith
240d91d38d
Add yacy config for user/pass, automatically add yacy json api path
2025-04-26 22:28:30 -04:00
Stephen Smith
0f73b96616
first pass at yacy support copied from searxng
2025-04-26 14:07:13 -04:00
tth37
92dbeb1939
feat: Add Firecrawl search engine
2025-04-24 14:57:28 +08:00
Timothy Jaeryang Baek
732d7aee70
enh: sentence transformers env vars
...
Co-Authored-By: DrZoidberg09 <96449693+drzoidberg09@users.noreply.github.com>
2025-04-24 01:55:18 +09:00
Timothy Jaeryang Baek
09874ab83d
fix: FireCrawlLoader
2025-04-24 01:40:34 +09:00
Timothy Jaeryang Baek
43efff0fe6
refac
2025-04-22 23:22:50 +09:00
Tim Jaeryang Baek
87844a8042
Merge pull request #12822 from tth37/feat_external_search_loader
...
feat: Support for Self-Hosted/External Web Search/Loader Engines
2025-04-18 23:51:27 -07:00