Commit graph

51 commits

Author SHA1 Message Date
_00_
0613563644
FIX: STT default whisper trascription language
FIX: STT default whisper trascription language

Fix the transcripcion language used by default whisper, setting as WHISPER_LANGUAGE if it is setted in env var, even if a language is detected in the file's metadata.
It is understood that if a language is set as an environment variable for transcriptions, this should be the preferred one and the one that should be used for that purpose.

It would be advisable to add this variable as configurable in UI
2025-07-22 16:47:06 +02:00
Timothy Jaeryang Baek
19f1286cc7 fix: emoji call 2025-07-16 15:38:48 +04:00
Athanasios Oikonomou
96758176cc fix: don't over quote forwarded headers
Fix introduced on #15035 is over quoting headers.

Eg mails instead of user@example.com shown as user%40example.com
Eg names instead of First Last shown as First%20Last

Also we are spending some time quoting ids and roles without required.

Keep quote only on user name, initially had problem based on the discussion
https://github.com/open-webui/open-webui/discussions/14391

Also add space in safe characters, in order remove %20 from names.
2025-07-10 22:08:28 +03:00
Timothy Jaeryang Baek
0a1f9966ef refac: audio error handling 2025-07-06 14:20:38 +04:00
Timothy Jaeryang Baek
6186bbf337 refac/fix: stt supported type 2025-06-18 14:01:14 +04:00
priten
f7920df870 Fix non-ascii error issue on ENABLE_FORWARD_USER_INFO_HEADERS 2025-06-16 12:33:11 -05:00
Timothy Jaeryang Baek
72df23ed79 refac 2025-06-16 17:24:55 +04:00
Timothy Jaeryang Baek
7a1afa9c66 feat: custom stt content type
Co-Authored-By: Bryan Berns <berns@uwalumni.com>
2025-06-16 16:13:40 +04:00
Timothy Jaeryang Baek
8258dfb5af enh: enable deepgram smart_format 2025-06-16 12:34:01 +04:00
Timothy Jaeryang Baek
036ce12dd9 doc: changelog 2025-05-30 01:14:38 +04:00
Romain Dauby
b12a493fe5 fix: only trust codec_name for audio conversion
Some files have .wav extension with incompatible OpenAI codec
2025-05-29 16:57:23 -04:00
Timothy Jaeryang Baek
baaa285534 feat: user stt language 2025-05-24 00:36:30 +04:00
Timothy Jaeryang Baek
73e64fe7fb refac: audio upload handling 2025-05-19 02:52:48 +04:00
Timothy Jaeryang Baek
b280f828b0 enh: very long audio transcription 2025-05-17 02:51:28 +04:00
Timothy Jaeryang Baek
b143c71da2 refac: AIOHTTP_CLIENT_SESSION_SSL 2025-05-14 23:33:52 +04:00
Timothy Jaeryang Baek
549989e9ec refac 2025-05-10 19:04:40 +04:00
Timothy Jaeryang Baek
827326e1a2 refac: audio transcription issue 2025-05-08 22:57:48 +04:00
Timothy Jaeryang Baek
bfa5550cc3 refac: openai already supports webm audio 2025-05-08 22:44:32 +04:00
Tim Jaeryang Baek
2a4dfc02a2
Merge pull request #13540 from NoMoreFood/dev
feat: Azure TTS Allow Base URL
2025-05-07 00:49:57 +04:00
Bryan Berns
5aabe21cbe Add Custom Azure TTS URL 2025-05-05 22:08:48 -04:00
Timothy Jaeryang Baek
7b36466c1c refac: audio transcribe supported filetype 2025-05-05 23:42:56 +04:00
Timothy Jaeryang Baek
4cfb99248d chore: format 2025-05-03 23:48:24 +04:00
Tim Jaeryang Baek
7b014e44ee
Merge pull request #13376 from Thaniel94/add-whisper-language-constraint
feat: Added WHISPER_LANGUAGE env variable
2025-05-02 03:08:00 -07:00
nathaniel
ef7acfbf3d WHISPER_LANGUAGE no longer a "PersistentConfig" variable (Was not appropriate with how WHISPER_LANGUAGE is currently configured). 2025-05-01 21:33:57 +01:00
Bryan Berns
6c8a9d000e Azure STT Allow Base URL & Max Speaker Setting 2025-04-30 08:51:01 -04:00
nathaniel
1efa708f83 Added WHISPER_LANGUAGE env variable. If set to a country's ISO2, constrains Whisper's stt to that language. Detects language as normal if unset 2025-04-27 05:58:06 +01:00
Timothy Jaeryang Baek
e7332fd6fe refac 2025-04-13 23:39:38 -07:00
Tom
24367d459b Enable vad_filter to improve quality of transcription in faster-whisper model. 2025-04-13 13:03:57 +01:00
Timothy Jaeryang Baek
bde89fd29e refac: audio 2025-04-12 18:40:09 -07:00
Timothy Jaeryang Baek
91a455a284 chore: format 2025-04-12 16:35:11 -07:00
Tim Jaeryang Baek
36ac81b229
Merge pull request #12727 from decent-engineer-decent-datascientist/main
feat: add Azure AI Speech STT provider
2025-04-10 16:50:40 -07:00
priten
9a55257c5b feat: add Azure AI Speech STT provider
- Add Azure STT configuration variables for API key, region and locales
- Implement Azure STT transcription endpoint with 200MB file size limit
- Update Audio settings UI to include Azure STT configuration fields
- Handle Azure API responses and error cases consistently
2025-04-10 15:38:59 -05:00
Timothy Jaeryang Baek
05aa9c6d9c refac 2025-04-10 12:27:11 -07:00
Thomas Rehn
4731e0d0e3 fix: convert webm to wav for OpenAI transcription endpoint 2025-04-10 09:00:51 +02:00
Thomas Rehn
d99a883867 fix: convert ogg to wav for OpenAI transcription endpoint 2025-04-08 15:04:04 +02:00
Hermógenes Oliveira
e936d7b53d fix: audio api endpoint filetype check
RFC2046 allows the Content-Type field to have additional parameters
after the main type/subtype information (Section 1).

Following RFC4281, many applications put codec information inside
parameters in the Content-Type. This is especially common for formats
that support many codecs, such as Ogg (RFC5334, Section 4).

The `/api/audio/transcriptions` endpoint is currently rejecting files
that contain parameters in the Content-Type field with a bad request
error.

This commit changes the current check in order to accept any
Content-Type field that begins with a supported type/subtype as listed
in the `supported_filetypes` tuple.

Since Content-Type here is provided by the user, I believe this check
is meant to prevent honest mistakes, like posting a PDF to an audio
processing endpoint, not as a security measure against possibly
malicious use. Therefore, I think it's OK not to validate the rest of
the field.
2025-03-08 18:03:30 -03:00
tidely
b15814c42f chore: remove unnecessary Path conversions
Remove unnecessary `pathlib.Path` conversions. (CACHE_DIR and DATA_DIR)

Use `/` Path joining shorthand to ensure using platform specific Path separators (Windows: \\, Unix: /)
2025-03-04 19:53:52 +02:00
Timothy Jaeryang Baek
3be5e3129b
Merge pull request #10752 from NovoNordisk-OpenSource/yvedeng/standardize-logging
refactor: replace print statements with logging
2025-02-25 10:53:02 -08:00
Yifang Deng
0e5d5ecb81
refactor: replace print statements with logging for better error tracking 2025-02-25 15:53:55 +01:00
Timothy Jaeryang Baek
613a087387 refac 2025-02-21 10:55:03 -08:00
Synergyst
f789ad59a9
Update audio.py
Removed original code that was commented out
2025-02-21 04:47:46 -06:00
Coleton M
cdf620e6ee Update audio.py to fetch custom URL voices and models 2025-02-21 04:41:45 -06:00
Timothy Jaeryang Baek
eeb00a5ca2 chore: format 2025-02-20 01:01:29 -08:00
Liu Yue
90d9cdacfa
fix: respect proxy and timeout settings in audio-related aiohttp requests 2025-02-20 14:55:45 +08:00
Timothy Jaeryang Baek
60095598ec chore: format 2025-02-09 22:20:47 -08:00
Tristan Morris
5df474abb9 Add support for Deepgram STT 2025-02-02 08:12:13 -06:00
Timothy Jaeryang Baek
8b6d03e430 fix: elevenlabs audio 2024-12-26 12:54:31 -08:00
Timothy Jaeryang Baek
70de5cf7b8 fix: audio 2024-12-19 16:18:54 -08:00
Timothy Jaeryang Baek
87d695caad Update audio.py 2024-12-11 04:47:35 -08:00
Timothy Jaeryang Baek
df0cdd9f3c wip 2024-12-11 04:37:47 -08:00