FIX: STT default whisper trascription language
Fix the transcripcion language used by default whisper, setting as WHISPER_LANGUAGE if it is setted in env var, even if a language is detected in the file's metadata.
It is understood that if a language is set as an environment variable for transcriptions, this should be the preferred one and the one that should be used for that purpose.
It would be advisable to add this variable as configurable in UI
Fix introduced on #15035 is over quoting headers.
Eg mails instead of user@example.com shown as user%40example.com
Eg names instead of First Last shown as First%20Last
Also we are spending some time quoting ids and roles without required.
Keep quote only on user name, initially had problem based on the discussion
https://github.com/open-webui/open-webui/discussions/14391
Also add space in safe characters, in order remove %20 from names.
- Add Azure STT configuration variables for API key, region and locales
- Implement Azure STT transcription endpoint with 200MB file size limit
- Update Audio settings UI to include Azure STT configuration fields
- Handle Azure API responses and error cases consistently
RFC2046 allows the Content-Type field to have additional parameters
after the main type/subtype information (Section 1).
Following RFC4281, many applications put codec information inside
parameters in the Content-Type. This is especially common for formats
that support many codecs, such as Ogg (RFC5334, Section 4).
The `/api/audio/transcriptions` endpoint is currently rejecting files
that contain parameters in the Content-Type field with a bad request
error.
This commit changes the current check in order to accept any
Content-Type field that begins with a supported type/subtype as listed
in the `supported_filetypes` tuple.
Since Content-Type here is provided by the user, I believe this check
is meant to prevent honest mistakes, like posting a PDF to an audio
processing endpoint, not as a security measure against possibly
malicious use. Therefore, I think it's OK not to validate the rest of
the field.
Remove unnecessary `pathlib.Path` conversions. (CACHE_DIR and DATA_DIR)
Use `/` Path joining shorthand to ensure using platform specific Path separators (Windows: \\, Unix: /)