mirror of https://github.com/tauuu/silero-ha-http-tts.git synced 2026-03-31 07:19:18 +00:00

No description

Python 96.6%
Shell 2.3%
Makefile 1.1%

Find a file

taurone 6c81c812a3 Update README.md		2025-02-13 13:31:35 +05:00
ha_config	more docs	2022-11-06 12:11:13 +00:00
.gitignore	Initial commit	2022-10-31 10:58:23 +03:00
_normalization.py	Add files via upload	2025-01-31 13:03:15 +05:00
CONTRIBUTING.md	Create CONTRIBUTING.md	2022-10-31 16:06:37 +03:00
dockerfile	change ffmpeg to sox, add echo effect, add text normalization (only Celsius degrees in Russian)	2022-11-05 18:40:59 +00:00
download_models.py	Update download_models.py	2025-02-10 09:13:24 +05:00
LICENSE	Initial commit	2022-10-31 10:58:23 +03:00
Makefile	Just docker and empty server so far	2022-10-31 09:19:11 +00:00
normalization.py	change ffmpeg to sox, add echo effect, add text normalization (only Celsius degrees in Russian)	2022-11-05 18:40:59 +00:00
README.md	Update README.md	2025-02-13 13:31:35 +05:00
requirements.txt	Update requirements.txt	2025-01-31 13:05:07 +05:00
requirements_last.txt	change ffmpeg to sox, add echo effect, add text normalization (only Celsius degrees in Russian)	2022-11-05 18:40:59 +00:00
server.py	Update server.py	2025-01-31 13:14:21 +05:00

README.md

плюс цифры и модель v4 минус эхо

git clone https://github.com/tauuu/silero-ha-http-tts.git
cd silero-ha-http-tts.git

make && make run





docker stop tts_silero
docker rm tts_silero

find / -type f -name *.wav
rm -rf /var/lib/docker/overlay2/61505e08466083a321de2288361c92d256963159ec6a03858150248dff7a526b/diff/usr/app/static/

docker run -d --restart unless-stopped -p 9898:80  --name tts_silero silero

docker exec -it tts_silero ls static


docker ps -a

silero-ha-http-tts

Этот проект я сделал для себя, чтобы обеспечить свой умный дом синтезом речи, который может работать автономно, без облачных провайдеров. На текущий момент работает:

Непосредственно синтез речи при помощи моделей silero
Силеро не умеет (почему?) норнмализовывать текст, поэтому пришлось сделать свою нормализацию при помощи Natasha & pymorphy. На текущий момент нормализуется только температура (число в текст)
докер контейнер выдает два эндпойнта. Один - для встроенного TTS Home Assistant, второй - чтобы получить звуковой файл непосредственно.
проигрывание звука в SLS шлюз. Пришлось сделать кривовато, но работает

Что еще предстоит, или чего нет:

Не разобрался с media-source в Home Assitant - если нужно в итоге в автоматизации получить рабочий URL, не знаю, как это сделать
Голос говорящего захаркожен
Эффект эха захаркожен
качество синтеза захардкожено
образ контейнера собирается локально, не загружается на DockerHub
захардкожены паузы в начале и конце (SLS шлюз шипит на старте и финише)
Синтез работает на x64 архитектуре. На малинке крашится, пока не разбирался (вероятно, силеро не работает)
синтезированные файлы остаются в контейнере и никак не чистятся (кроме перезапуска контейнера)

Если вы знаете, как сделать это тпроект лучше - присылайте PR, не стесняйтесь!

todo

docker packaging
make initial server
silero package working
TTS code itself
caching of TTS results in local filesystem
ML models cached in docker
Normalize text
server implements HA TTS API (MaryTTS)
make docs how to run it all with HA
respect voice, language parameters
make DockerHub image
parse humidity, dates, just numbers in normalization
make cache cleanups

How to run

git clone https://github.com/Gromina/silero-ha-http-tts.git
cd silero-ha-http-tts.git

make && make run # which is shortcut for 2 following lines
# docker build -t silero .
# docker run -p 9898:80 --rm --name tts_silero silero

Endpoints

POST /process - MaryTTS format endpoint returning wav file. Used when setup as HA TTS service

POST /tts - just get url to generated wav file

I took some ideas from

https://github.com/elia-morrison/silero_docker

HA config to work as TTS

tts:
  - platform: marytts
    host: localhost
    port: 9898
    codec: WAVE_FILE
    voice: xenia
    language: ru

HA config to use with SLS gateway

Put script from ./ha_config to your HA config folder

Make shell command in HA config:

shell_command:
        tts_to_sls: './tts_to_sls.sh "{{ tts_address }}" "{{ sls_address }}" "{{ text}}" '

Add following automation (write down correct IP addresses for TTS service and SLS)

- id: '1667401859473'
  alias: Speak status
  description: ''
  trigger: []
  condition: []
  action:
  - service: shell_command.tts_to_sls
    data_template:
      tts_address: http://192.111.11.111:9898
      sls_address: http://192.111.11.190
      text: Привет, шеф! В комнате 12
        градусов. В то же время на улице -23
        градусов
  mode: single

curl test

curl -X POST -H "Content-Type: application/x-www-form-urlencoded" -d "INPUT_TEXT=Привет" http://127.1:9898/tts

README.md Unescape Escape