Alfabeto actualizado

8ba99638 · cristian Quezada · 613ad938 · 8ba99638 · 8ba99638
Commit 8ba99638 authored May 07, 2019 by cristian Quezada
Hide whitespace changes
Inline Side-by-side

Showing with 32 additions and 1 deletion

README.md README.md +3 -1

alfabeto.txt model_es/alfabeto.txt +29 -0

No files found.
--- a/README.md
+++ b/README.md
@@ -53,7 +53,8 @@ matplotlib
 * Tener un archivo con las trasncripciones de los clips
 * Usar vocabulary_generator.ipynb para crear 3 archivos: train.csv,dev.csv,test.csv
 * Los 3 archivos deben tener por columnas : wav_filename,wav_filesize,transcript
-* De ejemplo usar spanish-single-speaker-speech-dataset
+
+* De ejemplo usar _spanish-single-speaker-speech-dataset_ , descargar omitir los pasos para generar el lm.binary y el trie, solo verificar que el path este correcto con los clips de audio.

 ## Generar Modelo de lenguaje

@@ -88,3 +89,4 @@ Finalmente es entrenar :
 ```
 python3 DeepSpeech.py --train_files data/train_es.csv --dev_files data/dev_es.csv --test_files data/test_es.csv --alphabet_config_path model_es/alfabeto.txt --lm_binary_path model_es/lm_es.binary --lm_trie_path model_es/trie_es --checkpoint_dir checkpoints --export_dir model_export
 ```
+Verificar el alfabeto y añadir más caracteres en caso falte.
\ No newline at end of file
--- a/model_es/alfabeto.txt
+++ b/model_es/alfabeto.txt
@@ -30,4 +30,33 @@ w
 x
 y
 z
+!
+'
+,
+-
+.
+:
+;
+?
+¡
+¿
+Á
+Å
+É
+Í
+Ó
+Ú
+á
+æ
+è
+é
+ë
+í
+î
+ñ
+ó
+ö
+ú
+ü
+—
 # The last (non-comment) line needs to end with a newline.