Training your own model ======================= You can train your own version of the models using the included ``pincelate.train`` module. Run ``python -m pincelate.train --help`` for a list of options:: --model-prefix MODEL_PREFIX prefix for saved models (directories must already exist!) --verbose show keras progress bars (default to one line per epoch) --random-state RANDOM_STATE random state for train/test split --epochs EPOCHS number of epochs to train --batch-size BATCH_SIZE batch size --src {orth,phon} source sequences --target {orth,phon} target sequences --unidirectional unidirectional rnn (default is bidirectional) --enc-rnn-units ENC_RNN_UNITS units in encoder RNN --dec-rnn-units DEC_RNN_UNITS units in decoder RNN --enc-rnn-dropout ENC_RNN_DROPOUT recurrent dropout in encoder RNN --dec-rnn-dropout DEC_RNN_DROPOUT recurrent dropout in decoder RNN --optimizer {adam,rmsprop} optimizer (rmsprop or adam) --lr LR learning rate for optimizer --decay DECAY learning rate decay for optimizer --clipvalue CLIPVALUE clip value for optimizer A serialized model consists of a number of files, including the pickled hyperparameters and network weights. The ``--model-prefix`` option sets the path and first few characters for these files. For example, an option written like so:: --model-prefix=my-models/phon2orth ... will direct the module to save files with names like ``my-models/phon2orth-obj.pickle``, ``my-models/phon2orth-training.h5``, ``my-models/phon2orth-infer-encoder.h5``, etc. Pincelate needs both an orthography-to-phoneme model and a phoneme-to-orthography model to operate; these are trained separately. You can set the data for the encoder and decoder using the ``--src`` and ``--target`` options. For example, to train an orthography-to-phoneme model with 64 hidden units in both the encoder and decoder:: python -m pincelate.train --model-prefix=test-models/orth2phon --src=orth \ --target=phon --enc-rnn-units=64 --dec-rnn-units=64 Training and test data from the CMU Pronouncing Dictionary is loaded and prepared in ``pincelate.cmudictdata``.