Pincelate class

class pincelate.Pincelate(model_path_prefixes=())

Loads and provides an API for sequence-to-sequence models for spelling and sounding out.

closest(vec)

Finds the closest Arpabet phoneme for the given feature vector

featureidx(feat)

Index of a feature in the phoneme feature vocabulary.

Examples

>>> from pincelate import Pincelate
>>> pin = Pincelate()
>>> pug = pin.phonemefeatures("pug")
>>> pug[1][pin.featureidx('vcd')] = 1
>>> pin.spellfeatures(pug)
'bug'
manipulate(s, letters=None, features=None, temperature=0.25)

Manipulate a round-trip respelling of a string

This method ‘re-spells’ words by first translating them to phonetic features then back to orthography. The provided values are used to attenuate or emphasize the probability of the given letters and phonetic features at each step of the spelling and sounding out process. Specifically, the decoded probability of the given item (letter or feature) is raised to the power of np.exp(n), where n is the provided value. A value of 0 will affect no change; negative values will increase the probability, positive values will decrease the probability. (A good range to try out is -10 to +10.)

Parameters:
  • s (str) – String to be re-spelled. Should contain only ASCII lowercase characters.
  • letters (dict) – Dictionary mapping letters to exponent values
  • features (dict) – Dictionary mapping phonetic features to exponent values
  • temperature (float) – Temperature for softmax sampling. Larger values will yield more unusual results.
Returns:

A re-spelling of the provided word

Return type:

str

Examples

Respell without using particular letters:

>>> from pincelate import Pincelate
>>> pin = Pincelate()
>>> pin.manipulate("cheese", letters={'e': 10})
"chi's"

Respell emphasizing certain phonetic features:

>>> from pincelate import Pincelate
>>> pin = Pincelate()
>>> pin.manipulate("nod", features={'alv': 10, 'blb': -10})
'mob'

Produce a less plausible spelling:

>>> from pincelate import Pincelate
>>> pin = Pincelate()
>>> [pin.manipulate("alphabet", temperature=1.5) for i in range(5)]
['alphabet', 'alphabey', 'alphibete', 'alfabet', 'alphabette']
phonemefeatures(s)

Produces an array of phoneme feature probabilities for string.

This function operates like soundout, except it omits the nearest-neighbor phoneme lookup at the end, returning the raw phoneme feature probabilities instead.

You can “spell” this array of phoneme features using the spellfeatures() method.

Parameters:s (str) – The string to sound out. Should contain only lowercase ASCII characters.
Returns:A numpy array of shape (n, m) where n is the number of predicted phonemes (including begin/end tokens) and m is the number of phoneme features in the training data (32 for the included pretrained model)
Return type:numpy.array

Examples

>>> from pincelate import Pincelate
>>> pin = Pincelate()
>>> feats = pin.phonemefeatures("hello")
>>> feats.shape
(6, 32)
phonemestate(s)

Calculates hidden state of the spelling model’s decoder for string.

This hidden state can be used for various purposes, including as a representation of the way the string sounds (for the purpose of similarity searches).

You can decode a spelling from a state returned from this method with the spellstate() method.

Parameters:s (str) – The string to sound out. Should contain only lowercase ASCII characters.
Returns:Array of shape (n,), where n is the number of dimensions in the spelling model’s hidden state (256 for the included pretrained model)
Return type:numpy.array

Examples

The following shows how to Calculate and compare the distance between the sound of two pairs of words.

>>> from pincelate import Pincelate
>>> from numpy.linalg import norm
>>> pin = Pincelate()
>>> bug2rug = norm(pin.phonemestate("bug") - pin.phonemestate("rug"))
>>> bug2zap = norm(pin.phonemestate("bug") - pin.phonemestate("zap"))
>>> bug2rug < bug2zap
True
soundout(s)

‘Sounds out’ the string, returning a list of Arpabet phonemes.

Parameters:s (str) – The string to sound out. Should contain only lowercase ASCII characters.
Returns:List of Arpabet phonemes
Return type:list

Examples

>>> from pincelate import Pincelate
>>> pin = Pincelate()
>>> pin.soundout("hello")
['HH', 'EH1', 'L', 'OW0']
spell(phones, temperature=0.25)

Produces a plausible spelling for a list of Arpabet phonemes.

Parameters:
  • phones (list) – A list of Arpabet phonemes. Vowels may optionally have stress numbers appended to the end (i.e., you can provide EH, EH0, EH1, EH2).
  • temperature (float) – Temperature for softmax sampling. Larger values will yield more unusual results.
Returns:

A spelling of the provided phonemes

Return type:

str

Examples

>>> from pincelate import Pincelate
>>> pin = Pincelate()
>>> pin.spell(['HH', 'EH1', 'L', 'OW0'])
'hello'
spellfeatures(vec, temperature=0.25)

Produces a plausible spelling of an array of phoneme features.

Arrays of phoneme features are returned from the phonemefeatures() and vectorizefeatures() methods.

Parameters:
  • vec (numpy.array) – A numpy array of shape (n, m), where n is the number of phonemes in the word (including begin/end tokens) and m is the number of phoneme features in the training data (32 for the included pretrained model)
  • temperature (float) – Temperature for softmax sampling. Larger values will yield more unusual results.
Returns:

A spelling of the provided phoneme features

Return type:

str

Examples

>>> from pincelate import Pincelate
>>> pin = Pincelate()
>>> bee = pin.vectorizefeatures([
...     ['beg'], ['blb', 'stp', 'vcd'], ['hgh', 'fnt', 'vwl'], ['end']
... ])
>>> pin.spellfeatures(bee)
'bee'
spellstate(state, temperature=0.25)

Produces a plausible spelling from spelling model’s hidden state.

Parameters:
  • state (numpy.array) – Array of shape (n,), where n is the number of dimensions in the spelling model’s hidden state (256 for the included pretrained model)
  • temperature (float) – Temperature for softmax sampling. Larger values will yield more unusual results.
Returns:

A spelling of the provided phoneme state

Return type:

str

Examples

>>> ai = (pin.phonemestate("artificial"),
...      pin.phonemestate("intelligence"))
>>> pin.spellstate((ai[0] + ai[1]) / 2)
'intelifical'
vectorizefeatures(arr)

Vectorizes a list of lists of phoneme features.

Helpful if you want to author phoneme features “by hand,” instead of (e.g.) using phonemefeatures to infer them from spelling.

Parameters:arr (list of lists) – List of list of phoneme features (see pincelate.featurephone for a list)
Returns:A numpy array of shape (n, m), where n is the number of phonemes and m is the number of phoneme features.
Return type:numpy.array

Examples

>>> from pincelate import Pincelate
>>> pin = Pincelate()
>>> bee = pin.vectorizefeatures([
...     ['beg'], ['blb', 'stp', 'vcd'], ['hgh', 'fnt', 'vwl'], ['end']
... ])
>>> pin.spellfeatures(bee)
'bee'