"trained" on _Frankenstein_. the idea of using a gaussian mixture model on the word vectors is to make it possible to predict words similar in meaning that aren't necessarily in the original text. I'm not sure if the variability in the text is due to (a) not enough data to really get the mixtures right (b) the lack of ability to control the degree of randomness in the sample function or (c) you shouldn't really do this kind of thing with word vectors