Mastodon @Mastodon

**Allison Parrish** @aparrish · Jun 16, 2018

Jun 16, 2018

so I've been working on a computer program to compose poems in the style of Zukofsky's _80 Flowers_, a collection (literal anthology!) of constrained poems written about individual flower varieties. eventually this is going to be a corpus-driven machine-learning thing but just now as a sort of "baseline" I made a generator that just arranges the top forty keywords from the wikipedia page corresponding to each flower in Zukofsky's collection, and the results are... surprising?

Dandelion heads achenes head vitamins matures achene milky witch erythrospermum pappus laevigatum blanched catsears weed latex recorded rosette parachutes taraxacum eaten downward tooth beak tires crete minerals parachute drop rochester apomixis taproot bracts scientist endive official officinale microspecies potassium mature rubber

Dandelion No blanch witloof handbound dry heart to racks a comb lion’s-teeth thisdehead golden-hair earth nail flower-clock up-by-pace dandle lion won’t dwarf lamb closes night season its long year dumble-dor bumbles cure wine bhwball black fill’s-berry madding sun mixen seeded rebus

**Allison Parrish** @aparrish · Jun 16, 2018

Jun 16, 2018

Allison Parrish @aparrish

my generator on the left, zukofsky on the right (obviously). I say "surprising" because even though I've been studying and admiring these poems for the better part of a year at this point I still sorta had this idea that the poems were *essentially* just random relevant words arranged in a grid. comparing the poems to, like, actually random elements in a grid really shows the craft and attention and unusual cohesion of the original

**Allison Parrish** @aparrish · Jun 16, 2018

Jun 16, 2018

Allison Parrish @aparrish

... which isn't to say that I don't kinda *like* my simplest-possible implementation? it's at least doing the work of juxtaposing obvious and non-obvious words that are relevant to the topic and eschewing conventional syntax. so I do feel justified in this approach and like I'm on the right track. here's another...

Roses / sericea technically attar medicine societies persia layer flaky damascena prickles points ditto itself asia subject vitamin pink absolute flavour canina pimpinellifolia sharp rosa centifolia cemetery striped stipules thorns countries sand petals minor syrup crimson borrowed adaptation hips artist flowered rugosa

**Allison Parrish** @aparrish · Jun 16, 2018

Jun 16, 2018

Allison Parrish @aparrish

("prickles / points ditto itself" and "centifolia cemetery / striped stipules" feel especially true to the original, if only by chance in this instance)

Allison Parrish @aparrish@mastodon.social

output from a convolutional neural network trying to "condense" wikipedia articles about each of zukofsky's 80 Flowers into the text of the poems themselves. the first is from a word-level model, attempting to produce one of the poems in the validation set; the second is from a character-level model, trying to produce one of the poems in the training set. the word-level one looks "coherent" but it's really just reproducing words in similar frequencies from the targets

white core _rosy_ _bells_ name mountains earth heaps bulbs flowering first bud brown turn rosy upturned limbed ox pygmies silvery anthers black to clouds dutch-art amiss aspire least pink lightpierced papery barber poles vanish discords swallows uptrilled-thundershower horses slope from

Kcoon arlai g owt moantain auraἈἈἈrἈod esf_oe pc dleartp pal-ir _entireongreen wintar yeaas hoaafrast moonod coad aomἈs yh Ἀofnbuscoieed MἈy ἈrownbwchpἈἈ5-paaal cacicbflowkr cἈuoid 10-ole def meas aosἈn_ sord s ay lecἈaio sieἈs ahhoa- fNot_ _thyme'rἈ touutlfἈawer_ cal co_ delusiers_ lbulc t _ _volle ἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈἈ

Jun 20, 2018, 04:57 PM··Web

2boosts·3favorites

**Allison Parrish** @aparrish · Jun 20, 2018

Jun 20, 2018

Allison Parrish @aparrish

in both cases there's so little data (just 80 items, since there are just 80 poems...) that the model pretty much instantly overfits and basically just learns the poems verbatim. I think I'm going to go back to the word model and try using pre-trained embeddings, then investigate data augmentation? (but allison, you're saying, CNN is very inappropriate for this task, use LSTM, bleah, and yes I know but I have Something I'm Trying To Show about zukofsky's style of composition in these poems)

**winmine.exe** @calvin@cronk.stenoweb.net · Jun 20, 2018

Jun 20, 2018

winmine.exe @calvin@cronk.stenoweb.net

@aparrish _it's gone_ sentient AAAAAAA

**SKANARCHO-SYNDICALISM** @frickhaditcoming@anticapitalist.party · Jun 20, 2018

Jun 20, 2018

SKANARCHO-SYNDICALISM @frickhaditcoming@anticapitalist.party

@aparrish me texting sober vs me texting drunk

**Falkreon** @falkreon@sleeping.town · Jun 20, 2018

Jun 20, 2018

Falkreon @falkreon@sleeping.town

@aparrish I *just read* something about applying CNNs to small, domain-specific sets like this. Their approach was to use a GAN pair to learn the style of the source set and generate additional plausible data points, and then train a traditional CNN/DNN on the stretched data.

**Allison Parrish** @aparrish · Jun 20, 2018

Jun 20, 2018

Allison Parrish @aparrish

@falkreon do you have a link? it would actually be really helpful right now to see/read details about someone else's process

**Falkreon** @falkreon@sleeping.town · Jun 20, 2018

Jun 20, 2018

Falkreon @falkreon@sleeping.town

@aparrish Whoops, it wasn't a paper, it was a talk by Monty Barlow. Still, found it: https://www.youtube.com/watch?v=7EfhicNoAbM

YouTubeDeep learning : Amazing feats with Generative Adversarial Networks (GANS)By Digital Greenhouse

**clausti** @Clausti@mastodon.xyz · Jun 20, 2018

Jun 20, 2018

clausti @Clausti@mastodon.xyz

@aparrish I have a very dumb question about overfit... do you ever deliberately add ‘noise’ to training data on purpose? or like, other writing in either the same style or by the same author but not both?

this question is inspired by the way one needs to “backcross to wild type” when optimizing for a specific polygenic trait in a breeding population (bc any single sampling won’t get all the possible contributions)

**Allison Parrish** @aparrish · Jun 20, 2018

Jun 20, 2018

Allison Parrish @aparrish

@Clausti I can't claim to be an expert but doesn't what you describe fall under the category of data augmentation? and stuff like adding dropout in neural networks, etc. conceptually for me the problem with data augmentation is that then you're sort of building an idea about how the data works into your own analysis, which seems... weird.

**clausti** @Clausti@mastodon.xyz · Jun 20, 2018

Jun 20, 2018

clausti @Clausti@mastodon.xyz

@aparrish ah, I said it was a dumb question bc I maybe don’t have enough background to ask a good one. I’m not familiar w standard practices of data augmentation

I think what I was trying to ask is if, in the face of a small data set probe to overfit, if there are maybe other populations of data/sources that contain some but not all of the characteristics you’re training for, and if doing multiple rounds of training, w a clean set then an “expanded” set could mitigate overfit