mastodon.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
The original server operated by the Mastodon gGmbH non-profit

Administered by:

Server stats:

380K
active users

One of the signs you're Not Well is that putting a 'reading list' section on your blog starts with building a rotating-proxy amazon scraper and a NLP powered metadata scrubbing engine that fixes common problems — Edition names jammed onto the end of Publisher names, publication dates appended to Imprint names, subtitles that are actually author lists, etc.

But, like… what am I supposed to do? Let BAD DATA just STAY BAD?

*wild-eyed, twitchy stare*

Structured book metadata is less of a contract and more of a puzzle game, in which you try to figure out whether ‘PENGUIN' appearing in the Publisher, Format, and Edition fields of a book about penguins is a sign that Penguin/Random House published it, that the metadata is duplicated, or that someone’s enthusiastic 5 year old got ahold of the keyboard on release day.

Mark Llobrera

@eaton It’s such a headache with my own reading log, and I view any API-pulled data as a mere starting point that needs human intervention

@markllobrera 100%. I've got things in ... DECENT shape, though not so good that I’d be comfortable doing (say) auto-generation of author-name index pages.

@eaton I settled on a system where the API pulls result in a Markdown file with the book metadata in front matter, and that can get cleaned up

@markllobrera @eaton i’ve had a book blog for 16 years and i just...keyboard that shit in

@aworkinglibrary @eaton Fair! (The only reason I ended up hooking in to APIs was because I wanted to build a few CLI tools for coding practice, rather than a true workflow need)

@eaton The dataviz potential here [fans self]

@eaton Resisting the siren call to add another personal project to the list