Follow

this has to be one of the weirdest data sets I've ever seen; the result of asking people to pretend to be automated car assistants in order to train better automated car assistants nlp.stanford.edu/blog/a-new-mu

@aparrish Ha, they're getting some diversity in tone.
"The forecast does not state that it will be humid in Compton tomorrow."
"No, it's not gonna be warm in Camarillo over the next 2 days"

@aparrish ah! Memories! I helped generate such a dataset 15 years ago. The infamous “wizard of oz” method. Because real data is impossible to acquire unless you already have a running system. Sorry, that was probably obvious...

@Tryphon yeah the weird thing about this data set is that (as i understand it?) it's asking people not to adopt the role of a helpful person but specifically to adopt the role *of the car*—reifying the abstract behavior of the automated agent instead of seeing the property of the appearance of automation to be undesirable. like having a dataset for text-to-speech where people are encouraged to read the text in such a way that they sound like a computer

@aparrish yeah, very contrived for sure. As the goal is to collect data to train a speech recognition or NLP system, it has to be as close as possible to what would be said in an actual human-computer interaction, hence the rigid computer side of the conversation. But as I understand it, only the human side of the conversation is of interest.

Sign in to participate in the conversation
Mastodon

Follow friends and discover new ones. Publish anything you want: links, pictures, text, video. This server is run by the main developers of the Mastodon project. Everyone is welcome as long as you follow our code of conduct!