Follow

As an average citizen, I don't have oceans of data to play with. Excited to find out about weak supervision as a means of getting larger labeled datasets.

· · Web · 1 · 0 · 1

@elisehuard there are some other tricks like using GAN's to generate artificial labelled data or online-learning.

But as an average person isn't there a large number of public labelled datasets out there already? Or is this a non-public datasource?

IMHO the next big leap will be when we get better computer-human interaction leading to effective cooperation and teaching.

@pvaneynd there are a large number of open data sources, but when we start getting into deep learning you need a _lot_ of data to get reasonable results.
The example I have is a dataset of a few 1000s of data points that was painstakingly labeled by a few academics (and made available under certain conditions) - which is not enough to use deep learning.

@elisehuard yeah for NLP there isn't a lot of labelled data out there.
Most researchers seen to want to run away when I tell them we don't work with images, only text and time series. Most of them unlabeled...
If I had time I would try to solve things like twitter.com/DynamicWebPaige/st which look like a lot of fun 😃

Sign in to participate in the conversation
Mastodon

The original server operated by the Mastodon gGmbH non-profit