As an average citizen, I don't have oceans of data to play with. Excited to find out about weak supervision as a means of getting larger labeled datasets.
@elisehuard there are some other tricks like using GAN's to generate artificial labelled data or online-learning.
But as an average person isn't there a large number of public labelled datasets out there already? Or is this a non-public datasource?
IMHO the next big leap will be when we get better computer-human interaction leading to effective cooperation and teaching.
@pvaneynd there are a large number of open data sources, but when we start getting into deep learning you need a _lot_ of data to get reasonable results.
The example I have is a dataset of a few 1000s of data points that was painstakingly labeled by a few academics (and made available under certain conditions) - which is not enough to use deep learning.
@elisehuard yeah for NLP there isn't a lot of labelled data out there.
Most researchers seen to want to run away when I tell them we don't work with images, only text and time series. Most of them unlabeled...
If I had time I would try to solve things like https://twitter.com/DynamicWebPaige/status/1040105383191171072?s=19 which look like a lot of fun 😃
The original server operated by the Mastodon gGmbH non-profit