Collecting useful user data without compromising our privacy: how carefully controlled lying could be the way forward.
Good article. Unfortunately, I've become the untrusting old carmudgeon that thinks all of the data collection out there is being sold or used for profit and therefore fight to give no company any data at all.
If they all followed your model, what a different and wonderful world this would be.
@DistroJunkie thank you! What I'd like to see happen is the conversation to change: if it's "we won't give you anything!" versus "they'll hate us anyway so we might as well collect everything and sell it", then it's so polarised that we'll never get anything done; an arms race is more expensive than a brain race. If there's something in the middle, where we can give data they find useful but not be compromised, then everyone wins.
It's worse than that here in the States. My employer has a wellness program administered by a contracting health outfit. Seemed harmless. That health outfit had a contractor that stored and maintained all of their data. By participating in the wellness program we gave much of our personal data up. The contractor to the contractor had a data breach and now many of us are seeing nefarious efforts to steal our identity. Had no idea that this could happen by participating.
@sil I like the idea of lying (you can decide if that's true or not). I'm curious if we need to let the user choose whether they lie or not. In theory this would likely be as statistically valid, but would give users the feeling of being in control of the data.
"We're going to send this data about your computer to Canonical:
[Nope] [Adjust values] [Sure]"
@ted the flaw there is that, as the piece outlines, you need to tune the amount of lying to balance "get accurate info in aggregate" against "protect user privacy". If you ask people that question, they won't know how to answer it -- why should I adjust? what's good about it? -- and you'll have no idea of how much lying happened and so don't know how accurate your data is at all, which makes it a lot less useful.
@sil I would make sure to get a designer involved for better text, more trying to illustrate the point. But certainly knowing the nature of the noise added to the system would give you better results, but I don't think I have to. For instance, it'd be hard to know precise values on how many people lie about committing a crime. Your error bars increase, but if things are that close, it's not actionable data anyway.
@ted perhaps so, yeah, and someone with more data science chops than I could doubtless quantify that so it's OK. My objection to asking is really that it makes people care about a thing that they shouldn't have to care about. It's like popping up a dialog to ask whether the kernel should defragment your memory. I don't know or care; decide for me!
@sil I agree, but, I think when it comes to data privacy people are surprisingly sophisticated. Certainly most would use Nope/Yeah in my example, but I think given the option to lie makes it fit into human notions of trust.
We have the computing power today to make computers more human, instead of what we've done in the past where we taught humans how to work with computers.