Privacy and Data Ownership Needs to be at the Core of our Technology

steve.penrod · April 18, 2018, 2:30pm

Originally published at: http://mycroft.ai/blog/privacy-and-data-ownership-needs-to-be-at-the-core-of-our-technology/

We need to talk. Not just you and me, society as a whole. The foundation of free society is eroding and we have only a brief time to shore it up. We need to talk about data privacy and balancing it with technological advancement.

Emerging danger

Technology constantly introduces new features, new techniques and opens up new possibilities. Most of the time this is good – things get easier, faster, and cheaper. Tasks which were difficult and prohibitively expensive a few years earlier are suddenly one-click away as fun Snapchat filters. These changes can be hard to keep track of but aren’t really dangerous if you don’t keep up. Life won't dramatically change if you miss out on a few generations of the latest gimmicks and graphic design trends. But something pervasively and unprecedentedly important has happened in technology over the last decade. Society is now walking at the edge of a privacy precipice from which we might never be able to recover if we aren't very careful with the next few steps.

Why are things this way?

I think it is important to look at what has lead to this state of things. Beginning around 2000 those of us in computer science started exploring things like "data mining", "big data", and most recently "machine learning". Like most technology there was no moral intent behind these techniques – they were just methodologies that emerged as interesting approaches to tough problems. What really makes these technologies unique from a societal perspective is not in the code. It is the required input to these systems. They need information … lots and lots of information. Anyone who is well versed in machine learning can confirm that one of the early considerations when planning to use this type of “AI” is identifying a source of massive amounts of the right kind of data for that problem.

Unquenchable thirst for data

The impressive results from data analysis has turned data into the new gold rush. For many companies, finding data has become the most imperative task, bar none. This has led to services offering every enticement they can think of to obtain this precious commodity. Facebook isn't alone but is the king of this. In exchange for your checking "I accept" on their Terms of Service, they offer unlimited usage, boundless storage, and ever-growing functionality. They offer convenience, power and entertainment. All for a few simple permissions on your cell phone.

Quest for excellence

Once Facebook had access to it all, they splashed in the pools of data they amassed. Much like a child visiting the ocean for the first time, they played with it to see what kinds of castles they could build. Matching algorithms, suggestions, predictions. Finding new ways to leverage this data was the sure way to corporate recognition and business success. And they excelled at it. Facebook is better at it than anyone. NOBODY knows *you* better than they do.

After you have it all, what comes next?

Looking for more led to partners willing to push the boundaries. Individuals and companies were driven by the same quest for more knowledge extracted from this data.

Facebook’s claim has always been that it’s a matching service–allowing a company to detail the type of customers it wants to target, and then placing the ads for the companies. The companies and marketing firms were never meant to have first-hand access to profiles.

But, sharing this trove with like minded academic researchers surely seemed logical. Simple. Easy.

This led to Aleksandr Kogan, the researcher behind the personality quiz “thisisyourdigitallife”. This app asked its users for access to some personal information and for access to their networks which provided Kogan with a baseline profile of every friend they had under the pretense of academic use. 270,000 consenting users turned into 50 million profiles.

Which led to Cambridge Analytica. Which lead to…

Ethics can be a slippery thing. It was likely easy to justify each of these steps. Users loved what they were getting, right? They clicked “I agree”. This is a fair trade… right?

Fundamental principles

Fundamental principles are being tested for the first time. Fiction has explored the concept of "big brother watching you" for decades. But the reality is that it's been impossible to watch *everyone* in a free society. Until now. With freedom comes an implied expectation of privacy. Can freedom exist without protection of privacy? I don’t think so. We *need* the ability to explore aspects of ourselves without fear of exploitation. Society benefits from the individuals in the society being themselves. Thus, protecting privacy is as important as freedom itself.

The phrase “peeping into the bedroom window” makes people uncomfortable. Certain things are held sacred; and the norms, rules and laws of society are supposed to protect its members when they are the most vulnerable.

It’s arguable that Kogan was not technically peeping in the window in this situation. Through Kogan’s app, he was invited into the bedroom, then broke a promise of confidentiality by sharing what he saw (and recorded) by providing it to Cambridge Analytica. Facebook has noted that this was not a security breach or a hack, but a breach of their terms of use for that data.

This new usage of data analytics has created hidden paths and invitations that threaten the privacy of the members of our society every bit as much as a peeping Tom at the bedroom window. Even more so since you don’t need anything as obvious as a ladder to gain this knowledge.

End of a great experiment?

So, should we call for an end to Big Data? Is machine learning a dark art that should be banned ... too powerful to be used?

I don’t think so, for several reasons. This box has been opened. These techniques work and will be used. Banning research doesn’t stop it, it just pushes it underground.

Besides, there is HUGE societal benefit from the results of some machine learning. We need to explore how to learn more from it, not less. We just need to do it responsibly.

We maintain that we don’t need to collect every shred of available data on our community and their interactions with Mycroft to create a fantastic user experience.

But we do need some data, so we’re using an opt-in mechanism to allow our community members to choose if any of their information is stored. We’re building Open Datasets - anonymized information and recordings from our opted-in community members use by anyone.

We think seriously about how to collect, store, and use data responsibly and transparently.

What else could we do?

Utopia

Modern technologies will help in this challenge. The same way we were able to use pervasive computing and networking to harvest and analyze data in ways that were previously unprecedented, we can also use these to create personal protections in ways that were previously impossible.

With tools like encryption and the blockchain I believe we can not only create privacy protections as policy but also guarantee it. I’ll be expanding on these ideas publicly soon.

For now, I’d like to hear if I’m alone in my thinking. Is this a real danger? Are we vulnerable? Does anyone care?

Join us on the Forums to share your thoughts.

pcwii · April 18, 2018, 5:24pm

Thanks for this great incite. I am happy to say that the day Facebook announced they were going public was the day I deleted my account, something didn’t sit well with me about these online companies turning a profit on what is essentially my intellectual property, everything that defines me as a person. Unfortunately like many others Facebook is not the only online account that we subscribe to, some willingly and some unwillingly, it sometimes feels like an uphill battle, a struggle between the benefits and convenience of online services and the loss of all anonymity.

jdub4237 · April 19, 2018, 3:47pm

I agree that we are at a precipice with data and privacy and we need to think very clearly on what is acceptable for society moving forward. At the advent of social media, being a CTO already and having good understanding into what is possible with technology, I was very careful to only contribute to these platforms what I wasn’t afraid to lose. As I have told many people about free technology, apps, internet services or social media platforms, nothing is for free. When you agree to be a part of that experiment you are giving up your rights to whatever you contribute. Regardless of what Facebook says, your data is not yours. If it has been shared outside of your brain, you own less of it than when you conceived the thought. This is the nature of humanity and it existed before computers. I think we need to be guarded, but it would help if there were bumper rails for those who are not as technology savvy as some of us. Opt-in is great, but if what you are opting into doesn’t have your best interest at heart, it doesn’t matter. Facebook has opt-in’s and this was still possible. We need to be more careful, and I am up for conversations in that regard for certain.

Sn0wbl1nd · April 19, 2018, 4:53pm

In a fundamental way, big data is blurring what could once have been considered the private and public spheres. Graph analysis reveals much more about you then just what is contained in the images and likes you share. I liked the blog post - very well considered. I recently closed my facebook account and explained my reasons here: https://wp.me/p5DunY-2Y

My preferred future is pods that contain data for a single entity alone, with a standardized communication protocol to share data and messaging. Such a pod would also contain software and run in a container/instance/etc. In the Mycroft universe this could become equipped with a voice interface and handle my messaging according to my settings while I am off-line. A personal assistant if you will. When you delete your pod, so goes all it contains.

How would online companies make money? Not by selling the right to interact with others online, or create walled gardens. They would have to negotiate for access to your data, or provide services (such as voice capability) to individual pods.

If you think on how this is any different from having your own website, or storage cloud, or computing instance… not very much, and completely different. The difference is in how you present and shape the experience, lower the threshold. Much like fb has.

gblaze53 · April 19, 2018, 6:28pm

Thanks Steve, as someone who works in the INFOSEC field at IBM , people are often to unconcerned about their information, they often feel either it isn’t important or that no one would be interested, even with all evidence to the contrary, I certainly hope more people see this and start taking their privacy as important as the food that they eat.

festro · April 19, 2018, 10:23pm

I agree entirely. I believe we are just as much to blame as the companies, people need to be better educated in tradeoffs. Take cellphones for example people want slim feature packed microcomputers in their pocket, but complain when when the battery had to be made smaller to fit their mega cameras or dedicated audio boards. the problem I’m sure many of us have observed is that alot of people now days put idealogy before practicality, and reality then look confused then it falls through. the same thing happened with social media (Facebook in this case). We openly placed our lives, personal information and things we normally wouldn’t share publicly on an online platform that has a bad habit of dropping the ball, yet here we are with our mouths gaping “How did this happen?”. We need to start reflecting on ourselves and our habits, asking ourselves do I need to tell the world that the whole family is going on vacation, and hope that someone in our circles isn’t a burglar looking for a easy target. I personally have given up on online privacy it’s far too easy to get dragged in, far too convient to live with and far too difficult to avoid or get out.

Flavius · April 19, 2018, 11:55pm

“Power tends to corrupt and absolute power corrupts absolutely.” Most of you will have heard that quote taken from Lord Acton’s letter to the Anglican Archbishop Creighton. What is less commonly known is the line that followed which indicated that as a result: “Great men are almost always bad men.” Don’t be taken in by vague promises of putative “safeguards” - by its very nature - great power will lend itself to abuse. Be careful of TOS agreements that are meant to obscure the facts rather than clarify them. A 2016 study in the US clearly demonstrated that participants would blithely agree to allow data sharing with NSA and employers along with the surrender of their first born as payment in return for access to a bogus social networking site called “NameDrop” that the student participants believed was real.
https://arstechnica.com/tech-policy/2016/07/nobody-reads-tos-agreements-even-ones-that-demand-first-born-as-payment/
Give some thought as to how high a price is too high in relation to what a “free” service offers to give you. If you don’t pay for the service - you are the service. How cheaply will you sell yourself? Is the genie out of the bottle? Absolutely. The question is: will we allow it to run amok? Demand accountability while we still have a say in the matter.
Right now we still choice - so please do support Open Source programs like Mycroft which are willing to put it out there for discussion of the issues and the consequences and give you the opportunity to opt-in rather than making you jump hoops in order to opt out. Choose the innovators who treat you with respect - it may take Mycroft a bit longer to get us where we want our tech to be - when compared to companies with the resources of an Amazon, Microsoft, Google etc. but the wait is so very much worth it. It is essential that we safeguard what is left of our privacy - we have given up too much already whether in pursuit of illusory safety from the boogeymen or mere convenience. My thanks to Steve Penrod for not shying away from the topic and to all who respond with their comments and ideas for caring enough the matter.

eric-mycroft · May 2, 2018, 7:26pm

Heard on NPR coming back from work that Cambridge Analytica is shutting down.

Article mentions customers leaving and a slew of lawsuits. Also notes its former parent SCL Group will close.

https://www.forbes.com/sites/geneely/2018/05/02/awash-in-facebook-data-scandal-cambridge-analytica-shuts-down/#1efe6bc331d4

iteco · May 3, 2018, 8:37pm

@steve.penrod good post. We care.

One comment for the opt-in collecting sound samples, and to my understanding that you don’t get enough samples… I would not like to give my voice to open dataset, but I would be fine to give it to you (Mycroft) if you would promise not to give it further. (At least as long as Joshua is the CEO and has the majority of the stocks). I would even dare to propose that you could make it (allow only MyCroft to use it and not share with enybody else) the default option, and allow users to change it to give nothing, or include it to open data set.

Ron777 · October 19, 2018, 4:41pm

Are you looking at methods to scrub opted-in voice donations that would prevent any chance
of identifying the source, either alone or combined with other data. For example:

remove metadata (name, ids, ip adr, etc)
remove potential identifier parts of speech such as names, addresses, etc
modify the audio qualities to prevent reverse engineering for speaker recognition

To do this would be a huge step to enable people to participate.

Ron

ldrscke · October 29, 2018, 4:54pm

Is an interesting read regarding this topic