Every age has its valuable commodities: salt, ivory, silk, spices, opium or oil. In the 21st century the most valuable commodity to possess is data. Data allows us to understand how your processes work and you can even predict what’s going to happen in the (near) future. This sounds great! We can make our processes more efficient, reduce waste and waiting time, detect failures before they cause trouble, discover important correlations between seemingly unrelated events, and we could go on… Data is one of the most powerful things the world has ever seen.
But, what about the data itself? Salt, silk and oil are commodities that you can just buy. And once you have bought them they belong only to you. If you were to buy a machine and place some sensors, it seems very reasonable to think that the data from those sensors belongs to you. And what you do with the data is also up to you.
Generally speaking, you are interested in optimizing the total process. This means making some predictive model based on all the data and balancing the cost of false positives against that of false negatives. For example, you would rather replace a part in an assembly line a few weeks too early (which costs a little money), than be a day too late causing your whole factory to shut down (costing a lot more money). Now for the individual component which was still functioning fine but was replaced prematurely, this is not entirely fair, but since it’s only a piece of metal no one really cares.
Now let’s move our attention to the field where big data is really taking off: ‘Living Data’. Here the objects under scrutiny are human beings, you and me, as we navigate through the digital world. Extensive profiles are being generated which try to describe us as accurately as possible, and many decisions are taken by models by analyzing these profiles. But who owns this data and what are they allowed to do with it? Can anybody just put up a sensor (pixel), trade your personal information and make generalized predictive models? What if I am arrested prematurely based on my suspicious profile attributes while I’m actually innocent? Or what happens when I am unable to get a health insurance because the model says the risk is too high for someone with my features? In statistics it’s very customary to make generalizations and discriminate against minorities (which are euphemistically called outliers), because only the total result counts. But what if the outliers are actual people?
At ORTEC for Communications we capture a lot of data, including data generated by humans, and use it to optimize your reading experience and make smart personalized recommendations. But, there are two concepts that are very important to us: transparency and impartiality. Transparency means making clear which information you are storing and how you are using it in your product. The user should always remain the owner of his/her data.
Impartiality is not discriminating against people based on their digital passport. The imgZine recommender engine gives you content depending on what you do, not who you are. By analyzing your reading behavior, the algorithm finds good suggestions that are completely adjusted to your personal tastes. But the user data can be stored anonymously and you will not be haunted by a profile you have no control over.
Not storing or analyzing data is definitely no longer an option in this digital era. But, being transparent about what you’re doing while respecting the rights of the people that are behind the data points is extremely important. In the end, this will help keep your users happy and make your product successful.