Sharing en masse with AI


Although research on artificial intelligence has been conducted for many decades, it has only been a few years that we can talk about the use of artificial intelligence in day to day life. The turning point was the year 2006.

At least two things happened around this year and coincided.

The development of processing capacities that increased exponentially year to year that made possible turning into practice the theoretical idea of multilayer neural networks that was developed and published around the eighties of the XX century. That time, however, it was put on ice because there was not enough computing power at that time.

Extended libraries of specimens/samples collected through crowdsourcing through the world wide web became available to researchers. Artificial intelligence is not just algorithms designed by the human hand. It is algorithms developed by machines on their own by comparing samples and drawing their own algorithms. Emerging of the internet and computing power cloud facilities resulted in a collaborative work of millions of people we describe as collaborative commons. It could have been used as a ready-to-tap library with millions or even billions of samples available upon which AI needed to make comparisons and draw its own algorithms. That work and the experience gained from these test works opened the door for further R&D.


Putting aside definitions of artificial intelligence that what we have to distinguish here is algorithms written by a human hand and fed into computers and algorithms written by the computers on their own.

A simple example might be robots in the factory 3.0 and robots in the factory 4.0. Already in the XXth century, in the 3.0 factories, robots replaced workers at assembly lines. Those robots have been programmed. The algorithms were designed and fed into computers by human programmers. It was those algorithms that steered the robots or robotics arms. But the moves at an assembly line were predictable, could have been precisely described with a yardstick like 250 millimeters forward, five turns right, 250 millimeters forward.

Today in industry 4.0. we do not longer talk algorithms made by human hand and invented by human brains. We are talking about artificial intelligence based on deep learning schemes where machines find solutions on their own with many of those solutions earlier not possible to grasp with the human brain. A person does not feed a 4.0 factory algorithm into the computer. A robot pilot, a human, may use a joystick or an exoskeleton to move the robotic arm or his arms stuck with sensors, over and over again, making the same moves in a changing environment like picking different objects that lay randomly on the floor. Expressing it in an algorithm would be difficult or time consuming for a human programmer. So, the human pilot makes moves, and the computer or a machine is describing those moves by writing the own algorithms, each time adding some new functions. In simple words, a human is operating a joystick; the computer is describing those moves with an algorithm. The roles just reversed in comparison to industry 3.0 world.


AI and robots increase the efficiency of processes in the industry and other spheres of our life. It is already a stated fact that AI and robotics technology limitations thresholds are moving forward. The main problem is unpredictability of processes. Mastering it will still for long time be the human domain. Still, more and more processes can be digitized. It is also a stated fact that AI and robotics can profoundly change the labor markets. By far, it is not only the problem of structural unemployment like just exchanging blue collars who used to work in a car factory with robots. It also about the new culture of retraining and upskilling throughout the entire life. We already know that there is no longer a thing like a job for life. Algorithms out today competing with white collars and robots are today competing and replacing blue collars. Until a few years ago, it was said that a programmer is a great future profession. Today we know that the basic skills of a programmer are not enough. You need to know more than just programming to stay on the job market. But it had been the programmers who made today’s developments possible. They created something that evolved to a competitor, able to develop own programming, not always understandable for a human.

In the meantime we got to a point when everybody who is connected to the world wide web is involved in the process of AI training.


If you show a dog to a little child saying: ‘it is a dog’, the one would easily recognize a dog for the rest of his or her life. It is due to the fantastic abilities of the human brain. How? Nobody knows. It just happens. It is getting more complicated if we deal with machines or computers. There is no straightforward algorithm you can feed into a computer so that the computer would recognize a dog in a picture. Computers needed a vast number of dog photos to define a recognition pattern. Did Google hire photographers to make as many dog images as possible? No, we could not be more wrong here.

Those who helped out were actually social media users. For years, millions of Internet users have uploaded, among others, images of dogs into the world wide web using a dog hashtag. If you carefully read the Google terms of use, you will find out that you agreed that Google uses your content to improve their services. If you examine the wording carefully, you will understand you agreed that your content might be used, among others, for AI training. To train artificial intelligence, Google uses crowdsourcing. We use most of its services for free, but in return, we deliver to them data and other valuable resources that they can use to develop their services further.

So, those millions but millions of pictures were fed into the computers. By comparing them with each other, detail by detail, pixel by pixel, computers have learned the distinctive features of a dog. After examining many images and setting up algorithms, a machine would know how to recognize a dog and recognize that it is not a dog on a picture. The process is called deep learning of AI and is based on neural networks. The neural networks are told to simulate the processes in the human brain. It is, however, not some sophisticated biotechnology. Artificial neural networks are just layers of multiple functions with inputs and outputs. Just programming and computation. The human assistance is about delivering a sufficient number of samples, including know-how samples, feeding them into the computers, and giving feedback.


An artificial neural network is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron that receives a signal then processes it and can signal neurons connected to it. The artificial neurons are typically organized into multiple layers. Neurons of one layer connect only to neurons of the immediately preceding and immediately following layers. The layer that receives external data is the input layer. The layer that produces the ultimate result is the output layer. In between them, there are zero or more layers. Many of those in-between layers are hidden, which means that an AI human trainer cannot see them. The hidden layers are AI’s secrets. In this context, we use the term deep learning. So humans just reveal about themselves en masse to AI, but AI keeps its secrets well.

Incoming impulses (inputs) are transferred in a domino effect from one neuron or, better to say, node to the other. It is not a one-time process. Each time new input is added, the artificial network adapts in line with the new experience. If necessary, it rewrites, updates, or extends the network repeatedly, with each and every new experience. It takes thousands but thousands of data sets (inputs) to train a machine. It is like in the case of a human. With seeing more and more x-rays defined with cancer, a young doctor gathers experience. With time and more x-rays examined, the young doctor becomes more experienced and makes diagnoses with more surety. The same happens with artificial neural networks. However, the capacity to remember all cases is better in the case of computers than with humans. What is more, the computer may see details earlier, not trackable by a human brain. This way, the artificial neural network may outperform even the most experienced doctors. But before algorithms set by the artificial neural networks can recognize an unlabeled x-ray, the accuracy ratio must be checked by experienced doctors and approved.

Here we come to the idea of reinforced learning. It is not only that the machine is recognizing patterns on its own. The artificial intelligence trainer is assessing the outcomes telling the machine whether the outcome is correct or not. If the outcome is not correct, the machine gets feedback from the human trainer. It will know that it should never make the same mistake. The neural network algorithms are adjusted so that the machine does not make a mistake anymore.

This technique is even more useful when dealing with processes that are not about zero versus one input. An example of reinforced learning is feeding a computer with more than 1000 movies to teach a machine to conversate with a human. The computer is scanning and learning all dialogues from all the films. The AI trainer is picking up a dialog with the computer. The computer has thousands but thousands of dialogues to choose from. But the AI trainer starts a conversation of his or her own. It is not repeating any of the dialogues from any of those films. If the computer responds well, so the response is logical, or it is simply a response that continues the trainer’s conversation thread, the computer is rewarded with a high score. It knows the answer was all right. If the response is false or wrong, the score given by the trainer is low. The computer will not repeat this kind of response anymore. The algorithms in the neural network are adapted correspondingly.


Crowdsourcing on the Internet and social media is just one way to collect the samples needed to train artificial intelligence. It is public knowledge that our hashtagged photos are used to train AI to recognize objects, emotions, animals, birds, etc. Everybody who accepted the Google terms of service is involved. What else Google uses is probably known only to a selected group of employees of this company.

Another example of crowdsourcing in collecting samples is crowdsourcing limited only to people who have been provided with appropriate software or hardware by the trainers. It is a kind of private crowdsourcing. And this one is often not about some work output. It is about skills and know-how.

An example is companies working on autonomous cars. Today, Tesla cars are not fully autonomous but are already able to drive autonomously in very specific conditions. They are equipped with hardware whose task is not only to drive a car autonomically but above all to collect information about various traffic situations in which such a car has found itself. It is Tesla car users who collect samples, i.e., data about various traffic situations as they are on the road, and transfer them in large quantities to Tesla headquarters, where Tesla employees process them and feed them into the computers as training samples. Mercedes-Benz installs the appropriate sensors and software in its new trucks. These trucks do not run autonomously, although some on some routes are capable of it. Their main task is to collect data on what is happening on the roads. Based on this data, Mercedes-Benz is developing its autonomous truck technology. So in fact each driver driving a connected car is involved in the development of autonomous vehicle technology by showing the machines how to react on that what happens on the road. In fact, what they share with machines is the driving skills.

Another example is software that is used to correct texts, typos, and grammar errors. Suppose we are connected to the Internet and use this software. Each time we accept the suggested amendment or reject the suggested amendment, we create feedback that is automatically sent to this software manufacturer. He then applies it to a training program. So, it is the actual users, not the software producer, training AI. This kind of feedback will be an example of reinforced learning. And yet again, that what we share with AI is our linguistics skills.

Artificial intelligence can also be trained in laboratory conditions. One of the coolest examples that I have recently seen on the Internet is, in South Korea, a few young people from a start-up were training a robot to clean a mess in a room. It cannot be done by crowdsourcing. At least, I think so.


Artificial intelligence is machines trained by humans. The training process is called machine learning, or if it is more complicated and involves machines to work on their own where at some stage of learning it is a process hidden from the human teachers, it is called deep learning. The process is based on artificial neural networks, which are layers of functions imitating complex human brain processes. For most of the process, it is self-training by software that is exposed to vast amounts of big data or samples or experiences to perform various tasks. It is a supervised process where machines use human trainers’ inputs like resources or know-how and get feedback (reinforced learning). Cloud libraries of samples or experiences taken by hardware designed to collect them are used in the process. The samples are either delivered by crowdsourcing or are produced in controlled lab conditions.

The fact is that AI is trained upon collective know-how and work outputs by everybody who is connected to the world wide web. We are sharing en masse …

Photo by Laura Musikanski and Pixabay on Pexels