DATA TRANSORMATION AND DATA MANAGEMENT
Cap 1: Introduction to information theory and data management
Data information and knowledge
Data are symbols that represent a reality that can be made up of people, objects,
events, activities. The raw data has no meaning in itself; however, it can assume a
meaning if a descriptor is connected to it.
The data, when it has meaning, i.e., when it is linked to a descriptor and when it is
aimed at decision making, provides information. With this term we define
everything that reduces the state of uncertainty with respect to the knowledge of a
given situation.
The information is produced through the processing (or elaboration) of data that is
the execution on them of simple operations or complex operations. We can
therefore affirm that data are transformed into information through the process of
elaboration.
When we deal with many data examples, or the patterns are too complex we have:
- Insight[intuizione] = the pattern should give us relevant information about
the problem that is not obvious.
- Actionable[processabile] = the insight we get should be something we can use
in some way.
Information has potential value; this value derives from the usefulness of
information. Which makes it economically positive.
We have different characteristics that give value to information such as:
- Completeness: it must contain all important data.
- Flexibility: the information must be able to support different decision-making
processes.
- Destination: the information must reach the person who use it and who
requested it.
- Simplicity: the information must be as simple as possible avoid
redundancy.
The information intensity matrix
The information intensity matrix allows us to understand how this different value of
information arises.
In the horizontal axis is the information content of the product or service. On the
vertical axis there is instead the information intensity of the process, the
information that is needed to the production process to be correctly carried out.
Starting from the quadrant (1) we have the products that have an almost zero
information content; on the contrary, the information intensity of the process value
chain is very high. In the quadrant (2) we find services with high information content
that use a large amount of data. In the quadrant (3) we find fashion goods that have
a value chain that does not need a large amount of information while commercial
activities and adv campaigns are information intensive. In the quadrant (4) we find
the extractive industry and constructions that have a product with a low information
content and a production process with an equally low need for information.
I the practice of production and markets, we have witnessed[assistito] the
phenomenon of the so-called commoditization where goods and services lose the
distinctive and unique characteristics of that determine their value.
In this way, the competitive advantage is lost, and profitability is eroded with
possible effects also on the survival of companies. So, very important are the
servitization strategies that allow you to maintain or increase the perceived value of
the products and so it allows you to increase the profitability.
Decision level Anthony’s pyramid
We have 3levels of decision-making.
We start from the strategic decision level, the top management. Through their
decisions and their strategies, they can ensure that the company has a competitive
advantage and can become market leader.
Going down we have the tactical decision level which have an impact in the short
and medium term.
Finally, we have the operational decision level which is made up of actions that are
carried out to make an organization work.
Big data
Big data are huge information datasets that have the capabilities to store, interpret,
and re-use the data by adding values.
Ultimately, big data and big data analytics have been used to describe the data sets
and analytical. The main characteristics of big data were typically described by using
3 Vs words:
1- Volume: refers to the scale of data. Every day, 2.5 quintillion (10^30) of data
are created. Cloud and virtualization help the management in the process of
storing and accessing to the data.
2- Velocity: refers to big data being transmitted and available in real-time since
every minute.
3- Variety: refers to the fact that data is available on new and diverse types of
variables and come with less structure.
4- Veracity: refers to certainty of data since it is estimated that poor data quality
costs $3,1 trillion to the US economy every year; 1 out of 3 business leaders
don’t trust the information they use to make decision.
5- Value: refers to the ability to use big data to obtain added value information
and insights derived from superior analytics and data handling.
Extracting value out of big data requires significant innovations.
Big data and open data paradigm
The open data paradigm is becoming more and more relevant. Open data reduces
friction in transactions and markets operate better in low friction environments. It
allows to make a better use of existing resources and create new products and
services.
The added value is described through the data value chain composed by:
1- Data providers.
2- Intermediaries: data aggregators and data enablers.
3- Providers of products and services to consumers: developers and data users.
4- End users: direct users and other.
Cap 2: A brief history of computing and data science
Digital era: John von Neumann architecture
In 1945 John von Neumann proposed his famous scheme called the von Neumann
architecture.
The First Draft describes a design architecture for an electronic digital computer
with these components:
- A processing unit that contains an arithmetic logic unit and processor registers
- A control unit that contains an instruction register and program counter
- Memory that stores data and instructions
- External mass storage
- Input and output mechanisms
This conceptual scheme has introduced significant innovations. In particular we
have:
- The separation between data and instructions
- The existence of several instructions that form a program
This architecture, differently from the first computing machines, allows the (re)-
programming making easier the often-arduous procedure of rewriting and
rebuilding the system.
Transistors technologies and programming languages
At the beginning of 1950s transistor technology was introduced. This phase sees the
introduction of the so-called user-oriented programming languages that are no
longer exclusively machine oriented. They represented a fundamental step in the
evolution of computing: first generation computers were programmed through the
introduction of bit strings (0 or 1) making this operation very com-plex. Now instead,
the programmer inserts instructions in the form of expressions in a language
(relatively) close to the one used in everyday life.
Moore’s law and future of computing
In 1965 Gordon Moore formulated a prediction: the capacity of computer chips will
double every year. This statement was made on the basis of the analysis of
computer chips in the previous years. 10 years later his prediction was confirmed,
Moore then refined his previous prediction by saying that the capacity will double
every two years. At the present time his forecast is stable on the average of a
doubling every 18 months.
Nano technologies and quantum computing
The experimentation phase of nano technologies and quantum computing is coming
to an end. The original purpose was to build electronic components on an atomic
scale.
The quantum computing represents a revolution in terms of speed of calculation; it
uses a quantum-mechanical interaction and is based on the concept of quantum bit
(qubit) that can be represented by an atom in one or two different states that can
be considered as 0 and 1. Qubits have a bizarre property: superpositions they can
exist simultaneously as 0 and 1, and therefore a quantum computer can assume
multiple states simultaneously.
The second property is the entanglement and allows the mysterious connection
between qubits within the quantum computer.
Cap 3: Data transmission networks
ISO/OSI model
The dialogue between computer is not simple and immediate. The ISO (international
standard organization) OSI (open system interconnection) was introduced in 1978 to
provide standards for the connection of open systems and to provide a model that
can be used to compare the different network architecture.
The ISO-OSI model has 7 layers:
1- Physical layer: it deals with the transmission of raw bits on a communication
channel; it concerns mechanical and electrical aspects.
2- Data link level: the purpose of this level is to make a physical transmission
medium appear at the upper level. This thanks to the encapsulation and
decapsulation.
3- Network level: the purpose of this level is to control the operation of the
communication. Its main tasks are routing, data conversion and so on.
4- Transport level: the purpose of this level is to accept data from the top layer,
break it into packets, pass it to the network layer and make sure that it
reaches the peer entity (receiver).
5- Session level: this layer is very important because of the token management
that authorizes the two parties to start the transmission.
6- Presentation layer: it is interested in the syntax and semantics of the
information to be transferred.
7- Application level: it provides a variety of protocols that are necessary to offer
the various services to the users.
1,2,3 are media layers layers related to physical components: hardware.
4,5,6,7 are host layers layers related to the software.
TCP/IP model
The TCP-IP model have 4 levels:
1- Host-to-network: this layer explains the ability of the host to send IP packets
over the network.
2- Internet level: its role is to enter packets into any network. IP (internet
protocol) has 3 duties: addressing packets, routing and congestion control.
Each machine connected to internet is uniquely identified by a 32-bit IP
address that identifies two things:
- Network number: the number assigned to the IP network on which
computer is located.
- Host number: the number assigned to the computer. The combination
is unique and two identical IP addresses cannot exist.
3- Transport level: it is designed to allow conversation between source and
destination host (end-to-end). TCP (transmission control protocol) governs
the operations. It fragments the incoming streams into separate messages
that are passed to the internet layer.
4- Application layer: in the TCP-IP architecture there are no session and
presentation layer because the application layer contains all the high-level
protocols used by real applications: FTP (file transfer protocol), SMTP (simple
mail transfer protocol), HTTP (hypertext transfer protocol etc.
TCP-IP is an architecture because uses well defined protocols, uses internet, this
model is more reliable, follows horizontal approach.
ISO-OSI is a model, the transport layer guarantees the delivery of packets, follows
a vertical approach.
Transmission media
In order to transmit data, a physical mean is required. We have:
- Twisted pair: It consists of a pair of copper conductors with each other in
helical form.
- Coaxial cable: it offers better insulation and allows higher transmission speed
over long distances.
- Optical fiber: it exploits the principle of deviation that a beam of light
undergoes when it crosses the border between two different materials.
- Wireless: are electromagnetic waves that travel at the speed of light.
Commuted telephone system
The telephone system plays a central role for the remote communication. The
telephone network is organized on a multi-level hierarchical structure; we have
switchboards instead of operators. The switchboards are fundamental because they
switch the incoming lines to the local loop (near switching station), and they are
connected to different outgoing line based on the users’ connection requests.
We have different kinds of connections (trunks) such as coaxial cables, optical fibers
etc.
Communication between computers
In order to communicate, computers must be connected each other through
switching. We have 2 types of switching:
- Circuit switching: when a call is made, the system establishes a dedicated
physical connection between the caller and the called party. A disadvantage is
the possibility of delays. An advantage is the facts that this system does not
present any other problems.
- Packet switching: the data to be sent is broken into packets which are sent
independently and then it is reordered at the destination. A disadvantage is
that the congestion can arise at any time. An advantage is that there is no
delay for connection setup.
We have a wide network classification:
1- Personal area network (PAN): it is used for communication between devices
for interpersonal communication.
2- Local area network (LAN): are usually owned by an organization and have an
extension up to few km.
The way by which the nodes of a network are interconnected is indicated with the
term topology. We have three topologies:
- Bus topology: here all computers are directly connected to a linear
transmission medium. Advantages less cables, minimum costs, easy to
expand.
Disadvantages performance decrease when the number of nodes grow,
slower than the other topologies.
- Ring topology: here all the computers are connected in order to form a ring.
In order to have a fluid communication we need a token that has an arbitrage
role.
Advantages low cost of installation, good performance even with high
traffic.
Disadvantages adding or removing a node would affect the entire network,
only one computer for time can transmit (the one with the token).
- Star topology: here all computers are connected to a central node (hub). The
packets sent from one station to another are repeated on all hubs, this allows
all the stations to see any packets sent on the network.
Advantages easy to expand, easy detections of errors.
Disadvantages if the central hub fails, all the network fails.
We also have a distinction between packets-switched networks:
- Networks with connection: where the paths that the packets follow are
predetermined and are always the same based on a virtual channel.
Here we have the frame relay: an efficient data transmission technique
used to send digital information quickly and efficiently to one or more
destinations.
- Switching network without connection: here the paths that the packets
follow are not predetermined. In TCP-IP networks, device A is connected to
B, thus there is a connection at the transport level with the high control
over the transportation of the packets.
3- Metropolitan area network (MAN): it has an urban extension and is generally
public (ex. Telecom).
4- Wide area network (WAN): it is extended at the level of nation, continent or
entire planet. It is made up of different components such as a set of
computers (host), a subnet communication that connects the end systems to
each other. The subnet is made up of two elements:
- Transmission lines: trunks.
- Switching elements: when data arrives on a line, those elements choose an
output line to which route it.
Cap 4: cloud computing
Cloud computing can be defined as Internet-based computing, whereby shared
resources, software, and information are provided to computers. Cloud computing
is a style of computing in which dynamically scalable and often virtualized resources
are provided as a service over the Internet.
Cloud computing is a software as a service (SaaS) so when a cloud is available in a
pay-as-you-go manner.
The idea behind the cloud computing is the fact that users always want to reduce
and simplify the work and they want to keep a good quality as well. In few words
users care only about the service.
The characteristics of cloud computing are:
1- Scalability & elasticity: scalability refers to the property of a system to handle
growing amounts of work. Elasticity instead is referred to the ability to apply
methodologies that avoid drastic changes in the system. We have:
- Dynamic provisioning: here the server computing is provisioned by the
server administrator, network administrator or any other enabled user.
- Multi-tenant designed: is a principle that allow a software to run on a
server and allows multiple clients to operate.
2- Availability & reliability: availability refers to the probability that a system is
operational in a given time. Reliability refers to the ability of a system to
perform its required functions for a specific period of time.
Here we talk about the quality of service (QoS) & service level management
(SLA) which refers to the set of technologies for managing network traffic in a
cost-effective manner to enhance user experience. Quality of service is used
in customer care evaluation and in technological evaluation.
Very important is the fault tolerance system that allows the system to
operate even if there are events of failure. In the fault tolerance system, we
have:
- SPOF (single point of failure) a part of the system which, if it fails, will
stop the entire system.
- FDI (fault detection and isolation) it identifies a fault and isolates it in
order to protect the system.
Reliability refers also to resilience, the ability of the system to return to its
original state after encountering troubles. Important for this reason is the
backup.
Important too is the cloud security that refers to the set of technologies that
protect data and that is associated to infrastructure of cloud computing.
3- Manageability & interoperability: manageability refers to the wide
administration of cloud computing system. Interoperability refers to the
ability to work with other products and systems without having any problems
and errors.
4- Performance & optimization: cloud computing has a highly performed and
highly optimized environment. Here we recognize the parallel computing that
allows the simultaneously work of many calculators and the load balancing
that allows the distribution of the workload across two or more computers
etc.
5- Accessibility & portability: accessibility refers to the degree to which a
product, a service is accessible by many people as possible. Portability refers
to the ability to access services by using any devices, anywhere.
Here we have the uniform access that refers to how users can access to cloud
services.
Thin client that is a computer program that depends on some other computer
to fulfill its goals.
Benefits from cloud
For market and enterprises, the benefits are:
1- Reduce initial investment
2- Reduce capital expenditure
3- Improve industrial specialization
4- Improve resource utilization
For end user and individual the benefits are:
1- Reduce local computing power
2- Reduce local storage power
3- Variety of end devices
The four service models
Infrastructure as a service (IaaS): here the consumer does not manage or control
the underlying cloud infrastructure but has the control over operating systems.
Container as a service (CaaS): it allows users to deploy and manage applications
using container-based abstraction technique.
Platform as a service (PaaS): it allows the consumer to manage the deployed
applications and the possibly application hosting environment.
The main differences between Caas and Paas are:
Caas requires more work design capacity, offers greater freedom, produces
functional applications regardless of the platform used.
Paas is faster and less demanding in term of management, does not have the
ability to control the flow.
Software as a service (SaaS): is the capability to offer to the consumer the
possibility to use the provider’s applications running in the cloud infrastructure. It
provides web-based applications, business applications, scientific applications etc.
Types of clouds
We have different types of clouds:
1- Public cloud: the cloud infrastructure is made available to the general public
or large industry group; it is owned by an organization that sells cloud
computing.
2- Private cloud: the cloud infrastructure operates solely for an organization.
3- Community cloud: the cloud infrastructure is shared by several organizations
and support a specific community that has shared concerns.
4- Hybrid cloud: the cloud infrastructure is a composition of two or more clouds
that remains unique entities but are bound together.
Cap 5: cybersecurity
The general goal of the cybersecurity is to identify the highest level of
authentication compatible with the available resources. Authentication refers to the
process that determines whether someone or something is who and what is its
claim.
First of all, an authentication must be able to resist against external attacks, do not
take too much time to recognize the identity of the user, be simple, accurate and
easy to use.
We have different types of authentications such as:
1- Authentication password: it is an authentication by knowledge. The weakness
of this type of authentication is the fact that password username could be
recognized. For this reason, a long password with special characters is
fundamental.
We have a paradigm user/password which is used in unilateral authentication
and requires the client to authenticate to the server by sending a personal
identifier (username) and the password associate with it. Here, the packets
transmitted could be intercepted by special programs. Here we have:
- Personal identification number (PIN): it is a string of characters used as a
password to gain access to a system resource. It is safer than the
password.
2- Authentication token: it is an authentication for possession. Here, first of all
the client send to the server the personal identifier; then we have a
generation of a random R number and finally the client has to send to the
server the number received to confirm the authentication. Here we have:
- One time password (OTP): it is a password that can be used only once. The
password is generated at the moment through a dedicated hardware
device. It is resistant to an attack.
- OTP via token: it is a hardware device or a software module that provides
the user with the OTP to enter at each authentication attempt.
- Smart card and USB token: are devices that need a physical connection
with the PC to operate and generate a one-time password. It is always
reliable.
- OTP via SMS: it is a mechanism that generates a one-time password that is
sent to the end user via SMS (short message service).
3- Authentication biometrics: it is an authentication by features. It is ana
automatic process by which the identity is determined through the analysis
and the recognition of physiological and behavioral characteristics. The
identification is obtained by making a series of comparison with data
associated with the record archive. When an algorithm generates a high score
(threshold), a match is reported.
Cap 6: digital ledger technologies
Blockchain
Blockchain is the result of combining already existing technologies and
cryptographic techniques with the new concept of chaining data together.
Blockchain safely stores data in a system that is accessible to everyone but whose
data are readable only by those who have authorization. Blockchain is defined as a
specific form of DTL (distributed ledger technology). All user transactions are
recorded in a chain of blocks of information.
Blockchain is peer-2-peer and make transaction reliable, measurable and
transparent. P2P consists of a system of computers directly connected to each other
via internet without passing through a central server. Moreover, P2P is considered
more secure since it does not have a single point of attack.
Blockchain could be used if we need a shared database, the parties do not trust each
other, encryption is needed etc.
The algorithm on which blockchain run is called consensus algorithm and it defines
the way the network comes to an agreement on new transaction entered into the
system. The consent allows all the nodes of the network to share exactly the same
data preventing the manipulation of them.
Bitcoin and public blockchain use a proof-to-work (PoW) consensus algorithm
which is a cryptographic protocol that validates a transaction when the solution of a
mathematical computation is found by a node. The protocol defends the system
from attackers who could want to reserve transactions a rewrite the block chained.
In Pow the nodes are in competition to solve the computational work.
Important too is the proof-to-stale (PoS) that is a consensus algorithm that
overcomes the problem of having more working on the same mathematical
computation by adopting a system where the work validate a transaction is assigned
to on of the miners. The choice is not casual: the miners that owns the most amount
of the platform’s coin will perform the work.
Blockchain can be of two types:
1- Permissionless: public blockchain are open to anyone and have an open-
source code that is maintained by the community, and which runs through a
native coin (bitcoin is the most popular).
2- Permissioned: private blockchain are open only to those who have the right to
register, and their size tend to be smaller than public blockchain. They are
faster and are more flexible.
3- Semi-private blockchain: it is composed by 2 parts, the private part is
controlled by a group of individuals, while the public part is open for
participation by anyone.
4- Consortium blockchain: joins the established structure and shares information
instead of starting from zero. It helps organizations to find solutions together
and save time and development costs.
Smart contracts
A smart contract is a program that performs predefined actions when certain
conditions are met. They are executed in the same language of the transactions and
make the execution of the contract known to all nodes of the ledger. Their ability to
be shaped according to any need id favoring their diffusion and numerous
applications.
A smart contract could make the post-trade process extremely efficient by reducing
errors by ensuring that the financial agreement is made when the transaction is
executed.
It could also make the back-office process when we refer to transaction
confirmation and execution via DTL (distributed ledger technology). In this case the
smart contract automatically manages and starts the payment.
Public-key cryptography
Public-key cryptography or asymmetric cryptography allows more individuals to
communicate on an unsecured network without having to transmit the encryption
key.
Blockchain has two problems by using asymmetric cryptography, because generates
to pairs of keys for each individual: one is public while the other is private and must
be kept secret. The public key is used by the sender to encrypt the message and
once the message arrives, the receiver uses his private key to decrypt the message.
Blockchain applications
Blockchain can have different areas of application:
- Supply chain management (SCM): it simplifies the traceability of products.
By using blockchain you can have a digital database in which transactions
and movements of goods are updated in real time. The benefits are the
process optimization, the elimination of problems related to trust between
actors, the renewal of business model.
- Property rights: blockchain technology can minimize the cost of property
rights management. Companies could use blockchain to register the
various rights and obligations arising from ownership.
- Finance: when technology innovation embrace finance we talk about
FinTech. Bitcoins have been created as a system to transfer peer-to-peer
electronic money and to offer itself as an alternative currency. The
objective of blockchain for bitcoin is to reduce transaction costs by making
the parties interact with each other (P2P).
Cap 7: immersive technologies
Immersive technologies create distinct experiences by merging physical world with
digital or simulated reality. Immersive technologies can have different forms:
1- Augmented reality (AR): it combines real-world experience with computer
generated content. It relies on processors, display, sensors and input devices
to create different types of augmented reality such as: marker-based AR,
markerless AR etc.
2- Virtual reality (VR): it is a computer-generated scenario that simulates a real-
world experience by providing a sense of immersion. We have different types
of VR: non-immersive VR, semi-immersive VR and fully-immersive VR.
3- 360° videos: it allows users to live an immersive experience.
4- Mixed reality: it is an experience that merges the user’s real-world
environment and the digital content created.
5- Extended reality: it is a technology that allows you to put together the virtual
and physical worlds together by using AR, VR, MR.
Immersive technologies offer different business opportunities: save costs, improve
performance, provide new value, customers engagement etc.
Immersive technologies can have several applications:
- Marketing and advertising:
- Entertainment
- E-commerce and shopping tools
- Manufacturing etc.
Cap 8: Artificial intelligence
Artificial intelligence is the ability of a digital computer to perform tasks commonly
associated with intelligent beings.
Its advantages are: more powerful computers, new improved interfaces, solving
new problems, conversion of information into knowledge.
Its downside are: increased costs, few experienced programmers, difficult software
development.
Basically, we have 3 types of AI:
1- Artificial narrow intelligence (ANI) which has a narrow range of abilities (Siri,
Google).
2- Artificial general intelligence (AGI) which is on pair with human abilities. It is a
machine with general intelligence that mimics human intelligence and
behavior with the ability to learn and solve problems.
3- Artificial supper intelligence (ASI) where the machine became self-aware and
surpass the caoacity of human intelligence and ability.
Artificial intelligence has different applications:
- The autonomous planning of activities and operations
- Autonomous control
- Robotics and artificial vision
- Elaboration of natural language
Weak and strong artificial intelligence
AI is able to perform some typical human functions: acting humanly, thinking
humanly, acting rationally and thinking rationally.
There considerations allow the artificial intelligence to be classified in two groups:
- Weak artificial intelligence: it identifies technological systems capable of
simulating some of man’s cognitive functions without achieving real
intellectual capacity. Here there is not the need to fully understand man’s
cognitive processes.
- Strong artificial intelligence is the ability to develop one’s own intelligence
without emulating thought processes or cognitive abilities by developing
one autonomously “wise system”.
Turing test
Alan Turing is defined as the father of artificial intelligence and very important is the
Turing test. It is a game involving three people: a man (A), a woman (B), and a third
individual (C). The third individual (C) has to guess who is the man and who is the
woman by asking a series of questions. (A) will have to persuade (C) and (B) will
have to try to help him to solve the question. The answers will have to be typed.
Here, Turing assumes that person (A) is replaced by machine. If (C) does not notice
anything after this substitution than (A) should be considered as intelligent as
human being.
Expert system
Expert system is a computer system emulating the decision-making of human-
expert. They are designed to solve complex problems by reasoning through the
knowledge which is mainly represented as an “if-then rules” rather than a
conventional procedural code.
Knowledge is stored in a database and can be represented using a variety of
methods; the most common is the production rule method.
Machine learning methods
Machine learning is a tool for forecasting purposes; it is able to create new models
that incorporate relationships.
Machine learning methods use a wide range of analytic tools, which can be classified
in:
1- Supervised machine learning: it involves the construction of a statistical
model for the prediction or estimation of a result.
2- Unsupervised machine learning: here a data set is analyzed without a
dependent variable to be estimated but the data is studied to highlight
models and structures of the information set.
Machine learning includes many different analytical methods:
- Regression: it is the forecasting of a continuous dependent quantitative
variable.
- Classification: (supervised learning) it is the prediction of a qualitative
dependent variable which takes on values in a class. Here the problems
can be resolved through a decision tree that answers to yes/no questions
that can be ordered.
- Clustering: (unsupervised learning), here only input variables are observed
while a corresponding dependent variable is missing.
Complexity and overfitting
The complexity of a model may be caused by the presence of too many parameters
compared to the number of observations. Excessively complex model can lea to
overfitting, describing random errors. In machine learning overfitting is particularly
present is non-linear models.
To manage the adaption of a model there are several ways:
Boosting is done by assigning an overweight to ensure that the model trains
more intensively. Bagging here, a model is run hundreds or thousands
of times to improve predictive performance. Random forest is a model
consisting of further different models based on decision trees.
Neural network and deep learning
Deep learning can be based on both supervised and unsupervised non-linear
methods.
Many artificial intelligence techniques are inspired by biological mechanisms; the
biological structure of the brain is the artificial neural network (ANN) which consists
in large number of interconnected neurons that cooperate each other and produce
output. In ANN, the neurons in the input level calculate the information regarding
the probability of failure; the neurons in the hidden level extract the data from the
input vector, estimate the risk and generate the values of the output vector.
These techniques are known as fuzzy adaptive network (FAN), neuro-fuzzy
interference systems (ANFIS). ANFIS uses learning structure with the aim of mapping
inputs such as the probability and the consequences of failure into an output result.
In deep learning, several layers of algorithms have the aim to mimic neurons and so
they give raise to the so called artificial neural networks. This allows to feed the
deep learning with any kind of data even qualitative and unstructured ones; this led
the system to perform relevant analysis.
Robotic process automation (RPA)
Robotic process automation is carried out through different software that imitates
human activities by executing structured processes based on defined rules. It is one
of the technologies that allow companies to implement digital transformation
strategy through the use of robots that aim to automate the repetitive activities.
Cap9: Internet of things, cyberphisical systems and smart production
The term internet of things (IoT) was coined by Kevin Ashton and identifies a
network of things connected to the internet that includes connected devices and IoT
physical resources. It allows to connect everything, receive and transmit data,
prepare it for the analysis etc.
The main areas that absorb investment in internet of things are: production process,
transport, smart buildings, smart home autonomation etc.
IoT represents the interconnection of end points that can be uniquely addressed
and identified with an IP address. An IP address is a unique address that identifies a
device on the internet or a local network.
The characteristics of internet of things are:
- Connectivity
- Things (physical)
- Data
- Communication
- Intelligence
- Ecosystem because important is the sharing of knowledge and experience
Internet of Things has different forms:
1- Internet of everything (IoE):is a term coined by Cisco. IoE can be seen as an
implementation of IoT. At the core of internet of everything is an “intelligent”
network capable of listening, learning and responding to offer new services
that are based on security simplicity and reliability.
2- Internet of robotic things (IoRT): it is a set of devices that can monitor events
that have occurred in the “external” world, combine data from various
sources and act by manipulating objects and devices in the real world.
3- Consumer internet of things (CIoT): examples are smart watches, connected
applications ad smart home. Consumer internet of things applications are
more “intelligence” devices, secure and a good reason to buy. CIoT are
focused on creating, stimulating and engaging customer-oriented
experiences.
4- Industrial internet of things (IIoT): refers to use of sensors and information
installed in machine, products that are capable of enabling their traceability
and remote control. Its focus is more on the benefits of applications. The
target market are the telecommunications companies, transports etc. IIoT
provides more efficiency, effectiveness, high quality and a cost reduction.
Industrial internet of things generates a large amount of data that can
enhance the capacity of machines. Important is the cyber physical system
(CPS) that allows a continuous interaction between things, data people and
services. Important is for:
- Data production
- Data analysis
- Maintenance and reconfiguration of processes
Internet of things have different fields of application:
1- Innovation in safety and process development: it allows to obtain in real time
fundamental data for management of the entire lifecycle of a product.
Moreover, it allows to solve problems without creating blocks and
slowdowns.
2- Innovation in production process control: quality control aims to facilitate the
implementation of a reliable production process that generates products in
line with market requirements. Important is the fact that with IoT you can
analyze a production in real time, and you can have rapid changes.
3- Stock management: it is possible to develop a replenishment system that
allow you to manage your inventory with less waste and with less space
occupation.
4- Remote monitoring: it allows managers to remotely monitor and diagnostic
machine quickly by identify and solving a problem before it compromises the
machine.
5- Innovation in distribution
6- Innovation in sales
7- Innovation in maintenance
Smart factory and cyber-physical systems
Smart factory refers to a fully connected and flexible system capable of learning and
adapting t needs. In other words, smart factory is capable of obtaining and
exploiting context information to assist both people and machinery in performing
their tasks. Its main component is the cyber physical system (CPS). CPS processes
and collect data by spreading them on the network in real time.
A main role is also given to communication because thanks the speed of the
exchanged data, the different subjects are able to communicate at any time in any
condition providing the possibility to transform data in value-added information.
A radio frequency identificatory (RFID) is a typical example of CPS system because it
integrates computational capacity, control and communication.
Thanks to CPS systems is possible to improve production systems at 3 levels:
- Smart factories: make the company more autonomous, improving its
control, its productions decrease the use of energy, produce less waste
etc.
- Virtual factories: provides better supply chain management, creates more
value for the integration of the processes, gives more transparency and
lower CO2 impact.
- Digital factories: create a digital model before the product is actually
made, have less waste, reduce the errors and is more efficient.
Digital factories must be integrated:
Vertically with flexible productions systems
Horizontally focusing on design, development and production
End-to-end by integrating the entire product lifecycle
IoT technology: cloud, fog and edge computing
The use of cloud, fog and edge computing limit the huge quantity of data used by
IoT devices; in this way they allow the access and the storing of final information
hence they reduce the response time.
Fog computing is a system level architecture that extends the computing, network
and storage capacity of the cloud.
Cap 10: Digital transformation and smart city development
Cities have been increasingly turning to technology to meet challenges, to finds
solutions to environmental problems. The development of smart cities is directly
related to the promotion of sustainability.
The term smart cities identify cities through the use of data and technology to
create efficiency, economic development and improvement of the quality of life and
inclusiveness for the citizens.
Giffinder and Gudrun have elaborated 6 characteristics of smart cities:
1- Smart economy
2- Smart people
3- Smart governance
4- Smart mobility
5- Smart living
6- Smart environment
Another group of scholars, Nam and Pardo, has highlighted 3 main dimensions:
technology, people and community.
Smart cities are strictly connected to sustainable development which is the
development that meets the needs of the present without compromising the ability
of future generations to meet their own needs.
Technology is at the basis of the life; nowadays we have different developments
such as:
- Telecommunication network: it’s a fundamental part of the smart city
infrastructures. Smart cities require vast and reliable information and
communication technology systems in order to successfully manage all the
urban services.
Smart cities must constantly analyze information of citizens’ mobility and the state
of public transport. Smart cities need to collet statistical data on the behavioral
patterns of its citizens and explore traffic patterns that can be used in the
construction of new transportation infrastructures.
Prosumers defined as the type of consumer who produces energy for its own
consumption.
Digital fabrication is defined as the design and production process that combines 3D
modelling as computing aided design that are integrated through data management.
The purpose is to virtually design, model, simulate, evaluate and optimize products
and processes before any modification is carried out on a physical system.
It is composed by 3 levels:
1- Smart factories
2- Virtual factory
3- Digital factory
Cap 11: Smart transport
Intelligent transport systems (ITS) focus on digital technologies providing
intelligence placed at the roadside or in vehicles; instead, cooperative intelligence
transport system (C-ITS) focuses on the communication between those systems.
Communication or cooperation between vehicles and infrastructure are essential for
the safe integration and operation of automated vehicles in the future transport
system.
Cooperative systems allow wireless communication between vehicles,
infrastructure and other users. It involves a two-way communication: V2V (vehicle
to vehicle), V2I (vehicle to infrastructure) uses 5G, V2U (vehicle to user) and I2U
(infrastructure to user). The goal of cooperative system is to minimize traffic
congestion while avoiding stoppings of the system.
The performance of cooperative system, differs in two types of communication:
1- Short-range communication: here the technology needs an ultra-low latency
and an ultra-high reliability in order to avoid crashes.
2- Long-range communication: in order to better manage traffic low, high data
rate and high scalability are required.
Dedicated short-range communication (DSRC)
Dedicated short-range communication and 5G are two widely used candidate
technologies for connected vehicle applications.
Dedicated short-range communication is a wireless communication technology that
works without any cellular infrastructure; it enables highly secured, low latency and
high speed short-range to medium-range wireless communication between vehicles
and the infrastructure.
In term of collision avoidance important is the packet loss rate (PLR) which is
calculated as the rate of lost packages among the total sent packages.