A Look Back: The Research History of Neural Networks
The concept of neural networks dates back to the early 1940s with pioneering work
by Warren McCulloch and Walter Pitts. However, limitations in computing power and
theoretical understanding hindered significant progress for decades.
Here are some key milestones in Neural Network research:
• 1950s: The Perceptron, a simple neural network model, was introduced.
However, limitations were discovered that hindered its ability to solve complex
problems.
• 1960s-1980s: A period of decline due to theoretical limitations and lack of
computational resources.
• 1980s onwards: A resurgence fueled by advancements in computing power
and the development of new algorithms like backpropagation, which allowed
for efficient training of more complex networks.
• Today: Deep Learning, a subfield of neural networks using many layers, has
revolutionized AI, achieving state-of-the-the-art performance in various
domains like speech recognition, computer vision, and natural language
processing.
Understanding these historical developments is crucial because they shaped
the current state of neural networks. It highlights the importance of continuous
research and technological advancements in driving AI progress.
Model of artificial neuron
Let's consider the example of determining whether to buy a smartphone based on its
features:
1. Inputs: When you're deciding whether to buy a smartphone, you consider
various features like camera quality, battery life, storage capacity, and price.
These are the inputs to your decision-making process.
2. Weights: Now, not all features are equally important to you. For example, you
might prioritize camera quality and battery life over storage capacity. So, you
assign weights to each feature based on their importance to you. Let's say you
assign a higher weight to camera quality and battery life and a lower weight
to storage capacity.
3. Summation Function: After assigning weights, you sum up all the weighted
inputs. Let's say you're considering a smartphone with a camera score of 8/10
(weighted by 0.7), battery life of 2 days (weighted by 0.8), storage capacity of
64GB (weighted by 0.5), and a price of $600 (weighted by -0.6, as higher prices
are less desirable). The weighted sum would be
0.7×8+0.8×2+0.5×64−0.6×6000.7×8+0.8×2+0.5×64−0.6×600. This
gives you a total weighted sum.
4. Activation Function: Now, you set a threshold for what features are good
enough for you to buy the smartphone. Let's say your threshold is 20. If the
total weighted sum exceeds 20, you decide to buy the smartphone; otherwise,
you don't.
5. Output: Based on whether the total weighted sum crosses the threshold or
not, you make your decision. If it crosses the threshold, you decide to buy the
smartphone. Otherwise, you don't.
So, in this example, the artificial neuron (your decision-making process) takes in
inputs like camera quality, battery life, storage capacity, and price, weighs them
based on their importance, adds them up, compares the total to a threshold, and
then decides whether to output "buy the smartphone" or "don't buy the
smartphone."
Action Under Uncertainity
Certainly! Let's dive into the topic of "Acting under
Uncertainty" within the realm of artificial intelligence.
When we talk about "uncertainty" in AI, we're referring to
situations where we don't have complete information or
where outcomes are not entirely predictable. This is common
in real-world scenarios because the world is inherently
uncertain.
Now, when an AI system needs to make decisions or take
actions in such uncertain environments, it needs to be smart
about it. It's like when you're playing a game and you don't
know what move your opponent will make next. You have to
think about all the possible moves they might make and then
decide on your next move based on that uncertainty.
In AI, we use different techniques to handle this uncertainty.
One important technique is called probabilistic reasoning.
This involves assigning probabilities to different outcomes
based on the available information. So instead of saying, "I
know exactly what will happen next," the AI might say,
"There's a 70% chance that this will happen and a 30%
chance that that will happen."
Another key concept is decision theory. This is about figuring
out the best course of action to take given the uncertainties.
It's like weighing the potential risks and rewards of different
actions and choosing the one that seems most promising.
To put it simply, "acting under uncertainty" in AI is all about
making smart decisions when you don't have all the answers.
It's about being flexible, adaptive, and able to handle
whatever the world throws at you.
Unit 2
Alright, class, today we'll be diving into the fascinating world of heuristic functions in
AI. Imagine you're lost in a maze, and you need to find the exit. Exhaustively
checking every path would be slow and inefficient. That's where heuristics come in –
they're like experienced guides that help you prioritize which paths to explore first.
What is a Heuristic Function?
A heuristic function, often simply called a heuristic, is an educated guess that
estimates the cost of reaching the goal state from any given state within a problem.
Think of it as a rule of thumb that helps AI algorithms make informed decisions
during the search process.
Here's a diagram to illustrate:
Current State (A)
|
V
State B ----- State C (Goal)
\ /
\ /
State D
The heuristic function, denoted by h(n), takes a state (like A, B, C, or D) as input and
outputs an estimated cost to get from that state to the goal (State C in this case).
While not always perfect, a good heuristic significantly reduces the search space by
prioritizing states closer (according to the estimate) to the goal.
How are Heuristics Calculated?
The way we calculate heuristics depends on the specific problem. Here are some
common approaches:
• Distance-based heuristics: In maze problems, we might use the Manhattan
distance (sum of the absolute differences in coordinates) between the current
state and the goal.
• Misplaced tile heuristic: For an 8-puzzle, this heuristic counts the number of
tiles out of place compared to the goal state.
• Domain-specific knowledge: For chess, a heuristic might evaluate the
material advantage (number of pieces) or the king's safety.
Why Use Heuristics?
There are two main reasons why heuristics are crucial in AI:
• Efficiency: By prioritizing states closer to the goal, heuristics significantly
reduce the number of states explored, leading to faster solutions.
• Intractability: Many real-world problems have enormous search spaces,
making it impossible to explore them all. Heuristics make these problems
tractable by guiding the search towards promising areas.
Hill Climbing algorithm real life use case:
• Real-life use cases:
1. Route Optimization: In navigation systems, hill climbing can be used
to iteratively improve route efficiency by making local adjustments
based on current traffic conditions or road closures.
2. Machine Learning: Hill climbing algorithms are used in some
optimization techniques within machine learning, such as feature
selection or parameter tuning in models like neural networks.
3. Network Routing: In telecommunications, hill climbing can help
optimize data packet routing by dynamically adjusting routes based on
network congestion or failures.
4. Game Playing: In certain types of game-playing AI, hill climbing can be
employed to make local decisions, such as determining the next move
in a chess game based on immediate board evaluation.
5. Financial Optimization: In financial markets, hill climbing can be
applied to optimize investment portfolios by iteratively adjusting asset
allocations based on local market conditions.
6. Resource Allocation: Hill climbing algorithms can be used in resource
allocation problems, such as scheduling tasks in a manufacturing
environment or assigning resources in project management.
Models Used in Uncertainty
In many real-life situations, we need to make decisions without having all the information or
without knowing exactly what will happen in the future. This is called dealing with
uncertainty. To handle this, we use various models that help us make informed guesses and
decisions. Here are some of the most common models used to deal with uncertainty:
1. Probability Models
2. Bayesian Networks
3. Decision Trees
4. Monte Carlo Simulation
Let's go through each of these with a real-life example.
1. Probability Models
Example: Weather Forecasting
When meteorologists predict the weather, they use probability models. They look at historical
data and current weather conditions to estimate the likelihood of rain, sunshine, or snow. For
example, if they say there is a 70% chance of rain tomorrow, they are using a probability
model based on various data inputs.
2. Bayesian Networks
Example: Medical Diagnosis
Doctors often use Bayesian networks to diagnose diseases. A Bayesian network is a type of
statistical model that represents a set of variables and their conditional dependencies. For
example, if a patient has a cough, fever, and sore throat, a Bayesian network can help
estimate the probability of different diseases (like the flu or strep throat) based on the
presence of these symptoms and known relationships between symptoms and diseases.
3. Decision Trees
Example: Business Decision Making
Companies use decision trees to make business decisions. A decision tree is a model that uses
a tree-like graph of decisions and their possible consequences. For example, a company
might use a decision tree to decide whether to launch a new product. The tree would include
branches for different scenarios, like high demand, moderate demand, or low demand, and
would help the company assess the potential outcomes and risks of each scenario.
4. Monte Carlo Simulation
Example: Financial Planning
Financial planners use Monte Carlo simulations to predict the future value of investments.
This method involves running many simulations to model the uncertainty of different
variables, like stock prices or interest rates. For example, to plan for retirement, a financial
planner might use a Monte Carlo simulation to estimate how different investment strategies
could perform over time, considering the unpredictable nature of the stock market.
What is Inference Using Full Disjoint Distribution?
Inference refers to the process of drawing conclusions or making predictions based on data
or evidence. When we talk about full disjoint distribution, we are referring to a situation
where different events or outcomes are completely separate from each other; they do not
overlap and are mutually exclusive.
Key Points:
1. Full Disjoint Distribution: A scenario where the outcomes or events are mutually
exclusive. This means that if one event happens, the others cannot happen at the same
time.
2. Inference: Using the given information or data to draw conclusions or make
predictions.
Why Do We Use Full Disjoint Distribution?
Using full disjoint distribution simplifies the process of making predictions or drawing
conclusions because:
1. Clarity: Since the events do not overlap, it is easier to calculate probabilities and
make inferences.
2. Simplicity: It simplifies complex problems by breaking them down into simpler, non-
overlapping parts.
3. Precision: Ensures that we are accounting for all possible outcomes without any
ambiguity.
Real-Life Usage
Example: Quality Control in Manufacturing
Imagine you are working in a factory that produces light bulbs. You want to ensure that the
bulbs are of high quality. Here’s how you might use inference with full disjoint distribution:
1. Define Events: You categorize the light bulbs into three distinct (disjoint) categories
based on quality:
o A: High quality
o B: Medium quality
o C: Low quality
2. Data Collection: You test a sample of light bulbs and find that:
o 60% of the bulbs are high quality (A)
o 30% are medium quality (B)
o 10% are low quality (C)
3. Inference: If a new light bulb is produced, you can infer the likelihood of its quality:
o There is a 60% chance it will be high quality.
o There is a 30% chance it will be medium quality.
o There is a 10% chance it will be low quality.
Understanding Decision Trees
A Decision Tree is a type of machine learning algorithm used for both classification and
regression tasks. Think of it like a flowchart that helps make decisions by breaking down a
complex problem into simpler parts.
How Decision Trees Work
1. Root Node: This is the starting point of the tree, representing the entire dataset. From
here, the data is split based on specific features.
2. Decision Nodes: These are points where the data is further divided based on the value
of a feature. Each decision node represents a test on an attribute (e.g., "Is the person's
age over 50?").
3. Leaf Nodes: These are the end points of the tree, representing the final decision or
outcome. In a classification tree, a leaf node would represent a class label (e.g., "Yes"
or "No"). In a regression tree, it represents a continuous value.
4. Branches: These are the connections between nodes, showing the outcomes of the
tests on the attributes.
Implementation Aspects of Decision Trees
1. Splitting Criteria: The core of building a decision tree involves deciding how to split
the data at each node. Common criteria include:
o Gini Impurity: Measures the frequency of different classes to split the data.
Lower impurity means better splits.
o Entropy: Based on information gain, it calculates the homogeneity of the
sample.
o Mean Squared Error: Used for regression tasks, it measures the average of
the squares of the errors.
2. Pruning: This involves removing parts of the tree that do not provide additional
power in predicting target variables. Pruning helps in reducing overfitting, which
occurs when a model learns the training data too well and performs poorly on new
data.
3. Max Depth: Limiting the maximum depth of the tree helps in controlling overfitting
by restricting the complexity of the model.
4. Minimum Samples for Split: This parameter sets the minimum number of samples
required to split an internal node.
5. Handling Missing Values: Some implementations of decision trees can handle
missing values by either ignoring them or imputing them based on other features.