Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
16 views11 pages

Basic Statistics and Probability Assignment

The document is a detailed examination of statistics and probability, covering definitions, concepts, and applications across various fields such as business, medicine, and education. It includes explanations of data classification, central tendency, dispersion, probability approaches, sample spaces, and decision-making scenarios, along with examples and formulas for calculations. The content serves as a comprehensive guide for students in the 2nd semester of a Bachelor's in Computer Application course, specifically for the Basic Statistics and Probability subject.

Uploaded by

subhranshu pati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views11 pages

Basic Statistics and Probability Assignment

The document is a detailed examination of statistics and probability, covering definitions, concepts, and applications across various fields such as business, medicine, and education. It includes explanations of data classification, central tendency, dispersion, probability approaches, sample spaces, and decision-making scenarios, along with examples and formulas for calculations. The content serves as a comprehensive guide for students in the 2nd semester of a Bachelor's in Computer Application course, specifically for the Basic Statistics and Probability subject.

Uploaded by

subhranshu pati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Name Subhranshu Pati

Semester 2nd Sem


Roll Number 2414500297
Course Code & Name DCA1206-Basic Statistics And Probability

Program Bachelors of Computer Application

Set-1

Q1. (a) Define statistics and discuss its scope across different fields with examples.

Ans: -
Statistics is the study of data gathering, able arrangement, analysis, interpretation, as well as
presentation of data. It assists in coming up with wise decisions since it gives a clear notion
of the pattern and trends in quantitative data.

The term statistics is used in different fields. In business world, business is assistive in
providing decisions through studying of sales and customer patronage. Statistics can be
applied in the field of medicine whereby new drugs can be tested in clinical trials. It is used
in education to assess learning outcomes as well as performance within education. Statistics
are vital in computer science whereas data analysis, machine learning and artificial
intelligence require the use of statistics. Statistics formed plan policies dictated by the
government agencies regarding the population of citizens, employment rate, and the progress
of the economy.

On the whole, statistics is an effective device which helps in problem solving and evidence
based decision making in various subjects hence necessitating the subject in the modern day
where information has become the king.

(b) Explain the process and importance of data classification, including the distinction
between attributes and variables.

Ans: -

The process of sorting or grouping data under specific categories or groups depending on
common characteristics is what is known as data classification. It assists in simplification of
large records so that they are easy and more meaningful to analyse. It is done by recognizing
the kind of data, aggregating similar data, and naming of the data so as to understand it in a
better perspective.

The classification should be important since the comparison is possible, the visualization is
easy and the statistical analysis is precise. It is useful in making decisions and trend or pattern
identification.

In statistics, attributes refer to qualitative properties which are not measurable like gender,
color, or nationality. These are normally divided into categories. Variables, however, are
number-measurable such as height, weight and age. Even further there can be discrete and
continuous variables.

This is a key understanding when it comes to applying appropriate statistics and


interpretation of data.
(c) Describe the concept and advantages of frequency distribution in summarizing large
datasets.
Ans: -

Frequency distribution is a statistical technique of sorting and describing huge volumes of


information by displaying the number of times each value or range of value in the
information happens. It is referred to the sure division of information into classifications or
bins (referred to as classes) where the variety of data that endorses every category is counted.
The number is referred to as the frequency.

The data can be represented in frequencies tabulated, in bar graphs, histograms or pie charts,
thus, making the data easier to understand and decipher.

The principal merit of frequency distribution is that it makes a complicated analysis of data
simple and can be analyzed and decided within a short time. It assists in spotting patterns e.g.
concentration of values, trend and outliers. It can also be used to make deviation calculations
within same data set or comparing not so similar datasets.

Moreover, frequency distribution forms the basis of other statistical analysis such as measures
of central tendency (mean, median, mode), measures of dispersion (range, variance, standard
deviation). It can be applied, in particular, to such areas as business, education, and health
when dealing with large sets of data is frequent.
Frequency Distribution aids in getting realistic conclusions by transforming raw data into a
systematic form, which enables making data-oriented decisions.

Q2. (a) What is central tendency? Explain its purpose and the characteristics of a good
measure.

Ans: -

Central tendency can be described as a statistical measure which determines one value that
represents a center or an average of a set of data. The standard measures involve the mean,
median and mode. It is meant to give an overview figure which gives the description of all
the data set and simplifies comparison and analysis.

The desirable characteristic of a good measure of central tendency must meet the following
criteria:

• Self explanatory and simple to know


• According to all the values within the dataset
• Is not very influenced by extreme values (robust)
• Transformable to additional algebra Brackets using algebra
• Consistent and stable with variations in samples
(b) Illustrate the calculation of median and mode for grouped and ungrouped data with
suitable examples.

Ans:-

For ungrouped data, the median is the middle value when data is arranged in order.
Example:
Data: 4, 6, 8, 10, 12 → Median = 8 (middle value)
Mode is the value that occurs most frequently. Example:
Data: 3, 5, 7, 7, 9 → Mode = 7

For grouped data, formulas are used.

Median formula:
𝑁
− 𝐶𝐹
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + ( 2 )×ℎ
𝑓

Where L = lower boundary of median class, N = total frequency, CF = cumulative


frequency before median class, f = frequency of median class, h = class width.

Mode formula:
𝑓1 − 𝑓0
𝑀𝑜𝑑𝑒 = 𝐿 + ( )×ℎ
2𝑓1 − 𝑓0 − 𝑓2

Where 𝑓1 = modal class frequency, 𝑓0 = previous class frequency, 𝑓2 = next class


frequency.

These methods help summarize large data clearly.

(c) Differentiate between arithmetic mean and weighted mean, and discuss their
applications.

Ans: -

The arithmetic mean is the simple average of a set of values, calculated by dividing the
sum of all values by the number of observations. It gives equal importance to every value.
𝛴𝑥
𝐴𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑀𝑒𝑎𝑛 =
𝑛
The weighted mean, on the other hand, assigns different weights to values based on their
importance or frequency. It is calculated as:
∑(𝜔 × 𝑥)
𝑊𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑀𝑒𝑎𝑛 =
𝛴𝜔
Q3. (a) What is dispersion? Discuss its significance and the difference between absolute
and relative measures.

Ans:-

The term dispersion is used to describe how the data values are distributed about a central
value whether mean or median. It assists in comprehending the spread, uniformity as well
as the distribution pattern in a dataset.

Significance: Dispersion is needed in studying reliability of averages. It indicates either a


tight clustering of the points of data or a rapid dispersion, which facilitates the ability to
compare dissimilar sources of data and provide a grasp of trends, risk factors and
conformity.

Such absolute measures (such as range, mean deviation, variance and standard deviation)
are stated in the data units.
Unitless Relative measures (a measure that does not have a unit, example the coefficient
of variation)are comparative, and they are either given as ratios or percentages and
therefore comparisons are possible between datasets of differing scales, or units.
(b) Explain how to calculate range, variance, and standard deviation using an example
each.

Ans:-

Range is the difference between the highest and lowest values.

Example: Data = 5, 10, 15 → Range = 15 - 5 = 10

Variance measures the average squared deviation from the mean.


Example: Data = 4, 8, 12 → Mean = 8

Variance = [(4−8)² + (8−8)² + (12−8)²] / 3

= [16 + 0 + 16] / 3 = 10.67

Standard Deviation is the square root of variance.

Using the above example:

Standard Deviation = √10.67 ≈ 3.27

These measures help understand how much the data varies from the average, indicating
consistency or spread.

(c) Describe one practical scenario each where standard deviation and relative variance
are important in decision-making.
Ans:-

Standard deviation is widely used in finance and investment. For example, an investor
comparing two stocks may use standard deviation to understand risk. If Stock A has an
average return of 10% with a standard deviation of 2%, and Stock B also has a 10%
return but with a standard deviation of 8%, Stock A is more stable. A lower standard
deviation means returns are more consistent, helping investors make safer decisions based
on risk tolerance.

Relative variance (often measured as the coefficient of variation) is important when


comparing datasets with different units or scales. For example, in business operations, a
company may compare the variability in delivery times between two suppliers.

Supplier A: Mean = 4 days, SD = 1 day → CV = (1/4) × 100 = 25%


Supplier B: Mean = 10 days, SD = 2 days → CV = (2/10) × 100 = 20%

Although Supplier B has a higher standard deviation, it has lower relative variability,
making it more reliable over time.

These measures support better decision-making by evaluating consistency, performance,


and risk in real-world scenarios.

SET-2

Q4. (a) Define the classical, empirical, and subjective approaches to probability. Explain
their significance in analyzing uncertainty.

Ans:-

The traditional theory of probability relies on the premise that each of the outcomes can be
equally likely.

Example: P(H) in the table of coin: P(Getting heads during the coin toss) = 1/2.

The empirical (or frequentist) method is based on real facts or trials. The frequency of
previous occurrence is the way of calculating probability.

Example: the probability of rain = 30/100 = 0.3; in case it rains on 30 days in 100 days.

The subjective one is that it relies on an opinion of the person or on the experience of a
specialist and not on the data.

An example would be a physician who thinks the patient has a chance of recovering of 70
percent.

These methods are useful in the examination of uncertainty in various situations, including
theoretical representations, practical data and professional forecasts, facilitating superior risk
evaluation and judgment makings.

(b) What is a sample space? Distinguish between finite, infinite, discrete, and continuous
sample spaces with examples.
Ans:- A sample space can be regarded as the set of all the possible outcomes of some random
experiment. It will be represented as S.

Finite sample space: Has a finite number of occurrence.

Example: Flipping a coin →S = {Heads, Tails}

Infinite sample space: It is the space of outcomes which is infinite.


Example: The number of flips of a coin until, heads emerges.

Discrete sample space: It would have countable elements.

Example- Roll a die → S {1, 2, 3, 4, 5, 6}

Continuous sample space: It has uncountable results, which are normally the outcomes of
measurements.

Example: Height of people: S can be a set of all the real numbers of a range (e.g. 150cm to
200cm)

Probabilities are well defined with the help of sample spaces.

(c) Define an event and discuss its role in probability theory, using a real-life situation.

Ans:-

An event in probability theory is an element (set of results) of a sample space that meets a
certain condition. It may be a single or a multifaceted outcome. The event is more often than
not referred to by capital letter like A, B, or C.

The role of events in probability cannot be underestimated since they are the real occurrences
about which probabilities are being calculated. They are categorized as types of events as
follows; simple events (single outcome), compound events (multiple outcome), mutually
exclusive, and independent events.

Real-life example:
As an example consider drawing a red card out of a pack of 52 playing cards.
Sample Space (S) : 52cards.

Event(A) : drawing of a red card -> 26 results (13 hearts 13 diamond)

P (A) = 26 / 52 = 0.5

Here, drawing a red card occurs as an event and the odds of the event happening are then
used to determine whether a game or simulation decision is to be made.

The existence of events allows to organize the problems in probability and compute the
likelihood of certain results, which is critical in domains such as insurance, finance or
machine learning.

Q5.(a) Differentiate deterministic, non-deterministic, and hybrid experiments with examples.


How are these experiments used in probability analysis?

Ans:-

A deterministic experiment exhibits reproducibility and will always give the same outcome
when carried out in the same conditions.

Case in point: Two chemical substances which are known to have a definite ratio of
proportions of reacting will yield an identical reaction on combining.

An experiment which is non-deterministic (random) is an experiment with unpredictable


outcomes.

Ex: the flipping of a coin - the outcome may either be heads or tails.

A hybrid experiment uses a deterministic and non-deterministic part.

An example: An item is always processed by a machine (deterministic), however there are


probabilistic flaws that happen (non-deterministic).

Deterministic experiments in analysis of probability do not need any probability as the result
is definite. Probability based analysis is used in the analysis of non-deterministic and hybrid
experiments to forecast the occurrence of outcomes, to handle uncertainty situations and to
help in decision making disciplines such as AI, operations research, and risk management.
(b) Explain the concept of expected value (EV). How is EV used in evaluating decision-
making scenarios? Provide an example.

Ans:-

Expected Value (EV) is the average of all the outcomes of a random variable based on
multiplications of probabilities of the received outcomes. It is the mean in the long-run in
case the experiment is conducted several times.

Formula:

𝐸𝑉 = 𝛴(Pi × Xi )

Where Pi is the probability and Xi is the outcome.

Use in decision-making: EV helps compare options based on their potential outcomes and
probabilities, guiding rational choices.

Example: A game offers ₹500 with a 0.2 chance and ₹0 with a 0.8 chance.
EV = (0.2 × 500) + (0.8 × 0) = ₹100

If the entry fee is ₹90, playing is profitable; if it’s ₹120, it’s a loss.

(c) What are equally likely and exhaustive events? Illustrate how they influence probability
calculations.

Ans:-

Even chances are chances that are evenly probable.

Example: In coin toss, the possibilities of observing the event Head and Tail have a likelihood
of occurring of 0.5.

Exhaustive events are a combination of all the possible outcomes of an experiment. They
exhaust the sample space and one of them has to happen.

Example: When we roll the die, an example is there of the exhaustive events which are {1, 2,
3, 4, 5, 6} because only these six numbers can appear on the die.

These concepts are important in probability calculations. When outcomes are equally likely
and exhaustive, the probability of an event is calculated using:
Number of favorable outcomes
P(E) =
Total number of outcomes
Example: Probability of getting an even number when rolling a die:

Favorable outcomes = {2, 4, 6} → 3 outcomes

Total outcomes = 6
3
P(E) = = 0.5
6
Understanding these terms ensures accurate and fair analysis of random experiments in
games, statistics, and real-life decisions.

Q6.(a) Explain the addition rule of probability. Differentiate mutually exclusive and non-
mutually exclusive events with examples.

Ans:-

The addition rule of probability is used to find the probability of the occurrence of either of
two events.

• For mutually exclusive events (events that cannot occur together):

P(A ∪ B) = P(A) + P(B)

Example: Drawing a red card or a black card from a deck – both cannot happen at
once.
• For non-mutually exclusive events (events that can occur together):

P(A ∪ B) = P(A) + P(B)-p(A ∩ B)

Example: Drawing a red card or a king – one card can be both red and a king.

This rule helps in calculating accurate probabilities when dealing with combined
events.

(b) Describe the multiplication rule of probability with examples of independent and
dependent events.
Ans:-

The multiplication rule of probability is used to find the probability of two events
both occurring.

• For independent events (one does not affect the other):

P(A ∩ B) = P(A) × P(B)

Example: Tossing a coin and rolling a die.


P(Heads and 4) = 1/2 × 1/6 = 1/12

• For dependent events (one affects the outcome of the other):

P(A ∩ B) = P(A) × P(B|A)

Example: Drawing two cards without replacement from a deck.


P(First red and second red) = 26/52 × 25/51 = 325/1326
This rule is essential in analyzing compound events in probability.
(c) How are Bayes’ Theorem and conditional probability applied in practical decision-making
problems?

Ans:-

Conditional probability is the possibility of occurrence of an event conditioned on the


possibility of occurrence of another event. It comes in handy in situations where outcome is
based on having previous knowledge.

Bayes Theorem is a mathematical equation that serves to revise the likelihood of the given
event obtainable on the basis of the novel materials. It finds the form:
P(B|A) × P(A)
P(A|B) =
P(B)

Such ideas are well-applied in the practice of decision-making in many spheres:

• Medical diagnosis: Physicians make use of the Bayes Theorem to compute the
likelihood that a patient has a disease in the condition that a test potentiates the truth
that the patient has a disease assuming the accurateness of the test and the prevalence,
or incidence, of the disease itself.

• Spam detection: An email filter is based on conditional probability in the calculation


of the likelihood that a message is spam based on words or patterns.

• Business risk analysis: To determine the probability of loss in the financial terms on
the basis of signals in the market.

• Computer learning: Programs apply bayes reason in loaders (such as ASCII Naive
Bayes) to forecast even considering differences.

With that new data and existing knowledge, Bayes Theorem and conditional probability
allow making decisions more precise and based on the data.

You might also like