Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
143 views9 pages

Last Year Question Paper - Big Data - (BCS 061)

The document outlines a series of examination questions related to Big Data for B.Tech students across multiple academic years. It includes sections on theoretical concepts, practical applications, and specific technologies such as Hadoop, HDFS, MapReduce, and NoSQL databases. The exam format consists of multiple sections, with varying question types and marks distribution, aimed at assessing students' understanding of Big Data principles and practices.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views9 pages

Last Year Question Paper - Big Data - (BCS 061)

The document outlines a series of examination questions related to Big Data for B.Tech students across multiple academic years. It includes sections on theoretical concepts, practical applications, and specific technologies such as Hadoop, HDFS, MapReduce, and NoSQL databases. The exam format consists of multiple sections, with varying question types and marks distribution, aimed at assessing students' understanding of Big Data principles and practices.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Printed Pages : 1 Roll No.

NIT067
B.TECH.
THEORY EXAMINATION (SEM–VI) 2016-17
BIG DATA
Time : 3 Hours Max. Marks : 100
Note : Be precise in your answer. In case of numerical problem assume data wherever not provided.

SECTION – A
1. Explain the following: 10 x 2 = 20
(a) List the characteristics of big data.
(b) How to calculate risk in marketing?
(c) Why would you use inferential statistics in big data?
(d) What do you mean by shrading?
(e) State the usage of Hadoop pipes.
(f) Compare Master-Slave and peer to peer architecture in NoSql.
(g) What is the purpose of bloom filter?
(h) Compare the classic Map Reduce with YARN.
(i) Mention the usage of Grunt.
(j) How Date and Time data types are used in Hive?
(k) Why Hive is preferred instead of PigLatin?

SECTION – B
2. Attempt any five of the following questions: 5 x 10 = 50
(a) Relate crowd sourcing and big data. Justify the relationship with an example.
(b) Write down the aggregate data model in detail with an example.
(c) Differentiate “Scale up and Scale out” Explain with an example How Hadoop uses
Scale out feature to improve the Performance.
(d) Discuss in detail about the basic building blocks of Hadoop with a neat sketch.
(e) Explain in detail about Map-reduce Workflows.
(f) Provide overview of HBase data model.
(g) Enumerate the rules followed while data modeling in Cassandra. How the relationships
are handled in Cassandra?
(h) Write down the Hive queries for natural join and outer join. Give examples.

SECTION – C
Attempt any two of the following questions: 2 x 15 = 30
3 (i) Explain with a neat sketch about the processing of a job in hadoop.
(ii) List the various operational modes of hadoop cluster configuration and explain
in detail about configuring/installing the hadoop in local/standalone mode.
4 (i) Consider the student data File (st.txt), Data in the following format Name,
District, age, gender.
 Write a PIG script to Display Names of all female students
 Write a PIG script to find the number of Students form XXXX District
 Write a PIG script to Display District wise count of all male students.
(ii) Explain the operators supported by pig w.r.to. data access, transformations and
debugging operations.
5 (i) Discuss the different ways of constructing version stamps. What are their pros
and cons?
(ii) Write in detail about the three dimensions of big data.
Printed Pages:2 Sub Code: NIT-067
Paper Id: 1 1 3 6 2 7 Roll No.
B. TECH.
(SEM VI) THEORY EXAMINATION 2017-18
BIG DATA
Time: 3 Hours Total Marks: 100
Note: Attempt all Sections. If require any missing data; then choose suitably.

SECTION A

1. Attempt all questions in brief. 2 x 10 = 20


a. What is big data, why we need to analyze big data?
b. Define “Data Locality Optimization”.
c. List down the tools related with Hadoop.
d. State the purpose of Hadoop Pipes.
e. What is map reducing?
f. Write the difference between operational and analytical system.
g. Explain Hadoop distributed file system.
h. Write down any four industry examples for Big Data.
i. List down the entity of YARN.
j. What is Hadoop architecture?
SECTION B

2. Attempt any three of the following: 10 x 3 = 30


a. Why crowd sourcing analytics needed? Explain.
b. Illustrate on how cloud and big data related to each other.
c. Discuss the design of Hadoop Distributed File System (HDFS) in detail.
d. Discuss the queries involved in Hive data definition.
e. Write in detail about Hbase data model and pig data model.
SECTION C
3. Attempt any one part of the following: 10 x 1 = 10
(a) How does Hadoop system analyze data? Explain your answer with example.
(b) Explain Cassandra data model.
4. Attempt any one part of the following: 10 x 1 = 10
(a) Explain the Anatomy of MapReduce job run.
(b) Discuss the different types and formats of Map Reduce with examples.
5. Attempt any one part of the following: 10 x 1 = 10
(a) With the help of a Data Model explain aggregations and relations.
(b) Write a brief note on composing map-reduce calculation.
6. Attempt any one part of the following: 10 x 1 = 10
(a) Explain Master slave and peer-peer replication in detail.
(b) Discuss about the three dimensions of Big Data.
7. Attempt any one part of the following: 10 x 1 = 10
(a) Describe about graph database and schema less databases.
(b) Elaborate on graph mapping schemas. What do you mean by lower bounds
replication rate?
Printed Page: 1 of 1
Subject Code: KCS061
0Roll No: 0 0 0 0 0 0 0 0 0 0 0 0 0

BTECH
(SEM VI) THEORY EXAMINATION 2021-22
BIG DATA
Time: 3 Hours Total Marks: 100
Note: Attempt all Sections. If you require any missing data, then choose suitably.

SECTION A
1. Attempt all questions in brief. 2*10 = 20
Qno Questions CO
(a) List any five Big Data platforms. 1
(b) Write any two industry examples for Big Data. 1
(c) What is the role of Sort & Shuffle in Map-Reduce? 2
(d) Give the full form of HDFS. 2
(e) What is the block size of a HDFS? 3
(f) Name the two type of nodes in Hadoop. 3
(g) Compare and Contrast No SQL Relational Databases. 4
(h) Does MongoDB support ACID properties? Justify your answer. 4
(i) Describe schema. 5
(j) Discuss the different types of data that can be handled with HIVE. 5
90
SECTION B

1
13
2. Attempt any three of the following: 10*3 = 30
_2

Qno Questions CO

2.
P1

(a) Detail about the three dimensions of BIG data. 1

24
(b) Illustrate the architecture of Map-Reduce. 2
2E

5.
(c) Examine how a client read and write data in HDFS. 3
.5
P2

(d) With the help of suitable example, explain how CRUD operations are 4
17
performed in MongoDB.
Q

|1

(e) Differentiate between Map-Reduce, PIG and HIVE 5


4

SECTION C
3
8:

3. Attempt any one part of the following: 10*1 = 10


:2

(a) Discuss in detail the different forms of BIG data. 1


13

(b) Elaborate various components of Big Data architecture. 1


4. Attempt any one part of the following: 10 *1 = 10
2

(a) Explain the detailed architecture of Map-Reduce 2


02

(b) Differentiate “Scale up and Scale out” Explain with an example How 2
-2

Hadoop uses Scale out feature to improve the Performance.


06

5. Attempt any one part of the following: 10*1 = 10


5-

(a) Demonstrate the design of HDFS and concept in detail. 3


|1

(b) Write the benefits and challenges of HDFS 3


6. Attempt any one part of the following: 10*1 = 10
(a) Classify and detail the different types of NoSQL 4
(b) Summarize the role of indexing in MongoDB using an example. 4

7. Attempt any one part of the following: 10*1 = 10


(a) Explore various execution models of PIG. 5
(b) Design and explain the detailed architecture of HIVE. 5

QP22EP1_290 | 15-06-2022 13:28:34 | 117.55.242.131


Printed Pages: 2 Sub Code:KCS-061
Paper Id: 236658 Roll No.

B.TECH
(SEM VI) THEORY EXAMINATION 2022-23
BIG DATA
Time: 3 Hours Total Marks: 100
Note: Attempt all Sections. If require any missing data; then choose suitably.

SECTION A

1. Attempt all questions in brief. 2 x 10 = 20


(a) List any five big data platforms.
(b) Discus the importance of hadoop technology in big data analytics.
(c) Explain three benefits of MapReduce.
(d) Define heartbeat in HDFS.
(e) Define data replication in Hadoop distributed file system.
(f) Differentiate between flume and Sqoop.
(g) Compare and Contrast No SQL relational databases.
(h) Explain briefly about the schedulers.
90

6
(i) Differentiate between Pig and MapReduce.

22
(j) Discuss meta store in HIVE in brief.
_2

3.
P1

11
SECTION B
3E

0.
2. Attempt any three of the following: 10x3=30

.2
P2

(a) Explain Hadoop ecosystem in detail.


25
Q

(b) Discuss Master Slave and Peer-Peer replication in detail.


|1
(c) Examine the process of reading and writing data in HDFS by a client.
(d) Explain how CRUD operations with example are performed in MongoDB.
1

(e) Draw and explain the detailed architecture of HIVE.


:4
54

SECTION C
:
08

3. Attempt any one part of the following: 10x1=10


3

(a) Detail about the analysis vs. reporting while introducing the Big Data
02

(b) Elaborate various components of Big Data architecture.


-2
06

4. Attempt any one part of the following: 10x1=10


1-

(a) Discuss the detailed architecture of Map-Reduce


|2

(b) Discuss the detailed architecture of YARN along with its components.

5. Attempt any one part of the following: 10x1=10


(a) Demonstrate the design of HDFS and concept in detail.
(b) Discuss in brief about the cluster specification. Describe how to setting up a
Hadoop Cluster?

QP23EP1_290 | 21-06-2023 08:54:41 | 125.20.113.226


6. Attempt any one part of the following: 10x1=10
(a)
State features of Apache Spark and also explain three ways of how Spark can be
built with Hadoop components.
(b) State difference between Java and Scala. Also explain various features of Scala.

7. Attempt any one part of the following: 10x1=10


(a) Explain the architecture of HIVE. Also explain data flow in HIVE.
(b) Compare and Contrast
(i) Apache Pig vs Map-Reduce
(ii) Pig vs SQL
(iii) Pig vs HIVE

90

6
22
_2

3.
P1

11
3E

0.
.2
P2

25
Q

|1
1
:4
: 54
08
3
02
-2
06
1-
|2

QP23EP1_290 | 21-06-2023 08:54:41 | 125.20.113.226


Printed Page: 1 of 2
Subject Code: KCS061
0Roll No: 0 0 0 0 0 0 0 0 0 0 0 0 0

BTECH
(SEM VI) THEORY EXAMINATION 2023-24
BIG DATA
TIME: 3 HRS M.MARKS: 100

Note: 1. Attempt all Sections. If require any missing data; then choose suitably.

SECTION A
1. Attempt all questions in brief. 2 x 10 = 20
Q no. Question Marks CO
a. What are the different types of digital data commonly encountered in 02 1
Big Data applications? Provide examples of structured, semi-structured,
and unstructured data.
b. What constitutes a Big Data platform? 02 1
c. What is Hadoop Streaming? 02 2
d. Discuss the data formats commonly used in Hadoop environments. 02 2
e. Describe the concepts of file sizes, block sizes, and block abstraction in 02 3
HDFS.
f. What are the benefits and challenges of using HDFS for distributed 02 3

2
storage and processing?

13
90
g. What are the characteristics and use cases for schedulers such as Fair 02 4

2.
Scheduler and Capacity Scheduler?
_2

24
h. What is YARN? 02 4
P1

i. What is Apache Pig? 02 5

5.
4E

j. Describe the Grunt shell in Apache Pig. 02 5

.5
17
P2

SECTION B
|1
Q

2. Attempt any three of the following: 3 x 10 = 30


a. Distinguish between data analysis and reporting in the context of Big 10 1
AM

Data. How does advanced analytics go beyond traditional reporting to


uncover hidden patterns, trends, and correlations in data?
4

b. Explain Apache Hadoop and its role in big data processing. What are 10 2
:3

the core components of the Apache Hadoop ecosystem, and how do they
12

work together to enable distributed data storage and processing?


9:

c. Explain the core concepts of HDFS, including NameNode, DataNode, 10 3


24

and the file system namespace. How do these components work together
to manage data storage and replication in Hadoop clusters?
20

d. Define NoSQL databases. What are the key characteristics and benefits 10 4
n-

of NoSQL databases compared to traditional relational databases?


Ju

e. Provide an overview of Apache Hive architecture and its components. 10 5


How does Hive translate SQL-like queries into MapReduce jobs for data
4-

processing in Hadoop?
|2

SECTION C

3. Attempt any one part of the following: 1 x 10 = 10


a. Describe the "5 Vs" of Big Data. What do each of these terms represent 10 1
in the context of Big Data, and why are they essential considerations for
data management and analysis?

1|Page
QP24EP1_290 | 24-Jun-2024 9:12:34 AM | 117.55.242.132
Printed Page: 2 of 2
Subject Code: KCS061
0Roll No: 0 0 0 0 0 0 0 0 0 0 0 0 0

BTECH
(SEM VI) THEORY EXAMINATION 2023-24
BIG DATA
TIME: 3 HRS M.MARKS: 100

b. Provide examples of real-world applications where Big Data analytics 10 1


have been instrumental. How do industries such as healthcare, finance,
e-commerce, and transportation leverage Big Data to gain insights and
create value?

4. Attempt any one part of the following: 1 x 10 = 10


a. Describe the Hadoop Distributed File System (HDFS). How does HDFS 10 2
manage the storage and replication of data across a distributed cluster of
machines?
b. Discuss the process of developing a MapReduce application. What are 10 2
the key steps involved in writing, testing, and deploying a MapReduce
program?

5. Attempt any one part of the following: 1 x 10 = 10


a. Explain how HDFS stores, reads, and writes files. Describe the sequence 10 3

2
of operations involved in storing a file in HDFS, retrieving data from

13
90
HDFS, and writing data to HDFS.

2.
_2

b. Describe the considerations for deploying Hadoop in a cloud 10 3

24
environment. What are the advantages and challenges of running
P1

5.
Hadoop clusters on cloud platforms like Amazon Web Services (AWS),
4E

.5
Microsoft Azure, and Google Cloud Platform (GCP)?
17
P2

6. Attempt any one part of the following: 1 x 10 = 10


|1
Q

a. Explain the operations for creating, updating, and deleting documents in 10 4


MongoDB. What are the MongoDB CRUD operations, and how are
AM

they used to manipulate data in collections?


b. Discuss Resilient Distributed Datasets (RDDs) in Spark. What are 10 4
4

RDDs, and how do they enable fault-tolerant and distributed data


:3

processing in Spark applications?


12
9:

7. Attempt any one part of the following: 1 x 10 = 10


24

a. Introduce the concepts of HBase and its role in the Hadoop ecosystem. 10 5
How does HBase differ from traditional relational databases, and what
20

advantages does it offer for storing and accessing large-scale data?


n-

b. Discuss the HiveQL language used in Apache Hive. How does HiveQL 10 5
Ju

support SQL-like syntax for defining tables, querying data, and


performing data manipulation operations?
4-
|2

2|Page
QP24EP1_290 | 24-Jun-2024 9:12:34 AM | 117.55.242.132
Printed Page: 1 of 2
Subject Code: KOE097
Paper ID : 250138 RollNo:
BTECH
(SEM VIII) THEORY EXAMINATION 2024-25
BIG DATA
TIME: 3 HRS M.MARKS: 100

Note: Attempt all Sections. In case of any missing data; choose suitably.

SECTION A
1. Attempt all questions in brief. 2x 10 = 20
O No. Question
a What is Big Data?
b Explain characteristics of big data
C Explain four V'sof big data
Discuss application of big data
e What is big data analytics?
f Discuss challenges of big data.
Differentiate between structured and non structured data

i.
h. Differentiate between HDFS and Hbase
What is Zookeeper? List benefits of it
247.82
Differentiate between Apache pig Vs Map Reduce

SECTION B
2. Attempt any three of the following: 103 3,30
Q No. Ouestion
a. Explain how big data processing-is different from distributed process
b Discuss Hadoop YARN in detail with failures in classic Map Reduce.
C. Explain Map Reduce framework in details.
d. What is name node and data node in Hadoop architecture
Differentiate between NoSQL and SQL.
e
9:03:
SECTION C
3 Attempt any one part of the following: 10 x 1 = 10
Q No. Qyestioh
What is map reduce? Explain working of various phase of map reduce with appropriate
example and diagram.
b Explain working of Hive with proper steps and diagram

4. Attempt any one part of the following: 10x 1 = 10


Q No. Question
a Explain "Map Phase" and Combiner Phase" in Map Reduce.
b What is data serialization? Make a note on how type of data affects data serialization.

1|Page
QP25EPI 143 |19-May-2025 9:06:19 AM|182.71.247.32
Printed Page: 2 of 2
Subject Code: KOE097
Paper ID : 250138 Roll No:

BTECH
(SEM VIII)THEORY EXAMINATION 2024-25
BIG DATA
TIME: 3 HRS M.MARKS: 100

5. Attempt any one part of the following: 10 x1= 10


Q No. Question
What is Hadoop Ecosystem? Explain various components of Hadoop Ecosystem.
Explain Spark components in details. Also list features of Spark.

6. Attempt any one part of the following: 10 x 1= 10


Q
No. Question
What is NoSQL database? List the differences between NoSQL and relational databases.
Explain in brief various types of NoSQL databases in practice.
b Define HDFS. Discuss HDFS architecture and HDES commands in brief.

7. Attempt any one part of the following: 10 x1= 10


Q No. Question 47.82
a
What do you mean by HiveQL data definition language? Explain any three HiveQL data
definition language with their syntax andexample.
b Explain job Scheduling in Map Reduce. How is it done in case of
(1) fair scheduling (i) capacity scheduling
QP25 182

|
AM

9:06:19
19-May-2025

2| Page
QP25EP1 143 |19-May-2025 9:06:19 AM | 182.71.247.82

You might also like