End Sem Paper

Uploaded by

dhruvbharara2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views4 pages

End Sem Paper

Uploaded by

dhruvbharara2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Reg No.

R Aalkeslelcleelols
B.Tech. DEGREE EXAMINATION, JULY 2024
Seventh Semester

1SCSE333J - BIG DATA TOOLS AND TECHNIQUES FOR BLOCKCHAIN

(For the candidates admitted from the academic year2021 - 2022)

Note:
i Part . Ashould be answered in OMIR shect within first 40 minutes and OMR sheet should be handed over to hall
invigilator at the end of 40th minute.
() Part - B& Part -C should be answered in answer booklet.

Time: 3 hours
Max. Marks: 100
PART - A (20 x 1= 20 Marks) Marks BL CO PO
Answer ALL Questions
0
1 What is Big Data?
A)A small amount of structured data B) Large volumes of structured and
unstructured data
C)Only structured data D) Data that fits into traditional
databases
Hadoop was inspired by which of the following papers?
A)Oracle White Paper B) Microsoft Azure Paper
C) Google's MapReduce and Google File D) IBM Watson White Paper
System
1
3. Which Unix tool is commonly used for searching through large text files?
A) cat B) grep
C) echo D) Is
4 Hadoop Streaming allows the use of which programming languages to write
MapReduce jobs?
A) Only Java B) Java and Python
C) Any language that can read from standard D) Java and C++
input and write to standard output
5
Which of the following is a primary feature of HDFS?
A)It is designed for interactive queries. B) It is optimized for reading large files.
C)It does not replicate data. D)It is designed for low-latency access
to small files.

6 Which command is used to list all files in a Hadoop directory?

A)hadoop fs -ls B) hadoop fs -mkdir
C) hadoop fs -put D) hadoop fs -rm
7. In HDFS data flow, what is the primary role ofa DataNode?
B) To store and retrieve blocks
A)To manage the file system namespace
C) To coordinate between clients and data D) To replicate data across nodes
storage
8.
What is serialization in Hadoop?
A)The process of converting a dala structure B) The process of compressing data
into a byte stream
)The process of distributing data across D) The process of querying data '5JF7ISCSEALJ
Iodes
| 3 2
What is the primary purpose of the Shufle phase in MapReduce?
A) Sorting the intermediate key-value pairS B) Merging mapper outputs
C) Sending data from mappers to reducers D)Initializing job parameters
10. In MapReduce, job scheduling is primarily concerned with
A)Allocating map and reduce slots on nodes B) Defining the input-output format for
the job
C) Determining the order of mapper tasks D) Allocating memory or lasks
11 Which of the following is NOT a potential cause of job failures in MapReduce?
A)Task Tracker failure B) Network congestion during Shuffle
C) Incorrect reducer logic D)Input data format errors
| 3 2
12. Which MapReduce format is suiable for handling large datasets where each input file is
processed independently?
A)TextInputFornat B) SequenceFilelnputFormat
C) KeyValueTextlnputFormat D) MultiplelnputsFormat
13. Which execution mode in Apache Pig is typically used for processing large datasets on a
Hadoop cluster?
A)Local mode B) MapReduce mode
C) Tez mode D) Spark mode
1 1 4 2
14. Which of the following statements about Pig Latin is true?
A)lt is a procedural language for defining data B) It supports ACID transactions.
flows.
C)It uses SQL-like syntax for querying data. D)It requires compilation before
execution.

15. Which component of Hive provides metadata management and storage for Hive tables?
A)Hive Shell B) Hive Metastore
C)Hive Server D)Hive CLI

16. Which statement accurately describes Hive tables compared to tables in traditional
databases?
A)Hive tables are stored in memory for fasterB) Hive tables support ACID
access. transactions.
C) Hive tables are schema-less and flexible D) Hive tables cannot be queried using
compared to traditional databases. SQL-like syntax.
17. What is a key characteristic of supervised learning?
A)Requires labeled data for training B) Learns from rewards and
punishments
C) Does not require training data D) Does not use statistical methods

18. Which of the following is NOT a type of machine learming?

A) Supervised Learning B) Unsupervised Learning
C)Reinforcement Learning D)Collaborative Filtering
19 HBase is preferred over traditional RDBMS in scenarios requiring
A) ACIID transactions B) Schema tlexibility and scalability
C)Simple dala storage D) Structured query language (SQL)
support

Page 2 of 4 25JF718CSE333,J
20. In the context of Big Data Analytics, what is a distinguishing feature of Hadoop
compared to traditional data processing systems?
A)Real-time data processing capabilities B) Centralized storage of all data
C)Ability to handle unstructured data D) Distributed processing across clusters
PART - B (5 x 4 = 20 Marks) Marks BL CO PO
Answer ANY FIVE Questions
21 Consider a healthcare organization that needs to manage a variety of data types
including patient records, medical images, and social media feedback on healthcare
services. Examine on how different types of digital data can be managed and utilized
eftectively in such an organization.
4
22 Explain the evolution of Hadoop, highlighting key milestones in its development. How
did Hadoop address the limitations of traditional data processing systems?
23 A company needs to store and process several petabytes of log data generated by web
servers. Examine on how HDFS can handle this requirement and mention its key
features that support large-scale data storage and processing.
24 Defend why to choose Avro for handling schema evolution in Hadoop and how it
facilitates data serialization and deserialization in Hadoop.
4
25 Compare and contrast different input formats available in MapReduce, highlighting their
features and use cases.

26 Explain how to use Apache Pig to analyze a large dataset containing logs from multiple
servers in a distributed computing environment effectively.
4 4
Compare HBase with traditional relational databases (RDBMS) like MySQL in terms of
architecture, use cases, and scalability. Discuss when each database system is more
suitable for Big Data analytics applications.
Marks BL CO PO
PART -C (5 x 12 = 60 Marks)
Answer ALL the Questions
3
28 a. Imagine Dharanee is a data scientist at an e-commerce company that needs to analyze 1:
customer reviews written in various programming languages like Python and Ruby.
Demonstrate how Hadoop Streaming can be used to process these reviews, and describe
a complete workflow from data ingestion to result generation. Include the benefits of
using Hadoop Streaming in this scenario.
(OR)
b. Discuss the characteristics and challenges of Big Data. How does Hadoop address these I2
challenges to provide a robust solution for Big Dala processing? Provide examples to
support your answers.

29 a. Discuss the different commands provided by the Hadoop File System (||DFS) command
line interface (CLI) for file and directory management. Include exanples tor operations
like creating iles, deleting files, and modifying file permissions.
(OR)
b.Compare and contrast the functionalities of Flume and Sqoop for data ingestion into
Hadoop. Provide specific scenarios where each tool would be most appropriate to use.
30 a. Imagine Lakshnan is leading a team tasked with developing a large scale data
processing application lor analysing petabytes of datu collected lrom sensors m a
distributed computing environment Ie have the opion o choose betwecn
implementng the application using tle MupReduce programming nodel or raditional
parallel computing models like MPI (Message Passing ntetace) or IPC (ligh
Performance Computing). Compare the advantages and disadvantages of selecting
MapReduce over traditional parallel computing models. 25JF7I8CSE333J
Page 3 of 4
(OR) and its 12
execution, detailing each phase
4 3 )
a MapReduce job
b. Discuss the anatomy of large-scale data.
significance in processing 12
cxccution modes. How do these modes
2 4 2
Apache Pig and its
31a. Discuss the anatomy of environments?
impact data processing in different
(OR) 12 2 4 2
to migrate its existing data warehouse to a Hadoop-based
b. An organization is planning design and implement Hive tables,
solution using Apache Hive. Explain how it would data retrieval and analysis.
manage metadata, and optimize queries for efficient 12 4 5 5
in data analytics using R.
32 a. Discuss the evolution of machine learning and its applications
Explain the concepts of supervised and unsupervised learning, providing examples of
applications.
cach. Compare their strengths, weaknesses, and real-world
(OR) 4 6 6

b. Imagine a team tasked with implementing a Big Data analytics solution using HBase for 2
a social media platfon. Discuss the architecture design, data modeling approach, and
integration strategies with existing systems. Compare the advantages of using HBase
over traditional RDBMS for handling large-scale social media data. Support the
discussion with relevant examples and considerations.

Ethical Hacking and Penetration Testing 2024
No ratings yet
Ethical Hacking and Penetration Testing 2024
4 pages
Big Data Analytics Exam 2020
100% (1)
Big Data Analytics Exam 2020
10 pages
28.1.3 Lab - Use The Netmiko Python Module To Configure A Router
No ratings yet
28.1.3 Lab - Use The Netmiko Python Module To Configure A Router
9 pages
5Th Sem. / Computer Subject: Big Data: What Are The Challenges For Processing Bigdata? (C - 1)
No ratings yet
5Th Sem. / Computer Subject: Big Data: What Are The Challenges For Processing Bigdata? (C - 1)
2 pages
Big Data Questions and Answers
No ratings yet
Big Data Questions and Answers
14 pages
Big Data Cat Questions
No ratings yet
Big Data Cat Questions
7 pages
Big Data
No ratings yet
Big Data
6 pages
Big Data 22 23 24
No ratings yet
Big Data 22 23 24
10 pages
Pig
No ratings yet
Pig
24 pages
5877 - 4 MCS 2 Big Data - 4093 - (19-06-2024 01 - 37 - 31 - 626 PM)
No ratings yet
5877 - 4 MCS 2 Big Data - 4093 - (19-06-2024 01 - 37 - 31 - 626 PM)
3 pages
KCS061 Big Data
No ratings yet
KCS061 Big Data
2 pages
Subject:-Big Data Computer / IT
No ratings yet
Subject:-Big Data Computer / IT
2 pages
Big Data Analytics 2M Definitions
No ratings yet
Big Data Analytics 2M Definitions
3 pages
BDA 6TH SEM Question Bank
No ratings yet
BDA 6TH SEM Question Bank
6 pages
Practise Quiz Ccd-470 Exam (05-2014) - Cloudera Quiz Learning
No ratings yet
Practise Quiz Ccd-470 Exam (05-2014) - Cloudera Quiz Learning
74 pages
Big Data Analytics
No ratings yet
Big Data Analytics
6 pages
PE CS801A SampleQB2
No ratings yet
PE CS801A SampleQB2
6 pages
Big Data and Hadoop Quiz Guide
No ratings yet
Big Data and Hadoop Quiz Guide
21 pages
Bda MCQ Set
No ratings yet
Bda MCQ Set
8 pages
BDA Question Bank
No ratings yet
BDA Question Bank
10 pages
Big Data QCM 1 PDF
100% (1)
Big Data QCM 1 PDF
7 pages
Hadoop
No ratings yet
Hadoop
14 pages
DSBDA Kadak Document
No ratings yet
DSBDA Kadak Document
249 pages
DS BigDATA 2ièmeN2TR UVT 2022 2023
No ratings yet
DS BigDATA 2ièmeN2TR UVT 2022 2023
4 pages
Bda - Cat Iii - Set Ii
No ratings yet
Bda - Cat Iii - Set Ii
3 pages
Devoir Surveillé: Please Answer The Following Multiple-Choice Questions
No ratings yet
Devoir Surveillé: Please Answer The Following Multiple-Choice Questions
8 pages
BIG DATA ANALYTICS MCQs
No ratings yet
BIG DATA ANALYTICS MCQs
8 pages
Big Data Question Paper
No ratings yet
Big Data Question Paper
1 page
Week - 5
No ratings yet
Week - 5
7 pages
Big Data 2018
No ratings yet
Big Data 2018
6 pages
Big Data Analytics Exam Guide
No ratings yet
Big Data Analytics Exam Guide
4 pages
Big Data Exam Prep Guide
100% (1)
Big Data Exam Prep Guide
9 pages
Big Data BCS061 Complete Question Bank With RealWorld
No ratings yet
Big Data BCS061 Complete Question Bank With RealWorld
5 pages
Pue Big Data
No ratings yet
Pue Big Data
2 pages
3 Hours / 70 Marks: Instructions
100% (1)
3 Hours / 70 Marks: Instructions
2 pages
Question Papers Question Bank BDA
No ratings yet
Question Papers Question Bank BDA
54 pages
Question Bank
No ratings yet
Question Bank
10 pages
Question Paper Code:: (10×2 20 Marks)
No ratings yet
Question Paper Code:: (10×2 20 Marks)
2 pages
Hadoop Quiz and Exam Answers
No ratings yet
Hadoop Quiz and Exam Answers
10 pages
Big Data 2023
No ratings yet
Big Data 2023
18 pages
Big Data Computing - Week-5
No ratings yet
Big Data Computing - Week-5
3 pages
Bda 23
No ratings yet
Bda 23
12 pages
Big Data MCQ
No ratings yet
Big Data MCQ
47 pages
Ese - Dec2020 - Socs - B Tech Cse Iotsc - Sem Vii - Csba4001 - Big Data Analytics
No ratings yet
Ese - Dec2020 - Socs - B Tech Cse Iotsc - Sem Vii - Csba4001 - Big Data Analytics
2 pages
Big Data
No ratings yet
Big Data
19 pages
Big Data Engineering Updated Unit 1 - 2-QB
No ratings yet
Big Data Engineering Updated Unit 1 - 2-QB
4 pages
Big Data Question Bank
No ratings yet
Big Data Question Bank
11 pages
QB
No ratings yet
QB
2 pages
Year - M.C.A. - III (C.B.C.S. Pattern) Sem-V Subject - PSMCAT504.2 - Paper-IV-Elective-II - Big Data & Hadoop
No ratings yet
Year - M.C.A. - III (C.B.C.S. Pattern) Sem-V Subject - PSMCAT504.2 - Paper-IV-Elective-II - Big Data & Hadoop
2 pages
Winter 2023
No ratings yet
Winter 2023
1 page
Big Data and Hadoop MCQs and XML Configurations
No ratings yet
Big Data and Hadoop MCQs and XML Configurations
21 pages
BD Question Bank MCQ Answered
No ratings yet
BD Question Bank MCQ Answered
8 pages
Btech Oe 8 Sem Big Data Koe 097 2023
No ratings yet
Btech Oe 8 Sem Big Data Koe 097 2023
2 pages
Ite06 Big Data Analytics-Qbank
No ratings yet
Ite06 Big Data Analytics-Qbank
18 pages
BDA Pyqs
No ratings yet
BDA Pyqs
4 pages
Last Year Question Paper - Big Data - (BCS 061)
No ratings yet
Last Year Question Paper - Big Data - (BCS 061)
9 pages
CSET 371 Course File
No ratings yet
CSET 371 Course File
81 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
Business Intelligence and Analytics: Systems For Decision Support, 10e (Sharda) Chapter 13 Big Data and Analytics
No ratings yet
Business Intelligence and Analytics: Systems For Decision Support, 10e (Sharda) Chapter 13 Big Data and Analytics
13 pages
Merged
No ratings yet
Merged
7 pages
Oracle FMW12c On SLES12-SP3 PDF
No ratings yet
Oracle FMW12c On SLES12-SP3 PDF
298 pages
Alpuerto Activity No. 1 HDL
No ratings yet
Alpuerto Activity No. 1 HDL
1 page
Students Attendance Management System Report
No ratings yet
Students Attendance Management System Report
68 pages
Azure Admin Exam Skills Guide
No ratings yet
Azure Admin Exam Skills Guide
9 pages
Chapter 8 Quiz
No ratings yet
Chapter 8 Quiz
4 pages
Big Data Analytics in Cloud Computing: An Overview
No ratings yet
Big Data Analytics in Cloud Computing: An Overview
11 pages
Examview Setup Information - Notes For Et - Sept2013
No ratings yet
Examview Setup Information - Notes For Et - Sept2013
3 pages
Editing PI Vision Displays
No ratings yet
Editing PI Vision Displays
2 pages
BDA Simp Tie
No ratings yet
BDA Simp Tie
2 pages
IoT Privacy and Security Challenges
No ratings yet
IoT Privacy and Security Challenges
5 pages
Rashid, Fatema
No ratings yet
Rashid, Fatema
164 pages
Enterprise Architectures Chapter 04
No ratings yet
Enterprise Architectures Chapter 04
9 pages
Watson X
No ratings yet
Watson X
107 pages
Mad Report Changed 2
No ratings yet
Mad Report Changed 2
35 pages
CMR SNMP Function Parameter Modification (SFTP) MOP in U31
No ratings yet
CMR SNMP Function Parameter Modification (SFTP) MOP in U31
2 pages
Data Structure Interview Questions PDF
No ratings yet
Data Structure Interview Questions PDF
4 pages
Chapter 15: Controlling Computer-Based Accounting Information System Information Systems, Part I 3 Edition James Hall
No ratings yet
Chapter 15: Controlling Computer-Based Accounting Information System Information Systems, Part I 3 Edition James Hall
5 pages
COmp INtfc Code
No ratings yet
COmp INtfc Code
21 pages
Normalization With Example2
No ratings yet
Normalization With Example2
20 pages
Unit - 5 Subject Name: Supply Chain & Logistics Management Subject Code - KMBNOP01 Supply Chain and Crm-Linkage
No ratings yet
Unit - 5 Subject Name: Supply Chain & Logistics Management Subject Code - KMBNOP01 Supply Chain and Crm-Linkage
7 pages
Adaptive Community For The Continuity of Education and Student Services National Teachers College
No ratings yet
Adaptive Community For The Continuity of Education and Student Services National Teachers College
7 pages
Load Balancing - IBM
No ratings yet
Load Balancing - IBM
9 pages
Data Management For Human Resource Information System
No ratings yet
Data Management For Human Resource Information System
14 pages
6.integration Testing
No ratings yet
6.integration Testing
6 pages
StalinKv Resume
No ratings yet
StalinKv Resume
3 pages
Software Developer Resume: C#, .NET, APIs
No ratings yet
Software Developer Resume: C#, .NET, APIs
1 page
Data Analytics in Business
No ratings yet
Data Analytics in Business
5 pages
Manuale Dunazip
No ratings yet
Manuale Dunazip
9 pages

End Sem Paper

Uploaded by

End Sem Paper

Uploaded by

Reg No.

1SCSE333J - BIG DATA TOOLS AND TECHNIQUES FOR BLOCKCHAIN

6 Which command is used to list all files in a Hadoop directory?

18. Which of the following is NOT a type of machine learming?

You might also like