Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
2 views1 page

Bigassignment 2

The document outlines an assignment for a Big Data course, detailing the tasks and topics students must address, including HDFS operations, Hadoop deployment in cloud environments, MapReduce application development, and real-world Big Data applications. It specifies that submissions must be hand-written and includes a due date of April 25, 2025, along with instructions for including personal information. The assignment covers various core concepts and challenges related to Big Data and Hadoop.

Uploaded by

Mohit Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views1 page

Bigassignment 2

The document outlines an assignment for a Big Data course, detailing the tasks and topics students must address, including HDFS operations, Hadoop deployment in cloud environments, MapReduce application development, and real-world Big Data applications. It specifies that submissions must be hand-written and includes a due date of April 25, 2025, along with instructions for including personal information. The assignment covers various core concepts and challenges related to Big Data and Hadoop.

Uploaded by

Mohit Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 1

Assignment-2

EVEN Semester Session 2024-2025


BIG DATA
(BCDS-601)

Max Marks 10 Due Date 25-04-2025


Note: 1.Mention your Name, Roll-Number, Branch, Section, and
subject code.
2. Only hand-written answers will be accepted.

1. Explain how HDFS stores, reads, and writes files. Describe the sequence of operations
involved in storing a file in HDFS, retrieving data from HDFS, and writing data to
HDFS.
2. Describe the considerations for deploying Hadoop in a cloud environment. What are the
advantages and challenges of running Hadoop clusters on cloud platforms like Amazon
Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP)?
3. Discuss the process of developing a MapReduce application. What are the key steps
involved in writing, testing, and deploying a MapReduce program?
4. Describe the Hadoop Distributed File System (HDFS). How does HDFS manage the
storage and replication of data across a distributed cluster of machines?
5. Provide examples of real-world applications where Big Data analytics have been
instrumental. How do industries such as healthcare, finance, e-commerce, and
transportation leverage Big Data to gain insights and create value.
6. Explain the core concepts of HDFS, including NameNode, DataNode, and the file
system namespace. How do these components work together to manage data storage and
replication in Hadoop clusters?
7. Explain Apache Hadoop and its role in big data processing. What are the core
components of the Apache Hadoop ecosystem, and how do they work together to enable
distributed data storage and processing.
8. What are the benefits and challenges of using HDFS for distributed storage and
processing.
9. Describe the concepts of file sizes, block sizes, and block abstraction in HDFS.
10. What are the different types of digital data commonly encountered in Big Data
applications? Provide examples of structured, semi-structured, and unstructured data?
11. Explain working of following phases of Map Reduce with one common example. (i)
Map Phase (ii) Combiner Phase (iii) Shuffle and Sort
12. Define heartbeat in HDFS

You might also like