HDFS is more suitable for large amount of data sets in a single file as compared to small amount of data spread across multiple files. This is because Namenode is a very expensive high performance system, so it is not prudent to occupy the space in the Namenode by unnecessary amount of metadata that is generated for multiple small files. So, when there is a large amount of data in a single file, name node will occupy less space. Hence for getting optimized performance, HDFS supports large data sets instead of multiple small files.
TOP 100 HADOOP INTERVIEW QUESTIONS ANSWERS PDF, REAL TIME HADOOP INTERVIEW QUESTIONS GATHERED FROM EXPERTS, TOP 100 BIG DATA INTERVIEW QUESTIONS, HADOOP ONLINE QUIZ QUESTIONS, BIG DATA MCQS, HADOOP OBJECTIVE TYPE QUESTIONS AND ANSWERS
Home » Unlabelled » Why do we use HDFS for applications having large data sets and not when there are lot of small files? | Hadoop Questions
Subscribe to:
Post Comments (Atom)
You want big data interview questions and answers follow this link.
ReplyDeletehttp://kalyanhadooptraining.blogspot.in/search/label/Big%20Data%20Interview%20Questions%20and%20Answers