Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a question regarding hadoop hdfs blocks replication. Suppose a block is written on a datanode and the DFS has a replication factor 3, how long does it take for the namenode to replicate this block on other datanodes? Is it instantaneuos? If not, right after writing the block to a datanode suppose the disk on this datanode fails which cannot be recovered, does it mean the block is lost forever? And also how often does the namenode check for missing/corrupt blocks?

question from:https://stackoverflow.com/questions/65946576/how-often-are-blocks-on-hdfs-replicated

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
885 views
Welcome To Ask or Share your Answers For Others

1 Answer

You may want to review this article which has a good description of hdfs writes. it should be immediate depending upon how busy the cluster is:

https://data-flair.training/blogs/hdfs-data-write-operation/ What happens if DataNode fails while writing a file in the HDFS? While writing data to the DataNode, if DataNode fails, then the following actions take place, which is transparent to the client writing the data.

  1. The pipeline gets closed, packets in the ack queue are then added to the front of the data queue making DataNodes downstream from the failed node to not miss any packet.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...