But how does the little yellow fellow know what is slow and when to schedule? When running large tasks looks like it runs some speculative tasks anyway, and many get the idea that there is some magic number of tasks launched for each job, but this would be way too expensive if the number of mappers/reducers is large. Therefore Hadoop goes the semi-smart way:
The current algorithm works the following way:
A speculative task is kicked off for mappers and reducers that have completion rate under certain pecentage of the completion rate of the majority of running tasks. For example if you have 100 mappers, 90 of which are at 80% completion, and 10 are at 20%, then hadoop will start 10 addittional tasks for the slower ones.
in versions of hadoop over 0.20.2 there are 3 new fields in the jobconf
completion of all other tasks) and the number of speculative tasks launched< speculativecap.
In older versions of Hadoop these threshold values are fixed and cannot be modified.
The current algorithm works the following way:
A speculative task is kicked off for mappers and reducers that have completion rate under certain pecentage of the completion rate of the majority of running tasks. For example if you have 100 mappers, 90 of which are at 80% completion, and 10 are at 20%, then hadoop will start 10 addittional tasks for the slower ones.
in versions of hadoop over 0.20.2 there are 3 new fields in the jobconf
- mapreduce.job.speculative.speculativecap
- mapreduce.job.speculative.
slowtaskthreshold - mapreduce.job.speculative.
slownodethreshold
In older versions of Hadoop these threshold values are fixed and cannot be modified.
Thank for sharing this great Hadoop tutorials Blog post.I get a lot of great information here and this is what I am searching for. Thank you for your sharing. I have bookmark this page for my future reference.
ReplyDeleteHadoop Training in hyderabad
thanks for simple explanation for speculative execution strategy and as I am so interested in this topic please share more posts about it and its new algorithms
ReplyDeletethanks so much
Actually, you have explained the technology to the fullest. Thanks for sharing the information you have got. It helped me a lot. I experimented your thoughts in my training program.
ReplyDeleteBig Data Training Chennai
Big Data Training
Big Data Course in Chennai
Thanks for sharing your informative article on Hive ODBC Driver. Your article is very descriptive and assists me to learn whole concept in detail. Hadoop Training in Chennai
ReplyDeleteI was just wondering how I missed this article so far, this is a great piece of content I have ever seen in the entire Internet. Thanks for sharing this worth able information in here and do keep blogging like this.
ReplyDeleteHadoop Training Chennai | Big Data Training in Chennai | Big Data Training Chennai
Informative post indeed, I’ve being in and out reading posts regularly and I see alot of engaging people sharing things and majority of the shared information is very valuable and so, here’s my fine read.
ReplyDeleteBig Data Hadoop Training In Chennai | Big Data Hadoop Training In anna nagar | Big Data Hadoop Training In omr | Big Data Hadoop Training In porur | Big Data Hadoop Training In tambaram | Big Data Hadoop Training In velachery