¸üÐÂʱ¼ä:2021Äê01ÔÂ12ÈÕ15ʱ00·Ö À´Ô´:ÀÖÓãµç¾º ä¯ÀÀ´ÎÊý:

ÎÊÌâ·ÖÎö
±¾ÌâÖ÷ÒªÊÇ¿¼²ìѧԱ¶ÔmapreduceµÄÊìϤ³Ì¶È
ºËÐĴ𰸽²½â
£¨1£©reduce side join
reduce side joinÊÇÒ»ÖÖ×î¼òµ¥µÄjoin·½Ê½£¬ÆäÖ÷Ҫ˼ÏëÈçÏ£º
ÔÚmap½×¶Î£¬mapº¯Êýͬʱ¶ÁÈ¡Á½¸öÎļþFile1ºÍFile2£¬ÎªÁËÇø·ÖÁ½ÖÖÀ´Ô´µÄkey/valueÊý¾Ý¶Ô£¬¶ÔÿÌõÊý¾Ý´òÒ»¸ö±êÇ© £¨tag£©£¬±ÈÈ磺tag=0±íʾÀ´×ÔÎļþFile1£¬tag=2±íʾÀ´×ÔÎļþFile2¡£¼´£ºmap½×¶ÎµÄÖ÷ÒªÈÎÎñÊǶԲ»Í¬ÎļþÖеÄÊý¾Ý´ò±êÇ©¡£
ÔÚreduce½×¶Î£¬reduceº¯Êý»ñÈ¡keyÏàͬµÄÀ´×ÔFile1ºÍFile2ÎļþµÄvalue list£¬ È»ºó¶ÔÓÚͬһ¸ökey£¬¶ÔFile1ºÍFile2ÖеÄÊý¾Ý½øÐÐjoin£¨µÑ¿¨¶û³Ë»ý£©¡£¼´£ºreduce½×¶Î½øÐÐʵ¼ÊµÄÁ¬½Ó²Ù×÷¡£
£¨2£©map side join
Ö®ËùÒÔ´æÔÚreduce side join£¬ÊÇÒòΪÔÚmap½×¶Î²»ÄÜ»ñÈ¡ËùÓÐÐèÒªµÄjoin×ֶΣ¬¼´£ºÍ¬Ò»¸ökey¶ÔÓ¦µÄ×ֶοÉÄÜλÓÚ²»Í¬mapÖС£Reduce side joinÊǷdz£µÍЧµÄ£¬ÒòΪshuffle½×¶ÎÒª½øÐдóÁ¿µÄÊý¾Ý´«Êä¡£
Map side joinÊÇÕë¶ÔÒÔϳ¡¾°½øÐеÄÓÅ»¯£ºÁ½¸ö´ýÁ¬½Ó±íÖУ¬ÓÐÒ»¸ö±í·Ç³£´ó£¬¶øÁíÒ»¸ö±í·Ç³£Ð¡£¬ÒÔÖÁÓÚС±í¿ÉÒÔÖ±½Ó´æ·Åµ½ÄÚ´æÖС£ÕâÑù£¬ÎÒÃÇ¿ÉÒÔ½«Ð¡±í¸´ÖÆ¶à ·Ý£¬ÈÃÿ¸ömap taskÄÚ´æÖдæÔÚÒ»·Ý£¨±ÈÈç´æ·Åµ½hash tableÖУ©£¬È»ºóֻɨÃè´ó±í£º¶ÔÓÚ´ó±íÖеÄÿһÌõ¼Ç¼key/value£¬ÔÚhash tableÖвéÕÒÊÇ·ñÓÐÏàͬµÄkeyµÄ¼Ç¼£¬Èç¹ûÓУ¬ÔòÁ¬½ÓºóÊä³ö¼´¿É¡£
£¨3£©SemiJoin
SemiJoin£¬Ò²½Ð°ëÁ¬½Ó£¬ÊÇ´Ó·Ö²¼Ê½Êý¾Ý¿âÖÐ½è¼ø¹ýÀ´µÄ·½·¨¡£ËüµÄ²úÉú¶¯»úÊÇ£º¶ÔÓÚreduce side join£¬¿ç»úÆ÷µÄÊý¾Ý´«ÊäÁ¿·Ç³£´ó£¬Õâ³ÉÁËjoin²Ù×÷µÄÒ»¸öÆ¿¾±£¬Èç¹ûÄܹ»ÔÚmap¶Ë¹ýÂ˵ô²»»á²Î¼Ójoin²Ù×÷µÄÊý¾Ý£¬Ôò¿ÉÒÔ´ó´ó½ÚÊ¡ÍøÂçIO¡£
ʵÏÖ·½·¨ºÜ¼òµ¥£ºÑ¡È¡Ò»¸öС±í£¬¼ÙÉèÊÇFile1£¬½«Æä²ÎÓëjoinµÄkey³éÈ¡³öÀ´£¬±£´æµ½ÎļþFile3ÖУ¬File3ÎļþÒ»°ãºÜС£¬¿ÉÒԷŵ½ ÄÚ´æÖС£ÔÚmap½×¶Î£¬Ê¹ÓÃDistributedCache½«File3¸´ÖƵ½¸÷¸öTaskTrackerÉÏ£¬È»ºó½«File2Öв»ÔÚFile3ÖÐµÄ key¶ÔÓ¦µÄ¼Ç¼¹ýÂ˵ô£¬Ê£ÏµÄreduce½×¶ÎµÄ¹¤×÷Óëreduce side joinÏàͬ¡£
ÎÊÌâÀ©Õ¹
mapµÄjoinÊǽ«Ò»¸öÊý¾Ý¼¯µÄÊý¾Ý·ÅÈëMap¼¯ºÏÖУ¬½«¼¯ºÏÔÚsetup·ÅÈëµ½»º´æÖУ¬ËùÒÔÉæ¼°DistributedCache£¬ÒòÎªÉæ¼°ÔÚÄڴ棬ËùÒÔ·ÅÈ뻺´æµÄÊý¾Ý¼¯Ñù±¾ÒªÐ¡£¬·ñÔò²»ÊÊÓã¬ËùÒÔÕâ¸öÒµÎñ³¡¾°±È½ÏÉÙ¡£
reduceµÄjoin½«ÐèÒªjoinµÄÊý¾Ý¼¯¶¼×÷ΪmapµÄÊäÈ룬ÔÚmapµÄÂß¼ÖжÔÊý¾Ý½øÐбê¼Ç£¬reduceÖжÔÊý¾Ý½øÐкϲ¢£¬ÐèÒª×Ô¶¨ÒåÊý¾ÝÀàÐÍ
²ÂÄãϲ»¶£º
hadoopÐéÄâ»úÈçºÎ°²×°ºÍÅäÖÃJDK?
´óÊý¾ÝÅàѵ:hadoopÖг£¼ûÎÊÌâÒÔ¼°½â¾ö·½°¸
ÀÖÓãµç¾º´óÊý¾ÝÅàѵ¿Î³Ì
±±¾©Ð£Çø