ÀÖÓãµç¾º

½ÌÓýÐÐÒµA¹ÉIPOµÚÒ»¹É£¨¹ÉƱ´úÂë 003032£©

È«¹ú×Éѯ/ͶËßÈÈÏߣº400-618-4000

´øÄãÉîÈëdz³öpythonÅÀ³æ¿ò¼Üscrapy(Èý)

¸üÐÂʱ¼ä:2017Äê11ÔÂ15ÈÕ17ʱ06·Ö À´Ô´:ÀÖÓã²¥¿Í ä¯ÀÀ´ÎÊý:

½ÓÏÂÀ´ÎÒÃÇÒª½²½âÅÀȡһЩ½ÏÄѵÄÊý¾ÝÆÀÂÛ£º

1. ÔÚItemÖж¨Òå×Ô¼ºÒª×¥È¡µÄÊý¾Ý£º

movie_name¾ÍÏñÊÇ×ÖµäÖеē¼ü”£¬ÅÀµ½µÄÊý¾Ý¾ÍÏñËÆ×ÖµäÖеēֵ”¡£Ôڼ̳ÐÁËBaseSpiderµÄÀàÖлáÓõ½£º

µÚÒ»ÐоÍÊÇÉÏÃæÄǸöͼÖеÄTutorialItemÕâ¸öÀ࣬ºì¿òȦ³öÀ´µÄ¾ÍÊÇÉÏͼÖеÄmovie_nameÖС£

2¡¢È»ºóÔÚspidersĿ¼Ï±༭Spider.pyÄǸöÎļþ

¸ú×ÅÉÏÃæµÄitemÊÇÆ¥ÅäµÄ

3.±à¼­pipelines.pyÎļþ£¬¿ÉÒÔͨ¹ýËü½«±£´æÔÚTutorialItemÖеÄÄÚÈÝдÈëµ½Êý¾Ý¿â»òÕßÎļþÖС£

¶ÔjsonÄ£¿éµÄ·½·¨µÄ×¢ÊÍ£ºdumpºÍdumps(´ÓPythonÉú³ÉJSON)£¬loadºÍloads(½âÎöJSON³ÉPythonµÄÊý¾ÝÀàÐÍ);dumpºÍdumpsµÄÎ¨Ò»Çø±ðÊÇdump»áÉú³ÉÒ»¸öÀàÎļþ¶ÔÏó£¬dumps»áÉú³É×Ö·û´®£¬Í¬ÀíloadºÍloads·Ö±ð½âÎöÀàÎļþ¶ÔÏóºÍ×Ö·û´®¸ñʽµÄJSON

4. ÉÏÊöÈý¸ö¹ý³Ìºó¾Í¿ÉÒÔÅÀ³æÁË£¬½öÐèÉÏÊöÈý¸ö¹ý³ÌÓ´£¬È»ºóÔÚdosÖн«Ä¿Â¼Çл»µ½tutorialÏÂÊäÈëscrapy crawl douban¾Í¿ÉÒÔÅÀÀ²

½ÓÏÂÀ´¾Í¼òµ¥½éÉÜÏÂһЩ»ù±¾ÖªÊ¶

5. start_requests·½·¨£º

Ö±½ÓÔÚstart_urlsÖдæÈëÎÒÃÇÒªÅÀ³æµÄÍøÒ³Á´½Ó£¬µ«ÊÇÈç¹ûÎÒÃÇÒªÅÀ³æµÄÁ´½ÓºÜ¶à£¬¶øÇÒÊÇÓÐÒ»¶¨¹æÂɵÄ£¬ÎÒÃǾÍÐèÒªÖØÐ´Õâ¸ö·½·¨ÁË£¬¿É¼ûËü¾ÍÊÇ´Óstart_urlsÖжÁÈ¡Á´½Ó£¬È»ºóʹÓÃmake_requests_from_urlÉú³ÉRequest¡£

ÄÇôÕâ¾ÍÒâζÎÒÃÇ¿ÉÒÔÔÚstart_requests·½·¨Öиù¾ÝÎÒÃÇ×Ô¼ºµÄÐèÇóÍùstart_urlsÖÐдÈëÎÒÃÇ×Ô¶¨ÒåµÄ¹æÂɵÄÁ´½Ó¡£

6. parse·½·¨£º

Éú³ÉÁËÇëÇóºó£¬scrapy»á°ïÎÒÃÇ´¦ÀíRequestÇëÇó£¬È»ºó»ñµÃÇëÇóµÄurlµÄÍøÕ¾µÄÏìÓ¦response£¬parse¾Í¿ÉÒÔÓÃÀ´´¦ÀíresponseµÄÄÚÈÝ¡£ÔÚÎÒÃǼ̳еÄÀàÖÐÖØÐ´parse·½·¨£¬parse_itemÊÇÎÒÃÇ×Ô¶¨ÒåµÄ·½·¨£¬ÓÃÀ´´¦ÀíÐÂÁ¬½ÓµÄrequestºó»ñµÃµÄresponse¡£ÓÑÇéÌáʾ£º»ñµÃ¸ü¶àѧ¿ÆÑ§Ï°ÊÓÆµ+×ÊÁÏ+Ô´Â룬Çë¼ÓQQ£º3276250747¡£

7. ÔÚÕâ¸öº¯ÊýÌåÖУ¬¸ù¾Ý start_requests (ĬÈÏΪGETÇëÇó)·µµÄ Response£¬µÃµ½ÁËÒ»¸ö Ãû×ÖΪ‘item_urls’ µÄurl¼¯ºÏ¡£È»ºó±éÀú²¢ÇëÇóÕâЩ¼¯ºÏ¡£ÔÙ¿´ Request Ô´Âë

±¾ÎİæÈ¨¹éÀÖÓã²¥¿ÍÈ˹¤ÖÇÄÜ+PythonѧԺËùÓУ¬»¶Ó­×ªÔØ£¬×ªÔØÇë×¢Ã÷×÷Õß³ö´¦¡£Ð»Ð»£¡
×÷ÕߣºÀÖÓã²¥¿ÍÈ˹¤ÖÇÄÜ+PythonѧԺ
Ê×·¢£ºhttp://python.itcast.cn/
0 ·ÖÏíµ½£º
ºÍÎÒÃÇÔÚÏß½»Ì¸£¡
¡¾ÍøÕ¾µØÍ¼¡¿¡¾sitemap¡¿