¸üÐÂʱ¼ä:2018Äê01ÔÂ23ÈÕ16ʱ31·Ö À´Ô´:ÀÖÓã²¥¿Í ä¯ÀÀ´ÎÊý:
Èç¹ûÄãÓÐÖ¾ÓÚ×öÒ»¸öÊý¾Ýר¼Ò£¬Äã¾ÍÓ¦¸Ã±£³ÖÒ»¿ÅºÃÆæÐÄ£¬×ÜÊDz»¶Ï̽Ë÷£¬Ñ§Ï°£¬Îʸ÷ÖÖÎÊÌâ¡£ÔÚÏßÈëÃŽ̳̺ÍÊÓÆµ½Ì³ÌÄܰïÄã×ß³öµÚÒ»²½£¬µ«ÊÇ×îºÃµÄ·½Ê½¾ÍÊÇͨ¹ýÊìϤ¸÷ÖÖÒѾÔÚÉú²ú»·¾³ÖÐʹÓõŤ¾ß¶øÎª³ÉΪһ¸öÕæÕýµÄÊý¾Ýר¼Ò×öºÃ³ä·Ö×¼±¸¡£
ÎÒ×ÉѯÁËÎÒÃÇÕæÕýµÄÊý¾Ýר¼Ò£¬ÊÕ¼¯ÕûÀíÁËËûÃÇÈÏΪËùÓÐÊý¾Ýר¼Ò¶¼Ó¦¸Ã»áµÄÆß¿î Python ¹¤¾ß¡£The Galvanize Data Science ºÍ GalvanizeU ¿Î³Ì×¢ÖØÈÃѧÉúÃÇ»¨´óÁ¿µÄʱ¼ä³Á½þÔÚÕâЩ¼¼ÊõÀï¡£µ±ÄãÕÒµÚÒ»·Ý¹¤×÷µÄʱºò£¬ÄãÔø¾Í¶ÈëµÄʱ¼ä¶ø»ñµÃµÄ¶Ô¹¤¾ßµÄÉîÈëÀí½â½«»áʹÄãÓиü´óµÄÓÅÊÆ¡£ÏÂÃæ¾ÍÁ˽âËüÃÇһϰɣº
IPython
IPython ÊÇÒ»¸öÔÚ¶àÖÖ±à³ÌÓïÑÔÖ®¼ä½øÐн»»¥¼ÆËãµÄÃüÁîÐÐ shell£¬×ʼÊÇÓà python ¿ª·¢µÄ£¬ÌṩÔöÇ¿µÄÄÚÊ¡£¬¸»Ã½Ì壬À©Õ¹µÄ shell Óï·¨£¬tab ²¹È«£¬·á¸»µÄÀúÊ·µÈ¹¦ÄÜ¡£IPython ÌṩÁËÈçÏÂÌØÐÔ:
¸üÇ¿µÄ½»»¥ shell(»ùÓÚ Qt µÄÖÕ¶Ë)
Ò»¸ö»ùÓÚä¯ÀÀÆ÷µÄ¼Çʱ¾£¬Ö§³Ö´úÂ룬´¿Îı¾£¬Êýѧ¹«Ê½£¬ÄÚÖÃͼ±íºÍÆäËû¸»Ã½Ìå
Ö§³Ö½»»¥Êý¾Ý¿ÉÊÓ»¯ºÍͼÐνçÃæ¹¤¾ß
Áé»î£¬¿ÉǶÈë½âÊÍÆ÷¼ÓÔØµ½ÈÎÒâÒ»¸ö×ÔÓй¤³ÌÀï
¼òµ¥Ò×Óã¬ÓÃÓÚ²¢ÐмÆËãµÄ¸ßÐÔÄܹ¤¾ß
ÓÉÊý¾Ý·ÖÎö×ܼ࣬Galvanize ר¼Ò Nir Kaldero Ìṩ¡£
GraphLab Greate
GraphLab Greate ÊÇÒ»¸ö Python ¿â£¬ÓÉ C++ ÒýÇæÖ§³Ö£¬¿ÉÒÔ¿ìËÙ¹¹½¨´óÐ͸ßÐÔÄÜÊý¾Ý²úÆ·¡£
ÕâÓÐһЩ¹ØÓÚ GraphLab Greate µÄÌØµã£º
¿ÉÒÔÔÚÄúµÄ¼ÆËã»úÉÏÒÔ½»»¥µÄËÙ¶È·ÖÎöÒÔ T Ϊ¼ÆÁ¿µ¥Î»µÄÊý¾ÝÁ¿¡£
ÔÚµ¥Ò»Æ½Ì¨ÉÏ¿ÉÒÔ·ÖÎö±í¸ñÊý¾Ý¡¢ÇúÏß¡¢ÎÄ×Ö¡¢Í¼Ïñ¡£
×îеĻúÆ÷ѧϰËã·¨°üÀ¨Éî¶Èѧϰ£¬½ø»¯Ê÷ºÍ factorization machines ÀíÂÛ¡£
¿ÉÒÔÓà Hadoop Yarn »òÕß EC2 ¾ÛÀàÔÚÄãµÄ±Ê¼Ç±¾»òÕß·Ö²¼ÏµÍ³ÉÏÔËÐÐͬÑùµÄ´úÂë¡£
½èÖúÓÚÁé»îµÄ API º¯ÊýרעÓÚÈÎÎñ»òÕß»úÆ÷ѧϰ¡£
ÔÚÔÆÉÏÓÃÔ¤²â·þÎñ±ã½ÝµØÅäÖÃÊý¾Ý²úÆ·¡£
Ϊ̽Ë÷ºÍ²úÆ·¼à²â´´½¨¿ÉÊÓ»¯µÄÊý¾Ý¡£
ÓÉ Galvanize Êý¾Ý¿ÆÑ§¼Ò Benjamin Skrainka Ìṩ¡£
Pandas
pandas ÊÇÒ»¸ö¿ªÔ´µÄÈí¼þ£¬Ëü¾ßÓÐ BSD µÄ¿ªÔ´Ðí¿É£¬Îª Python ±à³ÌÓïÑÔÌṩ¸ßÐÔÄÜ£¬Ò×ÓÃÊý¾Ý½á¹¹ºÍÊý¾Ý·ÖÎö¹¤¾ß¡£ÔÚÊý¾Ý¸Ä¶¯ºÍÊý¾ÝÔ¤´¦Àí·½Ãæ£¬Python ÔçÒÑÃûÉùÏÔºÕ£¬µ«ÊÇÔÚÊý¾Ý·ÖÎöÓ뽨ģ·½Ã棬Python ÊǸö¶Ì°å¡£Pands Èí¼þ¾ÍÌî²¹ÁËÕâ¸ö¿Õ°×£¬ÄÜÈÃÄãÓà Python ·½±ãµØ½øÐÐÄãËùÓÐÊý¾ÝµÄ´¦Àí£¬¶ø²»ÓÃת¶øÑ¡Ôñ¸üÖ÷Á÷µÄרҵÓïÑÔ£¬ÀýÈç R ÓïÑÔ¡£
ÕûºÏÁ˾¢±¬µÄ IPyton ¹¤¾ß°üºÍÆäËûµÄ¿â£¬ËüÔÚ Python ÖнøÐÐÊý¾Ý·ÖÎöµÄ¿ª·¢»·¾³ÔÚ´¦ÀíÐÔÄÜ£¬ËÙ¶È£¬ºÍ¼æÈÝ·½Ãæ¶¼ÐÔÄÜ׿Խ¡£Pands ²»»áÖ´ÐÐÖØÒªµÄ½¨Ä£º¯Êý³¬³öÏßÐԻعéºÍÃæ°å»Ø¹é;¶ÔÓÚÕâЩ£¬²Î¿¼ statsmodel ͳ¼Æ½¨Ä£¹¤¾ßºÍ scikit-learn ¿â¡£ÎªÁ衄 Python ´òÔì³É¶¥¼¶µÄͳ¼Æ½¨Ä£·ÖÎö»·¾³£¬ÎÒÃÇÐèÒª½øÒ»²½Å¬Á¦£¬µ«ÊÇÎÒÃÇÒѾ·Ü¶·ÔÚÕâÌõ·ÉÏÁË¡£
ÓÉ Galvanize ר¼Ò£¬Êý¾Ý¿ÆÑ§¼Ò Nir Kaldero Ìṩ¡£
PuLP
ÏßÐÔ±à³ÌÊÇÒ»ÖÖÓÅ»¯£¬ÆäÖÐÒ»¸ö¶ÔÏóº¯Êý±»×î´ó³Ì¶ÈµØÏÞÖÆÁË¡£PuLP ÊÇÒ»¸öÓà Python ±àдµÄÏßÐÔ±à³ÌÄ£ÐÍ¡£ËüÄܲúÉúÏßÐÔÎļþ£¬Äܵ÷Óø߶ÈÓÅ»¯µÄÇó½âÆ÷£¬GLPK£¬COIN CLP/CBC£¬CPLEX£¬ºÍGUROBI£¬À´Çó½âÕâЩÏßÐÔÎÊÌâ¡£
ÓÉ Galvanize Êý¾Ý¿ÆÑ§¼Ò Isaac Laughlin Ìṩ
Matplotlib
matplotlib ÊÇ»ùÓÚ Python µÄ 2D(Êý¾Ý)»æÍ¼¿â£¬Ëü²úÉú(Êä³ö)³ö°æ¼¶ÖÊÁ¿µÄͼ±í£¬ÓÃÓÚ¸÷ÖÖ´òÓ¡Ö½ÖʵÄÔ¼þ¸ñʽºÍ¿çƽ̨µÄ½»»¥Ê½»·¾³¡£matplotlib ¼È¿ÉÒÔÓÃÔÚ python ½Å±¾, python ºÍ ipython µÄ shell ½çÃæ (ala MATLAB® »ò Mathematica®)£¬web Ó¦Ó÷þÎñÆ÷£¬ºÍ6Àà GUI ¹¤¾ßÏä¡£
matplotlib ³¢ÊÔʹÈÝÒ×ÊÂÇé±äµÃ¸üÈÝÒ×£¬Ê¹À§ÄÑÊÂÇé±äΪ¿ÉÄÜ¡£ÄãÖ»ÐèÒªÉÙÁ¿¼¸ÐдúÂ룬¾Í¿ÉÒÔÉú³Éͼ±í£¬Ö±·½Í¼£¬ÄÜÁ¿¹âÆ×(power spectra)£¬Öù״ͼ£¬errorcharts£¬É¢µãͼ(scatterplots)µÈ,¡£
Ϊ¼ò»¯Êý¾Ý»æÍ¼£¬pyplot Ìṩһ¸öÀà MATLAB µÄ½Ó¿Ú½çÃæ£¬ÓÈÆäÊÇËüÓë IPython ¹²Í¬Ê¹ÓÃʱ¡£¶ÔÓڸ߼¶Óû§£¬Äã¿ÉÒÔÍêÈ«¶¨ÖưüÀ¨ÏßÐÍ£¬×ÖÌåÊôÐÔ£¬×ø±êÊôÐԵȣ¬½èÖúÃæÏò¶ÔÏó½Ó¿Ú½çÃæ£¬»òÏî MATLAB Óû§ÌṩÀàËÆ(MATLAB)µÄ½çÃæ¡£
Galvanize ¹«Ë¾µÄÊ×ϯ¿ÆÑ§¹Ù Mike Tamir ¹©¸å¡£
Scikit-Learn
Scikit-Learn ÊÇÒ»¸ö¼òµ¥ÓÐЧµØÊý¾ÝÍÚ¾òºÍÊý¾Ý·ÖÎö¹¤¾ß(¿â)¡£¹ØÓÚ×îÖµµÃÒ»ÌáµÄÊÇ£¬ËüÈËÈË¿ÉÓã¬Öظ´ÓÃÓÚ¶àÖÖÓï¾³¡£Ëü»ùÓÚ NumPy£¬SciPy ºÍ mathplotlib µÈ¹¹½¨¡£Scikit ²ÉÓÿªÔ´µÄ BSD ÊÚȨÐÒ飬ͬʱҲ¿ÉÓÃÓÚÉÌÒµ¡£Scikit-Learn ¾ß±¸ÈçÏÂÌØÐÔ:
·ÖÀà(Classification) – ʶ±ð¼ø¶¨Ò»¸ö¶ÔÏóÊôÓÚÄÄÒ»Àà±ð
»Ø¹é(Regression) – Ô¤²â¶ÔÏó¹ØÁªµÄÁ¬ÐøÖµÊôÐÔ
¾ÛÀà(Clustering) – ÀàËÆ¶ÔÏó×Ô¶¯·Ö×鼯ºÏ
½µÎ¬(Dimensionality Reduction) – ¼õÉÙÐèÒª¿¼ÂǵÄËæ»ú±äÁ¿ÊýÁ¿
Ä£ÐÍÑ¡Ôñ(Model Selection) –±È½Ï¡¢ÑéÖ¤ºÍÑ¡Ôñ²ÎÊýºÍÄ£ÐÍ
Ô¤´¦Àí(Preprocessing) – ÌØÕ÷ÌáÈ¡ºÍ¹æ·¶»¯
Galvanize ¹«Ë¾Êý¾Ý¿ÆÑ§½²Ê¦£¬Isaac LaughlinÌṩ
Spark
Spark ÓÉÒ»¸öÇý¶¯³ÌÐò¹¹³É£¬ËüÔËÐÐÓû§µÄ main º¯Êý²¢ÔÚ¾ÛÀàÉÏÖ´Ðжà¸ö²¢ÐвÙ×÷¡£Spark ×îÎüÒýÈ˵ĵط½ÔÚÓÚËüÌṩµÄµ¯ÐÔ·Ö²¼Êý¾Ý¼¯(RDD)£¬ÄÇÊÇÒ»¸ö°´ÕÕ¾ÛÀàµÄ½Úµã½øÐзÖÇøµÄÔªËØµÄ¼¯ºÏ£¬Ëü¿ÉÒÔÔÚ²¢ÐмÆËãÖÐʹÓá£RDDs ¿ÉÒÔ´ÓÒ»¸ö Hadoop ÎļþϵͳÖеÄÎļþ(»òÕ߯äËûµÄ Hadoop Ö§³ÖµÄÎļþϵͳµÄÎļþ)À´´´½¨£¬»òÕßÊÇÇý¶¯³ÌÐòÖÐÆäËûµÄÒѾ´æÔڵıêÁ¿Êý¾Ý¼¯ºÏ£¬°ÑËü½øÐб任¡£Óû§Ò²ÐíÏëÒª Spark ÔÚÄÚ´æÖÐÓÀ¾Ã±£´æ RDD£¬À´Í¨¹ý²¢ÐвÙ×÷ÓÐЧµØ¶Ô RDD ½øÐи´Óá£×îÖÕ£¬RDDs ÎÞ·¨´Ó½ÚµãÖÐ×Ô¶¯¸´Ô¡£
Spark Öеڶþ¸öÎüÒýÈ˵ĵط½ÔÚ²¢ÐвÙ×÷ÖбäÁ¿µÄ¹²Ïí¡£Ä¬ÈÏÇé¿öÏ£¬µ± Spark ÔÚ²¢ÐÐÇé¿öÏÂÔËÐÐÒ»¸öº¯Êý×÷Ϊһ×鲻ͬ½ÚµãÉϵÄÈÎÎñʱ£¬Ëü°Ñÿһ¸öº¯ÊýÖÐÓõ½µÄ±äÁ¿¿½±´Ò»·ÝË͵½Ã¿Ò»ÈÎÎñ¡£ÓÐʱ£¬Ò»¸ö±äÁ¿ÐèÒª±»Ðí¶àÈÎÎñºÍÇý¶¯³ÌÐò¹²Ïí¡£Spark Ö§³ÖÁ½ÖÖ·½Ê½µÄ¹²Ïí±äÁ¿£º¹ã²¥±äÁ¿£¬Ëü¿ÉÒÔÓÃÀ´ÔÚËùÓеĽڵãÉÏ»º´æÊý¾Ý¡£ÁíÒ»ÖÖ·½Ê½ÊÇÀÛ¼ÓÆ÷£¬ÕâÊÇÒ»ÖÖÖ»ÄÜÓÃ×÷Ö´Ðмӷ¨µÄ±äÁ¿£¬ÀýÈçÔÚ¼ÆÊýÆ÷Öкͼӷ¨ÔËËãÖС£
ÓÉ Galvanize Êý¾Ý¿ÆÑ§¼Ò Benjamin Skrainka Ìṩ¡£
Èç¹ûÄúÏë¶ÔÊý¾Ý¿ÆÑ§½øÐиüÉîÈëÁ˽⣬Çëµã»÷½øÈëÎÒÃǵÄÏîÄ¿ our data science giveaway À´»ñÈ¡¹ØÓÚÊý¾ÝÑÐÌÖ»áµÄÈ볡ȯ£ºÖîÈç PyData Seattle ºÍ Data Science Summit£¬»òÕß»ñµÃ Python ×ÊÔ´µÄÓŻݣ¬Ïñ£º Effective Python ºÍ Data Science from Scratch¡£¡¾ÎÄÕÂÀ´Ô´ÓÚÍøÂç¡¿
±±¾©Ð£Çø