1、文档
    Spark机器学习库(MLlib)官方指南手册中文版
    
    https://blog.csdn.net/liulingyuan6/article/details/53582300
   
    厦门大学数据库实验室的Spark教程
    
    http://mocom.xmu.edu.cn/article/show/5858ab782b2730e00d70fa08/0/1
   
概念:
    WOE
    
    https://blog.csdn.net/shenxiaoming77/article/details/78771698
   
2、实操
    /spark-submit
    
    –master yarn –deploy-mode cluster –queue tempo-queue –name MINE-688dc
    
    –files
    
    –executor-memory 2g –executor-cores 2 –driver-memory 2g –driver-cores 2 –num-executors 2
    
    –class com.meritdata.tempo.mine.server.executor.Executor
    
    spark-internal hdfs://*/688dc.xml
   
3、优化
    pip3 list
    
    Package       Version
    
    ————- ———-
    
    APScheduler   3.5.3
    
    asn1crypto    0.24.0
    
    bcrypt        3.1.4
    
    certifi       2018.10.15
    
    cffi          1.11.5
    
    chardet       3.0.4
    
    Click         7.0
    
    cryptography  2.3.1
    
    dnspython     1.15.0
    
    fire          0.1.3
    
    idna          2.7
    
    IPy           0.83
    
    pexpect       4.6.0
    
    pip           19.0.3
    
    ply           3.11
    
    prettytable   0.7.2
    
    psutil        5.4.8
    
    ptyprocess    0.6.0
    
    pyasn1        0.4.4
    
    pycparser     2.19
    
    pycryptodomex 3.8.1
    
    PyNaCl        1.3.0
    
    pysmi         0.3.4
    
    pysnmp        4.4.6
    
    pytz          2018.7
    
    requests      2.20.0
    
    setuptools    39.0.1
    
    six           1.11.0
    
    tzlocal       1.5.1
    
    urllib3       1.24.1
   
 
