| 说明: | |
| 在ssbc爬虫的基础上修复,现在可以7*24爬取的爬虫,修改了爬取策略,只入库音乐、电影、电子书。python实现的磁力搜索网站,代码比较烂,请轻喷! | |
| 搜索排行榜、浏览排行榜、DMCA投诉的功能未完成(其实是不想做) | |
| 和ssbc相比,没使用sphinx进行索引,而是用redis缓存访问页面,使用jieba分词,比sphinx的中文分词效果好。 | |
| 模板在templates目录,模板引擎是jinja2(非常易读),编写自己的专属模板非常方便,中文版文档 http://docs.jinkan.org/docs/jinja2/ 。 | |
| 后台可以直接搜索、删除DMCA投诉的关键字,管理首页推荐关键字、用户搜索记录、查看每天爬取的资源数量、管理后台用户。 | |
| 修改数据库密码后请修改manage.py里面的mysql+pymysql://root:后面的内容和simdht_work.py里面的DB_PASS | |
| 实验环境:centos7 python2.7 | |
| 理论上支持python3,以及Ubuntu,Debian等系统 | |
| 安装: | |
| 文件上传到主机上 | |
| tar zxvf zsky.tar.gz | |
| systemctl stop firewalld.service | |
| systemctl disable firewalld.service | |
| systemctl stop iptables.service | |
| systemctl disable iptables.service | |
| setenforce 0 | |
| sed -i s/"SELINUX=enforcing"/"SELINUX=disabled"/g /etc/sysconfig/selinux | |
| #关闭selinux | |
| cat << EOF > /etc/sysctl.conf | |
| net.ipv4.tcp_syn_retries = 1 | |
| net.ipv4.tcp_synack_retries = 1 | |
| net.ipv4.tcp_keepalive_time = 600 | |
| net.ipv4.tcp_keepalive_probes = 3 | |
| net.ipv4.tcp_keepalive_intvl =15 | |
| net.ipv4.tcp_retries2 = 5 | |
| net.ipv4.tcp_fin_timeout = 2 | |
| net.ipv4.tcp_max_tw_buckets = 36000 | |
| net.ipv4.tcp_tw_recycle = 1 | |
| net.ipv4.tcp_tw_reuse = 1 | |
| net.ipv4.tcp_max_orphans = 32768 | |
| net.ipv4.tcp_syncookies = 1 | |
| net.ipv4.tcp_max_syn_backlog = 16384 | |
| net.ipv4.tcp_wmem = 8192 131072 16777216 | |
| net.ipv4.tcp_rmem = 32768 131072 16777216 | |
| net.ipv4.tcp_mem = 786432 1048576 1572864 | |
| net.ipv4.ip_local_port_range = 1024 65000 | |
| net.ipv4.ip_conntrack_max = 65536 | |
| net.ipv4.netfilter.ip_conntrack_max=65536 | |
| net.ipv4.netfilter.ip_conntrack_tcp_timeout_established=180 | |
| net.core.somaxconn = 16384 | |
| net.core.netdev_max_backlog = 16384 | |
| EOF | |
| /sbin/sysctl -p /etc/sysctl.conf | |
| /sbin/sysctl -w net.ipv4.route.flush=1 | |
| echo ulimit -HSn 65536 >> /etc/rc.local | |
| echo ulimit -HSn 65536 >>/root/.bash_profile | |
| ulimit -HSn 65536 | |
| #优化内核参数,优化打开文件数 | |
| cd zsky | |
| yum -y install wget gcc gcc-c++ python-devel mariadb mariadb-devel mariadb-server | |
| yum -y install epel-release python-pip redis | |
| pip install -r requirements.txt | |
| systemctl start mariadb.service | |
| systemctl enable mariadb.service | |
| systemctl start redis.service | |
| systemctl enable redis.service | |
| mysql -uroot -e"create database zsky default character set utf8mb4;" | |
| #utf8mb4比utf8更优秀 | |
| mysql -uroot -e"set global interactive_timeout=31536000;set global wait_timeout=31536000;set global max_allowed_packet = 64*1024*1024;set global max_connections = 10000;" | |
| #建立并优化mysql数据库参数 | |
| python manage.py init_db | |
| #建表 | |
| python manage.py create_user | |
| #按照提示输入用户名、密码、邮箱 | |
| nohup gunicorn -k gevent --access-logfile zsky.log --error-logfile zsky_err.log manage:app -b 0.0.0.0:80 --reload>/dev/zero 2>&1& | |
| #开启网站访问,访问日志是当前目录下zsky.log,错误日志是当前目录下zsky_err.log | |
| #如果不想要日志 就运行下面这条命令 | |
| #nohup gunicorn -k gevent manage:app -b 0.0.0.0:80 --reload>/dev/zero 2>&1& | |
| nohup python simdht_worker.py 2>&1& | |
| #开启爬虫并写日志,如果爬虫有问题请提交日志文件nohup.out给我 | |
| 现在应该能访问http://IP 了,解析域名即可完成部署 | |
| 后台地址http://IP/admin | |
| #开机自启动 | |
| chmod +x /etc/rc.d/rc.local | |
| echo "systemctl start mariadb.service" >> /etc/rc.d/rc.local | |
| echo "systemctl start redis.service" >> /etc/rc.d/rc.local | |
| echo "cd /root/zsky" >> /etc/rc.d/rc.local | |
| echo "nohup python simdht_worker.py >/dev/zero 2>&1&" >> /etc/rc.d/rc.local | |
| echo "nohup gunicorn -k gevent manage:app -b 0.0.0.0:80 --reload>/dev/zero 2>&1&" >> /etc/rc.d/rc.local from https://github.com/psdshow/zsky/blob/master/%E8%AF%B4%E6%98%8E.txt https://github.com/psdshow/zsky/ |
No comments:
Post a Comment