口罩遭到疯抢,各大电商平台口罩供不应求。本文简述如何通过 Python 爬虫不断监控京东商品是否有货,并在到货时自动发送商品到货通知邮件到指定的邮箱。
先来看看京东无货的某牌一次性使用医用口罩(为了找到无货的,无奈选了宝岛):
为了获得查询当前商品是否有货的API,重新选择配送地区并抓包观察:
成功找到目标API:
- https://c0.3.cn/stock?skuId=100006979473&area=32_2768_53502_54370&venderId=1000096602&buyNum=1&choseSuitSkuIds=&cat=9192,12190,1517&extraParam={%22originid%22:%221%22}&fqsp=0&pdpin=&pduid=1597738524154116867159&ch=1&callback=jQuery9707334
-
继续获取以下商品的查询请求进行本次爬虫实践:
以上三种商品的URL分别对应:
- 1、京东东造口罩(无货):https://c0.3.cn/stock?skuId=100006979473&area=32_2768_53502_54370&venderId=1000096602&buyNum=1&choseSuitSkuIds=&cat=9192,12190,1517&extraParam={%22originid%22:%221%22}&fqsp=0&pdpin=&pduid=1597738524154116867159&ch=1&callback=jQuery9707334;
- 2、东方红口罩(有货): https://c0.3.cn/stock?skuId=100012198150&area=32_2768_53502_54370&venderId=1000304541&cat=14065,14099,14103&buyNum=1&choseSuitSkuIds=&extraParam={%22originid%22:%221%22}&ch=1&fqsp=0&pduid=1597738524154116867159&pdpin=&detailedAdd=null&callback=jQuery3905354;
- 3、内网攻防书籍(无货):https://c0.3.cn/stock?skuId=12639103&cat=1713,3287,3801&venderId=0&area=32_2768_53502_54370&buyNum=1&choseSuitSkuIds=&extraParam={%22originid%22:%221%22}&ch=1&fqsp=0&pduid=1597738524154116867159&pdpin=&coord=&detailedAdd=&callback=jQuery5705850
-
先来看完整代码:
- '''
- 京东商品到货邮件通知
- '''
- import requests
- import time
-
- # 有货通知 收件邮箱
- mail = '130XXXXXXXX@163.com'
- # 商品的url
- url = [
- 'https://c0.3.cn/stock?skuId=100006979473&area=32_2768_53502_54370&venderId=1000096602&buyNum=1&choseSuitSkuIds=&cat=9192,12190,1517&extraParam={%22originid%22:%221%22}&fqsp=0&pdpin=&pduid=1597738524154116867159&ch=1&callback=jQuery9707334',
- 'https://c0.3.cn/stock?skuId=100012198150&area=32_2768_53502_54370&venderId=1000304541&cat=14065,14099,14103&buyNum=1&choseSuitSkuIds=&extraParam={%22originid%22:%221%22}&ch=1&fqsp=0&pduid=1597738524154116867159&pdpin=&detailedAdd=null&callback=jQuery3905354',
- 'https://c0.3.cn/stock?skuId=12639103&cat=1713,3287,3801&venderId=0&area=32_2768_53502_54370&buyNum=1&choseSuitSkuIds=&extraParam={%22originid%22:%221%22}&ch=1&fqsp=0&pduid=1597738524154116867159&pdpin=&coord=&detailedAdd=&callback=jQuery5705850'
- ]
-
- def sendMail(url):
- import smtplib
- from email.mime.text import MIMEText
- # email 用于构建邮件内容
- from email.header import Header
- # 用于构建邮件头
- # 发信方的信息:发信邮箱,QQ 邮箱授权码
- from_addr = '142XXXXXXXX@qq.com'
- password = 'mfXXXXXXXXXXXXXXXXX'
- # 收信方邮箱
- to_addr = mail
- # 发信服务器
- smtp_server = 'smtp.qq.com'
- # 邮箱正文内容,第一个参数为内容,第二个参数为格式(plain 为纯文本),第三个参数为编码
- msg = MIMEText(url + ' 有口罩啦', 'plain', 'utf-8')
- # 邮件头信息
- msg['From'] = Header(from_addr)
- msg['To'] = Header(to_addr)
- msg['Subject'] = Header('有口罩啦')
- # 开启发信服务,这里使用的是加密传输
- server = smtplib.SMTP_SSL(host=smtp_server)
- server.connect(smtp_server, 465)
- # 登录发信邮箱
- server.login(from_addr, password)
- # 发送邮件
- server.sendmail(from_addr, to_addr, msg.as_string())
- # 关闭服务器
- server.quit()
-
-
- flag = 0
- while (1):
- try:
-
- session = requests.Session()
- session.headers = {
- "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/531.36",
- "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3",
- "Connection": "keep-alive"
- }
- print('第' + str(flag) + '次 ' + time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))
- flag += 1
- for i in url:
- # 商品url
- skuidUrl = 'https://item.jd.com/' + i.split('skuId=')[1].split('&')[0] + '.html'
- response = session.get(i)
- if (response.text.find('无货') > 0):
- print('无货 : ' + skuidUrl)
- else:
- print('有货啦! 有货啦! 有货啦! : ' + skuidUrl)
- sendMail(skuidUrl)
-
- time.sleep(5)
- except Exception as e:
- import traceback
- print(traceback.format_exc())
- print('异常')
- time.sleep(10)
-
上面的收件邮箱是我的163网易邮箱,发件邮箱是QQ邮箱,发件密码(不是QQ登陆密码)的获取和发件功能需要在QQ邮箱的 “设置-账户” 里开启如下所示功能:
在Pycharm执行以上脚本,如下图所示:
此时查看网易云邮箱,收到到货提醒邮件:
真实使用情况下可以在VPS主机持续执行该脚本。但问题来了,爬京东违法不,会不会蹲监狱?个人感觉和12306网站同理,也就是我只买我自己用的应该不会进去,但是黄牛会进。最后需要注意的是不要把脚本的爬取频率调的太高,避免被京东反爬虫。