python代理服务器设置以及开启log日志（python小白笔记四）

时间：08-23来源：作者：点击数：25

python代理

整理好的代理服务器网址：http://www.xicidaili.com/

找验证时间比较短的成功机率较大，验证时间长的可能会失效。

"""
代理服务器网址：http://www.xicidaili.com/
61.135.217.7 端口号：80
如果爬取得时候出现相应得异常，需要考虑是否对应得代理IP失效了。
"""
def use_proxy(proxy_addr,url):  """自定义函数，主要实现代理服务器来爬某个URL地址"""
    import urllib.request
    proxy=urllib.request.ProxyHandler({'http':proxy_addr}) #设置对应的代理服务器信息  服务器地址
    opener=urllib.request.build_opener(proxy,urllib.request.HTTPHandler) #创建一个自定义的opener对象
    urllib.request.install_opener(opener)  #创建全局默认的opener对象
    data=urllib.request.urlopen(url).read().decode('utf-8')
    return data
proxy_add="61.135.217.7:80"
data=use_proxy(proxy_add,"https://gsh.cdsy.xyz")
print(data)

开启log日志

边打印边调试log日志

import urllib.request
httphd = urllib.request.HTTPSHandler(debuglevel=1)
httpshd = urllib.request.HTTPSHandler(debuglevel=1)

opener = urllib.request.build_opener(httphd, httpshd)
urllib.request.install_opener(opener)
data=urllib.request.urlopen("https://gsh.cdsy.xyz")

运行结果：

D:\工具\pythonTools\CatchTest1101\venv\Scripts\python.exe D:/工具/pythonTools/CatchTest1101/venv/test/test110210Log.py
send: b'GET /qq_36411874 HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: blog.csdn.net\r\nUser-Agent: Python-urllib/3.7\r\nConnection: close\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: openresty
header: Date: Fri, 02 Nov 2018 08:48:28 GMT
header: Content-Type: text/html; charset=UTF-8
header: Transfer-Encoding: chunked
header: Connection: close
header: Vary: Accept-Encoding
header: Set-Cookie: uuid_tt_dd=10_9925713700-1541148508444-906338; Expires=Thu, 01 Jan 2025 00:00:00 GMT; Path=/; Domain=.csdn.net;
header: Set-Cookie: uuid_tt_dd=10_9925713700-1541148508444-906338; Expires=Thu, 01 Jan 2025 00:00:00 GMT; Path=/; Domain=.csdn.net;
header: Vary: Accept-Encoding
header: Strict-Transport-Security: max-age= 31536000

Process finished with exit code 0

捕捉异常URLError

URLError是HTTPError的父类。

首先，完整版。

#urlError
import urllib.request
import urllib.error
try:
    file=urllib.request.urlopen("https://gsh.cdsy.xyz")
    data = file.read()
    print(data)
    print(file.getcode())
except urllib.error.HTTPError as e:
    print(e.code)
    print(e.reason)
except urllib.error.URLError as e: #当URL不存在的时候，没有e.code的，只有e.reason  。URLError是HTTPError的父类
    print(e.reason)

整合版本，判断是否有code和reason再输出

import urllib.request
import urllib.error
try:
    file=urllib.request.urlopen("https://gsh.cdsy.xyz")
    data=file.read()
    print(file.getcode())
except urllib.request.URLError as e:
    if(hasattr(e,"code")):
        print(e.code)
    if(hasattr(e,"reason")):
        print(e.reason)

URLError原因：连接不上服务器，远程URL不存在，无网络，触发了HTTPError

方便获取更多学习、工作、生活信息请关注本站微信公众号 城东书院微信服务号

来顶一下

返回首页

上一篇:python设置是否超时，http协议get,post请求(python小白学习笔记三) 下一篇:python提取路径名称，最后一个正斜杠后边图片名称（加后缀）字符串（python小白学习笔记五）

高考生入学注意：这些大	【健康】纯净水、天然
14种竞赛生升学路径盘	excel后缀xls和xlsx有

首页

学习

工作

生活

兴趣组

电子

计算机

掌上机件

图库

游戏

考试与竞赛

黑板报

国学

外语

下载

故事汇

社区

课程

python代理服务器设置以及开启log日志（python小白笔记四）

python代理

开启log日志

捕捉异常URLError