Python 使用指定的网卡发送HTTP请求的实例
需求:一台机器上有多个网卡,如何访问指定的URL时使用指定的网卡发送数据呢?
$curl--interfaceeth0www.baidu.com#curlinterface可以指定网卡
阅读urllib.py的源码,追述到open_http–>httplib.HTTP–>httplib.HTTP._connection_class=HTTPConnection
HTTPConnection在创建的时候会指定一个source_address.
HTTPConnection.connect时调用HTTPConnection._create_connection=socket.create_connection
#先看一下本地网卡信息 $ifconfig lo0:flags=8049mtu16384 options=3 inet6::1prefixlen128 inet127.0.0.1netmask0xff000000 inet6fe80::1%lo0prefixlen64scopeid0x1 nd6options=1 en0:flags=8863 mtu1500 etherc8:e0:eb:17:3a:73 inet6fe80::cae0:ebff:fe17:3a73%en0prefixlen64scopeid0x4 inet192.168.20.2netmask0xffffff00broadcast192.168.20.255 nd6options=1 media:autoselect status:active en1:flags=8863 mtu1500 options=4 ether0c:5b:8f:27:9a:64 inet6fe80::e5b:8fff:fe27:9a64%en8prefixlen64scopeid0xa inet192.168.8.100netmask0xffffff00broadcast192.168.8.255 nd6options=1 media:autoselect(100baseTX ) status:active
可以看到en0和en1,这两块网卡都可以访问公网.lo0是本地回环.
直接修改socket.py做测试.
defcreate_connection(address,timeout=_GLOBAL_DEFAULT_TIMEOUT, source_address=None): """If*source_address*issetitmustbeatupleof(host,port) forthesockettobindasasourceaddressbeforemakingtheconnection. Anhostof''orport0tellstheOStousethedefault. source_address如果设置,必须是传递元组(host,port),默认是("",0) """ host,port=address err=None forresingetaddrinfo(host,port,0,SOCK_STREAM): af,socktype,proto,canonname,sa=res sock=None try: sock=socket(af,socktype,proto) #sock.bind(("192.168.20.2",0))#en0 #sock.bind(("192.168.8.100",0))#en1 #sock.bind(("127.0.0.1",0))#lo0 iftimeoutisnot_GLOBAL_DEFAULT_TIMEOUT: sock.settimeout(timeout) ifsource_address: print"socketbindsource_address:%s"%source_address sock.bind(source_address) sock.connect(sa) returnsock excepterroras_: err=_ ifsockisnotNone: sock.close() iferrisnotNone: raiseerr else: raiseerror("getaddrinforeturnsanemptylist")
参考说明文档,直接分三次绑定不通网卡的IP地址,端口设置为0.
#测试en0 $python-c'importurllibasu;printu.urlopen("http://ip.haschek.at").read()' .148.245.16 #测试en1 $python-c'importurllibasu;printu.urlopen("http://ip.haschek.at").read()' .94.115.227 #测试lo0 $python-c'importurllibasu;printu.urlopen("http://ip.haschek.at").read()' Traceback(mostrecentcalllast): File"",line1,in File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py",line87,inurlopen returnopener.open(url) File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py",line213,inopen returngetattr(self,name)(url) File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py",line350,inopen_http h.endheaders(data) File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py",line1049,inendheaders self._send_output(message_body) File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py",line893,in_send_output self.send(msg) File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py",line855,insend self.connect() File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py",line832,inconnect self.timeout,self.source_address) File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py",line578,increate_connection raiseerr IOError:[Errnosocketerror][Errno49]Can'tassignrequestedaddress
测试通过,说明在多网卡情况下,创建socket时绑定某块网卡的IP就可以,端口需要设置为0.如果端口不设置为0,第二次请求时,可以看到抛异常,端口被占用.
Traceback(mostrecentcalllast): File"",line1,in File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py",line87,inurlopen returnopener.open(url) File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py",line213,inopen returngetattr(self,name)(url) File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py",line350,inopen_http h.endheaders(data) File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py",line1049,inendheaders self._send_output(message_body) File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py",line893,in_send_output self.send(msg) File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py",line855,insend self.connect() File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py",line832,inconnect self.timeout,self.source_address) File"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py",line577,increate_connection raiseerr IOError:[Errnosocketerror][Errno48]Addressalreadyinuse
如果是在项目中,只需要把socket.create_connection这个函数的形参source_address设置为对应网卡的(IP,0)就可以.
#test-interface_urllib.py importsocket importurllib,urllib2 _create_socket=socket.create_connection SOURCE_ADDRESS=("127.0.0.1",0) #SOURCE_ADDRESS=("172.28.153.121",0) #SOURCE_ADDRESS=("172.16.30.41",0) defcreate_connection(*args,**kwargs): in_args=False iflen(args)>=3: args=list(args) args[2]=SOURCE_ADDRESS args=tuple(args) in_args=True ifnotin_args: kwargs["source_address"]=SOURCE_ADDRESS print"args",args print"kwargs",str(kwargs) return_create_socket(*args,**kwargs) socket.create_connection=create_connection printurllib.urlopen("http://ip.haschek.at").read()
通过测试,可以发现已经可以通过制定的网卡发送数据,并且IP地址对应网卡分配的IP.
问题,爬虫经常使用requests,requests是否支持呢.通过测试,可以发现,requests并没有使用python内置的socket模块.
看源码,requests是如果创建的socket连接呢.方法和查看urllib创建socket的方式一样.具体就不写了.
因为我用的是python2.7,所以可以定位到requests使用的socket模块是urllib3.utils.connection的.
修改方法和urllib相差不大.
importurllib3.connection _create_socket=urllib3.connection.connection.create_connection #pass urllib3.connection.connection.create_connection=create_connection #pass
运行后,可能会抛出异常.requests.exceptions.ConnectionError:Maxretriesexceededwith..Invalidargument
这个异常不是每次出现,跟IP段有关系,跳转递归层数太多导致,只需要将kwargs中的socket_options去掉即可.127.0.0.1肯定会出异常.
importsocket importurllib importurllib2 importurllib3.connection importrequestsasreq _default_create_socket=socket.create_connection _urllib3_create_socket=urllib3.connection.connection.create_connection SOURCE_ADDRESS=("127.0.0.1",0) #SOURCE_ADDRESS=("172.28.153.121",0) #SOURCE_ADDRESS=("172.16.30.41",0) defdefault_create_connection(*args,**kwargs): try: delkwargs["socket_options"] except: pass in_args=False iflen(args)>=3: args=list(args) args[2]=SOURCE_ADDRESS args=tuple(args) in_args=True ifnotin_args: kwargs["source_address"]=SOURCE_ADDRESS print"args",args print"kwargs",str(kwargs) return_default_create_socket(*args,**kwargs) defurllib3_create_connection(*args,**kwargs): in_args=False iflen(args)>=3: args=list(args) args[2]=SOURCE_ADDRESS in_args=True args=tuple(args) ifnotin_args: kwargs["source_address"]=SOURCE_ADDRESS print"args",args print"kwargs",str(kwargs) return_urllib3_create_socket(*args,**kwargs) socket.create_connection=default_create_connection #因为偶尔会出问题,所以使用默认的socket.create_connection #urllib3.connection.connection.create_connection=urllib3_create_connection urllib3.connection.connection.create_connection=default_create_connection print"***testrequests:"+req.get("http://ip.haschek.at").content print"***testurllib:"+urllib.urlopen("http://ip.haschek.at").read() print"***testurllib2:"+urllib2.urlopen("http://ip.haschek.at").read()
注意:使用urllib3.utils.connection好像不起作用
稍微再完善一下,就是把根据网卡名自动获取IP.
importsubprocess defget_all_net_devices(): sub=subprocess.Popen("ls/sys/class/net",shell=True,stdout=subprocess.PIPE) sub.wait() net_devices=sub.stdout.read().strip().splitlines() #['eth0','eth1','lo'] #这里简单过滤一下网卡名字,根据需求改动 net_devices=[iforiinnet_devicesif"ppp"ini] returnnet_devices ALL_DEVICES=get_all_net_devices() defget_local_ip(device_name): sub=subprocess.Popen("/sbin/ifconfigen0|grep'%s'|awk'{print$2}'"%device_name,shell=True,stdout=subprocess.PIPE) sub.wait() ip=sub.stdout.read().strip() returnip defrandom_local_ip(): returnget_local_ip(random.choice(ALL_DEVICES)) #code...
只需要把args[2]=SOURCE_ADDRESS和kwargs["source_address"]=SOURCE_ADDRESS改成random_local_ip()或者get_local_ip("eth0")
至于有什么用途,就全凭想象了.
以上这篇Python使用指定的网卡发送HTTP请求的实例就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持毛票票。
声明:本文内容来源于网络,版权归原作者所有,内容由互联网用户自发贡献自行上传,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任。如果您发现有涉嫌版权的内容,欢迎发送邮件至:czq8825#qq.com(发邮件时,请将#更换为@)进行举报,并提供相关证据,一经查实,本站将立刻删除涉嫌侵权内容。