一、模擬登陸需要賬號,密碼的網(wǎng)址
一些不需要登陸的網(wǎng)址操作已經(jīng)試過了,這次來用Python嘗試需要登陸的網(wǎng)址,來利用cookie模擬登陸
由于我們教務系統(tǒng)有驗證碼偏困難一點,故挑了個軟柿子捏,賽氪,https://www.saikr.com
我用的是火狐瀏覽器自帶的F12開發(fā)者工具,打開網(wǎng)址輸入賬號,密碼,登陸,如圖
可以看到捕捉到很多post和get請求,第一個post請求就是我們提交賬號和密碼的,
點擊post請求的參數(shù)選項可以看到我們提交的參數(shù)在bian表單數(shù)據(jù)里,name為賬戶名,pass為加密后的密碼,remember為是否記住密碼,0為不記住密碼。
我們再來看看headers,即消息頭
我們把這些請求頭加到post請求的headers后對網(wǎng)頁進行模擬登陸,
Cookie為必填項,否則會報錯:
{"code":403,"message":"訪問超時,請重試,多次出現(xiàn)此提示請聯(lián)系QQ:1409765583","data":[]}
便可以創(chuàng)建一個帶有cookie的opener,在第一次訪問登錄的URL時,將登錄后的cookie保存下來,然后利用帶有這個cookie的opener來訪問該網(wǎng)址的其他版塊,查看登錄之后才能看到的信息。
比如我是登陸https://www.saikr.com/login后模擬登陸了“我的競賽”版塊https://www.saikr.com/u/5598522
代碼如下:
- import urllib
- from urllib import request
- from http import cookiejar
- login_url = "https://www.saikr.com/login"
- postdata ={
- "name": "your account","pass": "your password(加密后)"
- }
- header = {
- "Accept":"application/json, text/javascript, */*; q=0.01",
- "Accept-Language":"zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2",
- "Connection":"keep-alive",
- "Host":"www.saikr.com",
- "Referer":"https://www.saikr.com/login",
- "Cookie":"your cookie",
- "Content-Type":"application/x-www-form-urlencoded; charset=UTF-8",
- "TE":"Trailers","X-Requested-With":"XMLHttpRequest"
- }
- postdata = urllib.parse.urlencode(postdata).encode('utf8')
- #req = requests.post(url,postdata,header)
- #聲明一個CookieJar對象實例來保存cookie
- cookie = cookiejar.CookieJar()
- #利用urllib.request庫的HTTPCookieProcessor對象來創(chuàng)建cookie處理器,也就CookieHandler
- cookie_support = request.HTTPCookieProcessor(cookie)
- #通過CookieHandler創(chuàng)建opener
- opener = request.build_opener(cookie_support)
- #創(chuàng)建Request對象
- my_url="https://www.saikr.com/u/5598522"
- req1 = request.Request(url=login_url, data=postdata, headers=header)#post請求
- req2 = request.Request(url=my_url)#利用構造的opener不需要cookie即可登陸,get請求
- response1 = opener.open(req1)
- response2 = opener.open(req2)
- print(response1.read().decode('utf8'))
- print(response2.read().decode('utf8'))
到此就告一段落了:
ps:有點小插曲,當在headers里加入
Accept-Encoding | gzip, deflate, br |
時,最后在 print(response1.read().decode('utf8'))時便會報錯
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
原因:在請求header中設置了'Accept-Encoding': 'gzip, deflate'
參考鏈接:https://www.cnblogs.com/chyu/p/4558782.html
解決方法:去掉Accept-Encoding后就正常了
二、模擬登陸網(wǎng)址常用方法總結
1.通過urllib庫的request庫的函數(shù)進行請求
- from urllib import request
- #get請求
- ------------------------------------------------------
- #不加headers
- response=request.urlopen(url)
- page_source = response.read().decode('utf-8')
- #加headers,由于urllib.request.urlopen() 函數(shù)不接受headers參數(shù),所以需要構建一個urllib.request.Request對象來實現(xiàn)請求頭的設置
- req= request.Request(url=url,headers=headers)
- response=request.urlopen(req)
- page_source = response.read().decode('utf-8')
- #post請求
- -------------------------------------------------------
- postdata = urllib.parse.urlencode(data).encode('utf-8')#必須進行重編碼
- req= request.Request(url=url,data=postdata,headers=headers)
- response=request.urlopen(req)
- page_source = response.read().decode('utf-8')
- #使用cookie訪問其他版塊
- #聲明一個CookieJar對象實例來保存cookie
- cookie = cookiejar.CookieJar()
- #利用urllib.request庫的HTTPCookieProcessor對象來創(chuàng)建cookie處理器,也就CookieHandler
- cookie_support = request.HTTPCookieProcessor(cookie)
- #通過CookieHandler創(chuàng)建opener
- opener = request.build_opener(cookie_support)
- # 將Opener安裝位全局,覆蓋urlopen函數(shù),也可以臨時使用opener.open()函數(shù)
- #urllib.request.install_opener(opener)
- #創(chuàng)建Request對象
- my_url="https://www.saikr.com/u/5598522"
- req2 = request.Request(url=my_url)
- response1 = opener.open(req1)
- response2 = opener.open(req2)
- #或者直接response2=opener.open(my_url)
- print(response1.read().decode('utf8'))
- print(response2.read().decode('utf8'))
2.通過requests庫的get和post函數(shù)
- import requests
- import urllib
- import json
- #get請求
- -----------------------------------------------------------
- #method1
- url="https://www.saikr.com/"
- params={ 'key1': 'value1','key2': 'value2' }
- real_url = base_url + urllib.parse.urlencode(params)
- #real_url="https://www.saikr.com/key1=value1&key2=value2"
- response=requests.get(real_url)
- #method2
- response=requests.get(url,params)
- print(response.text)#<class 'str'>
- print(response.content)# <class 'bytes'>
- #post請求
- login_url = "https://www.saikr.com/login"
- postdata ={
- "name": "1324802616@qq.com","pass": "my password",
- }
- header = {
- "Accept":"application/json, text/javascript, */*; q=0.01",
- "Accept-Language":"zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2",
- "Connection":"keep-alive",
- "Host":"www.saikr.com",
- "Referer":"https://www.saikr.com/login",
- "Cookie":"mycookie",
- "Content-Type":"application/x-www-form-urlencoded; charset=UTF-8",
- "TE":"Trailers","X-Requested-With":"XMLHttpRequest"
- }
- #requests中的post中傳入的data可以不進行重編碼
- #login_postdata = urllib.parse.urlencode(postdata).encode('utf8')
- response=requests.post(url=login_url,data=postdata,headers=header)#<class 'requests.models.Response'>
- #以下三種都可以解析結果
- json1 = response1.json()#<class 'dict'>
- json2= json.loads(response1.text)#<class 'dict'>
- json_str = response2.content.decode('utf-8')#<class 'str'>
- #利用session維持會話訪問其他版塊
- --------------------------------------------------------------------
- login_url = "https://www.saikr.com/login"
- postdata ={
- "name": "1324802616@qq.com","pass": "my password",
- }
- header = {
- "Accept":"application/json, text/javascript, */*; q=0.01",
- "Connection":"keep-alive",
- "Referer":"https://www.saikr.com/login",
- "Cookie":"mycookie",
- }
- session = requests.session()
- response = session.post(url=url, data=data, headers=headers)
- my_url="https://www.saikr.com/u/5598522"
- response1 = session.get(url=my_url, headers=headers)
- print(response1.json())