@
ebony0319 粗略写了下,需要把里面的 YOUR_COOKIE 替换成你的 cookie :
##################################
import requests, re, time
headers={
'Cookie': YOUR_COOKIE
}
base_url = '
http://weibo.com/p/1001018008644010000000000/checkin?page=%s'def append_to_file(text, filename):
with open(filename, 'a') as f:
f.write(text)
def get_page(page):
r = requests.get(base_url % page, headers=headers)
print r.text.encode('utf-8')
return '\n'.join([x.encode('utf-8') for x in re.findall('<strong usercard=\\\\"[^"]+" >([^<]+)<\\\\/strong>', r.text)])
sleep_interval = 5
for p in xrange(1, 35612):
nicks = get_page(p)
retry_count = 0
while len(nicks) == 0:
retry_count += 1
time.sleep(retry_count * sleep_interval)
nicks = get_page(p)
append_to_file(nicks, 'data/nicks.txt')
##################################
weibo 有限流措施,懒得去研究怎么突破了,就用了最简单的 sleep + retry ,就放着慢慢跑就是了
如果需要我来帮你跑,回复我吧