python - BeautifulSoup解析网页的问题-PHP中文网问答

文章专题 AI工具学习下载问答源码最近更新

PHP

会员中心讲师中心微信公众号

python - BeautifulSoup解析网页的问题

天蓬老师 2017-04-17 13:08:28

[Python讨论组]

334

soup = BeautifulSoup(urlopen(url).read()) 这样做就解析不了网页
soup.findAll('') 获取什么节点都没有
而把html = urlopen(url).read(),html打印出来，在控制台复制粘贴给变量 content，然后这样做 soup = BeautifulSoup(content)，就能解析成功呢？

天蓬老师

欢迎选择我的课程，让我们一起见证您的进步~~

全部回复(1)

高洛峰2017-04-17 13:10:28 1楼

from bs4 import BeautifulSoup
import urllib

url = 'http://soccerdata.sports.qq.com/playerSearch.aspx?lega=epl&pn=9'
soup = BeautifulSoup(urllib.urlopen(url).read())
print len(soup.findAll())
print len(soup.findAll(''))
print len(soup.findAll('p'))

content = '

<p><h1>This is my homepage.</h1><p>Do you know?</p></p>

'
soup2 = BeautifulSoup(content)
print len(soup2.findAll())
print len(soup2.findAll(''))
print len(soup2.findAll('p'))

输出：