代码如下:
from creepy import Crawler
from BeautifulSoup import BeautifulSoup
import urllib2
import json
class MyCrawler(Crawler):
def process_document(self, doc):
if doc.status == 200:
print '[%d] %s' % (doc.status, doc.url)
try:
soup = BeautifulSoup(doc.text.decode('gb18030').encode('utf-8'))
except Exception as e:
print e
soup = BeautifulSoup(doc.text)
print soup.find(id="product-intro").div.h1.text
url_id=urllib2.unquote(doc.url).decode('utf8').split('/')[-1].split('.')[0]
f = urllib2.urlopen('http://p.3.cn/prices/get?skuid=J_'+url_id,timeout=5)
price=json.loads(f.read())
f.close()
print price[0]['p']
else:
pass
crawler = MyCrawler()
crawler.set_follow_mode(Crawler.F_SAME_HOST)
crawler.set_concurrency_level(16)
crawler.add_url_filter('\.(jpg|jpeg|gif|png|js|css|swf)$')
crawler.crawl('http://item.jd.com/982040.html')
京东app是一款移动购物软件,具有商品搜索/浏览、评论查阅、商品购买、在线支付/货到付款、订单查询、物流跟踪、晒单/评价、返修退换货等功能,为您打造简单、快乐的生活体验。有需要的小伙伴快来保存下载体验吧!
C++高性能并发应用_C++如何开发性能关键应用
Java AI集成Deep Java Library_Java怎么集成AI模型部署
Golang后端API开发_Golang如何高效开发后端和API
Python异步并发改进_Python异步编程有哪些新改进
C++系统编程内存管理_C++系统编程怎么与Rust竞争内存安全
Java GraalVM原生镜像构建_Java怎么用GraalVM构建高效原生镜像
Python FastAPI异步API开发_Python怎么用FastAPI构建异步API
C++现代C++20/23/26特性_现代C++有哪些新标准特性如modules和coroutines
Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号