2.3.2 代码开发
重载 make_request
方法一般与重载 make_seeds
方法配合使用,如果你的种子过大,建议采用此种方式!这样做的优点如下:
- 可以提取重复代码
- 精简种子格式,节省内存
- 解耦初始化与
seeds -> request
流程
from bricks import Request
from bricks.spider import air
class MyTest(air.Spider):
def make_seeds(self, context: air.Context, **kwargs):
for i in range(1):
yield {
"page": 1
}
def make_request(self, context: air.Context) -> Request:
# 之前定义的种子会被投放至任务队列, 之后会被取出来, 迁入至 context 对象内
seeds = context.seeds
return Request(
url="https://fx1.service.kugou.com/mfanxing-home/h5/cdn/room/index/list_v2",
params={
"page": seeds["page"],
"cid": 6000
},
headers={
"User-Agent": "Mozilla/5.0 (Linux; Android 10; Redmi K30 Pro) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Mobile Safari/537.36",
"Content-Type": "application/json;charset=UTF-8",
},
)
if __name__ == '__main__':
spider = MyTest()
spider.run()