requests - Errores de comando de Scrapy Bench/Benchmark
scrapy sitemapspider (1)
Necesitarás instalar el paquete cffi
python pero antes necesitarás instalar ffi
, que es libffi-dev
y libffi
en Ubuntu:
sudo aptitude install libffi-dev libffi
sudo pip install cffi
También necesitará instalar libssl-dev
porque se usa en el paquete python de cryptography
.
Después de eso, debes reinstalar el scrapy usando: sudo pip install scrapy --upgrade
Si no resuelve el problema, instale la última versión de scrapy de github, tarball:
https://github.com/scrapy/scrapy/tarball/master
Funcionó para mí ...
Instalé Scrapy 0.22.2 y pude ejecutar el ejemplo de código DirBot sin problemas. Sin embargo, cuando ejecuto el comando Bench, obtengo algunos errores y excepciones. ¿Hay algún problema por debajo del puerto 8998 que no acepte conexiones?
C:/>scrapy bench
Traceback (most recent call last):
File "C:/Python27/lib/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "C:/Python27/lib/runpy.py", line 72, in _run_code
exec code in run_globals
File "C:/Python27/lib/site-packages/scrapy-0.22.2-py2.7.egg/scrapy/tests/mocks
erver.py", line 198, in <module>
os.path.join(os.path.dirname(__file__), ''keys/cert.pem''),
File "C:/Python27/lib/site-packages/twisted/internet/ssl.py", line 70, in __in
it__
self.cacheContext()
File "C:/Python27/lib/site-packages/twisted/internet/ssl.py", line 79, in cach
eContext
ctx.use_certificate_file(self.certificateFileName)
OpenSSL.SSL.Error: [(''system library'', ''fopen'', ''No such process''), (''BIO routin
es'', ''FILE_CTRL'', ''system lib''), (''SSL routines'', ''SSL_CTX_use_certificate_file''
, ''system lib'')]
2014-04-07 14:30:39-0500 [scrapy] INFO: Scrapy 0.22.2 started (bot: scrapybot)
2014-04-07 14:30:39-0500 [scrapy] INFO: Optional features available: ssl, http11
2014-04-07 14:30:39-0500 [scrapy] INFO: Overridden settings: {''CLOSESPIDER_TIMEO
UT'': 10, ''LOG_LEVEL'': ''INFO'', ''LOGSTATS_INTERVAL'': 1}
2014-04-07 14:30:40-0500 [scrapy] INFO: Enabled extensions: LogStats, TelnetCons
ole, CloseSpider, WebService, CoreStats, SpiderState
2014-04-07 14:30:42-0500 [scrapy] INFO: Enabled downloader middlewares: HttpAuth
Middleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, Def
aultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, Redirec
tMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2014-04-07 14:30:42-0500 [scrapy] INFO: Enabled spider middlewares: HttpErrorMid
dleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddlew
are
2014-04-07 14:30:42-0500 [scrapy] INFO: Enabled item pipelines:
2014-04-07 14:30:42-0500 [follow] INFO: Spider opened
2014-04-07 14:30:42-0500 [follow] INFO: Crawled 0 pages (at 0 pages/min), scrape
d 0 items (at 0 items/min)
2014-04-07 14:30:43-0500 [follow] INFO: Crawled 0 pages (at 0 pages/min), scrape
d 0 items (at 0 items/min)
2014-04-07 14:30:44-0500 [follow] INFO: Crawled 0 pages (at 0 pages/min), scrape
d 0 items (at 0 items/min)
2014-04-07 14:30:45-0500 [follow] INFO: Crawled 0 pages (at 0 pages/min), scrape
d 0 items (at 0 items/min)
2014-04-07 14:30:45-0500 [follow] ERROR: Error downloading <GET http://localhost
:8998/follow?total=100000&order=rand&maxlatency=0.0&show=20>: Connection was ref
used by other side: 10061: No connection could be made because the target machin
e actively refused it..
2014-04-07 14:30:45-0500 [follow] INFO: Closing spider (finished)
2014-04-07 14:30:45-0500 [follow] INFO: Dumping Scrapy stats:
{''downloader/exception_count'': 3,
''downloader/exception_type_count/twisted.internet.error.ConnectionRefus
edError'': 3,
''downloader/request_bytes'': 783,
''downloader/request_count'': 3,
''downloader/request_method_count/GET'': 3,
''finish_reason'': ''finished'',
''finish_time'': datetime.datetime(2014, 4, 7, 19, 30, 45, 575000),
''log_count/ERROR'': 1,
''log_count/INFO'': 10,
''scheduler/dequeued'': 3,
''scheduler/dequeued/memory'': 3,
''scheduler/enqueued'': 3,
''scheduler/enqueued/memory'': 3,
''start_time'': datetime.datetime(2014, 4, 7, 19, 30, 42, 439000)}
2014-04-07 14:30:45-0500 [follow] INFO: Spider closed (finished)