I have an asynchronous parser that works through a tor network:
proxy_auth = str(random.randint(10000, 0x7fffffff)) + ':' + 'foobar'
connector = SocksConnector.from_url('socks5://{}@localhost:9050'.format(proxy_auth))
async with aiohttp.ClientSession(connector=connector) as session:
for _ in range(5):
try:
results = []
async with session.get(url, allow_redirects=True, timeout=4) as response:
soup = BS4(await response.text(), 'html.parser')
results.append(result)
When working there are a lot of errors:
Traceback (most recent call last):
File "parser.py", line 149, in get_matches_info
async with session.get(meta['url'], allow_redirects=True, timeout=7) as response:
File "C:\Python37\lib\site-packages\aiohttp\client.py", line 1005, in __aenter__
self._resp = await self._coro
File "C:\Python37\lib\site-packages\aiohttp\client.py", line 575, in _request
break
File "C:\Python37\lib\site-packages\aiohttp\helpers.py", line 585, in __exit__
raise asyncio.TimeoutError from None
concurrent.futures._base.TimeoutError
Traceback (most recent call last):
File "parser.py", line 149, in get_matches_info
async with session.get(meta['url'], allow_redirects=True, timeout=7) as response:
File "C:\Python37\lib\site-packages\aiohttp\client.py", line 1005, in __aenter__
self._resp = await self._coro
File "C:\Python37\lib\site-packages\aiohttp\client.py", line 476, in _request
timeout=real_timeout
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 522, in connect
proto = await self._create_connection(req, traces, timeout)
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 854, in _create_connection
req, traces, timeout)
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 992, in _create_direct_connection
raise last_exc
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 974, in _create_direct_connection
req=req, client_error=client_error)
File "C:\Python37\lib\site-packages\aiohttp_socks\connector.py", line 53, in _wrap_create_connection
protocol_factory, None, None, sock=sock.socket, **kwargs)
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 931, in _wrap_create_connection
raise client_error(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host www.example.com:443 ssl:None [None]
Traceback (most recent call last):
File "parser.py", line 149, in get_matches_info
async with session.get(meta['url'], allow_redirects=True, timeout=7) as response:
File "C:\Python37\lib\site-packages\aiohttp\client.py", line 1005, in __aenter__
self._resp = await self._coro
File "C:\Python37\lib\site-packages\aiohttp\client.py", line 497, in _request
await resp.start(conn)
File "C:\Python37\lib\site-packages\aiohttp\client_reqrep.py", line 844, in start
message, payload = await self._protocol.read() # type: ignore # noqa
File "C:\Python37\lib\site-packages\aiohttp\streams.py", line 588, in read
await self._waiter
aiohttp.client_exceptions.ServerDisconnectedError: None
Because of this, I cannot parse some web pages. Moreover, after some time of the script workings, another error occurs:
Traceback (most recent call last):
File "parser.py", line 149, in get_matches_info
async with session.get(meta['url'], allow_redirects=True, timeout=7) as response:
File "C:\Python37\lib\site-packages\aiohttp\client.py", line 1005, in __aenter__
self._resp = await self._coro
File "C:\Python37\lib\site-packages\aiohttp\client.py", line 476, in _request
timeout=real_timeout
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 522, in connect
proto = await self._create_connection(req, traces, timeout)
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 854, in _create_connection
req, traces, timeout)
File "C:\Python37\lib\site-packages\aiohttp\connector.py", line 974, in _create_direct_connection
req=req, client_error=client_error)
File "C:\Python37\lib\site-packages\aiohttp_socks\connector.py", line 50, in _wrap_create_connection
await sock.connect((host, port))
File "C:\Python37\lib\site-packages\aiohttp_socks\proto.py", line 138, in connect
(self._socks_host, self._socks_port, e.strerror)) from e
aiohttp_socks.errors.SocksConnectionError: [Errno 10061] Can not connect to proxy localhost:9050 [Connect call failed ('127.0.0.1', 9050)]
I cannot connect to the tor network, the following is in the tor console:
Jul 21 14:21:38.000 [notice] We'd like to launch a circuit to handle a connection, but we already have 32 general-purpose client circuits pending. Waiting until some finish.
Jul 21 14:23:10.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60s after 18 timeouts and 124 buildtimes.
Jul 21 14:27:21.000 [notice] Tried for 121 seconds to get a connection to [scrubbed]:443. Giving up.
Jul 21 14:27:21.000 [notice] Tried for 121 seconds to get a connection to [scrubbed]:443. Giving up. (waiting for circuit)
What could it be a mistake and how should I be with her? Is it generally possible to use a tor in multiple threads, or is it a stupid idea?
User contributions licensed under CC BY-SA 3.0