I have wrote simple script to get html's from multiple website. Although I didn't have any issue with the script up until yesterday. It suddenly started throwing the exception bellow.
Traceback (most recent call last):
File "crowling.py", line 45, in <module>
result = requests.get(url)
File "/Users/gen/.pyenv/versions/3.7.1/lib/python3.7/site-packages/requests/api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "/Users/gen/.pyenv/versions/3.7.1/lib/python3.7/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/Users/gen/.pyenv/versions/3.7.1/lib/python3.7/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/Users/gen/.pyenv/versions/3.7.1/lib/python3.7/site-packages/requests/sessions.py", line 685, in send
r.content
File "/Users/gen/.pyenv/versions/3.7.1/lib/python3.7/site-packages/requests/models.py", line 829, in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
File "/Users/gen/.pyenv/versions/3.7.1/lib/python3.7/site-packages/requests/models.py", line 754, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(54, 'Connection reset by peer')", ConnectionResetError(54, 'Connection reset by peer'))
The main part of the script is this.
c = 0
#urls is the list of urls as strings
for url in urls:
result = requests.get(url)
c += 1
with open('htmls/p{}.html'.format(c),'w',encoding='UTF-8') as f:
f.write(result.text)
The list urls is generated by my other codes and I have checked that the urls are correct. Also the timing of the exception is not constant. Sometimes it stops when scraping 20th htmls and sometimes it goes until 80th then stop. As the exception suddenly appeared without changing codes, I am guessing that the exception is due to the Internet connection. Yet, I want to ensure that the script works stably. Is there any possible cause of this error?