Socket.py timeout

I am working on a skill that used to work as intended but no longer does (probably broke after something updated). The logic for that portion of code works on my machine outside of Mycroft so I suspect something is different in the virtualenvironment.

The skill calls an api which pulls data from mlb.com but it looks like the socket times out probably 90% of the time. I can’t find a pattern to the success/failure but it seems to lie around 10/90. That portion of the API, and urllib/http are unfamiliar to me so I was hoping somebody with more experience would recognize this.

Source code is available here. There is an open issue on that repository.

Error is the following:

17:26:29.948 - mycroft.skills.core:wrapper:607 - ERROR - An error occurred while processing a request in Score Skill
Traceback (most recent call last):
File "/home/dj/mycroft-core/mycroft/skills/core.py", line 598, in wrapper
handler(message)
File "/opt/mycroft/skills/skill-score/__init__.py", line 103, in handle_live_score_intent
self.get_result()
File "/opt/mycroft/skills/skill-score/__init__.py", line 76, in get_result
self.get_game()
File "/opt/mycroft/skills/skill-score/__init__.py", line 58, in get_game
self.game = mlbgame.day(self.year, self.month, self.day, home=self.team, away=self.team)
File "/home/dj/mycroft-core/.venv/lib/python3.6/site-packages/mlbgame/__init__.py", line 151, in day
data = mlbgame.game.scoreboard(year, month, day, home=home, away=away)
File "/home/dj/mycroft-core/.venv/lib/python3.6/site-packages/mlbgame/game.py", line 19, in scoreboard
data = mlbgame.data.get_scoreboard(year, month, day)
File "/home/dj/mycroft-core/.venv/lib/python3.6/site-packages/mlbgame/data.py", line 49, in get_scoreboard
data = urlopen(BASE_URL.format(year, month, day) + 'scoreboard.xml')
File "/home/dj/miniconda3/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/home/dj/miniconda3/lib/python3.6/urllib/request.py", line 526, in open
response = self._open(req, data)
File "/home/dj/miniconda3/lib/python3.6/urllib/request.py", line 544, in _open
'_open', req)
File "/home/dj/miniconda3/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/home/dj/miniconda3/lib/python3.6/urllib/request.py", line 1346, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/home/dj/miniconda3/lib/python3.6/urllib/request.py", line 1321, in do_open
r = h.getresponse()
File "/home/dj/miniconda3/lib/python3.6/http/client.py", line 1331, in getresponse
response.begin()
File "/home/dj/miniconda3/lib/python3.6/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/home/dj/miniconda3/lib/python3.6/http/client.py", line 258, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/home/dj/miniconda3/lib/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out

Hi @permanentlytemporary thanks for posting this. I took a look at your Python code and I can’t see any obvious errors in your code. My guess is that the mlb library opens a socket to get the data, but does not set a timeout or raise an exception when the timeout is reached.

Is there any documentation on mlb and I’ll take a quick look?

TL;DR I don’t think this is your Python, I think this is mlb.

Thanks @KathyReid!

I agree that this is an mlbgame/urllib problem :smiley:

The api - mlbgame lives here: https://github.com/panzarino/mlbgame

And there is documentation for the module in question here: http://panz.io/mlbgame/data.m.html

With no timeout specified it should fallback to the global default. I am confused as to why this call is able to successfully happen outside of Mycroft and even within Mycroft’s virtualenvironment, just not when MyCroft is actually running.

Without being fluent in Python websockets, my guess would be that there is some sort of socket timeout issue occurring. @forslund do you have any ideas here on what we should be checking for so that we don’t go down a wrong path?

The main issue as you’ve pointed out seems to be a timeout reading data from the site. This may be a change in the site’s api or some other network related issue. I’d first see if I can read the data manually.

Start up python and run the failing mlbgame.data.get_scoreboard(year, month, day) with the intended date and see if that works. If it doesn’t try to work out the BASE_URL and try opening the intended link in a browser.

If it does work try to provide an invalid date and see if the error occurs.

1 Like

@forslund I can confirm that I get valid http responses when running manually. It nets me this page, which is exactly what is intended: http://gd2.mlb.com/components/game/mlb/year_2018/month_06/day_04/scoreboard.xml

mlbgame has no issue making socket requests manually. I copy-pasted the skill class out, built a simple wrapper to implement it in a terminal and it throws no errors.

Did you try to pass an invalid date to the method?

Yup, invalid dates (zeros, negatives, months >12, days>31, etc) return a string of the path to a default xml file with an ‘empty’ xml scoreboard object.

Ok, this is really weird. I can repeat the issue exactly as you say it is. We must somehow change the timeout on sockets and the page is too slow or something.

In the mycroft venv the call works fine, only when run from within a mycroft processes I see the timeout error. Will research further.

1 Like

Ok did a quick test with your skill and added

        import socket
        socket.setdefaulttimeout(100)

before mlbgame.day which fixes the issue for me…

Wonder if the default has changed between python versions?

1 Like

Thanks a ton - it looks good over here as well!

1 Like

A short update:
I found the culprit reducing the timeout to 3 seconds and have submitted a PR fixing it: https://github.com/MycroftAI/mycroft-core/pull/1627

1 Like