You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Graph API has tough throttling policies and in most cases there's no point implementing high concurrency when loading data from it. Teams connector does concurrent requests to Graph API:
class MicrosoftTeamsDataSource(BaseDataSource):
def __init__(self, configuration):
self.queue = MemQueue(maxmemsize=QUEUE_MEM_SIZE, refresh_timeout=120)
...
async def get_docs(self):
...
async for tabs in self.client.get_user_chat_tabs(chat_id=chat["id"]):
for tab in tabs:
await self.queue.put(
(
self.formatter.format_doc(
item=tab,
document_type=self.schema.chat_tabs,
document={
"type": UserEndpointName.TABS.value,
"url": tab.get("configuration", {}).get("websiteUrl"),
"_timestamp": chat["lastUpdatedDateTime"],
"members": members,
},
),
None,
)
)
...
which:
Makes code more complex
Makes errors much less readable due to the way they are handled (in async queue rather than in regular async methods)
Does not add any parallelism even - see the code snippet, it puts the formatter statements into the queue, so data is loaded in a regular way
Proposed Solution
Get rid of usage of MemQueue from Microsoft Teams connector
Alternatives
None
Additional Context
This issue was created after seeing a user reporting problems and their log statements from the connector were absolutely unreadable:
[FMWK][15:53:04][ERROR] Exception found for task Task-471: 'NoneType' object has no attribute 'get'
NoneType: None
Investigation has shown that it happens because the queue is masking the exceptions potentially
The text was updated successfully, but these errors were encountered:
Problem Description
Teams connector is syncing data using Graph API: https://github.com/elastic/connectors/blob/main/connectors/sources/microsoft_teams.py#L249.
Graph API has tough throttling policies and in most cases there's no point implementing high concurrency when loading data from it. Teams connector does concurrent requests to Graph API:
which:
Proposed Solution
Get rid of usage of MemQueue from Microsoft Teams connector
Alternatives
None
Additional Context
This issue was created after seeing a user reporting problems and their log statements from the connector were absolutely unreadable:
Investigation has shown that it happens because the queue is masking the exceptions potentially
The text was updated successfully, but these errors were encountered: