PYTHON-2390 - Retryable reads use the same implicit session#2544
PYTHON-2390 - Retryable reads use the same implicit session#2544NoahStapp merged 17 commits intomongodb:masterfrom
Conversation
test/test_retryable_reads.py
Outdated
| retryReads=True, | ||
| ) | ||
|
|
||
| set_fail_point(client, fail_command) |
There was a problem hiding this comment.
Let's use self.fail_point() here.
test/test_retryable_reads.py
Outdated
|
|
||
| set_fail_point(client, fail_command) | ||
|
|
||
| client.t.t.estimated_document_count() |
There was a problem hiding this comment.
Can we extend this test to cover a few other operations as well?
test/test_retryable_reads.py
Outdated
| if event.command_name == "count" | ||
| ] | ||
| self.assertEqual(len(lsids), 2) | ||
| self.assertEqual(lsids[0], lsids[1]) |
There was a problem hiding this comment.
Should we fix PYTHON-2391 first? Otherwise this test doesn't prove the fix works correctly since first.command is the same dict instance as second.command.
There was a problem hiding this comment.
Yes, we should merge the PR for PYTHON-2391 (#2545) first to verify this works correctly.
|
Some additional context: Our current code uses
|
|
|
||
| @leave_alive.setter | ||
| def leave_alive(self, value: bool) -> None: | ||
| self._leave_alive = value |
There was a problem hiding this comment.
None of these new apis should be public since implicit sessions are only used internally by pymongo.
| with self._client._tmp_session(self._session) as s: | ||
| if s: | ||
| s.leave_alive = True | ||
| return self._run_aggregation_cmd(session=s) |
There was a problem hiding this comment.
Wouldn't leaving the cursor alive here leak a session every time the change stream cursor is closed?
There was a problem hiding this comment.
Good catch. This shouldn't have leave_alive set.
| return self._implicit | ||
|
|
||
| @property | ||
| def attached_to_cursor(self) -> bool: |
There was a problem hiding this comment.
Do we need two attributes to track ownership?
There was a problem hiding this comment.
Can you be more specific? As in we need a second attribute to track a different axis of ownership?
There was a problem hiding this comment.
Eh I was thinking tmp_session would work via the existing implicit=True/False attribute + a new "owner" attribute but your attached_to_cursor+leave_alive implementation seems simpler.
|
Looks like there's one test failure to fix: __________________________ TestSession.test_database ___________________________
self = <test.asynchronous.test_session.TestSession testMethod=test_database>
async def asyncTearDown(self):
monitoring._SENSITIVE_COMMANDS.update(self.sensitive_commands)
await self.client.drop_database("pymongo_test")
used_lsids = self.initial_lsids.copy()
for event in self.session_checker_listener.started_events:
if "lsid" in event.command:
used_lsids.add(event.command["lsid"]["id"])
current_lsids = {s["id"] for s in session_ids(self.client)}
> self.assertLessEqual(used_lsids, current_lsids)
E AssertionError: {Binary(b'\x92\xf1~\xfa\\\xbdI\xed\x96m\xba\x12\xdd\xba\x92\xbf', 4), Binary(b'\xbd\x86\xc1\xbc\xd35L\xb6\x80"\xbc\xb1D\x0f:a', 4)} not less than or equal to {Binary(b'\xbd\x86\xc1\xbc\xd35L\xb6\x80"\xbc\xb1D\x0f:a', 4)}
test/asynchronous/test_session.py:116: AssertionError |
| @property | ||
| def _is_implicit(self) -> bool: | ||
| """Whether this session was implicitly created by the driver.""" | ||
| return self._implicit |
There was a problem hiding this comment.
This is personal preference but do we really need these @property helpers? Usually we just access the private attribute directly, eg:
if session._implicit:...
if session._attached_to_cursor:...
This way there's less indirection and boilerplate code.
There was a problem hiding this comment.
For internal attributes it makes more sense to not have @property, agreed.
But only on PyPy 🫠 |
The failure was caused by PyPy garbage collecting differently than CPython: CPython GC'd the I've fixed this by explicitly closing the cursor rather than relying on garbage collection. Should we make doing so a standard pattern in the codebase? |
|
Good find. The bug sounds a little off. We should always be ending the session once the cursor is fully iterated. And |
Currently, we only end sessions associated with cursors in GC or on explicit closure. The fix here solves the issue of PyPy's GC behaving differently than expected. Do we want cursor-associated sessions to be either closed or untagged as soon as the cursor is no longer |
|
I believe this PR has introduced that case as a regression. Cursor.close() is always called after receiving the final batch which will return the session: mongo-python-driver/pymongo/synchronous/command_cursor.py Lines 290 to 291 in bbb6f88 |
ShaneHarvey
left a comment
There was a problem hiding this comment.
One comment otherwise LGTM
pymongo/asynchronous/client_bulk.py
Outdated
| session._start_retryable_write() | ||
| self.started_retryable_write = True | ||
| session._apply_to(cmd, retryable, ReadPreference.PRIMARY, conn) | ||
| session._leave_alive = True |
There was a problem hiding this comment.
Would it make more sense to move this to _process_results_cursor where we actually create a cursor? It seems out of place here.
ShaneHarvey
left a comment
There was a problem hiding this comment.
LGTM! Glad to see this bug finally fixed after 6 years.
No description provided.