difflib.GestaltSequenceMatcher implementation

# Feature or enhancement

### Proposal:

Increase quality of results by being able to run Longest Common Subsequence with `autojunk=False` with little to no degradation in performance.

Implemented `GestaltSequenceMatcher` by default gives equivalent results to `SequenceMatcher(autojunk=False)`.


| case                             |    Myers |  difflib(aj=1) |  difflib(aj=0) |  New(aj=0) |    New+balancing=2/3 |
| -------------------------------- | -------: | -------------: | -------------: | -------------: | -------------: |
| [L] diff_key      (   15,    15) |        8 |       5     -3 |       5     -3 |       5     -3 |       8      0 |
| [L] diff_mid1     (   28,    27) |       11 |      11      0 |      11      0 |      11      0 |      11      0 |
| [L] diff_mid2     (  128,   128) |       73 |      72     -1 |      72     -1 |      72     -1 |      72     -1 |
| [L] diff_mid3     (  131,   131) |       63 |      50    -13 |      50    -13 |      50    -13 |      57     -6 |
| [L] difflib.py    ( 2041,  2334) |     1883 |    1878     -5 |    1879     -4 |    1879     -4 |    1879     -4 |
| [L] linux_sched.c (11373, 11832) |    10844 |   10793    -51 |   10837     -7 |   10837     -7 |   10844      0 |
| [L] react.js      ( 3555,  3337) |     2631 |    2568    -63 |    2581    -50 |    2581    -50 |    2581    -50 |
| [L] three.js      (36282, 36793) |    30688 |   25877  -4811 |   30588   -100 |   30588   -100 |   30656    -32 |
| [L] webpack.js    ( 2327,  3429) |      929 |     772   -157 |     899    -30 |     899    -30 |     916    -13 |
| -----                            |          |                |                |                |                |                |
|                           Total: |    47130 |   42026  -5104 |   46922   -208 |   46922   -208 |   47024   -106 |
|                      Avg % diff: |       0% |         10.58% |          7.25% |          7.25% |          1.61% |
|                    Total % diff: |       0% |         10.83% |          0.44% |          0.44% |          0.22% |
|                     Runtime (s): |   32.451 |          0.694 |         41.517 |          0.292 |          0.362 |

Table above shows the number of individual match points.
Which is not a definitive metric.
More definitive metric could be a number of alignment points sacrificed for the sake of quality.

However, it can be safely assumed that `autojunk=False` / new matcher produces higher quality diffs by not resorting to approximations. Luckily, it also increases number of match points.

New optional `balancing` argument can be turned on to reduce the chance of greedily committing to unbalanced matches. It produces more concise diffs with more lines matched, while retaining block-oriented nature.

For visual inspection, can see outputs of different algorithms for tailored case [HERE!]( https://htmlpreview.github.io/?https://gist.githubusercontent.com/dg-pb/89c2ed7cc3ebad06d9cd87fe91361292/raw/d6220ec7c71d40112a01b2257e3b2f0f28bbe354/diff_comparison.html)

<details><summary>Legacy</summary>
<p>


O(n) solution is not that complex - 100 of Python lines.

I hacked in initial implementation and ran `context_diff` on the `difflib.py` - the change itself.

```python
import difflib, time

with open('/Users/Edu/lib/src/cx_amb/src/cx/cli/snd/diff/difflib.py') as f:
    s1 = f.read().splitlines(keepends=True)

with open('/Users/Edu/lib/src/cx_amb/src/cx/cli/snd/diff/difflib_new.py') as f:
    s2 = f.read().splitlines(keepends=True)

list(difflib.context_diff(s1, s2, fromfile='before.py', tofile='after.py'))
# 2.4 ms (Current)
# 2.9 ms (New)
```

~~So 30% faster.~~
So 20% slower.
Also, this is a fairly rough plug-in - there is a chance that the gap can be bridged.

`O(n)` solution should avoid some of the user dissatisfaction:
</p>
</details> 


### Has this already been discussed elsewhere?

No response given

### Links to previous discussion of this feature:

Made a couple of posts in unrelated thread.
Performance tables are there showing that this is faster even for sizes of 10.
https://discuss.python.org/t/include-time-and-space-complexity-in-documentation/105683/32


### Linked PRs
* gh-144164

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

difflib.GestaltSequenceMatcher implementation #144132

Feature or enhancement

Proposal:

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

case	Myers	difflib(aj=1)	difflib(aj=0)	New(aj=0)	New+balancing=2/3
[L] diff_key ( 15, 15)	8	5 -3	5 -3	5 -3	8 0
[L] diff_mid1 ( 28, 27)	11	11 0	11 0	11 0	11 0
[L] diff_mid2 ( 128, 128)	73	72 -1	72 -1	72 -1	72 -1
[L] diff_mid3 ( 131, 131)	63	50 -13	50 -13	50 -13	57 -6
[L] difflib.py ( 2041, 2334)	1883	1878 -5	1879 -4	1879 -4	1879 -4
[L] linux_sched.c (11373, 11832)	10844	10793 -51	10837 -7	10837 -7	10844 0
[L] react.js ( 3555, 3337)	2631	2568 -63	2581 -50	2581 -50	2581 -50
[L] three.js (36282, 36793)	30688	25877 -4811	30588 -100	30588 -100	30656 -32
[L] webpack.js ( 2327, 3429)	929	772 -157	899 -30	899 -30	916 -13
-----
Total:	47130	42026 -5104	46922 -208	46922 -208	47024 -106
Avg % diff:	0%	10.58%	7.25%	7.25%	1.61%
Total % diff:	0%	10.83%	0.44%	0.44%	0.22%
Runtime (s):	32.451	0.694	41.517	0.292	0.362

Uh oh!

difflib.GestaltSequenceMatcher implementation #144132

Description

Feature or enhancement

Proposal:

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions