You've got the first case right. In the second case, only one a from aabc matches, so M = 1. In the third example, both as match so M = 2.
[P.S.: you're referring to the ancient Python 2.4 source code. The current source code is at hg.python.org.]
Answer from Fred Foo on Stack OverflowYou've got the first case right. In the second case, only one a from aabc matches, so M = 1. In the third example, both as match so M = 2.
[P.S.: you're referring to the ancient Python 2.4 source code. The current source code is at hg.python.org.]
never too late...
from difflib import SequenceMatcher
texto1 = 'BRASILIA~DISTRITO FEDERAL, DF'
texto2 = 'BRASILIA-DISTRITO FEDERAL, '
tamanho_texto1 = len(texto1)
tamanho_texto2 = len(texto2)
tamanho_tot = tamanho_texto1 + tamanho_texto2
tot = 0
if texto1 <= texto2:
for x in range(len(texto1)):
y = texto1[x]
if y in texto2:
tot += 1
else:
for x in range(len(texto2)):
y = texto2[x]
if y in texto1:
tot += 1
print('sequenceM = ',SequenceMatcher(None, texto1, texto2).ratio())
print('Total calculado = ',2*tot/tamanho_tot)
sequenceM = 0.9285714285714286
Total calculado = 0.9285714285714286
Videos
You forgot the first parameter to SequenceMatcher.
>>> import difflib
>>>
>>> a='abcd'
>>> b='ab123'
>>> seq=difflib.SequenceMatcher(None, a,b)
>>> d=seq.ratio()*100
>>> print d
44.4444444444
http://docs.python.org/library/difflib.html
From the docs:
The SequenceMatcher class has this constructor:
class difflib.SequenceMatcher(isjunk=None, a='', b='', autojunk=True)
The problem in your code is that by doing
seq=difflib.SequenceMatcher(a,b)
you are passing a as value for isjunk and b as value for a, leaving the default '' value for b. This results in a ratio of 0.0.
One way to overcome this (already mentioned by Lennart) is to explicitly pass None as extra first parameter so all the keyword arguments get assigned the correct values.
However I just found, and wanted to mention another solution, that doesn't touch the isjunk argument but uses the set_seqs() method to specify the different sequences.
>>> import difflib
>>> a = 'abcd'
>>> b = 'ab123'
>>> seq = difflib.SequenceMatcher()
>>> seq.set_seqs(a.lower(), b.lower())
>>> d = seq.ratio()*100
>>> print d
44.44444444444444
» pip install cdifflib