lib/utils.py
author fabien
Fri, 10 Oct 2003 11:38:20 -0400
branchxbelweb
changeset 41 7b7cd0e74ee3
parent 33 db91081e5a78
permissions -rw-r--r--
[svn r1585] Restrict keywords count to selection.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
17
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
     1
#!/usr/bin/python
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
     2
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
     3
def unique(s):
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
     4
    """Return a list of the elements in s, but without duplicates.
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
     5
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
     6
    For example, unique([1,2,3,1,2,3]) is some permutation of [1,2,3],
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
     7
    unique("abcabc") some permutation of ["a", "b", "c"], and
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
     8
    unique(([1, 2], [2, 3], [1, 2])) some permutation of
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
     9
    [[2, 3], [1, 2]].
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    10
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    11
    For best speed, all sequence elements should be hashable.  Then
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    12
    unique() will usually work in linear time.
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    13
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    14
    If not possible, the sequence elements should enjoy a total
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    15
    ordering, and if list(s).sort() doesn't raise TypeError it's
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    16
    assumed that they do enjoy a total ordering.  Then unique() will
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    17
    usually work in O(N*log2(N)) time.
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    18
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    19
    If that's not possible either, the sequence elements must support
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    20
    equality-testing.  Then unique() will usually work in quadratic
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    21
    time.
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    22
    """
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    23
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    24
    n = len(s)
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    25
    if n == 0:
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    26
        return []
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    27
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    28
    # Try using a dict first, as that's the fastest and will usually
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    29
    # work.  If it doesn't work, it will usually fail quickly, so it
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    30
    # usually doesn't cost much to *try* it.  It requires that all the
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    31
    # sequence elements be hashable, and support equality comparison.
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    32
    u = {}
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    33
    try:
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    34
        for x in s:
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    35
            u[x] = 1
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    36
    except TypeError:
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    37
        del u  # move on to the next method
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    38
    else:
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    39
        return u.keys()
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    40
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    41
    # We can't hash all the elements.  Second fastest is to sort,
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    42
    # which brings the equal elements together; then duplicates are
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    43
    # easy to weed out in a single pass.
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    44
    # NOTE:  Python's list.sort() was designed to be efficient in the
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    45
    # presence of many duplicate elements.  This isn't true of all
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    46
    # sort functions in all languages or libraries, so this approach
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    47
    # is more effective in Python than it may be elsewhere.
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    48
    try:
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    49
        t = list(s)
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    50
        t.sort()
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    51
    except TypeError:
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    52
        del t  # move on to the next method
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    53
    else:
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    54
        assert n > 0
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    55
        last = t[0]
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    56
        lasti = i = 1
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    57
        while i < n:
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    58
            if t[i] != last:
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    59
                t[lasti] = last = t[i]
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    60
                lasti += 1
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    61
            i += 1
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    62
        return t[:lasti]
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    63
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    64
    # Brute force is all that's left.
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    65
    u = []
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    66
    for x in s:
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    67
        if x not in u:
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    68
            u.append(x)
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    69
    return u
14bec94bbe89 [svn r1551] Add delete bookmarks functionality as well as many reusability issues.
fabien
parents:
diff changeset
    70
33
db91081e5a78 [svn r1577] Various change to the new interface.
fabien
parents: 17
diff changeset
    71