update-hib.py
author Fabien Ninoles <fabien@tzone.org>
Thu, 28 Aug 2014 07:31:43 -0400
changeset 14 f7112a0f9df7
parent 13 7567c5e4db45
child 15 053eabfead09
permissions -rwxr-xr-x
Add pausing in case of errors.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
     1
#!/usr/bin/python3
2
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
     2
#
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
     3
# Update HIB - Scrapper for the HumbleBundle library page.
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
     4
# Copyright (C) 2012, Fabien Ninoles <- fabien - AT - tzone . org ->
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
     5
#
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
     6
# This program is free software: you can redistribute it and/or modify
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
     7
# it under the terms of the GNU General Public License as published by
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
     8
# the Free Software Foundation, either version 3 of the License, or
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
     9
# (at your option) any later version.
8
98065a298da0 Update to handle audio type as "standard" (always download).
Fabien Ninoles <fabien@tzone.org>
parents: 7
diff changeset
    10
#
2
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
    11
# This program is distributed in the hope that it will be useful,
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
    12
# but WITHOUT ANY WARRANTY; without even the implied warranty of
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
    13
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
    14
# GNU General Public License for more details.
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
    15
#
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
    16
# You should have received a copy of the GNU General Public License
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
    17
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
    18
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    19
import bs4
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    20
from pprint import pprint
4
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
    21
from itertools import chain, groupby
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
    22
import logging
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
    23
import operator
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    24
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    25
class Download:
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    26
    subst = { "arc32"         : ("x86",),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    27
              "arc64"         : ("x64",),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    28
              "i386.deb"      : ("x86","deb"),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    29
              "x86_64.deb"    : ("x64", "deb"),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    30
              "i686.rpm"      : ("x86", "rpm"),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    31
              ".i386.rpm"     : ("x86", "rpm"),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    32
              "x86_64.rpm"    : ("x64", "rpm"),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    33
              ".x86_64.rpm"   : ("x64", "rpm"),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    34
              "i386.tar.gz"   : ("x86", "tgz"),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    35
              "x86_64.tar.gz" : ("x64", "tgz"),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    36
              ".tar.gz"       : ("tgz",),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    37
              ".deb"          : ("deb",),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    38
              ".rpm"          : ("rpm",),
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    39
              "32-bit"        : ("x86",),
5
b6a3b0987bfc Add ebook support.
Fabien Ninoles <fabien@tzone.org>
parents: 4
diff changeset
    40
              "64-bit"        : ("x64",),
b6a3b0987bfc Add ebook support.
Fabien Ninoles <fabien@tzone.org>
parents: 4
diff changeset
    41
              "(HD)"          : ("HD",),
6
0c6d2ed2cd7c Add download script production for http only links.
Fabien Ninoles <fabien@tzone.org>
parents: 5
diff changeset
    42
              "(MP3)"         : ("MP3",),
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    43
              }
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    44
    def __init__(self, dltype, soup):
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    45
        self.dltype = dltype
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    46
        ids = [attr for attr in soup["class"] if attr != "download"]
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    47
        button = soup.find(class_="flexbtn")
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    48
        desc = button.span.string
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    49
        ids.extend(desc.split(" "))
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    50
        self.id = " ".join(ids)
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    51
        def cleanup(attr):
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    52
            attr = attr.strip()
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    53
            if attr not in ("Download","small",""):
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    54
                for s in self.subst.get(attr,(attr,)):
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    55
                    yield s
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    56
        self.attrs = set(chain.from_iterable(cleanup(attr) for attr in ids))
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    57
        urls = button.a.attrs
10
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
    58
        logging.debug("URLS are %r", urls)
11
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
    59
        self.torrent = urls["data-bt"] if "data-bt" in urls.keys() else None
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    60
        self.web = urls["data-web"]
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    61
        details = soup.find(class_="dldetails").find(class_="dlsize")
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    62
        size = details.find(class_="mbs")
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    63
        md5 = details.find(class_="dlmd5")
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    64
        date = details.find(class_="dldate")
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    65
        self.size = size.string if size else "Unknown"
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    66
        self.md5 = md5.string if md5 else "Unknown"
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    67
        self.date = date.string if date else "Unknown"
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    68
    def format(self, prefix=""):
5
b6a3b0987bfc Add ebook support.
Fabien Ninoles <fabien@tzone.org>
parents: 4
diff changeset
    69
        res = prefix + '<download type="' + self.dltype + '" id="' + self.id + '">\n'
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    70
        res += prefix + "  <web>" + self.web + "</web>\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    71
        res += prefix + "  <torrent>" + self.torrent + "</torrent>\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    72
        res += prefix + "  <size>" + self.size + "</size>\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    73
        res += prefix + "  <md5>" + self.md5 + "</md5>\n"
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    74
        res += prefix + "  <date>" + self.date + "</date>\n"
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    75
        res += prefix + "</download>"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    76
        return res
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    77
    def __repr__(self):
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    78
        return self.format()
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    79
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    80
class Downloads:
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    81
    def __init__(self, soup):
11
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
    82
        self.id = [class_ for class_ in soup["class"] if class_ not in ("downloads","js-platform")][0]
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    83
        self.elements = []
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    84
        self.others = []
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    85
        self.addchilds(soup)
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    86
    def addchilds(self, soup):
10
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
    87
        logging.debug("Parsing soup for downloads %s", self.id)
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    88
        for child in soup.children:
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    89
            if type(child) is not bs4.element.Tag:
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    90
                continue
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    91
            classes = child["class"] if "class" in child.attrs else []
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
    92
            if [True for attr in classes if attr in ("arc-toggle", "downloads")]:
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
    93
                self.addchilds(child)
11
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
    94
            elif "download-buttons" in classes:
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
    95
                for subchild in child.children:
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
    96
                    if type(subchild) is not bs4.element.Tag:
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
    97
                        continue
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
    98
                    btn = subchild.find(class_="flexbtn")
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
    99
                    if not btn:
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
   100
                        continue
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
   101
                    desc = btn.span.string
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
   102
                    if desc == "Stream":
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
   103
                        logging.info("Ignoring Stream URLs for %s", self.id)
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
   104
                    else:
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
   105
                        self.elements.append(Download(self.id, subchild))
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
   106
            elif [True for attr in classes if attr in ("clearfix","label")]:
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   107
                pass
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   108
            else:
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   109
                self.others.append(child)
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   110
    def __iter__(self):
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   111
        return iter(self.elements)
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   112
    def format(self, prefix = ""):
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   113
        res = prefix + '<downloads id="' + self.id + '">\n'
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   114
        if self.elements:
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   115
            for el in self.elements:
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   116
                res += el.format(prefix + "  ") + "\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   117
        if self.others:
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   118
            res += prefix + "  <others>\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   119
            for o in self.others:
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   120
                res += o.format(prefix + "    ") + "\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   121
            res += prefix + "  </others>\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   122
        res += prefix + "</downloads>"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   123
        return res
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   124
    def __repr__(self):
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   125
        return self.format()
8
98065a298da0 Update to handle audio type as "standard" (always download).
Fabien Ninoles <fabien@tzone.org>
parents: 7
diff changeset
   126
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   127
class Game:
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
   128
    def __init__(self, soup):
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   129
        self.title = "unknown"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   130
        self.downloads = []
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   131
        self.others = []
10
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   132
        for child in soup.children:
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
   133
            if type(child) is not bs4.element.Tag:
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
   134
                continue
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
   135
            classes = child["class"] if "class" in child.attrs else []
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
   136
            if "gameinfo" in classes:
12
9d5880ecdb82 Fix a bug when there is no <a> in a title (like for Eye-Candy art).
Fabien Ninoles <fabien@tzone.org>
parents: 11
diff changeset
   137
                divTitle = child.find(class_="title")
9d5880ecdb82 Fix a bug when there is no <a> in a title (like for Eye-Candy art).
Fabien Ninoles <fabien@tzone.org>
parents: 11
diff changeset
   138
                if divTitle.a:
9d5880ecdb82 Fix a bug when there is no <a> in a title (like for Eye-Candy art).
Fabien Ninoles <fabien@tzone.org>
parents: 11
diff changeset
   139
                    divTitle = divTitle.a
9d5880ecdb82 Fix a bug when there is no <a> in a title (like for Eye-Candy art).
Fabien Ninoles <fabien@tzone.org>
parents: 11
diff changeset
   140
                self.title = divTitle.string.strip()
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
   141
            elif "downloads" in classes:
10
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   142
                logging.debug("Collecting downloadables for %s", self.title)
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   143
                self.downloads.append(Downloads(child))
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
   144
            elif [True for attr in classes if attr in ["icn", "clearfix"]]:
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   145
                pass
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   146
            else:
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   147
                self.others.append(child)
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   148
    def __repr__(self):
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   149
        res  = "<game>\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   150
        res += "  <title>" + self.title + "</title>\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   151
        if self.downloads:
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   152
            res += "  <downloads>\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   153
            for dl in self.downloads:
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   154
                res += dl.format("    ") + "\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   155
            res += "  </downloads>\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   156
        if self.others:
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   157
            res += "  <others>\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   158
            for o in self.others:
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   159
                res += o.format("    ") + "\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   160
            res += "  </others>\n"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   161
        res += "</game>"
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   162
        return res
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   163
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
   164
def parseGamesFromSoup(soup):
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
   165
    for row in soup.find_all(class_="row"):
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   166
        yield Game(row)
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   167
1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   168
def parseGamesFromFile(filename):
9
e3a2bb2bae8d Add bs4 support, removing the need for tidy up the page first.
Fabien Ninoles <fabien@tzone.org>
parents: 8
diff changeset
   169
    for game in parseGamesFromSoup(bs4.BeautifulSoup(open(filename))):
1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   170
        yield game
0
1e76c59aa3a6 Initial version: parse tidy file and select a suitable download url.
Fabien Ninoles <fabien@tzone.org>
parents:
diff changeset
   171
8
98065a298da0 Update to handle audio type as "standard" (always download).
Fabien Ninoles <fabien@tzone.org>
parents: 7
diff changeset
   172
class FileSelector:
1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   173
    def scoreDownload(self, dl):
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   174
        if dl.dltype == "audio":
8
98065a298da0 Update to handle audio type as "standard" (always download).
Fabien Ninoles <fabien@tzone.org>
parents: 7
diff changeset
   175
            if not dl.attrs: # Empty set, so we simply take it.
98065a298da0 Update to handle audio type as "standard" (always download).
Fabien Ninoles <fabien@tzone.org>
parents: 7
diff changeset
   176
                return 1
1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   177
            if "FLAC" in dl.attrs:
4
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   178
                return 1
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   179
            if "OGG" in dl.attrs:
8
98065a298da0 Update to handle audio type as "standard" (always download).
Fabien Ninoles <fabien@tzone.org>
parents: 7
diff changeset
   180
                return 1
1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   181
            if "MP3" in dl.attrs:
8
98065a298da0 Update to handle audio type as "standard" (always download).
Fabien Ninoles <fabien@tzone.org>
parents: 7
diff changeset
   182
                return 1
1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   183
            if "website" in dl.attrs:
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   184
                return -1
10
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   185
            if "AAC" in dl.attrs:
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   186
                return 1
8
98065a298da0 Update to handle audio type as "standard" (always download).
Fabien Ninoles <fabien@tzone.org>
parents: 7
diff changeset
   187
            raise Exception("Unknown audio type: %r" % (dl.attrs))
1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   188
        if dl.dltype in ("mac","windows"):
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   189
            return -1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   190
        if dl.dltype == "linux":
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   191
            score = 1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   192
            if "x64" in dl.attrs:
4
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   193
                score += 2
1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   194
            if "deb" in dl.attrs:
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   195
                score += 1
8
98065a298da0 Update to handle audio type as "standard" (always download).
Fabien Ninoles <fabien@tzone.org>
parents: 7
diff changeset
   196
            if "Stream" in dl.attrs:
98065a298da0 Update to handle audio type as "standard" (always download).
Fabien Ninoles <fabien@tzone.org>
parents: 7
diff changeset
   197
                score -= 1
1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   198
            return score
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   199
        if dl.dltype == "android":
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   200
            return -1
5
b6a3b0987bfc Add ebook support.
Fabien Ninoles <fabien@tzone.org>
parents: 4
diff changeset
   201
        if dl.dltype == "ebook":
b6a3b0987bfc Add ebook support.
Fabien Ninoles <fabien@tzone.org>
parents: 4
diff changeset
   202
            if "MOBI" in dl.attrs:
b6a3b0987bfc Add ebook support.
Fabien Ninoles <fabien@tzone.org>
parents: 4
diff changeset
   203
                return -1
b6a3b0987bfc Add ebook support.
Fabien Ninoles <fabien@tzone.org>
parents: 4
diff changeset
   204
            return 1
1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   205
        raise Exception("Unknown dls type: %r" % (dl,))
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   206
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   207
    def chooseDownloads(self, dls):
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   208
        return sorted(((self.scoreDownload(dl),dl) for dl in dls), key=lambda x: x[0], reverse=True)
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   209
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   210
    def __call__(self, dls):
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   211
        return self.chooseDownloads(dls)
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   212
4
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   213
def selectHighestScore(scores):
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   214
    if scores:
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   215
        get_first = operator.itemgetter(0)
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   216
        score, dls = next(groupby(sorted(scores, key = get_first, reverse=True), get_first))
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   217
        if score > 0:
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   218
            return list(dl for s, dl in dls)
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   219
        else:
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   220
            return []
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   221
    logging.debug("Empty scores list: %r", scores)
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   222
    return []
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   223
10
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   224
class tee:
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   225
    def __init__(self, main, *other):
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   226
        self.main = main
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   227
        self.other = other
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   228
    def write(self, s):
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   229
        self.main.write(s)
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   230
        for o in self.other:
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   231
            o.write(s)
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   232
13
7567c5e4db45 Add a "cache-dir" option to the script.
Fabien Ninoles <fabien@tzone.org>
parents: 12
diff changeset
   233
def main(fn, cachedir):
2
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
   234
    selector = FileSelector()
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
   235
    downloads = []
10
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   236
    import sys
11
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
   237
    import os
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
   238
    import urllib.parse
10
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   239
    with open("torrents.log", "w") as l:
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   240
        for game in parseGamesFromFile(fn):
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   241
            logging.info("Parsing game %s (%d downloads)", game.title, len(game.downloads))
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   242
            for dls in game.downloads:
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   243
                scores = list(selector(dls))
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   244
                choosen = selectHighestScore(scores)
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   245
                for score, dl in scores:
11
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
   246
                    print("[%s] %2d | %-30s | %-15s | %-30s | %-15s | %s <%s>" % (
10
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   247
                            "*" if dl in choosen else " ",
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   248
                            score,
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   249
                            game.title,
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   250
                            dls.id,
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   251
                            dl.date,
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   252
                            ", ".join(sorted(dl.attrs)),
11
dc1b075c538a Update for new js-platform tag.
Fabien Ninoles <fabien@tzone.org>
parents: 10
diff changeset
   253
			    os.path.basename(urllib.parse.urlsplit(dl.torrent).path),
10
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   254
                            dl.torrent),
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   255
                          file=l)
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   256
                    if dl in choosen:
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   257
                        downloads.append(dl)
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   258
                if not scores:
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   259
                    print("No download for %s" % (dls.id), file=l)
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   260
                print("-" * 80, file=l)
1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   261
2
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
   262
    import urllib.request
6
0c6d2ed2cd7c Add download script production for http only links.
Fabien Ninoles <fabien@tzone.org>
parents: 5
diff changeset
   263
    urlfile = open('http-download.sh','w')
2
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
   264
    opener = urllib.request.build_opener()
13
7567c5e4db45 Add a "cache-dir" option to the script.
Fabien Ninoles <fabien@tzone.org>
parents: 12
diff changeset
   265
    cache = set(os.listdir(cachedir))
4
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   266
    for dl in (dl for dl in downloads):
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   267
        if dl.torrent:
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   268
            try:
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   269
                fn = os.path.basename(urllib.parse.urlsplit(dl.torrent).path)
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   270
                if os.path.exists(fn):
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   271
                    logging.info("Skipping existing torrent %s", fn)
13
7567c5e4db45 Add a "cache-dir" option to the script.
Fabien Ninoles <fabien@tzone.org>
parents: 12
diff changeset
   272
                elif fn in cache:
7567c5e4db45 Add a "cache-dir" option to the script.
Fabien Ninoles <fabien@tzone.org>
parents: 12
diff changeset
   273
                    logging.info("Copying %s as %s from cache", dl.torrent, fn)
7567c5e4db45 Add a "cache-dir" option to the script.
Fabien Ninoles <fabien@tzone.org>
parents: 12
diff changeset
   274
                    os.link(os.path.join(cachedir, fn), fn)
4
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   275
                else:
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   276
                    logging.info("Saving %s as %s", dl.torrent, fn)
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   277
                    with opener.open(dl.torrent) as u:
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   278
                        with open(fn,"wb") as f:
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   279
                            f.write(u.read())
10
d7e256c9aec9 Add torrents.log.
Fabien Ninoles <fabien@tzone.org>
parents: 9
diff changeset
   280
                    logging.info("%s saved.", os.path.realpath(fn))
4
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   281
            except:
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   282
                logging.exception("Error with download %r", dl)
14
f7112a0f9df7 Add pausing in case of errors.
Fabien Ninoles <fabien@tzone.org>
parents: 13
diff changeset
   283
                input("Press enter to continue...")
4
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   284
        else:
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   285
            logging.info("No torrent, url is %s", dl.web)
6
0c6d2ed2cd7c Add download script production for http only links.
Fabien Ninoles <fabien@tzone.org>
parents: 5
diff changeset
   286
            fn = os.path.basename(urllib.parse.urlsplit(dl.web).path)
8
98065a298da0 Update to handle audio type as "standard" (always download).
Fabien Ninoles <fabien@tzone.org>
parents: 7
diff changeset
   287
            urlfile.write("wget --progress=bar -c -O %s \"%s\"\n" % (fn,dl.web))
98065a298da0 Update to handle audio type as "standard" (always download).
Fabien Ninoles <fabien@tzone.org>
parents: 7
diff changeset
   288
1
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   289
fb1ab147b2dd Add downloading of torrent files.
Fabien Ninoles <fabien@tzone.org>
parents: 0
diff changeset
   290
2
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
   291
if __name__ == '__main__':
3675dd7daf59 Take filename from command line arguments, add copyright and better readme text.
Fabien Ninoles <fabien@tzone.org>
parents: 1
diff changeset
   292
    import sys
4
e102d2bb7a9e Update to download multiple version.
Fabien Ninoles <fabien@tzone.org>
parents: 2
diff changeset
   293
    logging.getLogger().setLevel(logging.INFO)
13
7567c5e4db45 Add a "cache-dir" option to the script.
Fabien Ninoles <fabien@tzone.org>
parents: 12
diff changeset
   294
    main(sys.argv[1], sys.argv[2])