FIX: Updated search to better account for versioning, FIX/IMP: Improved KAT torrent support, FIX:(#509) github issue link, FIX: Manual Run Post-Processing should be working again, FIX: FileChecking inclusions added, as well as more parsing (Issue Titles currently not supported)

This commit is contained in:
evilhero 2013-08-11 01:31:41 -04:00
parent 176de5b92a
commit 37c6decd07
7 changed files with 153 additions and 90 deletions

View File

@ -1,6 +1,8 @@
Mylar is an automated Comic Book (cbr/cbz) downloader program heavily-based on the Headphones template and logic (which is also based on Sick-Beard).
Yes, it does work for the most part but it is the pure definition of an 'Alpha Release'.
Yes, it does work, yes there are still bugs, and for that reson I still consider it the definition of an 'Alpha Release'.
This application requires a version of the 2.7.x Python branch for the best results. 3.x is not supported.
To start it, type in 'python Mylar.py' from within the root of the mylar directory. Adding a --help option to the command will give a list of available options.
@ -10,12 +12,14 @@ Here are some helpful hints hopefully:
- Add a comic (series) using the Search button, or using the Pullist.
- Make sure you specify Comic Location as well as your SABnzbd settings in the Configuration!
(Mylar auto-creates the Comic Series directories under the Comic Location. The directory is displayed on the Comic Detail page).
- If you make any Configuration changes, shutdown Mylar and restart it or else errors will occur - this is a current bug.
- If you make any Configuration changes, shutdown Mylar and restart it or else errors will occur - this is an outstanding bug.
- You need to specify a search-provider in order to get the downloads to send to SABnzbd. If you don't have either listed, choose Experimental!
- In the Configuration section, if you enable 'Automatically Mark Upcoming Issues as Wanted' it will mark any NEW comic from the pullist that is on your 'watchlist' as wanted.
- There are times when adding a comic it will fail with an 'Error', submit a bug and it will be checked out (usually an easy fix).
- For the most up-to-date build, use the Development build. Master doesn't get updated as frequently (> month), and Development is usually fairly stable.
The Mylar Forums are now online: http://forum.mylarcomics.com (it's new - don't be scared to post)
The 2 features below are extremely new and users may encounter some problems until bugs are squashed.
Please submit issues via git for any outstanding problems that need attention.
Post-Processing
@ -29,10 +33,10 @@ Post-Processing
Renaming
- You can now specify Folder / File Formats.
- Folder Format - if left blank, it will default to just using the default Comic Directory (and creating subdirectories beneath in the format of ComicName-Year)
- Folder Format - if left blank, it will default to just using the default Comic Directory [ and creating subdirectories beneath in the format of ComicName-(Year) ]
You can do multi-levels as well - so you could do $Publisher/$Series/$Year to have it setup like DC Comics/Batman/2011 (as an example)
- File Format - if left blank, Mylar will use the original file and not rename at all. This includes replacing spaces, and zero suppression (both renaming features).
- Folder Format IS used on every Add Series / Refresh Series request. Enabling Renaming has no bearing on this, so make sure if you're not using the default, that it's what you want.
Please help make it better, by sending in your bug reports / enhancement requests or just say what's working for you.
@ -52,3 +56,6 @@ The Pull page ...
The Config screen ...
![preview thumb](http://i.imgur.com/nQjIN.png)
reddit nick is evil-hero

View File

@ -74,7 +74,7 @@
</div>
<div class="row">
<label><strong>Issues</strong></label>
<div><a href="https://github.com/evilhero/mylar/">https://github.com/evilhero/mylar/issues/</a></div>
<div><a href="https://github.com/evilhero/mylar/issues">https://github.com/evilhero/mylar/issues/</a></div>
</div>
<div class="row">
<label><strong>Internet Relay Chat</strong></label>

View File

@ -196,10 +196,11 @@ class PostProcessor(object):
else:
fn = 0
fccnt = int(watchmatch['comiccount'])
if len(watchmatch) == 1: continue
while (fn < fccnt):
try:
tmpfc = watchmatch['comiclist'][fn]
except IndexError:
except IndexError,KeyError:
break
temploc= tmpfc['JusttheDigits'].replace('_', ' ')
temploc = re.sub('[\#\']', '', temploc)
@ -422,9 +423,6 @@ class PostProcessor(object):
return self.log
comicid = issuenzb['ComicID']
issuenumOG = issuenzb['Issue_Number']
if self.nzb_name == 'Manual Run':
#loop through the hits here.
if len(manual_list) == '0':
@ -437,6 +435,8 @@ class PostProcessor(object):
issuenumOG = ml['IssueNumber']
self.Process_next(comicid,issueid,issuenumOG,ml)
else:
comicid = issuenzb['ComicID']
issuenumOG = issuenzb['Issue_Number']
return self.Process_next(comicid,issueid,issuenumOG)
def Process_next(self,comicid,issueid,issuenumOG,ml=None):

View File

@ -27,7 +27,9 @@ def file2comicmatch(watchmatch):
#print ("match: " + str(watchmatch))
pass
def listFiles(dir,watchcomic,AlternateSearch=None):
def listFiles(dir,watchcomic,AlternateSearch=None,manual=None):
manual = "yes"
# use AlternateSearch to check for filenames that follow that naming pattern
# ie. Star Trek TNG Doctor Who Assimilation won't get hits as the
# checker looks for Star Trek TNG Doctor Who Assimilation2 (according to CV)
@ -67,25 +69,22 @@ def listFiles(dir,watchcomic,AlternateSearch=None):
#subname = os.path.join(basedir, item)
subname = item
#versioning - remove it
subsplit = subname.split('_')
subsplit = subname.replace('_', ' ').split()
volrem = None
for subit in subsplit:
#print ("subit:" + str(subit))
if 'v' in str(subit).lower():
#print ("possible versioning detected.")
if subit[0].lower() == 'v':
vfull = 0
if subit[1:].isdigit():
#if in format v1, v2009 etc...
if len(subit) > 3:
# if it's greater than 3 in length, then the format is Vyyyy
vfull = 1 # add on 1 character length to account for extra space
#print (subit + " - assuming versioning. Removing from initial search pattern.")
subname = re.sub(str(subit), '', subname)
subname = re.sub(subit, '', subname)
volrem = subit
if subit.lower()[:3] == 'vol':
elif subit.lower()[:3] == 'vol':
#if in format vol.2013 etc
#because the '.' in Vol. gets removed, let's loop thru again after the Vol hit to remove it entirely
#print ("volume detected as version #:" + str(subit))
logger.fdebug("volume indicator detected as version #:" + str(subit))
subname = re.sub(subit, '', subname)
volrem = subit
@ -115,7 +114,7 @@ def listFiles(dir,watchcomic,AlternateSearch=None):
logger.fdebug("- appears in series title.")
i+=1
if detneg == "no":
subname = re.sub(str(nono), '', subname)
subname = re.sub(str(nono), ' ', subname)
nonocount = nonocount + subcnt
#logger.fdebug(str(nono) + " detected " + str(subcnt) + " times.")
# segment '.' having a . by itself will denote the entire string which we don't want
@ -150,7 +149,7 @@ def listFiles(dir,watchcomic,AlternateSearch=None):
subname = re.sub(str(nono), ' ', subname)
nonocount = nonocount + subcnt + blspc
#subname = re.sub('[\_\#\,\/\:\;\.\-\!\$\%\+\'\?\@]',' ', subname)
modwatchcomic = re.sub('[\_\#\,\/\:\;\.\-\!\$\%\'\?\@]', '', u_watchcomic)
modwatchcomic = re.sub('[\_\#\,\/\:\;\.\-\!\$\%\'\?\@]', ' ', u_watchcomic)
detectand = False
detectthe = False
modwatchcomic = re.sub('\&', ' and ', modwatchcomic)
@ -292,6 +291,63 @@ def listFiles(dir,watchcomic,AlternateSearch=None):
logger.fdebug("final justthedigits [" + justthedigits + "]")
if manual == "yes":
#this is needed for Manual Run to determine matches
#without this Batman will match on Batman Incorporated, and Batman and Robin, etc..
logger.fdebug("modwatchcomic = " + modwatchcomic.lower())
logger.fdebug("subname = " + subname.lower())
#tmpitem = item[:jtd_len]
# if it's an alphanumeric with a space, rejoin, so we can remove it cleanly just below this.
substring_removal = None
poss_alpha = subname.split(' ')[-1:]
logger.fdebug("poss_alpha: " + str(poss_alpha))
logger.fdebug("lenalpha: " + str(len(''.join(poss_alpha))))
for issexcept in issue_exceptions:
if issexcept.lower()in str(poss_alpha).lower() and len(''.join(poss_alpha)) <= len(issexcept):
#get the last 2 words so that we can remove them cleanly
substring_removal = ' '.join(subname.split(' ')[-2:])
substring_join = ''.join(subname.split(' ')[-2:])
logger.fdebug("substring_removal: " + str(substring_removal))
logger.fdebug("substring_join: "+ str(substring_join))
break
if substring_removal is not None:
sub_removed = subname.replace('_', ' ').replace(substring_removal, substring_join)
else:
sub_removed = subname.replace('_', ' ')
logger.fdebug("sub_removed: " + str(sub_removed))
split_sub = sub_removed.rsplit(' ',1)[0].split(' ') #removes last word (assuming it's the issue#)
split_mod = modwatchcomic.replace('_', ' ').split() #batman
logger.fdebug("split_sub: " + str(split_sub))
logger.fdebug("split_mod: " + str(split_mod))
x = len(split_sub)-1
scnt = 0
if x > len(split_mod)-1:
logger.fdebug("number of words don't match...aborting.")
else:
while ( x > -1 ):
print str(split_mod[x]) + " comparing to " + str(split_mod[x])
if str(split_sub[x]).lower() == str(split_mod[x]).lower():
scnt+=1
logger.fdebug("word match exact. " + str(scnt) + "/" + str(len(split_mod)))
x-=1
wordcnt = int(scnt)
logger.fdebug("scnt:" + str(scnt))
totalcnt = int(len(split_mod))
logger.fdebug("split_mod length:" + str(totalcnt))
try:
spercent = (wordcnt/totalcnt) * 100
except ZeroDivisionError:
spercent = 0
logger.fdebug("we got " + str(spercent) + " percent.")
if int(spercent) >= 80:
logger.fdebug("this should be considered an exact match.")
else:
logger.fdebug("failure - not an exact match.")
continue
comiclist.append({
'ComicFilename': item,
@ -305,6 +361,7 @@ def listFiles(dir,watchcomic,AlternateSearch=None):
#print ("directory found - ignoring")
logger.fdebug("you have a total of " + str(comiccnt) + " " + watchcomic + " comics")
watchmatch['comiccount'] = comiccnt
print watchmatch
return watchmatch
def validateAndCreateDirectory(dir, create=False):

View File

@ -120,11 +120,11 @@ def torrents(pickfeed=None,seriesname=None,issue=None):
'site': 'KAT',
'length': tmpsz['length']
})
print ("Site: KAT")
print ("Title: " + str(feedme.entries[i].title))
print ("Link: " + str(tmpsz['url']))
print ("pubdate: " + str(feedme.entries[i].updated))
print ("size: " + str(tmpsz['length']))
#print ("Site: KAT")
#print ("Title: " + str(feedme.entries[i].title))
#print ("Link: " + str(tmpsz['url']))
#print ("pubdate: " + str(feedme.entries[i].updated))
#print ("size: " + str(tmpsz['length']))
elif pickfeed == "1" or pickfeed == "4":
# tmpsz = feedme.entries[i].enclosures[0]
@ -330,9 +330,11 @@ def rssdbupdate(feeddata,i,type):
def torrentdbsearch(seriesname,issue,comicid=None):
myDB = db.DBConnection()
seriesname_alt = None
#print "seriesname:" + str(seriesname)
if comicid is None or comicid == 'None':
pass
else:
#print ("ComicID: " + str(comicid))
snm = myDB.action("SELECT * FROM comics WHERE comicid=?", [comicid]).fetchone()
if snm is None:
logger.fdebug("Invalid ComicID of " + str(comicid) + ". Aborting search.")
@ -342,33 +344,46 @@ def torrentdbsearch(seriesname,issue,comicid=None):
seriesname_alt = snm['AlternateSearch']
tsearch_seriesname = re.sub('[\'\!\@\#\$\%\:\;\/\\=\?\.\s]', '%',seriesname)
tsearch_seriesname = re.sub('[\'\!\@\#\$\%\:\-\;\/\\=\?\.\s]', '%',seriesname)
formatrem_seriesname = re.sub('[\'\!\@\#\$\%\:\;\/\\=\?\.]', '',seriesname)
if formatrem_seriesname[:1] == ' ': formatrem_seriesname = formatrem_seriesname[1:]
tsearch = tsearch_seriesname + "%"
#print tsearch
AS_Alt = []
tresults = []
if mylar.ENABLE_CBT:
tresults = myDB.action("SELECT * FROM rssdb WHERE Title like ? AND Site='comicBT'", [tsearch])
tresults = myDB.action("SELECT * FROM rssdb WHERE Title like ? AND Site='comicBT'", [tsearch]).fetchall()
if mylar.ENABLE_KAT:
tresults += myDB.action("SELECT * FROM rssdb WHERE Title like ? AND Site='KAT'", [tsearch])
if tresults is None:
logger.fdebug("torrent search returned no results for " + seriesname)
if seriesname_alt is None:
tresults += myDB.action("SELECT * FROM rssdb WHERE Title like ? AND Site='KAT'", [tsearch]).fetchall()
#print "seriesname_alt:" + str(seriesname_alt)
if seriesname_alt is None or seriesname_alt == 'None':
if tresults is None:
logger.fdebug("no Alternate name given. Aborting search.")
return "no results"
else:
chkthealt = seriesname_alt.split('##')
if chkthealt == 0:
AS_Alternate = AlternateSearch
for calt in chkthealt:
AS_Alternate = re.sub('##','',calt)
if mylar.ENABLE_CBT:
tresults += myDB.action("SELECT * FROM rssdb WHERE Title like ? AND Site='comicBT'", [AS_Alternate])
if mylar.ENABLE_KAT:
tresults += myDB.action("SELECT * FROM rssdb WHERE Title like ? AND Site='KAT'", [AS_Alternate])
if tresults is None:
logger.fdebug("torrent alternate name search returned no results.")
return "no results"
else:
chkthealt = seriesname_alt.split('##')
if chkthealt == 0:
AS_Alternate = seriesname_alt
AS_Alt.append(seriesname_alt)
for calt in chkthealt:
AS_Alter = re.sub('##','',calt)
u_altsearchcomic = AS_Alter.encode('ascii', 'ignore').strip()
AS_Alternate = re.sub('[\_\#\,\/\:\;\.\-\!\$\%\+\'\?\@]', '%', u_altsearchcomic)
AS_Alt.append(AS_Alternate)
AS_Alternate += '%'
if mylar.ENABLE_CBT:
#print "AS_Alternate:" + str(AS_Alternate)
tresults += myDB.action("SELECT * FROM rssdb WHERE Title like ? AND Site='comicBT'", [AS_Alternate]).fetchall()
if mylar.ENABLE_KAT:
tresults += myDB.action("SELECT * FROM rssdb WHERE Title like ? AND Site='KAT'", [AS_Alternate]).fetchall()
if tresults is None:
logger.fdebug("torrent search returned no results for " + seriesname)
return "no results"
extensions = ('cbr', 'cbz')
tortheinfo = []
torinfo = {}
@ -386,7 +401,7 @@ def torrentdbsearch(seriesname,issue,comicid=None):
formatrem_torsplit = re.sub('\s+', ' ', formatrem_torsplit)
#print (str(len(formatrem_torsplit)) + " - formatrem_torsplit : " + formatrem_torsplit.lower())
#print (str(len(formatrem_seriesname)) + " - formatrem_seriesname :" + formatrem_seriesname.lower())
if formatrem_seriesname.lower() in formatrem_torsplit.lower():
if formatrem_seriesname.lower() in formatrem_torsplit.lower() or any(x.lower() in formatrem_torsplit.lower() for x in AS_Alt):
logger.fdebug("matched to : " + tor['Title'])
logger.fdebug("matched on series title: " + seriesname)
titleend = formatrem_torsplit[len(formatrem_seriesname):]
@ -405,7 +420,7 @@ def torrentdbsearch(seriesname,issue,comicid=None):
# #print("issue # detected : " + str(issue))
# elif helpers.issuedigits(issue.rstrip()) == helpers.issuedigits(sp.rstrip()):
# logger.fdebug("Issue matched for : " + str(issue))
#the title on CBT has a mix-mash of crap...ignore everything after cbz/cbr to cleanit
#the title on CBT has a mix-mash of crap...ignore everything after cbz/cbr to cleanit
ctitle = tor['Title'].find('cbr')
if ctitle == 0:
ctitle = tor['Title'].find('cbz')
@ -413,7 +428,7 @@ def torrentdbsearch(seriesname,issue,comicid=None):
logger.fdebug("cannot determine title properly - ignoring for now.")
continue
cttitle = tor['Title'][:ctitle]
# #print("change title to : " + str(cttitle))
#print("change title to : " + str(cttitle))
# if extra == '':
tortheinfo.append({
'title': cttitle, #tor['Title'],

View File

@ -469,32 +469,37 @@ def NZB_SEARCH(ComicName, IssueNumber, ComicYear, SeriesYear, nzbprov, nzbpr, Is
comsearch = comsrc + "%20" + str(isssearch) + "%20" + str(filetype)
issdig = ''
mod_isssearch = str(issdig) + str(isssearch)
#--- this is basically for RSS Feeds ---
if RSS == "yes":
if nzbprov == 'ComicBT':
if nzbprov == 'ComicBT' or nzbprov == 'KAT':
cmname = re.sub("%20", " ", str(comsrc))
logger.fdebug("Sending request to [ComicBT] RSS for " + str(cmname) + " : " + str(isssearch))
bb = rsscheck.torrentdbsearch(cmname,isssearch,ComicID)
logger.fdebug("Sending request to [" + str(nzbprov) + "] RSS for " + str(cmname) + " : " + str(mod_isssearch))
bb = rsscheck.torrentdbsearch(cmname,mod_isssearch,ComicID,nzbprov)
rss = "yes"
if bb is not None: logger.fdebug("bb results: " + str(bb))
else:
cmname = re.sub("%20", " ", str(comsrc))
logger.fdebug("Sending request to RSS for " + str(cmname) + " : " + str(isssearch))
bb = rsscheck.nzbdbsearch(cmname,isssearch,ComicID)
logger.fdebug("Sending request to RSS for " + str(cmname) + " : " + str(mod_isssearch))
bb = rsscheck.nzbdbsearch(cmname,mod_isssearch,ComicID)
rss = "yes"
if bb is not None: logger.fdebug("bb results: " + str(bb))
#this is the API calls
else:
#CBT is redudant now - just getting it ready for when it's not redudant :)
#CBT is redudant now since only RSS works
# - just getting it ready for when it's not redudant :)
if nzbprov == 'ComicBT':
cmname = re.sub("%20", " ", str(comsrc))
logger.fdebug("Sending request to [ComicBT] RSS for " + str(cmname) + " : " + str(isssearch))
bb = rsscheck.torrentdbsearch(cmname,isssearch,ComicID)
logger.fdebug("Sending request to [ComicBT] RSS for " + str(cmname) + " : " + str(mod_isssearch))
bb = rsscheck.torrentdbsearch(cmname,mod_isssearch,ComicID)
rss = "yes"
if bb is not None: logger.fdebug("results: " + str(bb))
elif nzbprov == 'KAT':
cmname = re.sub("%20", " ", str(comsrc))
logger.fdebug("Sending request to [KAT] for " + str(cmname) + " : " + str(isssearch))
bb = rsscheck.torrents(pickfeed='2',seriesname=cmname,issue=isssearch)
logger.fdebug("Sending request to [KAT] for " + str(cmname) + " : " + str(mod_isssearch))
bb = rsscheck.torrents(pickfeed='2',seriesname=cmname,issue=mod_isssearch)
rss = "no"
if bb is not None: logger.fdebug("results: " + str(bb))
elif nzbprov != 'experimental':
@ -907,8 +912,10 @@ def NZB_SEARCH(ComicName, IssueNumber, ComicYear, SeriesYear, nzbprov, nzbpr, Is
logger.fdebug("vers4vol: " + str(vers4vol))
if vers4year is not "no" or vers4vol is not "no":
if ComicVersion:# is not "None" and ComicVersion is not None:
if ComicVersion: #is not "None" and ComicVersion is not None:
D_ComicVersion = re.sub("[^0-9]", "", ComicVersion)
if D_ComicVersion == '':
D_ComicVersion = 0
else:
D_ComicVersion = 0
@ -1022,7 +1029,7 @@ def NZB_SEARCH(ComicName, IssueNumber, ComicYear, SeriesYear, nzbprov, nzbpr, Is
logger.fdebug("nzb name to be used for post-processing is : " + str(nzbname))
sent_to = "your Blackhole Directory"
#end blackhole
elif nzbprov == 'ComicBT':
elif nzbprov == 'ComicBT' or nzbprov == 'KAT':
logger.fdebug("sending .torrent to watchdir.")
logger.fdebug("ComicName:" + ComicName)
logger.fdebug("link:" + entry['link'])

View File

@ -629,33 +629,10 @@ def forceRescan(ComicID,archive=None):
controlValueDict = {"IssueID": iss_id}
#if Archived, increase the 'Have' count.
if archive:
issStatus = "Archived"
# if haveissue == "no" and issuedupe == "no":
# isslocation = "None"
# if old_status == "Skipped":
# if mylar.AUTOWANT_ALL:
# issStatus = "Wanted"
# else:
# issStatus = "Skipped"
# elif old_status == "Archived":
# havefiles+=1
# issStatus = "Archived"
# elif old_status == "Downloaded":
# issStatus = "Archived"
# havefiles+=1
# elif old_status == "Wanted":
# issStatus = "Wanted"
# elif old_status == "Ignored":
# issStatus = "Ignored"
# elif old_status == "Snatched": #this is needed for torrents, or else it'll keep on queuing..
# issStatus = "Snatched"
# else:
# issStatus = "Skipped"
#
# newValueDict = {"Status": issStatus }
#if archive:
# issStatus = "Archived"
elif haveissue == "yes":
if haveissue == "yes":
issStatus = "Downloaded"
newValueDict = {"Location": isslocation,
"ComicSize": issSize,
@ -664,10 +641,10 @@ def forceRescan(ComicID,archive=None):
issID_to_ignore.append(str(iss_id))
if 'annual' in temploc.lower():
myDB.upsert("annuals", newValueDict, controlValueDict)
else:
myDB.upsert("issues", newValueDict, controlValueDict)
if 'annual' in temploc.lower():
myDB.upsert("annuals", newValueDict, controlValueDict)
else:
myDB.upsert("issues", newValueDict, controlValueDict)
fn+=1
logger.fdebug("IssueID's to ignore: " + str(issID_to_ignore))