Table of Contents
- General
- Schema
- Format
- Header
- Caps
- Settings
- Login
- Example for a simple POST login
- Example for a very complex form-based login (real world form logins won't need most of the options)
- Example for a typical COOKIE definition
- Search
- Search HTML
- Providing the category field with a default value
- Search JSON
- Search Row Selectors
- Search XML
- Download
- Template Engine
- re_replace
- if ... else ... end
- if or/and ... else ... end
- if eq/ne ... else ... end
- join
- range
- range (with indexing)
- Variable substitution
- Variables
- Config Variables
- Special Variables
- Search Query Variables
- Download Variables
- Filters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
Here you'll find details on the yaml indexer definition format. It's very incomplete at this time (work in progress).
General
Using definitions files, it's possible to support most trackers without having to write native C# code. All you need is a little knowledge about HTML and CSS selectors. In order to add support for a new tracker just create a corresponding definition file in one of the corresponding folders (See the Jackett log output during startup for the exact folder locations).
The best way to get started is to look at existing definitions files (https://github.com/Jackett/Jackett/tree/master/src/Jackett.Common/Definitions). If you know a tracker which is similar to the one you want to add and is already supported just use its definition as a base for your new definition file.
Many sites often have a Powered by
logo at a footer on their pages, and we try to tag our yaml indexers with a comment at the bottom to make finding similar engines a little easier. If you find a matching engine then you can use that indexer as a base for your new site, which will save you a lot of time and effort.
In general, using cardigann definitions is the preferred way of adding new indexers. there's just one exception: Gazelle based trackers can be easily added by inheriting the GazelleTracker C# class.
Schema
To help you conform to YML coding standards, (and maintain compatibility parity with the v11 Prowlarr yaml indexers), you can validate your Jackett yaml Indexer by using the npm ajv tool.
Installation
The following npm packages are required ajv-cli-servarr ajv-formats
.
These can be installed globally on your system with npm install -g ajv-cli-servarr ajv-formats
.
Usage
ajv test -d "src\Jackett.Common\Definitions\<indexer>.yml" -s "src\Jackett.Common\Definitions\schema.json" --valid --all-errors -c ajv-formats --spec=draft2019
where <indexer>
supports masking with an asterisk, for example hd*
to scan all indexers beginning with hd
Credit
The Prowlarr team
Format
Jackett's Cardigann is quite fussy about indentation. Ensure you maintain the 2 space indentation per level as shown in the examples here. Getting it wrong could lead to errors during a run, or perhaps worse, a silent ignore of the clause altogether!
Text following a #
(Hash) is a comment, and the ones in the examples below do not have to be included in your code.
Header
Each definition must start with a header like this:
---
# [REQUIRED] Internal name of the indexer, must be unique. Usually it's the name of the
# web site, in lower case, stripped of any special characters and space
id: thepiratebay
# [OPTIONAL] This is an administrative function which should not be used by the end user.
# It is used to maintain backward compatibility when renaming the id of an indexer
# (the id is used in the torznab/download/search urls and in the indexer configuration file)
replaces:
- tpb-original
# [REQUIRED] Display name (The full name of the tracker)
name: The Pirate Bay
# [REQUIRED] displayed in the tooltip on the add-indexer page and in the config panel
description: "Pirate Bay (TPB) is the galaxy’s most resilient Public BitTorrent site"
# [REQUIRED] Language code of the main language used on the tracker
# See http://www.lingoes.net/en/translator/langcode.htm
# usually you load this with the value from the sites <html lang="en_US"> tag.
language: en-US
# [REQUIRED] Indexer type:
# public (no registration required)
# semi-private (registration required, but always open)
# private (registration required. Invite/application needed)
type: public
# [REQUIRED] Website encoding used by the tracker
# usually you get this from the sites html <meta charset=utf-8"> tag.
encoding: UTF-8
# [OPTIONAL] Can be true or false (default is false)
# Enable/Disable automatic update of the URL in case of a redirect to a different domain
followredirect: false
# [OPTIONAL] Can be true or false (default is true)
# Enable/Disable the pre-testing of the .torrent files when attempting a download (indexers
# that support fallback downloading need true). Some web sites do not allow performing two
# GET requests for the same .torrent in sequence so setting this to false will avoid an error.
testlinktorrent: false
# [OPTIONAL] The number of seconds in between requests to a site
# Mainly used for sites that limit the number of requests per period with a temporary block.
requestDelay: 2.5
# [REQUIRED] List of known domains
# (the first one is the default, must end with /)
links:
- https://thepiratebay.org/
- https://thepiratesbay.pw/
- https://tproxy.pro/
# [OPTIONAL] List of old domains which no longer work
# If one of these URLs is configured it will be automatically replaced with the default one
legacylinks:
- https://thepiratebay.sw/
# [OPTIONAL] If the tracker uses untrusted HTTPS certificates (self-signed, expired, etc)
# you can specify a list of SHA-1 Fingerprint (thumbprint) hashes which should be accepted
# as valid anyway. This shouldn't be needed in most cases.
certificates:
- D40789207A75EA36B02E255BF7162C8DF9637751 # Expired 24 June 2020
Caps
Next, you've to specify the capabilities of the indexer.
# Capabilities of the indexer:
# Mapping between the tracker categories and the Newznab categories.
# - id: [REQUIRED] The tracker specific category ID.
# Can be a string too.
# - cat: [REQUIRED] The corresponding newznab predefined category.
# See this list for valid options:
# https://github.com/Jackett/Jackett/wiki/Jackett-Categories
# - desc: [OPTIONAL] The tracker category name.
# If provided it will be used for a 1:1 mapping between
# tracker and newznab categories.
# - default: [OPTIONAL] default flag, can be true or false (default is false)
# Specify if this category should be used as default (if the search query doesn't
# contain any categories).
caps:
categorymappings:
- {id: 101, cat: Audio, desc: "Music", default: false}
- {id: 201, cat: Movies, desc: "Movies", default: true}
- {id: 299, cat: Movies/Other, desc: "Video Other", default: true}
- {id: 302, cat: PC/Mac, desc: "Mac", default: false}
- {id: 901, cat: XXX, desc: "Porn SD", default: false}
- {id: 902, cat: XXX, desc: "Porn HD", default: false}
# Specify one or more torznab search modes and attributes that are supported by the indexer.
# Implementation note: Jackett doesn't care very much about this, but you should still
# specify the correct modes, as most apps calling Jackett via the Torznab API depend on them.
# The q attribute is the absolute minimum default, and you should only add the others if the
# tracker supports searching with them, especially imdbid, tvdbid, tmdbid, rid (TVRage),
# tvmaze, traktid, doubanid, album, artist, label, track, author, title, publisher, year & genre.
modes:
search: [q]
tv-search: [q, season, ep, imdbid, tvdbid, rid, tvmaze, traktid, doubanid, year, genre]
movie-search: [q, imdbid, tmdbid, traktid, doubanid, year, genre]
music-search: [q, album, artist, label, track, year, genre]
book-search: [q, author, title, publisher, year, genre]
A list of the categories that Jackett can define are available here
Settings
Optionally you can specify which config options should be available for the indexer. If the settings block is not specified, the defaults are used (username and password). Some examples:
settings:
# internal variable name
- name: username
# input type
type: text
# Display name
label: Username
- name: password
# the input box will mask the content from view replacing characters with asterisks
type: password
label: Password
- name: pin
type: text
label: Pin
- name: itorrents-links
type: checkbox
label: Add download links via itorrents.org
# [OPTIONAL] set the following to true if you want the checkbox to be ticked.
# The checkbox is un-ticked by default
default: false
- name: info
type: info
label: ITorrents Note
default: Without the itorrents option only magnet links will be provided.
- name: category-id
type: select
label: Category
default: 0_0
options:
0_0: "All categories"
1_0: Movies
1_1: Movies/HD
- name: quality
type: multi-select
label: Select one or more quality
options:
480p: 480p
720p: 720p
1080p: 1080p
2160p: 2160p
4K: 4K
defaults:
- 1080p
- 720p
# this special type generates an info box in the indexer config that gives details on the sites' category 8000 dependence
- name: info_category_8000
type: info_category_8000
# this special type generates an info box in the indexer config that gives instructions on how to fetch a cookie
- name: info_cookie
type: Info_cookie
# this special type generates an info box in the indexer config to warn that the flaresolverr app may be required
- name: info_flaresolverr
type: Info_flaresolverr
# this special type generates an info box in the indexer config that gives instructions on how to fetch a useragent
- name: info_useragent
type: Info_useragent
If it's a public tracker and no config settings are needed then set settings: []
to disable all options.
Login
If the tracker requires a login, you've to include a login block. First, you've to pick one of the following login methods:
- post: The input values are transmitted as a HTTP POST request. This will work for many trackers which require only static login information (username, password, ...).
- get: Same as post but HTTP GET is used
- form: The input values are transmitted as a HTTP POST request. But instead of sending them directly, the specified path is retrieved first and the corresponding HTML form is extracted. This allows login to most trackers which require dynamic login information (e.g. CAPTCHAS or CSRF tokens). For google ReCaptchas no special configuration is required, they're detected automatically. In case the the tracker is using "simplecaptcha" (Messages like "click on the Bug" and "Click on the "X") it's automatically solved.
- cookie: the cookies provided via the
cookie
setting will be used.
After sending the actual login request the resulting HTML document is checked for error messages (error
section). If one of the specified selectors matches the login is considered as failed and the matching text is returned as error message.
After checking for error messages, a login test is performed (test
section). The specified path will be requested. If a redirect is returned the login is considered as failed. Optionally it's possible to specify a selector which must match for a successful login. Typically, the path
is set to the same path as the torrent search path. Most trackers will redirect users to the login page if a login is required. If a tracker will just show the login form (no redirect) you'll have to specify a selector too.
Example for a simple POST login
login:
# use simple post login
method: post
# target of the POST request
path: takelogin.php
# list of POST parameters
inputs:
# using configured username and password from the prior settings block
username: "{{ .Config.username }}"
password: "{{ .Config.password }}"
# example of a fixed parameter
keeplogged: 1
# [OPTIONAL] error message handling
error:
- selector: #errormessage > span.warning
# [OPTIONAL] using a simple redirect based login detection
test:
path: browse.php
Example for a very complex form-based login (real world form logins won't need most of the options)
login:
# Using a form-based login
method: form
# Location of the document containing the form
path: login.php
# location of the following POST request.
# Only needed of it's not the same as the action specified in the form element.
submitpath: takelogin.php
# Selector for the HTML form element (default: form)
form: form[action="takelogin.php"]
captcha:
# image based captcha (can be image or text)
type: image
# selector for the captcha HTML element
selector: img[alt="Security code"]
# name of the target form element (captcha value)
input: code
inputs:
# use configured username and password from the settings
username: "{{ .Config.username }}"
password: "{{ .Config.password }}"
# example of a fixed parameter
keeplogged: 1
# [OPTIONAL] Only needed in case of dynamic input element names (very rare)
# If it's set to true the keys/names from the 'input' section will be
# interpreted as CSS selectors
# example: https://github.com/Jackett/Jackett/blob/master/src/Jackett.Common/Definitions/spiritofrevolution.yml
selectors: false
# [OPTIONAL] Only needed in very limited cases.
# Can be used to include values based on a result of a selector.
# e.g. if a CSRF token is hidden in JavaScript).
selectorinputs:
# name of the required key-name, for example: securitytoken
securitytoken:
# selector for the value
selector: "script:contains(\"stKey: \")"
# [OPTIONAL] further filters for the value
filters:
- name: regexp
args: "stKey: \"(.+?)\","
# additional arguments for the URL
# Send as part of the query string, not in the POST body
getselectorinputs:
c:
selector: "script:contains(\"login.php\")"
filters:
- name: regexp
args: "login.php\\?c=(.*?)&rhash="
rhash:
text: 123
# multiple selectors for errors
error:
- selector: tbody:has(td.colhead > span:contains("Error"))
- selector: tbody:has(td.colhead > span:contains("failed"))
# example of a complex error message handler
# e.g. to deal with javascript based error messages
# if this selector matches => login failed
- selector: body[onLoad^="makeAlert('"]
# [OPTIONAL] selector to change the error message which will be displayed.
message:
selector: body[onLoad^="makeAlert('"]
attribute: onLoad
filters:
- name: replace
args: ["makeAlert('Error' , '", ""]
- name: replace
args: ["');", ""]
test:
path: browse.php
# [OPTIONAL] check this selector (must match)
selector: a#logout
If the FORM or POST method does not work for the web site you can resort to using the cookie method, which uses the session cookie when accessing the web site's pages
Example for a typical COOKIE definition
settings:
- name: cookie
type: text
label: Cookie
- name: info_cookie
type: info_cookie
login:
method: cookie
inputs:
cookie: "{{ .Config.cookie }}"
test:
path: index.php
selector: a[href="logout.php"]
Search
The search block contains all the information on how to search and how to extract the necessary information from the various trackers. There are three search method available, which are based on the response type from the web site. Search HTML which is the default, Search JSON and Search XML.
Search HTML
It's possible to do some optional pre-processing of the search keywords first using the keywordsfilters
list (e.g. to remove short search words or replace special characters with wildcards. After that the search URLs will be constructed based on the provided paths
and inputs
. All resulting paths will be requested. Each result is checked for error messages based on the error
selector list. After that the rows are extracted based on the selector, etc. provided in the rows
block. Finally, each row is parsed based on the fields
list.
Example of a complex search block explaining all available options:
search:
# list of paths which should be searched
# For the most trackers just a single path is needed. But some trackers use
# different pages for e.g. porn or scene and non-scene releases.
paths:
- path: torrents.php
# [OPTIONAL] HTTP method (get or post) (default is get)
method: post
# [OPTIONAL] Enable/Disable following of search redirects
# can be true or false (default is false)
# If you enable this make sure you specify a selector in the login/test section
followredirect: false
# [OPTIONAL] list of tracker categories
# If specified the path will be only used if at least one category from the list is included in
# the search categories list. A "!" as first entry negates the matching logic (include the path
# in any other than the specified categories is in the search categories list)
categories: ["!", 901, 902]
# [OPTIONAL] list of (extra) arguments which should be added for this path
inputs:
scene: 0
# [OPTIONAL] boolean option to disable input inheritance from the search level inputs list.
# If set to true the "inputs" from the search level list will be used as the base for the path specific inputs
# Default is true
inheritinputs: true
- path: torrents.php
# don't use this path if we're only searching for porn
categories: ["!", 901, 902]
inputs:
scene: 1
- path: xxx.php
# only use it if we're searching for porn
categories: [901, 902]
# [OPTIONAL] If a key resolves to a value that is empty then Cardigann will not use that key/value pair in its query to the site.
In the event that the site requires a key without a value then use this override. The default is false.
allowEmptyInputs: true
# list of HTTP arguments which are used by all paths
inputs:
# Generate the category[] arguments list
# The $raw input is special, the result will be included in the HTTP arguments list
# without further escaping (only variables are escaped).
$raw: "{{ range .Categories }}category[]={{.}}&{{end}}"
# If an IMDB ID has been specified use it. Otherwise use the search keywords.
search: "{{ if .Query.IMDBID }}{{ .Query.IMDBID }}{{ else }}{{ .Keywords }}{{ end }}"
imdb_search: "{{ if .Query.IMDBID }}yes{{ else }}{{ end }}"
searchin: title
incldead: 1
# [OPTIONAL] extra headers which should be included in search requests
headers:
x-requested-with: ["XMLHttpRequest"]
# [OPTIONAL] list of filters which will be applied to the search string.
# The result is available in the .Keywords variable
keywordsfilters:
- name: re_replace # remove words <= 3 characters and surrounding special characters
args: ["(?:^|\\s)[_\\+\\/\\.\\-\\(\\)]*[\\S]{0,3}[_\\+\\/\\.\\-\\(\\)]*(?:\\s|$)", " "]
- name: re_replace # replace special characters with "*" (wildcard)
args: ["[^a-zA-Z0-9]+", "*"]
# [OPTIONAL] list of selectors to check for errors on the search result page
# (same syntax as in the login block)
error:
- selector: div.error
# [OPTIONAL] list of filters to apply to the search result before doing further HTML parsing
preprocessingfilters:
- name: jsonjoinarray
args: ["$.result", ""]
- name: prepend
args: "<table>"
- name: append
args: "</table>"
rows:
selector: table#sortabletable > tbody > tr:has(a[href*="/details.php?id="])
# [OPTIONAL] list of row filters
filters:
# The andmatch filter will make sure that only torrents which contain all words from the search string are
# returned. This is helpful if the tracker returns a lot of unrelated search results.
- name: andmatch
# [OPTIONAL] argument, the maximum length of the search string which should be compared. Specify this if the
# trackers cuts of the torrent name after a certain amount of characters.
args: 66
# [OPTIONAL] dump the HTML of each row to the log (for debugging purposes)
- name: strdump
# [OPTIONAL] selector for rows containing dates.
# Use this if the torrent result rows don't contain a publish date but a previous row contains the date.
# The indexer will go back and parse the first sibling element matching the selector as date for that torrent.
dateheaders:
selector: ":has(td.colhead[title]:contains(\"Torrents from\") > b)"
filters:
- name: dateparse
args: "ddd dd MMM"
# [OPTIONAL] row merging. Use this if the tracker uses multiple row elements for each torrent
# (e.g. hidden tooltip or collapsed rows) The specified number of elements from the rows selector result will be
# merged into the previous element. In this example (1) two rows will be merged together.
after: 1
# [REQUIRED] list of attributes which are extracted for each row
fields:
# [REQUIRED] tracker category id (id field from caps/categorymappings)
# if the site does not provide one in its results then use category Other.
category:
selector: a[href^="browse.php?cat="]
attribute: href
filters:
# extract the "cat" parameter from the query string
- name: querystring
args: cat
# [ALTERNATIVE] if the site does not provide a category id for results,
# but it does provide the category name we use for descriptions, use categorydesc instead of category
categorydesc:
selector: div.kat_cat_pic
# [REQUIRED] the title of the torrent
title:
selector: a[href^="details.php?id="]
# [OPTIONAL] link to the site's details page for the torrent
# If not available from the response then its usual to use the .Config.sitelink as a default.
details:
selector: a[href^="details.php?id="]
attribute: href
# [REQUIRED] download link for the torrent file. See the download block documentation for special handling if needed.
# If a download link is not available you should provide a magnet URI, or if neither is available an infohash.
download:
selector: a[href^="download.php?torrent="]
attribute: href
# [ALTERNATIVE] magnet link
magnet:
selector: a[href^="magnet:"]
attribute: href
# [ALTERNATIVE] Loads the infohash, and for Pubic or Semi-Private Indexers auto-generates a magnet URI
# When neither the .torrent link or a magnet URI is available, use the infohash statement to auto-generate a
# magnet URI from an infohash. The magnet's &dn= will be loaded from the .Result.title, and a set of ten of
# the currently most useful trackers will be added for the &tr= sequence.
# Note that for Private Indexers the auto-generation is disabled.
infohash:
selector: a[href^="index.php?page=torrent-details&id="][title]
attribute: href
filters:
- name: querystring
args: id
# [OPTIONAL] link to a poster image (cover, banner, etc.)
# This will show up (on the Jackett dashboard search page) as a tooltip when you hover over the title
# If the selector does not match it is ignored.
poster:
selector: a[href^="details.php?id="]
attribute: onmouseover
filters:
- name: regexp
args: src=\\'(.+?)\\'
# replace dummy image with empty string
- name: replace
args: ["./pic/noposter.jpg", ""]
# [OPTIONAL] id for imdb.com if e.g. a link is returned then the number is extracted automatically
# An alias named imdb is also valid here.
# If the selector does not match it is ignored.
imdbid:
selector: a[href*="imdb.com/title/tt"]
attribute: href
# [OPTIONAL] id for tvrage.com if a link is returned then the number is extracted automatically
# If the selector does not match it is ignored.
rageid:
selector: a[href*="tvrage.com/"]
attribute: href
# [OPTIONAL] id for themoviedb.org if a link is returned then the number is extracted automatically
# If the selector does not match it is ignored.
tmdbid:
selector: a[href*="themoviedb.org/movie/"], a[href*="themoviedb.org/tv/"]
attribute: href
# [OPTIONAL] id for tvmaze.com if a link is returned then the number is extracted automatically
# If the selector does not match it is ignored.
tvmazeid:
selector: a[href*="tvmaze.com/shows/"]
attribute: href
# [OPTIONAL] id for thetvdb.com if a link is returned then the number is extracted automatically
# If the selector does not match it is ignored.
tvdbid:
selector: a[href*="thetvdb.com/"]
attribute: href
# [OPTIONAL] id for trakt.tv if a link is returned then the number is extracted automatically
# If the selector does not match it is ignored.
traktid:
selector: a[href*="trakt.tv/movies/"], a[href*="trakt.tv/shows/"]
attribute: href
# [OPTIONAL] id for movie.douban.com if a link is returned then the number is extracted automatically
# If the selector does not match it is ignored.
doubanid:
selector: a[href*="movie.douban.com/subject/"]
attribute: href
# [REQUIRED] publish date (if the site does not provide a date for all results, then a default of "now" should be used)
# if the site can only provide a rows: dateheaders: selector then you can omit the date field.
date:
selector: td:nth-child(4) > span[title]
attribute: title
filters:
# append the timezone used by the tracker
- name: append
args: " +08:00"
- name: dateparse
args: "yyyy-MM-ss HH:mm:ss zzz"
# [REQUIRED] size of the torrent (units are handled automatically). if the site does not provide a size for all
# results, then provide a default of "512 MB". If the site occasionally has a missing size then "0 B" is usual.
# Side note: For Sites using European numbering schemes (1,024.4MB or 1.024,4MB etc.) there is no need to remove
# commas or extra dots as these are automatically dealt with.
size:
selector: td:nth-child(7)
# [OPTIONAL] number of files
files:
selector: td:nth-child(4)
# [OPTIONAL] number of completed downloads
grabs:
selector: td:nth-child(8)
filters:
- name: regexp
# get the first number from the result
args: (\d+)
# [REQUIRED] number of seeders (if the site does not provide seeders for all results,
# then provide a default of "1").
seeders:
selector: td:nth-child(9)
# [OPTIONAL] number of leechers (if the site does not provide leechers for all results,
# then provide a default of "1").
leechers:
selector: td:nth-child(10)
# [OPTIONAL] genre. A list of one or more genre categories.
# You should aim to load genre with a comma delimited list, for example: "Action, Drama, Thriller"
# and use filters to massage the list into the requisite layout if required.
# If the selector does not match it is ignored.
genre:
selector: div i
filters:
- name: regexp
args: "\\((.+?)\\)"
# [OPTIONAL] Factor for the download volume. In most cases it should be set to "1"
# Set to "0" if a torrent is freeleech, "0.5" if only 50% is counted, "0.75" if only 75% is counted.
# if a site states that the download is 75% free then the DLVF is 0.25 (only 25% is counted).
downloadvolumefactor:
case:
img.pro_free: 0
img.pro_neutral: 0
img.pro_50pctdown: 0.5
# default to 1
"*": 1
# [OPTIONAL] Factor for the upload volume, in most cases it should be set to "1"
# Set it to "0" for a torrent that is a neutral leech (upload is not counted), set to "2" for a double upload
uploadvolumefactor:
case:
img.pro_neutral: 0
img.pro_2up: 2
# default to 1
"*": 1
# [OPTIONAL] minimum ratio the torrent client must seed to avoid Hit & Run penalties
minimumratio:
text: 1.0
# [OPTIONAL] minimum number of seconds the client must seed to release the MR requirement
minimumseedtime:
# 1 day (as seconds = 24 x 60 x 60)
text: 86400
# [OPTIONAL] description (any other available/relevant information)
# This will show up (on the Jackett dashboard search page) as info on a tooltip when you hover over the title
# If the selector does not match it is ignored.
description:
selector: td:nth-child(2)
# remove a and img elements to get rid of spurious text
remove: a, img
Each field starts with the HTML row as value.
If the text
keyword is specified the keys value is used. This can be used for fixed values (e.g. minimumratio and minimumseedtime).
If a fixed text value is not specified then the presence of the selector keyword is checked. If it's found then it's applied to the row. This allows you to extract more specific details such as the title or download link using CSS selectors.
After that the selector specified in the remove
keyword is applied. With this, it's possible to remove unwanted elements (See the description
example above). Any removed elements will be removed for good, they won't be available to following fields. Due to that you should put fields using the remove keyword at the end of the list.
Now it's possible to set the value based on the existence of elements using the case
keyword. If the corresponding selector matches the field value is set to the specified case value. Processing ends after the first case selector matches. This is commonly used for downloadvolumefactor
and uploadvolumefactor
.
Finally, the resulting value will be processed by the template engine and filter engine (see below).
Providing the category field with a default value
In the event that a field might not be reliably present from the site results, you can use the default
statement, as shown in these examples:
category:
selector: a[href^="browse.php?cat="]
attribute: href
optional: true
default: 38
filters:
- name: querystring
args: cat
title_default:
# this title may be abbreviated
selector: a[href^="details.php?id="]
title:
# this title if present is full length
selector: a[title][href^="details.php?id="]
attribute: title
optional: true
default: "{{ .Result.title_default }}"
seeders:
# seeders may be missing
selector: a[href$="toseeders=1"]
optional: true
default: 0
Note that the use of the noappend
modifier is deprecated for the category field.
So if you have an old category block like
category:
selector: td:nth-child(1)
optional: true
filters:
- name: replace
args: ["---", 4]
category|noappend:
selector: a[href^="browse.php?cat="]
attribute: href
optional: true
filters:
- name: querystring
args: cat
then you will see warnings on your log, and you should covert to
category_default:
selector: td:nth-child(1)
optional: true
filters:
- name: replace
args: ["---", 4]
category:
selector: a[href^="browse.php?cat="]
attribute: href
optional: true
default: "{{ .Result.category_default }}"
filters:
- name: querystring
args: cat
as at some point support for the category:noappend will be removed.
Search JSON
Example of a complex search block explaining all available options:
search:
# [OPTIONAL] extra headers which should be included in search requests
headers:
x-milkie-auth: ["{{ .Config.apikey }}"]
paths:
# [REQUIRED] If the API has different paths for some queries, you can use conditionals to define them
- path: "{{ if .Keywords }}api/v2/torrent/search{{ else }}api/torrent/latest{{ end }}"
# [OPTIONAL] The default is to send the query as a http get, the other choice is http post
# You can use conditionals to select the method for different path requirements
method: "{{ if .Keywords }}post{{ else }}get{{ end }}"
# [REQUIRED] The response block is necessary to define parsing of a JSON response
response:
# [REQUIRED] "json" indicates that a JSON response is expected
type: json
# [OPTIONAL] In the event that a server does not return an empty JSON object or a Count set to 0
# in response to a query-no-found state, you can code the exception here.
# If the string you provide is contained in the response, or the server returns an empty response
# and you coded an empty string here, then this will return the traditional "Found 0 releases" instead
# of the default "Exception (indexer): Object reference not set to an instance of an object." error.
noResultsMessage: "nothing found message from server"
inputs:
# Specify whichever query parameters the API is prepared to accept as valid. Some examples below.
query_term: "{{ if .Query.IMDBID }}{{ .Query.IMDBID }}{{ else }}{{ re_replace .Keywords \"[']\" \"\" }}{{ end }}"
limit: 50
sort: date_added
rows:
# [REQUIRED] This is the where you define how to find the row sets that contain the torrent fields
# You can use the $ symbol to refer to the root object.
selector: data.movies
# [OPTIONAL] If the torrents are in separate subset
attribute: torrents
# [OPTIONAL] When the attribute is missing, this option allows you to suppress the error and return a no-results-found
missingAttributeEqualsNoResults: true
# [OPTIONAL] If there are multiple torrents per title
multiple: true
# [OPTIONAL] If the response contains a field that indicates the number of hits returned,
# then you define that field in the count block selector, so that if the response had a
# count of 0 if would indicate a results not found condition.
# If the response uses an empty set [] to signify a no results found state, then don't use the count block.
count:
# [REQUIRED] IF you have defined the Count block then you need to provide the field that has the count.
# You can use the $ symbol to refer to a root object field, for example: $[0].id
selector: data.movie_count
# [REQUIRED] list of attributes which are extracted for each row
fields:
# All the regular filters are available as described elsewhere in the Wiki. I've included some examples.
#
# If you have not defined an attribute in the rows block above, then all the fields are extracted
# from the rows set.
# If you have defined an attribute in the rows block above, then a prefix of .. means that this field
# is extracted directly from the rows set, and without a .. prefix you are indicating that the field is
# to be extracted from the attribute subset.
#
# Any fields below that do not have either [OPTIONAL] or [REQUIRED] are working fields.
# You give them a name and use them to extract additional data from the row sets, which you can use in
# conditionals for setting strings for other fields, or as direct values for concatenating into strings.
#
# [REQUIRED] tracker category id (id field from caps/categorymappings)
# While not required, it is usual to return a category for Torznab apps to use,
# so if the site does not provide one in its results, then use category Other.
category:
selector: category
# [ALTERNATIVE] if the site does not provide a category id for results,
# but it does provide the category name we use for descriptions, use categorydesc instead of category
categorydesc:
selector: category
year:
selector: ..year
_quality:
selector: quality
_type:
selector: type
# [REQUIRED] the title of the torrent
title:
selector: ..title
# [OPTIONAL] any filters as described elsewhere in the Wiki
filters:
- name: replace
args: [":", ""]
- name: replace
args: [" ", "."]
- name: append
args: ".{{ .Result.year }}.{{ .Result._quality }}.{{ if eq .Result._type \"web\" }}WEBRip{{ else }}BRRip{{ end }}-YTS"
_id:
selector: id
# [OPTIONAL] link to the site's details page for the torrent
# If not available from the response then its usual to use the .Config.sitelink as a default.
details:
text: "{{ .Config.sitelink }}browse/{{ .Result._id }}"
apikey:
text: "{{ .Config.apikey }}"
filters:
- name: urlencode
# [REQUIRED] download link for the torrent file.
# if a download link is not available you should provide a magnet URI, or if neither is available an infohash.
download:
text: "{{ .Config.sitelink }}api/v1/torrents/{{ .Result._id }}/torrent?key={{ .Result.apikey }}"
# [ALTERNATIVE] magnet link
magnet:
selector: magnet_uri
# [ALTERNATIVE] Loads the infohash, and for Public and Semi-Private Indexers auto-generates a magnet URI
# When neither the .torrent link or a magnet URI is available, use the infohash statement to auto-generate a
# magnet URI from an infohash. The magnet's &dn= will be loaded from the .Result.title, and a set of ten
# currently most useful trackers will be added for the &tr= sequence.
# Note that the auto-generation is disabled for Private Indexers.
infohash:
selector: hash
# [OPTIONAL] link to a poster image (cover, banner, etc.)
# This will show up (on the Jackett dashboard search page) as a tooltip when you hover over the title
# If the selector does not match it is ignored.
poster:
selector: ..large_cover_image
# [OPTIONAL] description (any other available/relevant information)
# This will show up (on the Jackett dashboard search page) as info on a tooltip when you hover over the title
# If the selector does not match it is ignored.
description:
text: "{{ .Result.year }} - {{ .Result._quality }} - {{ .Result._type }}"
# [OPTIONAL] id for imdb.com if e.g. a link is returned then the number is extracted automatically
# An alias named imdb is also valid here.
# If the selector does not match it is ignored.
imdbid:
selector: ..imdb_id
# [OPTIONAL] id for tvrage.com if a link is returned then the number is extracted automatically
# If the selector does not match it is ignored.
rageid:
selector: ..rage_id
# [OPTIONAL] id for themoviedb.org if a link is returned then the number is extracted automatically
# If the selector does not match it is ignored.
tmdbid:
selector: ..tmdb_id
# [OPTIONAL] id for thetvdb.com if a link is returned then the number is extracted automatically
# If the selector does not match it is ignored.
tvdbid:
selector: ..tvdb_id
# [OPTIONAL] id for tvmaze.com if a link is returned then the number is extracted automatically
# If the selector does not match it is ignored.
tvmazeid:
selector: ..tvmaze_id
# [OPTIONAL] id for trakt.tv if a link is returned then the number is extracted automatically
# If the selector does not match it is ignored.
traktid:
selector: ..trakt_id
# [OPTIONAL] id for movie.douban.com if a link is returned then the number is extracted automatically
# If the selector does not match it is ignored.
doubanid:
selector: ..douban_id
# [REQUIRED] publish date (if the site does not provide a date for all results, then "now" is preferred)
date:
selector: ..date_uploaded_unix
# [REQUIRED] size of the torrent (units are handled automatically). if the site does not provide a size for all
# results, then provide a default of "512 MB". If the site occasionally has a missing size then "0 B" is usual.
# Side note: For Sites using European numbering schemes (1,024.4MB or 1.024,4MB etc.) there is no need to remove
# commas or extra dots as these are automatically dealt with.
size:
selector: size_bytes
# [OPTIONAL] number of files
files:
selector: num_file
# [OPTIONAL] number of completed downloads
grabs:
selector: completed
# [REQUIRED] number of seeders (if the site does not provide seeders for all results,
# then provide a default of "1").
seeders:
selector: seeds
# [OPTIONAL] number of leechers (if the site does not provide leechers for all results,
# then provide a default of "1").
leechers:
selector: peers
# [OPTIONAL] genre. A list of one or more genre categories.
# You should aim to load genre with a comma delimited list, for example: "Action, Drama, Thriller"
# and use filters to massage the list into the requisite layout if required.
# If the selector does not match it is ignored.
genre:
selector: genres
# [OPTIONAL] Factor for the download volume. In most cases it should be set to "1"
# Set to "0" if a torrent is freeleech, "0.5" if only 50% is counted, "0.75" if only 75% is counted.
# if a site states that the download is 75% free then the DLVF is 0.25 (only 25% is counted).
downloadvolumefactor:
selector: freeleech
# in this example the freeleech provided by the API is 0=false, 1=true
# so we use a case block to provide the expected DLVF values
case:
0: 1 # not free
1: 0 # freeleech
# [OPTIONAL] Factor for the upload volume, in most cases it should be set to "1"
# Set it to "0" for a torrent that is a neutral leech (upload is not counted), set to "2" for a double upload
uploadvolumefactor:
selector: double_upload
# in this example the double_upload provided by the API is 0=false, 1=true
# so we use a case block to provide the expected ULVF values
case:
0: 1 # normal
1: 2 # double
# [OPTIONAL] minimum ratio the torrent client must seed to avoid Hit & Run penalties
minimumratio:
text: 0.4
# [OPTIONAL] minimum number of seconds the client must seed to release the MR requirement
minimumseedtime:
# 7 day (as seconds = 7 x 24 x 60 x 60)
text: 604800
Search Row Selectors
The use of :has()
, :not()
and :contains()
are supported by the rows selector and fields selectors.
rows:
selector: data:has(attributes.size):has(attributes.name:contains(1080)):has(attributes.poster:contains(.jpg)):not(attributes.fake_att):not(attributes.uploader:contains(DarkSwan2001))
fields:
title_dts:
selector: name:contains(DTS)
optional: true
filters:
- name: re_replace
args: ["DTS", "DTSSS"]
title_notdts:
selector: name:not(:contains(DTS))
optional: true
title:
text: "{{ if .Result.title_dts }}{{ .Result.title_dts }}{{ else }}{{ .Result.title_notdts }}{{ end }}"
filters:
- name: re_replace
args: ["\\[", " "]
Search XML
This is similar to the JSON method except you code type xml:
response:
# [REQUIRED] indicates that an XML response is expected
type: xml
Download
The download block is needed in the following cases:
- The torrent download link can't be extracted from the search results (e.g. if it's only available from the details page of the torrent)
- The download request must be done via HTTP POST instead of GET
- You've to access another page first before downloading the file (e.g. you've to click on the "Thank you" button first).
Note: Some trackers just omit the download link from the search results, but it still can be easily generated from the available information (e.g. use the details link and replace "details.php" with "download.php"). In this case the download block isn't needed.
Example of the download block explaining all options:
download:
# [OPTIONAL] use HTTP POST instead of GET to download the torrent file (default is get)
method: post
# [OPTIONAL] HTTP request which needs to be done before downloading the file
before:
# request target
path: thanks.php
# send via HTTP POST
method: post
# [OPTIONAL] if the before link requires a query separator other than the default "&" then use this
queryseparator: ";"
# list of HTTP arguments which will be included
inputs:
# extract the "id" parameter from the search result download URL query string
infohash: "{{ .DownloadUri.Query.id }}"
thanks: 1
selectors:
# [OPTIONAL] If a list of selectors is defined, the search result download URL will be retrieved and parsed as HTML.
# The first selector is then applied to get the actual download URL.
- selector: a[href^="download.php?id="]
attribute: href
# [OPTIONAL] Can be true of false (default is false)
# Set to true if you want the selector to come from the page generated by the previous BEFORE block.
# The default causes the selector to come from the page of the link in the search download block.
usebeforeresponse: false
# [OPTIONAL] a list of filters which should be applied to the result of this selector
filters:
- name: querystring
args: url
- name: urldecode
# [OPTIONAL] As many other selectors as you need, to be used as a fallback for when the prior selector fails to download.
- selector: a[href^="magnet:?xt="]
attribute: href
# [OPTIONAL] a list of filters which should be applied to the result of this selector
filters:
- name: toupper
Download Block Infohash Example
download:
# [OPTIONAL] HTTP request which needs to be done before downloading the file
before:
path: get_srv_details.php
inputs:
action: 2
id: "{{ .DownloadUri.Query.id }}"
# [OPTIONAL] If you only have a magnet hash then this method will allow you to automatically generate a magnet URI
# For use with Public or Semi-Private Indexers.
# Note that this option is not suitable for Private sites which may require ONLY the use of their own tracker and
# have DHT DISABLED and no other PUBLIC trackers on the magnet.
infohash:
# [OPTIONAL] Can be true or false (default is false)
# Set to true if you want the infohash and title to come from the page generated by the previous BEFORE block.
# The default causes the infohash and title to come from the page of the link in the search download block.
usebeforeresponse: true
# [REQUIRED] Use this selector to provide the file hash for the &xt parameter of the magnet URI
hash:
# [REQUIRED] the selector to use to find the file hash
selector: a[href^="magnet:?xt="]
attribute: href
# [OPTIONAL] a list of filters which should be applied to the result of this selector
filters:
- name: regexp
args: ([A-F|a-f|0-9]{40})
# [REQUIRED] Use this selector to provide the title for the &dn parameter of the magnet URI
title:
# [REQUIRED] The selector used to find the title
selector: meta[property="og:title"]
attribute: content
# [OPTIONAL] a list of filters which should be applied to the result of this selector
filters:
- name: trim
- name: validfilename
Download Block "before" Pathselector Example
download:
# Use this method if you need to do a http GET using a href in the details page in order to make a download link available
before:
# thankyou link: ./viewtopic.php?f=52&p=65417&thanks=65417&to_id=54&from_id=3950
pathselector:
selector: ul.post-buttons li:nth-last-child(1) a
attribute: href
selectors:
- selector: a[href^="magnet:?xt="]
attribute: href
Template Engine
The template engine is very basic, and supports the following statements.
re_replace
A simple regex replace operation.
Syntax: {{ re_replace .Variable "regex-term" "replace-term"}}
Example:
# Replace any non alphanumeric character in the keywords with the wildcard character
"{{ re_replace .Keywords \"[^a-zA-Z0-9]+\" \"*\" }}"
if ... else ... end
A basic if/else condition. Only boolean true (non-empty)/false (empty) operations on variables are supported.
Syntax: {{ if .Variable }}on true result{{ else }}on false result{{ end }}
Example:
search:
paths:
# when .Keywords contains a value
# then search.php is used as for the path
# otherwise latest.php is used
- path: "{{ if .Keywords }}search.php{{ else }}latest.php{{ end }}"
if or/and ... else ... end
The implementation is based on: go hdr functions These are not true logical OR and AND operators in that they operate on variables that contain a value or are empty. Note that the use of round brackets is entirely optional.
Example of: if or ... else ... end
search:
paths:
# when any of the 3 vars in brackets has a value
# then set the path to search
# otherwise set it to music
- path: "{{ if or (.Query.Album) (.Query.Artist) (.Keywords) }}search{{ else }}music{{ end }}"
inputs:
# when either/both of the two query vars in the brackets have a value
# then load whichever vars have a value to the string
# and when neither var has a value then load the value from .Keywords to the string
q: "{{ if or (.Query.Album) (.Query.Artist) }}{{ or (.Query.Album) (.Query.Artist) }}{{ else }}{{ .Keywords }}{{ end }}"
Example of: if and ... else ... end
title:
# when both the vars in brackets have a value
# then load the value from title_polish to the string
# when only one of the vars has a value, or both vars have no value
# then load the value from title_phase1 to the string
text: "{{ if and (.Config.lang) (.Result.is_polish) }}{{ .Result.title_polish }}{{ else }}{{ .Result.title_phase1 }}{{ end }}"
if eq/ne ... else ... end
The implementation is based on: go hdr functions This is a string comparison only. Supports the use of both variables and strings.
Example of: if eq ... else ... end
size:
# when the variable .Result._cat contains the string "series"
# then the text string will be set to "512 MB"
# otherwise it will be set to "2 GB"
text: "{{ if eq .Result._cat \"series\" }}512 MB{{ else }}2 GB{{ end }}"
Nesting is supported.
size:
# when the variable .Result._cat contains any of the strings "movie", "movie_etc", "movie_eng"
# then the text string will be set to "2 GB"
# otherwise it will be set to "512 MB"
text: "{{ if or (eq .Result._cat \"movie\") (or (eq .Result._cat \"movie_etc\") (eq .Result._cat \"movie_eng\")) }}2 GB{{ else }}512 MB{{ end }}"
Special variables .True and .False are available .True contains "True" (which represents a non-empty variable) and .False contains null (which represents an empty variable).
join
A simple loop over a list variable building a concatenated string with items joined by a delimiter.
Syntax: {{ join .Variable "<delimiter>"}}
Example:
# build a query string by concatenating all the categories with a comma
# input: [101,201,301]
"{{join .Categories \",\"}}"
# output: "101,201,301"
range
A simple loop over a list variable building a concatenated string.
Syntax: {{ range .Variable }}<prefix>{{.}}<suffix>{{end}}
Example:
# build a query string argument list for the selected categories
# input: [101,201,301]
"{{ range .Categories }}&cat{{.}}=1{{end}}"
# output: "&cat101=1&cat201=1&cat301=1"
range (with indexing)
If a parameter requires indexing then use the following to generate an incremental index starting with zero.
Syntax: {{ range $i, $e := .Variable }}<prefix[{{$i}}]>{{.}}<suffix>{{end}}
Example:
# build a query string argument list for the selected categories with indexing
# input: [101,201,301]
"{{ range $i, $e := .Categories }}&categories[{{$i}}]={{.}}{{end}}"
# output: "&categories[0]=101&categories[1]=201&categories[2]=301"
Variable substitution
The basic variable substitution operation.
Syntax: {{ .Variable }}
Variables
TODO: more explanation
Config Variables
Note that these are always available. Generated based on the settings section
.Config.$Name # for example .Config.username , .Config.password , .Config.sitelink
Special Variables
Note that these are always available.
.True contains "True" (which represents a non-empty variable)
.False contains null (which represents an empty variable)
.Today.Year contains "2024" (or whatever the current year is)
Search Query Variables
Note that these are only available during search queries.
.Query.Type # search, movie, tvsearch, book, music
.Query.Q
.Query.Series # not supported (Cardigann compatibility)
.Query.Ep # from t=tvsearch
.Query.Season # from t=tvsearch
.Query.Movie # not supported (Cardigann compatibility)
.Query.Year # from t=tvsearch or t=movie or t=music or t=book
.Query.Limit
.Query.Offset
.Query.Extended
.Query.Categories
.Query.APIKey
.Query.TVDBID # from t=tvsearch
.Query.TVRageID # from t=tvsearch
.Query.IMDBID # e.g. tt12345678 from t=tvsearch or t=movie
.Query.IMDBIDShort # e.g. 12345678
.Query.TMDBID # from t=tvsearch or t=movie
.Query.TVMazeID # from t=tvsearch
.Query.TraktID # from t=tvsearch or t=movie
.Query.DoubanID # from t=tvsearch or t=movie
.Query.Genre # from t=tvsearch or t=movie or t=music or t=book
.Query.Album # from t=music
.Query.Artist # from t=music
.Query.Label # from t=music
.Query.Track # from t=music
.Query.Episode # EpisodeSearchString, such as S00E00 or S00 or yyyy.MM.dd from t=tvssearch
.Query.Author # from t=book
.Query.Title # from t=book
.Query.Publisher # from t=book
.Categories # MappedCategories
.Query.Keywords # original keywords
.Keywords # keywords after applying the keywordsfilters
the following are boolean-like variables in that they return either the string "True" or are null. Can be used in if-else-end statements.
.Query.IsBookSearch # t=book
.Query.IsDoubanQuery # from t=tvseearch or t=movie
.Query.IsGenreQuery # from t=tvsearch or t=movie or t=music or t=book
.Query.IsIdSearch # Episode.IsNotNullOrWhiteSpace() || Season > 0 || IsImdbQuery || IsTvdbQuery || IsTVRageQuery || IsTraktQuery || IsTvmazeQuery || IsTmdbQuery || IsDoubanQuery || Album.IsNotNullOrWhiteSpace() || Artist.IsNotNullOrWhiteSpace() || Label.IsNotNullOrWhiteSpace() || Genre.IsNotNullOrWhiteSpace() || Track.IsNotNullOrWhiteSpace() || Author.IsNotNullOrWhiteSpace() || Title.IsNotNullOrWhiteSpace() || Publisher.IsNotNullOrWhiteSpace() || Year.HasValue
.Query.IsImdbQuery # from t=tvseearch or t=movie
.Query.IsMovieSearch # t=movie
.Query.IsMusicSearch # t-music
.Query.IsRssSearch # SearchTerm.IsNullOrWhiteSpace() && !IsIdSearch
.Query.IsSearch # t=search
.Query.IsTVRageQuery # from t=tvsearch
.Query.IsTVSearch # t=tvsearch
.Query.IsTmdbQuery # from t=tvseearch or t=movie
.Query.IsTraktQuery # from t=tvseearch or t=movie
.Query.IsTvdbQuery # from t=tvsearch
.Query.IsTvmazeQuery # from t=tvsearch
Note: There are several variables that are not supported and are provided by Cardigann for compatibility with the Torznab specifications. These variables will always return null.
All field results are available to the following fields via the .Result.$FieldName
variables too.
For example:
fields:
title:
selector: h3 a
_subcat:
selector: div.box ul li:first-child
year:
selector: div.box ul li:contains("Year:")
_quality:
selector: div.box ul li:contains("Quality:")
description:
text: "{{ .Result._subcat }} {{ .Result.year }} {{ .Result._quality }}"
Temporary variables used to help build release results should contain an underscore in their variable names, such as title_phase1
or _quality
.
Download Variables
Based on the download search field result the following variables are available:
.DownloadUri.AbsoluteUri example: https://domain.to/torrent/1234567/A-Torrent-Name-1080p/
.DownloadUri.AbsolutePath example: /torrent/1234567/A-Torrent-Name-1080p/
.DownloadUri.Scheme example: https
.DownloadUri.Host example: domain.to
.DownloadUri.Port example: 443
.DownloadUri.PathAndQuery example: /torrent/1234567/A-Torrent-Name-1080p/
.DownloadUri.Query example: see below
For each query string argument of the URI a corresponding .DownloadUri.Query.$Key
variable is generated.
for example, a URI like https://amigos-share.club/torrents-details.php?id=37346&hit=yes
would generate the following two variables:
.DownloadUri.Query.id
with the value 37346
and
.DownloadUri.Query.hit
with the value yes
.
Filters
querystring
Extract values from URL arguments.
Example:
# extract the category ID from a category link
selector: a[href^="browse.php?cat="]
attribute: href
filters:
# input: browse.php?cat=123
- name: querystring
args: cat
# result: 123
prepend
Inserts a string by appending additional characters to the beginning of its current value. The single parameter in the argument is the string to be prefixed.
Example:
# prefix InfoHash with a magnet URI header
selector: span > a
attribute: href
filters:
# input: B21F2A6DB07A8F4F76E2C5E15D28235D356B8D41
- name: prepend
args: "magnet:?xt=urn:btih:"
# result: magnet:?xt=urn:btih:B21F2A6DB07A8F4F76E2C5E15D28235D356B8D41
append
Extends a string by appending additional characters to the end. The single parameter in the argument is the string to be appended.
Example:
# add a tracker to complete the magnet URI
selector: span > a
attribute: href
filters:
# input: magnet:?xt=urn:btih:B21F2A6DB07A8F4F76E2C5E15D28235D356B8D41&dn=I.Am.A.Magnet
- name: append
args: "&tr=udp://tracker.coppersurfer.tk:6969"
# result: magnet:?xt=urn:btih:B21F2A6DB07A8F4F76E2C5E15D28235D356B8D41&dn=I.Am.A.Magnet&tr=udp://tracker.coppersurfer.tk:6969
tolower
Converts a string to lowercase letters. Does not require any parameters.
Example:
# make the title lowercase
selector: dt a
filters:
# input: MY MOVIE TITLE 1080P
- name: tolower
# result: my movie title 1080p
toupper
Converts a string to uppercase letters. Does not require any parameters.
Example:
# make the title uppercase
selector: dt a
filters:
# input: my movie title 1080p
- name: toupper
# result: MY MOVIE TITLE 1080P
replace
If the pattern string is matched, then the pattern is replaced by a replacement string. The first parameter in the argument is the pattern string, and the second is the replacement string.
Example:
# fix the date field when it contains Y-day
selector: td:nth-child(2)
filters:
# input: Y-day 12:27
- name: replace
args: ["Y-day", "yesterday"]
# result: yesterday 12:27
split
Divides a string into an array of substrings, and return the selected substring. The first parameter in the argument is the single character pattern used to split the string, and the second parameter is the array element number of the wanted substring, counting from zero for the first element.
Example:
# extract the category id
selector: td[class^="coll-1"] a[href^="sub/"]
attribute: href
filters:
# input: sub/45/0
- name: split
args: ["/", 1]
# result: 45
trim
Removes all leading and trailing occurrences of a set of specified characters. Used without an argument removes all leading and trailing white-space characters. If a set of characters are supplied in an argument, then those will be removed from all leading and trailing occurrences.
Example:
# fetch the title
selector: td:nth-child(2) a
attribute: title
filters:
# input: This Is My Title
- name: trim
# result: This Is My Title
# extract the title
selector: td:nth-child(2) a
attribute: title
filters:
# input: xxxThis Is My Titlexxx
- name: trim
args: "x"
# result: This Is My Title
regexp
Perform pattern-matching and "search-and-replace" functions on a string using a Regular Expression.
Example:
# extract the uploaded date and time
selector: td:nth-child(2) font.detDesc
filters:
# input: Uploaded 09-14 02:31, Size 282.88 MiB, ULed by
- name: regexp
args: "Uploaded (.+?),"
# result: 09-14 02:31
re_replace
Similar to replace, but the parameters in the argument are Regular Expressions.
Example:
# normalize to SXXEYY format
selector: td:nth-child(2) a.tab
attribute: href
filters:
# input: 12x45
- name: re_replace
args: ["(\\d{2})x(\\d{2})", "S$1E$2"]
# result: S12E45
validate
Given a list of words, delimited by any one of , /.)(;[]"|:
this filter will return a comma delimited list of only the words that
are in the args. Useful for removing non-genre types from an open tag list.
Note: to preserve a double word (for example Science Fiction
or Sci-Fi & Fantasy
) replace the spaces with underscores. These will be auto-restored in results.
Example:
# remove any tags that are not standard genre types
selector: div.tags
filters:
- name: re_replace
args: ["(?i)(Science Fiction)", "Science_Fiction"]
# input: crime, x264, 1080p, (music), pack, comedy, Science Fiction, dd5.1, Hip/Hop
- name: validate
args: "Action, Adventure, Crime, Comedy, Science_Fiction, War"
# result: crime, comedy, science fiction
dateparse
Converts a date/time string into a DateTime object ("ddd, dd MMM yyyy HH:mm:ss z"). Requires two parameters in its argument, the first is the string to be processed into the DateTime, and the second is the format to use for the conversion. For a full breakdown of the format specifiers see https://learn.microsoft.com/en-us/dotnet/standard/base-types/custom-date-and-time-format-strings
Here are the more common format specifiers used by Jackett
format specifier | description | example |
---|---|---|
yyyy | The year as a four-digit number. | 2009-06-15T13:45:30.6175 -> 2009 |
yy | The year, from 00 to 99. | 2009-06-15T13:45:30.6175 -> 09 |
MMMM | The full name of the month. | 2009-06-15T13:45:30.6175 -> June |
MMM | The abbreviated name of the month. | 2009-06-15T13:45:30.6175 -> Jun |
MM | The month, from 01 through 12. | 2009-06-15T13:45:30.6175 -> 06 |
M | The month, from 1 through 12. | 2009-06-15T13:45:30.6175 -> 6 |
dddd | The full name of the day of the week. | 2009-06-15T13:45:30.6175 -> Monday |
ddd | The abbreviated name of the day of the week. | 2009-06-15T13:45:30.6175 -> Mon |
dd | The day of the month, from 01 through 31. | 2009-06-15T13:45:30.6175 -> 15 |
d | The day of the month, from 1 through 31. | 2009-06-15T13:45:30.6175 -> 15 |
HH | The hour, using a 24-hour clock from 00 to 23. | 2009-06-15T13:45:30.6175 -> 13 |
H | The hour, using a 24-hour clock from 0 to 23. | 2009-06-15T13:45:30.6175 -> 13 |
hh | The hour, using a 12-hour clock from 01 to 12. | 2009-06-15T13:45:30.6175 -> 01 |
h | The hour, using a 12-hour clock from 1 to 12. | 2009-06-15T13:45:30.6175 -> 1 |
mm | The minute, from 00 through 59. | 2009-06-15T13:45:30.6175 -> 45 |
m | The minute, from 0 through 59. | 2009-06-15T13:45:30.6175 -> 45 |
ss | The second, from 00 through 59. | 2009-06-15T13:45:30.6175 -> 30 |
s | The second, from 0 through 59. | 2009-06-15T13:45:30.6175 -> 30 |
ffff | The ten thousandths of a second in a date and time value. | 2009-06-15T13:45:30.6175 -> 6175 |
fff | The milliseconds in a date and time value. | 2009-06-15T13:45:30.6175 -> 617 |
ff | The hundredths of a second in a date and time value. | 2009-06-15T13:45:30.6175 -> 61 |
f | The tenths of a second in a date and time value. | 2009-06-15T13:45:30.6175 -> 6 |
tt | The AM/PM designator. | 2009-06-15T13:45:30.6175 -> PM |
zzz | Hours and minutes offset from UTC. | 2009-06-15T13:45:30-07:00 -> -07:00 |
zz | Hours offset from UTC, with a leading zero for a single-digit value. | 2009-06-15T13:45:30-07:00 -> -07 |
Example:
# get the DateTime
selector: td.torrent_table_dateAdded
filters:
# input: 2017-09-18 19:17:24 +00:00
- name: dateparse
args: "yyyy-MMM-dd HH:mm:ss zzz"
# result: Mon, 18 Sep 2017 19:17:24 GMT
timeparse
Alias for dateparse
timeago
Converts a time-ago string into a DateTime object ("ddd, dd MMM yyyy HH:mm:ss z"). Does not require an argument. Timeago can handle a time-ago string such as:
now
2 hours and 1 day
4 years ago
1 week
5 months
9hr,12m,39s
8 days 3 hours 12 minutes 10 seconds
Example:
# get the DateTime (assuming the current time is Mon, 18 Sep 2017 19:17:24 GMT)
selector: td.torrent_table_dateAdded
filters:
# input: 2 hours and 1 day
- name: timeago
# result: Sun, 17 Sep 2017 17:17:24 GMT
reltime
Alias for timeago
fuzzytime
Converts a fuzzy-time string into a DateTime object ("ddd, dd MMM yyyy HH:mm:ss z"). By default fuzzytime renders a USA_Date. But if you supply an argument containing "UK" then it will return a UK_Date. Fuzzytime can handle a fuzzy-time string such as:
now
4 years ago (or any other timeago values)
Today
Yesterday
Tomorrow
1505788002 (a UNIX time-stamp Tue Sep 19 02:26:42 2017 UTC)
01-31 (dates without a year value)
1 Jan
Wednesday at 15:30
Example:
# get the DateTime (assuming the current time is Mon, 18 Sep 2017 19:17:24 GMT)
selector: td.torrent_table_dateAdded
filters:
# input: Yesterday
- name: fuzzytime
# result: Sun, 17 Sep 2017 19:17:24 GMT
htmldecode
Converts a string that has been HTML-encoded for HTTP transmission into a decoded string.
Example:
# decode the HTML
selector: td:nth-child(2) a
attribute: href
filters:
- name: querystring
args: f
# input: Anne+Rice%26%23039%3Bs+Mayfair+Witches+S01E01+1080p+WEB-DL+DD%2B+5.1+H.264-GGEZ
- name: htmldecode
# result: Anne Rice's Mayfair Witches S01E01 1080p WEB-DL DD+ 5.1 H.264-GGEZ
urldecode
Converts a string that has been encoded for transmission in a URL into a decoded string.
Example:
# decode the url
selector: td:nth-child(2) a.tab
attribute: href
filters:
# input: https://zooqle.com/search?q=preacher+s01e10
- name: urldecode
# result: https://zooqle.com/search?q=preacher s01e10
urlencode
Encodes a URL string.
Example:
# encode the url
magfile:
text: "{{ .Result.title }}"
filters:
# input: https://zooqle.com/search?q=preacher s01e10
- name: urlencode
# result: https://zooqle.com/search?q=preacher+s01e10
validfilename
Ensures that a string comprises only characters that are valid for use in filenames.
Example:
# get the filename
text: "{{ .Result.title }}"
filters:
# input: aFile?Name>With<Invalid*Symbols
- name: validfilename
# result: aFileNameWithInvalidSymbols
diacritics
Replace diacritics characters with their base character.
Example:
# replace any diacritics
keywordsfilters:
# input: ŠĐĆŽšđčćž
- name: diacritics
args: replace
# result: SĐCZsđccz
jsonjoinarray
Parse the input string as JSON, apply a JSONPath expression and join the resulting array using the specified separator.
args: [JSONPathExpression, separator]
Example:
# extract HTML code from a JSON response
preprocessingfilters:
- name: jsonjoinarray
args: ["$.result", ""]
hexdump
Dump the HTML of each row to the log in HEX format (for debugging purposes). You will need to have Enhanced Logging enabled to view the results.
Example:
date:
selector: div[class="resultdivbotton"] div[class="resulttime"] div[class="resultdivbottontime"]
filters:
# input: Tue, 19 Sep 2017 21:21:52 +12
- name: hexdump
# result in the log: MM-dd hh:mm:ss Debug CardigannIndexer (trackername): strdump: T(54)u(75)e(65),(2C) (20)1(31)9(39) (20)S(53)e(65)p(70) (20)2(32)0(30)1(31)7(37) (20)2(32)1(31):(3A)2(32)1(31):(3A)5(35)2(32) (20)+(2B)1(31)2(32)
strdump
Dump the HTML of each row or field to the log (for debugging purposes). You will need to have Enhanced Logging enabled to view the results. If you are using strdump to debug multiple field selectors, you can use the Optional args so that you can uniquely tag the results in the enhanced log.
Example:
selector: div[class="resultdivbotton"] div[id^="hideinfohash"]
filters:
# input: dbbde2fc0c299c1d1aa43280b57dafc3fbf0bd39
- name: strdump
# result in the log: MM-dd hh:mm:ss Debug CardigannIndexer (trackername): strdump: dbbde2fc0c299c1d1aa43280b57dafc3fbf0bd39
fields:
title:
selector: a[href*="?p=torrents&pid=10&action=details"]
filters:
# input: Star Trek 1080p
- name: strdump
args: title
# result in the log: # MM-dd hh:mm:ss Debug CardigannIndexer (trackername): strdump(title): Star Trek 1080p
download:
selector: a[href^="details.php/?id="]
attribute: href
filters:
# input: http://tracker.btnext.com/details.php/?id=123456
- name: strdump
args: dl_href_in
# result in the log: MM-dd hh:mm:ss Debug CardigannIndexer (trackername): strdump(dl_href_in): http://tracker.btnext.com/details.php/?id=123456
- name: replace
args: ["/details.php/", "/download.php/"]
- name: strdump
args: dl_href_out
# result in the log: MM-dd hh:mm:ss Debug CardigannIndexer (trackername): strdump(dl_href_out): http://tracker.btnext.com/download.php/?id=123456
Proposed changes
- Add support for a more powerful and cross platform template engine (JavaScript?)
- Add support for multi row parsing as described in https://github.com/cardigann/cardigann/pull/336#issuecomment-277645749