Commit Graph

82 Commits

Author SHA1 Message Date
Philipp Hagemeister 81c2f20b53 [youtube] Correct invalid JSON (Fixes #2353) 2014-02-09 17:56:10 +01:00
dst c1206423c4 Fix extraction of og content in single quotes 2014-01-31 03:57:33 +07:00
Jaime Marquínez Ferrándiz 0c708f11cb [bloomberg] Fix ooyala url extraction
Added a helper method to InfoExtractor for searching the ‘twitter:player’ meta property.
Now the OoyalaIE also recognizes the ‘ec’ parameter in the url as the embed code.
2014-01-29 18:03:32 +01:00
Philipp Hagemeister 7e8caf30c0 Throw an error if no video formats are found 2014-01-27 07:31:54 +01:00
Philipp Hagemeister db1f388878 [huffpost] Add support 2014-01-27 05:47:38 +01:00
Jaime Marquínez Ferrándiz 944d65c762 [extractor/common] Encode the url when calculating the md5 with `—write-pages` option
This doesn’t cause any problem in python 2.*, but on python 3 the `md5` function only accepts bytes.
2014-01-25 15:32:56 +01:00
Philipp Hagemeister 1394ce65b4 [youtube] Add new formats (Fixes #2221) 2014-01-23 23:54:06 +01:00
Philipp Hagemeister 50317b111d Merge branch 'youtube-dash-manifest'
Conflicts:
	youtube_dl/extractor/youtube.py
2014-01-22 19:58:31 +01:00
Philipp Hagemeister 9d4288b2d4 [extractor/common] Clarify when and when not we generate the filename 2014-01-21 01:41:13 +01:00
Philipp Hagemeister b60016e831 Deal with implicitly UTF-16 decoded webpages
These webpages don't specify an encoding and rely on the BOM
2014-01-21 01:39:40 +01:00
Philipp Hagemeister dd27fd1739 [youtube] Download DASH manifest
If given, download and parse the DASH manifest file, in order to get ultra-HQ formats.
Fixes #2166
2014-01-19 05:47:20 +01:00
Philipp Hagemeister 3ec05685f7 [extractor/common] Limit --write-pages filename to 200 chars
This avoids problems with very long URLs.
2014-01-17 14:47:47 +01:00
Philipp Hagemeister 9933b57430 [pornhub] Use centralized sorting 2014-01-07 10:25:34 +01:00
Philipp Hagemeister 3d3538e422 [khanacademy] Add support (Fixes #2066) 2014-01-07 09:35:34 +01:00
Philipp Hagemeister 5d73273f6f [orf] Use new extraction method (Fixes #2057) 2014-01-06 17:15:27 +01:00
Philipp Hagemeister 9887c9b2d6 [jpopsuki] Simplify 2014-01-03 12:51:37 +01:00
Philipp Hagemeister 08d13955dd [wistia] Prefer original video format above all others
We could also set up a formula which would weigh filesize/bitrate and vcodec/acodec (say, 1GB h264 < 3 GB MPEG2 < 2 GB h264), but that would get really messy real soon.
2014-01-01 20:23:49 +01:00
Philipp Hagemeister 5d4f3985be Document that format_id field should be present 2013-12-26 21:19:00 +01:00
Philipp Hagemeister 7217e148fb [yahoo] Use centralized sorting, and add tbr field 2013-12-25 15:18:40 +01:00
Philipp Hagemeister c7deaa4c74 [zdf] Use centralized sorting 2013-12-24 23:32:04 +01:00
Philipp Hagemeister e6812ac99d [spiegel] Use centralized sorting 2013-12-24 12:40:23 +01:00
Philipp Hagemeister 4bcc7bd1f2 Add temporary _sort_formats helper function 2013-12-24 12:31:42 +01:00
Philipp Hagemeister f49d89ee04 Add a resolution field and improve general --list-formats output 2013-12-24 11:56:02 +01:00
Philipp Hagemeister f45f96f8f8 [myvideo] Use RTMP instead of RTMPT (Fixes #2032) 2013-12-23 15:57:43 +01:00
Philipp Hagemeister 1538eff6d8 [bliptv] Remove support for direct downloads
This is now handled by the generic IE
2013-12-23 15:49:21 +01:00
Philipp Hagemeister aa94a6d315 [aparat] Add support (Fixes #2012) 2013-12-20 17:05:39 +01:00
Jaime Marquínez Ferrándiz c0d0b01f0e [generic] Detect ooyala videos (fixes #2013) 2013-12-19 20:32:12 +01:00
Philipp Hagemeister 46374a56b2 [youtube] Do not warn for videos with allow_rating=0
This fixes #1982
Test video: http://www.youtube.com/watch?v=gi2uH3YxohU
2013-12-17 02:49:56 +01:00
Itay Brandes 87a28127d2 _search_regex's "isatty" call fails with Py2exe's
_search_regex calls the sys.stderr.isatty() function for unix systems.

Py2exe uses a custom Stderr() stream which doesn't have an `isatty()`
function, leading to it's crash.

Fixes easily with checking that it's a unix system first.
2013-12-16 21:50:26 +01:00
Philipp Hagemeister d67b0b1596 Reorder info_dict documentation 2013-12-16 14:13:40 +01:00
Philipp Hagemeister c0ba0f4859 Document duration field 2013-12-16 04:09:43 +01:00
Philipp Hagemeister e2b38da931 [mtv] Fixup incorrectly encoded XML documents 2013-12-10 12:45:22 +01:00
Philipp Hagemeister 7cc3570e53 Add fatal=False parameter to _download_* functions.
This allows us to simplify the calls in the youtube extractor even further.
2013-12-09 01:49:03 +01:00
Philipp Hagemeister 19e3dfc9f8 [9gag] Like/dislike count (#1895) 2013-12-05 18:29:07 +01:00
Philipp Hagemeister aaebed13a8 [smotri] Simplify 2013-12-02 17:08:17 +01:00
Philipp Hagemeister 2a275ab007 [zdf] Use _download_xml 2013-11-28 05:47:50 +01:00
Philipp Hagemeister 79d09f47c2 Merge branch 'opener-to-ydl' 2013-11-25 03:30:37 +01:00
Philipp Hagemeister c059bdd432 Remove quality_name field and improve zdf extractor 2013-11-25 03:28:55 +01:00
Philipp Hagemeister 02dbf93f0e [zdf/common] Use API in ZDF extractor.
This also comes with a lot of extra format fields
Fixes #1518
2013-11-25 03:13:22 +01:00
Philipp Hagemeister e03db0a077 Merge branch 'master' into opener-to-ydl 2013-11-24 15:18:44 +01:00
Jaime Marquínez Ferrándiz 267ed0c5d3 [collegehumor] Encode the xml before calling xml.etree.ElementTree.fromstring (fixes #1822)
Uses a new helper method in InfoExtractor: _download_xml
2013-11-24 14:59:19 +01:00
Philipp Hagemeister 7012b23c94 Match --download-archive during playlist processing (Fixes #1745) 2013-11-22 22:46:46 +01:00
Philipp Hagemeister dca0872056 Move the opener to the YoutubeDL object.
This is the first step towards being able to just import youtube_dl and start using it.
Apart from removing global state, this would fix problems like #1805.
2013-11-22 19:57:52 +01:00
Philipp Hagemeister 5904088811 Add support for tou.tv (Fixes #1792) 2013-11-20 06:13:19 +01:00
Philipp Hagemeister 91c7271aab Add automatic generation of format note based on bitrate and codecs 2013-11-16 01:08:43 +01:00
Jaime Marquínez Ferrándiz 78fb87b283 Don't accept '>' inside the content attribute in OpenGraph regexes 2013-11-15 12:54:13 +01:00
Jaime Marquínez Ferrándiz ab2d524780 Improve the OpenGraph regex
* Do not accept '>' between the property and content attributes.
* Recognize the properties if the content attribute is before the property attribute using two regexes (fixes the extraction of the description for SlideshareIE).
2013-11-15 12:24:54 +01:00
Philipp Hagemeister eb0a839866 [common] Simplify og_search_property 2013-11-12 10:36:23 +01:00
Marcin Cieślak a8eeb0597b Fix AssertionError when og property not found
On tvp.pl some webpages contain OpenGraph
metadata and some don't.

If og property is not found, _og_search_description
fails with

WARNING: unable to extract OpenGraph description; please report this issue on http://yt-dl.org/bug
Traceback (most recent call last):
  File "/usr/home/saper/bin/youtube-dl", line 18, in <module>
    youtube_dl.main()
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 766, in main
    _real_main(argv)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 719, in _real_main
    retcode = ydl.download(all_urls)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 715, in download
    videos = self.extract_info(url)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 348, in extract_info
    ie_result = ie.extract(url)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 125, in extract
    return self._real_extract(url)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/tvp.py", line 56, in _real_extract
    info['description'] = self._og_search_description(webpage)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 331, in _og_search_description
    return self._og_search_property('description', html, fatal=False, **kargs)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 325, in _og_search_property
    return unescapeHTML(escaped)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/utils.py", line 494, in unescapeHTML
    assert type(s) == type(u'')
AssertionError

The patch allows me to use:

  try:
    info['description'] = self._og_search_description(webpage)
    info['thumbnail'] = self._og_search_thumbnail(webpage)
  except RegexNotFoundError:
    pass
2013-11-05 23:19:29 +01:00
Jaime Marquínez Ferrándiz 9103bbc5cd Add the 'webpage_url' field to info_dict
The url for the video page, it must allow to reproduce the result.
It's automatically set by YoutubeDL if it's missing.
2013-11-03 12:11:13 +01:00