Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract video from .swf using Python

I've written code that generated the links to videos such as the one below. Once obtained, I try to download it in this manner:

import urllib.request
import os

url = 'http://www.videodetective.net/flash/players/?customerid=300120&playerid=351&publishedid=319113&playlistid=0&videokbrate=750&sub=RTO&pversion=5.2%22%20width=%22670%22%20height=%22360%22'
response = urllib.request.urlopen(url).read()
outpath = os.path.join(os.getcwd(), 'video.mp4')
videofile = open(outpath , 'wb')
videofile.write(response)
videofile.close()   

All I get is a 58kB file in that directory that can't be read. Could someone point me in the right direction?

like image 602
nindalf Avatar asked Dec 04 '22 05:12

nindalf


1 Answers

With your code, you aren't downloading the encoded video file here, but the flash application (in CWS-format) that is used to play the video. It is executed in the browser and dynamically loads and plays the video. You'd need to apply some reverse-engineering to figure out the actual video source. The following is my attempt at it:

Decompressing the SWF file

First, save the 58K file you mentioned on your hard disk under the name test.swf (or similiar). You can then use the small Perl script cws2fws for that:

perl cws2fws test.swf

This results in a new file named test.fws.swf in the same directory

Searching for the configuration URL in the FWS file

I used a simple

strings test.fws.swf | grep http

Which yields:

...
cookieOhttp://www.videodetective.net/flash/players/flashconfiguration.aspx?customerid=
...

Interesting. Let's try to put our known customerid, playerid and publishedid arguments to this URL:

http://www.videodetective.net/flash/players/flashconfiguration.aspx?customerid=300120&playerid=351&publishedid=319113

If we open that in a browser, we can see the player configuration XML, which in turn points us to

http://www.videodetective.net/flash/players/playlist.aspx?videokbrate=450&version=4.6&customerid=300120&fmt=3&publishedid=&sub=

Now if we open that, we can finally see the source URL:

http://cdn.videodetective.net/svideo/mp4/450/6993/293732.mp4?c=300120&r=450&s=293732&d=153&sub=&ref=&fmt=4&e=20111228220329&h=03e5d78201ff0d2f7df9a

Now we can download this h264 video file and we are finished.

Automating the whole process in a Python script

This is an entirely different task (left as an exercise to the reader).

like image 171
Niklas B. Avatar answered Dec 30 '22 14:12

Niklas B.