Question

I am scraping forum website by type (article, webinar, video) I thought if ruby can extract somehow the length of the video. The corresponding html part of the web page looks like.

<div align="center"><script type="text/javascript" src="http://somedomain.com/wp-content/themes/thesis/custom/swfobject.js"></script>

<div id="player">This text will be replaced</div>

<script type="text/javascript">
var so = new SWFObject('http://somedomain.com/forum/yota/audio_player/player.swf','mpl','640','500','9');
so.addParam('allowscriptaccess','always');
so.addParam('allowfullscreen','true');
so.addParam('flashvars','&amp;file=http://somedomain.net/flv/ezinearticles/ezinearticles.flv&amp;dock=false');
so.write('player');
</script></div>

Interestingly I can access the flv file directly so I am happy to download all videos and then extract the length somehow. Majority of the files are flv some are wmv.

Please note that I have permission of the forum webmaster to do scrapping.

Was it helpful?

Solution

ffmpeg-ruby looks like it does what you want.

OTHER TIPS

I don't know about ruby, but you can always invoke external program, such as ffmpeg or mplayer with file as an argument, and parse the output. Take a look at:

http://www.linuxquestions.org/questions/linux-software-2/getting-file-information-with-ffmpeg-601817/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top