Are you trying to Download Video Fragments and recombine streaming video files to mp4? Find the source code and examples on Github – https://github.com/nathanielkam/mint-csv-import
Page Contents
Disclaimer:
Do not download videos you do not have the rights or license to.
What is Bash?
Bash is a popular Unix Shell and command line language. Bash was the default for shell for MacOS up until Catalina when it was replaced with zsh. Bash continues to be one of the most popular shells for Linux workstations and server distributions.
Using bash, we can run read a list of commands or inputs from a file, run programs from the command line, and or examine their standard output to coordinate other tasks.
Why I Wrote This Bash Script to Download Video Fragments
I originally wrote this script for 2 reasons:
- The first was we couldn’t find the original versions of some of our old marketing webinars (before we moved everything to s3 cloud storage).
- The second reason is because it seemed like a good learning exercise for bash scripting.
What is a .ts Video File?
The .ts video extension stands for transport stream. You may also see it as .mp2t, .m2ts, .tsv or .tsa. Originally the .ts format was used on DVDs, but more likely today you can find it on internet video streams. The video and audio is in the mpeg-2 format, meaning it can pretty easily be converted to other common formats like mp4.
Does the File Format Need to Be .ts to Download Video Fragments?
We will be using FFMPEG to do the concatenation (combining) of the video fragments. From their documentation,
The demuxer is more flexible – it requires the same codecs, but different container formats can be used; and it can be used with any container formats, while the protocol only works with a select few containers.
FFMPEG documentation as of 05/08/20 – https://trac.ffmpeg.org/wiki/Concatenate#samecodec
What Does This Script Do?
The bash script we’ve made available here on GitHub is going to do the following:
- Download all pieces of a video stream. For example, if the video fragments are 0001.ts to 0999.ts it will download all of them into a folder you define.
- Generate a list, or “manifest”, of .ts files in the folder you define. This will be used to run the concatenate process.
- Concatenate all video files in the manifest into one .mp4 file.
What Prerequisites Do I Need for This?
In order to run this script you will need a shell like bash or zsh. Unix systems will come with this built in (In MacOS terminal is a bash shell). If you are on the latest version of Windows 10 you also have bash now!
Next, you will need a program called curl. Curl is a HTTP client, which basically means it downloads and uploads web stuff. Finally, you’ll need ffmpeg to do the concatenation of video files.
How Do You Configure and Run the Script to Download Video Fragments?
- You must be able to download pieces of the video you want for this code to work – check you can do that in your browser
- You now need to get the first and last video fragment url
- Open your browser’s developer tools and go to network tab
- Start streaming the video and look for a .ts or .mpt or .mp2 files
- You will see as the video plays new fragments are downloaded in sequence
- When the video first starts, find the fragment number (it is almost always 0 or 1 with some set of leading zeroes)
- ex: For start fragment http://test.com/movie_0001 the first fragment is 1 (do not write leading zeroes)
- Go to end of the video and you will see the HIGHEST fragment number (write this number down)
- ex: For end fragment http://test.com/movie_1000 the highest fragment is 1000
- You now also know the URL prefix the video pieces have (write this prefix down)
- ex: For fragment http://test.com/movie_1000.ts the prefix is http://test.com/movie_ (no 1000)
- When the video first starts, find the fragment number (it is almost always 0 or 1 with some set of leading zeroes)
- Get the CURL command to download the video fragment
- Right click on the video fragment in your network tab and click “copy->copy as curl”
- Replace the “XYZ” portions of the curl command with your REAL values
- Pick a name for your output folder and video
- ex: if you are downloading http://test.com/movie_1000.ts you might want to name your output movie
- You can now construct your download command
Explaining the Bash Script
The first thing we want to review is the script parameters the script is expecting to run its commands.
url=$1
# Debug Text
echo Downloading: $url
start=$2
# Debug Text
echo Starting at Part: $start
stop=$3
# Debug Text
echo Ending at Part: $stop
output=$4
# Debug Text
echo Output to Folder: $output
In bash, parameters are read in as an array with index[0] being the script, then all other parameters in order after that as 1-n.
Ex: sh command.sh OPTION1, parameter 0 is command.sh, and parameter 1 is OPTION1
Next, the script is going to make an output directory to store the fragments in:
# Make a directory for the files you are downloading, important so you only combine files for this video
# Location is relative to where download.sh is
mkdir $output
Now that we have a place to store the fragments, we’ll want to download them to the output folder:
# Start on piece defined in parameter and end of piece in parameter
for (( i=$start; i <= $stop; i++ ))
do
# Pause for 2 seconds to prevent rate limiting (you can change this to whatever works best for your site)
sleep 2
# Convert piece i to 5 digit format with leading zeroes. If you need more leading zeroes you change %05d to %06d or higher
# if you need no leading zeroes you can just remove printf "%05d" and just use i
id=$(printf "%05d" $i);
# Build the link of the current piece to download
link="${url}${id}.ts"
# Download the file to the defined output folder as a .ts file
# If you need to download as a different format like mp2 you can change the .ts to .mp2
# Replace the XYZ portions with your REAL ones
curl $link -H 'authority: XYZ' - H 'origin: XYZ' -H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome Safari/537.36' -H 'accept: */*' -H 'sec-fetch-site: cross-site' -H 'sec-fetch-mode: cors' -H 'referer: XYZ' -H 'accept-encoding: gzip, deflate, br' -H 'accept-language: en-US,en;q=0.9' --compressed > $output/${id}.ts
done
Now that we have all our fragments we’ll need a manifest of files to combine into our final video file:
# Change working directory to the defined output
cd $output
# Add all .ts files to a combine_manifest.txt file so we can combine them later with ffmpeg
for f in ./*.ts; do echo "file '$f'" >> combine_manifest.txt; done
Last but not least, we’ll want to actually run the concatenation of all the files in the manifest:
# Use ffmpeg to concatenate all video pieces in the manifest together into one output file
ffmpeg -f concat -safe 0 -i combine_manifest.txt -c copy ../../$output.mp4
How to Run the Download Video Fragments Script
In your bash shell run the following command to run the script:
sh download.sh https://website.domain/video_prefix_ 1 999999999999 OUTPUT_NAME_NO_EXTENSION
- Replace https with http if your site does not support https
- Replace website.domain with the actual domain you are pulling from
- replace “video_prefix” with everything except the number component of the URI you are downloading.
- Ex: If you are downloading from “https://website.domain/video_prefix_0001.ts” your prefix is “video_prefix_”
- Replace the first parameter “1” with the FIRST fragment number
- Replace the second parameter “999999999999” with the actual LAST fragment number for your video
- Replace the OUTPUT_NAME_NO_EXTENSION with the video name you want
Frequently Asked Questions
Do a search and replace command in your text editor for “.ts” and replace with whatever extension you need support for.
You can change the numbering format in on the line “id=$(printf “%05d” $i);”
If you are getting a permissions error, this means you are not authorized to download the video. If the video is working in your browser, then some kind of permissions is being given to you to access the video. This is likely the cookie or token info in your request headers. You can ADD these headers to the curl command to access protected content.
Still Having Issues?
Are you still struggling to download video fragments, go ahead and send me an overview of your issue and I’ll try to get this article updated to cover your question.