cat some.vtt | sed -e '/WEBVTT/d' -e '/-->/d' | awk '!seen[$0]++' | awk 1 ORS=' ' > some.txt
This is what I use to get flowing plain text from a vtt subtitle. Then just need to manually add paragraph breaks.
1st one removes WEBVTT from top of file
2nd one deletes timecodes -->
3rd one removes duplicate lines and blank lines
4th one removes new lines for flowing text
for f in *.vtt; do cat "$f" | sed -e '/WEBVTT/d' -e '/-->/d' | awk '!seen[$0]++' | awk 1 ORS=' ' > "${f%.*}".txt ; done
>>Click here to continue<<