Sometimes whisper.cpp or even just whisper does hallucinations or repeating lines where it gets stuck in a loop. For this particular audiobook the way I fixed it was to re-encode all audiobook chapters to individual files. Used LosslessCut to remove the unwanted portion which consisted of a 5 or seconds of silence then sounded like a manual stop and restart with a foreign language reading of a title. Then merged those two back into one chapter then re-encoded the entire audiobook and the transcription was fixed.
VSCode has an extension think it's called Highlight Duplicates and it'll highlight dupes but still you need to visually browse through. Other extensions prints count number of dupes but then there isn't any line numbers. So trying to find a script online to detect duplicates and show the lines numbers is much harder than I thought especially if the lines are one after another as in subtitles since it would be every 3rd line.
>>Click here to continue<<
