lots of random punctuation. I stripped most of that out. Also, the negative/greater than/less than would need to be transcribed to words if spoken. Same with the degrees centigrade, or if it’s just three point three C, then the degree mark can be dropped. The times also.
If it’s a spoken word it needs to be written out as the actual word and not as anything else. I think you’re going to need to sit down and manually fix a lot of this stuff, unfortunately. My biggest chunk of time training a voice was correctly tagging all the clips I would use. 10k of those, and they’d been run through deepspeech (or google cloud STT), and even after that about half STILL needed some manual editing to be right. I have listened to all of the clips I used by now, there’s about 15k potential more I could do but have no actual desire to sit through at this point.