You will definitely want a gpu to train even if you use cloud hosts, and you will be running a local tts engine with your model, which in all likelihood will need a gpu as well.
Record as much of her as you can, of course. Then you’ll want to explore audacity and use that to clean up and match your clips as much as possible.
One way to go would be to get her to tell you stories, and just record those. Just tell her you’re looking to capture her memories for family?