Hi Alistair, I just re read my previous posts, the tone was un acceptably off, my apologies.
Moving on, so today key things to gain are obviously routers in use, but more importantly DSL stats from this we can understand how much bandwidth you have to play with, but more importantly how fragile those connections are and if they would be prone to erratic dropping. And obviosly a sketch of network topology and general design.
If we put any internal network design issues aside for now, I guess things we are looking at in respect to the WAN is, is it an issue of it dropping, or is it saturating and thus stalling all traffic. I have dealt with a few of these recently, companies that were rural and on ADSL and had moved to 365 and other cloud services. In those cases it was saturation of the uplink, mainly caused by network drives syncing back to their 365 sharepoint instance.
To diagnose those I put a RPi in the networks and ran a script that every x minutes would do an mtr to their ISP's upstream gateway and log it, from this I was able to see if it was the local networks gateway that went awol (indicating an internal network issue) or if the WAN dropped, or more importantly if I saw a dramatic increase in latency then could reasonably assume saturation. Just a thought.
Cyril