I ever analyzed dumps from various customers for a couple years, the followings is only my personal perspective to your question, hope this helps.
Should we write the large dump before the small one? i don't consider the order is important for crash, hang etc typical issues. the crash spot is there, the deadlock is there in dump, first captured or after.
Should we even be writing two dumps? i would suggest write at least 1 full dump, the small dump is very convenient for you to get an initial impression of what's the problem, but it's very limited esp. when your application crash. so you may suggest customer to email you the small dump to do first round triage, if this can not help you find the root cause, then ask the full dump. technically you can strip a small dump out from a full dump, however, you may not want your customer to do this sort of work for you. so this depends on how you interact with your customer.
Could we somehow have the system first suspend the threads of the process and then write two dumps that are completely "synchronous"?
technically this is doable. e.g. it's relatively easy to do out-proc, a simple NtSuspendProcess() suspend all target threads, but it have to be called from another process. if you prefer to do in-proc, you have to enumerate all threads and call SuspendThread(), this is how MiniDumpWriteDump() works. However, i think sync/asyn does not affect the accuracy of the dump.