| For most slow PyTorch runs the first question isn't show me every trace event, it is just: where do I even start? - where did step time go? I haven been thinking about what a compact end-of-run summary would look like: lightweight enough to run on every job, not just dedicated profiling runs. Here's one example of what that output could look like: Curious how others are solving this today. What would make something like this useful? What is missing? [link] [comments] |