Simulation Performance Metrics

https://stackoverflow.com/questions/17684806

03-06-2022
|

Вопрос

This is a semi-broad question, but it's one that I feel on some level is answerable or at least approachable.

I've spent the last month or so making a fairly extensive simulation. In order to protect the interests of my employer, I won't state specifically what it does... but an analogy of what it does may be explained by... a high school dance.

A girl or boy enters the dance floor, and based on the selection of free dance partners, an optimal choice is made. After a period of time, two dancers finish dancing and are now free for a new partnership.

I've been making partner selection algorithms designed to maximize average match outcome while not sacrificing wait time for a partner too much.

I want a way to gauge / compare versions of my algorithms in order to make a selection of the optimal algorithm for any situation. This is difficult however since the inputs of my simulation are extremely large matrices of input parameters (2-5 per dancer), and the simulation takes several minutes to run (a fact that makes it difficult to test a large number of simulation inputs). I have a few output metrics, but linking them to the large number of inputs is extremely hard. I'm also interested in finding which algorithms completely fail under certain input conditions...

Any pro tips / online resources which might help me in defining input constraints / output variables which might give clarity on an optimal algorithm?

Решение

I might not understand what you exactly want. But here is my suggestion. Let me know if my solution is inaccurate/irrelevant and I will edit/delete accordingly.

Assume you have a certain metric (say compatibility of the pairs or waiting time). If you just have the average or total number for this metric over all the users, it is kind of useless. Instead you might want to find the distribution of of this metric over all users. If nothing, you should always keep track of the variance. Once you have the distribution, you can calculate a probability that particular algorithm A is better than B for a certain metric.

If you do not have the distribution of the metric within an experiment, you can always run multiple experiments, and the number of experiments you need to run depends on the variance of the metric and difference between two algorithms.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow