Question

While trying to understand complexity I run into an example of going through records organized in following way:

data = [
  {"name": "category1", "files": [{"name": "file1"}, {"name": "file2"}],
  {"name": "category2", "files": [{"name": "file3"}]
]

The task requires to go through all file records which is straight forward:

for category in data:
  for file in category["files"]:
    pass

It seems like complexity of this algorithm is O(n * m), where n is length of data and m is max length of files array in any of data records. But is O(n * m) only correct answer?

Because even there are two for-loops it still looks like iterating over a global array of file records organized in nested way. Is it legit to compare with iteration over different structure like that:

data = [
  ('category1', 'file1'),
  ('category1', 'file2'),
  ('category2', 'file3'),
]
for category, file in data:
  pass

...where complexity is obviously O(n), and n is a total number of records?

Was it helpful?

Solution

A better bound on the complexity is $O(k)$, where $k$ is the total number of files.

A similar situation is encountered in algorithms for sparse matrices. Sparse matrices are stored as a list of all non-zero entries. The running time is often analyzed in term of the number of non-zero entries rather than the dimensions of the matrix. The reason is that matrices encountered in practice are often sparse (or can be sparsified), and we are interested in algorithms taking advantage of this.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top