Histogram based on buckets

https://stackoverflow.com/questions/21670037

09-10-2022
|

Question

My data looks in the following manner (space separated):

What I would like to see is a histogram of A's that gives 90% of all B's.

Here is what I've tried. The dual-list dataset

dataset = [[],[]]

consists of the contents of A in dataset[0] and contents of B in dataset[1].

I created the element num_buckets so that I can push the data based on ranks to the buckets list.

def parse_dataset(dataset):
    sums = [[],[],[],[],[]]
    for s in range(1, len(dataset)):
        sums[s] =  sum(dataset[s])
    for a in range(0, len(dataset)):
        rankdict = {v: k for k,v in enumerate(sorted(set(dataset[a])))}
        ranked = [rankdict[b] for b in dataset[a]]
        sorted_rank = sorted(zip(ranked, dataset[0]))
        max_rank = max(ranked)
        min_rank = min(ranked)
        num_buckets =  (max(ranked) - min(ranked)) / 9
        buckets = [[] for q in range(num_buckets)]
        for z in range(0, len(sorted_ranks)):
            if min_rank =< sorted_ranks[z][0] < 9:
                buckets[0].append(sorted_ranks)

                ....

Please let me know if some important information is missing out.

La solution

A good plotting option would be to try out the matplotlib package. Note that it relies upon the numpy package. There are numerous examples on how to do different graphs there. If you want more on how to parse your data, comment here and I'll see what I can do.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow