Question

I need to apply classifier algorithm after clustering. Now after clustering I find the id numbers that which id belongs which cluster. I clustered them into 2 cluster.

Now I need to collect those data by using those id. But I don't know how I can collect all information by using those id.

As I use jupyeter notebook and in main data I have no attribute named id and those id assigned jupyter notebook when I load data from main data file.

This is my main data

enter image description here

Here is my code to find which data belongs which cluster.

x = 0.10
i=0
C_i = np.where(labels == i)[0].tolist()
n_i = len(C_i) # number of points in cluster i

# (2) indices of the points from X to be sampled from cluster i
sample_i = np.random.choice(C_i, int(x * n_i)) 
print (i, sample_i)

and after clustering I find these ids

enter image description here

New addition:

suppose my loading file name is train. now using train.loc[26] command I get the info of that specific id.

enter image description here

But I need to collect all info into a new data frame like as train dataframe

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top