Question

I'm trying to manipulate some data in Biolabs Orange, using the built in Python Script widget and information at Biolabs Orange tutorial on scripting.

However, I'm struggling with taking the results of SMOTE and putting them into a format for Orange:

This is the code I'm using in the Python Script widget:

# Get libraries
import Orange
import numpy as np
from Orange.data import Domain, Table
from imblearn.over_sampling import SMOTE

#in_data = Orange.data.Table('WORKING_temp.csv')

df = in_data.copy()

# set variables for SMOTE
sm = SMOTE(random_state=42)

# get table of data (X) and class variables (y)
X, y = df.X, df.Y

# resample data and classes
X_res, y_res = sm.fit_sample(X, y)


df.X = X_res
df.Y = y_res

temp = Orange.data.Table(df.X, df.Y)
temp.domain = df.domain

out_data = Orange.data.Table(temp)

The result is a ValueError, which I think is related to changing the length of the class variables and data table, while leaving the original index length?

"ValueError: could not broadcast input array from shape (3724,10) into shape (3724)"

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top