Question

I've a question about weka as this person:

Hi all:

I felt really strange about WEKA on this.

I have prepared a CSV file which has lots of missing values. One missing value in this file is basic just no any value between pair of commas i.e. ,random_value1,,random_value2. This is an example of the format. You can see there is a pair of commas, between them is just nothing not even a white_space, and it should indicates a missing value of the data.

The weird thing is when I read this CSV into WEKA, WEKA assigns all missing values to a question mark, i.e. '?'. This is exactly how WEKA expresses it.

And then when I run testing analysis, WEKA started working on these '?' as some sort useful information. It just missing values, could WEKA please just jump over it?

These problem became really wasting. Analysis results read like if missing then value missing, missing assocciates with missing, missing correlates missing.

Can WEKA reads missing value as missing value, not some sort question marks? Or can I tell WEKA that for all '?', treat them as missing values?

Thanks guys

He solved his problem using this solution:

I found a way to tell WEKA about the missings. Just use the fine_and_replace function of a ASCII editor, replace all '?' to ?.

>

but I didn't know how can download ASCII Editor and use it ,, can anyone inform me ????

Was it helpful?

Solution

I suggest you to use notepad2 or notepad++ in windows.

OTHER TIPS

You don't have to work on with missing values. Different algorithms work differently on missing values. So, don't worry, it will be handled just the way it should have been.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top