سؤال

I have a text file called "test.txt" which contains data in libsvm format. Data in this file is represented as follows:

165475 0:246870 1124384:2 342593:7 1141651:1 297582:1 1186846:1 17725:1 656602:1 
463304:1 766612:1 573309:1 290046:1 748198:1 216665:1 950594:2 909004:1 29008:1      
105623:1 5018:5 806027:1 1125729:1 757846:1 1023921:2 612980:1 120767:1 51340:1 
108172:5 674420:2

where 1st term represents the label and remaining represents the feature and its weight(separated by : ).This is a very huge file(with every label having lots of features and weights).

I am using scikit with ipython notebook and want to load this data in notebook to start processing it.

Can someone tell how to do that.Thanks in advance.

هل كانت مفيدة؟

المحلول

Use load_svmlight_file from sklearn.datasets.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top