Question

I am using PSPP (NOT SPSS since I can't get that running on my Ubuntu machine) and having my set of ~100k records clustered with a k-means cluster. Now what I really need is a more detailed output than just how many records are in each cluster. I need the cluster variable saved i.e.

row 1 => cluster 1

row 2 => cluster 4

row 3 => cluster 1

etc...

Essentially I need the extra field that saves the resulting cluster affinity of each record. My current syntax is:

QUICK CLUSTER  cat1 cat2 cat3 cat4 cat5 cat6 cat7 cat8 cat9 cat10 cat11 cat12
/CRITERIA=CLUSTERS(12) MXITER(100000000).

SPSS and PSPP share a lot of the same syntax so if there is an option in SPSS it might work here too.

Était-ce utile?

La solution

Statistics should run on Ubuntu, but the Statistics QUICK CLUSTER command has a subcommand

/SAVE CLUSTER

that should do what you want. You can optionally specify a variable name in parentheses after CLUSTER.

Autres conseils

The PSPP does not handle /SAVE CLUSTER subcommand. Try it!

QUICK CLUSTER var_list
      [/CRITERIA=CLUSTERS(k) [MXITER(max_iter)] CONVERGE(epsilon) [NOINITIAL]]
      [/MISSING={EXCLUDE,INCLUDE} {LISTWISE, PAIRWISE}]
      [/PRINT={INITIAL} {CLUSTER}]

See on GNU page of PSPP

I know you're looking for something in PSPP, but your best bet is probably to save the output as an open doc, open up your data file as a .csv in a spreadsheet, then copy in the cluster members ships (assuming you added /print=cluster to your command line).

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top