Question

I'm using the R packages TraMineR to compute and analyze state sequences. I would like to obtain a sequence frequency plots using the command seqfplot. However, instead of setting the number of the most frequent sequences to be plotted using

seqfplot(mydata.seq, tlim=1:20)

it would be useful to set the percentage of the most frequent sequences needed to reach - for example - the 50% of the sample. I tried with this

seqfplot(mydata.seq, trep = 0.5)

but - differently from seqrep.grp and seqrep - the option trep is not supported by seqfplot command. Should I create a new function to do that?

Thank you.

Was it helpful?

Solution

You are right, the trep argument is an argument of TraMineR seqrep function which looks for representative sequences covering at least a trep percentage of all sequences.

If you specifically want the most frequent sequence patterns such that their cumulated percent frequencies is say 50%, then you have to compute the selection filter your self. Here is how you can do that using the biofam data.

library(TraMineR)
data(biofam)
bf.seq <- seqdef(biofam[,10:25])

## first retrieve the "Percent" column of the frequency table provided 
## as the  "freq" attribute of the object returned by the seqtab function.

bf.freq <- seqtab(bf.seq, tlim=nrow(bf.seq))
bf.tab <- attr(bf.freq,"freq")
bf.perct <- bf.tab[,"Percent"]

## Compute the cumulated percentages
bf.cumsum <- cumsum(bf.perct)

## Now we can use the cumulated percentage to select
## the wanted patterns
bf.freq50 <- bf.freq[bf.cumsum <= 50,]

## And to plot the frequent patterns
(nfreq <- length(bf.cumsum[bf.cumsum <= 50]))
seqfplot(bf.seq, tlim=1:nfreq)

Hope this helps.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top