Question

Problem description: I'm interested in looking at terms in the text window of, say, 3 words to the left and 3 to the right. The base case has the form of w-3 w-2 w-1 term w+1 w+2 w+3. I want to implement a sliding window over my text with which I will be able to record the context words of each term. So, every word is once treated as a term, but when the window moves, it becomes a context word, etc. However, when the term is the 1st word in line, there are no context words on the left (t w+1 w+2 w+3), when it's the 2nd word in line, there's only one context word on the left, and so on. So, I am interested in any hints for implementing this flexible sliding window (in Python) without writing and specifying separately each possible situation.

To recap:

Example of input:

["w1", "w2", "w3", "w4", "w5", "w6", "w7", "w8", "w9", "w10"]

Output:

t1 w2 w3 w4

w1 t2 w3 w4 w5

w1 w2 t3 w4 w5 w6

w1 w2 w3 t4 w5 w6 w7

__ w2 w3 w4 t5 w6 w7 w8

__ __ etc.

My current plan is to implement this with a separate condition for each line in the output.

Was it helpful?

Solution

If you want a sliding window of n words, use a double-ended queue with maximum length n to implement a buffer.

This should illustrate the concept:

mystr = "StackOverflow"    
from collections import deque    
window = deque(maxlen=5)
for char in mystr:
    window.append(char)
    print ( ''.join(list(window)) )

Output:

S
St
Sta
Stac
Stack
tackO
ackOv
ckOve
kOver
Overf
verfl
erflo
rflow
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top