Вопрос

I'm using Haystack to index fields that have html inside and I would like to confirm that what I'm doing makes sense since this index business is recent to me.

Consider the following model:

class Document(models.Model):
    name = models.CharField(max_length=200)
    date = models.DateField()

    text = models.TextField()  # has html
    summary = models.TextField() # has html

    def compose_summary(self):
        # searches for tags in html and substitute for other tags (for rendering)

    def visible_summary(self):
        # returns summary without html.

    def visible_text(self):
        # returns text without html.

that I want to create a SearchIndex out of. In particular I would like to summarize the search results as:

<h4><a href="{{ law.get_absolute_url }}">{{ law.name }}</a></h4>
<div class="age">{{law.date}}</div>
<p>{{ law.compose_summary|safe }}</p>

and I would like to perform the search over all the fields.

The way I'm doing it now is:

class DocumentIndex(indexes.SearchIndex, indexes.Indexable):
    id = indexes.IntegerField(model_attr='id', indexed=False) # for get_absolute_url
    name = indexes.CharField(model_attr='name', indexed=False)
    date = indexes.DateField(model_attr='date')

    text = indexes.CharField(document=True, use_template=True)
    summary = indexes.DateField(model_attr='summary', indexed=False)

    def get_model(self):
        return Document

    def compose_summary(self):
        # copy of Document.compose_summary().
    def get_absolute_url(self):
        # copy of Document.get_absolute_url().

# document_text.txt
{{ object.name }}
{{ object.visible_summary }}
{{ object.visible_text }}

However, I'm not sure this is the right way: I'm repeating code in at least two situations (functions), and I think this is storing the content of summary three times: one in the database (Document.summary), one in the DocumentIndex.summary and one inside DocumentIndex.text. Can someone please give me a hint whether this makes sense at all?

Это было полезно?

Решение

To address your two concerns:

  • there is no need to repeat the code in the DocumentIndex object, you can access the related Document object via eg. {{ law.object.get_absolute_url }} so put the code in Document

  • yes, summary will be stored three times, once in the DB and twice in the index, nothing wrong with that. FYI, DocumentIndex.text will be populated via your template so it'll contain not only summary but everything you want indexed.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top