Domanda

Am planning to create a simple search engine in python(python3).Going through the documentation for sqlite fts3/fts4 ,it became my choice to store the documents,since full text searches are fast.I already have a set of webpages ,with their text extracted and saved in text files.
Hence I planned to create the fts4 table the following way:

conn = sqlite3.connect('/home/xyz/exampledb.db')
c = conn.cursor()
c.execute("CREATE VIRTUAL TABLE mypages USING fts4(docid, name, content)")


Then i would iterate over the text files,store it in a string and insert this string into the fts table along with the name and docid(an integer from 1 to n where n is total documents)
But the following statement in sqlite documentation has me confused and am not sure my above code will work:
A virtual table is an interface to an external storage or computation engine that appears to be a table but does not actually store information in the database file.
So where will the information be stored?if it was a regular sqlite table,i would first create a database file and create the table in this database file.If i had to use the same database in another machine i would simply copy this file and paste it on that machine.I might have missed something in the documentation but i want to be clear on how information will be stored before i implement it.

È stato utile?

Soluzione

That statement from the documentation is somewhat misleading; the virtual table itself does not store data in the database, but the engine that implements the virtual table might choose to use other tables to store the data.

What happens for FTS is explained in section 9.1 of the documentation:

For each FTS virtual table in a database, three to five real (non-virtual) tables are created to store the underlying data. These real tables are called "shadow tables". The real tables are named "%_content", "%_segdir", "%_segments", "%_stat", and "%_docsize", where "%" is replaced by the name of the FTS virtual table.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top