It turns out that because of the very complex DB structure there are also a lot of views created in the Database. The View_CMS_Tree_Joined
is the view (or one of the views I am looking for).
A quick reference to some of the columns that I will need:
[ClassName]
[ClassDisplayName]
[DocumentPageTitle]
[DocumentPageKeyWords]
[DocumentPageDescription]
[DocumentContent]
[DocumentType]
[NodeAliasPath]
[DocumentUrlPath]
[DocumentExtensions]
These (and some others) seem enough to parse some data out. One pain will be the actual page content since it is very interestingly stored in the DB:
<content>
<webpart id="editabletext;821223e7-e515-4a0b-92c1-30726c724889"><![CDATA[<p>SOME TEXT HERE</p>]]></webpart>
<webpart id="editableimage;27a57931-f182-4ae9-b41d-1af0790d5286"><![CDATA[<image><property name="imagepath">~/asdasd/media/asdasd/images/2013/4.gif</property></image>]]>
<!-- EVEN MORE STUFF LIKE THAT-->
</content>
So every single tag is enclosed in CDATA
and webpart tags. I can live with the CDATA
but why non-standard tags? Anyway will manage to painfuly parse it out.
Some additional information for the structure of the database can be found in this document. The database reference also proved to be a nice resource.
Many thanks to the guys over at the Kentico Dev Forums their remarks on this could be found here.