XML parser selector skips mixed content in migration
Question
I am trying to migrate and map the source fields of an external rss/xml file into Drupal.
I have built a custom module:
In config/install
folder I have my yml
file modules/custom/import_rss/config/install/migrate_plus.migration.xml_articles.yml
:
id: xml_articles
label: 'Import articles'
status: true
source:
plugin: url
data_fetcher_plugin: http
urls: 'https://myrss.xml'
data_parser_plugin: simple_xml
item_selector: /rss/channel/item
fields:
-
name: guid
label: GUID
selector: guid
-
name: title
label: Title
selector: title
-
name: pub_date
label: 'Publication date'
selector: pubDate
-
name: link
label: 'Origin link'
selector: link
-
name: summary
label: Summary
selector: description
ids:
guid:
type: string
destination:
plugin: 'entity:node'
process:
title:
plugin: get
source: title
field_remote_url: link
body: summary
created:
plugin: format_date
from_format: 'D, d M Y H:i:s O'
to_format: 'U'
source: pub_date
status:
plugin: default_value
default_value: 1
type:
plugin: default_value
default_value: my_article
After running drush mim xml_articles
all the items are imported except the body
field in Drupal nodes are empty?
If I look at the source xml file the <description>
tag looks like this:
<description><div class="field field-name-body field-type-text-with-summary field-label-hidden text-content text-secondary"><div class="field-items"><div class="field-item even"><p>You might have noticed some changes on Drupalize.Me lately. We've just wrapped up a huge content archiving project and I'd like to share what we did and how this will help us move forward with a library of tutorials for Drupal learners that we're committed to keeping up-to-date.</p>
</div></div></div></description>
In the XML source the <description>
has 3 <div>
tags <description><div><div><div>Text</div></div></div></description>
I tried as selector description/*
, but that's not working either.
How do I get this working?
Solution
By adding description/div/div/div/*
to the selector.
Licensed under: CC-BY-SA with attribution
Not affiliated with drupal.stackexchange