Question

I have a database that is outside my control, and am pulling data from it using Power BI for reporting. The database is a list of pages on a website, where each page is a new row, and the columns all have metadata regarding the page (created by, created by, and so on). Except not all of the metadata is stored in distinct columns, there is a "Primary" and "Secondary" column that stores mulitple key value pairs. For example:

PageName | CreatedBy | Primary | Secondary 
page1    | Joe       | [owner:frank,topic:meals] | [topic:drinks]
page2    | Dale      | [owner:joe, topic:drinks, topic:meals] | [topic:appetizers]

The metadata is even more complex than above with several other keys that can potentially occur in either Primary or Secondary column. But the root problem is, how can I extract these key value pairs using Power BI so that my final table has each key as a unique column with all instances of its values across Primary / Secondary as a list, similar to this:

PageName | CreatedBy | Owner | Topic |
page1    | Joe       | frank | meals |
page2    | Dale      | joe   | drinks, meals, appetizers |
Was it helpful?

Solution

This can be achieved using a few lines of Power Query:

  1. Remove outer brackets ("[" & "]") using Table.ReplaceValue
  2. Merge [Primary] & [Secondary] columns using Table.AddColumns & Text.Combine
  3. Split the new merged column into rows using the key value pair delimiter (",") using Table.ExpandListColumn, Table.TransformColumns, & Splitter.SplitTextByDelimiter
  4. Split the new [KeyValuePairs] column into separate columns using the key value delimiter (":") using Table.SplitColumn
  5. Create groups by [PageName], [CreatedBy], & [Key], while concatenating the [Value] column using Table.Group & Text.Combine
  6. Pivot the table by [Key] for the [KeyValues] using Table.Pivot

Here is a the Power Query I constructed using this method:

let
    Source = #table({"PageName", "CreatedBy", "Primary", "Secondary"}, {{"page1", "Joe", "[owner:frank,topic:meals]", "[topic:drinks]"}, {"page2", "Dale", "[owner:joe,topic:drinks,topic:meals]", "[topic:appetizers]"}}),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"PageName", type text}, {"CreatedBy", type text}, {"Primary", type text}, {"Secondary", type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","[","",Replacer.ReplaceText,{"Primary", "Secondary"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","]","",Replacer.ReplaceText,{"Primary", "Secondary"}),
    #"Inserted Merged Column" = Table.AddColumn(#"Replaced Value1", "KeyValuePairs", each Text.Combine({[Primary], [Secondary]}, ","), type text),
    #"Removed Columns1" = Table.RemoveColumns(#"Inserted Merged Column",{"Primary", "Secondary"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Removed Columns1", {{"KeyValuePairs", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "KeyValuePairs"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"KeyValuePairs", type text}}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type1", "KeyValuePairs", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), {"Key", "Value"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Key", type text}, {"Value", type text}}),
    //#"Grouped Rows" = Table.Group(#"Changed Type2", {"Index", "KeyValuePairs.1"}, {{"Values", each _, type table}})
    #"Grouped Rows" = Table.Group(#"Changed Type2", {"PageName", "CreatedBy", "Key"}, {{"KeyValues", each Text.Combine(_[Value], ", "), type text}}),
    #"Pivoted Column" = Table.Pivot(#"Grouped Rows", List.Distinct(#"Grouped Rows"[Key]), "Key", "KeyValues")
in
    #"Pivoted Column"
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top