Question

Flex tables are one of the new features in Vertica 7.0.

Can anyone tell me how does Flex table convert the unstructured data into structured data?

Thanks in advance!

No correct solution

OTHER TIPS

Well Flex tables are a new feauture in Vertica 7.0. This feauture creates a different kind of table designed especially for loading and querying unstructured data, also called semi-structured data in HP Vertica Syntax to create a flex table :

create flex table unstruc_data();

Where the content of the unstruc_data has two columns the _identity_ and the _row_; Where the row col is the content of the semistructured data with it's type LONG VARBINARY and the identitity will be the row id.
Flex tables comes with a set of help functions :

  • COMPUTE_FLEXTABLE_KEYS
  • BUILD_FLEXTABLE_VIEW
  • COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW
  • MATERIALIZE_FLEXTABLE_COLUMNS
  • RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW

I am not going to go thru explaining all of them as i think you should go and study them. For more details on new Vertica Features go to this link Vertica 7.0 New Stuff

In a Scenario where a JSON document is passed to you through client and you need to store it in Vertica DB.

Without using the flex table here are few problems : 1) You need to know the structure of the Json . 2) Create a table in Vertica DB . 3) Extract each column value from JSON document 4) Insert the values in the table .

Apart from this process , If a new Key is added to JSON there is additional task on vertica DB to alter the table and also on processing logic to get the new key pair value

Using Flex table ,Below detailed is a explanation on how we simplify it:

1) Take the below Json,EE.txt
    {"Name":"Rahul","Age":30}
2) Create a flex table EMP_test     
    dbadmin=> create flex table EMP_Test();
    CREATE TABLE
3) Load the data into the flex table 
    dbadmin=> copy EMP_Test from '/home/dbadmin/EE.txt' parser fjsonparser();
     Rows Loaded 
    -------------
               1
    (1 row)

4) To find out what keys are there in your Json , You have to refresh keys projection using below command 
    dbadmin=> select compute_flextable_keys('EMP_Test');
                  compute_flextable_keys              
    --------------------------------------------------
     Please see public.EMP_Test_keys for updated keys
    (1 row)
    dbadmin=> select * FRom EMP_Test_keys;
     key_name | frequency | data_type_guess 
    ----------+-----------+-----------------
     Age      |         1 | varchar(20)
     Name     |         1 | varchar(20)
    (2 rows)


5) Refresh the view for flex table using below command .You can query the view for data  
    dbadmin=> 
    dbadmin=> select build_flextable_view('EMP_Test');
                    build_flextable_view                 
    -----------------------------------------------------
     The view public.EMP_Test_view is ready for querying
    (1 row)

    dbadmin=> select * From EMP_Test_View
    dbadmin-> ;
     age | name  
    -----+-------
     30  | Rahul
    (1 row)

6) Now , If your Json structure changes and a Additional key 'Gender' is added .
        {"Name":"Sid","Age":22,"Gender":"M"}

7) You can load the data directly into the table EMP_Test
    dbadmin=> copy EMP_Test from '/home/dbadmin/EE1.txt' parser fjsonparser();
     Rows Loaded 
    -------------
               1
    (1 row)
8) Re compute the keys and rebuild the view  using below command 
    dbadmin=> select compute_flextable_keys('EMP_Test');
                  compute_flextable_keys              
    --------------------------------------------------
     Please see public.EMP_Test_keys for updated keys
    (1 row)

    dbadmin=> select build_flextable_view('EMP_Test');
                    build_flextable_view                 
    -----------------------------------------------------
     The view public.EMP_Test_view is ready for querying
    (1 row)

9) You can find the new data added and new keys using the below command .
    dbadmin=> 
    dbadmin=> select * From EMP_Test_keys;
     key_name | frequency | data_type_guess 
    ----------+-----------+-----------------
     Age      |         2 | varchar(20)
     Name     |         2 | varchar(20)
     Gender   |         1 | varchar(20)
    (3 rows)

    dbadmin=> select * From EMP_test_view;
     age | name  | gender 
    -----+-------+--------
     30  | Rahul | 
     22  | Sid   | M
    (2 rows)

This is how Flex table converts unstructured data(semi structured data)  to structured data  .
Flex table has made it very easy to integrate any data service with vertica DB .

all unstructured data save to raw data field

it's a BLOB

when you need access to unstructured field, it's a SLOW, because need BLOB extracting

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top