
OK, so I have been at this problem for the last few days, and I must admit - I am stuck.

I am trying to make a web application, where users can upload documents or send inbound e-mails.

Each document or email will be added to a stream. So a stream holds all the docs/mails.

Now, what I want to allow my users are the following:

  1. Users should be able to add ´fields´ to a stream. A field is essentially just a dynamic variable, that users can set. So for example, if a user uploads a large text file, they can eg. create below fields:
id | name
 1 | Order number
 2 | Tracking reference
  1. Users can then add parsing rules to each field. A parsing rule can consist of multiple methods. For example:

    • remove_empty_lines
    • text_replace
    • regex_text_replace

Users can apply as many parsing rules to the text string as they want. Everytime a new parsing rule has been applied, the new and updated text string will be stored in parsing_rule_results

OK so above describes the workflow, that a user can setup for each document/email, and above is bound pr. stream.

Now, what I want to achieve is the following:

  1. For each new document/email that is added to my application, I need to check what fields is defined for the specific stream.
  2. For each field defined, I need to run the string through each parsing_rule, and
  3. ultimately, once the string has been run through each parsing rule, the end result should be dynamically saved to that specific document/email. Something like:
id | document_id | email_id | field | data
 1 | 5           | null     | 3     | 5000251
 2 | 5           | null     | 4     | AJIWO4124124J
 3 | 6           | null     | 3     | 92841
 4 | 6           | null     | 4     | KKLJPEPQ9102
 5 | 6           | null     | 3     | E-Order2000
 6 | null        | 2        | 4     | OOCLTCU8291LK

Which can be translated into:

Stream: 1

Document ID # 5:

  1. order_number= "5000251"

  2. tracking_number = "AJIWO4124124J"

Document ID # 6:

  1. order_number= "92841"

  2. tracking_number = "KKLJPEPQ9102"

Email ID # 2:

  1. order_number= "E-Order2000"

  2. tracking_number = "OOCLTCU8291LK"

Below is the beginning of my database design - but without "field results".

My database setup

This is the code I have so far (models):


    // A stream can have many documents
    public function documents()
        return $this->hasMany(Document::class);

    //A stream can have many e-mails
    public function emails()
        return $this->hasMany(Email::class);
     //A stream can have many fields
    public function fields()
        return $this->hasMany(Field::class);
    //A stream have fields, which then have parsing rules.
    public function parsingRules()
        return $this->hasManyThrough(ParsingRule::class, Field::class);


    //A document belongs to a Stream.
    public function stream()
        return $this->belongsTo(Stream::class);

    // A document will have fields.
    public function fields()
        return $this->hasMany(Field::class, 'stream_id', 'stream_id');


    // An email belongs to a Stream.
    public function stream()
        return $this->belongsTo(Stream::class);

    //An email will have fields.
    public function fields()
        return $this->hasMany(Field::class, 'stream_id', 'stream_id');


   // A field belongs to a Stream
    public function stream()
        return $this->belongsTo(Stream::class);

    // A field can have many parsing rules.
    public function parsingRules()
        return $this->hasMany(ParsingRule::class);


    //A parsing rule belongs to a field.
    public function field()
        return $this->belongsTo(Field::class);


    //A parsing rule, belongs to a field rule.
    public function fieldrule()
        return $this->belongsTo(FieldRule::class);

    // A parsing ruleresult, belongs to a document.
    public function document()
        return $this->belongsTo(Document::class);

I believe the real problem I have lies in my understanding of relationships, and how I should dynamically apply above logic.

Let's imagine I have a class, that will be called whenever a new document/email is being added:


    public function parse(Stream $stream, Document $document)
         $text = $document->text;
         $fields = $stream->fields()->get();

         //1. get all $parsing_rules for each $fields.

         //2. parse $text by using each $parsing_rule

         //3. save the end result of $text by document/email specific and field.


As you can see, I can fetch the stream details as well as the fields for the specific stream.

However - how can I:

  1. Make it so it allows both Email and Document, depending on what is being added? (Above only allows Document)
  2. Run through each field, and subsequently parse through each parsing_rule and save the end result, so it's specific for each document/email

I hope above is somewhat clear. This post got a lot longer than first expected.

Était-ce utile?

La solution

To allow the parse() function parse both Document and Email, you can create an interface, e.g. TextContainer that Document and Email implement. Something like:

interface TextContainer {
    function getText(): string;
    function setText(string $text);
    function save();

class ApplyParsingRules {

    public function parse(Stream $stream, TextContainer $container)
         $text = $container->getText();
         $fields = $stream->fields()->get();

         //1. get all $parsing_rules for each $fields.

         //2. parse $text by using each $parsing_rule

         //3. save the end result of $text by document/email specific and field.

I actually prefer parse() to return the result and let the caller decide what to do with the result, i.e.

class ApplyParsingRules {

    public function parse(Stream $stream, TextContainer $container): string
         $text = $container->getText();
         $fields = $stream->fields()->get();

         //1. get all $parsing_rules for each $fields.

         //2. parse $text by using each $parsing_rule

         return $text;
Licencié sous: CC-BY-SA avec attribution
scroll top