Question

Ok here is an example of the text I got

"data": [
  {
     "post_id": "164902600239452_10202071734744222",
     "actor_id": 164902600239452,
     "target_id": null,
     "likes": {
        "href": "https://www.facebook.com/browse/likes/?id=10202071734744222",
        "count": 2,
        "sample": [
       678063648,
       100000551340876,
       100000805495404,
       100000905843684,
        ],
        "friends": [

        ],
        "user_likes": false,
        "can_like": true
     },
     "comments": {
        "can_remove": false,
        "can_post": true,
        "count": 0,
        "comment_list": [

        ]
     },
     "message": "Down to the FINAL 3 SEATS for It Factor LIVE 2013... WHO will snag them before we close registration on October 15th???\n\nLearn more now at http://www.ItFactorLIVE.com/"
  }, ]

I want to match only the numbers inside the brackets after the "sample":

            "sample": [
       678063648,
       100000551340876,
       100000805495404,
       100000905843684,
        ],

so that I end up with this

           678063648
           100000551340876
           100000805495404
           100000905843684

May somebody please help me with the correct regex to make that happen?

Was it helpful?

Solution

OK - I have looked at the solution that @hwnd had suggested, as well as the link you gave to the "real" data, and came up with the following:

\d+(?=,*\s+(?:\d|\]))

You can see at http://regex101.com/r/pL3gW2 that this matches every string of digits in the sample that is inside square brackets.

The key difference with @hwnd's solution was the addition of a * after the ,, making the comma after the digits optional: this allows the expression to match the last set of numbers before the close ]. Without it, the match skipped the last number inside the brackets.

It's been said before: there are powerful JSON parsers available in almost any language / platform. Look into them.

OTHER TIPS

see if this works for you pattern = (\d+)(?=(?:(?!\[).)*\]) Demo

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top