Question

I am trying to write a program to decode view state given a url. I know similar programs exist but this is more of an excursive than a project. Given the html source of a page, how can I get the value of the view state form element. I started by doing this:

def get_viewstate(html):
        i = html.index('id="__VIEWSTATE" value="')
        somedata = html[i+len('id="__VIEWSTATE" value="'):]

But I couldn't figure out an efficient way to only retrieve the value of the element up to the end tag. What is the most efficient way to retrieve the value of this form element?

Was it helpful?

Solution

Using lxml with css selector:

import lxml.html

root = lxml.html.fromstring(html)
matched = root.cssselect('#__VIEWSTATE')
if matched:
    value = matched[0].get('value')

Using BeautifulSoup:

from bs4 import BeautifulSoup

soup = BeautifulSoup(html)
matched = soup.select('#__VIEWSTATE')
if matched:
    value = matched[0].get('value')
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top