문제

I want to forbid users being able to submit duplicate urls into the database.

Right now my approach is:

db.url.URL.requires=[IS_URL(error_message='URL Error'),
                     IS_NOT_IN_DB(db, 'url.URL',error_message='Dupilcated URL')]

It covers the case for with "http" and without "http". For example, if www.123.com is already in database, then user could not submit http://www.123.com. But this approach does NOT cover the case with "https", i.e. the user still could submit https://www.123.com.

Is there anyway to prevent such duplication?

I think eliminating the "http"/"https" ,if any, in urls before calling SQLFORM().process(). In such way, the urls in database are all without "http"/"https". But I don't know how to edit user input before calling SQLFORM().process().

Right now my code is

url_form=SQLFORM(db.url).process()

Any ideas?

Thank you!

도움이 되었습니까?

해결책

You could create a custom validator to strip the http/https before further processing:

import re
db.url.URL.requires = [lambda url: (re.sub(r'http[s]?://', '', url), None),
                       IS_URL(error_message='URL Error'),
                       IS_NOT_IN_DB(db, 'url.URL',error_message='Dupilcated URL')]

Note, the custom validator returns a tuple including the altered URL and None (the None indicates that there is no error). That altered URL is then passed to the remaining two validators.

Note, by default, IS_URL will prepend "http://" to any URL that lacks a scheme (which will be all URLs in this case, since the first validator strips the scheme). To suppress that behavior, you can do IS_URL(prepend_scheme=None).

다른 팁

You can create a custom validator that will check for both the http and https versions in the database. This also allows for formatting the URL. Think of all lowercase hostnames and removal of keyword arguments (?a=b) in the url. If you plan to do that, be sure to check out urlparse.

The following code is untested, but may provide you with enough code to create your own solution.

class scheme_independant_url_is_not_in_db:
    def __init__(self, db,error_message='Duplicate URL'):
        self.db = db
        self.e = error_message

    def __call__(self, value):
        # test the entered uri
        url_validator = IS_NOT_IN_DB(db,'url.URL')
        value, error = url_validator(value)
        if error: 
            return value,self.e
        # find the opposing scheme
        if value.lower().startswith('http:'):
            opposite_scheme_value =  'https:'+value[5:] 
        elif value.lower().startswith('https:')
            opposite_scheme_value =  'http:'+value[6:] 
        # error on the opposite_scheme_value in db
        value, error = url_validator(opposite_scheme_value)
        if error: 
            return value,self.error_message
        # return the original url, preserving the original scheme
        return (value, None)
... 

db.url.URL.requires=[IS_URL(error_message='URL Error'),
                 scheme_independant_url_is_not_in_db(db)]
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top