Question

I'm trying to build an URL-alias app which allows the user create aliases for existing url in his website.

I'm trying to do this via middleware, where the request.META['PATH_INFO'] is checked against database records of aliases:

try:
    src: request.META['PATH_INFO']
    alias = Alias.objects.get(src=src)
    view = get_view_for_this_path(request)

    return view(request) 
except Alias.DoesNotExist:
   pass

return None

However, for this to work correctly it is of life-importance that (at least) the PATH_INFO is changed to the destination path.

Now there are some snippets which allow the developer to create testing request objects (http://djangosnippets.org/snippets/963/, http://djangosnippets.org/snippets/2231/), but these state that they are intended for testing purposes.

Of course, it could be that these snippets are fit for usage in a live enviroment, but my knowledge about Django request processing is too undeveloped to assess this.

Was it helpful?

Solution

Instead of the approach you're taking, have you considered the Redirects app?

It won't invisibly alias the path /foo/ to return the view bar(), but it will redirect /foo/ to /bar/

OTHER TIPS

(posted as answer because comments do not seem to support linebreaks or other markup)

Thank for the advice, I have the same feeling regarding modifying request attributes. There must be a reason that the Django manual states that they should be considered read only.

I came up with this middleware:

def process_request(self, request):
    try:
        obj = A.objects.get(src=request.path_info.rstrip('/')) #The alias record.
        view, args, kwargs = resolve_to_func(obj.dst + '/') #Modified http://djangosnippets.org/snippets/2262/
        request.path = request.path.replace(request.path_info, obj.dst)
        request.path_info = obj.dst
        request.META['PATH_INFO'] = obj.dst
        request.META['ROUTED_FROM'] = obj.src
        request.is_routed = True

        return view(request, *args, **kwargs)
    except A.DoesNotExist: #No alias for this path
        request.is_routed = False
    except TypeError: #View does not exist.
        pass

    return None

But, considering the objections against modifying the requests' attributes, wouldn't it be a better solution to just skip that part, and only add the is_routed and ROUTED_TO (instead of routed from) parts?

Code that relies on the original path could then use that key from META.

Doing this using URLConfs is not possible, because this aliasing is aimed at enabling the end-user to configure his own URLs, with the assumption that the end-user has no access to the codebase or does not know how to write his own URLConf.

Though it would be possible to write a function that converts a user-readable-editable file (XML for example) to valid Django urls, it feels that using database records allows a more dynamic generation of aliases (other objects defining their own aliases).

Sorry to necro-post, but I just found this thread while searching for answers. My solution seems simpler. Maybe a) I'm depending on newer django features or b) I'm missing a pitfall.

I encountered this because there is a bot named "Mediapartners-Google" which is asking for pages with url parameters still encoded as from a naive scrape (or double-encoded depending on how you look at it.) i.e. I have 404s in my log from it that look like:

1.2.3.4 - - [12/Nov/2012:21:23:11 -0800] "GET /article/my-slug-name%3Fpage%3D2 HTTP/1.1" 1209 404 "-" "Mediapartners-Google

Normally I'd just ignore a broken bot, but this one I want to appease because it ought to better target our ads (It's google adsense's bot) resulting in better revenue - if it can see our content. Rumor is it doesn't follow redirects so I wanted to find a solution similar to the original Q. I do not want regular clients accessing pages by these broken urls, so I detect the user-agent. Other applications probably won't do that.

I agree a redirect would normally be the right answer.

My (complete?) solution:

from django.http import QueryDict
from django.core.urlresolvers import NoReverseMatch, resolve

class MediapartnersPatch(object):
    def process_request(self, request):
        # short-circuit asap
        if request.META['HTTP_USER_AGENT'] != 'Mediapartners-Google':
            return None

        idx = request.path.find('?')
        if idx == -1:
            return None

        oldpath = request.path
        newpath = oldpath[0:idx]
        try:
            url = resolve(newpath)
        except NoReverseMatch:
            return None

        request.path = newpath
        request.GET = QueryDict(oldpath[idx+1:])
        response = url.func(request, *url.args, **url.kwargs)
        response['Link'] = '<%s>; rel="canonical"' % (oldpath,)
        return response
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top