سؤال

I'm writing this program that is like a web crawler for online forums. For each forum I crawl, I need to do the same thing:

  1. login
  2. find the boards
  3. find the posts
  4. find the permalink to the post
  5. find username of the person who made the post
  6. etc.

Now while this same logic needs to occur for each forum, the implementation for each forum is different. For example, the inputs of each login form is different per forum. One forum may have a field named "username" the other may have a field named "user". Some of these steps may have default implementations. For example, the default implementation of login is to not do anything (because you don't have to login to some forums to crawl it).

What I've done is created a function with all these steps in it named crawl-forum but the implementations are abstract and implemented elsewhere. My question is, what's the best way to get crawl-forum to use these implementations?

What I've tried

1) Config map

Here's what I've tried so far. I added a new argument to the crawl-forum function called configs. It's a map data structure that looks like this:

{ :login login-function
  :find-boards find-boards-function
  ...
}

The code that calls crawl-forum is responsible for populating that map. What I don't like about this is the configs need to be passed throughout the crawl-forum code. It adds a new parameter everywhere. Also, I've got some lame, ad-hoc code for handling default implementations.

2) Multimethods

I talked on irc about this and someone gave me the idea that I should use multimethods for this instead since it's really polymorphic behavior. They look like this:

(defn get-site-key [& args] (first args))
(defmulti login get-site-key)
(defmethod login :default [site-key cookie-store] nil)

Then the client code has to define its own multimethods on the outside:

(defmethod login :forum-1 [site-key cookie-store] (do-something cookie-store))

What I don't like about this is that just like the config, I have to pass the site-key in to the crawl-forum function and that site-key still has to be passed around everywhere inside. Also, every defmethod has to be passed its own site-key back as a parameter, but none of them will ever use it. It's simply a necessary argument to do dispatch on. It's really hard for me to find a thorough multimethod tutorial though, so if there's a smarter way to do this, let me know.

Is there a 3rd option that's even better? Is there a better way to be using multimethods? Let me know, thanks.

هل كانت مفيدة؟

المحلول

I would go with option 1. If passing the map around bother you, you can always use a dynamic var. For the defaults, what I would suggest is to use merge:

(def defaults { ... })
(def site-specific (merge defaults { ...}))

نصائح أخرى

You can use Protocols. These are the successors to multimethods which support polymorphic behavior. In addition, when you define a Protocol, it can compile to a Java interface with the :gen-class namespace directive.

Here are some good links to help your understanding:

However, I am partial to a simple implementation approach. Maintain a map of forum URLs to map of your functions, i.e.

(def config 
  {"http://forum1.com" {:login login-function1
                          :find-boards find-boards-function1 ... }
   "http://forum2.com" {:login login-function2 
                          :find-boards find-boards-function2 ... }
    ;; etc
   "http://forumN.com" {:login login-functionN
                          :find-boards find-boards-functionN ... }})

and elsewhere

(crawl-forum (get config forum-url))

I don't think there is anything wrong with this simple approach in a small application. Interfaces and polymorphism are for large projects with team members separated by distance and time.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top