Question

We have a Proxy that is taking messages from a JMS queue and sends them to an FTP folder. We discovered now, that the sending to the FTP is very slow when the target directory on the FTP already contains a lot of files. (i.e. when I have around 2000 files in a directory, it already takes several seconds)

Here the code of our Proxy (get messages (plain-text) from a JMS and writes them to FTP):

<?xml version="1.0" encoding="UTF-8"?>
<proxy xmlns="http://ws.apache.org/ns/synapse" name="myProxy" statistics="disable" trace="disable" transports="jms">
<parameter name="transport.jms.Destination">myQueue</parameter>
<parameter name="transport.jms.ConnectionFactory">myQueueConnectionFactory</parameter>
<parameter name="transport.jms.DestinationType">queue</parameter>
<parameter name="transport.jms.ContentType">
    <rules>
        <jmsProperty>contentType</jmsProperty>
        <default>text/plain</default>
    </rules>
</parameter>
<target faultSequence="rollbackSequence">
    <inSequence>
        <log level="custom">
            <property name="STATUS" value="myProxy called"/>
        </log>
        <property name="ClientApiNonBlocking" scope="axis2" action="remove"/>
        <property name="OUT_ONLY" value="true"/>
        <property name="transport.vfs.ReplyFileName" expression="fn:concat(get-property('SYSTEM_DATE','yyyyMMddHHmmss_SSS'), '_result.txt')" scope="transport"/>

        <send>
            <endpoint key="myFTPendpoint"/>
        </send>
    </inSequence>
</target>

And the FTPEndpoint lookes like this:

<?xml version="1.0" encoding="UTF-8"?>
<endpoint xmlns="http://ws.apache.org/ns/synapse" name="myFTPendpoint">
    <address uri="vfs:ftp://USER:PASSWORD@SERVER.com/path/toSomewhere?vfs.passive=true"/>
</endpoint>

My analysis for now:

  1. It is only slow when using FTP with VFS. When using the local file system - it is fast.
  2. The files are tiny - so it's not the upload time
  3. The network is fast
  4. !Speed depends on the number of files already in the directory on the FTP!

Possible solutions?

  • Fix the problem of the speed. Disable the directory listing?
  • Workaraound: Create new folders at the output (that not one folder gets filled too much)

Does someone also discovered the same issue? And how can the FTP speed to big directories be improved? Thanks for any help

Was it helpful?

Solution

I believe regardless of whether you do an explicit Directory listing there will be always an inferred Directory listing to determine whether the file write operation will be an overwrite or a create.

This leaves you with the other workaround.

You should create new folders at the output. Implement a hashing scheme to aid in the folder naming so that you know that the folders will not get filled too much. For example, instead of file1234.ext consider file/1/2/3/4.ext.

OTHER TIPS

Generally, if you have performance issues you should benchmark.

Try performing the same action from a command line FTP client and see where the slow point is. Running each of the commands one by one will allow you to see which exact step(s) perform differently when putting to a folder with many files vs an empty folder.

You should also consider that the performance issue may not be with FTP. Just because that's the channel you're seeing the issue on, doesn't mean (purely as an example) that the OS isn't just slow when handling large folders (like NT used to be). FTP is the way you're seeing this error, that doesn't mean it's related to the cause.

To test this, I would access the server directly and access the folder that contains the files.

Finally, if none of those give you any clues, I'd probably try doing the same thing on a different end-point to see if there's a persistent problem.

You will always have issues on FTP with that amount of files, this is a common problem, and is not related with JMS, to confirm that, use a ftp client like filezilla and try to list the directory where the 2000 files exists...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top