Question

Whenever I specify a resource using the Spring Data Hadoop namespace, by application throws an IOException when loading the file specified. The file definitely exists and is of a valid format.

Spring Data Hadoop XML config:

Stack trace on startup:

Caused by: java.lang.RuntimeException: java.io.IOException: Stream closed
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1231)
    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1103)
    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1037)
    at org.apache.hadoop.conf.Configuration.get(Configuration.java:415)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:860)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1380)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
    at com.mendeley.swets.config.HdfsConfig.fileSystem(HdfsConfig.java:28)
    at com.mendeley.swets.config.HdfsConfig$$EnhancerByCGLIB$$38b1feb7.CGLIB$fileSystem$0(<generated>)
    at com.mendeley.swets.config.HdfsConfig$$EnhancerByCGLIB$$38b1feb7$$FastClassByCGLIB$$3c3c119d.invoke(<generated>)
    at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
    at org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(ConfigurationClassEnhancer.java:280)
    at com.mendeley.swets.config.HdfsConfig$$EnhancerByCGLIB$$38b1feb7.fileSystem(<generated>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:149)
    ... 41 more
Caused by: java.io.IOException: Stream closed
    at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:189)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStream.read(XMLEntityManager.java:2932)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:704)
    at com.sun.org.apache.xerces.internal.impl.XMLVersionDetector.determineDocVersion(XMLVersionDetector.java:186)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:772)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1162)
    ... 61 more
Was it helpful?

Solution

This has been fixed in trunk and will be available in the next milestone. See the spring forum post [1] for more information.

[1] http://forum.springsource.org/showthread.php?123777-IOException-when-using-lt-hadoop-configuration-resources-quot

OTHER TIPS

Chris is actually right. I ran into a similar problem (IOException: stream closed), and the problem is caused by reading from a stale stream. I am guessing, Deejay, that you are using something along these lines to read custom resource from your classpath:

<hdp:configuration resources="classpath:/custom-site.xml"/>

, and then obtaining a FileSystem as FileSystem.get(conf).

After spending sometime with a debugger, it looks like the problem is caused by a combination of Spring's ConfigurationFactoryBean and Apache Hadoop's Configuration objects. If you look at the source code for Spring Hadoop on github (yes, it is available there), Spring Hadoop looks like a combination of Spring Settings and Apache Hadoop API underneath.

An input stream is opened in Spring to parse the custom resource, and is closed after reading it. The method, get, from FileSystem subsequently reloads the same stream, which is already closed, and reads again throwing the IOException: stream closed error.

A workaround, similar to the examples on github, is to use Spring properties and SpEl (Spring Expression Language) to substitue the configurations you need for the necessary fields. The other option is probably to write your own ConfigurationFactoryBean that will create a new Configuration instance using the existing one as parent, and adding resources as URLs.

Hope this somewhat helps.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top