Question

This might be a dumb question since I have not yet fully understood how Cognos BI works. Also, I tried posting this on Stack Exchange, in vain (error with putting tags).

My question is - Can Apache Hadoop be used to make Cognos BI work faster ? Or is Cognos doing the same thing which Hadoop does (MapReduce functionality) ?

The place where I have started working, uses the Cognos BI suite on top of Sybase IQ (the content store). Apache Tomcat is the web server. What happens sometimes is that Cognos takes a lot of time(almost dies) to generate reports if the data set is large.

So can Apache Hadoop help Cognos perform better by fitting somewhere between Cognos and Sybase ? Or is report optimization the only way out in this case ?

Thanks Guys.

Was it helpful?

Solution

Hadoop as a platform is not aimed for ad-hoc queries or analytic reports.
Cognos is an IBM product. It can only query it's own distribution of Hadoop, which is called big insights:
InfoSphere BigInsights
Over BigInsight Cognos issues queries using Hive, which eventually translate to MapReduce.

You say you are using Sybase IQ (this is not the content store, this is the reporting DB your queries are running on).
Although I don't know much about Sybase IQ, I am working heavily with Vertica, which is also columnar DB.
In order to get good performance, you have to tune anything possible:

  • Cognos Framework model
  • Cognos reports
  • Sybase DB tuning and structure. Hadoop can certainly help by preparing data in the correct level of granularity and by precalculate any your required calculations.

OTHER TIPS

Simply put, Hadoop is a distributed platform for manipulating large data sets. It has fault-tolerance built in which makes it appealing to organizations where downtime can impact business processes. Cognos is a business intelligence tool that allows users to explore and report on data. So there appears to be a logical fit.

Hadoop, however, does not lend itself (yet) to ad-hoc querying as the other poster has commented. There is a Hadoop project that promises just that - Hive. Developers have released ODBC connectors to access Hive databases (which is simply a data warehouse view of your Hadoop data and can be queried using an SQL-like language called HiveQL). Since Cognos can extract data from an ODBC database, it stands to reason that Cognos can extract data from Hadoop through Hive.

The other approach to using Hadoop in your Cognos environment is to transfer data using text files such as CSV. Hadoop can generate a data file that can then be imported into Cognos. This is the approach I currently use.

Yet, I have not answered the "why" of using Hadoop. The two applications I have used Hadoop on are inventory forecasting and cash flow/budgeting. If you are trying to perform routine forecasts of hundreds of thousands of SKU's, Hadoop is a wonderful tool. If you are trying to perform a Monte Carlo simulation over a thousand budget items, Hadoop is wonderful. Just import data from your data warehouse, run your Hadoop jobs, and import the resulting CSV files into Cognos. Voila!

Take care though, Hadoop is not a panacea. Sometimes old fashion SQL and your programming language of choice are just as good - or better. Hadoop comes with a learning curve and resource demands. I learned by downloading the Hortonworks sandbox; it is a preconfigured virtual machine that runs in VMware, VirtualBox, etc. So you do not have to install or configure anything!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top