DataStax Enterprise version 4.0.2 , Cassandra 2.0.6 and Opscenter 4.1.2. show "Error creating cluster: Timeout while adding cluster. "

StackOverflow https://stackoverflow.com/questions/23629933

  •  21-07-2023
  •  | 
  •  

Question

I have setup cluster with 5 node on Amazon EC2 with multi region center. And ops center node/instance is separate from cluster nodes. When I try to add a existing cluster by the opscenter web, it show "Error creating cluster: Timeout while adding cluster. Please check the log for details on the problem.." on the web. Then I checked the opscenterd.log , it seems that opscenter can connect both nodes, but a warning: "ProcessingError while calling CreateClusterConfController: Timeout while adding cluster. Please check the log for details on the problem."

Do you have any idea on this issue? I'm using DataStax Enterprise version 4.0.2 , Cassandra 2.0.6 and Opscenter 4.1.2. I'm creating cluster on Ubuntu 12.0.4 I have checked the Cassandra system logs and datastax-agent agent log but there is no error.

Is this any existing issue ? like opscenter version 4.1.0 issue " opscenterd breaking when updating definition files on platforms with Python 2.6" which was fixed in 4.1.1. http://www.datastax.com/documentation/opscenter/4.1/opsc/release_notes/opscReleaseNotes411.html

Please suggest.

================================================================================== All port are open on ec2 security groups, (61620,61621 ..etc)> I did the telnet from opsceter to host with port 61621 and telnet from host to opscenter with port 61620 both are connecting. Below is the opscenter.log

2014-05-14 05:53:46+0000 []  INFO: Starting factory <opscenterd.ThriftService.NoReconnectCassandraClientFactory instance at 0x4076c68>
2014-05-14 05:53:46+0000 []  INFO: Adding new cluster 'Connect2me': {u'jmx': {u'username': u'', u'password': u'', u'port': u'7199'}, 'kerberos_client_principals': {}, 'kerberos': {}, u'agents': {}, 'kerberos_hostnames': {}, 'kerberos_services': {}, u'cassandra': {u'username': u'', u'seed_hosts': u'54.214.1.100', u'api_port': u'9160', u'password': u''}}
2014-05-14 05:53:46+0000 []  INFO: Starting new cluster services for Connect2me
2014-05-14 05:53:46+0000 [Connect2me]  INFO: Starting services for cluster Connect2me
2014-05-14 05:53:46+0000 [Connect2me]  INFO: Loading event plugins
2014-05-14 05:53:46+0000 [Connect2me]  INFO: Loading event plugin conf /etc/opscenter/event-plugins/posturl.conf
2014-05-14 05:53:46+0000 [Connect2me]  INFO: Successfully loaded event plugin conf /etc/opscenter/event-plugins/posturl.conf
2014-05-14 05:53:46+0000 [Connect2me]  INFO: Loading event plugin conf /etc/opscenter/event-plugins/email.conf
2014-05-14 05:53:46+0000 [Connect2me]  INFO: Successfully loaded event plugin conf /etc/opscenter/event-plugins/email.conf
2014-05-14 05:53:46+0000 [Connect2me]  INFO: Done loading event plugins
2014-05-14 05:53:46+0000 []  INFO: Metric caching enabled with 50 points and 1000 metrics cached
2014-05-14 05:53:46+0000 []  INFO: Starting PushService
2014-05-14 05:53:46+0000 [Connect2me]  INFO: Starting CassandraCluster service
2014-05-14 05:53:46+0000 [Connect2me]  INFO: agent_config items: {'cassandra_log_location': '/var/log/cassandra/system.log', 'thrift_port': 9160, 'thrift_ssl_truststore': None, 'rollups300_ttl': 2419200, 'rollups86400_ttl': -1, 'jmx_port': 7199, 'metrics_ignored_solr_cores': '', 'api_port': '61621', 'metrics_enabled': 1, 'thrift_ssl_truststore_type': 'JKS', 'kerberos_use_ticket_cache': True, 'use_ssl': 1, 'kerberos_renew_tgt': True, 'rollups60_ttl': 604800, 'cassandra_install_location': '', 'rollups7200_ttl': 31536000, 'kerberos_debug': False, 'storage_keyspace': 'OpsCenter', 'ec2_metadata_api_host': '169.254.169.254', 'provisioning': 0, 'kerberos_use_keytab': True, 'metrics_ignored_column_families': '', 'thrift_ssl_truststore_password': None, 'metrics_ignored_keyspaces': 'system, system_traces, system_auth, dse_auth, OpsCenter'}
2014-05-14 05:53:46+0000 []  INFO: Stopping factory <opscenterd.ThriftService.NoReconnectCassandraClientFactory instance at 0x4076c68>
2014-05-14 05:53:47+0000 [Connect2me]  INFO: Enterprise functionality: True
2014-05-14 05:53:48+0000 [Connect2me]  INFO: Snitch: com.datastax.bdp.snitch.DseDelegateSnitch
2014-05-14 05:53:48+0000 [Connect2me]  INFO: Cluster Name: Connect2me
2014-05-14 05:53:48+0000 [Connect2me]  INFO: Partitioner: org.apache.cassandra.dht.RandomPartitioner
2014-05-14 05:53:50+0000 [Connect2me]  INFO: Recognizing new node 54.214.1.100 ('128010234515697016761586673489854425713')
2014-05-14 05:53:50+0000 [Connect2me]  INFO: Node 54.214.1.100 has multiple tokens (vnodes). Only one picked for display.
2014-05-14 05:53:50+0000 [Connect2me]  INFO: Recognizing new node 54.214.1.110 ('74547314523494862953006764525852718268')
2014-05-14 05:53:50+0000 [Connect2me]  INFO: Node 54.214.1.110 has multiple tokens (vnodes). Only one picked for display.
2014-05-14 05:53:50+0000 [Connect2me]  INFO: Recognizing new node 54.243.203.229 ('95676355653121167189122638977297238333')
2014-05-14 05:53:50+0000 [Connect2me]  INFO: Recognizing new node 54.214.1.78 ('165496574052081051366176941207447197429')
2014-05-14 05:53:50+0000 [Connect2me]  INFO: Recognizing new node 54.243.201.237 ('164812453774030768973707069212224107713')
2014-05-14 05:53:56+0000 [Connect2me]  INFO: Keyspaces: {'dse_security': CassandraKeyspace(name=dse_security, column_families=['tokens'], tables=[u'tokens'], attributes={'strategy_options': {'us-east': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}), 'solr_admin': CassandraKeyspace(name=solr_admin, column_families=[], tables=[u'solr_resources'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.EverywhereStrategy'}), 'mykeyspace1': CassandraKeyspace(name=mykeyspace1, column_families=[], tables=[u'mysolr1', u'videos', u'lyrics', u'song'], attributes={'strategy_options': {'us-west-2': '3', 'us-east': '2'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}), 'system': CassandraKeyspace(name=system, column_families=['IndexInfo', 'NodeIdInfo', 'schema_keyspaces', 'hints'], tables=[u'peers', u'range_xfers', u'schema_keyspaces', u'schema_columns', u'IndexInfo', u'schema_triggers', u'sstable_activity', u'peer_events', u'paxos', u'batchlog', u'NodeIdInfo', u'compaction_history', u'compactions_in_progress', u'schema_columnfamilies', u'local', u'hints'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.LocalStrategy'}), 'cfs_archive': CassandraKeyspace(name=cfs_archive, column_families=['rules', 'sblocks', 'cleanup', 'inode'], tables=[u'rules', u'sblocks', u'cleanup', u'inode'], attributes={'strategy_options': {'us-east': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}), 'OpsCenter': CassandraKeyspace(name=OpsCenter, column_families=['events_timeline', 'settings', 'rollups60', 'rollups86400', 'pdps', 'rollups7200', 'events', 'rollups300'], tables=[u'events_timeline', u'settings', u'rollups60', u'rollups86400', u'pdps', u'rollups7200', u'events', u'rollups300'], attributes={'strategy_options': {'replication_factor': '2'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'system_traces': CassandraKeyspace(name=system_traces, column_families=[], tables=[u'events', u'sessions'], attributes={'strategy_options': {'replication_factor': '2'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'HiveMetaStore': CassandraKeyspace(name=HiveMetaStore, column_families=['MetaStore'], tables=[u'MetaStore'], attributes={'strategy_options': {'replication_factor': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'cfs': CassandraKeyspace(name=cfs, column_families=['rules', 'sblocks', 'cleanup', 'inode'], tables=[u'rules', u'sblocks', u'cleanup', u'inode'], attributes={'strategy_options': {'us-east': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}), 'dse_system': CassandraKeyspace(name=dse_system, column_families=[], tables=[u'job_trackers'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.EverywhereStrategy'})}
2014-05-14 05:54:06+0000 []  WARN: ProcessingError while calling CreateClusterConfController: Timeout while adding cluster. Please check the log for details on the problem.
2014-05-14 05:54:54+0000 [Connect2me]  INFO: Initializing event storage.
2014-05-14 05:54:54+0000 [Connect2me]  INFO: SSL agent communication is enabled. Automatic agent detection will be turned off.
2014-05-14 05:54:54+0000 [Connect2me]  INFO: Attempting to load all persisted alert rules
2014-05-14 05:54:55+0000 [Connect2me]  INFO: Done initializing event storage.
2014-05-14 05:54:55+0000 [Connect2me]  INFO: Done loading persisted scheduled job descriptions
2014-05-14 05:54:55+0000 [Connect2me]  INFO: Done loading persisted alert rules
2014-05-14 05:54:55+0000 [Connect2me]  INFO: OpsCenter starting up.

Here is datastax-agent/agent.log

INFO [qtp30763405-24] 2014-05-14 05:54:35,911 New JMX connection (127.0.0.1:7199)
INFO [qtp30763405-24] 2014-05-14 05::54:35,921 HTTP: :get /cluster/topology {:node_ip "54.214.1.100"} - 200
INFO [qtp30763405-21] 2014-05-14 05::54:35,934 New JMX connection (127.0.0.1:7199)
INFO [qtp30763405-21] 2014-05-14 05:54:35,945 HTTP: :get /cluster/topology {:node_ip "54.214.1.110"} - 200
INFO [qtp30763405-19] 2014-05-14 05:54:35,952 New JMX connection (127.0.0.1:7199)
INFO [qtp30763405-22] 2014-05-14 05:54:35,957 New JMX connection (127.0.0.1:7199)
INFO [qtp30763405-19] 2014-05-14 05:54:35,960 HTTP: :get /cluster/topology {:node_ip "54.243.203.229"} - 200
INFO [qtp30763405-22] 2014-05-14 05:54:35,972 HTTP: :get /cluster/topology {:node_ip "54.243.201.237"} - 200

I could not see any error in both the logs but still and I think nodes are connecting but still I'm getting timeout error

2014-05-14 05:54:06+0000 []  WARN: ProcessingError while calling CreateClusterConfController: Timeout while adding cluster. Please check the log for details on the problem.

Can anyone help me on this.

Was it helpful?

Solution

There are several things that may cause this to timeout, some of which are bugs in Cassandra and some of which can be optimized OpsCenter-side. You may be able to workaround this issue by creating a cluster config file manually in /etc/opscenter/clusters/ and restarting opscenterd. For example, write the following to mycluster.conf:

[cassandra]
seed_hosts = 1.2.3.4, 2.3.4.5

It may still take ~1 minute for things to properly work for that cluster, but this will bypass the timeout check.

OTHER TIPS

The most likely issue since opscenter is is separate from the cluster nodes is firewall issues (security group)

Take a look at the ports listed here http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/sec/secConfFirePort.html and make sure you can telnet from opscenterd to the cluster nodes and back again for the relevant ports.

The error you mention would throw an error with a stack trace that would contain ERROR: Error upacking definitions file

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top