Question

I am trying to follow a tutorial on big data, it wants to reads data from a keyspace defined with cqlsh.

I have compiled this piece of code successfully:

require 'rubygems'
require 'cassandra'

db = Cassandra.new('big_data', '127.0.0.1:9160')

# get a specific user's tags
row = db.get(:user_tags,"paul")

###
def tag_counts_from_row(row)
  tags = {}

  row.each_pair do |pair|
    column, tag_count = pair
    #tag_name = column.parts.first
    tag_name = column
    tags[tag_name] = tag_count
  end

  tags
end
###
# insert a new user
db.add(:user_tags, "todd", 3, "postgres")
db.add(:user_tags, "lili", 4, "win")


tags = tag_counts_from_row(row)
puts "paul - #{tags.inspect}"

but when I write this part to output everyone's tags I get an error.

 user_ids = []
    db.get_range(:user_tags, :batch_size => 10000) do |id|
    #  user_ids << id
    end

    rows_with_ids = db.multi_get(:user_tags, user_ids)
    rows_with_ids.each do |row_with_id|
      name, row = row_with_id

      tags = tag_counts_from_row(row)
      puts "#{name} - #{tags.inspect}"
    end

the Error is:

line 33: warning: multiple values for a block parameter (2 for 1)

I think the error may have came from incompatible versions of Cassandra and Ruby. How to fix it?

Was it helpful?

Solution

Its a little hard to tell which line is 33, but it looks like the problem is that get_range yields two values, but your block is only taking the first one. If you only care about the row keys and not the columns then you should use get_range_keys.

It looks like you do in fact care about the column values because you fetch them out again using db.multi_get. This is an unnecessary additional query. You can update your code to something like:

db.get_range(:user_tags, :batch_size => 10000) do |id, columns|
  tags = tag_counts_from_row(columns)
  puts "#{id} - #{tags.inspect}"
end
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top