Thursday, March 29, 2012

Cassandra Opscenter, No data collected

you might get this, the ring is correct, no data get collected from agent,
image
If you check log /var/log/opscenterd.log

2012-03-28 16:20:43-0700 [] Problem while calling MetricsController
        Traceback (most recent call last):
        Failure: telephus.cassandra.c08.ttypes.InvalidRequestException: InvalidRequestException(why='range finish must come after start in the order of traversal')

2012-03-28 16:20:44-0700 [] Problem while calling MetricsController
        Traceback (most recent call last):
        Failure: telephus.cassandra.c08.ttypes.InvalidRequestException: InvalidRequestException(why='range finish must come after start in the order of traversal')

2012-03-28 16:20:45-0700 [] Problem while calling MetricsController
        Traceback (most recent call last):
        Failure: telephus.cassandra.c08.ttypes.InvalidRequestException: InvalidRequestException(why='range finish must come after start in the order of traversal')

2012-03-28 16:20:46-0700 [] Problem while calling MetricsController
        Traceback (most recent call last):
        Failure: telephus.cassandra.c08.ttypes.InvalidRequestException: InvalidRequestException(why='range finish must come after start in the order of traversal')

2012-03-28 16:20:47-0700 [] Problem while calling MetricsController
        Traceback (most recent call last):
        Failure: telephus.cassandra.c08.ttypes.InvalidRequestException: InvalidRequestException(why='range finish must come after start in the order of traversal')

2012-03-28 16:20:48-0700 [] Problem while calling MetricsController
        Traceback (most recent call last):
        Failure: telephus.cassandra.c08.ttypes.InvalidRequestException: InvalidRequestException(why='range finish must come after start in the order of traversal')

2012-03-28 16:20:49-0700 [] Problem while calling MetricsController
        Traceback (most recent call last):
        Failure: telephus.cassandra.c08.ttypes.InvalidRequestException: InvalidRequestException(why='range finish must come after start in the order of traversal')


Basically, this means the time setting is incorrect, between the cluster nodes and your client machine, here will be the browser.

DataStax DSE 2.0, Hadoop/Hive quick tutorial - 2

Hive is used as a batch approach, once we get the analysis result, we need put it into a store which is available for real-time access, for Cassandra, it will be Cassandra keyspace.
Given the previous result, Let’s create one keyspace called OLDPks and one column family called result. for the result column family,
here we use the cityname as the rowid, and the number will be stored in the result:count
[image%255B43%255D.png]

like the hbase-hive handler, we can create the keyspace, column family first, then add a external table in Hive.
or ask hive to create the underlying keyspace and column family.

now let’s do the second approach,
in the hive cli, create one external table served by cassandrafs
image

then from the Opscenter, you can see the keyspace/columnfamily is there
image
or via cassandra cli, show schema,
image

now, we can insert some data to the result cf via casansra-cli,
image

then list the result,
image
also you can view the data throught the opscenter data explorer,

image


Now, let’s query the data through hive,
image

Now we can Load the hive analysis result that we did last time to this external table
image
from the opscetner hadoop jobs, click full details, to go to the familiar jobtrackr admin ui.
image

Once done, result is there, in the hive external table called result, which is mapped to a Column family named result under OLAPks,
image
you can query it by Hive,
image
Or by Cassandra API,
image

So, we create one hbase-hive handler style approach in cassandra. run data analysis by Hive, and export to real time CassandraFS. then data is ready for real-time access by ad-hoc clients.

next stop will be solr integration, stay tuned.

Wednesday, March 28, 2012

DataStax DSE 2.0, Hadoop/Hive quick tutorial - 1

Just setup 8 VMS, and load it with the DSE 2.0, one new feature of the DSE 2.0 is the SOLR support. Now, when I run the node tool, you will see it create one virtual data center called Solr besides the Analytics for hadoop.

image

Ops Center has a visual layout, 3 virtual datacenters.
image

Now let’s test the first case, Loading one CSV file to the CassandraFS (like the hdfs)
I googled some sample csv file, and picked this one, http://jasperreports.sourceforge.net/sample.reference/csvdatasource/
it has some data like  "Dallas",47,"Janet Fuller","445 Upland Pl.","Trial" which maps to{"city", "id", "name", "address", "state"};

So paste this data to a local file called state.csv, and copy it to our cassandraFS
image

now let’s run a wordcount mapreduce job on the cluaster, powered by cassandra.

image

you can see the running job from opscenter
image
once done, we can view the result,
image

now the data is there, we just create one hive table and do the 1st test
Create Hive Table and Load data from CassandraFS

when you enable the Analytics role, those node will works like a hadoop node,So I can run this test in any node which is allocated to Analytics Datacenter zone.
run dsetool to identify the current jt node,
image

then just run dse hive, you will be in the hive CLI shell,
image


then load data from cassandrafs to hive using the hive standard syntax, then, the data is there,
image
then we can just write some query run a grouping by city,
image
from the opscenter, you can see the hadoop job is submitted to the jt,
image
once done , you will get the result like this, which is no surprise at allSmile
image

the wraps the first test case, Load data to CassandraFS, and create hive table, then load data from cassandraFS and run a query.

Then Let’s create another case, LIke the Hbase-hive handler, we create one Hive table which maps to Cassandra ColumnFamily. then we can save the query result to this hive table powered by Cassandra CFS data.

Thursday, March 22, 2012

Apache Solr and SolrNet quickstart, using the Netflix Dataset (Odata format)

Here is one quick tutorial to Send some movie data (netflix oData format) to solr for indexing, and do some basic querying.

Sample Dataset, http://odata.netflix.com/Catalog/
image

I picked up 5 fields and put them into the solr scheme.xml,
image

then add solrnet to your C# client, Define one POCO object used for the mapping. use the solrxxx attirbute will enable you bridge some naming conventions between solr schema and your C# obejct.

image

then use OData Helper, and Linq entitiy to load the data (using .net framework 4.0)
image

once it download the catalog, several parallel tasks will be created to post the data to solr for indexing, after done, you can see the data is there,

image

notes,
solrnet using the Microsoft.Practices.ServiceLocation.dll as a dependency to manage the object creation,
you can use nuget search Odata to download the Odata helper.

Tuesday, March 20, 2012

Jquery error, Uncaught ReferenceError: class2type is not defined

This is an entry-level error you might get when you try put Jquery into you web. if you use chrome. the error looks like this in the console tab of the developer tools.
image

the error means the method is not available when some anonymous function are invoked in the Jquery-UI lib.

image 

So, why? there is standard jquery functions. the answer is short, when you call this anonymosu function, the calss2type function is not there yet. here is my script including order,
image

when 2 was loaded, it’s dependency lib which is 3 is not loaded yet.  drag line 3 before 2, will address the issues.

image

Hope this helpsSmile

 
Locations of visitors to this page