Wednesday, March 28, 2012

DataStax DSE 2.0, Hadoop/Hive quick tutorial - 1

Just setup 8 VMS, and load it with the DSE 2.0, one new feature of the DSE 2.0 is the SOLR support. Now, when I run the node tool, you will see it create one virtual data center called Solr besides the Analytics for hadoop.

image

Ops Center has a visual layout, 3 virtual datacenters.
image

Now let’s test the first case, Loading one CSV file to the CassandraFS (like the hdfs)
I googled some sample csv file, and picked this one, http://jasperreports.sourceforge.net/sample.reference/csvdatasource/
it has some data like  "Dallas",47,"Janet Fuller","445 Upland Pl.","Trial" which maps to{"city", "id", "name", "address", "state"};

So paste this data to a local file called state.csv, and copy it to our cassandraFS
image

now let’s run a wordcount mapreduce job on the cluaster, powered by cassandra.

image

you can see the running job from opscenter
image
once done, we can view the result,
image

now the data is there, we just create one hive table and do the 1st test
Create Hive Table and Load data from CassandraFS

when you enable the Analytics role, those node will works like a hadoop node,So I can run this test in any node which is allocated to Analytics Datacenter zone.
run dsetool to identify the current jt node,
image

then just run dse hive, you will be in the hive CLI shell,
image


then load data from cassandrafs to hive using the hive standard syntax, then, the data is there,
image
then we can just write some query run a grouping by city,
image
from the opscenter, you can see the hadoop job is submitted to the jt,
image
once done , you will get the result like this, which is no surprise at allSmile
image

the wraps the first test case, Load data to CassandraFS, and create hive table, then load data from cassandraFS and run a query.

Then Let’s create another case, LIke the Hbase-hive handler, we create one Hive table which maps to Cassandra ColumnFamily. then we can save the query result to this hive table powered by Cassandra CFS data.

No comments:

 
Locations of visitors to this page