Once the data was loaded to HDFS, let’s create hive external table by using the hive shell.
fist make sure data is there,
Then using the hive shell to create one external table named customers
CREATE EXTERNAL TABLE customers(id INT, name STRING,
Now let’s create one flow to read the data in hive. pick up the version ,and the thrift server ip/port, then put a hive query.
once import step, click the edit scheme and just add one column with type as object. then we will parse the result and map to our schema.
Click the advanced tab, to enable the parse query results, using the column we just created as object type.
drag the parserecordset component to the surface and conenct the mainout of hiverow to it, click edit schemas to do the mapping.
then match the values,
the the job looks like the following,
if we need to output the row, just put one logrow and connect, the final layout,
Now click to run this job, from the console it tell you whether it has connected to the hive server successfully, if failed, why.
on the hive server, it will show you it receive one query and parse it to mapreduce on the fly,
once done, you can see the results from the run console.
once done, we can export the result to Hbase to enable real-time application query. check it here,