Big data has been a hot-topic now, and Talend is very popular in the open source ETL community. however, there was not enough support for big data. recently, talend released a new product with the open source support called Talend Open Studio for Big Data. you can download it and play in your environment, it has built-in support for Big data. I will put several test cases here , then you can get an impression what does the product offer.
1. FROM DB to HDFSfor the db purpose , I just use the row generator to simulate the rows in db, put some simple logic to create one row serious with 3 columns, ID, name and level,
Then drag and drop the hdfsoutput coponent to the surface, connect the major output of the row generator to the hdfs. for the hdfs, just specify the name node address, and the folder to hold the file
then click to run the job, it will create a file for you which contains all the rows we generated in a CSV format.
remember to pickup the right version of you hadoop environment, and when done, you can tell the time taken to transfer the rows between two systems.
Once we export the data from traditional db to hdfs, we can run hive query to get the results , that’s the next case to test.
402 comments:
«Oldest ‹Older 601 – 402 of 402Post a Comment