Thursday, September 23, 2010

How to download and compile, run Tika on windows tutorial

You might get scarred when you first try to download and run Tika on windows, If you dont have some experience of SVN and Maven like me. Here is a quick tutorial to go through these processes.

1. Use subversion client to download the source code of tika.

there is one Windows Shell Extension for Subversion, just download and install it to your windows box. then Right click one folder like C:\temp\tike, and CLick the Svn checkout context menu.

image

enter the SVN source url. http://svn.apache.org/repos/asf/tika/trunk

image

it may take couple seconds to download the source code . Click Ok when done.

image

2. Download Maven , the Build utility like the msbuild, ant. and put the mvn.bat folder to windows PATH.

After the path is set, you should be able to run “mvn” at the command prompt.

image

3. Go the the download tike source folder c:\temp\tika. and run “mvm install”

the builder will download necessary component and compile the project. this make take a while

image

4. run the tika app now.

go to that folder, run “java –jar tika-app-0.8-snapshot.jar –m a.txt”

to pull the metadata of a.txt

image

or –t  yourpdf.pdf to extract the pdf file content

No comments:

 
Locations of visitors to this page