Tuesday, August 31, 2010

Nutch Crawing and Searching Test on MAC/Linx

Nutch is one Open-source web-search software, built on Lucene Java. it can be used to crawl the website(intranet/internet) and expose search UI for those crawled and indexed content.
Among the lucene ecosystem, Nutch is one great component as the content generation tool. It can also work together with Solr to provide a more rich query functionality. such as faceting.
Let me run a quick tutorial to Index my blogpage, and turn on the query interface by deploying the webapp to tomcat.

1. Download the Nutch bits from apache website. I'll pick the apache-nutch-1.1-bin.tar.gz , download and extract to a local folder. like ~/apache-nutch

2. make sure the java_home is defined. in MAC, the java_home will be

/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home/

you may run export JAVA_home=/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home/

3. Create a directory named Output to keep the crawled and indexed content, also make one file like test.url which contains the root url list. I will put http://androidyou.blogspot.com in the file.

4. Change the conf/crawl-urlfilter.txt, make sure only index the url begins with http://androidyou.blogspot.com in my example

# accept hosts in MY.DOMAIN.NAME
+^http://androidyou.blogspot.com

5. Change the conf/nutch-default.xml. assign one default useragent.

<property>
<name>http.agent.name</name>
<value>AndroidyouTestAgent</value>

6. Startup the crawling and indexing process. By enter

bin/nutch crawl test.url -dir Output -depth 1 >& log.log
you may view the log.log to see if there anything wrong.

7 Verify the Index is created. ls Ouput
you should be be able to see 5 folders like

crawldb index indexes linkdb segments

8. Download the Lucence toolkit luke. to view the index and document. http://code.google.com/p/luke/
run the lukeall.jar , browse to the folder of the Output/Index

you will get an idea like how many documents get crawled and indexed. what are those top terms.

Now the crawling and indexing Process is Done. Let's Host the searching function to the tomcat.

A. Download and Install Tomcat. then startup the service.
B. Copy the nutch-1.1.war to the tomcat webapps folder. (this will deploy the app to tomcat.)
C. change the webapps/nutch-1.1/web-info/classes/nutch-site.xml. we should tell the runtime where is the indexed folder.
here is my file.

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>searcher.dir</name>
<value>/Users/androidyou/Documents/apache-nutch/Output</value>
</property>
</configuration>

then restart the Tomcat.
browse the url http://localhost:8080/nutch-1.1/en

search androidyou, get one result.

you can click cache or explain to get more details like lucence doucment fields.

Now you can push the index result to solr



Monday, August 30, 2010

Tomcat Architecture and Configuration

Inside tomcat, there are a lot .xml configurations. like server.xml, context.xml, web.xml.  For each Config File, it could be placed in different places. Today, I found one useful diagram which explain the tiered layout of tomcat architecture. it’s provided by marakana. I will provide more details from the perspective of programmers.

Tomcat Listener: 

A Listener element defines a component that performs actions when specific events occur, usually Tomcat starting or Tomcat stopping.

Listeners may be nested inside a Server, Engine, Host or Context. Some Listeners are only intended to be nested inside specific elements. These constraints are noted in the documentation below.

HERE are all the listener implementations, Listener interface only has one method. LifeCycleEvent

public abstract interface LifecycleListener
{
  public abstract void lifecycleEvent(LifecycleEvent paramLifecycleEvent);
}

image

Services:
 

A "Service" is a collection of one or more "Connectors" that share a single "Container" Note:  A "Service" is not itself a "Container", so you may not define subcomponents such as "Valves" at this level

A Service element represents the combination of one or more Connector components that share a single Engine component for processing incoming requests. One or more Service elements may be nested inside a Server element
 

Connectors:

Connectors that allow browsers to connect directly to the Tomcat and connectors that do it through a Web Server. typically will be http connector and AJP connector (connect through web server like IIS or Apache.)
   each connector has a thread settings.(shared thread executor pool)

Each incoming request requires a thread for the duration of that request. If more simultaneous requests are received than can be handled by the currently available request processing threads, additional threads will be created up to the configured maximum (the value of the maxThreads attribute). If still more simultaneous requests are received, they are stacked up inside the server socket created by the Connector, up to the configured maximum (the value of the acceptCount attribute. Any further simultaneous requests will receive "connection refused" errors, until resources are available to process them
 

So each connector take care different protocols.
 image

Engines:

The Engine element represents the entire request processing machinery associated with a particular Catalina Service. It receives and processes all requests from one or more Connectors, and returns the completed response to the Connector for ultimate transmission back to the client.

Exactly one Engine element MUST be nested inside a Service element, following all of the corresponding Connector elements associated with this Service.

each engine has several HOSts, and Also Engine can be routed via jvmroutes> By default, it use the StandEngine.
image

Hosts:

The Host element represents a virtual host, which is an association of a network name for a server (such as "www.mycompany.com" with the particular server on which Catalina is running. In order to be effective, this name must be registered in the Domain Name Service (DNS) server that manages the Internet domain you belong to - contact your Network Administrator for more information.

In many cases, System Administrators wish to associate more than one network name (such as www.mycompany.com and company.com) with the same virtual host and applications. This can be accomplished using the Host Name Aliases feature discussed below.

One or more Host elements are nested inside an Engine element. Inside the Host element, you can nest Context elements for the web applications associated with this virtual host. Exactly one of the Hosts associated with each Engine MUST have a name matching the defaultHost attribute of that Engine.


Host Maps URl to Contexts.
image

Contexts:

The Context element represents a web application, which is run within a particular virtual host. Each web application is based on a Web Application Archive (WAR) file, or a corresponding directory containing the corresponding unpacked contents, as described in the Servlet Specification (version 2.2 or later). For more information about web application archives, you can download the Servlet Specification, and review the Tomcat Application Developer's Guide.

The web application used to process each HTTP request is selected by Catalina based on matching the longest possible prefix of the Request URI against the context path of each defined Context. Once selected, that Context will select an appropriate servlet to process the incoming request, according to the servlet mappings defined in the web application deployment descriptor file (which MUST be located at /WEB-INF/web.xml within the web app's directory hierarchy).

there are two contexts implementation in tomcat, standard vs the replicated. (Tomcat Clustering. session state keep sync between different contexts across different HOSTs)

image

Context takes care the Filters, JSPS.  the final logic executio of all JSPs, like welcome page. watched resource. serverlet mappings.  status page mappings. all those configurations in web.xml located in web-inf
image

Valves:

A Valve element represents a component that will be inserted into the request processing pipeline for the associated Catalina container (Engine, Host, or Context). Individual Valves have distinct processing capabilities, and are described individually below.

there are a lot of vales available in tomcat, like AccessLog , Authenticator. JVMrouting.

image

Wednesday, August 25, 2010

Silverlight communication with WCF service, Disable the .svc web browser Access

For the latest Silverlight 4.0. the WCF client runtime supports two Message Encodings, the  mtomMessageEncoding encoding is not available yet.

  • BinaryMessageEncoding
  • TextMessageEncoding

So from the client side (browser side) prospective, the Siliverlight APP communicate with the WCF service(.svc) either using the Soap/Text Format or Soap/Binary Format. typically, you can tell this from the Content-Type Header.

  • application/soap+msbin1
  • application/soap+xml;
image

here is the binaryencoding.

image

If you browse the .svc url directly. you may be able to see the wsdl ( if you enabled the servermetadata httpget already)
image
always, you don’t want end user to view this information directly.  hope the user will get a customer error or even 404 error. How to do that?

Write a simple Http Module and deploy it to the web server that host SVC service, And Deny those request, only support the soap+xml or soap+msbin protocal. 
image

then when user try to access the .svc directly, will get 404 error as we expect.

 image

could not get a worker for name ajp13 when config IIS with Tomcat

I just followed the instruction on http://tomcat.apache.org/connectors-doc/reference/iis.html.

Installed one Tomcat 7 Instance, with all the default setting, then startup the instance by run “bin\startup.bat”

Download and Install the Mod_JK for IIS module from http://www.apache.org/dist/tomcat/tomcat-connectors/jk/binaries/, please pay attention if you are using the 64 bit windows ,please chose the 64 bit ISAP extension, otherwise IIS will ignore the extension.

then make worker.properties and uriworkermap.properties files. they are pretty simple. forward all the url like /docs to the tomcat help.

#Woker.Properties

worker.list=tomcat1,jkstatus
worker.tomcat1.cachesize=10
worker.tomcat1.host=127.0.0.1
worker.tomcat1.port=8009
worker.tomcat1.type=ajp13
worker.jkstatus.type=status


uriworkermap.properties

/docs=tomcat1
/docs/*=tomcat1
/status=jkstatus

Since I am using IIS 7.5 ( the one comes with windows 7), I have to enable the IIS extension features. they are disabled by default.

image

after Done, Open the IIS manager, enable the ISAPI extension. (go to default site, Http Handlers mappings, Click add wildcard script map in the right action panel. )

image

then create one virtual directly called jakarta. and point to the directory that contains the isapi_redirect dll.

>>> this is important, later on we will config the registry to point the extension_url to /jakarta/isapi_redirect_xxx.dll

Now, fill out those defautl configurations in registry. you may just save and import the following setting, Change and apply your folder path.

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Apache Software Foundation\Jakarta Isapi Redirector\1.0]
"log_file"="C:\\apache-tomcat-7.0.0\\IIS\\redirector.log"
"log_level"="debug"
"worker_file"="C:\\apache-tomcat-7.0.0\\IIS\\workers.properties"
"worker_mount_file"="C:\\apache-tomcat-7.0.0\\IIS\\uriworkermap.properties"
"extension_uri"="/jakarta/isapi_redirect-64.dll"
"strip_session"="1"
"reject_unsafe"="1"

image

then run “IISReset” to refresh the changes.
 

after that, it’s time to test the drive. when I enter “http://localhost/docs”, nothing appears. just a blank screen , no errors.
image

what happened under the Neath? then I open the redirect_log, here is the log I get.
I highlighted some parts, looks like the config did get picked up, the tomcat1 worker is right. the communication is good.

[Wed Aug 25 10:30:06.871 2010] [3868:5124] [debug] jk_set_time_fmt::jk_util.c (459): Pre-processed log time stamp format is '[%a %b %d %H:%M:%S.000 %Y] '
[Wed Aug 25 10:30:06.881 2010] [3868:5124] [info] init_jk::jk_isapi_plugin.c (2403): Starting Jakarta/ISAPI/isapi_redirector/1.2.30
[Wed Aug 25 10:30:06.884 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2421): Detected IIS version 7.5
[Wed Aug 25 10:30:06.887 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2426): Using registry.
[Wed Aug 25 10:30:06.890 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2429): Using log file C:\apache-tomcat-7.0.0\IIS\redirector.log.
[Wed Aug 25 10:30:06.894 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2430): Using log level 1.
[Wed Aug 25 10:30:06.897 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2431): Using extension uri /jakarta/isapi_redirect-64.dll.
[Wed Aug 25 10:30:06.900 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2432): Using worker file C:\apache-tomcat-7.0.0\IIS\workers.properties.
[Wed Aug 25 10:30:06.902 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2433): Using worker mount file C:\apache-tomcat-7.0.0\IIS\uriworkermap.properties.

[Wed Aug 25 10:30:06.905 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2435): Using rewrite rule file .
[Wed Aug 25 10:30:06.908 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2437): Using uri select 3.
[Wed Aug 25 10:30:06.911 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2438): Using no chunked encoding.
[Wed Aug 25 10:30:06.912 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2440): Using notification event SF_NOTIFY_AUTH_COMPLETE (0x04000000)
[Wed Aug 25 10:30:06.915 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2450): Using uri header TOMCATURI0000000010000000:.
[Wed Aug 25 10:30:06.918 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2451): Using query header TOMCATQUERY0000000010000000:.
[Wed Aug 25 10:30:06.920 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2452): Using worker header TOMCATWORKER0000000010000000:.
[Wed Aug 25 10:30:06.922 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2453): Using worker index TOMCATWORKERIDX0000000010000000:.
[Wed Aug 25 10:30:06.925 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2454): Using translate header TOMCATTRANSLATE0000000010000000:.
[Wed Aug 25 10:30:06.928 2010] [3868:5124] [debug] init_jk::jk_isapi_plugin.c (2455): Using a default of 250 connections per pool.
[Wed Aug 25 10:30:06.931 2010] [3868:5124] [debug] jk_map_read_property::jk_map.c (491): Adding property '/docs' with value 'tomcat1' to map.
[Wed Aug 25 10:30:06.933 2010] [3868:5124] [debug] jk_map_read_property::jk_map.c (491): Adding property '/docs/*' with value 'tomcat1' to map.
[Wed Aug 25 10:30:06.936 2010] [3868:5124] [debug] jk_map_read_property::jk_map.c (491): Adding property '/status' with value 'jkstatus' to map.

[Wed Aug 25 10:30:06.939 2010] [3868:5124] [debug] uri_worker_map_load::jk_uri_worker_map.c (1102): Loading urimaps from C:\apache-tomcat-7.0.0\IIS\uriworkermap.properties with reload check interval 60 seconds
[Wed Aug 25 10:30:06.941 2010] [3868:5124] [debug] uri_worker_map_add::jk_uri_worker_map.c (729): exact rule '/docs=tomcat1' source 'uriworkermap' was added
[Wed Aug 25 10:30:06.943 2010] [3868:5124] [debug] uri_worker_map_add::jk_uri_worker_map.c (720): wildchar rule '/docs/*=tomcat1' source 'uriworkermap' was added
[Wed Aug 25 10:30:06.946 2010] [3868:5124] [debug] uri_worker_map_add::jk_uri_worker_map.c (729): exact rule '/status=jkstatus' source 'uriworkermap' was added
[Wed Aug 25 10:30:06.951 2010] [3868:5124] [debug] uri_worker_map_dump::jk_uri_worker_map.c (171): uri map dump after file load: index=0 file='C:\apache-tomcat-7.0.0\IIS\uriworkermap.properties' reject_unsafe=1 reload=60 modified=1282755822 checked=1282757406
[Wed Aug 25 10:30:06.954 2010] [3868:5124] [debug] uri_worker_map_dump::jk_uri_worker_map.c (176): generation 0: size=0 nosize=0 capacity=0
[Wed Aug 25 10:30:06.956 2010] [3868:5124] [debug] uri_worker_map_dump::jk_uri_worker_map.c (176): generation 1: size=3 nosize=0 capacity=4
[Wed Aug 25 10:30:06.961 2010] [3868:5124] [debug] uri_worker_map_dump::jk_uri_worker_map.c (186): NEXT (1) map #0: uri=/docs/* worker=tomcat1 context=/docs/* source=uriworkermap type=Wildchar len=7
[Wed Aug 25 10:30:06.964 2010] [3868:5124] [debug] uri_worker_map_dump::jk_uri_worker_map.c (186): NEXT (1) map #1: uri=/status worker=jkstatus context=/status source=uriworkermap type=Exact len=7
[Wed Aug 25 10:30:06.966 2010] [3868:5124] [debug] uri_worker_map_dump::jk_uri_worker_map.c (186): NEXT (1) map #2: uri=/docs worker=tomcat1 context=/docs source=uriworkermap type=Exact len=5
[Wed Aug 25 10:30:06.968 2010] [3868:5124] [debug] jk_map_read_property::jk_map.c (491): Adding property 'worker.list' with value 'tomcat1,jkstatus' to map.
[Wed Aug 25 10:30:06.971 2010] [3868:5124] [warn] jk_map_validate_property::jk_map.c (411): The attribute 'worker.tomcat1.cachesize' is deprecated - please check the documentation for the correct replacement.
[Wed Aug 25 10:30:06.974 2010] [3868:5124] [debug] jk_map_read_property::jk_map.c (491): Adding property 'worker.tomcat1.cachesize' with value '10' to map.
[Wed Aug 25 10:30:06.979 2010] [3868:5124] [debug] jk_map_read_property::jk_map.c (491): Adding property 'worker.tomcat1.host' with value '127.0.0.1' to map.
[Wed Aug 25 10:30:06.980 2010] [3868:5124] [debug] jk_map_read_property::jk_map.c (491): Adding property 'worker.tomcat1.port' with value '8009' to map.
[Wed Aug 25 10:30:06.984 2010] [3868:5124] [debug] jk_map_read_property::jk_map.c (491): Adding property 'worker.tomcat1.type' with value 'ajp13' to map.
[Wed Aug 25 10:30:06.986 2010] [3868:5124] [debug] jk_map_read_property::jk_map.c (491): Adding property 'worker.jkstatus.type' with value 'status' to map.
[Wed Aug 25 10:30:06.988 2010] [3868:5124] [debug] jk_map_resolve_references::jk_map.c (774): Checking for references with prefix worker. with wildcard (recursion 1)
[Wed Aug 25 10:30:06.991 2010] [3868:5124] [debug] jk_shm_calculate_size::jk_shm.c (132): shared memory will contain 1 ajp workers of size 320 and 0 lb workers of size 320 with 0 members of size 384+320
[Wed Aug 25 10:30:06.994 2010] [3868:5124] [debug] jk_shm_open::jk_shm.c (254): Initialized shared memory JKISAPISHMEM_LOCALHOST_1 size=448 free=320 addr=0x3b0000
[Wed Aug 25 10:30:06.997 2010] [3868:5124] [debug] jk_map_dump::jk_map.c (589): Dump of map: 'worker.list' -> 'tomcat1,jkstatus'
[Wed Aug 25 10:30:07.000 2010] [3868:5124] [debug] jk_map_dump::jk_map.c (589): Dump of map: 'worker.tomcat1.cachesize' -> '10'
[Wed Aug 25 10:30:07.002 2010] [3868:5124] [debug] jk_map_dump::jk_map.c (589): Dump of map: 'worker.tomcat1.host' -> '127.0.0.1'
[Wed Aug 25 10:30:07.005 2010] [3868:5124] [debug] jk_map_dump::jk_map.c (589): Dump of map: 'worker.tomcat1.port' -> '8009'
[Wed Aug 25 10:30:07.008 2010] [3868:5124] [debug] jk_map_dump::jk_map.c (589): Dump of map: 'worker.tomcat1.type' -> 'ajp13'
[Wed Aug 25 10:30:07.011 2010] [3868:5124] [debug] jk_map_dump::jk_map.c (589): Dump of map: 'worker.jkstatus.type' -> 'status'
[Wed Aug 25 10:30:07.014 2010] [3868:5124] [debug] build_worker_map::jk_worker.c (242): creating worker tomcat1
[Wed Aug 25 10:30:07.017 2010] [3868:5124] [debug] wc_create_worker::jk_worker.c (146): about to create instance tomcat1 of ajp13
[Wed Aug 25 10:30:07.020 2010] [3868:5124] [debug] wc_create_worker::jk_worker.c (159): about to validate and init tomcat1
[Wed Aug 25 10:30:07.023 2010] [3868:5124] [debug] ajp_validate::jk_ajp_common.c (2605): worker tomcat1 contact is '127.0.0.1:8009'
[Wed Aug 25 10:30:07.027 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2800): setting endpoint options:
[Wed Aug 25 10:30:07.030 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2803): keepalive:              0
[Wed Aug 25 10:30:07.035 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2807): socket timeout:         0
[Wed Aug 25 10:30:07.038 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2811): socket connect timeout: 0
[Wed Aug 25 10:30:07.040 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2815): buffer size:            0
[Wed Aug 25 10:30:07.044 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2819): pool timeout:           0
[Wed Aug 25 10:30:07.047 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2823): ping timeout:           10000
[Wed Aug 25 10:30:07.050 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2827): connect timeout:        0
[Wed Aug 25 10:30:07.053 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2831): reply timeout:          0
[Wed Aug 25 10:30:07.056 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2835): prepost timeout:        0
[Wed Aug 25 10:30:07.059 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2839): recovery options:       0
[Wed Aug 25 10:30:07.062 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2843): retries:                2
[Wed Aug 25 10:30:07.063 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2847): max packet size:        8192
[Wed Aug 25 10:30:07.066 2010] [3868:5124] [debug] ajp_init::jk_ajp_common.c (2851): retry interval:         100
[Wed Aug 25 10:30:07.069 2010] [3868:5124] [debug] ajp_create_endpoint_cache::jk_ajp_common.c (2662): setting connection pool size to 10 with min 5 and acquire timeout 200
[Wed Aug 25 10:30:07.072 2010] [3868:5124] [debug] build_worker_map::jk_worker.c (242): creating worker jkstatus
[Wed Aug 25 10:30:07.075 2010] [3868:5124] [debug] wc_create_worker::jk_worker.c (146): about to create instance jkstatus of status
[Wed Aug 25 10:30:07.078 2010] [3868:5124] [debug] wc_create_worker::jk_worker.c (159): about to validate and init jkstatus
[Wed Aug 25 10:30:07.081 2010] [3868:5124] [debug] init::jk_status.c (5053): Status worker 'jkstatus' is read/write and has css '(null)', prefix 'worker', name space 'jk:', xml name space 'xmlns:jk="http://tomcat.apache.org"', document type '(null)'
[Wed Aug 25 10:30:07.084 2010] [3868:5124] [debug] init::jk_status.c (5104): Status worker 'jkstatus' has good rating for '0000000f' and bad rating for '00ff1010'
[Wed Aug 25 10:30:07.085 2010] [3868:5124] [debug] wc_get_worker_for_name::jk_worker.c (116): found a worker tomcat1
[Wed Aug 25 10:30:07.089 2010] [3868:5124] [debug] wc_get_name_for_type::jk_worker.c (293): Found worker type 'ajp13'
[Wed Aug 25 10:30:07.091 2010] [3868:5124] [debug] uri_worker_map_ext::jk_uri_worker_map.c (512): Checking extension for worker 0: tomcat1 of type ajp13 (2)
[Wed Aug 25 10:30:07.094 2010] [3868:5124] [debug] wc_get_worker_for_name::jk_worker.c (116): found a worker jkstatus
[Wed Aug 25 10:30:07.097 2010] [3868:5124] [debug] wc_get_name_for_type::jk_worker.c (293): Found worker type 'status'
[Wed Aug 25 10:30:07.100 2010] [3868:5124] [debug] uri_worker_map_ext::jk_uri_worker_map.c (512): Checking extension for worker 1: jkstatus of type status (6)
[Wed Aug 25 10:30:07.103 2010] [3868:5124] [debug] wc_get_worker_for_name::jk_worker.c (116): found a worker tomcat1
[Wed Aug 25 10:30:07.106 2010] [3868:5124] [debug] wc_get_name_for_type::jk_worker.c (293): Found worker type 'ajp13'
[Wed Aug 25 10:30:07.109 2010] [3868:5124] [debug] uri_worker_map_ext::jk_uri_worker_map.c (512): Checking extension for worker 2: tomcat1 of type ajp13 (2)
[Wed Aug 25 10:30:07.111 2010] [3868:5124] [debug] uri_worker_map_dump::jk_uri_worker_map.c (171): uri map dump after extension stripping: index=0 file='C:\apache-tomcat-7.0.0\IIS\uriworkermap.properties' reject_unsafe=1 reload=60 modified=1282755822 checked=1282757406
[Wed Aug 25 10:30:07.115 2010] [3868:5124] [debug] uri_worker_map_dump::jk_uri_worker_map.c (176): generation 0: size=0 nosize=0 capacity=0
[Wed Aug 25 10:30:07.117 2010] [3868:5124] [debug] uri_worker_map_dump::jk_uri_worker_map.c (176): generation 1: size=3 nosize=0 capacity=4
[Wed Aug 25 10:30:07.120 2010] [3868:5124] [debug] uri_worker_map_dump::jk_uri_worker_map.c (186): NEXT (1) map #0: uri=/docs/* worker=tomcat1 context=/docs/* source=uriworkermap type=Wildchar len=7
[Wed Aug 25 10:30:07.123 2010] [3868:5124] [debug] uri_worker_map_dump::jk_uri_worker_map.c (186): NEXT (1) map #1: uri=/status worker=jkstatus context=/status source=uriworkermap type=Exact len=7
[Wed Aug 25 10:30:07.126 2010] [3868:5124] [debug] uri_worker_map_dump::jk_uri_worker_map.c (186): NEXT (1) map #2: uri=/docs worker=tomcat1 context=/docs source=uriworkermap type=Exact len=5

[Wed Aug 25 10:30:07.129 2010] [3868:5124] [debug] uri_worker_map_switch::jk_uri_worker_map.c (482): Switching uri worker map from index 0 to index 1
[Wed Aug 25 10:30:07.132 2010] [3868:5124] [info] init_jk::jk_isapi_plugin.c (2573): Jakarta/ISAPI/isapi_redirector/1.2.30 initialized
[Wed Aug 25 10:30:07.135 2010] [3868:5124] [debug] wc_maintain::jk_worker.c (339): Maintaining worker tomcat1
[Wed Aug 25 10:30:07.137 2010] [3868:5124] [debug] init_ws_service::jk_isapi_plugin.c (2792): Reading extension header HTTP_TOMCATWORKER0000000010000000: (null)
[Wed Aug 25 10:30:07.140 2010] [3868:5124] [debug] init_ws_service::jk_isapi_plugin.c (2793): Reading extension header HTTP_TOMCATWORKERIDX0000000010000000: -1
[Wed Aug 25 10:30:07.144 2010] [3868:5124] [debug] init_ws_service::jk_isapi_plugin.c (2794): Reading extension header HTTP_TOMCATURI0000000010000000: (null)
[Wed Aug 25 10:30:07.147 2010] [3868:5124] [debug] init_ws_service::jk_isapi_plugin.c (2795): Reading extension header HTTP_TOMCATQUERY0000000010000000: (null)
[Wed Aug 25 10:30:07.149 2010] [3868:5124] [debug] init_ws_service::jk_isapi_plugin.c (2800): No URI header value provided. Defaulting to old behaviour
[Wed Aug 25 10:30:07.152 2010] [3868:5124] [debug] init_ws_service::jk_isapi_plugin.c (3108): Service protocol=HTTP/1.1 method=GET host=::1 addr=::1 name=localhost port=80 auth= user= uri=/docs/
[Wed Aug 25 10:30:07.155 2010] [3868:5124] [debug] init_ws_service::jk_isapi_plugin.c (3120): Service request headers=7 attributes=0 chunked=no content-length=0 available=0
[Wed Aug 25 10:30:07.158 2010] [3868:5124] [debug] wc_get_worker_for_name::jk_worker.c (116): did not find a worker ajp13
[Wed Aug 25 10:30:07.161 2010] [3868:5124] [debug] HttpExtensionProc::jk_isapi_plugin.c (2162): could not get a worker for name ajp13
[Wed Aug 25 10:30:07.163 2010] [3868:5124] [error] HttpExtensionProc::jk_isapi_plugin.c (2210): could not get a worker for name ajp13

could not get a worker for name ajp13
  that’s the error, from the log you can see the mapping is correct, from /docs to tomcat1.  the request uri is /docs, that’s also correct. why it still pick up the ajp13 worker?

after I did a  3 hours research, the answer is pretty obvious.
Inside of the isapi_redirect-64.dll, there are two modules specific for IIS. they are
ISAPI Extension
AND
ISAPI Filters

ISAPI extensions are true applications that run on IIS and have access to all of the functionality provided by IIS. As an example of how powerful ISAPI extensions can be, ASP pages are processed through an ISAPI extension called ASP.dll. In general, clients can access ISAPI extensions the same way they access a static HTML file or dynamic ASP file.

ISAPI extensions are implemented as DLLs that are loaded into a process that is controlled by IIS. Like ASP and HTML pages, IIS uses the virtual location of the DLL file in the file system to map the ISAPI extension into the URL namespace that is served by IIS.

Extensions and filters are the two types of applications that can be developed using ISAPI. An ISAPI extension runs when requested just like any other static HTML file or dynamic ASP file. Since ISAPI applications are compiled code, they are processed much faster than ASP files or files that call COM+ components.

ISAPI filters are DLL files that can be used to modify and enhance the functionality provided by IIS. ISAPI filters always run on an IIS server, filtering every request until they find one they need to process. The ability to examine and modify both incoming and outgoing streams of data makes ISAPI filters powerful and flexible.

Filters are registered at either the site level or the global level (that is, global filters apply to all sites on the IIS server), and are initialized when the worker process is started. A filter listens to all requests to the site on which it is installed.


when you execute one link like /jakarta/isapi_redirect-64.dll,
this is called Extension. there is one method called HttpExtensionProc. Inside this method, it runs the logic of Mapping, say ,from /tomcat to tomcat1 worker1.  this part is OK.
the second part, we need the
ISAPI Filter Module which will be hit for every request to run the redirect logic, also it contains the mapping ( say from /tomcat1 to our extension dll /jakarta/isapi_redirect-64.dll) , this piece is missing.

image

what we have to do is open the ISAPI Filter Add-in in IIS. import this filter.

image

Restart the IIS, then everything comes back to normal. you get the familiar help page from localhost/docs, which is served by tomcat in the backend.
 image

Also the status page.
image

Here is one link to run the IIS trace which tell the different process to server the request, with or without the filter.

using the IIS 7 /7.5 tracing Features to locate the mod_jd tomcat ISAPI problem

some day, I get an strange error, “could not get a worker ajp13”, what I did first was to turn on the tracing features of IIS 7.
 image

click the Failed Request tracing in the right action panel. here I will log all the request no matter it’s successful or not.

image

Dump all the details for all the requests.

image

Then I try the url http://localhost/docs/.

from the dump directly. you can open the xml using the Brower directly.

image

Click the compact view. the request uri is /docs. at first the ISAP extension module is loaded.

image

then It get executed. the HTTPExtensionProc method.
image

that’s it. NO Server side URL rewrite Logic, Like redirect /docs to /jakarta/isapi_redirector.dll

After we added the filters to IIS. and try the trace, we get different Results.

firstly, the rul is still the /docs
image

the module get loaded also.

image

this time, the filter get loaded. we should expect a url redirect in server side.

image

her it is, /docs is redircted to /jakarta/isap_redirector.dll

image

then the extension get executed.
image

repeat the logic in the previous config. but this time, it get a lot context
image

we made it. :)

Thursday, August 19, 2010

install and test Apache tomcat on windows 7

if you want to a server to host some JSPs and Servelets. Tomcat is one of those popular FREE servers. install and configure it is super easy, I will show you how to install it on windows . basically there are 3 steps.

  • Tomcat is a standard JAVA application, so make sure you install the JDK. Different tomcat version might need least java version. check it here, http://tomcat.apache.org/whichversion.html
  • Download and install the zip file which contains the jars and some resources, as well configurations.
  • kick it manually or add it as  a windows service.

So , first make sure JDK is there, if not, download one jdk from Sun, and click to install it.  After that, Put the JDK bin directory to windows path variable. and Setup JAVA_home to the jdk folder.
   run JAVA –version |JAVAC –version, you will get the following output depends on your version. this means that JDK and PATH is all set.

C:\>java -version | javac -version
java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) Client VM (build 17.0-b16, mixed mode, sharing)
javac 1.6.0_21

then, download and unzip the tomcat files here, http://tomcat.apache.org/download-70.cgi, since tomcat is a standard java application, there should be the same for 32/64 bit windows. why there are two bits for you to download, they only difference is it hold one specific version of Java service wrapper that enable you to put the java as a windows service.
  unzip it to c:\apache ,the folder should looks like this.


image

start it manually or Add it as a windows service.
  go to the command prompt, enter “catalina.bat start” or “catanalina.bat run” [start the new process in a new window]
image

the utility basically startup the java process and put more parameters to load.
  here is mine.
 

"C:\jdk1.6.0_21\bin\java"   -Djava.util.logging.config.file="C:\apache\conf\logging.properties" -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager   -Djava.endorsed.dirs="C:\apache\endorsed" -classpath "C:\apache\bin\bootstrap.jar;C:\apache\bin\tomcat-juli.jar" -Dcatalina.base="C:\apache" -Dcatalina.home="C:\apache" -Djava.io.tmpdir="C:\apache\temp" org.apache.catalina.startup.Bootstrap  start 

or run “ service.bat install tomcat7” , create a windows service.

image

image

the windows service will be called “Apache Tomcat [youservicename]”, tomcat7.exe is the one which differs in 32/64 bit downloads.

Monday, August 16, 2010

Cisco wireless web authentication , web form auto login, java htmlunit, webbrowser control

In .NET world, If you want to invoke web pages, fill out forms, click links, etc..., you can writing logic target the WebBrowser Class.

for example, you need to enter username/password everytime when you access some wireless network which is secured by Cisco Wireless web authentication.

you can write a Simple C# application to parse the Dom, get the input fields and assign the value, then invoke the click event on the submit button. here is one sample C# Code.

if you have the problem on MAC, you can’t use the .NET for sure.
I am always looking for the same web Brower control , unfortunately, there is NO webbrwser control which works like the C# one. However, you can use the HTMLUnit

with HtmlUnit, you can write a 10 tines of code to loadup one URL, fillout some forms, and submit.
here is one sample java code

Thursday, August 12, 2010

Test Appfabric on windows 7

Let me start from my New IIS7 ultimate dev machine, I’ve installed VS 2010 And SQL Server 2008 R2 dev version. you don’t need to read the installation guide in order to start the appfabric programming.

> it require four major components, just install them and ready to run.

  • .net runtime 4.0
  • IIS 7 or above and some admin packages ( The IIS 7.0 Manager for Remote Administration is required for managing remote IIS servers)
  • SQL server ( monitoring data storage, and workflow persistence storage.)
  • Appfabric Runtime (some .net assemblies)

First Make sure, .NET Framework 4.0 Is installed. ( if you want to do some development on appfabric, vs 2010 setup will install the .net framework 4.0 for you.), you can check the Microsoft.net folder quickly.

image

then enable IIS 7 and some other features as well.
Run “optionalfeatures” simply, and chose the IIS7 features and WAS 
 image

image

then download and install the IIS 7 Manager for Remote Administration , you can download here. 
http://go.microsoft.com/fwlink/?LinkId=182018

in the right bottom, select the right version for you cpu architecture. if you don’t want to use the webplatform installer.
image 
once done, start the IIS admin , you will have the connect to remote server context menu.
image
 

Download and install APPfabric runtime from msdn. http://www.microsoft.com/downloads/details.aspx?FamilyID=467e5aa5-c25b-4c80-a6d2-9f8fb0f337d2&displaylang=en

pay attention to the bits, if you are using windows 7 x64, chose the WindowsServerAppFabricSetup_x64_6.1.exe
and WindowsServerAppFabricSetup_x86_6.1.exe for x86

image

for me, I use the x64 version. I will select all the features during the setup
image

once finished, it will popup the configuration GUI. all the bits are been installed to C:\Windows\System32\AppFabric. if you missed the configuration, start up the configuration exe manually.
image 

for the configuration part, basically it just need to create the Database to hold the Monitoring data and WF service(persistence,tracking db.)
By default , it ships with the default SQl style provider for monitoring and persistence. all you need to do is point to the right service account and DB server.
once done, it may promp you to start up the SQL Agent in order to show statistics on the dashboard. [ why? after enable monitoring, The wcf runtime to dump etw log, and put it to a staging table, there are some SQL job needed to import the data from staging table to the query table.)
image

you may skip the cache configuration at this point.

then open the IIS manager, you should be able to see the Appfabric Dashboard add-in for IIS. we are done.
image

what happened under the hood? 
It created 3 databases
installed some Assemblies and put them to GAC
changed web.config and ApplicationHost.config to inject the monitoring logic.
how?
     WCF 4.0 comes with new features called Default Configuration (default binding, default behaviors), so the new config will add some new behavior which will be picked up by all application excepty you clear those behaviors explicitly. you may check the C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Config\web.config
   image

and add some diagnostics listeners. wf tracking profile and persistence instances.

Now let’s write a helloAppfabric Application , and deploy it to the “fabric”. first create a new wcf web project.

   image

click Ok, if you get the error says asp.net 4.0 is not configurated for IIS.
image
run the asp.net 4.0 registration utility. go to the framework v4 folder.
image 

then the project is created, browser the svc directly. http://localhost/HelloFabric/Service.svc
after that, Open IIS fabric dashboard. you see nothing.
All counter is zero.
image

when you get this , make sure to start up the sql agent and run the refresh again. You will see the counter is increased
image

then you can play more with the appfabric. select one Application, richt click to manage the appfabric behavior.
image

 
Locations of visitors to this page