My Note on Solutions.
Monday, October 24, 2016
How to test pacemaker to create a Virtual IP for LB purpose for 3 nodes
nodeone= 'node1'
nodetwo= 'node2'
nodethree= 'node3'
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
#### node 1###########
config.vm.define :"node1" do |node1|
node1.vm.box = BOX
node1.vm.box_url = BOX_URL
node1.vm.network :private_network, ip: "192.168.2.101"
node1.vm.hostname = nodeone
node1.vm.synced_folder ".", "/vagrant", disabled: true
node1.ssh.shell = "bash -c 'BASH_ENV=/etc/profile exec bash'"
node1.vm.provision "shell", path: "post-deploy.sh" ,run: "always"
node1.vm.provider "virtualbox" do |v|
v.customize ["modifyvm", :id, "--memory", "1300"]
v.name = nodeone
v.gui = true
end
end
and post-deploy.sh
#!/bin/bash
value=$( grep -ic "entry" /etc/hosts )
if [ $value -eq 0 ]
then
echo "
################ hosts entry ############
192.168.2.101 node1
192.168.2.102 node2
192.168.2.103 node3
######################################################
" >> /etc/hosts
fi
once done, you can call vagrant up node1 node2 node3 to have 3 VMs ready.
now vagran ssh all nodes to install the dependencies.
yum install -y pcsd pacemaker corosync
after done, enable all daemon to start automatically.
systemctl enable pcsd
systemctl enable pacemaker
systemctl enable corosync
//disable firewalld
systemctl disable firewalld
Now time to configure the corosync for nodes topology.
you can just copy the /etc/corosync/corosync.conf.example to /etc/corosync/corosync.conf
just change your ip and quorum settings (by dfault it use mcast to maintain the cluster membership)
then create a user named haclusteruser among 3 nodes
echo you_special_compliceted_password|passwd --stdin haclusteruser
once done, you can call pcs cluster auth to authorize eacy host using the shared credential
pcs cluster auth node1
after this. start the corosync and pacemaker on all nodes.
run crm_mon to see all the nodes
now we can create a floating vip to HA purpose.
pcs resource create VIP2 IPaddr2 ip=192.168.2.100 cidr_netmask=32 nic=enp0s8 op monitor interval=30s
after that, you might see the status is always stopped. we need to disable the stonith
pcs property set stonith-enabled=false
try the check again for the status, the VIP is bound to one node (node1 here)
to test the HA, shutdown node1. and run crm_mon again, the VIP was assigned on node2
Wednesday, August 10, 2016
How to locate Magento credit card leakge
and it turns out to be a very smart hacker. here is the steps I try to locate the issue.
I use tcpdump to capture all traffic for a couple hours and do a quick analysis to see whether there are some special. like sending out credit card using SMTP with the port 25, or just use stand http post, if it https. you can sort the TLS certificate to see any special certificate which represents some evil 3rd party hosts.
Nothing found special for me. most just traffic to this site and some 3rd party API call like shipping rate calculation, Fedex integration.
then do a analysis about the request log to see any url which has huge hit from the same IP.
you can use some handy linux command to do the aggregation.
awk '{print $8$2}' requestlog|sort|uniq -c|sort -r -n|more
nothing special, most hits are from crawlers ip with Google.
feel a little big frustrated now, this must be a smart hacker. then I watch for file change within 5 minutes. if they hacker intercept the request and store somewhere, I will definitly find out where did he/she store the file.
go the root directly of the magento, I spect all files change within 5 mintues (I placed a order using a dummy credit card"
find . -mmin -5 -ls
there we go, one file called db-tab-footer_bg.gif was changed minutes ago. there should not be any chagne for gif files. this turns out to be a complicated image file. since it's all stored with encrypted data ( it must be credit card there),
now, time to locate how can they capture the data and dump to this file.
just search all php files containing db-tab-footer_bg.gif.
it's in the global config file, here is the content. essentially they intercept all post request and encrypt using his/her RSA public key and put to the gif file. nobody can decrypt it.
now time to get ride of the backdoor and do the housekeeping to locate how did they inject the backdoor on the server
Sunday, July 31, 2016
Group All established connections by target IP
Basic idea. netstat -an|grep 'EST' will show you all the established connection.
so the 4th column is the client endpoints, and 5th column is the target one.
if we want to see how many connection for each Target as a IP. you can do the following command.
netstat -an -p tcp|grep "EST"|awk '{print $5}'|awk -F "." '{print $1"."$2"."$3"."$4 }'|sort |uniq -c |sort -n -r
Tuesday, May 17, 2016
Spark window function, failure: ``union'' expected but `(' found
team=[("Lakers","WEST",29 ),("Golden State","WEST",89 ), ("MIA HEAT","EAST",79 ),("SAS","WEST",9 ), ("RAPTORS","EAST",29 ) ] sql.createDataFrame( sc.parallelize( team).map(lambda x: Row(Team=x[0],Division=x[1], Score=x[2])))\ .registerAsTable("team") print sql.sql("SELECT team, division, score, rank() OVER (PARTITION BY division ORDER BY score desc)" " as rank FROM team").take(10)
And I got this errors complaining the syntax
4j.protocol.Py4JJavaError: An error occurred while calling o36.sql.
: java.lang.RuntimeException: [1.43] failure: ``union'' expected but `(' found
SELECT team, division, score, rank() OVER (PARTITION BY division ORDER BY score desc) as rank FROM team
^
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:36)
at org.apache.spark.sql.catalyst.DefaultParserDialect.parse(ParserDialect.scala:67)
at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:211)
at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:211)
To Fix this, please make sure you are using HiveContext instead of SqlContext
How to Run Spark testing application in your fav Python IDE
1. copy and grab pyspark folder under the standard spark distribution to your project folder
2. setup some bootstrap to take care the environments using the following code , I use 1.6.1 as an example. and you may create this as a module.
class Setup(object): def setupSpark(self): os.environ["SPARK_HOME"] = "/Users/and/Development/spark/spark-1.6.1-bin-hadoop2.6/" os.environ["PYSPARK_SUBMIT_ARGS"]="--master local[2]"; spark_home = os.environ.get("SPARK_HOME") spark_release_file = spark_home + "/RELEASE" if os.path.exists(spark_release_file) and "Spark 1.6.1" in open(spark_release_file).read(): pyspark_submit_args = os.environ.get("PYSPARK_SUBMIT_ARGS", "") if not "pyspark-shell" in pyspark_submit_args: pyspark_submit_args += " pyspark-shell" os.environ["PYSPARK_SUBMIT_ARGS"] = pyspark_submit_args sys.path.insert(0, spark_home + "/python") sys.path.insert(0, os.path.join(spark_home, "python/lib/py4j-0.9-src.zip")) return pyspark.SparkContext()
3. you are good to go
from lib.setup import Setup sc=Setup().setupSpark() print sc.parallelize(range(1,10)).count()
Tuesday, December 15, 2015
How to: test ElasticSearch geoLocation support
the query looks like this. basically query all the earthquake happened near San Diego within 300KMs.
and the results showing the individual earthquakes and the buckets aggregated.
you can grad the data source from USGS , I closed the CSV file here
to do the Indexing, first we define the mapping , I used Postman to do the REST call. My mapping looks like this,
then copy and paste the data we convered, basically tell the importer we want to create a new document called eq, then the doc itself
Thursday, May 21, 2015
Http 1.1 Chunked transfer.
If you inspect the http response with some zipped resources, you may find 2 things. I assure we are on HTTP 1.1. there is no content-length header ins the response, also the transfer-encoding is chunked.
and Http 1.1 have a full spec about the chunked encoding, check it here. http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6
if you use node.js , we can write a simple test to try out the chunked transfer. here is the source code
and try pull the response using wget. you will notice the 3 seconds delay between Okay3 and 4.
and inspect the traffic in between, you can see the wirelevel bits
1st response returns Okay1
then the 2nd returns2 and 3
after 3 seconds delay
we get okay4
once we call the end , you can see the end trailer.
wireshark are smart enough, to tell you 4 frames invloved in this http req/res