Showing posts with label Knol. Show all posts
Showing posts with label Knol. Show all posts

Thursday, October 10, 2013

Simple caching with a TTL

Spring's cache abstraction framework is hugely useful for declarative caching as outlined in previous posts. As we start to use more caches, an inherent requirement that arises is periodic and on-demand purging. For the first kind of purge, you require a Time To Live (TTL) to be associated with your cache. External solutions like ehcache provide configuration to be able to do this. There are host of other parameters, like writing to disk, disk location, buffer sizes, max limits that can be configured using the configuration. However, what if your requirement is simpler and don't want to marry into ehcache just yet.
Spring's ConcurrentMapCacheFactoryBean has been made nicely pluggable where you can plug in any backing store that you want to use for the concurrent map based caching. So, here we can plug in our own TTLAwareConcurrentMap. But, I don't want to write TTL logic myself, right? Sure, just use the constructs available from guava. Guava gives a TTL backed map with a CacheBuilder which looks as simple as:
return CacheBuilder.newBuilder().expireAfterWrite(duration, unit).build().asMap();
All we need to do now is create FactoryBean in Spring that will be injected with the duration (and unit of the duration) of the TTL and it will vend out the store that will be used by Spring's caching framework. Sample FactoryBean isTTLAwareConcurrentMapFactoryBean.
For on-demand spring cache flushes, we can have a JMX operation defined on a custom CacheManager that is injected with the Spring caches at startup. On invocation, the specific cache or all caches may be flushed by invoking the Cache.clear() method. Due credit to this SO question.
Hope this helps!
References:

Monday, May 27, 2013

Lets go uber!

ow do you run your non-webapp maven-based java program from command line? One might be using exec:java route and specify the main class. The only sticking point here is that the classpath will be filled with references to the user’s local repository (usually NFS mounted). I would be much more comfortable if I had all the set of jars that I would be dependent on packaged WAR-style and be ready at my disposal. In that sense, the application would be much more self-contained. It also becomes that much easy for someone to testdrive your application. Hence the concept of an “uber” jar – a unified jar that will house all classes/resources of the project along with classes/resources of its transitive dependencies.

Thursday, March 14, 2013

Blocked persistence nuance with GET*DATE()


Noticed an interesting (and desired) behavior when dealing with persisting timestamps that is blocked by a competing transaction, which might be useful for others as well. For unitemporal and bitemporal tables, we frequently update the knowledge_date as either GETDATE() or GETUTCDATE(), but it is not guaranteeing that the wall-clock time of when record got persisted to the db is the same as the value noted in the knowledge_date column.
To illustrate this, let's say we have a table as
CREATE TABLE special_table (
special_key INT PRIMARY KEY,
special_value INT,
knowledge_date DATETIME)
An we insert a few values into it:
INSERT INTO special_table VALUES(1, 100, GETUTCDATE())
INSERT INTO special_table VALUES(2, 100, GETUTCDATE())
INSERT INTO special_table VALUES(100, 10000, GETUTCDATE())
Now, let's say, we've got two competing transactions, one a read and another a write with the read preceding the write and the read transaction taking a lot more time to finish (simulated with a WAITFOR DELAY).
Read Transaction (at isolation level repeatable read):
BEGIN TRANSACTION
SELECT GETUTCDATE() --ts1
SELECT * FROM special_table WHERE special_key = 2
WAITFOR DELAY '00:00:10'
SELECT * FROM special_table WHERE special_key = 2
SELECT GETUTCDATE() --ts2
COMMIT
Write Transaction (at isolation level read committed):
BEGIN TRANSACTION
SELECT GETUTCDATE() --ts3
UPDATE special_table
   SET special_value = special_value + 1, knowledge_date=GETUTCDATE()
 WHERE special_key = 2
SELECT GETUTCDATE() --ts4
SELECT * FROM special_table WHERE special_key = 2
COMMIT
Execute these two batches in two windows of SSMS with the read preceding the write. Since the read started before the write, ts1+10 ~= ts2 because the read transaction will experience no blocking. The write operation was kicked off a little after read was kicked off (say with interval d). Hence ts1 + d = ts3.
Question: will the knowledge_date updated be closer to ts3 or ts4?
One might think that the knowledge_date value is closer to ts4 when the write transaction actually gets unblocked, however, this is not the case. For Sql Server itself to figure out whether or not the transaction needs to get blocked (because of the default page level locking scheme followed), the query needs to be evaluated and hence the value to be assigned to knowledge_date has to be evaluated at a time closer to ts3 itself. Hence the knowledge_date timestamp will be persisted to the DB at a wall-clock time closer to ts4even if the DB claims the timestamp as closer to ts3.
This can be verified with the output from the read and writes where there is a marked delay between the knowledge_date updated and the ts4. This becomes even more interesting when you have multiple updates in the same write transaction - some of which can proceed - till the point that it gets blocked because of the read and one can notice varying knowledge_date across records even though they were all kicked off in the same transaction.
BEGIN TRANSACTION
SELECT GETUTCDATE() --ts1
UPDATE special_table
   SET special_value = special_value + 1, knowledge_date=GETUTCDATE() --ts
 WHERE special_key = 100
SELECT GETUTCDATE() --ts2
UPDATE special_table
   SET special_value = special_value + 1, knowledge_date=GETUTCDATE() --tss
 WHERE special_key = 2
SELECT GETUTCDATE() --ts3
SELECT * FROM special_table WHERE special_key = 2
COMMIT
Here, knowledge_date for key 100 would be closer to ts1 due to it not getting blocked by the read and the knowledge_date for key 2 would be closer to ts3 and away from ts2 since it was blocked by the read.
BTW, this should not haunt the trigger based td_bl temporal tables as the trigger gets only fired after the base table is updated and effectively captures the timestamp of when the base table was changed (but may not be the exact time when the temporal record got persisted to the db due to blocking concerns).
Hope this helps!
References:

Saturday, March 9, 2013

Quick headless JAX-RS servers with CXF


If one needs to vend out JSON data in a JAX-RS compatible way with minimal setup fuss, CXF + Spring provides good out-of-the-box solution for you.
The steps would be:
  1. Write your service class (interface and impl preferably)
  2. Annotate your service impl methods with
    1. @Path annotation indicating the URI on which it will serve the resource
    2. @Get/@Post indicating the HTTP method which it serves
    3. @Produces("application/json") indicating that the output format is JSON
  3. Define a jaxrs:server directive in your spring context file indicating the address and resource path on which the service is hosted
  4. Add maven dependencies of javax.ws.rs-api (for annotations), cxf-rt-core (for stubbing RS communication over a http conduit) and cxf-rt-transports-http-jetty (for embedded jetty)
and voila you are done.
Concretely:
public interface SpecialService {
    String getSomeText();
}
public class SpecialServiceImpl implements SpecialService {

    @GET
    @Produces("application/json")
    @Path("/someText/")
    @Override
    public String getSomeText() {
        return "kilo";
    }
}
    <bean id="specialService" class="com.kilo.SpecialServiceImpl"/>

    <bean id="inetAddress" class="java.net.InetAddress" factory-method="getLocalHost" />

    <jaxrs:server id="specialServiceRS"
        address="http://#{inetAddress.hostName}:${com.kilo.restful.port}/specialServiceRS">
        <jaxrs:serviceBeans>
            <ref bean="specialService" />
        </jaxrs:serviceBeans>
    </jaxrs:server>
And now hit http://yourhostname:yourportnum/specialServiceRS/someText to get the response as "kilo". If you examine the request via some developer tools, you will see that the content type is application/json.
CXF JAX-RS uses an embedded jetty as the http container, so we don't really need a tomcat for setting this up. This might bring up the question of Tomcat vs Jetty overall and here are my thoughts:
Tomcat
  • Lightweight
  • Servlet 3 style async threading in the works
  • Known beast in terms of configuration
Jetty
  • Even more lightweight
  • Implements servlet 3 style async thread allocation (like node) and hence more responsive and efficient
  • Easy to have an embedded server with cxf (embeddability is synonymous with jetty)
  • Ability to have multiple simple java processes that act as headless servers quickly
Overall, I believe we should give Jetty a chance and see how it performs. If it ever lets us down, it is easy to take the process and house it in a Tomcat container.

We will try to cover some more involved use cases of passing in inputs via JAX-RS, dealing with complex objects, CORS and GZIP in subsequent posts (the samples already have them explained).

References:

Tuesday, January 29, 2013

CORS filters choices


Have you ever had to workaround Javascript’s Same origin policy and felt the need to take a shower with an industrial grade disinfectant immediately after? Well fear no more, as HTML5 to the rescue – CORS - Cross Origin Resource Sharing. Almost every conceivable hack that web developers used to do is being standardized as a feature spec in HTML5 and I thought CORS was a good feature. While there is good merit in the same origin policy, with the ubiquity of data vending servers that are being re-used in other data vending servers, there need to be a straight-forward solution to transitive data sharing. The workarounds of setting document.domain property, using JSONP or using crossdomain.xml (flex only) were not seamless. By making this a HTML5 spec and conveying it by means of headers, there is now, a definite method in the madness. This tutorial does a very good job of explaining nuances related to preflight, credentials (cookies) and caching policies along with all its usages via JQuery. The obvious concern of security has been aptly addressed in the following snip from the tutorial.

A WORD ABOUT SECURITY

While CORS lays the groundwork for making cross-domain requests, the CORS headers are not a substitute for sound security practices. You shouldn’t rely on the CORS header for securing resources on your site. Use the CORS headers to give the browser directions on cross-domain access, but use some other security mechanism, such as cookies or OAuth2, if you need additional security restrictions on your content.
What was missing was a good server side filter that will enable this for a typical java webapp. While the spec is pretty straight forward for the vanilla use cases, it quickly gets involved once we get into the details. There are two alternatives out there that I could find:
  1. com.thetransactioncompany.cors.CORSFilter
    1. Pros:
      1. Hugely popular
      2. Additional support for allowing subdomains in allow list
      3. Frequently updated for latest specs and bug fixes
      4. Liberal license (Apache)
    2. Cons:
      1. Not from a stable house
  2. org.eclipse.jetty.servlets.CrossOriginFilter
    1. Pros:
      1. From Jetty
      2. Liberal license (Eclipse)
    2. Cons:
      1. Not very popular
      2. May seem somewhat unintuitive if we use jetty related artifacts in a tomcat container environment
      3. Not aggressively updated
So, overall, it seems that the CORS filter from dzhuvinov is the winner, but would be happy to know if others feel differently.
References:

Thursday, November 1, 2012

Spring PropertyPlaceholderConfigurer Tip

With the increased scrutiny on preferring to keep passwords in files and not as part of the source code, our reliance on PropertyPlaceholderConfigurer (PPC) has greatly increased. An important property of a PPC is the ignoreUnresolvablePlaceholders which is by default set to false. Setting this to false means that if any placeholders being passed to it were not resolved by it, it throws an exception and the entire context load fails. With the existence of multiple PPC's in a spring context (many of our own plus a few from all the transitive contexts we load), it helps to understand the interplay between them. Ideally every modules' that defines a PPC should necessarily have it's ignoreUnresolvablePlaceholders to true - and most do it as it will cause a context failure on loading. However, one may argue that the topmost context may need to know if all properties from all contexts were correctly resolved. By setting the ignoreUnresolvablePlaceholders to true, you may end up loading the context but having placeholders unresolved. The order in which the different PPC's are consulted is determined by the order property. However, if it is not defined, it defaults to Integer.MAX_VALUE. If order is not defined in any of the PPC's, the order is slightly non-deterministic but will mostly correspond to the order in which they are defined. The reason I say mostly is because Spring can choose to eagerly initialize beans to resolve circular dependencies. This is not limited to PPC's but also encompasses our vanilla beans referring other beans. This has been a topic that I believe we don't really pay a lot of attention to - but bites when you are least expecting. Anyway, there is a spring ticket out there to also have an "order" for vanilla beans as well. Coming back to ensuring that the context load is successful only when all placeholders have been able to be resolved - you can define an empty PPC that will act as the signaller for any unresolved placeholders. We will not define any locations for this PPC, and will have an order that is the highest among all - say Integer.MAX_VALUE with the default value of ignoreUnresolvablePlaceholders as false. Since this is the last guy that will get triggered, any unresolved placeholders will be flagged and context load stopped. Ofcourse this depends on the premise that the contexts (from different jars) that the topmost context loads will not have PPC's that are defined without order (defaulting to Integer.MAX_VALUE) and hence will not get loaded after the empty PPC in the topmost context.
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.springframework.org/schema/beans
        http://www.springframework.org/schema/beans/spring-beans-3.0.xsd">

    <bean id="ppc1"
        class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
        <property name="locations">
            <list>
                <value>file:///tmp/p1.properties</value>
            </list>
        </property>
        <property name="ignoreUnresolvablePlaceholders" value="true" />
        <property name="order" value="10000" />
    </bean>

    <bean id="ppc2"
        class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
        <property name="locations">
            <list>
                <value>file:///tmp/p2.properties</value>
            </list>
        </property>
        <property name="ignoreUnresolvablePlaceholders" value="true" />
        <property name="order" value="10001" />
    </bean>

    <bean id="overallPPC"
        class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
        <property name="order" value="20000" />
    </bean>

    <bean id="simpleBean" class="com.kilo.SimpleBean">
        <property name="property1" value="${com.kilo.property1}" />
        <property name="property2" value="${com.kilo.property2}" />
        <property name="property3" value="${com.kilo.property3}" />
        <property name="property4" value="${com.kilo.property4}" />
    </bean>

</beans>

So all you nice people out there giving out their jars with spring contexts to others, please ensure that your set an order of your PPC other than the default of Integer.MAX_VALUE. A good rule of thumb could be to use 10000-90000 for our own non-toplevel PPC.

References:

1. http://tarlogonjava.blogspot.in/2009/02/tips-regarding-springs.html
2. Sample setup available at https://github.com/kilokahn/spring-testers/tree/master/spring-ppc-tester


Monday, October 29, 2012

Classpath of a running java process spawned via maven exec:java

Ran into a peculiar situation where I was using the maven exec:java plugin to launch a java process. The plugin does the heavy lifting of creating the correct classpath comprising of its dependencies (much like what JUnit does). In this case, I suspected the java and maven to be conspiring against me by picking up a staler jar. In the case of Tomcat (or other containers) it is usually easy to look at WEB-INF/lib for the jars that will be used. But in this case, the command line didn't yield much information.

> ps -ef | grep produser| grep java
munshi     953 27405  0 01:21 pts/0    00:00:00 grep java
munshi   31771 31758 11 01:14 pts/0    00:00:50 /usr/local/java/jdk/bin/java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,address=8765,suspend=n -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8219 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -classpath /tools/apache/maven/boot/plexus-classworlds-2.4.jar -Dclassworlds.conf=/tools/apache/maven/bin/m2.conf -Dmaven.home=/tools/apache/maven org.codehaus.plexus.classworlds.launcher.Launcher --settings /u/produser/.m2/settings.xml clean compile exec:java -Dexec.mainClass=com.kilo.DriverCLI -Dexec.args=SOMETHING 

The -classpath /tools/maven/boot/plexus-classworlds-2.4.jar didn't help much here. In JUnit, a whole file is created with the list of jars being used and it is very easy to inspect. There are intrusive ways to print the classpath for a application by either 1. setting -verbose:class when starting the process or by explicitly requesting in code with the system property "java.class.path". I fired up JVisualVM to see if it can help me examine the system property non-instrusively (since I really didn't want to restart the application for investigative reasons -- just yet). Again it showed the  /tools/maven/boot/plexus-classworlds-2.4.jar as the value. That's when I came across this alternate way where you use the lsof command to see which files are open. Since the classloader has to refer to the JAR from where the class is being loaded, it shows up in the list of open files - however not sure if this is a sureshot way to get things done - what if the JVM decides to close off the connection to the JAR as a way of reducing number of open file descriptors.

> lsof -p 31771 | less
COMMAND   PID   USER   FD   TYPE             DEVICE SIZE/OFF     NODE NAME
java    31771 produser  cwd    DIR               0,70     4096 13085795 /u/produser/testproject (filer.kilo.com:/vol/proj1/proj-302)
java    31771 produser  rtd    DIR              253,0     4096        2 /
java    31771 produser    txt    REG              253,0     7630  1465254 /usr/local/java/jdk1.7.0.04/bin/java
java    31771 produser  mem    REG              253,0   156872  1441803 /lib64/ld-2.12.so
java    31771 produser  mem    REG              253,0  1922112  1441807 /lib64/libc-2.12.so
java    31771 produser  mem    REG              253,0    22536  1444029 /lib64/libdl-2.12.so
java    31771 produser  mem    REG              253,0   145720  1441859 /lib64/libpthread-2.12.so
java    31771 produser  mem    REG              253,0   598800  1444297 /lib64/libm-2.12.so
java    31771 produser  mem    REG              253,0    47064  1441866 /lib64/librt-2.12.so
java    31771 produser  mem    REG              253,0   113952  1444031 /lib64/libresolv-2.12.so
java    31771 produser  mem    REG              253,0   124624  1444032 /lib64/libselinux.so.1
java    31771 produser  mem    REG              253,0    17256  1444294 /lib64/libcom_err.so.2.1
java    31771 produser  mem    REG              253,0    12592  1444030 /lib64/libkeyutils.so.1.3
java    31771 produser    mem    REG              253,0   915104  1444295 /lib64/libkrb5.so.3.3
java    31771 produser  mem    REG              253,0    43304  1444292 /lib64/libkrb5support.so.0.1
java    31771 produser  mem    REG              253,0   181608  1444293 /lib64/libk5crypto.so.3.1
java    31771 produser  mem    REG              253,0   268944  1444296 /lib64/libgssapi_krb5.so.2.2
java    31771 produser  mem    REG              253,0     6398  1361340 /usr/local/java/jdk1.7.0.04/jre/lib/amd64/librmi.so
java    31771 produser  mem    REG              253,0  1023488  1361282 /usr/local/java/jdk1.7.0.04/jre/lib/ext/localedata.jar
java    31771 produser  mem    REG              253,0  2476995  1361287 /usr/local/java/jdk1.7.0.04/jre/lib/resources.jar
java    31771 produser  mem    REG              253,0     8934  1361283 /usr/local/java/jdk1.7.0.04/jre/lib/ext/dnsns.jar
java    31771 produser  mem    REG               0,71  1501575  1312261 /u/produser/.m2/repository/com/google/guava/guava/10.0.1/guava-10.0.1.jar (filer.kilo.com:/vol/home2/produser)

References:

1. http://thilinamb.wordpress.com/2009/07/01/analyse-the-classpath-of-a-running-java-program/

Tuesday, October 9, 2012

SQLServer Plan-Fu

Like google-fu, here is my MS SQL Server plan-fu :D

Finding out the cached plan handle for a query. You need some distinguishing text for the query - which I refer to as the MyMarker. Word of caution - this is computationally intensive so exercise discretion when running it - otherwise you might hear from your friendly neighborhood DBA-man :)
     SELECT cache_plan.plan_handle, cache_plan.objtype, cache_plan.size_in_bytes,
            cache_plan.cacheobjtype, cache_plan.usecounts, sql_text.text
       FROM sys.dm_exec_cached_plans as cache_plan WITH (NOLOCK)
OUTER APPLY sys.dm_exec_sql_text (cache_plan.plan_handle) as sql_text
      WHERE sql_text.text like N'%MyMarker%' AND cache_plan.cacheobjtype='Compiled Plan' AND cache_plan.objtype='Prepared'

To see the query plan using the plan_handle from above
SELECT query_plan
  FROM sys.dm_exec_query_plan (0x06003500F43BCC1940A17B9E010000000000000000000000)

To kick out the bad boy from the cache
DBCC FREEPROCCACHE (0x0600050093BEF30740014500030000000000000000000000)

To see the currently running query for a given spid
DBCC INPUTBUFFER (<spid>)

To see all the connections from a given login (to know the spid)
SELECT * FROM master..sysprocesses WHERE loginame='someuser'

Friday, October 5, 2012

Eclipse importing plugins and features from existing installation

Eclipse Juno released it's SR1 last week - so now seemed a right time to move away from our trusted Indigo SR2 to the Juno world. (Typically, we want to wait for an SR1 to ensure that all nagging issues with the new release are tackled with either fixes or workarounds are available from the community. Yes - this means that we are not on the bleeding edge, but caution in this case would be well-advised - after all we wouldn't want some nagging issue in the new release to set us back in productivity!) Update: Even the SR1 sucked!! Check out this if you don't believe me!

But what about the umpteen number of plugins that I had already installed with Indigo. You can either be lazy and wait for someone to tar an installation with most plugins required and then add the ones that are missing over again... or you can ask Eclipse to help you out. Starting from Indigo, Eclipse has a feature of importing the plugins and features from another installation. So you could download a vanilla copy of the JEE version from Eclipse's site and bring it up. Next, via File -> Import -> Installation -> From existing Installation, point it to where your previous Eclipse installation was present and presto, it will identify and import all compatible plugins. I have read about problems when plugins were incompatible, but luckily for me all plugins were compatible.

The code recommender and the JDT improvements are the two features of Juno that I am looking forward to the most!

References:

1. http://stackoverflow.com/a/11264964/874076

2. http://eclipsesource.com/blogs/2012/06/27/top-10-eclipse-juno-features/

Wednesday, October 3, 2012

Details of long running and long-forgotten processes

Sometimes we start off a new process and after quite sometime you wish to know details as to how the process was started. Things like current working directory, complete command line, environment variables used. This usually happens for the programs kicked off using a sudo-user and then you mostly quit the shell from where this was invoked. In my specific case, I was looking for the current working directory (to see where it will dump the core) and environment variables in use. In Linux, this is available in /proc/<pid>/ files.

I started browsing other entries and was amazed at the amount of useful information that is being transparently provided by Linux. Some of the notable ones are limits, statm and task. The reference link has details for other interesting details.

Here is the story:

I remember that I started the process on a particular host as the produser so log on there

> sudo -u produser ssh localhost

I remember that it is a java process - to get a handle to the process id.
> ps -ef | grep java | grep produser 
produser 8113  7983  0 02:14 pts/10   00:00:00 grep java
produser 21765 21750  0 Aug31 ?        01:46:45 /usr/local/java/jdk/bin/java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,address=8765,suspend=n -Dcom.sun.management.jmxremote.port=8219 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -classpath /tools/apache/maven/boot/classworlds-1.1.jar -Dclassworlds.conf=/tools/apache/maven/bin/m2.conf -Dmaven.home=/tools/apache/maven org.codehaus.classworlds.Launcher "--settings" "/u/produser/.m2/settings.xml" "clean" "compile" "exec:java" "-Dexec.mainClass=com.kilo.DriverCLI" "-Dexec.args=SOMETHING"

Now check the current working directory:
> ls -al /proc/21765/cwd
lrwxrwxrwx 1 produser produser 0 Oct  1 08:04 /proc/21765/cwd -> /u/produser/testproject

Now for the environment settings:
> cat /proc/21765/environ
#All environment variables were listed - elided for privacy reasons

Other interesting info:
> cat /proc/21765/limits
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            10485760             unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             16341                16341                processes
Max open files            65536                65536                files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       256591               256591               signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us

> cat /proc/21765/task/21771/status
Name: java
State: S (sleeping)
Tgid: 21765
Pid: 21771
PPid: 21750
TracerPid: 0
Uid: 5307 5307 5307 5307
Gid: 5307 5307 5307 5307
Utrace: 0
FDSize: 256
Groups: 5307 6135 6584
VmPeak: 11245832 kB
VmSize: 11243768 kB
VmLck:        0 kB
VmHWM:  3575560 kB
VmRSS:  2966532 kB
VmData: 11064932 kB
VmStk:       88 kB
VmExe:        4 kB
VmLib:    15324 kB
VmPTE:     6360 kB
VmSwap:        0 kB
Threads: 31
SigQ: 0/256591
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000004
SigIgn: 0000000000000000
SigCgt: 2000000181005ccf
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed: ff
Cpus_allowed_list: 0-7
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 8189
nonvoluntary_ctxt_switches: 933

References:

1. http://www.kernel.org/doc/man-pages/online/pages/man5/proc.5.html

Tuesday, September 25, 2012

Generating AS objects for AMF remoting

We know that flex/flash as a technology is something we wish to move away in preference for HTML5. While HTML5 itself doesn't seem to be quite there with what it promised, but it is definitely going strong. And with Adobe itself saying that HTML5 is the future, newer projects are not using Flex/Flash. There are however, a sizeable population of existing projects which use Flex and can benefit from using the nice tools already available. One of them is the GraniteDS. While we generally use BlazeDS as the remoting mechanism, when it comes to helping out the developers with cross-language quirks, GraniteDS is quite helpful. With our typical application stack, we have (and want to continue to have) the bulk of the business logic in the 3-tier JEE app with a Flex UI talking to via a remoting message broker with Spring thrown in to be able to access remote services nicely. The trouble comes in when we talk beyond primitives being passed back and forth. I am by no means a Flex expert, but it was a pain to hand-roll the AS classes that mirrored the Java classes that were need on the Flex UI for rendering purposes. Here is where GraniteDS with its maven codegen plugin stepped in. You can model this as a maven dependency for your UI module and have the packaged swc included. Let's work it out with an example:

Say you have a domain objects called SpecialObject and User and now you have to display the combination of these in a UI. You create a view object called UserSpecialObjectView which is crafted by your viewing service call. This needs to be passed over to the Flex side (without having to handroll it again in AS and without having to worry about changes to the dependent Java classes). We define our module which is supposed to create the AS view objects (called graniteds-tester-vo in our example). In the pom we reuse the configuration given in the documentation, and ask it to generate AS classes for all Java classes in the said package and it creates a nice swc that can be included in the UI module.

One of my colleages, had asked as to how the plugin handles customizations that may be needed for the generated classes. The plugin creates two classes for each class that it encounters - a base class and an extension. The base is something that is out-of-bounds for modifications (in target/generated-sources) and any changes to it are anyway overwritten with each build. The extension is what is available for customizations and with the configuration in the example, goes and sits in src/main/flex and should be checked in. The funny thing here is that the code in src/main/flex depends on target/generated-sources. If seen in Eclipse, this might show up as a warning/error, but the Maven command line is able to handle this correctly because maven recognizes that generation of sources is a valid phase in the build lifecycle. It is only because the M2Eclipse maven plugin in eclipse is unable to handle this is the reason why this shows up as an error/warning. However, if you do a build from command line once, the stuff shows up correctly in Eclipse also.

References:

Tuesday, September 18, 2012

Sudden Death

We often have scenarios where we need nimble restarts for our web applications. (The reason for the restarts can be quite creative - from stale file handles to it being a strategy to deal with leaky libraries). Let's take Tomcat as our container. In most cases, we rely on our catalina.sh stop force to bring our application to a halt and then we proceed to start. Internally, catalina.sh stop force, does a kill -9 on the $CATALINA_PID. However, the kill itself is not synchronous, even the kill -9 will halt if there is a kernel thread that is doing some operation. A snippet as suggested from the references which can work is something like
kill-9 $CATALINA_PID; while kill -0 $CATALINA_PID 2>/dev/null; do sleep 1; done

kill with the 0 signal seems to be a special signal which can be used to check if the process is up. It returns 0 (a good return code) if it is able to signal to the other process and returns 1 if it can't (either due to invalid PID or due to insufficient permissions). I have seen the while loop when using the regular stop in catalina.sh, however, when forced, it only does the kill -9 and sometimes it takes a non-trivial amount of time (say 10 secs) to completely halt. Was wondering if it would make sense for tomcat to include it in its stop force section also? Have logged a bugzilla ticket for it as an enhancement to Tomcat.

References:

Friday, September 14, 2012

Eclipse Debugger Step filtering

Ever noticed the T kind of an arrow in the Eclipse debug tab?


It is called "Use Step Filters" and seems a rather useful feature. Say you have code like:
            specialObjects.add(new SpecialClass1(new SpecialClass2(
                    new SpecialClass3(i, String.valueOf(i), Boolean
                            .valueOf(String.valueOf(i))))));

When you wish to debug to the constructor of SpecialClass2 using F5 (step into), Eclipse makes you jump through hoops taking you through a tour of Class Loaders, Security Managers, its internals, its grand-daddy's internals (sun.misc.* and com.sun.*) whereas what you really care about are your own classes. Enter Step Filters. Configure it via Windows-Preferences-Java-Debug-Step Filtering as shown and voila.


A "Step Into" operation skips classes from the packages listed out there. There are a few other options as well like filtering out getters/setters and static initializers, etc. which can also come in handy.

Happy Eclipsing!

References:

Thursday, September 13, 2012

Checking stdout and stderr for a process already running

Did you ever run into a scenario where you ran a job (either a perl process or a java process) and forgot to redirect its stdout and stderr to some files and now your term is also gone. After sometime, you think that the process is not working and you are not able to see the error messages being generated. Without really resorting to snooping techniques like ttysnoop or xwindow solutions, I came across a pretty neat way of intercepting those calls (tested in Linux)
strace -ewrite -p <pid>

The strace command traces all the system calls and the signals, and if you start grepping on the write calls. Turned out to be quite useful  in a crunch scenario, but one should always think about where the stdout and stderr should be redirected.

Hope this helps!

References:

Wednesday, September 12, 2012

Real size of objects

What does the size of an object really mean in Java? The usual answer of counting the flat or shallow size may not always be what we are looking for. Say, we are vending out a lot of data from our application to an external application and in the process if we happen to die with an OOME, it suddenly becomes relevant to know the size of the overall data we are vending out. Here is where I stumbled on the concept called "retained size".

From YourKit's site:
Shallow size of an object is the amount of memory allocated to store the object itself, not taking into account the referenced objects. Shallow size of a regular (non-array) object depends on the number and types of its fields. Shallow size of an array depends on the array length and the type of its elements (objects, primitive types). Shallow size of a set of objects represents the sum of shallow sizes of all objects in the set.

Retained size of an object is its shallow size plus the shallow sizes of the objects that are accessible, directly or indirectly, only from this object. In other words, the retained size represents the amount of memory that will be freed by the garbage collector when this object is collected.

The site gives a good pictorial representation that distinguishes the two. Now, YourKit is a commerical third party software. For us mere mortals, we will have to make do with JVisualVM. So I fired up my jvisualvm to perform a heap dump. Note that memory related operations using jvisualvm need to be done on the same host as the process and cannot be done remotely. Once the heap dump was done, take a deep breath and hit the "Find 20 biggest objects by retained size" button and leave for the day :)

If you are lucky, jvisualvm would have been done computing the retained size for the heap dump by the time you are back the next day. I was trying this out for a heap of size 1.3G and it didn't complete even after consuming ~75 hours of CPU time. Similar complaints are heard in forums.

Next stop: YourKit Java Profiler 11.0.8. I got an evaluation licence valid for 15 days and proceeded to download the profiler tar (~68 MB) for Linux. I loaded the snapshot taken from jvisualvm into this and in a few seconds, the snapshot loading action was done. There was a question on CompressedOops in the 64-bit JVM which was best explained in detail here.

The default view itself shows the number of objects, the shallow size and the retained size upfront. It uses some heuristics to guess these sizes and you have the option of getting the exact values by hitting another button which refines them to the more accurate values. Interestingly, these differed only minutely.



Right click on one of the culprits and you know all the retained instances. It has a few other inspections that can be done. E.g.: Duplicate strings caused by careless use of new String(), duplicate arrays, sparse arrays, zero length arrays and other oddities along with a brief explanation.

Overall the responsiveness seemed much better than jvisualvm. If anyone has used jprofiler, you can share if such a thing exists there and how intuitive/responsive it is.

Now, over to MAT from Eclipse. This is a free tool, which is purportedly as fast as YourKit. So I dowloaded the Linux archive (~46 MB) and fired it up for the heap generated earlier.






Again, in a couple of minutes, MAT was able to create an overview which gave the biggest objects by retained sizes.




References:

Tuesday, September 11, 2012

Maven release plugin on a multi-module project having different releasecycles and hence versions

With Git migration headed in our direction, we were testing the readiness of our utility scripts for this new SCM. The maven-release-plugin had advertised that it supports git as a SCM and we were hoping that it would be straight-forward.

The project structure was like

  • parent


    • child1

    • child2

    • child3

We modified the parent pom file of parent as
    <scm>
        <connection>scm:git:ssh://gitserve/test/parent.git</connection>
        <developerConnection>scm:git:ssh://gitserve/test/parent.git</developerConnection>
        <url>http://gitweb/?p=test/parent.git;a=summary</url>
    </scm>

FTR, the connection tag is the read link and the developerConnection is the read-write link. This is supported for cases where one may allow the reads and writes to happen via different transports - say HTTP for reads and only SSH for writes.

Also, the maven-release-plugin v 2.3.2 fixes some significant bugs, so make sure that you use this version of the plugin. Best set it up via pluginManagement.

We follow the standard pattern of moving changes from trunk -> branch -> tag in the SVN world.
With the move to Git, the tag is a real tag and is really a read-only point in the history of the project and hence no need to have commit hooks that were added to support the notion on tags in the SVN world. Other open source projects like Spring, Sonatype have adopted a similar approach.

Now comes the interesting part of doing a minor release for one of the sub-modules - say child2. With the regular invocation of the maven release:branch, it was complaining of an invalid git location as
ssh://gitserve/test/parent.git/child2

Clearly, this was not a valid git location. I finally stumbled on this SO question which discusses that this is actually some intelligence on the part of the maven release plugin when it encounters multiple projects. So, the workaround? We just need to redefine the scm tags in the individual sub-modules again and voila - you are done!

Yes, this was a long winding rant, but hopefully someone will benefit from it!