Tuesday, October 23, 2012

Summary of use cases, using hadoop in enterprise


  • Need to analysis / summarize / query / store unstructured or semi-structured data. Example:
    • logs
    • sensor data
    • emails
    • blogs
    • web content
    • DOCs / PDFs
    • images
    • videos
  • Ability to support multiple data sources that are producing very disparate and unstructured data
  • Rate at which data is generated is very high, continuous and unpredictable ( say 1 TB per day or per cycle)
  • Data to be analyzed is massively distributed. eg logs
    • Not possible to intercept data being generated at single / known source
  • Using traditional ETL batch processes to summarize data is too time consuming or impractical or expensive
    • Moving all the big data to one storage area network (SAN) or ETL server becomes infeasible with big data volumes. 
    • Even if you can move the data, processing it is slow, limited to SAN bandwidth, and often fails to meet batch processing windows.
  • There is a need to run analytics on raw data
    • Queries that will be run on raw data are not determinate and hence, criteria / parameters for summarizing data are not know upfront
  • Huge amount of data needs to be retained on cheap commodity hardware
    • Using expensive storage, used by RDBMS is not feasible
  • To be continued

Friday, October 12, 2012

Demystifying AOP, Getting started with LTW (load time weaving)

Often, using aspectj AOP, especially with LTW (load time weaving) is shrouded in mystery.
Thought I would write up a little note about getting started with Aspectj AOP LTW.

here goes...all you would need is a simple eclipse java project looking like below.


We would need 
  1. an aspect source file - MySimpleLoggerAspect.java
  2. a sample service which will get AOP-ed - SampleService.java
  3. a test class with main method - Tester.java
  4. an Aspectj, LTW related config file - META-INF\aop.xml
  5. aspectjrt-1.7.0.jar and aspectjweaver-1.7.0.jar in your project classpath


MySimpleLoggerAspect.java is the logging aspect. For further details about writing aspectj aspects please refer www.eclipse.org/aspectj/docs.php. Listing of the simple aspect in java is below.

----------------------------------------------------------------
package com.ghag.rnd.aspects.ltw;

import org.aspectj.lang.ProceedingJoinPoint;

import org.aspectj.lang.annotation.Around;
import org.aspectj.lang.annotation.Aspect;
import org.aspectj.lang.annotation.Pointcut;

@Aspect
public class MySimpleLoggerAspect {
@Pointcut("execution(* *(..))")
public void myTraceCall() {
}
@Around("com.ghag.rnd.aspects.ltw.MySimpleLoggerAspect.myTraceCall()")
public Object myTrace(ProceedingJoinPoint joinPoint) throws Throwable
{
System.out.println("myTrace:before call "
+joinPoint.getTarget().getClass().getName()
+"."+joinPoint.getSignature().getName());
Object retVal = null;
try
{
retVal = joinPoint.proceed();
}
finally
{
System.out.println("myTrace:after call "+
joinPoint.getTarget().getClass().getName()
+"."+joinPoint.getSignature().getName() + " retval=" +retVal);
}
return retVal;
}


}




SampleService.java is as easy as given below:
----------------------------------------------------------------
package com.ghag.rnd.aspects.sample;

public class SampleService {
public String doService(String in){
System.out.println("inside doService");
return in;
}
}




Tester.java listing is a few more lines of code:
----------------------------------------------------------------
package com.ghag.test;

import com.ghag.rnd.aspects.sample.SampleService;

public class Tester {
public static void main(String[] args) {
new SampleService().doService("Ganesh Ghag");
}
}




And finally the META-INF\aop.xml listing is simple a self explanatory, especially the package names ;-)
----------------------------------------------------------------
<aspectj>
<aspects>
<aspect name="com.ghag.rnd.aspects.ltw.MySimpleLoggerAspect" />
</aspects>
 
<!--  <weaver options="-verbose -debug -showWeaveInfo"> -->
<weaver>
<include within="com.ghag.rnd.aspects.sample.*" />
<include within="com.ghag.rnd.aspects.ltw.*" />
</weaver>
</aspectj>





Now when you run Tester.java, just ensure you have the following paramater supplied as JVM argument:
-javaagent:/your dev env local /path/to/aspectj\aspectjweaver-1.7.0.jar

Thats it folks, when you run, Tester.java, SampleService call will get AOP-ed and give following output:
----------------------------------------------------------------
myTrace:before call com.ghag.rnd.aspects.sample.SampleService.doService
inside doService
myTrace:after call com.ghag.rnd.aspects.sample.SampleService.doService retval=Ganesh Ghag


Getting started with AspectJ AOP with LTW is that easy, folks!