Tuesday, August 26, 2014

A Snowball


It was a cold winter evening. I was sitting on work next to my laptop and working In our team we had many talented persons but in that evening I had 2 extremely talented: a product manager and a data scientist – a Phd guy.I was working on some task then PM just asked me when I predict the feature will be done. I started thinking. Then a Phd guy joined the discussion and started explaining how complicated the task is…Actually I realized that the PM had no idea what the Phd guy was talking about. But he tried to be proactive in the discussion. And asked one more question .Well,  he made the mistake of his life… a stream of math equations had been thrown  to explain him what he was asking…and guess what! It absolutely hasn't explained the second question… Well , guess what happened then…  The misunderstanding started rolling like a snowball. After 15 minutes of such a conversation both of them were totally tired with 0 understanding of each other…But what is terrible what both of them were absolutely sure that they understand and got conclusions.
Next day the PM had some meeting where he explained all other conclusions of his yesterday’s conversation. I can summarize his explanation like a "total wrong " (being very gentle here!) but the worst of all is that he mentioned that the math guy told him this. On that moment the math guy was on day off and they couldn't reach him to prove or disprove it. As a result – a snowball of crap keeps rolling down… and other people start making their decision based on totally wrong things.
And the main problem is that that’s the way it works .Not only in that  company.
The snowball grows and the worst thing: it’s unpredictable. You never know when it starts and what will be its impact.


So please…when you discuss – double check you understand before you make some conclusions.  And in addition to what – double check the opponent understood your question. I know it’s annoying. But that’s the only way ….

Friday, August 22, 2014

Java 8 Lambdas - examples

Group by and counting

List<String> strings = Arrays.asList("rami", "ida", "maya", "ida", "ida", "rami");
 Map<String, Long> collect = strings.stream().collect(Collectors.groupingBy((e) ->                                                                                                       e,Collectors.counting()));


Generate random numbers

 ThreadLocalRandom.current().longs().mapToObj(Long::toHexString).limit(10).sorted()


Filter and count
 long num = strList.stream().filter(x -> x.length()> 3).count();



PartitionBy

Map<Boolean, List<String>> collect1 = Arrays.asList("rami", "ida", "maya", "ida", "ida", "rami").stream().collect(Collectors.partitioningBy((x) -> x.startsWith("r")));

Result:
{false=[ida, maya, ida, ida], true=[rami, rami]}

Thursday, August 21, 2014

Java 8 Lambdas

First of all - have you ever asked yourself a question - why i need to create a Java Interface with 1 single method?
Hell! Why to create a file if the only thing i want is to create 1 single generic function!
If you did then you are absolutely right! There are many functional languages that spinning around that idiom .

So that's exactly that Java 8 did. They just defined thats kind of Generic interfaces and main of them are :

Interface name Arguments Returns Example
Predicate<T> T boolean Has this album been released yet?
Consumer<T> T void Printing out a value
Function<T,R> T R Get the name from an Artist object
Supplier<T> None T A factory method
UnaryOperator<T> T T Logical not (!)
BinaryOperator<T> (T, T) T Multiplying two numbers (*)

But that's no the end.
In order to provide us an option to write gentle code in already existing classes of the SDK
Stream API had been added

for example on a list you can now convert into stream and the proceed
Example :

long count = allArtists.stream().filter(artist -> artist.isFrom("London")).count();

And this gives us an option writing VERY short and gentle
Now we can create functions on the fly by only writing
()->do something
and pass it to some function

Examples:

Converting Stream to list:
Stream.of("a", "b", "c").collect(Collectors.toList());

Mapping ( transforming a collection)
List<String> collected = Stream.of("a", "b", "hello")
         .map(string -> string.toUpperCase())
         .collect(toList());

Filtering
Stream.of("a", "b", "hello")
.filter(value -> isDigit(value.charAt(0)))
.collect(toList());

flatMap
Is splitting every subelement to stream
Stream.of(asList(1, 2), asList(3, 4))
.flatMap(numbers -> numbers.stream())
.collect(toList());

Max and Min
Track shortestTrack = tracks.stream().min(Comparator.comparing(track -> track.getLength())).get();

Reducing
Stream.of(1, 2, 3)
.reduce(0, (acc, element) -> acc + element);

Combining actions together
album.getMusicians()
.filter(artist -> artist.getName().startsWith("The"))
.map(artist -> artist.getNationality())
.collect(toSet());

Great article to read
http://javarevisited.blogspot.co.il/2014/02/10-example-of-lambda-expressions-in-java8.html

Thursday, June 5, 2014

Spring - short note - 1- REST

Creating Rest Services

1.       Create Conroller class which is annotated by :  @Controller;
2.       For get methods:
    @RequestMapping(method = RequestMethod.GET,value = "/rami/{id}/{p}")
    @ResponseBody
    public String home(@PathVariable String id,@PathVariable int p) {
        return "Hello World!"+ id + "----"+p;
    }

Or:
@RequestMapping(method = RequestMethod.GET,value = "/rami")
    @ResponseBody
    public ClassA home3() {
        ClassA a = new ClassA();
        a.setName("ROMA");
        a.setAge(1112222);
        return a;
    }

For POST:
@RequestMapping(method = RequestMethod.POST,value = "/p1/")
     public    @ResponseBody  ClassA home2(@RequestBody ClassA id) {
        return id;
    }

3.      Create main * Spring Boot* :
public static void main(String[] args) throws Exception {
        SpringApplication.run(SampleController.class, args);
    }
With following in gradle file:
repositories {
    maven { url "http://repo.spring.io/libs-snapshot" }
    mavenCentral()
}

dependencies {
    testCompile group: 'junit', name: 'junit', version: '4.11'
    compile("org.springframework.boot:spring-boot-starter-web:1.0.2.RELEASE")

}




Working with Config
1. Create Config Class

@Configuration
public class InfiniSpanConfig {

    @Bean(name = "cache1")
    public Cache<String,String> namesCache() {

        Cache<String,String> cache = new DefaultCacheManager().getCache("test",true);
        return cache;
    }
}


2. Add @ComponentScan
3. in case of generic collection type autowiring use @Resource annotation instead of @Autowiring

    @Resource(name="cache1")
    private Cache<String,String> cache;

Friday, January 10, 2014

Order the mess : ACID ,CAP theorem and NoSQL



Database transaction
Lets define a transaction as unit  of work that is reliable and recoverable. By definition transaction must support ACID features :
 Atomic - if transaction contains several actions it can occur in all-or-nothing manner.
Consistent - if i commit some action - all other actions see the change immediately.
Isolated - 2 transactions that run on the same data will see different copies until on of them commits changes.
Example
1 transactions reads some value when other updates it.Untill transaction B commits - transaction A will see an old value.
Durable -Once transaction is committed -even if the system crushes, the transaction results not going to get lost.Its achieved by managing transaction logs that can be replayed,

Great! we have 4 properties of a transaction.

All those properties achieved in different ways .


CAP theorem
Its set of  requirements for distributed system
Consistency- all servers in the distributed system will be in the same state.
Availability - you will be always be able to get data out of the system ( even if isn't consistent)
Partition tolerance - service will stay available even if some parts of he system crushes.

So its proved that all 3 of them can't be achieved at the same time.And its required to have 2 out of 3.

So lets make some order :
1. CAP are properties of distributed system.
2. ACID are futures of transaction .

Transactions are futures of relational databases. like MySQL, Oracle etc.

Big Data era.

Amounts of data  that is being generated in last years caused  scale and performance to be most important over some ACID futures. Providing Consistency for example causes the system to be much slower. And sometimes we can live with an idea that the data will not be consistency at the same second ( but finally yes!)

That causes the idea of other DB systems to get popular . Some of them provide parts of ACID , some of them don't have SQL as main interface. And we call them NoSQL databases.
Ideas of Relational storage is avoided in some of them.
But as main concept they provide 2 out of 3 CAP requirements, avoid ACID and solve some predefined problem in data storage, like graph relations, columnar storage etc.

Which means the main focus now is on data instead of the system.You look, investigate the data then you look for storage solution instead of taking some Relational storage and starting modelling the data .




http://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdf
http://en.wikipedia.org/wiki/Eventual_consistency
http://www.cs.berkeley.edu/~istoica/classes/cs268/06/notes/20-BFTx2.pdf




Thursday, January 9, 2014

Couchbase : almost document-oriented database





After deep investigation of several NoSQL DB's Looks like couchbase is one of best options

1. It has JSON support 
like document-oriented DB but it doesn't indexes all the fileds on all levels of the document ( good and bad!).
It something between Document-Oriented DB and Key-Value store with an option for custom Indexes View creation which is actually a simple map reduce job that runs automaticly.

2. Auto-Sharding
In 1 click you can simply add servers.

3.Cross Cluster replication

4. Object level cache based on memcached

5.  Built in management and monitoring tool

6. Asynchronous persistence.


7. Benchmarks:
http://www.couchbase.com/presentations/benchmarking-couchbase
http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-708169.pdf

8. Best practices

9. Great documentation

10. Supports ( not offically) GeoSearch

11. Good API's
Contains Bulk modes ( on Views)
CAS oprations ( optimistic locking)
and much more

It looks like it fits all avarage needs but during the work you need to pay attention to some things like:
1. Working with keys and values instead of indexes ( indexes defined by map - reduce jobs update incrementally  and you can not control it too much)
2. Compaction on views and buckets exist and can be controlled.
3. Don't define more than couple of buckets. ( Read attached best  practices paper)

have fun with couchbase.