Getting started with Big Data

Recently I have developed a lot of interest in learning Big Data and started exploring different learnings. I will provide links to some of the good resources I came across here, so that this would be a good starting point. I would request readers to suggest any additional readings in the comments.

1. Getting Started with Microsoft Big Data – From Microsoft Virtual Academy. This training has good introduction to using Windows Azure HDInsight. The sessions walk you through running MapReduce Jobs, Hive etc. These sessions a little out dated but you will get an Idea

2. Getting started with Windows Azure HDInsight : This has the latest documentation with up to date information on HDInsight.

The above two links are from Microsoft Technologies.

3. Yahoo! Hadoop Tutorial : This has a very good description of Hadoop Architecture

4. Apart from these, there are good courses on Coursera (Data Science, this is a very good course to get an idea of various Big Data solutions), Cloudera Training

5. Introduction to Hadoop and MapReduce: This is a course on Udacity. Very quick to finish. Really nice one.

6. Map Reduce – A really simple Introduction – here

Please do suggest any good references that would help me and other readers.

Design Patterns and Advanced Design Principles

Last week, I have been through a training at Microsoft campus on Design Patterns and Advanced Design Principles. It was a four day session, where in we learnt some good design principles and practised few patterns. It was a very wonderful session, as the sessions were very interactive and the instructor was very good at the subject. The interesting learning from this session for me is that one should focus on Design principles and Design patterns follows by itself. And our session were also more focused on principles than patterns (though there were good numbers of patterns covered with real world applications)

I wouldn’t be talking about the complete sessions here but would summarize the learning into points. If you are interested in learning more, I encourage you to read more on the web for each topic.

Recommended readings:

1. Head First Design Patterns:

2. Gang of Four’s Design Patterns Book

Pointers for Good Design:

1. Define abstractions from a given requirement

2. Design from client code perspective. Validate your design by writing code.

3. Understanding of Is-A and Has-A relationships

4. Focus on Design Principles

5. Separation of concerns

6. OCP – Open Close Principle [A class should be open for extension, but closed for modification]

7. SRP – Single Responsibility Principle

8. ISP – Interface Segregation Principle

9. LSP – Liskov Substitution Principle

10. Program to an interface  and not to an implementation

11. Prefer Has-A over Is-A

12. High Cohesion and Loose coupling is desired

13. Avoid Conditional delegation

14. Avoid Redundancy in Design

15. Avoid class proliferation

16. Decouple an abstraction from Implementation

Keep these principles in mind while designing. Practise few Patterns.