Big Data-hadoop And Its Impact On Business Intelligence Systems
Recently my work necessitated me look into the new features added in informatica 9.1
, but I never thought the journey will take me to explore further on this and write a blog Lets see how I traversed through different new aspects that are getting very much related to data management and Business Intelligence. First we will look what is Bigdata and its position now.
People would always think how the organizations like Yahoo, Google, Facebook store large amounts of data of the users. We should take a note that Facebook stores more photos than Googles Picassa. Any guesses??
What is Hadoop
The answer is Hadoop and it is a way to store large amounts of data in petabytes and zettabytes. This storage system is called as Hadoop Distributed File System. Hadoop was developed by Doug Cutting based on ideas suggested by Googles papers. Mostly we get large amounts of machine generated data. For example, the Large Hadron Collider to study the origins of universe produces 15 petabytes of data every year for each experiment carried out.
MapReduce
The next thing which comes to our mind is how quick we can access these large amounts of data. Hadoop uses MapReduce, which first appeared in research papers of Google. It follows Divide and Conquer. The data is organized as key value pairs. It processes the entire data that is spread across countless number of systems in parallel chunks from a single node. Then it will sort and process the collected data.
With a standard PC server, Hadoop will connect to all the servers and distributes the data files across these nodes. It used all these nodes as one large file system to store and process the data, making it a 100% unadulterated distributed file system. Extra nodes can be added if data reaches the maximum installed capacity, making the setup highly scalable. It is very cheap as it is open source and doesnt require special processors like used in traditional servers. Hadoop is also one of the NoSQL implementations.
Hadoop in Real time
The Tennessee Valley Authority(TVA) uses smart-grid field devices to collect data on its power-transmission lines and facilities across the country. These sensors send in data at a rate of 30 times per second at that rate, the TVA estimates it will have half a petabyte of data archived within a few years. TVA uses Hadoop to store and analyze data. In India Power Grid Corporation of India intends to install these smart devices in their grids for collecting data to reduce transmission losses. It is better they also emulate TVA. Recently Facebook moved to 30 Petabyte Hadoop, which sounds incredible and hard to digest the fact we are using such a myriad volume of data.
Data Warehouse and
Business Intelligence Products supporting Hadoop and MapReduce
1) Greenplum
2) Informatica
3) Teradata
5) Pentaho
6) Talend
If Hadoop and other NoSQL implementations are widely used, the limitations of traditional SQL systems can be resolved like storing unstructured data. With the volume of data increasing exponentially, commercialization of Hadoop will happen in a large scale and data integrator tools will play a key role in mining data for business.
Readers share your experiences if any of you have worked with Hadoop on other ETL and BI Tools, tools that are available in the market.
by: Sumit Srivastava
It Performance Revolutionizing The Business For Its Expansion The Right First Impression With Cool Business Cards Buying Forklifts For The Benefit Of Your Business Year 2012 Brings Top Business Events At Multiple Exhibition Venues In London Software Development Services And Their Role In Business Development The Business Of Claim Management: It Depends On The Insurance! Reach Out To Customers And Expand Your Business How Small Business Ecommerce Software Reduce It Investments? Can Small Business E-commerce Software Reduce Operational Cost? Business Success With A Professional Brochure Printing Business Partnership Pros And Cons 5 Reasons Your Business Correspondence Should Be On Personalized Stationery Php Web Development For Business Solutions
www.yloan.com
guest:
register
|
login
|
search
IP(216.73.216.101) California / Anaheim
Processed in 0.024462 second(s), 7 queries
,
Gzip enabled
, discuz 5.5 through PHP 8.3.9 ,
debug code: 34 , 3566, 54,