Www Cubrid Org(8)

download Www Cubrid Org(8)

of 11

Transcript of Www Cubrid Org(8)

  • 7/30/2019 Www Cubrid Org(8)

    1/11

    | Login | Register

    Plaorms for Big Data

    posted 9 months ago in Dev Plaorm category by Lee Kyu Jae

    Dealing with Future Problems

    The reason why we need to be concerned over problems that have not yet occurred is to secure sufficient response me. Sufficient me will enable us

    o take full consideraon before making decisions as well as preparaon. Therefore, it is invaluable to predict upcoming events and be concerned

    about future problems.

    To solve maers in hand, explicit recognion is needed.

    What kind of ski l ls do we need to solve problems?

    ntuion and data analysis skills. The merging of the two elements is crical. Intuion is necessary to determine which data to analize; also some

    degree of intuion is required as you have to deal with events that have not yet occurred, even if you handled massive data analysis well. It is obvious

    hough that decision making based on data analysis is more trustworthy than intuion. Because it excludes tendency, taste, and experience of a

    decision marker, al lowing an objecve approach to problems.

    This arcle covers technical details on how to effecvely analyze data from a plaorm perspecve. It is so-called big data. T he big data

    echnology allows storing a large amount of data, searching meaningful data for visualizaon, enabling predicve analysis, thereby, internalizing

    business process for applicaon.

    Figure 1: Useful technology for the present and t he future.(Big data, Analycs and the Path from Insights to Value, MIT Solan management review,

    Winter 2011.)

    There are several priories that we need to check before handling big data.

    First, data is important by nature. The increased amount of data is not the reason. The increased amount of data has made it harder to process with

    he exisng soware.

    Second, we can obtain significant results from data and technical elements only by interacon and repeve acon with people who analyze, use, and

    determine the elements. Understanding for applicaons and businesses is imperave and you must define clear definions to resolve problems,

    ncluding decision making on which data needs to be analyzed and processed. There is no such a system that automacally handles overall tasks, such

    as, how to process data, which analysis algorithm and processing systems to use, and how to interpret results.

    Lastly, analysis based on data should be closely connected with business processes and management of the corresponding organizaon.

    What is Big Data?

    AboutAbout DownloadsDownloads DocumentaonDocumentaon Dev ZoneDev Zone CommunityCommunity BlogBlog

    converted by Web2PDFConvert.com

    http://www.web2pdfconvert.com/?ref=PDFhttp://www.web2pdfconvert.com/?ref=PDFhttp://www.cubrid.org/bloghttp://www.cubrid.org/communityhttp://www.cubrid.org/dev_zonehttp://www.cubrid.org/documentationhttp://www.cubrid.org/?mid=downloads&item=any&os=detect&cubrid=8.4.3http://www.cubrid.org/abouthttp://www.cubrid.org/?mid=textyle&category=dev-platform&alias_title=platforms-for-big-data&vid=blog&act=dispMemberSignUpFormhttp://www.cubrid.org/?mid=textyle&category=dev-platform&alias_title=platforms-for-big-data&vid=blog&act=dispMemberLoginFormhttp://www.cubrid.org/?mid=textyle&category=dev-platform&alias_title=platforms-for-big-data&vid=blog&l=kohttp://www.cubrid.org/
  • 7/30/2019 Www Cubrid Org(8)

    2/11

    Big data means immense amount of data, so much so that it is difficult to collect, store, manage, and analyze via general database soware. In

    general, the meaning of immense amount of data is classified into three types as follows:

    Volume: There is too much data to be stored and require too many processessemanc analysis/data processing. These are the two elements

    that we need to understand.

    Velocity: It means storage and processing speed.

    Variety: The demand for unstructured data, such as text and images is increasing as well as refined-type data that can be standardized and

    previously defined like the RDBMS table record.

    n addion to the three aspects, value is added in the standpoint that value can be created only when data is analyzed.

    What kind of data should be stored?

    We can think of data produced by corporate environment, including customer informaon, service log, and ERP (Enterprise Resource Planning). There

    are also CDR (Call Detailed Record) generated from telecom companies, machinery such as planes, oil pipelines, and power cables, or sensor data and

    metering data of plaorms. Web servers or applicaon logs could be a target, and there is user created data from social media blogs, mail, and bullen

    boards. In short, everything that can be measured and stored could be a target for analysis.

    Such data has b een mana ged and a nalyzed from long before. But why is i t the term big data i s emerging now?

    Environmental Factor

    t is necessary to menon briefly environmental factors that make it big data aracve.

    Figure 2: What makes big data aracve.

    With the rise of personalized and social services, restructuring is occurring in the exisng Internet service environment. From the Internet Web service

    environment that focused on searching/porter sites, there is demand for personalized & social services in the whole service area, such as

    elecommunicaon, game, music, search, shopping, etc. This is why the scale-out technology is becoming more important than the scale-up storage.

    Moreover, for complicated funcons, storage size, and processing requirement that analyze the relaonship between individuals or personal

    preference, data processing technology beyond the scope of OLTP is becoming increasingly important.

    The universalizaon of the mobile environment generated the exisng data and many changes were made to the consuming source environment.

    Users mobile and acvity informaon is becoming data to be stored and analyzed. Cloud-based informaon processing technology for portability

    between different devices, such as smartphones, PCs, and TVs, is a reality.

    t is now feasible to provide mass data storage, processing, and servicing at the appropriate level of investment through enhanced trend of the IT

    environment (IaaS), plaorm environment (PaaS), and service environment (SaaS), which are represented as cloud compung.

    Large-scale data processing systems released by leading Internet firms, such as Google, are being recreated by open-source territory, which can be

    used in a development environment.

    Efficient storage of large data and processing capability according to all the environmental factors are closely related to business compeon.

    Plaormizaon of data storage and processing funconality by many soware organizaons and companies enable easy use of services. Small groups

    or companies are now able to ulize the big data technology, which used to be exclusive to large firms, due to the difficulty in securing specialized

    manpower and high costs.

    Plaorm Technology that Handles Big D ata

    converted by Web2PDFConvert.com

    http://www.web2pdfconvert.com/?ref=PDFhttp://www.web2pdfconvert.com/?ref=PDF
  • 7/30/2019 Www Cubrid Org(8)

    3/11

    We will classify various plaorms into three aspects Storage System, Handling, and Analysis Method and describe related products and

    echnology. While some of them are unsuitable for such a classificaon standard, we believe the approach is easy to understand intuively.

    Storage System Aspect

    Parallel DB MS

    NoSQL

    Both the Parallel DBMS and NoSQL are idencal in that they use the scale-out expansion approach in order to store large data.

    Note that there are also the exisng storage technologies, namely SAN and NAS, and cloud file storage systems, known as Amazon S3 or OpenStack

    Swi, as well as distributed file systems, such as G FS and HDFS, which are technologies for storing large data.

    PARALLEL DBMS

    Michael Stonebraker, a pioneer who led PostgreSQL development, an open source database, made prototype systems, such as H-Store and C-Store,

    and demonstrated that while the exisng RDBMS technology was designed to fit all areas with a single system, systems that have specialized in

    architecture are far more superior in handling OLAP, text, streaming, and high-dimensional data. [1]

    According to Stonebraker, there have been many environmental changes compared to the me when RDBMS was first designed and implemented;

    hus, such changes should be reflected to the exisng OLTP processing area. For example, he said that disk in the 1970s used to be regarded as

    memory nowadays. The disk log and disaster recovery system by tape backup should be changed to a high-cost structure of the k-safe environment

    that is, duplicaon), and reconsider the concurrency technique of convenonal lock method according to changes in process technology.

    VOLTDB

    VoltDB is a system consisng of a suitable format to a high-performance OLTP environment. The system is not memory-based data processing or SQL,

    but it performs sequenal processing for data split based on stored procedure and reduces lock overhead with communicaon, helping to configure the

    high-speed OLTP system through horizontal split for table data.[2]

    Figure 3: VoltD B architecture.

    Figure 3 displays that a certain task that requires to operate in just one paron is executed sequenally in the corresponding paron, and that a

    certain task that needs to be handled in several parons are processed by a coordinator. If there are many operaons that need to be processed in

    several parons, large rows and sizes may not be good.

    SAP HANA

    SAP HANA is a memory-based storage made from SAP. Its characterisc is to organize a system opmized to analysis tasks, such as OLAP. If all data

    s inside system memory, maximizing CPU ulizaon is crucial and the key point is to reduce bolenecks between memory and CPU cache. In order to

    converted by Web2PDFConvert.com

    http://www.web2pdfconvert.com/?ref=PDFhttp://www.web2pdfconvert.com/?ref=PDFhttp://www.sap.com/hana/index.epxhttp://voltdb.com/http://www.cubrid.org/blog/tags/HDFS/
  • 7/30/2019 Www Cubrid Org(8)

    4/11

    minimize Cache miss, consecuve data for processing within the given me is more advantageous, meaning that configuraon of column-oriented

    ables could be favorable when analyzing many OLAP.

    There are many advantages of the column-oriented table configuraon and typical examples are a high data compression rao and processing speed.

    n case of the same data domain, several data domains are beer for data compression than when they are combined together. Moreover, the

    configuraon enables reducing CPU operaons through a lightweight compression, such as RLE (Run length encoding) or diconary encoding or

    execung desired operaons without a recovery process for compressed data. The following figure shows a brief comparison with Row-oriented and

    Column-oriented methods.

    Figure 4: A Comparison with Row-oriented and Column-oriented methods.

    VERTICA

    Verca is database specialized for OLAP, which stores data on disk via the column method. The Shared-nothing-oriented MPP structure comprises a

    storage opmized for wring so as to load data fast, a reading storage in a compressed type, and tuple mover that manages bilateral data flow. Figure

    5 below helps to understand the Verca structure.

    Figure 5: Verca structure.

    GREENPLUM

    Greenplum database is a shared-nothing MPP structure, generated based on PostgreSQL. Data to be stored can select Row-oriented or Column-

    oriented methods accordingly to operaons apply to the corresponding data. Data is stored in a server in segment and have availability because of

    segment unit replicaon of the log shipping method. A query engine, which was developed based on PostgreSQL, is configured to execute SQL basic

    operaon (hash-join, hash-aggregaon) or a map-reduce program so as to effecvely process parallel query or map-reduced type programs. Each

    process node is connected to soware-oriented data switch component.

    converted by Web2PDFConvert.com

    http://www.web2pdfconvert.com/?ref=PDFhttp://www.web2pdfconvert.com/?ref=PDFhttp://www.greenplum.com/http://www.vertica.com/
  • 7/30/2019 Www Cubrid Org(8)

    5/11

    Figure 6: Greenplum architecture.

    BM NETEZZA DATA WAREHOUS E

    BM Netezza data warehouse has a two-er type architecture consisted of SMP and MPP, called AMPP (Asymmetric Massively Parallel Processing).

    A host with a SMP structure operates query execuon plan and aggregaon results, while S-blade nodes with a MPP structure handles query

    execuon.

    Each S-blade is connected by a special data processor called FPGA (Field Programmable Gate Array) and disk.

    Each S-blade and host is connected to network that use IP addresses.

    Unlike other systems, FPGA has filtering for data compression, record or column; in transacon processing, it enables filtering or transformaon

    uncons, such as visibility check during retrieving data from disk memory for real-me processing. When processing large-date, it adheres to the

    principles (processing close to the data source), which is to reduce as much unnecessary data as possible from transmission by performing data

    operaon where data is located.

    Figure 7: IBM Netezza data architecture.

    There are many other soluons for big data processing, namely Teradata, Sybase, and Essbase, tradional and powerful programs.

    Parallel DBMS is advanced form of old RDBMS , which in many cases have a MPP structure. In addion, companies and organizaons that develop

    parallel DBMS are taken over by IT conglomerates and are in the progress of development in the appliance type. The names of conglomerates and

    date of acquision of aforemenoned parallel DBMS are shown in the following table:

    converted by Web2PDFConvert.com

    http://www.web2pdfconvert.com/?ref=PDFhttp://www.web2pdfconvert.com/?ref=PDFhttp://www.netezza.com/
  • 7/30/2019 Www Cubrid Org(8)

    6/11

    Table 1: Names of companies which acquired parallel RDBMS.

    The names of companies acquired Database Year

    SAP Sybase 2010

    HP Verca 2011

    IBM Netezza 2010

    Oracle Essbase (Hyperian Soluons) 2007

    Teradata Aster Data 2011

    EMC Greenplum 2010

    NOSQL

    n RDBMS, scaling out while supporng ACID (Atomicity, Consistency, Isolaon, and Durability) is almost impossible. For storage, data had to be

    divided into several devices; to be sasfied with ACID that has divided data, you have to use complicated locking and replicaon methods, which will

    ead to performance degradaon.

    NoSQL, a general term for a new storage system has emerged in order to simplify data models for easy definion of shard, which is the basic of

    distribuon, and to make requirements less strict (Eventual Consistency) in a dis tribuon replicaon environment or constraint isolaon.

    Since NoSQL is covered many mes in our DevPlaorm Blogs and there are many places to obtain informaon, we will not go over the NoSQL

    products.

    Processing Aspects

    The key point of parallel processing is Divide and Conquer. That is, data is divided in an independent type and process it in parallel. Just imagine the

    matrix mulplicaon that can divide and process each operaon. The meaning of big data process is dividing a problem into several small operaons,

    and combine them into a single result. If there is operaon dependence, it is certainly impossible to make the best use of the parallel operaon. It is

    necessary to save and process data considering these factors.

    MAP-REDUCE

    The most widely known technology that helps to handle large-data would be a distribuon data process framework of the Map-Reduce method, such

    as Apache Hadoop.

    Data processing via the Map-reduce method has the following characteriscs:

    It operates via regular computer that uses built-in hard disk, not a special storage. Each computer has extremely weak correlaon where

    expansion can be hundreds and thousands of computers.

    Since many computers are parcipang in processing, system errors and hardware errors are assumed as general circumstances, rather than

    exceponal.

    With a simplified and abstracted basic operaon of Map and Reduce, you can solve many complicated problems. Programmers who are not

    familiar with parallel programs can easily perform parallel processing for data.

    It supports high throughput by using many computers.

    The following figure displays the implementaon flow of the map-reduce method. Data stored in the HDFS storage is divided to available worker and

    expressed (Map) a value type, and results are stored in a local disk. The data is complied by reducing worker and generate a result file.

    converted by Web2PDFConvert.com

    http://www.web2pdfconvert.com/?ref=PDFhttp://www.web2pdfconvert.com/?ref=PDFhttp://www.cubrid.org/blog/tags/Hadoop/http://www.cubrid.org/blog/categories/dev-platform/http://www.cubrid.org/blog/tags/NoSQL/
  • 7/30/2019 Www Cubrid Org(8)

    7/11

    Figure 8: M ap-Reduce execuon.

    Depending on the characteriscs of a data storage, make the best use of locality by reducing the gap between a node which is processing data and

    source data locaon by placing worker in the locaon (based on network switch) where data is stored. Each worker can be implemented in various

    anguages through streaming interface (standard in/out).

    DRYAD

    Like Dryad, there is a framework that enables processing parallel data by forming the data channel between programs in a graph type. Developers who

    use Map/reduce framework should write Map and Reduce funcons; thus, if using Dryad, they need to make a graph which processes the

    corresponding data. Figure 9 below describes data processing in Dryad in comparison with the UNIXs pipe.

    Figure 9: Dryad data processing.

    Dryad enables data flow processing of the DAG (Direct Acyclic Graph) type.

    Despite parallel data operaon framework offers enough funcons for processing big data, there are the barriers to entry in using the system for

    nexperienced developers, data analyzers, and data minors. As a result, we need a method that helps easier data processing through higher standard

    abstract.

    APACHE PIG

    Apache Pig provides a high-standard data processing structure, allowing large data processing through combinaon. It supports a language called, Pig

    Lan, and shows the following characteriscs:

    It provides a high-standard structure, such as relaon, bag, and tuple, including int, long, and double basic types.

    It supports relaon (relaon and table) operaon, such as FILTER , FOREACH , GROUP , JOIN , LOAD , and STORE .

    converted by Web2PDFConvert.com

    http://www.web2pdfconvert.com/?ref=PDFhttp://www.web2pdfconvert.com/?ref=PDFhttp://pig.apache.org/http://research.microsoft.com/en-us/projects/dryad/
  • 7/30/2019 Www Cubrid Org(8)

    8/11

    You can specify a user defined funcon.

    A data processing program, which is specified via Pig Lan, converts to a logical execuon plan and this is again conversed to a Map-Reduce

    execuon plan for execuon. Figure 10 below shows Pig operaon process.

    Figure 10: Converng process of Pig L an to M ap-Reduce.

    Apache Pig is an access method, enabling a large data processing program in procedural programming languages, namely C or Java. Another similar

    access method is Googles Sawzall.

    Some technologies help declarave data processing, like SQL, without specifying data processing procedures as in programming languages. Typical

    examples are Apache Hive, Google Tenzing, and SCOPE of Microso.

    APACHE HIVE

    Apache Hive helps to analyze large data by using the query language called HiveQL for data source, such as HDFS or HBase. Architecture is divided

    nto Map-Reduce-oriented execuon, meta data informaon for a data storage, and an execuon part that receives a query from user or applicaons

    or execuon.

    Figure 11: HIVE architecture.

    To support expansion by user, it allows user specified funcon at the scalar value, aggregaon, and table level.

    Analysis Aspects

    We reviewed systems that store big data and procedural/declarave technologies that display processing and how to process large data. Finally, let usook into technology th at analyzes big data.

    The process of finding meaning in data is called KDD (Knowledge Discovery in Databases). KDD is to store data, process/analyze the whole or part of

    nterested data in order to extract progress or meaning value, or discover facts that were so far unknown and make them into knowledge ulmately.

    For this, various technologies are comprehensively applied, such as arficial intelligence, machine learning, stascs, and database.

    converted by Web2PDFConvert.com

    http://www.web2pdfconvert.com/?ref=PDFhttp://www.web2pdfconvert.com/?ref=PDFhttp://www.cubrid.org/blog/tags/HBase/http://hive.apache.org/
  • 7/30/2019 Www Cubrid Org(8)

    9/11

    GNU R

    GNU R is a soware environment comprising program languages specialized for stascs analysis and graphics (visualizaon)

    and packages. It ensures a smooth process of vector and matrix data so as to be opmized for stascal calculaons in terms of

    language. You can easily acquire desired stascs process library because of the R package site known as CRAN (Comprehensive

    R Archive Network). It can be touted as an open source in the field of stascs.

    n the past, R used to put data to be processed into the memory of a computer for analyzing using a single CPU. There has been much progress due to

    ever increasing data to be processed. Some of the typical examples are as follows:

    doSMP package: It uses mul-core.

    Bigmemory package: It stores data in the shared memory. It stores only the reference for memory value and stores the value in a disk.

    snow package: It enables the R program execuon in the computer cluster environment.

    The following figure demonstrates how snow package processes in a cluster environment. A typical division and conquer method.

    Figure 12: Execuon environment for GNU R and Snow Package.

    As the Map-Reduce operaon is becoming more common, we can see technologies, such as RHIPE, RHadoop package, and Ricardo, where MR

    operaon and R are in the form of integraon. There is also the rhive package, which combined with high-standard data processing technology, likeApache Hive.

    APACHE MAHOUT

    Apache Mahout is implemented to expand data analysis algorithm. Not only Apache Hadoop, but it can be operated in a variety of environments. It

    also offers effecve packages in terms of storing the mathemacal library and Java Collecon, such as vector matrix as well as a funcon to use

    DBMS or other NoSQL database as data sources.

    Besides, there is a trend in which integrates and supports R to its own technology that is provided from a big data storage product of the exisng MPP

    structure. For example, Oracle big data appliance enables execuon of the R program, targeng data stored in Oracle database by integrang with R.

    To summarize, the data analysis technology, such as GNU R is progressing towards technology convergence from the exisng large data processing

    ramework or operaon on storage systems.

    n Conclusion

    We have looked into plaorm technology that processes big data by classifying it into storage, process, and analysis. We also briefly reviewed storage

    systems, large data processing and analysis technologies.

    We wish to deliver that big data will eventually serve its role when the processing technology and the capability of people/organizaons who use the

    echnology are well combined.

    By Lee Kyu Jae, Senior Engineer, Storage System Development Team, NHN Corporaon.

    Reference

    1. Steve LaValle, Eric Lesser, Rebecca Shockley, Michael S. Hopkins and Nina Kruschwitz (December 21, 2010), Big data, Analycs and the Path

    from Insights to Value.

    2. James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, C harles Roxburgh, Angela H ung Byers (May 2011), Big data: The

    next froner for innovaon, compeon, and producvity.

    converted by Web2PDFConvert.com

    http://www.web2pdfconvert.com/?ref=PDFhttp://www.web2pdfconvert.com/?ref=PDFhttp://www.mckinsey.com/Insights/MGI/Research/Technology_and_Innovation/Big_data_The_next_frontier_for_innovationhttp://sloanreview.mit.edu/the-magazine/2011-winter/52205/big-data-analytics-and-the-path-from-insights-to-value/http://mahout.apache.org/
  • 7/30/2019 Www Cubrid Org(8)

    10/11

    3. Richard Winter (December 2011), BIG DATA: BUSINESS OPPORTUNITIES, REQUIREMENTS AND O RACL ES APPROACH (PDF).

    4. Michael Stonebraker, Nabil Hachem, Pat Helland, The End of an Architectural Era (Its Time for a Complete Rewrite), VLDB 2007 (PDF).

    5. Michael Stonebraker et al., One Size Fits All? Part 2: Benchmarking Results, CIDR 2007 (PDF).

    6. Daniel J. Abadi et al., Integrang Compression and Execuon in Column-Oriented Database Systems, SIGMOD 06 (PDF).

    7. VoltDB | Lightning Fast, Rock Solid.

    8. SAP HANA.

    9. Real-Time Analycs Plaorm | Big Data Analycs | MPP Data Warehouse.

    10. Greenplum is driving the future of Big Data analycs.

    11. Data Warehouse Appliance, Data Warehouse Appliances, and Data Warehousing from Netezza.

    12. Jeffrey Dean and Sanjay Chemawat, MapReduce: Simplified Data Processing O n Large Clusters, C AC M Jan. 2008 (PDF).

    13. Mihai Budiu (March 2008), Cluster Compung with Dryad, MSR-SVC LiveLabs (PPT)

    14. Hung-chih Yang et al., Map-Reduce-Merge: Simplified Relaonal Data Processing on Large Clusters , SIGMOD 07 (PPT).

    15. Welcome to Apache Hadoop.

    16. The R Project for Stascal Compung.

    1. C-Store is designed to fit the OLAP applicaon, which was commercialized as Verca, and was taken over by HP in 2011. H-Store is programmed

    to fit memory-based high-performance OLTP and was commercialized as VoltDB.

    2. Recent snap-shot and command logging methods support durability in disk, including read operaon of the map-reduce style, and materialized

    view, which enhanced the analyc funcon (collecon operaon).

    See also

    Database Technology for Large Scale Data

    Dev Plaorm What do we know about the Database Technology used today for Large Scale Data? Vast variety of services around the world are a...

    2 years ago by Esen Sagynov 1 30819

    The Story behind LINE App Development

    Dev Plaorm If you have a smartphone, you must be using LINE, KakaoTalk, or Whatsapp messaging app, if not all of them. But unli...

    9 months ago by Esen Sagynov 0 10817

    Log Analysis System Using Hadoop and MongoDB

    Dev Plaorm I have been searching for a way to process the exponenally increasing service logs at a low cost. For the past several mo...

    2 years ago by Jeong Hyun Lee 5 25530

    converted by Web2PDFConvert.com

    http://www.web2pdfconvert.com/?ref=PDFhttp://www.web2pdfconvert.com/?ref=PDFhttp://www.cubrid.org/blog/cubrid-life/over-2000-devs-gathered-at-nhn-deview-2012-what-have-you-missed/http://www.cubrid.org/blog/dev-platform/log-analysis-system-using-hadoop-and-mongodb/#commentshttp://www.cubrid.org/blog/categories/dev-platform/http://www.cubrid.org/blog/dev-platform/log-analysis-system-using-hadoop-and-mongodb/http://www.cubrid.org/blog/dev-platform/log-analysis-system-using-hadoop-and-mongodb/http://www.cubrid.org/blog/dev-platform/the-story-behind-line-app-development/#commentshttp://www.cubrid.org/blog/categories/dev-platform/http://www.cubrid.org/blog/dev-platform/the-story-behind-line-app-development/http://www.cubrid.org/blog/dev-platform/the-story-behind-line-app-development/http://www.cubrid.org/blog/dev-platform/database-technology-for-large-scale-data/#commentshttp://www.cubrid.org/blog/categories/dev-platform/http://www.cubrid.org/blog/dev-platform/database-technology-for-large-scale-data/http://www.cubrid.org/blog/dev-platform/database-technology-for-large-scale-data/http://www.r-project.org/http://hadoop.apache.org/http://www.google.co.kr/url?sa=t&rct=j&q=map-reduce-merge%3A%20simplified%20relational%20data%20processing%20on%20large%20clusters%20sigmod%20%E2%80%9807&source=web&cd=3&ved=0CD0QFjAC&url=http%3A%2F%2Fidb.snu.ac.kr%2Fimages%2Fd%2Fdb%2F20100714_Map-Reduce-Merge.ppt&ei=GWJ1T_uSBKz0mAWm7NHpDw&usg=AFQjCNFtAQrfz-p8JVFz0wcIBe3TH6Bnnw&cad=rjthttp://www.google.co.kr/url?sa=t&rct=j&q=cluster%20computing%20with%20dryad&source=web&cd=1&ved=0CCQQFjAA&url=http%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fprojects%2Fdryad%2Fdryad-talk.pptx&ei=gV91T-TgOc_QmAXD5JHpDw&usg=AFQjCNGl1RcOQoQxHcKtL4tw6ClEk6qpsg&cad=rjthttp://static.usenix.org/event/osdi04/tech/full_papers/dean/dean.pdfhttp://www.netezza.com/http://www.greenplum.com/http://www.vertica.com/http://www.sap.com/hana/index.epxhttp://voltdb.com/http://db.lcs.mit.edu/projects/cstore/abadisigmod06.pdfhttp://nms.csail.mit.edu/~stavros/pubs/osfa.pdfhttp://cs-www.cs.yale.edu/homes/dna/papers/vldb07hstore.pdfhttp://www.oracle.com/us/corporate/analystreports/infrastructure/winter-big-data-1438533.pdf
  • 7/30/2019 Www Cubrid Org(8)

    11/11

    About CUBRID | Contact us |

    2012 CUB RI D.org. All rights reserved.

    Over 2,000 devs gathered at NHN 2012 DEVIEW. What have you missed?

    CUBRID Life Annually since 2008 NHN, the company behind CUBRID open source RDBMS, holds DEVIEW conference at the...

    5 months ago by Esen Sagynov 0 1959

    Tadpole - Web based SQL Client capable of generang ER Diagrams

    CUBRID Apps&Tools Tadpole is a very easy to use SQL Client which allows users to connect and query various relaonal and non-relaonal databa...

    7 months ago by Esen Sagynov 0 4338

    0 comments

    What's this? ALSO ONCUBRID OPEN SOURCE DATABASE COMMUNITY

    Understanding TCP/IP Network Stack

    Thomas Eichberger Thank you v ery much, very interesting.

    The Principles of Java Application Performance Tuning

    David Matjek I hav e two notes:1) I'm experimenting G1GC for sometime, but hav e not much information about it 's p

    Leave a message...

    Discussion Community Share

    comment 12 days ago 1 comment a month ago

    0

    http://www.cubrid.org/contacthttp://www.cubrid.org/abouthttp://www.cubrid.org/blog/cubrid-appstools/tadpole-web-based-sql-client-capable-of-generating-er-diagrams/#commentshttp://www.cubrid.org/blog/categories/cubrid-appstools/http://www.cubrid.org/blog/cubrid-appstools/tadpole-web-based-sql-client-capable-of-generating-er-diagrams/http://www.cubrid.org/blog/cubrid-appstools/tadpole-web-based-sql-client-capable-of-generating-er-diagrams/http://www.cubrid.org/blog/cubrid-life/over-2000-devs-gathered-at-nhn-deview-2012-what-have-you-missed/#commentshttp://www.cubrid.org/blog/categories/cubrid-life/http://www.cubrid.org/blog/cubrid-life/over-2000-devs-gathered-at-nhn-deview-2012-what-have-you-missed/